Discussion:
Mahout contributions for next sprint
Saikat Kanjilal
2016-05-13 21:10:25 UTC
Permalink
Hello Mahout Committers,I spoke to AndrewM and he suggested that I work on some precision/recall or run-time performance module, or possibly streaming kmeans (apparently there's a new JIRA associated with this). I was wondering which of these would be the most impactful and interesting to the project. I would be glad to help in any of these areas.Thanks in advance.
Saikat Kanjilal
2016-05-13 23:06:08 UTC
Permalink
If its ok with everyone I'll come up with a proposal to float in the dev list and create a JIRA item. If there are any thoughts or discussions around the runtime performance module that you guys have already had I'd be much obliged if you let me know.

My thoughts around the runtime perf module:
In a nutshell the module could: 1) take in an algorithm (or a set of algorithms) and a combination of features 2) measure latency, run time performance, cpu load, io characteristics of the algorithm 3) spit the result out to some visualization dashboard (can we use zeppelin for this). I imagine it would be something that mahout committers could run every release before and after any new code gets submitted. One other point to consider is whether the performance will be solely for samsara or a dashboard of performance metrics for samsara connected to each of the backends.

Thoughts?
Proposal to come in the next week or so.
Thanks in advance.
Subject: RE: Mahout contributions for next sprint
Date: Fri, 13 May 2016 21:41:57 +0000
Hello Saikat
https://github.com/apache/mahout/blob/master/math-scala/src/main/scala/org/apache/mahout/classifier/stats/ConfusionMatrix.scala
Embarrassed to say that I forgot that myself.
The runtime performance module does sound like it would be very useful.
Andy
-------- Original message --------
Date: 05/13/2016 5:10 PM (GMT-05:00)
Subject: Mahout contributions for next sprint
Hello Mahout Committers,I spoke to AndrewM and he suggested that I work on some precision/recall or run-time performance module, or possibly streaming kmeans (apparently there's a new JIRA associated with this). I was wondering which of these would be the most impactful and interesting to the project. I would be glad to help in any of these areas.Thanks in advance.
Loading...