Sebastian
2016-07-21 12:13:18 UTC
Hi Andrew,
I think this topic is broader than just defining a few traits. A popular
way of integrating ML algorithms is via the combination of dataframes
and pipelines, similar to what scipy and SparkML are offering at the
moment. Maybe it could make sense to integrate with what they have
instead of starting our own efforts?
Best,
Sebastian
I think this topic is broader than just defining a few traits. A popular
way of integrating ML algorithms is via the combination of dataframes
and pipelines, similar to what scipy and SparkML are offering at the
moment. Maybe it could make sense to integrate with what they have
instead of starting our own efforts?
Best,
Sebastian
Hi All,
I'd like to draw your attention to MAHOUT-1856: https://issues.apache.org/jira/browse/MAHOUT-1856
This is a discussion that has popped up several times over the last couple of years. as we move towards building out our algorithm library, It would be great to nail this down now.
Most Importantly to not be able to be criticized as "a loose bag of algorithms" as we've sometimes been in the past.
The main point being It would be good to lay out common traits for Classification, Clustering, and Optimization algorithms.
This is just a start. I created this issue a few months back, and intentionally left off Recommender, because I was unsure if there were common traits across them. By traits, I am referring to both both the literal meaning and more specifically, actual Scala traits.
@pat, @tdunning, @ssc, could you give your thoughts on this?
As well, it would be good to add online flavors of different algorithm classes into the mix.
@tdunning could you share some thoughts here?
Trevor Grant will be heading up this effort, and It would be great if we all as a team could come up with abstract design plans for each class of algorithm (as well as to determine the current "classes of algorithms", as each of us has our own unique blend of specializations. And could give our thoughts on this.
Currently this is really the opening of the conversation.
It would be best to post thoughts on: https://issues.apache.org/jira/browse/MAHOUT-1856
Any feedback is welcomed.
Thanks,
Andy
I'd like to draw your attention to MAHOUT-1856: https://issues.apache.org/jira/browse/MAHOUT-1856
This is a discussion that has popped up several times over the last couple of years. as we move towards building out our algorithm library, It would be great to nail this down now.
Most Importantly to not be able to be criticized as "a loose bag of algorithms" as we've sometimes been in the past.
The main point being It would be good to lay out common traits for Classification, Clustering, and Optimization algorithms.
This is just a start. I created this issue a few months back, and intentionally left off Recommender, because I was unsure if there were common traits across them. By traits, I am referring to both both the literal meaning and more specifically, actual Scala traits.
@pat, @tdunning, @ssc, could you give your thoughts on this?
As well, it would be good to add online flavors of different algorithm classes into the mix.
@tdunning could you share some thoughts here?
Trevor Grant will be heading up this effort, and It would be great if we all as a team could come up with abstract design plans for each class of algorithm (as well as to determine the current "classes of algorithms", as each of us has our own unique blend of specializations. And could give our thoughts on this.
Currently this is really the opening of the conversation.
It would be best to post thoughts on: https://issues.apache.org/jira/browse/MAHOUT-1856
Any feedback is welcomed.
Thanks,
Andy