Discussion:
One year of Mahout
Isabel Drost-Fromm
2016-05-24 10:06:55 UTC
Permalink
Hi,

the past year has seen quite a few releases of Mahout. To my knowledge lots of
integration work went into the project.

What do people think - would it make sense to reflect on that a bit and post the
most important findings on ***@mahout.apache.org ? I'm happing to do the final
editing, but I need your input on what should be included...


Isabel
Suneel Marthi
2016-05-24 10:28:27 UTC
Permalink
Yes, we have been talking about writing a blog post. One of us will have to
start on that.

Some of the accomplishments being:

1. New Mahout Book - 'Apache Mahout: Beyond MapReduce' - Dmitriy Lyubimov,
Andrew Palumbo
2. Backend Integrations with Spark, Flink, H2O
3. Multi-Modal Recommender based on Mahout-Samsara in PredictionIO
(proposed for Apache Incubator) - Pat Ferrel
4. Major Linear Algebra performance enhancements (Sebastiano Vigna,
Dymitriy Lyubimov)
5. Mahout talks in conferences (Apache Big Data 2016, Vancouver)
6. Most recent being visualization and plotting with Apache Zeppelin
(Trevor Grant)

Suneel
Post by Isabel Drost-Fromm
Hi,
the past year has seen quite a few releases of Mahout. To my knowledge lots of
integration work went into the project.
What do people think - would it make sense to reflect on that a bit and post the
editing, but I need your input on what should be included...
Isabel
Isabel Drost-Fromm
2016-05-24 11:50:45 UTC
Permalink
Post by Suneel Marthi
Yes, we have been talking about writing a blog post. One of us will have to
start on that.
I can promise to start but if I'm not able to finish for whatever reason please
give it to someone else.

Where should we collaborate on this? I'd love for this to be a community effort,
so some place that accepts PRs would be ideal, maybe? Any suggestions? Otherwise
I'd start on my private github repo with the explicit statement that I'm more
than happy to have stuff moved elsewhere.
Post by Suneel Marthi
1. New Mahout Book - 'Apache Mahout: Beyond MapReduce' - Dmitriy Lyubimov,
Andrew Palumbo
Sounds like a short paragraph.
Post by Suneel Marthi
2. Backend Integrations with Spark, Flink, H2O
I think we could have one retrospective part here - what did we learn walking
through these integrations and one future part - how do downstream users benefit
from using those.
Post by Suneel Marthi
3. Multi-Modal Recommender based on Mahout-Samsara in PredictionIO
(proposed for Apache Incubator) - Pat Ferrel
Sounds like we should have a brief paragraph in the overview post and an
extended one on the deeper technical details and reasoning behind it, Pat, do
you think you can provide some notes to that?
Post by Suneel Marthi
4. Major Linear Algebra performance enhancements (Sebastiano Vigna,
Dymitriy Lyubimov)
Performance enhancements are always a great win.
Post by Suneel Marthi
5. Mahout talks in conferences (Apache Big Data 2016, Vancouver)
6. Most recent being visualization and plotting with Apache Zeppelin
(Trevor Grant)
Yeah for screenshots and community pictures ;)


Isabel
Suneel Marthi
2016-05-24 12:05:47 UTC
Permalink
Post by Isabel Drost-Fromm
Post by Suneel Marthi
Yes, we have been talking about writing a blog post. One of us will have
to
Post by Suneel Marthi
start on that.
I can promise to start but if I'm not able to finish for whatever reason please
give it to someone else.
Where should we collaborate on this? I'd love for this to be a community effort,
so some place that accepts PRs would be ideal, maybe? Any suggestions? Otherwise
I'd start on my private github repo with the explicit statement that I'm more
than happy to have stuff moved elsewhere.
Post by Suneel Marthi
1. New Mahout Book - 'Apache Mahout: Beyond MapReduce' - Dmitriy
Lyubimov,
Post by Suneel Marthi
Andrew Palumbo
Sounds like a short paragraph.
http://www.weatheringthroughtechdays.com/2016/02/mahout-samsara-book-is-out.html
Post by Isabel Drost-Fromm
Post by Suneel Marthi
2. Backend Integrations with Spark, Flink, H2O
I think we could have one retrospective part here - what did we learn walking
through these integrations and one future part - how do downstream users benefit
from using those.
<I don't wanna recall the pains>
Post by Isabel Drost-Fromm
Post by Suneel Marthi
3. Multi-Modal Recommender based on Mahout-Samsara in PredictionIO
(proposed for Apache Incubator) - Pat Ferrel
Sounds like we should have a brief paragraph in the overview post and an
extended one on the deeper technical details and reasoning behind it, Pat, do
you think you can provide some notes to that?
https://www.mapr.com/blog/mahout-spark-what%E2%80%99s-new-recommenders

https://www.mapr.com/blog/mahout-spark-whats-new-recommenders%E2%80%94part-2
Post by Isabel Drost-Fromm
Post by Suneel Marthi
4. Major Linear Algebra performance enhancements (Sebastiano Vigna,
Dmitriy Lyubimov)
Performance enhancements are always a great win.
Post by Suneel Marthi
5. Mahout talks in conferences (Apache Big Data 2016, Vancouver)
6. Most recent being visualization and plotting with Apache Zeppelin
(Trevor Grant)
Yeah for screenshots and community pictures ;)
https://trevorgrant.org/2016/05/19/visualizing-apache-mahout-in-r-via-apache-zeppelin-incubating/
Post by Isabel Drost-Fromm
Isabel
Andrew Musselman
2016-05-25 16:51:17 UTC
Permalink
I've tried inviting the rest of the PMC but was not able to; I've commented
on https://issues.apache.org/jira/browse/INFRA-11957 so we'll see how to
resolve it.
Post by Isabel Drost-Fromm
Post by Isabel Drost-Fromm
Where should we collaborate on this? I'd love for this to be a community
effort,
Post by Isabel Drost-Fromm
so some place that accepts PRs would be ideal, maybe? Any suggestions?
Otherwise
Post by Isabel Drost-Fromm
I'd start on my private github repo with the explicit statement that I'm
more
Post by Isabel Drost-Fromm
than happy to have stuff moved elsewhere.
We could start a directory in gh-pages branch for the blog drafts (where
https://github.com/apache/mahout/tree/gh-pages
Not sure how we would want to deal with the markdown rendering that way.
________________________________________
Sent: Tuesday, May 24, 2016 7:50:45 AM
Subject: Re: One year of Mahout
Post by Isabel Drost-Fromm
Yes, we have been talking about writing a blog post. One of us will have
to
Post by Isabel Drost-Fromm
start on that.
I can promise to start but if I'm not able to finish for whatever reason please
give it to someone else.
Where should we collaborate on this? I'd love for this to be a community effort,
so some place that accepts PRs would be ideal, maybe? Any suggestions? Otherwise
I'd start on my private github repo with the explicit statement that I'm more
than happy to have stuff moved elsewhere.
Post by Isabel Drost-Fromm
1. New Mahout Book - 'Apache Mahout: Beyond MapReduce' - Dmitriy
Lyubimov,
Post by Isabel Drost-Fromm
Andrew Palumbo
Sounds like a short paragraph.
Post by Isabel Drost-Fromm
2. Backend Integrations with Spark, Flink, H2O
I think we could have one retrospective part here - what did we learn walking
through these integrations and one future part - how do downstream users benefit
from using those.
Post by Isabel Drost-Fromm
3. Multi-Modal Recommender based on Mahout-Samsara in PredictionIO
(proposed for Apache Incubator) - Pat Ferrel
Sounds like we should have a brief paragraph in the overview post and an
extended one on the deeper technical details and reasoning behind it, Pat, do
you think you can provide some notes to that?
Post by Isabel Drost-Fromm
4. Major Linear Algebra performance enhancements (Sebastiano Vigna,
Dymitriy Lyubimov)
Performance enhancements are always a great win.
Post by Isabel Drost-Fromm
5. Mahout talks in conferences (Apache Big Data 2016, Vancouver)
6. Most recent being visualization and plotting with Apache Zeppelin
(Trevor Grant)
Yeah for screenshots and community pictures ;)
Isabel
Andrew Musselman
2016-05-25 18:47:31 UTC
Permalink
Infra is adding accounts and I'm making them available as they're created.

On Wed, May 25, 2016 at 9:51 AM, Andrew Musselman <
Post by Andrew Musselman
I've tried inviting the rest of the PMC but was not able to; I've
commented on https://issues.apache.org/jira/browse/INFRA-11957 so we'll
see how to resolve it.
Post by Isabel Drost-Fromm
Post by Isabel Drost-Fromm
Where should we collaborate on this? I'd love for this to be a community
effort,
Post by Isabel Drost-Fromm
so some place that accepts PRs would be ideal, maybe? Any suggestions?
Otherwise
Post by Isabel Drost-Fromm
I'd start on my private github repo with the explicit statement that I'm
more
Post by Isabel Drost-Fromm
than happy to have stuff moved elsewhere.
We could start a directory in gh-pages branch for the blog drafts (where
https://github.com/apache/mahout/tree/gh-pages
Not sure how we would want to deal with the markdown rendering that way.
________________________________________
Sent: Tuesday, May 24, 2016 7:50:45 AM
Subject: Re: One year of Mahout
Post by Isabel Drost-Fromm
Yes, we have been talking about writing a blog post. One of us will
have to
Post by Isabel Drost-Fromm
start on that.
I can promise to start but if I'm not able to finish for whatever reason please
give it to someone else.
Where should we collaborate on this? I'd love for this to be a community effort,
so some place that accepts PRs would be ideal, maybe? Any suggestions? Otherwise
I'd start on my private github repo with the explicit statement that I'm more
than happy to have stuff moved elsewhere.
Post by Isabel Drost-Fromm
1. New Mahout Book - 'Apache Mahout: Beyond MapReduce' - Dmitriy
Lyubimov,
Post by Isabel Drost-Fromm
Andrew Palumbo
Sounds like a short paragraph.
Post by Isabel Drost-Fromm
2. Backend Integrations with Spark, Flink, H2O
I think we could have one retrospective part here - what did we learn walking
through these integrations and one future part - how do downstream users benefit
from using those.
Post by Isabel Drost-Fromm
3. Multi-Modal Recommender based on Mahout-Samsara in PredictionIO
(proposed for Apache Incubator) - Pat Ferrel
Sounds like we should have a brief paragraph in the overview post and an
extended one on the deeper technical details and reasoning behind it, Pat, do
you think you can provide some notes to that?
Post by Isabel Drost-Fromm
4. Major Linear Algebra performance enhancements (Sebastiano Vigna,
Dymitriy Lyubimov)
Performance enhancements are always a great win.
Post by Isabel Drost-Fromm
5. Mahout talks in conferences (Apache Big Data 2016, Vancouver)
6. Most recent being visualization and plotting with Apache Zeppelin
(Trevor Grant)
Yeah for screenshots and community pictures ;)
Isabel
i***@apache.org
2016-05-26 08:02:48 UTC
Permalink
Post by Andrew Musselman
I've tried inviting the rest of the PMC but was not able to; I've commented
on https://issues.apache.org/jira/browse/INFRA-11957 so we'll see how to
resolve it.
Great. Thanks for the heads up.


Isabel
i***@apache.org
2016-05-26 08:07:08 UTC
Permalink
Post by Andrew Musselman
I've tried inviting the rest of the PMC but was not able to; I've commented
on https://issues.apache.org/jira/browse/INFRA-11957 so we'll see how to
resolve it.
At least for myself I just received the account confirmation and invitation to the Mahout blog.

Thanks Andrew for taking care of this,
Isabel
Suneel Marthi
2016-06-08 02:28:04 UTC
Permalink
Shall we get going with this? Its about time we called out the
accomplishments and the new shape of Mahout.
Post by Isabel Drost-Fromm
Post by Isabel Drost-Fromm
Where should we collaborate on this? I'd love for this to be a community
effort,
Post by Isabel Drost-Fromm
so some place that accepts PRs would be ideal, maybe? Any suggestions?
Otherwise
Post by Isabel Drost-Fromm
I'd start on my private github repo with the explicit statement that I'm
more
Post by Isabel Drost-Fromm
than happy to have stuff moved elsewhere.
We could start a directory in gh-pages branch for the blog drafts (where
https://github.com/apache/mahout/tree/gh-pages
Not sure how we would want to deal with the markdown rendering that way.
________________________________________
Sent: Tuesday, May 24, 2016 7:50:45 AM
Subject: Re: One year of Mahout
Post by Isabel Drost-Fromm
Yes, we have been talking about writing a blog post. One of us will have
to
Post by Isabel Drost-Fromm
start on that.
I can promise to start but if I'm not able to finish for whatever reason please
give it to someone else.
Where should we collaborate on this? I'd love for this to be a community effort,
so some place that accepts PRs would be ideal, maybe? Any suggestions? Otherwise
I'd start on my private github repo with the explicit statement that I'm more
than happy to have stuff moved elsewhere.
Post by Isabel Drost-Fromm
1. New Mahout Book - 'Apache Mahout: Beyond MapReduce' - Dmitriy
Lyubimov,
Post by Isabel Drost-Fromm
Andrew Palumbo
Sounds like a short paragraph.
Post by Isabel Drost-Fromm
2. Backend Integrations with Spark, Flink, H2O
I think we could have one retrospective part here - what did we learn walking
through these integrations and one future part - how do downstream users benefit
from using those.
Post by Isabel Drost-Fromm
3. Multi-Modal Recommender based on Mahout-Samsara in PredictionIO
(proposed for Apache Incubator) - Pat Ferrel
Sounds like we should have a brief paragraph in the overview post and an
extended one on the deeper technical details and reasoning behind it, Pat, do
you think you can provide some notes to that?
Post by Isabel Drost-Fromm
4. Major Linear Algebra performance enhancements (Sebastiano Vigna,
Dymitriy Lyubimov)
Performance enhancements are always a great win.
Post by Isabel Drost-Fromm
5. Mahout talks in conferences (Apache Big Data 2016, Vancouver)
6. Most recent being visualization and plotting with Apache Zeppelin
(Trevor Grant)
Yeah for screenshots and community pictures ;)
Isabel
i***@apache.org
2016-06-08 13:51:14 UTC
Permalink
Post by Suneel Marthi
Shall we get going with this? Its about time we called out the
accomplishments and the new shape of Mahout.
I haven't been actively contributing to Mahout for too long, posted a PR to the
Github repo, let me know if this should be done differently.

Isabel
Suneel Marthi
2016-06-08 16:35:48 UTC
Permalink
This is very helpful, we can start filling in the meat.
Post by i***@apache.org
Post by Suneel Marthi
Shall we get going with this? Its about time we called out the
accomplishments and the new shape of Mahout.
I haven't been actively contributing to Mahout for too long, posted a PR to the
Github repo, let me know if this should be done differently.
Isabel
Isabel Drost-Fromm
2016-05-25 10:29:39 UTC
Permalink
Post by Suneel Marthi
Yes, we have been talking about writing a blog post. One of us will have to
start on that.
I've created https://issues.apache.org/jira/browse/INFRA-11972 according to http://www.apache.org/dev/project-blogs


Isabel
Suneel Marthi
2016-05-25 10:51:38 UTC
Permalink
We had a blog site set up for the project this week at
https://blogs.apache.org/mahout and some of us have access to write a blog.
Post by Isabel Drost-Fromm
Post by Suneel Marthi
Yes, we have been talking about writing a blog post. One of us will have
to
Post by Suneel Marthi
start on that.
I've created https://issues.apache.org/jira/browse/INFRA-11972 according
to http://www.apache.org/dev/project-blogs
Isabel
Isabel Drost-Fromm
2016-05-25 11:00:48 UTC
Permalink
Post by Suneel Marthi
We had a blog site set up for the project this week at
https://blogs.apache.org/mahout and some of us have access to write a blog.
Oh - nice. Apparently I missed that information on dev@ - who has access? Does it make sense to give admin rights to the whole PMC?

Isabel
Suneel Marthi
2016-05-25 13:02:54 UTC
Permalink
We just got the access this monday and I didn't have time to post on dev@
since I am on work travel.

Could someone (from PMC) please file a ticket with infra to grant access
for all of PMC?
Post by Isabel Drost-Fromm
Post by Suneel Marthi
We had a blog site set up for the project this week at
https://blogs.apache.org/mahout and some of us have access to write a
blog.
Does it make sense to give admin rights to the whole PMC?
Isabel
Andrew Musselman
2016-05-25 15:40:02 UTC
Permalink
Anyone with admin rights can add people I believe; I'll do it today.
Post by Suneel Marthi
since I am on work travel.
Could someone (from PMC) please file a ticket with infra to grant access
for all of PMC?
<javascript:;>>
Post by Isabel Drost-Fromm
Post by Suneel Marthi
We had a blog site set up for the project this week at
https://blogs.apache.org/mahout and some of us have access to write a
blog.
access?
Post by Isabel Drost-Fromm
Does it make sense to give admin rights to the whole PMC?
Isabel
i***@apache.org
2016-05-26 13:39:53 UTC
Permalink
Post by Isabel Drost-Fromm
https://github.com/apache/mahout/tree/gh-pages
Makes sense to me.


Isabel
Loading...