Discussion:
looking to contribute to the project
dustin vanstee
2017-03-15 01:12:11 UTC
Permalink
Hi I have been looking into mahout and think it has some very nice
ML/Linear alg capabilities. I would like to contribute to the project, and
I was hoping someone on the mailing list might be able to give me a few
ideas about where I could start. Thanks!
Trevor Grant
2017-03-15 07:24:58 UTC
Permalink
Hey Dustin!

Welcome to the community.

At the moment, we are in the middle of a release. The most immediate thing
you could help with would be to help us test the release candidate. See
Andrew's email.

Moving forward though, there are lots of opportunities-
Some things that have been kicked around on here over the last few months
include:
- Migrating website to a git based so that non committers can edit and
contribute to the docs.
- Expanding the algorithms section (are there any algorithms you are
familiar with? Implementing in Mahout would be a good start)
- I have been toying with some docker based integration tests if you happen
to be familiar with Dockers and using them for maven IT (or want to learn)
- Beginner issues- at the moment there aren't many on the JIRA board bc we
fixed most in preparation for the release.

Testing the release would be a good start point however, because it will
get you familiar with building Mahout ( a necessary first step).

Items 1 and 3 are a bit advanced for someone just starting out- so unless
you have some specific familiarity- I would direct you toward number 2.

In that case- check out:
https://github.com/apache/mahout/tree/master/math-scala/src/main/scala/org/apache/mahout/math/algorithms

There is the algorithm framework- look through it. If there is an
algorithm you have in mind (try to start with an easy one), let us know and
open a JIRA ticket!

Best,

tg

Trevor Grant
Data Scientist
https://github.com/rawkintrevo
http://stackexchange.com/users/3002022/rawkintrevo
http://trevorgrant.org

*"Fortunate is he, who is able to know the causes of things." -Virgil*
Post by dustin vanstee
Hi I have been looking into mahout and think it has some very nice
ML/Linear alg capabilities. I would like to contribute to the project, and
I was hoping someone on the mailing list might be able to give me a few
ideas about where I could start. Thanks!
Manuel Sequino
2017-03-15 15:42:01 UTC
Permalink
Hi Trevor,
I'd like to contribute on Mahout specially working on something inherently
docker, I am pretty new but I think I could give you help.

What about this bullet?

"I have been toying with some docker based integration tests if you happen
to be familiar with Dockers and using them for maven IT (or want to learn)"

Where can I get more info? Jira doesn't contain the "docker" keyword.

Best regards,

---------------------------------------
Manuel Sequino

Email: ***@gmail.com
Skype: manuel.sequino
+39 320 4869904

Linkedin page <https://it.linkedin.com/pub/manuel-sequino/96/261/494>
--------------------------------------
Post by Trevor Grant
Hey Dustin!
Welcome to the community.
At the moment, we are in the middle of a release. The most immediate thing
you could help with would be to help us test the release candidate. See
Andrew's email.
Moving forward though, there are lots of opportunities-
Some things that have been kicked around on here over the last few months
- Migrating website to a git based so that non committers can edit and
contribute to the docs.
- Expanding the algorithms section (are there any algorithms you are
familiar with? Implementing in Mahout would be a good start)
- I have been toying with some docker based integration tests if you happen
to be familiar with Dockers and using them for maven IT (or want to learn)
- Beginner issues- at the moment there aren't many on the JIRA board bc we
fixed most in preparation for the release.
Testing the release would be a good start point however, because it will
get you familiar with building Mahout ( a necessary first step).
Items 1 and 3 are a bit advanced for someone just starting out- so unless
you have some specific familiarity- I would direct you toward number 2.
https://github.com/apache/mahout/tree/master/math-scala/
src/main/scala/org/apache/mahout/math/algorithms
There is the algorithm framework- look through it. If there is an
algorithm you have in mind (try to start with an easy one), let us know and
open a JIRA ticket!
Best,
tg
Trevor Grant
Data Scientist
https://github.com/rawkintrevo
http://stackexchange.com/users/3002022/rawkintrevo
http://trevorgrant.org
*"Fortunate is he, who is able to know the causes of things." -Virgil*
Post by dustin vanstee
Hi I have been looking into mahout and think it has some very nice
ML/Linear alg capabilities. I would like to contribute to the project,
and
Post by dustin vanstee
I was hoping someone on the mailing list might be able to give me a few
ideas about where I could start. Thanks!
Trevor Grant
2017-03-19 00:28:10 UTC
Permalink
Hey Manuel,

(I think I accidentally dropped ***@m.a.o when replying to I've added them
back.)
Let me open a WIP PR and then we can discuss on there.

In general though, the current form will create a docker image with Hadoop
and/or Spark, and mounts the project directory in the docker image at
/opt/mahout (which is also Mahout Home)
Also a script is run upon start up that runs a few of the examples/CLI
drivers.

We want:
- A script which runs through an exhaustive list of tests (cli
drivers/examples/etc)
- A way to tell weather those tests passed or failed (checking the output?)
- A way to fail the build if if the examples/etc fail. (no idea how this
works, I've always tried to make build successful, never tried to fail one).




Trevor Grant
Data Scientist
https://github.com/rawkintrevo
http://stackexchange.com/users/3002022/rawkintrevo
http://trevorgrant.org

*"Fortunate is he, who is able to know the causes of things." -Virgil*
Post by Manuel Sequino
Hi Trevor,
let's start with this task.
I did some experiments with maven and docker but I am not still
comfortable.
Now it looks like clear, if I have some doubt, I'll get back to you.
Just a problem, I don't know how and what to write a jira, may you direct
me?
Best regards,
Manuel
Hey Manuel,
Awesome!! I don't think I even started a JIRA yet. I was literally just
toying- I saw some cool stuff when building Apache Streams-Incubating, and
copied it. Having maven kick off docker images is a strange thing.
https://github.com/rawkintrevo/mahout/tree/docker-based-its/dockerITs
At this point I 1) Recognize it is a thing we should do to streamline our
testing 2) don't know enough to intelligently write a JIRA.
The idea is, there should be a maven phase where we fire up pseudo spark
and hadoop clusters, and then run all of the exambles, cli drivers, and
shell tests. And fail loudly should any of those tests fail.
As I was telling Saikat, also kind of busy with 100 other things. If you
want to take point on this, feel free to write a jira- copy or fork what
I've done so far and go.
Again, also check out Apache Streams-incubating since I am admittedly
copying them.
tg
Trevor Grant
Data Scientist
https://github.com/rawkintrevo
http://stackexchange.com/users/3002022/rawkintrevo
http://trevorgrant.org
*"Fortunate is he, who is able to know the causes of things." -Virgil*
Post by Manuel Sequino
Hi Trevor,
I'd like to contribute on Mahout specially working on something
inherently docker, I am pretty new but I think I could give you help.
What about this bullet?
"I have been toying with some docker based integration tests if you happen
to be familiar with Dockers and using them for maven IT (or want to learn)"
Where can I get more info? Jira doesn't contain the "docker" keyword.
Best regards,
---------------------------------------
Manuel Sequino
Skype: manuel.sequino
+39 320 4869904 <+39%20320%20486%209904>
Linkedin page <https://it.linkedin.com/pub/manuel-sequino/96/261/494>
--------------------------------------
Post by Trevor Grant
Hey Dustin!
Welcome to the community.
At the moment, we are in the middle of a release. The most immediate thing
you could help with would be to help us test the release candidate. See
Andrew's email.
Moving forward though, there are lots of opportunities-
Some things that have been kicked around on here over the last few months
- Migrating website to a git based so that non committers can edit and
contribute to the docs.
- Expanding the algorithms section (are there any algorithms you are
familiar with? Implementing in Mahout would be a good start)
- I have been toying with some docker based integration tests if you happen
to be familiar with Dockers and using them for maven IT (or want to learn)
- Beginner issues- at the moment there aren't many on the JIRA board bc we
fixed most in preparation for the release.
Testing the release would be a good start point however, because it will
get you familiar with building Mahout ( a necessary first step).
Items 1 and 3 are a bit advanced for someone just starting out- so unless
you have some specific familiarity- I would direct you toward number 2.
https://github.com/apache/mahout/tree/master/math-scala/src/
main/scala/org/apache/mahout/math/algorithms
There is the algorithm framework- look through it. If there is an
algorithm you have in mind (try to start with an easy one), let us know and
open a JIRA ticket!
Best,
tg
Trevor Grant
Data Scientist
https://github.com/rawkintrevo
http://stackexchange.com/users/3002022/rawkintrevo
http://trevorgrant.org
*"Fortunate is he, who is able to know the causes of things." -Virgil*
On Tue, Mar 14, 2017 at 6:12 PM, dustin vanstee <
Post by dustin vanstee
Hi I have been looking into mahout and think it has some very nice
ML/Linear alg capabilities. I would like to contribute to the
project, and
Post by dustin vanstee
I was hoping someone on the mailing list might be able to give me a
few
Post by dustin vanstee
ideas about where I could start. Thanks!
Loading...