Discussion:
[Hello] from NASa
Steven NASa
2016-05-20 14:07:33 UTC
Permalink
Hi Folk & Masters,

My name is *NASa*. I am now working for an e-commerce B2C company in China,
dealing with Transaction Process development in C++ & Java on Linux
environment.

As you know, *Recommender System* is quite valuable and important to an
e-commerce online shopping website like Amazon. I was told and required to
design and implement a Recommender System which can bring some value to my
Company. Our System is based on C++ codes. So I was searching for an robust
Machine Learning framework in C++ which can help me to easily implement a
Recommender System. I did not find any one which can satisfy my
requirements, but only some C++ math libraries.

Our system is based on an internal distributed frameworks like RPC and DB
access on Linux environment based on C++ programming language. But I find
it is really inconvenient to implement a Recommender System in C++ from
zero without distributed computing library supporting, like
implementing *Collaborative
Filtering* with SVD in a distributed computing way. So I am trying to find
a framework/library with is designed based on Distributed-System. There I
come to *Mahout*.

I wish I can build a library that can help people easily and quickly build
up a Recommender System based on Distributed System and also use the
Machine Learning Algorithms in distributed way. Apache has many amazing
projects which can help people to build up robust distributed system
easily. So I am moving to using “Java” environment.

I am new to *Mahout* and *Hadoop*, *Spark*, *Scala* and I learned Andrew
Ng’s “Machine Learning” from Coursera
<https://www.coursera.org/learn/machine-learning/home/welcome>. So I have
the basic knowledge of Machine Learning, and now I am keeping forward to *Deep
Learning* and *Convex Optimization*, some other Mathematical Optimization
implementation. I am now still learning and getting famiIiar with Mahout. I
hope I can contribute some codes to Mahout in the early future with
learning by coding and coding by learning.
NASa 2016/05/20
​
Khurrum Nasim
2016-05-20 15:32:18 UTC
Permalink
Sounds more like demand prediction to me.

However your system should be able to interact with other non-C/C++ systems.
There is something called Apache Thrift.

Which brings me to the following - would it be a valuable feature to Mahout library to provide
connectivity with other systems using Thrift.


Thoughts ?

Khurrum

p.s. Andrew Ng can put you to sleep easily.
Post by Steven NASa
Hi Folk & Masters,
My name is *NASa*. I am now working for an e-commerce B2C company in China,
dealing with Transaction Process development in C++ & Java on Linux
environment.
As you know, *Recommender System* is quite valuable and important to an
e-commerce online shopping website like Amazon. I was told and required to
design and implement a Recommender System which can bring some value to my
Company. Our System is based on C++ codes. So I was searching for an robust
Machine Learning framework in C++ which can help me to easily implement a
Recommender System. I did not find any one which can satisfy my
requirements, but only some C++ math libraries.
Our system is based on an internal distributed frameworks like RPC and DB
access on Linux environment based on C++ programming language. But I find
it is really inconvenient to implement a Recommender System in C++ from
zero without distributed computing library supporting, like
implementing *Collaborative
Filtering* with SVD in a distributed computing way. So I am trying to find
a framework/library with is designed based on Distributed-System. There I
come to *Mahout*.
I wish I can build a library that can help people easily and quickly build
up a Recommender System based on Distributed System and also use the
Machine Learning Algorithms in distributed way. Apache has many amazing
projects which can help people to build up robust distributed system
easily. So I am moving to using “Java” environment.
I am new to *Mahout* and *Hadoop*, *Spark*, *Scala* and I learned Andrew
Ng’s “Machine Learning” from Coursera
<https://www.coursera.org/learn/machine-learning/home/welcome>. So I have
the basic knowledge of Machine Learning, and now I am keeping forward to *Deep
Learning* and *Convex Optimization*, some other Mathematical Optimization
implementation. I am now still learning and getting famiIiar with Mahout. I
hope I can contribute some codes to Mahout in the early future with
learning by coding and coding by learning.
NASa 2016/05/20

Andrew Musselman
2016-05-20 16:59:03 UTC
Permalink
Steven, thanks for reaching out, and welcome to the project!

If you want to discuss how to build a recommender system, the user list is
probably more appropriate, and we all hang out there too.

If you'd like to contribute to the project dev's the right list. Let us
know if you have any trouble getting up and running and we can help out.

Best
Andrew
Post by Steven NASa
Hi Folk & Masters,
My name is *NASa*. I am now working for an e-commerce B2C company in China,
dealing with Transaction Process development in C++ & Java on Linux
environment.
As you know, *Recommender System* is quite valuable and important to an
e-commerce online shopping website like Amazon. I was told and required to
design and implement a Recommender System which can bring some value to my
Company. Our System is based on C++ codes. So I was searching for an robust
Machine Learning framework in C++ which can help me to easily implement a
Recommender System. I did not find any one which can satisfy my
requirements, but only some C++ math libraries.
Our system is based on an internal distributed frameworks like RPC and DB
access on Linux environment based on C++ programming language. But I find
it is really inconvenient to implement a Recommender System in C++ from
zero without distributed computing library supporting, like
implementing *Collaborative
Filtering* with SVD in a distributed computing way. So I am trying to find
a framework/library with is designed based on Distributed-System. There I
come to *Mahout*.
I wish I can build a library that can help people easily and quickly build
up a Recommender System based on Distributed System and also use the
Machine Learning Algorithms in distributed way. Apache has many amazing
projects which can help people to build up robust distributed system
easily. So I am moving to using “Java” environment.
I am new to *Mahout* and *Hadoop*, *Spark*, *Scala* and I learned Andrew
Ng’s “Machine Learning” from Coursera
<https://www.coursera.org/learn/machine-learning/home/welcome>. So I have
the basic knowledge of Machine Learning, and now I am keeping forward to *Deep
Learning* and *Convex Optimization*, some other Mathematical Optimization
implementation. I am now still learning and getting famiIiar with Mahout. I
hope I can contribute some codes to Mahout in the early future with
learning by coding and coding by learning.
NASa 2016/05/20
​
Suneel Marthi
2016-05-20 17:00:34 UTC
Permalink
Welcome to the project Steven!!
Post by Steven NASa
Hi Folk & Masters,
My name is *NASa*. I am now working for an e-commerce B2C company in China,
dealing with Transaction Process development in C++ & Java on Linux
environment.
As you know, *Recommender System* is quite valuable and important to an
e-commerce online shopping website like Amazon. I was told and required to
design and implement a Recommender System which can bring some value to my
Company. Our System is based on C++ codes. So I was searching for an robust
Machine Learning framework in C++ which can help me to easily implement a
Recommender System. I did not find any one which can satisfy my
requirements, but only some C++ math libraries.
Our system is based on an internal distributed frameworks like RPC and DB
access on Linux environment based on C++ programming language. But I find
it is really inconvenient to implement a Recommender System in C++ from
zero without distributed computing library supporting, like
implementing *Collaborative
Filtering* with SVD in a distributed computing way. So I am trying to find
a framework/library with is designed based on Distributed-System. There I
come to *Mahout*.
I wish I can build a library that can help people easily and quickly build
up a Recommender System based on Distributed System and also use the
Machine Learning Algorithms in distributed way. Apache has many amazing
projects which can help people to build up robust distributed system
easily. So I am moving to using “Java” environment.
I am new to *Mahout* and *Hadoop*, *Spark*, *Scala* and I learned Andrew
Ng’s “Machine Learning” from Coursera
<https://www.coursera.org/learn/machine-learning/home/welcome>. So I have
the basic knowledge of Machine Learning, and now I am keeping forward to *Deep
Learning* and *Convex Optimization*, some other Mathematical Optimization
implementation. I am now still learning and getting famiIiar with Mahout. I
hope I can contribute some codes to Mahout in the early future with
learning by coding and coding by learning.
NASa 2016/05/20
​
Pat Ferrel
2016-05-21 14:06:35 UTC
Permalink
Hi Stephen,

We have implemented SVD, ALS, and CCO for recommender, but these are only core algorithms, not really recommenders as Mahout has done in the past. The reason for this is that there are data prep, data ingestion, and serving components that, in a modern system, must be supplied also. So far Mahout has stayed aways from actually including servers, either for input of output.

That said there is plenty of room for algorithm development in Mahout. I worked on the CCO algorithm, which uses PredictionIO (proposed for the Apache Incubator) to supply the serving components.

Someone with your experience in real-life use of recommenders is certainly welcome.

What type of project did you have in mind?


On May 20, 2016, at 10:00 AM, Suneel Marthi <***@apache.org> wrote:

Welcome to the project Steven!!
Post by Steven NASa
Hi Folk & Masters,
My name is *NASa*. I am now working for an e-commerce B2C company in China,
dealing with Transaction Process development in C++ & Java on Linux
environment.
As you know, *Recommender System* is quite valuable and important to an
e-commerce online shopping website like Amazon. I was told and required to
design and implement a Recommender System which can bring some value to my
Company. Our System is based on C++ codes. So I was searching for an robust
Machine Learning framework in C++ which can help me to easily implement a
Recommender System. I did not find any one which can satisfy my
requirements, but only some C++ math libraries.
Our system is based on an internal distributed frameworks like RPC and DB
access on Linux environment based on C++ programming language. But I find
it is really inconvenient to implement a Recommender System in C++ from
zero without distributed computing library supporting, like
implementing *Collaborative
Filtering* with SVD in a distributed computing way. So I am trying to find
a framework/library with is designed based on Distributed-System. There I
come to *Mahout*.
I wish I can build a library that can help people easily and quickly build
up a Recommender System based on Distributed System and also use the
Machine Learning Algorithms in distributed way. Apache has many amazing
projects which can help people to build up robust distributed system
easily. So I am moving to using “Java” environment.
I am new to *Mahout* and *Hadoop*, *Spark*, *Scala* and I learned Andrew
Ng’s “Machine Learning” from Coursera
<https://www.coursera.org/learn/machine-learning/home/welcome>. So I have
the basic knowledge of Machine Learning, and now I am keeping forward to *Deep
Learning* and *Convex Optimization*, some other Mathematical Optimization
implementation. I am now still learning and getting famiIiar with Mahout. I
hope I can contribute some codes to Mahout in the early future with
learning by coding and coding by learning.
NASa 2016/05/20

Steven NASa
2016-05-21 14:30:27 UTC
Permalink
Hi Pat,

Thank you for your reply, I fully understand that core algorithms and data
are 2 different part of the system, this is why we have 2 major idea: "Big
data" and "Machine Learning".

My requirements of Recommenders are just like what Amazon does: Item-based,
but the number of items and users is very big, so there comes to a very
huge matrix. So I am still learning using Mahout to make the matrix
computing on a distributed system. After I am familiar with Mahout, I think
I can have some works on GPU acceleration for Matrix computing and some
other mathematical optimization.
About the data prep, I think we can define an abstraction of
conventions in data
prep, data ingestion, and serving components. Users can following some
conventions to feed data to Mahout.

Steven NASa
2016/05/21
Post by Pat Ferrel
Hi Stephen,
We have implemented SVD, ALS, and CCO for recommender, but these are only
core algorithms, not really recommenders as Mahout has done in the past.
The reason for this is that there are data prep, data ingestion, and
serving components that, in a modern system, must be supplied also. So far
Mahout has stayed aways from actually including servers, either for input
of output.
That said there is plenty of room for algorithm development in Mahout. I
worked on the CCO algorithm, which uses PredictionIO (proposed for the
Apache Incubator) to supply the serving components.
Someone with your experience in real-life use of recommenders is certainly welcome.
What type of project did you have in mind?
Welcome to the project Steven!!
Post by Steven NASa
Hi Folk & Masters,
My name is *NASa*. I am now working for an e-commerce B2C company in
China,
Post by Steven NASa
dealing with Transaction Process development in C++ & Java on Linux
environment.
As you know, *Recommender System* is quite valuable and important to an
e-commerce online shopping website like Amazon. I was told and required
to
Post by Steven NASa
design and implement a Recommender System which can bring some value to
my
Post by Steven NASa
Company. Our System is based on C++ codes. So I was searching for an
robust
Post by Steven NASa
Machine Learning framework in C++ which can help me to easily implement a
Recommender System. I did not find any one which can satisfy my
requirements, but only some C++ math libraries.
Our system is based on an internal distributed frameworks like RPC and DB
access on Linux environment based on C++ programming language. But I find
it is really inconvenient to implement a Recommender System in C++ from
zero without distributed computing library supporting, like
implementing *Collaborative
Filtering* with SVD in a distributed computing way. So I am trying to
find
Post by Steven NASa
a framework/library with is designed based on Distributed-System. There I
come to *Mahout*.
I wish I can build a library that can help people easily and quickly
build
Post by Steven NASa
up a Recommender System based on Distributed System and also use the
Machine Learning Algorithms in distributed way. Apache has many amazing
projects which can help people to build up robust distributed system
easily. So I am moving to using “Java” environment.
I am new to *Mahout* and *Hadoop*, *Spark*, *Scala* and I learned Andrew
Ng’s “Machine Learning” from Coursera
<https://www.coursera.org/learn/machine-learning/home/welcome>. So I
have
Post by Steven NASa
the basic knowledge of Machine Learning, and now I am keeping forward to *Deep
Learning* and *Convex Optimization*, some other Mathematical Optimization
implementation. I am now still learning and getting famiIiar with
Mahout. I
Post by Steven NASa
hope I can contribute some codes to Mahout in the early future with
learning by coding and coding by learning.
NASa 2016/05/20
​
Pat Ferrel
2016-05-22 01:05:26 UTC
Permalink
Well, IMO big data tensor math is Mahout’s strongest point and GPUs on immediately on the roadmap.

On May 21, 2016, at 7:30 AM, Steven NASa <***@gmail.com> wrote:

Hi Pat,

Thank you for your reply, I fully understand that core algorithms and data
are 2 different part of the system, this is why we have 2 major idea: "Big
data" and "Machine Learning".

My requirements of Recommenders are just like what Amazon does: Item-based,
but the number of items and users is very big, so there comes to a very
huge matrix. So I am still learning using Mahout to make the matrix
computing on a distributed system. After I am familiar with Mahout, I think
I can have some works on GPU acceleration for Matrix computing and some
other mathematical optimization.
About the data prep, I think we can define an abstraction of
conventions in data
prep, data ingestion, and serving components. Users can following some
conventions to feed data to Mahout.

Steven NASa
2016/05/21
Post by Pat Ferrel
Hi Stephen,
We have implemented SVD, ALS, and CCO for recommender, but these are only
core algorithms, not really recommenders as Mahout has done in the past.
The reason for this is that there are data prep, data ingestion, and
serving components that, in a modern system, must be supplied also. So far
Mahout has stayed aways from actually including servers, either for input
of output.
That said there is plenty of room for algorithm development in Mahout. I
worked on the CCO algorithm, which uses PredictionIO (proposed for the
Apache Incubator) to supply the serving components.
Someone with your experience in real-life use of recommenders is certainly welcome.
What type of project did you have in mind?
Welcome to the project Steven!!
Post by Steven NASa
Hi Folk & Masters,
My name is *NASa*. I am now working for an e-commerce B2C company in
China,
Post by Steven NASa
dealing with Transaction Process development in C++ & Java on Linux
environment.
As you know, *Recommender System* is quite valuable and important to an
e-commerce online shopping website like Amazon. I was told and required
to
Post by Steven NASa
design and implement a Recommender System which can bring some value to
my
Post by Steven NASa
Company. Our System is based on C++ codes. So I was searching for an
robust
Post by Steven NASa
Machine Learning framework in C++ which can help me to easily implement a
Recommender System. I did not find any one which can satisfy my
requirements, but only some C++ math libraries.
Our system is based on an internal distributed frameworks like RPC and DB
access on Linux environment based on C++ programming language. But I find
it is really inconvenient to implement a Recommender System in C++ from
zero without distributed computing library supporting, like
implementing *Collaborative
Filtering* with SVD in a distributed computing way. So I am trying to
find
Post by Steven NASa
a framework/library with is designed based on Distributed-System. There I
come to *Mahout*.
I wish I can build a library that can help people easily and quickly
build
Post by Steven NASa
up a Recommender System based on Distributed System and also use the
Machine Learning Algorithms in distributed way. Apache has many amazing
projects which can help people to build up robust distributed system
easily. So I am moving to using “Java” environment.
I am new to *Mahout* and *Hadoop*, *Spark*, *Scala* and I learned Andrew
Ng’s “Machine Learning” from Coursera
<https://www.coursera.org/learn/machine-learning/home/welcome>. So I
have
Post by Steven NASa
the basic knowledge of Machine Learning, and now I am keeping forward to *Deep
Learning* and *Convex Optimization*, some other Mathematical Optimization
implementation. I am now still learning and getting famiIiar with
Mahout. I
Post by Steven NASa
hope I can contribute some codes to Mahout in the early future with
learning by coding and coding by learning.
NASa 2016/05/20

Khurrum Nasim
2016-05-22 19:17:07 UTC
Permalink
Interesting.
Post by Steven NASa
Hi Pat,
Thank you for your reply, I fully understand that core algorithms and data
are 2 different part of the system, this is why we have 2 major idea: "Big
data" and "Machine Learning".
My requirements of Recommenders are just like what Amazon does: Item-based,
but the number of items and users is very big, so there comes to a very
huge matrix. So I am still learning using Mahout to make the matrix
computing on a distributed system. After I am familiar with Mahout, I think
I can have some works on GPU acceleration for Matrix computing and some
other mathematical optimization.
About the data prep, I think we can define an abstraction of
conventions in data
prep, data ingestion, and serving components. Users can following some
conventions to feed data to Mahout.
Steven NASa
2016/05/21
Post by Pat Ferrel
Hi Stephen,
We have implemented SVD, ALS, and CCO for recommender, but these are only
core algorithms, not really recommenders as Mahout has done in the past.
The reason for this is that there are data prep, data ingestion, and
serving components that, in a modern system, must be supplied also. So far
Mahout has stayed aways from actually including servers, either for input
of output.
That said there is plenty of room for algorithm development in Mahout. I
worked on the CCO algorithm, which uses PredictionIO (proposed for the
Apache Incubator) to supply the serving components.
Someone with your experience in real-life use of recommenders is certainly welcome.
What type of project did you have in mind?
Welcome to the project Steven!!
Post by Steven NASa
Hi Folk & Masters,
My name is *NASa*. I am now working for an e-commerce B2C company in
China,
Post by Steven NASa
dealing with Transaction Process development in C++ & Java on Linux
environment.
As you know, *Recommender System* is quite valuable and important to an
e-commerce online shopping website like Amazon. I was told and required
to
Post by Steven NASa
design and implement a Recommender System which can bring some value to
my
Post by Steven NASa
Company. Our System is based on C++ codes. So I was searching for an
robust
Post by Steven NASa
Machine Learning framework in C++ which can help me to easily implement a
Recommender System. I did not find any one which can satisfy my
requirements, but only some C++ math libraries.
Our system is based on an internal distributed frameworks like RPC and DB
access on Linux environment based on C++ programming language. But I find
it is really inconvenient to implement a Recommender System in C++ from
zero without distributed computing library supporting, like
implementing *Collaborative
Filtering* with SVD in a distributed computing way. So I am trying to
find
Post by Steven NASa
a framework/library with is designed based on Distributed-System. There I
come to *Mahout*.
I wish I can build a library that can help people easily and quickly
build
Post by Steven NASa
up a Recommender System based on Distributed System and also use the
Machine Learning Algorithms in distributed way. Apache has many amazing
projects which can help people to build up robust distributed system
easily. So I am moving to using “Java” environment.
I am new to *Mahout* and *Hadoop*, *Spark*, *Scala* and I learned Andrew
Ng’s “Machine Learning” from Coursera
<https://www.coursera.org/learn/machine-learning/home/welcome>. So I
have
Post by Steven NASa
the basic knowledge of Machine Learning, and now I am keeping forward to *Deep
Learning* and *Convex Optimization*, some other Mathematical Optimization
implementation. I am now still learning and getting famiIiar with
Mahout. I
Post by Steven NASa
hope I can contribute some codes to Mahout in the early future with
learning by coding and coding by learning.
NASa 2016/05/20

Loading...