I really appreciate you as you write so long to clarify my confusion. Much
appreciated. Thank you so much :)
Post by Dmitriy LyubimovPrakash,
(1) to be clear, the ASF trademark and branding policy is not to endorse
views of the 3rd party publications and to ask 3rd party writers to do a
disclosure that their views are not endorsed by ASF project. To that end,
ASF project can't really tell you that some publication is
"(in)appropriate". 3rd party publications are of their own account and
cannot be by default tied to the ASF views. That said, committers have
their opinions, which of course exhibit certain variation, and some things
do get linked on the site or mentioned on Twitter via Mahout account. But
some do not. Best practice is always to ask for pointers on the list first.
(2) I am not sure what your definition of "appropriate" is, but on
personal note, most of these links were quite "appropriate" at the time in
the sense that they were published prior to release 0.10 and 2/2014 or
before 0.10, and therefore were describing what was in the project at that
time. Thus, MIA fuzzy k-means example in your very link is dated back of
June 2011 and is relevant to release circa 0.6 or 0.7. So if you mean
whether those algorithms were "in the fold" back then, the answer is yes,
they were. I see no contradiction between these publications and the
current reality.
(3) If something deprecated reasonably works for a particular purpose, I
think there's no reason not to use/write about it.
*However, I just don't think most of these particular deprecated
Java-based MR algorithms work for the purposes of an established benchmark
or a standard in a research -- modern edgy ML is usually much more faster
(and often, more convenient too). *
Don't mean to come across as preachy, but research is usually held to
quite different standard as it comes to claims, than an ad-hoc industrial
application or a blog entry. I simply can't see how any of MR stuff can
work for that purpose today.
(4) if your "appropriate"-ness question is really about why they were
deprecated, well, there are two main reasons for that. First, it seems that
the realization of MR limitations w.r.t. iterative applications quickly
caught up with both users and contributors, and, second, most contributors
abandoned their MR contributions (most likely for the same reason). I
contributed a couple of MR algorithms back in 2010-2011 but i am absolutely
fine with them being deprecated and written off the books. If something is
not being used, or people (exactly as your case has demonstrated) don't get
answers to their questions, or bugs are not being fixed, it is difficult to
justify keeping the code. It is much easier to focus on what is actually
being used and maintained instead. Here, the very banal and boring reason
for the deprecations.
(5) Finally, If your goal is simply to learn "how the project works", just
like Suneel said, i'd suggest to follow release notes and the project site
(news and howtos) -- your last link in fact should perhaps be your first.
And the list, of coure.
As you probably can tell by release notes, the last two years were
practically exclusively about multiplatform Mahout involvement with Spark,
Flink and H20 backends, as well as the Samsara environment for general
numeric analysis (but no MR stuff beyond very nominal fixes).
I also agree that it looks like the Mahout site perhaps should be more
clear about the status of MR algorithms (it used to be more clear, I think,
but every news eventually becomes an old news).
Hope this clarifies.
-d
On Thu, Apr 28, 2016 at 12:02 PM, Prakash Poudyal <
Post by Prakash PoudyalHi!
Thank you for your emails !!
Actually, I need to use fuzzy clustering to cluster the sentence in my
research. This is my goal.
I started to use Fuzzy K means clustering of Mahout since last week !!! I
found several blogs links, and many other helpful documents !!!! I was
going through, as being new, I realize this the best, easy and fast way to
know about Mahout works. In my opinion, many new commers do the same as I
do. After being used to the tools, than only people focus on the works and
go deeply.
I had gone through many blogs and sites to know about Mahout, some of
http://technobium.com/introduction-to-clustering-using-apache-mahout/
http://tuxdna.github.io/pages/mahout.html
https://github.com/tdunning/MiA/blob/master/src/main/java/mia/clustering/ch09/FuzzyKMeansExample.java
http://www.programering.com/a/MDNwgTMwATI.html
https://www.safaribooksonline.com/library/view/apache-mahout-clustering/9781783284436/ch04.html
https://ymnliu.wordpress.com/2015/11/05/install-apache-mahout-in-eclipse/
https://mahout.apache.org/
What do you say about these sites !! Is these sites are not appropriate ???
I raise my problem several time, in mailing list and even IRC but I got
response !! just today :(
So finally, it would be great, if you could reply the answers of my
following question .
Is Apache Mahout appropriate tool for clustering sentences through
fuzzy-clustering ?
If answer is "YES"
Which version of Mahout ?
Can you write the steps that I need to followed, or give me
appropriate documentation (links) ?
Thanks
Prakash Poudyal
Portugal