Thanks so much. Will run some more experiments to validate the initial
outcomes. A Spark upgrade is definitely in the pipeline and will likely
solve some of these performance issues.
other than the Mahout version change. I'm sorry, the data is sensitive so
sharing it won't be possible.
instead of computeAtBzipped for AtB. Although, this is meant to speed
things up, so not sure its relevant as far as this problem is concerned.
Post by Pat FerrelI have been using the same function through all those versions of Mahout.
Iâm running on newer versions of Spark 1.4-1.6.2. Using my datasets there
has been no slowdown. I assume that you are only changing the Mahout
versionâleaving data, Spark, HDFS, and all config the same. In which case I
wonder if you are somehow running into limits of your machine like memory?
Have you allocated a fixed executor memory limit?
There has been almost no code change to item similarity. Dmitriy, do you
know if the underlying AtB has changed? I seem to recall the partitioning
was set to âautoâ about 0.11. We were having problems with large numbers of
small part files from Spark Streaming causing partitioning headaches as I
recall. In some unexpected way the input structure was trickling down into
partitioning decisions made in Spark.
The first thing Iâd try is giving the job more executor memory, the second
is to upgrade Spark. A 3x increase in execution speed is a pretty big deal
if it isnât helped with these easy fixes so can you share your data?
0.11 targets 1.3+.
I don't quite have anything on top of my head affecting A'B specifically,
but i think there were some chanages affecting in-memory multiplication
(which is of course used in distributed A'B).
I am not in particular familiar or remember details of row similarity on
top of my head, i really wish the original contributor would comment on
that. trying to see if i can come up with anything useful though.
what behavior do you see in this job -- cpu-bound or i/o bound?
(1) I/O many times exceeds the input size, so spills are inevitable. So
tuning memory sizes and look at spark spill locations to make sure disks
are not slow there is critical. Also, i think in spark 1.6 spark added a
lot of flexibility in managing task/cache/shuffle memory sizes, it may help
in some unexpected way.
(2) sufficient cache: many pipelines commit reused matrices into cache
(MEMORY_ONLY) which is the default mahout algebra behavior, assuming there
is enough cache memory there for only good things to happen. if it is not,
however, it will cause recomputation of results that were evicted. (not
saying it is a known case for row similarity in particular). make sure this
is not the case. For cases of scatter type exchanges it is especially super
bad.
(3) A'B -- try to hack and play with implemetnation there in AtB (spark
side) class. See if you can come up with a better arrangement.
(4) in-memory computations (MMul class) if that's the bottleneck can be in
practice quick-hacked with mutlithreaded multiplication and bridge to
native solvers (netlib-java) at least for dense cases. this is found to
improve performance of distributed multiplications a bit. Works best if you
get 2 threads in the backend and all threads in the front end.
There are other known things that can improve speed multiplication of the
public mahout version, i hope mahout will improve on those in the future.
-d
Hi,
Iâve been working with LLR in Mahout for a while now. Mostly using the
SimilarityAnalysis.cooccurenceIDss function. I recently upgraded the
Mahout
libraries to 0.11, and subsequently also tried with 0.12 and the same
program is running orders of magnitude slower (at least 3x based on
initial
analysis).
Looking into the tasks more carefully, comparing 0.10 and 0.11 shows that
the amount of Shuffle being done in 0.11 is significantly higher,
especially in the AtB step. This could possibly be a reason for the
reduction in performance.
Although, I am working on Spark 1.2.0. So, its possible that this could
be
causing the problem. It works fine with Mahout 0.10.
Any ideas why this might be happening?
Thank you,
Nikaash Puri