Discussion:
MAHOUT-1876 - Lucene compatibility
Raviteja Lokineni
2016-08-04 17:53:33 UTC
Permalink
Hi Devs,

Issue link: https://issues.apache.org/jira/browse/MAHOUT-1876

Problem statement: Mahout should be compatible with the latest lucene
version. I was trying to solve a text clustering problem and I stumbled
upon the error that the lucene version in use is not supported. That's when
I raised this issue and one of the guys suggested me to try fixing it and I
started doing so.

Proposed solution:

1. Change lucene version in the POM file and fix all the compilation
failures
2. Fix any failing tests due to this change

Current progress:

1. Fixed all the compilation issues
2. Comparing the test failures of before-fix v/s after-fix

I might need some help in test failures, I happen to see that the same
tests are failing in both before-fix and after-fix.

Let me know if any comments.

Thanks,
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade

E: ***@gmail.com

[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
Raviteja Lokineni
2016-08-04 18:05:47 UTC
Permalink
Most of the MR tests are failing with:

java.lang.NullPointerException: null
at __randomizedtesting.SeedInfo.seed([C8C254E3D9D80CC9:845EFDB07D41873E]:0)
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1012)
at org.apache.hadoop.util.Shell.runCommand(Shell.java:445)
at org.apache.hadoop.util.Shell.run(Shell.java:418)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:650)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:739)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:722)
at
org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:633)
at
org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:421)
at org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSystem.java:281)
at
org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:125)
at
org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:348)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1303)
at
org.apache.mahout.cf.taste.hadoop.preparation.PreparePreferenceMatrixJob.run(PreparePreferenceMatrixJob.java:77)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at
org.apache.mahout.cf.taste.hadoop.item.RecommenderJob.run(RecommenderJob.java:168)
at
org.apache.mahout.cf.taste.hadoop.item.RecommenderJobTest.testCompleteJobWithFiltering(RecommenderJobTest.java:881)

Do I need to have hadoop installed on my local machine?

On Thu, Aug 4, 2016 at 1:53 PM, Raviteja Lokineni <
Post by Raviteja Lokineni
Hi Devs,
Issue link: https://issues.apache.org/jira/browse/MAHOUT-1876
Problem statement: Mahout should be compatible with the latest lucene
version. I was trying to solve a text clustering problem and I stumbled
upon the error that the lucene version in use is not supported. That's when
I raised this issue and one of the guys suggested me to try fixing it and I
started doing so.
1. Change lucene version in the POM file and fix all the compilation
failures
2. Fix any failing tests due to this change
1. Fixed all the compilation issues
2. Comparing the test failures of before-fix v/s after-fix
I might need some help in test failures, I happen to see that the same
tests are failing in both before-fix and after-fix.
Let me know if any comments.
Thanks,
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade
[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade

E: ***@gmail.com

[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
Raviteja Lokineni
2016-08-04 20:00:48 UTC
Permalink
All the tests are successful even with my code change and upgrade to
Lucene. (The test failure that I reported above was on my local machine, I
guess you can ignore that.)

So is this pull request good to merge?

https://github.com/apache/mahout/pull/247

https://travis-ci.org/apache/mahout/builds/149840408

On Thu, Aug 4, 2016 at 2:05 PM, Raviteja Lokineni <
Post by Raviteja Lokineni
java.lang.NullPointerException: null
845EFDB07D41873E]:0)
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1012)
at org.apache.hadoop.util.Shell.runCommand(Shell.java:445)
at org.apache.hadoop.util.Shell.run(Shell.java:418)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(
Shell.java:650)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:739)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:722)
at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(
RawLocalFileSystem.java:633)
at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(
RawLocalFileSystem.java:421)
at org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSystem.java:281)
at org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(
JobSubmissionFiles.java:125)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(
JobSubmitter.java:348)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(
UserGroupInformation.java:1556)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1303)
at org.apache.mahout.cf.taste.hadoop.preparation.
PreparePreferenceMatrixJob.run(PreparePreferenceMatrixJob.java:77)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.mahout.cf.taste.hadoop.item.RecommenderJob.
run(RecommenderJob.java:168)
at org.apache.mahout.cf.taste.hadoop.item.RecommenderJobTest.
testCompleteJobWithFiltering(RecommenderJobTest.java:881)
Do I need to have hadoop installed on my local machine?
On Thu, Aug 4, 2016 at 1:53 PM, Raviteja Lokineni <
Post by Raviteja Lokineni
Hi Devs,
Issue link: https://issues.apache.org/jira/browse/MAHOUT-1876
Problem statement: Mahout should be compatible with the latest lucene
version. I was trying to solve a text clustering problem and I stumbled
upon the error that the lucene version in use is not supported. That's when
I raised this issue and one of the guys suggested me to try fixing it and I
started doing so.
1. Change lucene version in the POM file and fix all the compilation
failures
2. Fix any failing tests due to this change
1. Fixed all the compilation issues
2. Comparing the test failures of before-fix v/s after-fix
I might need some help in test failures, I happen to see that the same
tests are failing in both before-fix and after-fix.
Let me know if any comments.
Thanks,
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade
[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade
[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade

E: ***@gmail.com

[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
Raviteja Lokineni
2016-08-04 22:09:31 UTC
Permalink
Are there any pre-requisites to run tests on local machines?

Most of the MR module tests failed with:

java.lang.NullPointerException: null
at __randomizedtesting.SeedInfo.seed([C8C254E3D9D80CC9:845EFDB07D41873E]:0)
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1012)
at org.apache.hadoop.util.Shell.runCommand(Shell.java:445)
at org.apache.hadoop.util.Shell.run(Shell.java:418)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:650)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:739)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:722)
at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(
RawLocalFileSystem.java:633)
at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(
RawLocalFileSystem.java:421)
at org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSystem.java:281)
at org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(
JobSubmissionFiles.java:125)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(
JobSubmitter.java:348)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(
UserGroupInformation.java:1556)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1303)
at org.apache.mahout.cf.taste.hadoop.preparation.PreparePreferenceMatrixJob.
run(PreparePreferenceMatrixJob.java:77)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.mahout.cf.taste.hadoop.item.RecommenderJob.
run(RecommenderJob.java:168)
at org.apache.mahout.cf.taste.hadoop.item.RecommenderJobTest.
testCompleteJobWithFiltering(RecommenderJobTest.java:881)
Actually, Travis CI is not set up to test all mahout modules. Could you
please test again on your local machine, and report any errors.
Andy
-------- Original message --------
Date: 08/04/2016 4:01 PM (GMT-05:00)
Subject: Re: MAHOUT-1876 - Lucene compatibility
All the tests are successful even with my code change and upgrade to
Lucene. (The test failure that I reported above was on my local machine, I
guess you can ignore that.)
So is this pull request good to merge?
https://github.com/apache/mahout/pull/247
https://travis-ci.org/apache/mahout/builds/149840408
On Thu, Aug 4, 2016 at 2:05 PM, Raviteja Lokineni <
Post by Raviteja Lokineni
java.lang.NullPointerException: null
845EFDB07D41873E]:0)
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1012)
at org.apache.hadoop.util.Shell.runCommand(Shell.java:445)
at org.apache.hadoop.util.Shell.run(Shell.java:418)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(
Shell.java:650)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:739)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:722)
at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(
RawLocalFileSystem.java:633)
at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(
RawLocalFileSystem.java:421)
at org.apache.hadoop.fs.FilterFileSystem.mkdirs(
FilterFileSystem.java:281)
Post by Raviteja Lokineni
at org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(
JobSubmissionFiles.java:125)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(
JobSubmitter.java:348)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(
UserGroupInformation.java:1556)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1303)
at org.apache.mahout.cf.taste.hadoop.preparation.
PreparePreferenceMatrixJob.run(PreparePreferenceMatrixJob.java:77)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.mahout.cf.taste.hadoop.item.RecommenderJob.
run(RecommenderJob.java:168)
at org.apache.mahout.cf.taste.hadoop.item.RecommenderJobTest.
testCompleteJobWithFiltering(RecommenderJobTest.java:881)
Do I need to have hadoop installed on my local machine?
On Thu, Aug 4, 2016 at 1:53 PM, Raviteja Lokineni <
Post by Raviteja Lokineni
Hi Devs,
Issue link: https://issues.apache.org/jira/browse/MAHOUT-1876
Problem statement: Mahout should be compatible with the latest lucene
version. I was trying to solve a text clustering problem and I stumbled
upon the error that the lucene version in use is not supported. That's
when
Post by Raviteja Lokineni
Post by Raviteja Lokineni
I raised this issue and one of the guys suggested me to try fixing it
and I
Post by Raviteja Lokineni
Post by Raviteja Lokineni
started doing so.
1. Change lucene version in the POM file and fix all the compilation
failures
2. Fix any failing tests due to this change
1. Fixed all the compilation issues
2. Comparing the test failures of before-fix v/s after-fix
I might need some help in test failures, I happen to see that the same
tests are failing in both before-fix and after-fix.
Let me know if any comments.
Thanks,
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade
[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade
[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade
[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade

E: ***@gmail.com

[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
Raviteja Lokineni
2016-08-05 01:53:03 UTC
Permalink
I'm up for the challenge but looks like nothing failed this time too.

Repo: https://github.com/bond-/mahout
Branch: mahout-1876

See the full output in the gist:
https://gist.github.com/bond-/33c349136b6914db4892233b8876b9d7

If someone wants to re-verify running the tests, please go ahead and let me
know.
Maven >= 3.3.3,
java >= 1.7,
MAHOUT_HOME=/path/to/install,
also MAHOUT_LOCAL=true
If ur not running against a hadoop cluster. (Looks like ur not)
Upgrading this dep is not a trivial task.
-------- Original message --------
Date: 08/04/2016 6:09 PM (GMT-05:00)
Subject: Re: MAHOUT-1876 - Lucene compatibility
Are there any pre-requisites to run tests on local machines?
java.lang.NullPointerException: null
845EFDB07D41873E]:0)
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1012)
at org.apache.hadoop.util.Shell.runCommand(Shell.java:445)
at org.apache.hadoop.util.Shell.run(Shell.java:418)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(
Shell.java:650)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:739)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:722)
at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(
RawLocalFileSystem.java:633)
at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(
RawLocalFileSystem.java:421)
at org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSystem.java:281)
at org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(
JobSubmissionFiles.java:125)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(
JobSubmitter.java:348)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(
UserGroupInformation.java:1556)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1303)
at org.apache.mahout.cf.taste.hadoop.preparation.
PreparePreferenceMatrixJob.
run(PreparePreferenceMatrixJob.java:77)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.mahout.cf.taste.hadoop.item.RecommenderJob.
run(RecommenderJob.java:168)
at org.apache.mahout.cf.taste.hadoop.item.RecommenderJobTest.
testCompleteJobWithFiltering(RecommenderJobTest.java:881)
Actually, Travis CI is not set up to test all mahout modules. Could you
please test again on your local machine, and report any errors.
Andy
-------- Original message --------
Date: 08/04/2016 4:01 PM (GMT-05:00)
Subject: Re: MAHOUT-1876 - Lucene compatibility
All the tests are successful even with my code change and upgrade to
Lucene. (The test failure that I reported above was on my local machine,
I
guess you can ignore that.)
So is this pull request good to merge?
https://github.com/apache/mahout/pull/247
https://travis-ci.org/apache/mahout/builds/149840408
On Thu, Aug 4, 2016 at 2:05 PM, Raviteja Lokineni <
Post by Raviteja Lokineni
java.lang.NullPointerException: null
845EFDB07D41873E]:0)
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1012)
at org.apache.hadoop.util.Shell.runCommand(Shell.java:445)
at org.apache.hadoop.util.Shell.run(Shell.java:418)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(
Shell.java:650)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:739)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:722)
at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(
RawLocalFileSystem.java:633)
at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(
RawLocalFileSystem.java:421)
at org.apache.hadoop.fs.FilterFileSystem.mkdirs(
FilterFileSystem.java:281)
Post by Raviteja Lokineni
at org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(
JobSubmissionFiles.java:125)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(
JobSubmitter.java:348)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(
UserGroupInformation.java:1556)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1303)
at org.apache.mahout.cf.taste.hadoop.preparation.
PreparePreferenceMatrixJob.run(PreparePreferenceMatrixJob.java:77)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.mahout.cf.taste.hadoop.item.RecommenderJob.
run(RecommenderJob.java:168)
at org.apache.mahout.cf.taste.hadoop.item.RecommenderJobTest.
testCompleteJobWithFiltering(RecommenderJobTest.java:881)
Do I need to have hadoop installed on my local machine?
On Thu, Aug 4, 2016 at 1:53 PM, Raviteja Lokineni <
Post by Raviteja Lokineni
Hi Devs,
Issue link: https://issues.apache.org/jira/browse/MAHOUT-1876
Problem statement: Mahout should be compatible with the latest lucene
version. I was trying to solve a text clustering problem and I
stumbled
Post by Raviteja Lokineni
Post by Raviteja Lokineni
upon the error that the lucene version in use is not supported. That's
when
Post by Raviteja Lokineni
Post by Raviteja Lokineni
I raised this issue and one of the guys suggested me to try fixing it
and I
Post by Raviteja Lokineni
Post by Raviteja Lokineni
started doing so.
1. Change lucene version in the POM file and fix all the
compilation
Post by Raviteja Lokineni
Post by Raviteja Lokineni
failures
2. Fix any failing tests due to this change
1. Fixed all the compilation issues
2. Comparing the test failures of before-fix v/s after-fix
I might need some help in test failures, I happen to see that the same
tests are failing in both before-fix and after-fix.
Let me know if any comments.
Thanks,
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade
[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade
[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade
[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade
[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade

E: ***@gmail.com

[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
Raviteja Lokineni
2016-08-05 15:12:47 UTC
Permalink
The reason the tests failed previously on my windows box is because it
wasn't able to locate winutils.exe in HADOOP_HOME.

Now, to test that I am building winutils.exe from source using the
following resource:
http://zutai.blogspot.com/2014/06/build-install-and-run-hadoop-24-240-on.html

On Thu, Aug 4, 2016 at 9:53 PM, Raviteja Lokineni <
Post by Raviteja Lokineni
I'm up for the challenge but looks like nothing failed this time too.
Repo: https://github.com/bond-/mahout
Branch: mahout-1876
See the full output in the gist: https://gist.github.com/bond-/
33c349136b6914db4892233b8876b9d7
If someone wants to re-verify running the tests, please go ahead and let
me know.
Maven >= 3.3.3,
java >= 1.7,
MAHOUT_HOME=/path/to/install,
also MAHOUT_LOCAL=true
If ur not running against a hadoop cluster. (Looks like ur not)
Upgrading this dep is not a trivial task.
-------- Original message --------
Date: 08/04/2016 6:09 PM (GMT-05:00)
Subject: Re: MAHOUT-1876 - Lucene compatibility
Are there any pre-requisites to run tests on local machines?
java.lang.NullPointerException: null
at __randomizedtesting.SeedInfo.seed([C8C254E3D9D80CC9:845EFDB0
7D41873E]:0)
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1012)
at org.apache.hadoop.util.Shell.runCommand(Shell.java:445)
at org.apache.hadoop.util.Shell.run(Shell.java:418)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Sh
ell.java:650)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:739)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:722)
at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(
RawLocalFileSystem.java:633)
at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(
RawLocalFileSystem.java:421)
at org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSyste
m.java:281)
at org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(
JobSubmissionFiles.java:125)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(
JobSubmitter.java:348)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(
UserGroupInformation.java:1556)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1303)
at org.apache.mahout.cf.taste.hadoop.preparation.PreparePrefere
nceMatrixJob.
run(PreparePreferenceMatrixJob.java:77)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.mahout.cf.taste.hadoop.item.RecommenderJob.
run(RecommenderJob.java:168)
at org.apache.mahout.cf.taste.hadoop.item.RecommenderJobTest.
testCompleteJobWithFiltering(RecommenderJobTest.java:881)
Actually, Travis CI is not set up to test all mahout modules. Could you
please test again on your local machine, and report any errors.
Andy
-------- Original message --------
Date: 08/04/2016 4:01 PM (GMT-05:00)
Subject: Re: MAHOUT-1876 - Lucene compatibility
All the tests are successful even with my code change and upgrade to
Lucene. (The test failure that I reported above was on my local
machine, I
guess you can ignore that.)
So is this pull request good to merge?
https://github.com/apache/mahout/pull/247
https://travis-ci.org/apache/mahout/builds/149840408
On Thu, Aug 4, 2016 at 2:05 PM, Raviteja Lokineni <
Post by Raviteja Lokineni
java.lang.NullPointerException: null
845EFDB07D41873E]:0)
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1012)
at org.apache.hadoop.util.Shell.runCommand(Shell.java:445)
at org.apache.hadoop.util.Shell.run(Shell.java:418)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(
Shell.java:650)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:739)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:722)
at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(
RawLocalFileSystem.java:633)
at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(
RawLocalFileSystem.java:421)
at org.apache.hadoop.fs.FilterFileSystem.mkdirs(
FilterFileSystem.java:281)
Post by Raviteja Lokineni
at org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(
JobSubmissionFiles.java:125)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(
JobSubmitter.java:348)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(
UserGroupInformation.java:1556)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1303)
at org.apache.mahout.cf.taste.hadoop.preparation.
PreparePreferenceMatrixJob.run(PreparePreferenceMatrixJob.java:77)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.mahout.cf.taste.hadoop.item.RecommenderJob.
run(RecommenderJob.java:168)
at org.apache.mahout.cf.taste.hadoop.item.RecommenderJobTest.
testCompleteJobWithFiltering(RecommenderJobTest.java:881)
Do I need to have hadoop installed on my local machine?
On Thu, Aug 4, 2016 at 1:53 PM, Raviteja Lokineni <
Post by Raviteja Lokineni
Hi Devs,
Issue link: https://issues.apache.org/jira/browse/MAHOUT-1876
Problem statement: Mahout should be compatible with the latest lucene
version. I was trying to solve a text clustering problem and I
stumbled
Post by Raviteja Lokineni
Post by Raviteja Lokineni
upon the error that the lucene version in use is not supported.
That's
when
Post by Raviteja Lokineni
Post by Raviteja Lokineni
I raised this issue and one of the guys suggested me to try fixing it
and I
Post by Raviteja Lokineni
Post by Raviteja Lokineni
started doing so.
1. Change lucene version in the POM file and fix all the
compilation
Post by Raviteja Lokineni
Post by Raviteja Lokineni
failures
2. Fix any failing tests due to this change
1. Fixed all the compilation issues
2. Comparing the test failures of before-fix v/s after-fix
I might need some help in test failures, I happen to see that the
same
Post by Raviteja Lokineni
Post by Raviteja Lokineni
tests are failing in both before-fix and after-fix.
Let me know if any comments.
Thanks,
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade
[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade
[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade
[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade
[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade
[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade

E: ***@gmail.com

[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
Raviteja Lokineni
2016-08-05 16:01:13 UTC
Permalink
I have built winutils.exe but nevertheless the test fails to run on windows
due to linker errors.

So, for now linux builds are successful on my end. Let me know what is your
plan with the pull request to upgrade.

On Fri, Aug 5, 2016 at 11:12 AM, Raviteja Lokineni <
Post by Raviteja Lokineni
The reason the tests failed previously on my windows box is because it
wasn't able to locate winutils.exe in HADOOP_HOME.
Now, to test that I am building winutils.exe from source using the
following resource: http://zutai.blogspot.com/2014/06/build-
install-and-run-hadoop-24-240-on.html
On Thu, Aug 4, 2016 at 9:53 PM, Raviteja Lokineni <
Post by Raviteja Lokineni
I'm up for the challenge but looks like nothing failed this time too.
Repo: https://github.com/bond-/mahout
Branch: mahout-1876
See the full output in the gist: https://gist.github.com/
bond-/33c349136b6914db4892233b8876b9d7
If someone wants to re-verify running the tests, please go ahead and let
me know.
Maven >= 3.3.3,
java >= 1.7,
MAHOUT_HOME=/path/to/install,
also MAHOUT_LOCAL=true
If ur not running against a hadoop cluster. (Looks like ur not)
Upgrading this dep is not a trivial task.
-------- Original message --------
Date: 08/04/2016 6:09 PM (GMT-05:00)
Subject: Re: MAHOUT-1876 - Lucene compatibility
Are there any pre-requisites to run tests on local machines?
java.lang.NullPointerException: null
at __randomizedtesting.SeedInfo.seed([C8C254E3D9D80CC9:845EFDB0
7D41873E]:0)
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1012)
at org.apache.hadoop.util.Shell.runCommand(Shell.java:445)
at org.apache.hadoop.util.Shell.run(Shell.java:418)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Sh
ell.java:650)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:739)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:722)
at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(
RawLocalFileSystem.java:633)
at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(
RawLocalFileSystem.java:421)
at org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSyste
m.java:281)
at org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(
JobSubmissionFiles.java:125)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(
JobSubmitter.java:348)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(
UserGroupInformation.java:1556)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1303)
at org.apache.mahout.cf.taste.hadoop.preparation.PreparePrefere
nceMatrixJob.
run(PreparePreferenceMatrixJob.java:77)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.mahout.cf.taste.hadoop.item.RecommenderJob.
run(RecommenderJob.java:168)
at org.apache.mahout.cf.taste.hadoop.item.RecommenderJobTest.
testCompleteJobWithFiltering(RecommenderJobTest.java:881)
Actually, Travis CI is not set up to test all mahout modules. Could
you
please test again on your local machine, and report any errors.
Andy
-------- Original message --------
Date: 08/04/2016 4:01 PM (GMT-05:00)
Subject: Re: MAHOUT-1876 - Lucene compatibility
All the tests are successful even with my code change and upgrade to
Lucene. (The test failure that I reported above was on my local
machine, I
guess you can ignore that.)
So is this pull request good to merge?
https://github.com/apache/mahout/pull/247
https://travis-ci.org/apache/mahout/builds/149840408
On Thu, Aug 4, 2016 at 2:05 PM, Raviteja Lokineni <
Post by Raviteja Lokineni
java.lang.NullPointerException: null
845EFDB07D41873E]:0)
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1012)
at org.apache.hadoop.util.Shell.runCommand(Shell.java:445)
at org.apache.hadoop.util.Shell.run(Shell.java:418)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(
Shell.java:650)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:739)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:722)
at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(
RawLocalFileSystem.java:633)
at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(
RawLocalFileSystem.java:421)
at org.apache.hadoop.fs.FilterFileSystem.mkdirs(
FilterFileSystem.java:281)
Post by Raviteja Lokineni
at org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(
JobSubmissionFiles.java:125)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(
JobSubmitter.java:348)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(
UserGroupInformation.java:1556)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1303)
at org.apache.mahout.cf.taste.hadoop.preparation.
PreparePreferenceMatrixJob.run(PreparePreferenceMatrixJob.java:77)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.mahout.cf.taste.hadoop.item.RecommenderJob.
run(RecommenderJob.java:168)
at org.apache.mahout.cf.taste.hadoop.item.RecommenderJobTest.
testCompleteJobWithFiltering(RecommenderJobTest.java:881)
Do I need to have hadoop installed on my local machine?
On Thu, Aug 4, 2016 at 1:53 PM, Raviteja Lokineni <
Post by Raviteja Lokineni
Hi Devs,
Issue link: https://issues.apache.org/jira/browse/MAHOUT-1876
Problem statement: Mahout should be compatible with the latest
lucene
Post by Raviteja Lokineni
Post by Raviteja Lokineni
version. I was trying to solve a text clustering problem and I
stumbled
Post by Raviteja Lokineni
Post by Raviteja Lokineni
upon the error that the lucene version in use is not supported.
That's
when
Post by Raviteja Lokineni
Post by Raviteja Lokineni
I raised this issue and one of the guys suggested me to try fixing
it
and I
Post by Raviteja Lokineni
Post by Raviteja Lokineni
started doing so.
1. Change lucene version in the POM file and fix all the
compilation
Post by Raviteja Lokineni
Post by Raviteja Lokineni
failures
2. Fix any failing tests due to this change
1. Fixed all the compilation issues
2. Comparing the test failures of before-fix v/s after-fix
I might need some help in test failures, I happen to see that the
same
Post by Raviteja Lokineni
Post by Raviteja Lokineni
tests are failing in both before-fix and after-fix.
Let me know if any comments.
Thanks,
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade
[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade
[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade
[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade
[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade
[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade
[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade

E: ***@gmail.com

[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
Suneel Marthi
2016-08-05 16:11:22 UTC
Permalink
Most of us r Linux or MacOS users, so we never hit any of the build issues
u have encountered.
I must say that we never test for Windows :), but its expected that u would
run into issues on Windows.

The patch looks good and we shuld be able to merge this over the weekend,
thanks again for the contrib.



On Fri, Aug 5, 2016 at 12:01 PM, Raviteja Lokineni <
Post by Raviteja Lokineni
I have built winutils.exe but nevertheless the test fails to run on windows
due to linker errors.
So, for now linux builds are successful on my end. Let me know what is your
plan with the pull request to upgrade.
On Fri, Aug 5, 2016 at 11:12 AM, Raviteja Lokineni <
Post by Raviteja Lokineni
The reason the tests failed previously on my windows box is because it
wasn't able to locate winutils.exe in HADOOP_HOME.
Now, to test that I am building winutils.exe from source using the
following resource: http://zutai.blogspot.com/2014/06/build-
install-and-run-hadoop-24-240-on.html
On Thu, Aug 4, 2016 at 9:53 PM, Raviteja Lokineni <
Post by Raviteja Lokineni
I'm up for the challenge but looks like nothing failed this time too.
Repo: https://github.com/bond-/mahout
Branch: mahout-1876
See the full output in the gist: https://gist.github.com/
bond-/33c349136b6914db4892233b8876b9d7
If someone wants to re-verify running the tests, please go ahead and let
me know.
Maven >= 3.3.3,
java >= 1.7,
MAHOUT_HOME=/path/to/install,
also MAHOUT_LOCAL=true
If ur not running against a hadoop cluster. (Looks like ur not)
Upgrading this dep is not a trivial task.
-------- Original message --------
Date: 08/04/2016 6:09 PM (GMT-05:00)
Subject: Re: MAHOUT-1876 - Lucene compatibility
Are there any pre-requisites to run tests on local machines?
java.lang.NullPointerException: null
at __randomizedtesting.SeedInfo.seed([C8C254E3D9D80CC9:845EFDB0
7D41873E]:0)
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1012)
at org.apache.hadoop.util.Shell.runCommand(Shell.java:445)
at org.apache.hadoop.util.Shell.run(Shell.java:418)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Sh
ell.java:650)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:739)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:722)
at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(
RawLocalFileSystem.java:633)
at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(
RawLocalFileSystem.java:421)
at org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSyste
m.java:281)
at org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(
JobSubmissionFiles.java:125)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(
JobSubmitter.java:348)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(
UserGroupInformation.java:1556)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1303)
at org.apache.mahout.cf.taste.hadoop.preparation.PreparePrefere
nceMatrixJob.
run(PreparePreferenceMatrixJob.java:77)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.mahout.cf.taste.hadoop.item.RecommenderJob.
run(RecommenderJob.java:168)
at org.apache.mahout.cf.taste.hadoop.item.RecommenderJobTest.
testCompleteJobWithFiltering(RecommenderJobTest.java:881)
Actually, Travis CI is not set up to test all mahout modules. Could
you
please test again on your local machine, and report any errors.
Andy
-------- Original message --------
Date: 08/04/2016 4:01 PM (GMT-05:00)
Subject: Re: MAHOUT-1876 - Lucene compatibility
All the tests are successful even with my code change and upgrade to
Lucene. (The test failure that I reported above was on my local
machine, I
guess you can ignore that.)
So is this pull request good to merge?
https://github.com/apache/mahout/pull/247
https://travis-ci.org/apache/mahout/builds/149840408
On Thu, Aug 4, 2016 at 2:05 PM, Raviteja Lokineni <
Post by Raviteja Lokineni
java.lang.NullPointerException: null
845EFDB07D41873E]:0)
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1012)
at org.apache.hadoop.util.Shell.runCommand(Shell.java:445)
at org.apache.hadoop.util.Shell.run(Shell.java:418)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(
Shell.java:650)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:739)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:722)
at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(
RawLocalFileSystem.java:633)
at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(
RawLocalFileSystem.java:421)
at org.apache.hadoop.fs.FilterFileSystem.mkdirs(
FilterFileSystem.java:281)
Post by Raviteja Lokineni
at org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(
JobSubmissionFiles.java:125)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(
JobSubmitter.java:348)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(
UserGroupInformation.java:1556)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.
java:1303)
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Raviteja Lokineni
at org.apache.mahout.cf.taste.hadoop.preparation.
PreparePreferenceMatrixJob.run(PreparePreferenceMatrixJob.java:77)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.mahout.cf.taste.hadoop.item.RecommenderJob.
run(RecommenderJob.java:168)
at org.apache.mahout.cf.taste.hadoop.item.RecommenderJobTest.
testCompleteJobWithFiltering(RecommenderJobTest.java:881)
Do I need to have hadoop installed on my local machine?
On Thu, Aug 4, 2016 at 1:53 PM, Raviteja Lokineni <
Post by Raviteja Lokineni
Hi Devs,
Issue link: https://issues.apache.org/jira/browse/MAHOUT-1876
Problem statement: Mahout should be compatible with the latest
lucene
Post by Raviteja Lokineni
Post by Raviteja Lokineni
version. I was trying to solve a text clustering problem and I
stumbled
Post by Raviteja Lokineni
Post by Raviteja Lokineni
upon the error that the lucene version in use is not supported.
That's
when
Post by Raviteja Lokineni
Post by Raviteja Lokineni
I raised this issue and one of the guys suggested me to try fixing
it
and I
Post by Raviteja Lokineni
Post by Raviteja Lokineni
started doing so.
1. Change lucene version in the POM file and fix all the
compilation
Post by Raviteja Lokineni
Post by Raviteja Lokineni
failures
2. Fix any failing tests due to this change
1. Fixed all the compilation issues
2. Comparing the test failures of before-fix v/s after-fix
I might need some help in test failures, I happen to see that the
same
Post by Raviteja Lokineni
Post by Raviteja Lokineni
tests are failing in both before-fix and after-fix.
Let me know if any comments.
Thanks,
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade
[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade
[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade
[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade
[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade
[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade
[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade
[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
Raviteja Lokineni
2016-08-05 16:14:24 UTC
Permalink
Yay! for the heads up on merging.

FYI, I take back my word on failure on windows though. I had to include the
hadoop.dll file on PATH. Tests are running (I am running it just to satisfy
myself ;) ).
Post by Suneel Marthi
Most of us r Linux or MacOS users, so we never hit any of the build issues
u have encountered.
I must say that we never test for Windows :), but its expected that u would
run into issues on Windows.
The patch looks good and we shuld be able to merge this over the weekend,
thanks again for the contrib.
On Fri, Aug 5, 2016 at 12:01 PM, Raviteja Lokineni <
Post by Raviteja Lokineni
I have built winutils.exe but nevertheless the test fails to run on
windows
Post by Raviteja Lokineni
due to linker errors.
So, for now linux builds are successful on my end. Let me know what is
your
Post by Raviteja Lokineni
plan with the pull request to upgrade.
On Fri, Aug 5, 2016 at 11:12 AM, Raviteja Lokineni <
Post by Raviteja Lokineni
The reason the tests failed previously on my windows box is because it
wasn't able to locate winutils.exe in HADOOP_HOME.
Now, to test that I am building winutils.exe from source using the
following resource: http://zutai.blogspot.com/2014/06/build-
install-and-run-hadoop-24-240-on.html
On Thu, Aug 4, 2016 at 9:53 PM, Raviteja Lokineni <
Post by Raviteja Lokineni
I'm up for the challenge but looks like nothing failed this time too.
Repo: https://github.com/bond-/mahout
Branch: mahout-1876
See the full output in the gist: https://gist.github.com/
bond-/33c349136b6914db4892233b8876b9d7
If someone wants to re-verify running the tests, please go ahead and
let
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Raviteja Lokineni
me know.
Maven >= 3.3.3,
java >= 1.7,
MAHOUT_HOME=/path/to/install,
also MAHOUT_LOCAL=true
If ur not running against a hadoop cluster. (Looks like ur not)
Upgrading this dep is not a trivial task.
-------- Original message --------
Date: 08/04/2016 6:09 PM (GMT-05:00)
Subject: Re: MAHOUT-1876 - Lucene compatibility
Are there any pre-requisites to run tests on local machines?
java.lang.NullPointerException: null
at __randomizedtesting.SeedInfo.seed([C8C254E3D9D80CC9:845EFDB0
7D41873E]:0)
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1012)
at org.apache.hadoop.util.Shell.runCommand(Shell.java:445)
at org.apache.hadoop.util.Shell.run(Shell.java:418)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Sh
ell.java:650)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:739)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:722)
at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(
RawLocalFileSystem.java:633)
at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(
RawLocalFileSystem.java:421)
at org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSyste
m.java:281)
at org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(
JobSubmissionFiles.java:125)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(
JobSubmitter.java:348)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(
UserGroupInformation.java:1556)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1303)
at org.apache.mahout.cf.taste.hadoop.preparation.PreparePrefere
nceMatrixJob.
run(PreparePreferenceMatrixJob.java:77)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.mahout.cf.taste.hadoop.item.RecommenderJob.
run(RecommenderJob.java:168)
at org.apache.mahout.cf.taste.hadoop.item.RecommenderJobTest.
testCompleteJobWithFiltering(RecommenderJobTest.java:881)
Actually, Travis CI is not set up to test all mahout modules.
Could
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Raviteja Lokineni
you
please test again on your local machine, and report any errors.
Andy
-------- Original message --------
Date: 08/04/2016 4:01 PM (GMT-05:00)
Subject: Re: MAHOUT-1876 - Lucene compatibility
All the tests are successful even with my code change and upgrade
to
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Lucene. (The test failure that I reported above was on my local
machine, I
guess you can ignore that.)
So is this pull request good to merge?
https://github.com/apache/mahout/pull/247
https://travis-ci.org/apache/mahout/builds/149840408
On Thu, Aug 4, 2016 at 2:05 PM, Raviteja Lokineni <
Post by Raviteja Lokineni
java.lang.NullPointerException: null
845EFDB07D41873E]:0)
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1012)
at org.apache.hadoop.util.Shell.runCommand(Shell.java:445)
at org.apache.hadoop.util.Shell.run(Shell.java:418)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(
Shell.java:650)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:739)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:722)
at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(
RawLocalFileSystem.java:633)
at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(
RawLocalFileSystem.java:421)
at org.apache.hadoop.fs.FilterFileSystem.mkdirs(
FilterFileSystem.java:281)
Post by Raviteja Lokineni
at org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(
JobSubmissionFiles.java:125)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(
JobSubmitter.java:348)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(
UserGroupInformation.java:1556)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.
java:1303)
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Raviteja Lokineni
at org.apache.mahout.cf.taste.hadoop.preparation.
PreparePreferenceMatrixJob.run(PreparePreferenceMatrixJob.
java:77)
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Raviteja Lokineni
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.mahout.cf.taste.hadoop.item.RecommenderJob.
run(RecommenderJob.java:168)
at org.apache.mahout.cf.taste.hadoop.item.RecommenderJobTest.
testCompleteJobWithFiltering(RecommenderJobTest.java:881)
Do I need to have hadoop installed on my local machine?
On Thu, Aug 4, 2016 at 1:53 PM, Raviteja Lokineni <
Post by Raviteja Lokineni
Hi Devs,
Issue link: https://issues.apache.org/jira/browse/MAHOUT-1876
Problem statement: Mahout should be compatible with the latest
lucene
Post by Raviteja Lokineni
Post by Raviteja Lokineni
version. I was trying to solve a text clustering problem and I
stumbled
Post by Raviteja Lokineni
Post by Raviteja Lokineni
upon the error that the lucene version in use is not supported.
That's
when
Post by Raviteja Lokineni
Post by Raviteja Lokineni
I raised this issue and one of the guys suggested me to try
fixing
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Raviteja Lokineni
it
and I
Post by Raviteja Lokineni
Post by Raviteja Lokineni
started doing so.
1. Change lucene version in the POM file and fix all the
compilation
Post by Raviteja Lokineni
Post by Raviteja Lokineni
failures
2. Fix any failing tests due to this change
1. Fixed all the compilation issues
2. Comparing the test failures of before-fix v/s after-fix
I might need some help in test failures, I happen to see that
the
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Raviteja Lokineni
same
Post by Raviteja Lokineni
Post by Raviteja Lokineni
tests are failing in both before-fix and after-fix.
Let me know if any comments.
Thanks,
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade
[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade
[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade
[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade
[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade
[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade
[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade
[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade

E: ***@gmail.com

[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
Raviteja Lokineni
2016-08-05 16:41:32 UTC
Permalink
Just a FYI, all the tests are successful on windows too ;)
+1
________________________________
Sent: Friday, August 5, 2016 12:14:24 PM
To: mahout
Subject: Re: MAHOUT-1876 - Lucene compatibility
Yay! for the heads up on merging.
FYI, I take back my word on failure on windows though. I had to include the
hadoop.dll file on PATH. Tests are running (I am running it just to satisfy
myself ;) ).
Post by Suneel Marthi
Most of us r Linux or MacOS users, so we never hit any of the build
issues
Post by Suneel Marthi
u have encountered.
I must say that we never test for Windows :), but its expected that u
would
Post by Suneel Marthi
run into issues on Windows.
The patch looks good and we shuld be able to merge this over the weekend,
thanks again for the contrib.
On Fri, Aug 5, 2016 at 12:01 PM, Raviteja Lokineni <
Post by Raviteja Lokineni
I have built winutils.exe but nevertheless the test fails to run on
windows
Post by Raviteja Lokineni
due to linker errors.
So, for now linux builds are successful on my end. Let me know what is
your
Post by Raviteja Lokineni
plan with the pull request to upgrade.
On Fri, Aug 5, 2016 at 11:12 AM, Raviteja Lokineni <
Post by Raviteja Lokineni
The reason the tests failed previously on my windows box is because
it
Post by Suneel Marthi
Post by Raviteja Lokineni
Post by Raviteja Lokineni
wasn't able to locate winutils.exe in HADOOP_HOME.
Now, to test that I am building winutils.exe from source using the
following resource: http://zutai.blogspot.com/2014/06/build-
install-and-run-hadoop-24-240-on.html
On Thu, Aug 4, 2016 at 9:53 PM, Raviteja Lokineni <
Post by Raviteja Lokineni
I'm up for the challenge but looks like nothing failed this time
too.
Post by Suneel Marthi
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Repo: https://github.com/bond-/mahout
Branch: mahout-1876
See the full output in the gist: https://gist.github.com/
bond-/33c349136b6914db4892233b8876b9d7
If someone wants to re-verify running the tests, please go ahead and
let
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Raviteja Lokineni
me know.
Maven >= 3.3.3,
java >= 1.7,
MAHOUT_HOME=/path/to/install,
also MAHOUT_LOCAL=true
If ur not running against a hadoop cluster. (Looks like ur not)
Upgrading this dep is not a trivial task.
-------- Original message --------
Date: 08/04/2016 6:09 PM (GMT-05:00)
Subject: Re: MAHOUT-1876 - Lucene compatibility
Are there any pre-requisites to run tests on local machines?
java.lang.NullPointerException: null
at __randomizedtesting.SeedInfo.seed([C8C254E3D9D80CC9:845EFDB0
7D41873E]:0)
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1012)
at org.apache.hadoop.util.Shell.runCommand(Shell.java:445)
at org.apache.hadoop.util.Shell.run(Shell.java:418)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Sh
ell.java:650)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:739)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:722)
at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(
RawLocalFileSystem.java:633)
at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(
RawLocalFileSystem.java:421)
at org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSyste
m.java:281)
at org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(
JobSubmissionFiles.java:125)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(
JobSubmitter.java:348)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(
UserGroupInformation.java:1556)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.
java:1303)
Post by Suneel Marthi
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Raviteja Lokineni
at org.apache.mahout.cf.taste.hadoop.preparation.PreparePrefere
nceMatrixJob.
run(PreparePreferenceMatrixJob.java:77)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.mahout.cf.taste.hadoop.item.RecommenderJob.
run(RecommenderJob.java:168)
at org.apache.mahout.cf.taste.hadoop.item.RecommenderJobTest.
testCompleteJobWithFiltering(RecommenderJobTest.java:881)
Actually, Travis CI is not set up to test all mahout modules.
Could
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Raviteja Lokineni
you
please test again on your local machine, and report any errors.
Andy
-------- Original message --------
Date: 08/04/2016 4:01 PM (GMT-05:00)
Subject: Re: MAHOUT-1876 - Lucene compatibility
All the tests are successful even with my code change and upgrade
to
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Lucene. (The test failure that I reported above was on my local
machine, I
guess you can ignore that.)
So is this pull request good to merge?
https://github.com/apache/mahout/pull/247
https://travis-ci.org/apache/mahout/builds/149840408
On Thu, Aug 4, 2016 at 2:05 PM, Raviteja Lokineni <
Post by Raviteja Lokineni
java.lang.NullPointerException: null
845EFDB07D41873E]:0)
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1012)
at org.apache.hadoop.util.Shell.runCommand(Shell.java:445)
at org.apache.hadoop.util.Shell.run(Shell.java:418)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(
Shell.java:650)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:739)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:722)
at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(
RawLocalFileSystem.java:633)
at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(
RawLocalFileSystem.java:421)
at org.apache.hadoop.fs.FilterFileSystem.mkdirs(
FilterFileSystem.java:281)
Post by Raviteja Lokineni
at org.apache.hadoop.mapreduce.JobSubmissionFiles.
getStagingDir(
Post by Suneel Marthi
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Raviteja Lokineni
JobSubmissionFiles.java:125)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(
JobSubmitter.java:348)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(
UserGroupInformation.java:1556)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.
java:1303)
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Raviteja Lokineni
at org.apache.mahout.cf.taste.hadoop.preparation.
PreparePreferenceMatrixJob.run(PreparePreferenceMatrixJob.
java:77)
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Raviteja Lokineni
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.mahout.cf.taste.hadoop.item.RecommenderJob.
run(RecommenderJob.java:168)
at org.apache.mahout.cf.taste.hadoop.item.RecommenderJobTest.
testCompleteJobWithFiltering(RecommenderJobTest.java:881)
Do I need to have hadoop installed on my local machine?
On Thu, Aug 4, 2016 at 1:53 PM, Raviteja Lokineni <
Post by Raviteja Lokineni
Hi Devs,
Issue link: https://issues.apache.org/jira/browse/MAHOUT-1876
Problem statement: Mahout should be compatible with the latest
lucene
Post by Raviteja Lokineni
Post by Raviteja Lokineni
version. I was trying to solve a text clustering problem and I
stumbled
Post by Raviteja Lokineni
Post by Raviteja Lokineni
upon the error that the lucene version in use is not
supported.
Post by Suneel Marthi
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Raviteja Lokineni
That's
when
Post by Raviteja Lokineni
Post by Raviteja Lokineni
I raised this issue and one of the guys suggested me to try
fixing
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Raviteja Lokineni
it
and I
Post by Raviteja Lokineni
Post by Raviteja Lokineni
started doing so.
1. Change lucene version in the POM file and fix all the
compilation
Post by Raviteja Lokineni
Post by Raviteja Lokineni
failures
2. Fix any failing tests due to this change
1. Fixed all the compilation issues
2. Comparing the test failures of before-fix v/s after-fix
I might need some help in test failures, I happen to see that
the
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Raviteja Lokineni
same
Post by Raviteja Lokineni
Post by Raviteja Lokineni
tests are failing in both before-fix and after-fix.
Let me know if any comments.
Thanks,
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade
[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade
[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade
[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade
[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade
[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade
[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade
[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade
[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade

E: ***@gmail.com

[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
Raviteja Lokineni
2016-08-05 17:57:02 UTC
Permalink
Hi Andrew,

Looks like the examples don't seem to work unless on a hadoop cluster. If I
get some time I will download a cloudera quickstart vm and test it out.

Thanks,
Raviteja
Thanks again Raviteja,
Tests pass in my Linux env as well.
FYI, if the windows script has not yet been officially deprecated it
should be soon.
As Suneel said, someone will merge it over the weekend. In the meantime
it would good to ensure that some of the examples are working in the
$MAHOUT_HOME/examples/bin dir. Could you try running
classify-wikipedia.sh option (2), cluster-reuters.sh option (1) pr (2) and
classify-20newsgroups.sh option 1 in (pseudo)cluster mode if possible?
This would to ensure that seq2sparse is working correctly which relies
heavily on lucene.
Thanks again for the great contribution.
Andy
-------- Original message --------
Date: 08/05/2016 12:42 PM (GMT-05:00)
Subject: Re: MAHOUT-1876 - Lucene compatibility
Just a FYI, all the tests are successful on windows too ;)
+1
________________________________
Sent: Friday, August 5, 2016 12:14:24 PM
To: mahout
Subject: Re: MAHOUT-1876 - Lucene compatibility
Yay! for the heads up on merging.
FYI, I take back my word on failure on windows though. I had to include
the
hadoop.dll file on PATH. Tests are running (I am running it just to
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade

E: ***@gmail.com

[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
Suneel Marthi
2016-08-05 18:08:46 UTC
Permalink
u don't need a hadoop cluster for that,

set MAHOUT_LOCAL=true
and u shuld be able to run locally

On Fri, Aug 5, 2016 at 1:57 PM, Raviteja Lokineni <
Post by Raviteja Lokineni
Hi Andrew,
Looks like the examples don't seem to work unless on a hadoop cluster. If I
get some time I will download a cloudera quickstart vm and test it out.
Thanks,
Raviteja
Thanks again Raviteja,
Tests pass in my Linux env as well.
FYI, if the windows script has not yet been officially deprecated it
should be soon.
As Suneel said, someone will merge it over the weekend. In the meantime
it would good to ensure that some of the examples are working in the
$MAHOUT_HOME/examples/bin dir. Could you try running
classify-wikipedia.sh option (2), cluster-reuters.sh option (1) pr (2)
and
classify-20newsgroups.sh option 1 in (pseudo)cluster mode if possible?
This would to ensure that seq2sparse is working correctly which relies
heavily on lucene.
Thanks again for the great contribution.
Andy
-------- Original message --------
Date: 08/05/2016 12:42 PM (GMT-05:00)
Subject: Re: MAHOUT-1876 - Lucene compatibility
Just a FYI, all the tests are successful on windows too ;)
+1
________________________________
Sent: Friday, August 5, 2016 12:14:24 PM
To: mahout
Subject: Re: MAHOUT-1876 - Lucene compatibility
Yay! for the heads up on merging.
FYI, I take back my word on failure on windows though. I had to include
the
hadoop.dll file on PATH. Tests are running (I am running it just to
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade
[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
Raviteja Lokineni
2016-08-05 18:15:44 UTC
Permalink
This is what I get.

$ ./classify-20newsgroups.sh
/home/lok268/projects/mahout/examples/bin/set-dfs-commands.sh: line
36: /bin/hadoop: No such file or directory
/home/lok268/projects/mahout/examples/bin/set-dfs-commands.sh: line
38: [: too many arguments
/home/lok268/projects/mahout/examples/bin/set-dfs-commands.sh: line
43: [: -eq: unary operator expected
Can't determine Hadoop version.
Post by Suneel Marthi
u don't need a hadoop cluster for that,
set MAHOUT_LOCAL=true
and u shuld be able to run locally
On Fri, Aug 5, 2016 at 1:57 PM, Raviteja Lokineni <
Post by Raviteja Lokineni
Hi Andrew,
Looks like the examples don't seem to work unless on a hadoop cluster.
If I
Post by Raviteja Lokineni
get some time I will download a cloudera quickstart vm and test it out.
Thanks,
Raviteja
Thanks again Raviteja,
Tests pass in my Linux env as well.
FYI, if the windows script has not yet been officially deprecated it
should be soon.
As Suneel said, someone will merge it over the weekend. In the
meantime
Post by Raviteja Lokineni
it would good to ensure that some of the examples are working in the
$MAHOUT_HOME/examples/bin dir. Could you try running
classify-wikipedia.sh option (2), cluster-reuters.sh option (1) pr (2)
and
classify-20newsgroups.sh option 1 in (pseudo)cluster mode if possible?
This would to ensure that seq2sparse is working correctly which relies
heavily on lucene.
Thanks again for the great contribution.
Andy
-------- Original message --------
Date: 08/05/2016 12:42 PM (GMT-05:00)
Subject: Re: MAHOUT-1876 - Lucene compatibility
Just a FYI, all the tests are successful on windows too ;)
+1
________________________________
Sent: Friday, August 5, 2016 12:14:24 PM
To: mahout
Subject: Re: MAHOUT-1876 - Lucene compatibility
Yay! for the heads up on merging.
FYI, I take back my word on failure on windows though. I had to
include
Post by Raviteja Lokineni
the
hadoop.dll file on PATH. Tests are running (I am running it just to
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade
[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade

E: ***@gmail.com

[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
Suneel Marthi
2016-08-05 18:21:29 UTC
Permalink
r u running this on windows prompt or in Cygwin.

Suggest use Cygwin.

On Fri, Aug 5, 2016 at 2:15 PM, Raviteja Lokineni <
Post by Raviteja Lokineni
This is what I get.
$ ./classify-20newsgroups.sh
/home/lok268/projects/mahout/examples/bin/set-dfs-commands.sh: line
36: /bin/hadoop: No such file or directory
/home/lok268/projects/mahout/examples/bin/set-dfs-commands.sh: line
38: [: too many arguments
/home/lok268/projects/mahout/examples/bin/set-dfs-commands.sh: line
43: [: -eq: unary operator expected
Can't determine Hadoop version.
Post by Suneel Marthi
u don't need a hadoop cluster for that,
set MAHOUT_LOCAL=true
and u shuld be able to run locally
On Fri, Aug 5, 2016 at 1:57 PM, Raviteja Lokineni <
Post by Raviteja Lokineni
Hi Andrew,
Looks like the examples don't seem to work unless on a hadoop cluster.
If I
Post by Raviteja Lokineni
get some time I will download a cloudera quickstart vm and test it out.
Thanks,
Raviteja
Thanks again Raviteja,
Tests pass in my Linux env as well.
FYI, if the windows script has not yet been officially deprecated it
should be soon.
As Suneel said, someone will merge it over the weekend. In the
meantime
Post by Raviteja Lokineni
it would good to ensure that some of the examples are working in the
$MAHOUT_HOME/examples/bin dir. Could you try running
classify-wikipedia.sh option (2), cluster-reuters.sh option (1) pr
(2)
Post by Suneel Marthi
Post by Raviteja Lokineni
and
classify-20newsgroups.sh option 1 in (pseudo)cluster mode if
possible?
Post by Suneel Marthi
Post by Raviteja Lokineni
This would to ensure that seq2sparse is working correctly which
relies
Post by Suneel Marthi
Post by Raviteja Lokineni
heavily on lucene.
Thanks again for the great contribution.
Andy
-------- Original message --------
Date: 08/05/2016 12:42 PM (GMT-05:00)
Subject: Re: MAHOUT-1876 - Lucene compatibility
Just a FYI, all the tests are successful on windows too ;)
+1
________________________________
Sent: Friday, August 5, 2016 12:14:24 PM
To: mahout
Subject: Re: MAHOUT-1876 - Lucene compatibility
Yay! for the heads up on merging.
FYI, I take back my word on failure on windows though. I had to
include
Post by Raviteja Lokineni
the
hadoop.dll file on PATH. Tests are running (I am running it just to
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade
[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade
[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
Raviteja Lokineni
2016-08-05 18:28:08 UTC
Permalink
Nope in a Linux environment.
Post by Suneel Marthi
r u running this on windows prompt or in Cygwin.
Suggest use Cygwin.
On Fri, Aug 5, 2016 at 2:15 PM, Raviteja Lokineni <
Post by Raviteja Lokineni
This is what I get.
$ ./classify-20newsgroups.sh
/home/lok268/projects/mahout/examples/bin/set-dfs-commands.sh: line
36: /bin/hadoop: No such file or directory
/home/lok268/projects/mahout/examples/bin/set-dfs-commands.sh: line
38: [: too many arguments
/home/lok268/projects/mahout/examples/bin/set-dfs-commands.sh: line
43: [: -eq: unary operator expected
Can't determine Hadoop version.
Post by Suneel Marthi
u don't need a hadoop cluster for that,
set MAHOUT_LOCAL=true
and u shuld be able to run locally
On Fri, Aug 5, 2016 at 1:57 PM, Raviteja Lokineni <
Post by Raviteja Lokineni
Hi Andrew,
Looks like the examples don't seem to work unless on a hadoop
cluster.
Post by Raviteja Lokineni
Post by Suneel Marthi
If I
Post by Raviteja Lokineni
get some time I will download a cloudera quickstart vm and test it
out.
Post by Raviteja Lokineni
Post by Suneel Marthi
Post by Raviteja Lokineni
Thanks,
Raviteja
Thanks again Raviteja,
Tests pass in my Linux env as well.
FYI, if the windows script has not yet been officially deprecated
it
Post by Raviteja Lokineni
Post by Suneel Marthi
Post by Raviteja Lokineni
should be soon.
As Suneel said, someone will merge it over the weekend. In the
meantime
Post by Raviteja Lokineni
it would good to ensure that some of the examples are working in
the
Post by Raviteja Lokineni
Post by Suneel Marthi
Post by Raviteja Lokineni
$MAHOUT_HOME/examples/bin dir. Could you try running
classify-wikipedia.sh option (2), cluster-reuters.sh option (1) pr
(2)
Post by Suneel Marthi
Post by Raviteja Lokineni
and
classify-20newsgroups.sh option 1 in (pseudo)cluster mode if
possible?
Post by Suneel Marthi
Post by Raviteja Lokineni
This would to ensure that seq2sparse is working correctly which
relies
Post by Suneel Marthi
Post by Raviteja Lokineni
heavily on lucene.
Thanks again for the great contribution.
Andy
-------- Original message --------
Date: 08/05/2016 12:42 PM (GMT-05:00)
Subject: Re: MAHOUT-1876 - Lucene compatibility
Just a FYI, all the tests are successful on windows too ;)
On Fri, Aug 5, 2016 at 12:18 PM, Andrew Palumbo <
+1
________________________________
Sent: Friday, August 5, 2016 12:14:24 PM
To: mahout
Subject: Re: MAHOUT-1876 - Lucene compatibility
Yay! for the heads up on merging.
FYI, I take back my word on failure on windows though. I had to
include
Post by Raviteja Lokineni
the
hadoop.dll file on PATH. Tests are running (I am running it just
to
Post by Raviteja Lokineni
Post by Suneel Marthi
Post by Raviteja Lokineni
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade
[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade
[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
Raviteja Lokineni
2016-08-06 03:41:09 UTC
Permalink
Guys, found an issue lucene 6.x is compatible only with Java 8. What's the
plan for mahout compatibility? Do you guys want to call in a vote for Java
compatibility?
Hi Raviteja,
Since this upgrade affects the entire Mahout MapReduce text processing
pipeline it is important to make sure that it is working in the end to end
examples.
Could you please set up a Hadoop 2.4.1 pseudo cluster and run through the
previously mentioned examples?
The instructions are here (this is from 2.7.1 but should be the same for
<https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-common/
SingleCluster.html>https://hadoop.apache.org/docs/r2.7.2/
hadoop-project-dist/hadoop-common/SingleCluster.html#
Pseudo-Distributed_Operation
Thanks very much,
Andy
________________________________
Sent: Friday, August 5, 2016 2:38 PM
Subject: Re: MAHOUT-1876 - Lucene compatibility
Ahh- yes I think we started removing MAHOUT_LOCAL capability I see the
https://github.com/apache/mahout/commit/daad3a4ce618cbd05be468c4ce6e45
1618f3a028
[https://avatars3.githubusercontent.com/u/692523?v=3&s=200]<https://
github.com/apache/mahout/commit/daad3a4ce618cbd05be468c4ce6e451618f3a028>
MAHOUT-1665: Update hadoop commands in example scripts (akm) closes a
 ·
daad3a4ce618cbd05be468c4ce6e451618f3a028>
github.com

pache/mahout#98
So it would make sense that you are seeing that Error in local mode.
________________________________
Sent: Friday, August 5, 2016 2:28:08 PM
To: mahout
Subject: Re: MAHOUT-1876 - Lucene compatibility
Nope in a Linux environment.
Post by Suneel Marthi
r u running this on windows prompt or in Cygwin.
Suggest use Cygwin.
On Fri, Aug 5, 2016 at 2:15 PM, Raviteja Lokineni <
Post by Raviteja Lokineni
This is what I get.
$ ./classify-20newsgroups.sh
/home/lok268/projects/mahout/examples/bin/set-dfs-commands.sh: line
36: /bin/hadoop: No such file or directory
/home/lok268/projects/mahout/examples/bin/set-dfs-commands.sh: line
38: [: too many arguments
/home/lok268/projects/mahout/examples/bin/set-dfs-commands.sh: line
43: [: -eq: unary operator expected
Can't determine Hadoop version.
Post by Suneel Marthi
u don't need a hadoop cluster for that,
set MAHOUT_LOCAL=true
and u shuld be able to run locally
On Fri, Aug 5, 2016 at 1:57 PM, Raviteja Lokineni <
Post by Raviteja Lokineni
Hi Andrew,
Looks like the examples don't seem to work unless on a hadoop
cluster.
Post by Raviteja Lokineni
Post by Suneel Marthi
If I
Post by Raviteja Lokineni
get some time I will download a cloudera quickstart vm and test it
out.
Post by Raviteja Lokineni
Post by Suneel Marthi
Post by Raviteja Lokineni
Thanks,
Raviteja
On Fri, Aug 5, 2016 at 12:53 PM, Andrew Palumbo <
Thanks again Raviteja,
Tests pass in my Linux env as well.
FYI, if the windows script has not yet been officially deprecated
it
Post by Raviteja Lokineni
Post by Suneel Marthi
Post by Raviteja Lokineni
should be soon.
As Suneel said, someone will merge it over the weekend. In the
meantime
Post by Raviteja Lokineni
it would good to ensure that some of the examples are working in
the
Post by Raviteja Lokineni
Post by Suneel Marthi
Post by Raviteja Lokineni
$MAHOUT_HOME/examples/bin dir. Could you try running
classify-wikipedia.sh option (2), cluster-reuters.sh option (1)
pr
Post by Suneel Marthi
Post by Raviteja Lokineni
(2)
Post by Suneel Marthi
Post by Raviteja Lokineni
and
classify-20newsgroups.sh option 1 in (pseudo)cluster mode if
possible?
Post by Suneel Marthi
Post by Raviteja Lokineni
This would to ensure that seq2sparse is working correctly which
relies
Post by Suneel Marthi
Post by Raviteja Lokineni
heavily on lucene.
Thanks again for the great contribution.
Andy
-------- Original message --------
Date: 08/05/2016 12:42 PM (GMT-05:00)
Subject: Re: MAHOUT-1876 - Lucene compatibility
Just a FYI, all the tests are successful on windows too ;)
On Fri, Aug 5, 2016 at 12:18 PM, Andrew Palumbo <
+1
________________________________
Sent: Friday, August 5, 2016 12:14:24 PM
To: mahout
Subject: Re: MAHOUT-1876 - Lucene compatibility
Yay! for the heads up on merging.
FYI, I take back my word on failure on windows though. I had to
include
Post by Raviteja Lokineni
the
hadoop.dll file on PATH. Tests are running (I am running it
just
Post by Suneel Marthi
to
Post by Raviteja Lokineni
Post by Suneel Marthi
Post by Raviteja Lokineni
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade
[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade
[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
Raviteja Lokineni
2016-08-06 21:15:18 UTC
Permalink
I will let you know by tomorrow. Will run them now.
We will likely move to Java 8 at some point of course, but I personally
would not be inclined to enforce it right now as most of our current new
work is Scala-based, and this (the lucene dep.) is only used in legacy
components. Admittedly though, one useful legacy component. Were you
able to get the examples to run in pseudo-cluster mode with lucene 6?
Thanks,
Andy
________________________________
Sent: Saturday, August 6, 2016 5:03:45 PM
Subject: Re: MAHOUT-1876 - Lucene compatibility
Thank you Raviteja, this is something that we will have to discuss.
________________________________
Sent: Friday, August 5, 2016 11:41:09 PM
To: mahout
Subject: Re: MAHOUT-1876 - Lucene compatibility
Guys, found an issue lucene 6.x is compatible only with Java 8. What's the
plan for mahout compatibility? Do you guys want to call in a vote for Java
compatibility?
Hi Raviteja,
Since this upgrade affects the entire Mahout MapReduce text processing
pipeline it is important to make sure that it is working in the end to
end
examples.
Could you please set up a Hadoop 2.4.1 pseudo cluster and run through the
previously mentioned examples?
The instructions are here (this is from 2.7.1 but should be the same for
<https://hadoop.apache.org/docs/r2.7.2/hadoop-project-
dist/hadoop-common/
SingleCluster.html>https://hadoop.apache.org/docs/r2.7.2/
hadoop-project-dist/hadoop-common/SingleCluster.html#
Pseudo-Distributed_Operation
Thanks very much,
Andy
________________________________
Sent: Friday, August 5, 2016 2:38 PM
Subject: Re: MAHOUT-1876 - Lucene compatibility
Ahh- yes I think we started removing MAHOUT_LOCAL capability I see the
https://github.com/apache/mahout/commit/daad3a4ce618cbd05be468c4ce6e45
1618f3a028
[https://avatars3.githubusercontent.com/u/692523?v=3&s=200]<https://
github.com/apache/mahout/commit/daad3a4ce618cbd05be468c4ce6e451618f3a028
MAHOUT-1665: Update hadoop commands in example scripts (akm) closes a
 ·
daad3a4ce618cbd05be468c4ce6e451618f3a028>
github.com

pache/mahout#98
So it would make sense that you are seeing that Error in local mode.
________________________________
Sent: Friday, August 5, 2016 2:28:08 PM
To: mahout
Subject: Re: MAHOUT-1876 - Lucene compatibility
Nope in a Linux environment.
Post by Suneel Marthi
r u running this on windows prompt or in Cygwin.
Suggest use Cygwin.
On Fri, Aug 5, 2016 at 2:15 PM, Raviteja Lokineni <
Post by Raviteja Lokineni
This is what I get.
$ ./classify-20newsgroups.sh
/home/lok268/projects/mahout/examples/bin/set-dfs-commands.sh: line
36: /bin/hadoop: No such file or directory
/home/lok268/projects/mahout/examples/bin/set-dfs-commands.sh: line
38: [: too many arguments
/home/lok268/projects/mahout/examples/bin/set-dfs-commands.sh: line
43: [: -eq: unary operator expected
Can't determine Hadoop version.
Post by Suneel Marthi
u don't need a hadoop cluster for that,
set MAHOUT_LOCAL=true
and u shuld be able to run locally
On Fri, Aug 5, 2016 at 1:57 PM, Raviteja Lokineni <
Post by Raviteja Lokineni
Hi Andrew,
Looks like the examples don't seem to work unless on a hadoop
cluster.
Post by Raviteja Lokineni
Post by Suneel Marthi
If I
Post by Raviteja Lokineni
get some time I will download a cloudera quickstart vm and test
it
Post by Suneel Marthi
out.
Post by Raviteja Lokineni
Post by Suneel Marthi
Post by Raviteja Lokineni
Thanks,
Raviteja
On Fri, Aug 5, 2016 at 12:53 PM, Andrew Palumbo <
Thanks again Raviteja,
Tests pass in my Linux env as well.
FYI, if the windows script has not yet been officially
deprecated
Post by Suneel Marthi
it
Post by Raviteja Lokineni
Post by Suneel Marthi
Post by Raviteja Lokineni
should be soon.
As Suneel said, someone will merge it over the weekend. In the
meantime
Post by Raviteja Lokineni
it would good to ensure that some of the examples are working
in
Post by Suneel Marthi
the
Post by Raviteja Lokineni
Post by Suneel Marthi
Post by Raviteja Lokineni
$MAHOUT_HOME/examples/bin dir. Could you try running
classify-wikipedia.sh option (2), cluster-reuters.sh option
(1)
pr
Post by Suneel Marthi
Post by Raviteja Lokineni
(2)
Post by Suneel Marthi
Post by Raviteja Lokineni
and
classify-20newsgroups.sh option 1 in (pseudo)cluster mode if
possible?
Post by Suneel Marthi
Post by Raviteja Lokineni
This would to ensure that seq2sparse is working correctly which
relies
Post by Suneel Marthi
Post by Raviteja Lokineni
heavily on lucene.
Thanks again for the great contribution.
Andy
-------- Original message --------
Date: 08/05/2016 12:42 PM (GMT-05:00)
Subject: Re: MAHOUT-1876 - Lucene compatibility
Just a FYI, all the tests are successful on windows too ;)
On Fri, Aug 5, 2016 at 12:18 PM, Andrew Palumbo <
+1
________________________________
Sent: Friday, August 5, 2016 12:14:24 PM
To: mahout
Subject: Re: MAHOUT-1876 - Lucene compatibility
Yay! for the heads up on merging.
FYI, I take back my word on failure on windows though. I had
to
Post by Suneel Marthi
Post by Raviteja Lokineni
Post by Suneel Marthi
include
Post by Raviteja Lokineni
the
hadoop.dll file on PATH. Tests are running (I am running it
just
Post by Suneel Marthi
to
Post by Raviteja Lokineni
Post by Suneel Marthi
Post by Raviteja Lokineni
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade
[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade
[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
Raviteja Lokineni
2016-08-07 02:17:52 UTC
Permalink
Hi Andy,

I ran the following tests as you have specified:

- classify-wikipedia.sh
- Option 2
- cluster-reuters.sh
- Option 1,2
- classify-20newsgroups.sh
- Option 1

All these examples *ran successfully* on a cloudera quickstart vm 5.7. I
had to change the cluster JVM to 1.8 to make it work otherwise lucene was
failing with incompatible class major/minor version error (because lucene
6.1.0 was built for JVM 1.8).

On seeing that this patch wasn't working with Java 8, I was like why, why,
why?

Thanks,
Raviteja

On Sat, Aug 6, 2016 at 5:15 PM, Raviteja Lokineni <
Post by Raviteja Lokineni
I will let you know by tomorrow. Will run them now.
We will likely move to Java 8 at some point of course, but I personally
would not be inclined to enforce it right now as most of our current new
work is Scala-based, and this (the lucene dep.) is only used in legacy
components. Admittedly though, one useful legacy component. Were you
able to get the examples to run in pseudo-cluster mode with lucene 6?
Thanks,
Andy
________________________________
Sent: Saturday, August 6, 2016 5:03:45 PM
Subject: Re: MAHOUT-1876 - Lucene compatibility
Thank you Raviteja, this is something that we will have to discuss.
________________________________
Sent: Friday, August 5, 2016 11:41:09 PM
To: mahout
Subject: Re: MAHOUT-1876 - Lucene compatibility
Guys, found an issue lucene 6.x is compatible only with Java 8. What's the
plan for mahout compatibility? Do you guys want to call in a vote for Java
compatibility?
Hi Raviteja,
Since this upgrade affects the entire Mahout MapReduce text processing
pipeline it is important to make sure that it is working in the end to
end
examples.
Could you please set up a Hadoop 2.4.1 pseudo cluster and run through
the
previously mentioned examples?
The instructions are here (this is from 2.7.1 but should be the same for
<https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/
hadoop-common/
SingleCluster.html>https://hadoop.apache.org/docs/r2.7.2/
hadoop-project-dist/hadoop-common/SingleCluster.html#
Pseudo-Distributed_Operation
Thanks very much,
Andy
________________________________
Sent: Friday, August 5, 2016 2:38 PM
Subject: Re: MAHOUT-1876 - Lucene compatibility
Ahh- yes I think we started removing MAHOUT_LOCAL capability I see the
https://github.com/apache/mahout/commit/daad3a4ce618cbd05be468c4ce6e45
1618f3a028
[https://avatars3.githubusercontent.com/u/692523?v=3&s=200]<https://
github.com/apache/mahout/commit/daad3a4ce618cbd05be468c4ce6e
451618f3a028>
MAHOUT-1665: Update hadoop commands in example scripts (akm) closes a
 ·
daad3a4ce618cbd05be468c4ce6e451618f3a028>
github.com

pache/mahout#98
So it would make sense that you are seeing that Error in local mode.
________________________________
Sent: Friday, August 5, 2016 2:28:08 PM
To: mahout
Subject: Re: MAHOUT-1876 - Lucene compatibility
Nope in a Linux environment.
Post by Suneel Marthi
r u running this on windows prompt or in Cygwin.
Suggest use Cygwin.
On Fri, Aug 5, 2016 at 2:15 PM, Raviteja Lokineni <
Post by Raviteja Lokineni
This is what I get.
$ ./classify-20newsgroups.sh
/home/lok268/projects/mahout/examples/bin/set-dfs-commands.sh: line
36: /bin/hadoop: No such file or directory
/home/lok268/projects/mahout/examples/bin/set-dfs-commands.sh: line
38: [: too many arguments
/home/lok268/projects/mahout/examples/bin/set-dfs-commands.sh: line
43: [: -eq: unary operator expected
Can't determine Hadoop version.
Post by Suneel Marthi
u don't need a hadoop cluster for that,
set MAHOUT_LOCAL=true
and u shuld be able to run locally
On Fri, Aug 5, 2016 at 1:57 PM, Raviteja Lokineni <
Post by Raviteja Lokineni
Hi Andrew,
Looks like the examples don't seem to work unless on a hadoop
cluster.
Post by Raviteja Lokineni
Post by Suneel Marthi
If I
Post by Raviteja Lokineni
get some time I will download a cloudera quickstart vm and test
it
Post by Suneel Marthi
out.
Post by Raviteja Lokineni
Post by Suneel Marthi
Post by Raviteja Lokineni
Thanks,
Raviteja
On Fri, Aug 5, 2016 at 12:53 PM, Andrew Palumbo <
Thanks again Raviteja,
Tests pass in my Linux env as well.
FYI, if the windows script has not yet been officially
deprecated
Post by Suneel Marthi
it
Post by Raviteja Lokineni
Post by Suneel Marthi
Post by Raviteja Lokineni
should be soon.
As Suneel said, someone will merge it over the weekend. In
the
Post by Suneel Marthi
Post by Raviteja Lokineni
Post by Suneel Marthi
meantime
Post by Raviteja Lokineni
it would good to ensure that some of the examples are working
in
Post by Suneel Marthi
the
Post by Raviteja Lokineni
Post by Suneel Marthi
Post by Raviteja Lokineni
$MAHOUT_HOME/examples/bin dir. Could you try running
classify-wikipedia.sh option (2), cluster-reuters.sh option
(1)
pr
Post by Suneel Marthi
Post by Raviteja Lokineni
(2)
Post by Suneel Marthi
Post by Raviteja Lokineni
and
classify-20newsgroups.sh option 1 in (pseudo)cluster mode if
possible?
Post by Suneel Marthi
Post by Raviteja Lokineni
This would to ensure that seq2sparse is working correctly
which
Post by Suneel Marthi
Post by Raviteja Lokineni
relies
Post by Suneel Marthi
Post by Raviteja Lokineni
heavily on lucene.
Thanks again for the great contribution.
Andy
-------- Original message --------
Date: 08/05/2016 12:42 PM (GMT-05:00)
Subject: Re: MAHOUT-1876 - Lucene compatibility
Just a FYI, all the tests are successful on windows too ;)
On Fri, Aug 5, 2016 at 12:18 PM, Andrew Palumbo <
+1
________________________________
Sent: Friday, August 5, 2016 12:14:24 PM
To: mahout
Subject: Re: MAHOUT-1876 - Lucene compatibility
Yay! for the heads up on merging.
FYI, I take back my word on failure on windows though. I
had to
Post by Suneel Marthi
Post by Raviteja Lokineni
Post by Suneel Marthi
include
Post by Raviteja Lokineni
the
hadoop.dll file on PATH. Tests are running (I am running it
just
Post by Suneel Marthi
to
Post by Raviteja Lokineni
Post by Suneel Marthi
Post by Raviteja Lokineni
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade
[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade
[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade

E: ***@gmail.com

[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
Raviteja Lokineni
2016-08-07 02:20:13 UTC
Permalink
*correction: wasn't working with Java 7

On Sat, Aug 6, 2016 at 10:17 PM, Raviteja Lokineni <
Post by Raviteja Lokineni
Hi Andy,
- classify-wikipedia.sh
- Option 2
- cluster-reuters.sh
- Option 1,2
- classify-20newsgroups.sh
- Option 1
All these examples *ran successfully* on a cloudera quickstart vm 5.7. I
had to change the cluster JVM to 1.8 to make it work otherwise lucene was
failing with incompatible class major/minor version error (because lucene
6.1.0 was built for JVM 1.8).
On seeing that this patch wasn't working with Java 8, I was like why, why,
why?
Thanks,
Raviteja
On Sat, Aug 6, 2016 at 5:15 PM, Raviteja Lokineni <
Post by Raviteja Lokineni
I will let you know by tomorrow. Will run them now.
We will likely move to Java 8 at some point of course, but I personally
would not be inclined to enforce it right now as most of our current new
work is Scala-based, and this (the lucene dep.) is only used in legacy
components. Admittedly though, one useful legacy component. Were you
able to get the examples to run in pseudo-cluster mode with lucene 6?
Thanks,
Andy
________________________________
Sent: Saturday, August 6, 2016 5:03:45 PM
Subject: Re: MAHOUT-1876 - Lucene compatibility
Thank you Raviteja, this is something that we will have to discuss.
________________________________
Sent: Friday, August 5, 2016 11:41:09 PM
To: mahout
Subject: Re: MAHOUT-1876 - Lucene compatibility
Guys, found an issue lucene 6.x is compatible only with Java 8. What's the
plan for mahout compatibility? Do you guys want to call in a vote for Java
compatibility?
Hi Raviteja,
Since this upgrade affects the entire Mahout MapReduce text processing
pipeline it is important to make sure that it is working in the end to
end
examples.
Could you please set up a Hadoop 2.4.1 pseudo cluster and run through
the
previously mentioned examples?
The instructions are here (this is from 2.7.1 but should be the same
for
<https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/h
adoop-common/
SingleCluster.html>https://hadoop.apache.org/docs/r2.7.2/
hadoop-project-dist/hadoop-common/SingleCluster.html#
Pseudo-Distributed_Operation
Thanks very much,
Andy
________________________________
Sent: Friday, August 5, 2016 2:38 PM
Subject: Re: MAHOUT-1876 - Lucene compatibility
Ahh- yes I think we started removing MAHOUT_LOCAL capability I see the
https://github.com/apache/mahout/commit/daad3a4ce618cbd05be468c4ce6e45
1618f3a028
[https://avatars3.githubusercontent.com/u/692523?v=3&s=200]<https://
github.com/apache/mahout/commit/daad3a4ce618cbd05be468c4ce6e
451618f3a028>
MAHOUT-1665: Update hadoop commands in example scripts (akm) closes a

·
daad3a4ce618cbd05be468c4ce6e451618f3a028>
github.com

pache/mahout#98
So it would make sense that you are seeing that Error in local mode.
________________________________
Sent: Friday, August 5, 2016 2:28:08 PM
To: mahout
Subject: Re: MAHOUT-1876 - Lucene compatibility
Nope in a Linux environment.
Post by Suneel Marthi
r u running this on windows prompt or in Cygwin.
Suggest use Cygwin.
On Fri, Aug 5, 2016 at 2:15 PM, Raviteja Lokineni <
Post by Raviteja Lokineni
This is what I get.
$ ./classify-20newsgroups.sh
line
Post by Suneel Marthi
Post by Raviteja Lokineni
36: /bin/hadoop: No such file or directory
line
Post by Suneel Marthi
Post by Raviteja Lokineni
38: [: too many arguments
line
Post by Suneel Marthi
Post by Raviteja Lokineni
43: [: -eq: unary operator expected
Can't determine Hadoop version.
Post by Suneel Marthi
u don't need a hadoop cluster for that,
set MAHOUT_LOCAL=true
and u shuld be able to run locally
On Fri, Aug 5, 2016 at 1:57 PM, Raviteja Lokineni <
Post by Raviteja Lokineni
Hi Andrew,
Looks like the examples don't seem to work unless on a hadoop
cluster.
Post by Raviteja Lokineni
Post by Suneel Marthi
If I
Post by Raviteja Lokineni
get some time I will download a cloudera quickstart vm and
test it
Post by Suneel Marthi
out.
Post by Raviteja Lokineni
Post by Suneel Marthi
Post by Raviteja Lokineni
Thanks,
Raviteja
On Fri, Aug 5, 2016 at 12:53 PM, Andrew Palumbo <
Thanks again Raviteja,
Tests pass in my Linux env as well.
FYI, if the windows script has not yet been officially
deprecated
Post by Suneel Marthi
it
Post by Raviteja Lokineni
Post by Suneel Marthi
Post by Raviteja Lokineni
should be soon.
As Suneel said, someone will merge it over the weekend. In
the
Post by Suneel Marthi
Post by Raviteja Lokineni
Post by Suneel Marthi
meantime
Post by Raviteja Lokineni
it would good to ensure that some of the examples are
working in
Post by Suneel Marthi
the
Post by Raviteja Lokineni
Post by Suneel Marthi
Post by Raviteja Lokineni
$MAHOUT_HOME/examples/bin dir. Could you try running
classify-wikipedia.sh option (2), cluster-reuters.sh option
(1)
pr
Post by Suneel Marthi
Post by Raviteja Lokineni
(2)
Post by Suneel Marthi
Post by Raviteja Lokineni
and
classify-20newsgroups.sh option 1 in (pseudo)cluster mode if
possible?
Post by Suneel Marthi
Post by Raviteja Lokineni
This would to ensure that seq2sparse is working correctly
which
Post by Suneel Marthi
Post by Raviteja Lokineni
relies
Post by Suneel Marthi
Post by Raviteja Lokineni
heavily on lucene.
Thanks again for the great contribution.
Andy
-------- Original message --------
Date: 08/05/2016 12:42 PM (GMT-05:00)
Subject: Re: MAHOUT-1876 - Lucene compatibility
Just a FYI, all the tests are successful on windows too ;)
On Fri, Aug 5, 2016 at 12:18 PM, Andrew Palumbo <
+1
________________________________
Sent: Friday, August 5, 2016 12:14:24 PM
To: mahout
Subject: Re: MAHOUT-1876 - Lucene compatibility
Yay! for the heads up on merging.
FYI, I take back my word on failure on windows though. I
had to
Post by Suneel Marthi
Post by Raviteja Lokineni
Post by Suneel Marthi
include
Post by Raviteja Lokineni
the
hadoop.dll file on PATH. Tests are running (I am running it
just
Post by Suneel Marthi
to
Post by Raviteja Lokineni
Post by Suneel Marthi
Post by Raviteja Lokineni
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade
[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade
[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade
[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade

E: ***@gmail.com

[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
Raviteja Lokineni
2016-08-08 01:01:08 UTC
Permalink
Submitted another PR with Lucene 5.5.2 and Java 7 compatibility. Based on
the devs preference we can choose one of patches.

https://github.com/apache/mahout/pull/248

I did all the necessary tests specified above and all are successful.

Thanks,
Raviteja

On Sat, Aug 6, 2016 at 7:20 PM, Raviteja Lokineni <
Post by Raviteja Lokineni
*correction: wasn't working with Java 7
On Sat, Aug 6, 2016 at 10:17 PM, Raviteja Lokineni <
Post by Raviteja Lokineni
Hi Andy,
- classify-wikipedia.sh
- Option 2
- cluster-reuters.sh
- Option 1,2
- classify-20newsgroups.sh
- Option 1
All these examples *ran successfully* on a cloudera quickstart vm 5.7. I
had to change the cluster JVM to 1.8 to make it work otherwise lucene was
failing with incompatible class major/minor version error (because lucene
6.1.0 was built for JVM 1.8).
On seeing that this patch wasn't working with Java 8, I was like why,
why, why?
Thanks,
Raviteja
On Sat, Aug 6, 2016 at 5:15 PM, Raviteja Lokineni <
Post by Raviteja Lokineni
I will let you know by tomorrow. Will run them now.
We will likely move to Java 8 at some point of course, but I personally
would not be inclined to enforce it right now as most of our current new
work is Scala-based, and this (the lucene dep.) is only used in legacy
components. Admittedly though, one useful legacy component. Were you
able to get the examples to run in pseudo-cluster mode with lucene 6?
Thanks,
Andy
________________________________
Sent: Saturday, August 6, 2016 5:03:45 PM
Subject: Re: MAHOUT-1876 - Lucene compatibility
Thank you Raviteja, this is something that we will have to discuss.
________________________________
Sent: Friday, August 5, 2016 11:41:09 PM
To: mahout
Subject: Re: MAHOUT-1876 - Lucene compatibility
Guys, found an issue lucene 6.x is compatible only with Java 8. What's the
plan for mahout compatibility? Do you guys want to call in a vote for Java
compatibility?
Hi Raviteja,
Since this upgrade affects the entire Mahout MapReduce text processing
pipeline it is important to make sure that it is working in the end
to end
examples.
Could you please set up a Hadoop 2.4.1 pseudo cluster and run through
the
previously mentioned examples?
The instructions are here (this is from 2.7.1 but should be the same
for
<https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/h
adoop-common/
SingleCluster.html>https://hadoop.apache.org/docs/r2.7.2/
hadoop-project-dist/hadoop-common/SingleCluster.html#
Pseudo-Distributed_Operation
Thanks very much,
Andy
________________________________
Sent: Friday, August 5, 2016 2:38 PM
Subject: Re: MAHOUT-1876 - Lucene compatibility
Ahh- yes I think we started removing MAHOUT_LOCAL capability I see the
https://github.com/apache/mahout/commit/daad3a4ce618cbd05be4
68c4ce6e45
1618f3a028
[https://avatars3.githubusercontent.com/u/692523?v=3&s=200]<https://
github.com/apache/mahout/commit/daad3a4ce618cbd05be468c4ce6e
451618f3a028>
MAHOUT-1665: Update hadoop commands in example scripts (akm) closes
a
 ·
daad3a4ce618cbd05be468c4ce6e451618f3a028>
github.com

pache/mahout#98
So it would make sense that you are seeing that Error in local mode.
________________________________
Sent: Friday, August 5, 2016 2:28:08 PM
To: mahout
Subject: Re: MAHOUT-1876 - Lucene compatibility
Nope in a Linux environment.
Post by Suneel Marthi
r u running this on windows prompt or in Cygwin.
Suggest use Cygwin.
On Fri, Aug 5, 2016 at 2:15 PM, Raviteja Lokineni <
Post by Raviteja Lokineni
This is what I get.
$ ./classify-20newsgroups.sh
line
Post by Suneel Marthi
Post by Raviteja Lokineni
36: /bin/hadoop: No such file or directory
line
Post by Suneel Marthi
Post by Raviteja Lokineni
38: [: too many arguments
line
Post by Suneel Marthi
Post by Raviteja Lokineni
43: [: -eq: unary operator expected
Can't determine Hadoop version.
Post by Suneel Marthi
u don't need a hadoop cluster for that,
set MAHOUT_LOCAL=true
and u shuld be able to run locally
On Fri, Aug 5, 2016 at 1:57 PM, Raviteja Lokineni <
Post by Raviteja Lokineni
Hi Andrew,
Looks like the examples don't seem to work unless on a hadoop
cluster.
Post by Raviteja Lokineni
Post by Suneel Marthi
If I
Post by Raviteja Lokineni
get some time I will download a cloudera quickstart vm and
test it
Post by Suneel Marthi
out.
Post by Raviteja Lokineni
Post by Suneel Marthi
Post by Raviteja Lokineni
Thanks,
Raviteja
On Fri, Aug 5, 2016 at 12:53 PM, Andrew Palumbo <
Thanks again Raviteja,
Tests pass in my Linux env as well.
FYI, if the windows script has not yet been officially
deprecated
Post by Suneel Marthi
it
Post by Raviteja Lokineni
Post by Suneel Marthi
Post by Raviteja Lokineni
should be soon.
As Suneel said, someone will merge it over the weekend. In
the
Post by Suneel Marthi
Post by Raviteja Lokineni
Post by Suneel Marthi
meantime
Post by Raviteja Lokineni
it would good to ensure that some of the examples are
working in
Post by Suneel Marthi
the
Post by Raviteja Lokineni
Post by Suneel Marthi
Post by Raviteja Lokineni
$MAHOUT_HOME/examples/bin dir. Could you try running
classify-wikipedia.sh option (2), cluster-reuters.sh
option (1)
pr
Post by Suneel Marthi
Post by Raviteja Lokineni
(2)
Post by Suneel Marthi
Post by Raviteja Lokineni
and
classify-20newsgroups.sh option 1 in (pseudo)cluster mode if
possible?
Post by Suneel Marthi
Post by Raviteja Lokineni
This would to ensure that seq2sparse is working correctly
which
Post by Suneel Marthi
Post by Raviteja Lokineni
relies
Post by Suneel Marthi
Post by Raviteja Lokineni
heavily on lucene.
Thanks again for the great contribution.
Andy
-------- Original message --------
Date: 08/05/2016 12:42 PM (GMT-05:00)
Subject: Re: MAHOUT-1876 - Lucene compatibility
Just a FYI, all the tests are successful on windows too ;)
On Fri, Aug 5, 2016 at 12:18 PM, Andrew Palumbo <
+1
________________________________
Sent: Friday, August 5, 2016 12:14:24 PM
To: mahout
Subject: Re: MAHOUT-1876 - Lucene compatibility
Yay! for the heads up on merging.
FYI, I take back my word on failure on windows though. I
had to
Post by Suneel Marthi
Post by Raviteja Lokineni
Post by Suneel Marthi
include
Post by Raviteja Lokineni
the
hadoop.dll file on PATH. Tests are running (I am running
it
just
Post by Suneel Marthi
to
Post by Raviteja Lokineni
Post by Suneel Marthi
Post by Raviteja Lokineni
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade
[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade
[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade
[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade
[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade

E: ***@gmail.com

[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
Suneel Marthi
2016-08-08 18:08:01 UTC
Permalink
We can't drop support for Java 7 yet, so I would suggest that u close the
PR for Lucene 6.
Thanks Raviteja,
Someone will review the PR shortly.
And these ran w/o issue for you in cluster mode, correct?
Andy
________________________________
Sent: Sunday, August 7, 2016 9:01:08 PM
To: mahout
Subject: Re: MAHOUT-1876 - Lucene compatibility
Submitted another PR with Lucene 5.5.2 and Java 7 compatibility. Based on
the devs preference we can choose one of patches.
https://github.com/apache/mahout/pull/248
I did all the necessary tests specified above and all are successful.
Thanks,
Raviteja
On Sat, Aug 6, 2016 at 7:20 PM, Raviteja Lokineni <
Post by Raviteja Lokineni
*correction: wasn't working with Java 7
On Sat, Aug 6, 2016 at 10:17 PM, Raviteja Lokineni <
Post by Raviteja Lokineni
Hi Andy,
- classify-wikipedia.sh
- Option 2
- cluster-reuters.sh
- Option 1,2
- classify-20newsgroups.sh
- Option 1
All these examples *ran successfully* on a cloudera quickstart vm 5.7. I
had to change the cluster JVM to 1.8 to make it work otherwise lucene
was
Post by Raviteja Lokineni
Post by Raviteja Lokineni
failing with incompatible class major/minor version error (because
lucene
Post by Raviteja Lokineni
Post by Raviteja Lokineni
6.1.0 was built for JVM 1.8).
On seeing that this patch wasn't working with Java 8, I was like why,
why, why?
Thanks,
Raviteja
On Sat, Aug 6, 2016 at 5:15 PM, Raviteja Lokineni <
Post by Raviteja Lokineni
I will let you know by tomorrow. Will run them now.
We will likely move to Java 8 at some point of course, but I
personally
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Raviteja Lokineni
would not be inclined to enforce it right now as most of our current
new
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Raviteja Lokineni
work is Scala-based, and this (the lucene dep.) is only used in
legacy
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Raviteja Lokineni
components. Admittedly though, one useful legacy component. Were
you
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Raviteja Lokineni
able to get the examples to run in pseudo-cluster mode with lucene 6?
Thanks,
Andy
________________________________
Sent: Saturday, August 6, 2016 5:03:45 PM
Subject: Re: MAHOUT-1876 - Lucene compatibility
Thank you Raviteja, this is something that we will have to discuss.
________________________________
Sent: Friday, August 5, 2016 11:41:09 PM
To: mahout
Subject: Re: MAHOUT-1876 - Lucene compatibility
Guys, found an issue lucene 6.x is compatible only with Java 8. What's the
plan for mahout compatibility? Do you guys want to call in a vote for Java
compatibility?
Hi Raviteja,
Since this upgrade affects the entire Mahout MapReduce text
processing
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Raviteja Lokineni
pipeline it is important to make sure that it is working in the end
to end
examples.
Could you please set up a Hadoop 2.4.1 pseudo cluster and run
through
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Raviteja Lokineni
the
previously mentioned examples?
The instructions are here (this is from 2.7.1 but should be the same
for
<https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/h
adoop-common/
SingleCluster.html>https://hadoop.apache.org/docs/r2.7.2/
hadoop-project-dist/hadoop-common/SingleCluster.html#
Pseudo-Distributed_Operation
Thanks very much,
Andy
________________________________
Sent: Friday, August 5, 2016 2:38 PM
Subject: Re: MAHOUT-1876 - Lucene compatibility
Ahh- yes I think we started removing MAHOUT_LOCAL capability I see
the
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Raviteja Lokineni
https://github.com/apache/mahout/commit/daad3a4ce618cbd05be4
68c4ce6e45
1618f3a028
[https://avatars3.githubusercontent.com/u/692523?v=3&s=200
]<https://
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Raviteja Lokineni
github.com/apache/mahout/commit/daad3a4ce618cbd05be468c4ce6e
451618f3a028>
MAHOUT-1665: Update hadoop commands in example scripts (akm) closes
a
 ·
daad3a4ce618cbd05be468c4ce6e451618f3a028>
github.com

pache/mahout#98
So it would make sense that you are seeing that Error in local mode.
________________________________
Sent: Friday, August 5, 2016 2:28:08 PM
To: mahout
Subject: Re: MAHOUT-1876 - Lucene compatibility
Nope in a Linux environment.
Post by Suneel Marthi
r u running this on windows prompt or in Cygwin.
Suggest use Cygwin.
On Fri, Aug 5, 2016 at 2:15 PM, Raviteja Lokineni <
Post by Raviteja Lokineni
This is what I get.
$ ./classify-20newsgroups.sh
line
Post by Suneel Marthi
Post by Raviteja Lokineni
36: /bin/hadoop: No such file or directory
line
Post by Suneel Marthi
Post by Raviteja Lokineni
38: [: too many arguments
line
Post by Suneel Marthi
Post by Raviteja Lokineni
43: [: -eq: unary operator expected
Can't determine Hadoop version.
On Fri, Aug 5, 2016 at 2:08 PM, Suneel Marthi <
Post by Suneel Marthi
u don't need a hadoop cluster for that,
set MAHOUT_LOCAL=true
and u shuld be able to run locally
On Fri, Aug 5, 2016 at 1:57 PM, Raviteja Lokineni <
Post by Raviteja Lokineni
Hi Andrew,
Looks like the examples don't seem to work unless on a
hadoop
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Suneel Marthi
cluster.
Post by Raviteja Lokineni
Post by Suneel Marthi
If I
Post by Raviteja Lokineni
get some time I will download a cloudera quickstart vm and
test it
Post by Suneel Marthi
out.
Post by Raviteja Lokineni
Post by Suneel Marthi
Post by Raviteja Lokineni
Thanks,
Raviteja
On Fri, Aug 5, 2016 at 12:53 PM, Andrew Palumbo <
Thanks again Raviteja,
Tests pass in my Linux env as well.
FYI, if the windows script has not yet been officially
deprecated
Post by Suneel Marthi
it
Post by Raviteja Lokineni
Post by Suneel Marthi
Post by Raviteja Lokineni
should be soon.
As Suneel said, someone will merge it over the weekend.
In
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Raviteja Lokineni
the
Post by Suneel Marthi
Post by Raviteja Lokineni
Post by Suneel Marthi
meantime
Post by Raviteja Lokineni
it would good to ensure that some of the examples are
working in
Post by Suneel Marthi
the
Post by Raviteja Lokineni
Post by Suneel Marthi
Post by Raviteja Lokineni
$MAHOUT_HOME/examples/bin dir. Could you try running
classify-wikipedia.sh option (2), cluster-reuters.sh
option (1)
pr
Post by Suneel Marthi
Post by Raviteja Lokineni
(2)
Post by Suneel Marthi
Post by Raviteja Lokineni
and
classify-20newsgroups.sh option 1 in (pseudo)cluster mode
if
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Suneel Marthi
Post by Raviteja Lokineni
possible?
Post by Suneel Marthi
Post by Raviteja Lokineni
This would to ensure that seq2sparse is working correctly
which
Post by Suneel Marthi
Post by Raviteja Lokineni
relies
Post by Suneel Marthi
Post by Raviteja Lokineni
heavily on lucene.
Thanks again for the great contribution.
Andy
-------- Original message --------
Date: 08/05/2016 12:42 PM (GMT-05:00)
Subject: Re: MAHOUT-1876 - Lucene compatibility
Just a FYI, all the tests are successful on windows too ;)
On Fri, Aug 5, 2016 at 12:18 PM, Andrew Palumbo <
+1
________________________________
Sent: Friday, August 5, 2016 12:14:24 PM
To: mahout
Subject: Re: MAHOUT-1876 - Lucene compatibility
Yay! for the heads up on merging.
FYI, I take back my word on failure on windows though. I
had to
Post by Suneel Marthi
Post by Raviteja Lokineni
Post by Suneel Marthi
include
Post by Raviteja Lokineni
the
hadoop.dll file on PATH. Tests are running (I am running
it
just
Post by Suneel Marthi
to
Post by Raviteja Lokineni
Post by Suneel Marthi
Post by Raviteja Lokineni
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade
[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade
[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade
[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade
[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade
[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
Raviteja Lokineni
2016-08-08 18:21:41 UTC
Permalink
Done. Closed it.
Post by Suneel Marthi
We can't drop support for Java 7 yet, so I would suggest that u close the
PR for Lucene 6.
Thanks Raviteja,
Someone will review the PR shortly.
And these ran w/o issue for you in cluster mode, correct?
Andy
________________________________
Sent: Sunday, August 7, 2016 9:01:08 PM
To: mahout
Subject: Re: MAHOUT-1876 - Lucene compatibility
Submitted another PR with Lucene 5.5.2 and Java 7 compatibility. Based on
the devs preference we can choose one of patches.
https://github.com/apache/mahout/pull/248
I did all the necessary tests specified above and all are successful.
Thanks,
Raviteja
On Sat, Aug 6, 2016 at 7:20 PM, Raviteja Lokineni <
Post by Raviteja Lokineni
*correction: wasn't working with Java 7
On Sat, Aug 6, 2016 at 10:17 PM, Raviteja Lokineni <
Post by Raviteja Lokineni
Hi Andy,
- classify-wikipedia.sh
- Option 2
- cluster-reuters.sh
- Option 1,2
- classify-20newsgroups.sh
- Option 1
All these examples *ran successfully* on a cloudera quickstart vm
5.7. I
Post by Raviteja Lokineni
Post by Raviteja Lokineni
had to change the cluster JVM to 1.8 to make it work otherwise lucene
was
Post by Raviteja Lokineni
Post by Raviteja Lokineni
failing with incompatible class major/minor version error (because
lucene
Post by Raviteja Lokineni
Post by Raviteja Lokineni
6.1.0 was built for JVM 1.8).
On seeing that this patch wasn't working with Java 8, I was like why,
why, why?
Thanks,
Raviteja
On Sat, Aug 6, 2016 at 5:15 PM, Raviteja Lokineni <
Post by Raviteja Lokineni
I will let you know by tomorrow. Will run them now.
We will likely move to Java 8 at some point of course, but I
personally
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Raviteja Lokineni
would not be inclined to enforce it right now as most of our current
new
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Raviteja Lokineni
work is Scala-based, and this (the lucene dep.) is only used in
legacy
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Raviteja Lokineni
components. Admittedly though, one useful legacy component. Were
you
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Raviteja Lokineni
able to get the examples to run in pseudo-cluster mode with lucene
6?
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Thanks,
Andy
________________________________
Sent: Saturday, August 6, 2016 5:03:45 PM
Subject: Re: MAHOUT-1876 - Lucene compatibility
Thank you Raviteja, this is something that we will have to discuss.
________________________________
Sent: Friday, August 5, 2016 11:41:09 PM
To: mahout
Subject: Re: MAHOUT-1876 - Lucene compatibility
Guys, found an issue lucene 6.x is compatible only with Java 8.
What's
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Raviteja Lokineni
the
plan for mahout compatibility? Do you guys want to call in a vote
for
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Java
compatibility?
Hi Raviteja,
Since this upgrade affects the entire Mahout MapReduce text
processing
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Raviteja Lokineni
pipeline it is important to make sure that it is working in the
end
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Raviteja Lokineni
to end
examples.
Could you please set up a Hadoop 2.4.1 pseudo cluster and run
through
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Raviteja Lokineni
the
previously mentioned examples?
The instructions are here (this is from 2.7.1 but should be the
same
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Raviteja Lokineni
for
<https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/h
adoop-common/
SingleCluster.html>https://hadoop.apache.org/docs/r2.7.2/
hadoop-project-dist/hadoop-common/SingleCluster.html#
Pseudo-Distributed_Operation
Thanks very much,
Andy
________________________________
Sent: Friday, August 5, 2016 2:38 PM
Subject: Re: MAHOUT-1876 - Lucene compatibility
Ahh- yes I think we started removing MAHOUT_LOCAL capability I see
the
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Raviteja Lokineni
https://github.com/apache/mahout/commit/daad3a4ce618cbd05be4
68c4ce6e45
1618f3a028
[https://avatars3.githubusercontent.com/u/692523?v=3&s=200
]<https://
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Raviteja Lokineni
github.com/apache/mahout/commit/daad3a4ce618cbd05be468c4ce6e
451618f3a028>
MAHOUT-1665: Update hadoop commands in example scripts (akm)
closes
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Raviteja Lokineni
a
 ·
daad3a4ce618cbd05be468c4ce6e451618f3a028>
github.com

pache/mahout#98
So it would make sense that you are seeing that Error in local
mode.
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Raviteja Lokineni
________________________________
Sent: Friday, August 5, 2016 2:28:08 PM
To: mahout
Subject: Re: MAHOUT-1876 - Lucene compatibility
Nope in a Linux environment.
Post by Suneel Marthi
r u running this on windows prompt or in Cygwin.
Suggest use Cygwin.
On Fri, Aug 5, 2016 at 2:15 PM, Raviteja Lokineni <
Post by Raviteja Lokineni
This is what I get.
$ ./classify-20newsgroups.sh
/home/lok268/projects/mahout/examples/bin/set-dfs-commands.s
line
Post by Suneel Marthi
Post by Raviteja Lokineni
36: /bin/hadoop: No such file or directory
/home/lok268/projects/mahout/examples/bin/set-dfs-commands.s
line
Post by Suneel Marthi
Post by Raviteja Lokineni
38: [: too many arguments
/home/lok268/projects/mahout/examples/bin/set-dfs-commands.s
line
Post by Suneel Marthi
Post by Raviteja Lokineni
43: [: -eq: unary operator expected
Can't determine Hadoop version.
On Fri, Aug 5, 2016 at 2:08 PM, Suneel Marthi <
Post by Suneel Marthi
u don't need a hadoop cluster for that,
set MAHOUT_LOCAL=true
and u shuld be able to run locally
On Fri, Aug 5, 2016 at 1:57 PM, Raviteja Lokineni <
Post by Raviteja Lokineni
Hi Andrew,
Looks like the examples don't seem to work unless on a
hadoop
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Suneel Marthi
cluster.
Post by Raviteja Lokineni
Post by Suneel Marthi
If I
Post by Raviteja Lokineni
get some time I will download a cloudera quickstart vm and
test it
Post by Suneel Marthi
out.
Post by Raviteja Lokineni
Post by Suneel Marthi
Post by Raviteja Lokineni
Thanks,
Raviteja
On Fri, Aug 5, 2016 at 12:53 PM, Andrew Palumbo <
Thanks again Raviteja,
Tests pass in my Linux env as well.
FYI, if the windows script has not yet been officially
deprecated
Post by Suneel Marthi
it
Post by Raviteja Lokineni
Post by Suneel Marthi
Post by Raviteja Lokineni
should be soon.
As Suneel said, someone will merge it over the weekend.
In
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Raviteja Lokineni
the
Post by Suneel Marthi
Post by Raviteja Lokineni
Post by Suneel Marthi
meantime
Post by Raviteja Lokineni
it would good to ensure that some of the examples are
working in
Post by Suneel Marthi
the
Post by Raviteja Lokineni
Post by Suneel Marthi
Post by Raviteja Lokineni
$MAHOUT_HOME/examples/bin dir. Could you try running
classify-wikipedia.sh option (2), cluster-reuters.sh
option (1)
pr
Post by Suneel Marthi
Post by Raviteja Lokineni
(2)
Post by Suneel Marthi
Post by Raviteja Lokineni
and
classify-20newsgroups.sh option 1 in (pseudo)cluster
mode
if
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Suneel Marthi
Post by Raviteja Lokineni
possible?
Post by Suneel Marthi
Post by Raviteja Lokineni
This would to ensure that seq2sparse is working
correctly
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Raviteja Lokineni
which
Post by Suneel Marthi
Post by Raviteja Lokineni
relies
Post by Suneel Marthi
Post by Raviteja Lokineni
heavily on lucene.
Thanks again for the great contribution.
Andy
-------- Original message --------
Date: 08/05/2016 12:42 PM (GMT-05:00)
Subject: Re: MAHOUT-1876 - Lucene compatibility
Just a FYI, all the tests are successful on windows too
;)
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Suneel Marthi
Post by Raviteja Lokineni
Post by Suneel Marthi
Post by Raviteja Lokineni
On Fri, Aug 5, 2016 at 12:18 PM, Andrew Palumbo <
+1
________________________________
Sent: Friday, August 5, 2016 12:14:24 PM
To: mahout
Subject: Re: MAHOUT-1876 - Lucene compatibility
Yay! for the heads up on merging.
FYI, I take back my word on failure on windows
though. I
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Raviteja Lokineni
had to
Post by Suneel Marthi
Post by Raviteja Lokineni
Post by Suneel Marthi
include
Post by Raviteja Lokineni
the
hadoop.dll file on PATH. Tests are running (I am
running
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Raviteja Lokineni
it
just
Post by Suneel Marthi
to
Post by Raviteja Lokineni
Post by Suneel Marthi
Post by Raviteja Lokineni
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade
[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade
[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade
[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade
[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade
[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade

E: ***@gmail.com

[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
Raviteja Lokineni
2016-08-10 15:07:11 UTC
Permalink
Hi Guys, just wanted to follow up. Any update/what's the plan?

On Mon, Aug 8, 2016 at 2:21 PM, Raviteja Lokineni <
Post by Raviteja Lokineni
Done. Closed it.
Post by Suneel Marthi
We can't drop support for Java 7 yet, so I would suggest that u close the
PR for Lucene 6.
Thanks Raviteja,
Someone will review the PR shortly.
And these ran w/o issue for you in cluster mode, correct?
Andy
________________________________
Sent: Sunday, August 7, 2016 9:01:08 PM
To: mahout
Subject: Re: MAHOUT-1876 - Lucene compatibility
Submitted another PR with Lucene 5.5.2 and Java 7 compatibility. Based
on
the devs preference we can choose one of patches.
https://github.com/apache/mahout/pull/248
I did all the necessary tests specified above and all are successful.
Thanks,
Raviteja
On Sat, Aug 6, 2016 at 7:20 PM, Raviteja Lokineni <
Post by Raviteja Lokineni
*correction: wasn't working with Java 7
On Sat, Aug 6, 2016 at 10:17 PM, Raviteja Lokineni <
Post by Raviteja Lokineni
Hi Andy,
- classify-wikipedia.sh
- Option 2
- cluster-reuters.sh
- Option 1,2
- classify-20newsgroups.sh
- Option 1
All these examples *ran successfully* on a cloudera quickstart vm
5.7. I
Post by Raviteja Lokineni
Post by Raviteja Lokineni
had to change the cluster JVM to 1.8 to make it work otherwise lucene
was
Post by Raviteja Lokineni
Post by Raviteja Lokineni
failing with incompatible class major/minor version error (because
lucene
Post by Raviteja Lokineni
Post by Raviteja Lokineni
6.1.0 was built for JVM 1.8).
On seeing that this patch wasn't working with Java 8, I was like why,
why, why?
Thanks,
Raviteja
On Sat, Aug 6, 2016 at 5:15 PM, Raviteja Lokineni <
Post by Raviteja Lokineni
I will let you know by tomorrow. Will run them now.
We will likely move to Java 8 at some point of course, but I
personally
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Raviteja Lokineni
would not be inclined to enforce it right now as most of our
current
new
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Raviteja Lokineni
work is Scala-based, and this (the lucene dep.) is only used in
legacy
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Raviteja Lokineni
components. Admittedly though, one useful legacy component.
Were
you
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Raviteja Lokineni
able to get the examples to run in pseudo-cluster mode with lucene
6?
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Thanks,
Andy
________________________________
Sent: Saturday, August 6, 2016 5:03:45 PM
Subject: Re: MAHOUT-1876 - Lucene compatibility
Thank you Raviteja, this is something that we will have to discuss.
________________________________
Sent: Friday, August 5, 2016 11:41:09 PM
To: mahout
Subject: Re: MAHOUT-1876 - Lucene compatibility
Guys, found an issue lucene 6.x is compatible only with Java 8.
What's
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Raviteja Lokineni
the
plan for mahout compatibility? Do you guys want to call in a vote
for
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Java
compatibility?
Hi Raviteja,
Since this upgrade affects the entire Mahout MapReduce text
processing
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Raviteja Lokineni
pipeline it is important to make sure that it is working in the
end
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Raviteja Lokineni
to end
examples.
Could you please set up a Hadoop 2.4.1 pseudo cluster and run
through
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Raviteja Lokineni
the
previously mentioned examples?
The instructions are here (this is from 2.7.1 but should be the
same
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Raviteja Lokineni
for
<https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/h
adoop-common/
SingleCluster.html>https://hadoop.apache.org/docs/r2.7.2/
hadoop-project-dist/hadoop-common/SingleCluster.html#
Pseudo-Distributed_Operation
Thanks very much,
Andy
________________________________
Sent: Friday, August 5, 2016 2:38 PM
Subject: Re: MAHOUT-1876 - Lucene compatibility
Ahh- yes I think we started removing MAHOUT_LOCAL capability I
see
the
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Raviteja Lokineni
https://github.com/apache/mahout/commit/daad3a4ce618cbd05be4
68c4ce6e45
1618f3a028
[https://avatars3.githubusercontent.com/u/692523?v=3&s=200
]<https://
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Raviteja Lokineni
github.com/apache/mahout/commit/daad3a4ce618cbd05be468c4ce6e
451618f3a028>
MAHOUT-1665: Update hadoop commands in example scripts (akm)
closes
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Raviteja Lokineni
a
 ·
daad3a4ce618cbd05be468c4ce6e451618f3a028>
github.com

pache/mahout#98
So it would make sense that you are seeing that Error in local
mode.
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Raviteja Lokineni
________________________________
Sent: Friday, August 5, 2016 2:28:08 PM
To: mahout
Subject: Re: MAHOUT-1876 - Lucene compatibility
Nope in a Linux environment.
Post by Suneel Marthi
r u running this on windows prompt or in Cygwin.
Suggest use Cygwin.
On Fri, Aug 5, 2016 at 2:15 PM, Raviteja Lokineni <
Post by Raviteja Lokineni
This is what I get.
$ ./classify-20newsgroups.sh
/home/lok268/projects/mahout/examples/bin/set-dfs-commands.s
line
Post by Suneel Marthi
Post by Raviteja Lokineni
36: /bin/hadoop: No such file or directory
/home/lok268/projects/mahout/examples/bin/set-dfs-commands.s
line
Post by Suneel Marthi
Post by Raviteja Lokineni
38: [: too many arguments
/home/lok268/projects/mahout/examples/bin/set-dfs-commands.s
line
Post by Suneel Marthi
Post by Raviteja Lokineni
43: [: -eq: unary operator expected
Can't determine Hadoop version.
On Fri, Aug 5, 2016 at 2:08 PM, Suneel Marthi <
Post by Suneel Marthi
u don't need a hadoop cluster for that,
set MAHOUT_LOCAL=true
and u shuld be able to run locally
On Fri, Aug 5, 2016 at 1:57 PM, Raviteja Lokineni <
Post by Raviteja Lokineni
Hi Andrew,
Looks like the examples don't seem to work unless on a
hadoop
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Suneel Marthi
cluster.
Post by Raviteja Lokineni
Post by Suneel Marthi
If I
Post by Raviteja Lokineni
get some time I will download a cloudera quickstart vm
and
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Raviteja Lokineni
test it
Post by Suneel Marthi
out.
Post by Raviteja Lokineni
Post by Suneel Marthi
Post by Raviteja Lokineni
Thanks,
Raviteja
On Fri, Aug 5, 2016 at 12:53 PM, Andrew Palumbo <
Thanks again Raviteja,
Tests pass in my Linux env as well.
FYI, if the windows script has not yet been officially
deprecated
Post by Suneel Marthi
it
Post by Raviteja Lokineni
Post by Suneel Marthi
Post by Raviteja Lokineni
should be soon.
As Suneel said, someone will merge it over the weekend.
In
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Raviteja Lokineni
the
Post by Suneel Marthi
Post by Raviteja Lokineni
Post by Suneel Marthi
meantime
Post by Raviteja Lokineni
it would good to ensure that some of the examples are
working in
Post by Suneel Marthi
the
Post by Raviteja Lokineni
Post by Suneel Marthi
Post by Raviteja Lokineni
$MAHOUT_HOME/examples/bin dir. Could you try running
classify-wikipedia.sh option (2), cluster-reuters.sh
option (1)
pr
Post by Suneel Marthi
Post by Raviteja Lokineni
(2)
Post by Suneel Marthi
Post by Raviteja Lokineni
and
classify-20newsgroups.sh option 1 in (pseudo)cluster
mode
if
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Suneel Marthi
Post by Raviteja Lokineni
possible?
Post by Suneel Marthi
Post by Raviteja Lokineni
This would to ensure that seq2sparse is working
correctly
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Raviteja Lokineni
which
Post by Suneel Marthi
Post by Raviteja Lokineni
relies
Post by Suneel Marthi
Post by Raviteja Lokineni
heavily on lucene.
Thanks again for the great contribution.
Andy
-------- Original message --------
Date: 08/05/2016 12:42 PM (GMT-05:00)
Subject: Re: MAHOUT-1876 - Lucene compatibility
Just a FYI, all the tests are successful on windows
too ;)
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Suneel Marthi
Post by Raviteja Lokineni
Post by Suneel Marthi
Post by Raviteja Lokineni
On Fri, Aug 5, 2016 at 12:18 PM, Andrew Palumbo <
+1
________________________________
Sent: Friday, August 5, 2016 12:14:24 PM
To: mahout
Subject: Re: MAHOUT-1876 - Lucene compatibility
Yay! for the heads up on merging.
FYI, I take back my word on failure on windows
though. I
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Raviteja Lokineni
had to
Post by Suneel Marthi
Post by Raviteja Lokineni
Post by Suneel Marthi
include
Post by Raviteja Lokineni
the
hadoop.dll file on PATH. Tests are running (I am
running
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Raviteja Lokineni
it
just
Post by Suneel Marthi
to
Post by Raviteja Lokineni
Post by Suneel Marthi
Post by Raviteja Lokineni
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade
[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade
[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade
[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade
[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade
[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade
[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade

E: ***@gmail.com

[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
Raviteja Lokineni
2016-08-08 18:13:20 UTC
Permalink
Hi Andy,

Yep no issues with the runs in cluster in both the patches. Only difference
is the Java version compatibility.

- https://github.com/apache/mahout/pull/247
- Lucene 6.1.0
- Works with Java 8 and above
- The Hadoop cluster should also be running on Java 8 and above
- https://github.com/apache/mahout/pull/248
- Lucene 5.5.2
- Works with Java 7 and above
- The Hadoop cluster should also be running on Java 7 and above

Thanks,
Raviteja
Thanks Raviteja,
Someone will review the PR shortly.
And these ran w/o issue for you in cluster mode, correct?
Andy
________________________________
Sent: Sunday, August 7, 2016 9:01:08 PM
To: mahout
Subject: Re: MAHOUT-1876 - Lucene compatibility
Submitted another PR with Lucene 5.5.2 and Java 7 compatibility. Based on
the devs preference we can choose one of patches.
https://github.com/apache/mahout/pull/248
I did all the necessary tests specified above and all are successful.
Thanks,
Raviteja
On Sat, Aug 6, 2016 at 7:20 PM, Raviteja Lokineni <
Post by Raviteja Lokineni
*correction: wasn't working with Java 7
On Sat, Aug 6, 2016 at 10:17 PM, Raviteja Lokineni <
Post by Raviteja Lokineni
Hi Andy,
- classify-wikipedia.sh
- Option 2
- cluster-reuters.sh
- Option 1,2
- classify-20newsgroups.sh
- Option 1
All these examples *ran successfully* on a cloudera quickstart vm 5.7. I
had to change the cluster JVM to 1.8 to make it work otherwise lucene
was
Post by Raviteja Lokineni
Post by Raviteja Lokineni
failing with incompatible class major/minor version error (because
lucene
Post by Raviteja Lokineni
Post by Raviteja Lokineni
6.1.0 was built for JVM 1.8).
On seeing that this patch wasn't working with Java 8, I was like why,
why, why?
Thanks,
Raviteja
On Sat, Aug 6, 2016 at 5:15 PM, Raviteja Lokineni <
Post by Raviteja Lokineni
I will let you know by tomorrow. Will run them now.
We will likely move to Java 8 at some point of course, but I
personally
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Raviteja Lokineni
would not be inclined to enforce it right now as most of our current
new
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Raviteja Lokineni
work is Scala-based, and this (the lucene dep.) is only used in
legacy
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Raviteja Lokineni
components. Admittedly though, one useful legacy component. Were
you
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Raviteja Lokineni
able to get the examples to run in pseudo-cluster mode with lucene 6?
Thanks,
Andy
________________________________
Sent: Saturday, August 6, 2016 5:03:45 PM
Subject: Re: MAHOUT-1876 - Lucene compatibility
Thank you Raviteja, this is something that we will have to discuss.
________________________________
Sent: Friday, August 5, 2016 11:41:09 PM
To: mahout
Subject: Re: MAHOUT-1876 - Lucene compatibility
Guys, found an issue lucene 6.x is compatible only with Java 8. What's the
plan for mahout compatibility? Do you guys want to call in a vote for Java
compatibility?
Hi Raviteja,
Since this upgrade affects the entire Mahout MapReduce text
processing
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Raviteja Lokineni
pipeline it is important to make sure that it is working in the end
to end
examples.
Could you please set up a Hadoop 2.4.1 pseudo cluster and run
through
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Raviteja Lokineni
the
previously mentioned examples?
The instructions are here (this is from 2.7.1 but should be the same
for
<https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/h
adoop-common/
SingleCluster.html>https://hadoop.apache.org/docs/r2.7.2/
hadoop-project-dist/hadoop-common/SingleCluster.html#
Pseudo-Distributed_Operation
Thanks very much,
Andy
________________________________
Sent: Friday, August 5, 2016 2:38 PM
Subject: Re: MAHOUT-1876 - Lucene compatibility
Ahh- yes I think we started removing MAHOUT_LOCAL capability I see
the
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Raviteja Lokineni
https://github.com/apache/mahout/commit/daad3a4ce618cbd05be4
68c4ce6e45
1618f3a028
[https://avatars3.githubusercontent.com/u/692523?v=3&s=200
]<https://
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Raviteja Lokineni
github.com/apache/mahout/commit/daad3a4ce618cbd05be468c4ce6e
451618f3a028>
MAHOUT-1665: Update hadoop commands in example scripts (akm) closes
a
 ·
daad3a4ce618cbd05be468c4ce6e451618f3a028>
github.com

pache/mahout#98
So it would make sense that you are seeing that Error in local mode.
________________________________
Sent: Friday, August 5, 2016 2:28:08 PM
To: mahout
Subject: Re: MAHOUT-1876 - Lucene compatibility
Nope in a Linux environment.
Post by Suneel Marthi
r u running this on windows prompt or in Cygwin.
Suggest use Cygwin.
On Fri, Aug 5, 2016 at 2:15 PM, Raviteja Lokineni <
Post by Raviteja Lokineni
This is what I get.
$ ./classify-20newsgroups.sh
line
Post by Suneel Marthi
Post by Raviteja Lokineni
36: /bin/hadoop: No such file or directory
line
Post by Suneel Marthi
Post by Raviteja Lokineni
38: [: too many arguments
line
Post by Suneel Marthi
Post by Raviteja Lokineni
43: [: -eq: unary operator expected
Can't determine Hadoop version.
On Fri, Aug 5, 2016 at 2:08 PM, Suneel Marthi <
Post by Suneel Marthi
u don't need a hadoop cluster for that,
set MAHOUT_LOCAL=true
and u shuld be able to run locally
On Fri, Aug 5, 2016 at 1:57 PM, Raviteja Lokineni <
Post by Raviteja Lokineni
Hi Andrew,
Looks like the examples don't seem to work unless on a
hadoop
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Suneel Marthi
cluster.
Post by Raviteja Lokineni
Post by Suneel Marthi
If I
Post by Raviteja Lokineni
get some time I will download a cloudera quickstart vm and
test it
Post by Suneel Marthi
out.
Post by Raviteja Lokineni
Post by Suneel Marthi
Post by Raviteja Lokineni
Thanks,
Raviteja
On Fri, Aug 5, 2016 at 12:53 PM, Andrew Palumbo <
Thanks again Raviteja,
Tests pass in my Linux env as well.
FYI, if the windows script has not yet been officially
deprecated
Post by Suneel Marthi
it
Post by Raviteja Lokineni
Post by Suneel Marthi
Post by Raviteja Lokineni
should be soon.
As Suneel said, someone will merge it over the weekend.
In
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Raviteja Lokineni
the
Post by Suneel Marthi
Post by Raviteja Lokineni
Post by Suneel Marthi
meantime
Post by Raviteja Lokineni
it would good to ensure that some of the examples are
working in
Post by Suneel Marthi
the
Post by Raviteja Lokineni
Post by Suneel Marthi
Post by Raviteja Lokineni
$MAHOUT_HOME/examples/bin dir. Could you try running
classify-wikipedia.sh option (2), cluster-reuters.sh
option (1)
pr
Post by Suneel Marthi
Post by Raviteja Lokineni
(2)
Post by Suneel Marthi
Post by Raviteja Lokineni
and
classify-20newsgroups.sh option 1 in (pseudo)cluster mode
if
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Raviteja Lokineni
Post by Suneel Marthi
Post by Raviteja Lokineni
possible?
Post by Suneel Marthi
Post by Raviteja Lokineni
This would to ensure that seq2sparse is working correctly
which
Post by Suneel Marthi
Post by Raviteja Lokineni
relies
Post by Suneel Marthi
Post by Raviteja Lokineni
heavily on lucene.
Thanks again for the great contribution.
Andy
-------- Original message --------
Date: 08/05/2016 12:42 PM (GMT-05:00)
Subject: Re: MAHOUT-1876 - Lucene compatibility
Just a FYI, all the tests are successful on windows too ;)
On Fri, Aug 5, 2016 at 12:18 PM, Andrew Palumbo <
+1
________________________________
Sent: Friday, August 5, 2016 12:14:24 PM
To: mahout
Subject: Re: MAHOUT-1876 - Lucene compatibility
Yay! for the heads up on merging.
FYI, I take back my word on failure on windows though. I
had to
Post by Suneel Marthi
Post by Raviteja Lokineni
Post by Suneel Marthi
include
Post by Raviteja Lokineni
the
hadoop.dll file on PATH. Tests are running (I am running
it
just
Post by Suneel Marthi
to
Post by Raviteja Lokineni
Post by Suneel Marthi
Post by Raviteja Lokineni
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade
[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade
[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade
[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade
[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade
[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
--
*Raviteja Lokineni* | Business Intelligence Developer
TD Ameritrade

E: ***@gmail.com

[image: View Raviteja Lokineni's profile on LinkedIn]
<http://in.linkedin.com/in/ravitejalokineni>
Loading...