Discussion:
[jira] [Created] (MAHOUT-1876) Mahout fails to read from lucene index of solr-6.1.0
Raviteja Lokineni (JIRA)
2016-07-19 16:05:20 UTC
Permalink
Raviteja Lokineni created MAHOUT-1876:
-----------------------------------------

Summary: Mahout fails to read from lucene index of solr-6.1.0
Key: MAHOUT-1876
URL: https://issues.apache.org/jira/browse/MAHOUT-1876
Project: Mahout
Issue Type: Bug
Affects Versions: 0.12.2
Environment: Solr: 6.1.0
JDK: 1.8.0_92
Mahout: 0.12.2
OS: Linux
Reporter: Raviteja Lokineni


Command: {noformat}bin/mahout lucene.vector --dir ~/softwares/solr-6.1.0/server/solr/nlp-core/data/index --output /tmp/solr-nlp-core/out.vec --field rspns_val --dictOut /tmp/solr-nlp-core/dictionary.txt --norm 2{noformat}

Stacktrace:
{noformat}
hadoop binary is not in PATH,HADOOP_HOME/bin,HADOOP_PREFIX/bin, running locally
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/lok268/softwares/apache-mahout-distribution-0.12.2/mahout-examples-0.12.2-job.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/lok268/softwares/apache-mahout-distribution-0.12.2/mahout-mr-0.12.2-job.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/lok268/softwares/apache-mahout-distribution-0.12.2/lib/slf4j-log4j12-1.7.19.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
Exception in thread "main" org.apache.lucene.index.IndexFormatTooNewException: Format version is not supported (resource: ChecksumIndexInput(MMapIndexInput(path="/home/lok268/softwares/solr-6.1.0/server/solr/nlp-core/data/index/segments_2"))): 6 (needs to be between 0 and 1)
at org.apache.lucene.codecs.CodecUtil.checkHeaderNoMagic(CodecUtil.java:148)
at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:329)
at org.apache.lucene.index.StandardDirectoryReader$1.doBody(StandardDirectoryReader.java:56)
at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:843)
at org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:52)
at org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:66)
at org.apache.mahout.utils.vectors.lucene.Driver.dumpVectors(Driver.java:89)
at org.apache.mahout.utils.vectors.lucene.Driver.main(Driver.java:277)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72)
at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:145)
at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:153)
at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:195)
{noformat}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
Suneel Marthi (JIRA)
2016-07-19 17:26:21 UTC
Permalink
[ https://issues.apache.org/jira/browse/MAHOUT-1876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384544#comment-15384544 ]

Suneel Marthi commented on MAHOUT-1876:
---------------------------------------

Yes this is not supported. Mahout is still at Lucene 4.6.x and hence would not be compatible with Solr 6.0.
Post by Raviteja Lokineni (JIRA)
Mahout fails to read from lucene index of solr-6.1.0
----------------------------------------------------
Key: MAHOUT-1876
URL: https://issues.apache.org/jira/browse/MAHOUT-1876
Project: Mahout
Issue Type: Bug
Affects Versions: 0.12.2
Environment: Solr: 6.1.0
JDK: 1.8.0_92
Mahout: 0.12.2
OS: Linux
Reporter: Raviteja Lokineni
Command: {noformat}bin/mahout lucene.vector --dir ~/softwares/solr-6.1.0/server/solr/nlp-core/data/index --output /tmp/solr-nlp-core/out.vec --field rspns_val --dictOut /tmp/solr-nlp-core/dictionary.txt --norm 2{noformat}
{noformat}
hadoop binary is not in PATH,HADOOP_HOME/bin,HADOOP_PREFIX/bin, running locally
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/lok268/softwares/apache-mahout-distribution-0.12.2/mahout-examples-0.12.2-job.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/lok268/softwares/apache-mahout-distribution-0.12.2/mahout-mr-0.12.2-job.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/lok268/softwares/apache-mahout-distribution-0.12.2/lib/slf4j-log4j12-1.7.19.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
Exception in thread "main" org.apache.lucene.index.IndexFormatTooNewException: Format version is not supported (resource: ChecksumIndexInput(MMapIndexInput(path="/home/lok268/softwares/solr-6.1.0/server/solr/nlp-core/data/index/segments_2"))): 6 (needs to be between 0 and 1)
at org.apache.lucene.codecs.CodecUtil.checkHeaderNoMagic(CodecUtil.java:148)
at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:329)
at org.apache.lucene.index.StandardDirectoryReader$1.doBody(StandardDirectoryReader.java:56)
at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:843)
at org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:52)
at org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:66)
at org.apache.mahout.utils.vectors.lucene.Driver.dumpVectors(Driver.java:89)
at org.apache.mahout.utils.vectors.lucene.Driver.main(Driver.java:277)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72)
at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:145)
at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:153)
at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:195)
{noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
Suneel Marthi (JIRA)
2016-07-19 17:50:20 UTC
Permalink
[ https://issues.apache.org/jira/browse/MAHOUT-1876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384581#comment-15384581 ]

Suneel Marthi commented on MAHOUT-1876:
---------------------------------------

To add more context to the previous post, we tried moving to Lucene 4.10.x back in March 2015 but that completely broke the vectorization code in the legacy mapreduce and was failing all tests.
Post by Raviteja Lokineni (JIRA)
Mahout fails to read from lucene index of solr-6.1.0
----------------------------------------------------
Key: MAHOUT-1876
URL: https://issues.apache.org/jira/browse/MAHOUT-1876
Project: Mahout
Issue Type: Bug
Affects Versions: 0.12.2
Environment: Solr: 6.1.0
JDK: 1.8.0_92
Mahout: 0.12.2
OS: Linux
Reporter: Raviteja Lokineni
Command: {noformat}bin/mahout lucene.vector --dir ~/softwares/solr-6.1.0/server/solr/nlp-core/data/index --output /tmp/solr-nlp-core/out.vec --field rspns_val --dictOut /tmp/solr-nlp-core/dictionary.txt --norm 2{noformat}
{noformat}
hadoop binary is not in PATH,HADOOP_HOME/bin,HADOOP_PREFIX/bin, running locally
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/lok268/softwares/apache-mahout-distribution-0.12.2/mahout-examples-0.12.2-job.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/lok268/softwares/apache-mahout-distribution-0.12.2/mahout-mr-0.12.2-job.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/lok268/softwares/apache-mahout-distribution-0.12.2/lib/slf4j-log4j12-1.7.19.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
Exception in thread "main" org.apache.lucene.index.IndexFormatTooNewException: Format version is not supported (resource: ChecksumIndexInput(MMapIndexInput(path="/home/lok268/softwares/solr-6.1.0/server/solr/nlp-core/data/index/segments_2"))): 6 (needs to be between 0 and 1)
at org.apache.lucene.codecs.CodecUtil.checkHeaderNoMagic(CodecUtil.java:148)
at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:329)
at org.apache.lucene.index.StandardDirectoryReader$1.doBody(StandardDirectoryReader.java:56)
at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:843)
at org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:52)
at org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:66)
at org.apache.mahout.utils.vectors.lucene.Driver.dumpVectors(Driver.java:89)
at org.apache.mahout.utils.vectors.lucene.Driver.main(Driver.java:277)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72)
at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:145)
at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:153)
at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:195)
{noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
ASF GitHub Bot (JIRA)
2016-08-04 18:40:20 UTC
Permalink
[ https://issues.apache.org/jira/browse/MAHOUT-1876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15408289#comment-15408289 ]

ASF GitHub Bot commented on MAHOUT-1876:
----------------------------------------

GitHub user bond- opened a pull request:

https://github.com/apache/mahout/pull/247

MAHOUT-1876: Upgrade lucene to 6.1.0 and fix compilation failures

Looked at the lucene migrate guides and past deprecation warnings to find alternatives to removed features

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/bond-/mahout mahout-1876

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/mahout/pull/247.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #247

----
commit aae0eb332396bb242dfddf558d3005af6cfab575
Author: Raviteja Lokineni <***@gmail.com>
Date: 2016-08-04T18:38:48Z

MAHOUT-1876: Upgrade lucene to 6.1.0 and fix compilation failures

Looked at the lucene migrate guides and past deprecation warnings to find alternatives to removed features

----
Post by Raviteja Lokineni (JIRA)
Mahout fails to read from lucene index of solr-6.1.0
----------------------------------------------------
Key: MAHOUT-1876
URL: https://issues.apache.org/jira/browse/MAHOUT-1876
Project: Mahout
Issue Type: Bug
Affects Versions: 0.12.2
Environment: Solr: 6.1.0
JDK: 1.8.0_92
Mahout: 0.12.2
OS: Linux
Reporter: Raviteja Lokineni
Command: {noformat}bin/mahout lucene.vector --dir ~/softwares/solr-6.1.0/server/solr/nlp-core/data/index --output /tmp/solr-nlp-core/out.vec --field rspns_val --dictOut /tmp/solr-nlp-core/dictionary.txt --norm 2{noformat}
{noformat}
hadoop binary is not in PATH,HADOOP_HOME/bin,HADOOP_PREFIX/bin, running locally
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/lok268/softwares/apache-mahout-distribution-0.12.2/mahout-examples-0.12.2-job.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/lok268/softwares/apache-mahout-distribution-0.12.2/mahout-mr-0.12.2-job.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/lok268/softwares/apache-mahout-distribution-0.12.2/lib/slf4j-log4j12-1.7.19.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
Exception in thread "main" org.apache.lucene.index.IndexFormatTooNewException: Format version is not supported (resource: ChecksumIndexInput(MMapIndexInput(path="/home/lok268/softwares/solr-6.1.0/server/solr/nlp-core/data/index/segments_2"))): 6 (needs to be between 0 and 1)
at org.apache.lucene.codecs.CodecUtil.checkHeaderNoMagic(CodecUtil.java:148)
at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:329)
at org.apache.lucene.index.StandardDirectoryReader$1.doBody(StandardDirectoryReader.java:56)
at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:843)
at org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:52)
at org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:66)
at org.apache.mahout.utils.vectors.lucene.Driver.dumpVectors(Driver.java:89)
at org.apache.mahout.utils.vectors.lucene.Driver.main(Driver.java:277)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72)
at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:145)
at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:153)
at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:195)
{noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
ASF GitHub Bot (JIRA)
2016-08-04 18:41:20 UTC
Permalink
[ https://issues.apache.org/jira/browse/MAHOUT-1876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15408291#comment-15408291 ]

ASF GitHub Bot commented on MAHOUT-1876:
----------------------------------------

Github user bond- commented on the issue:

https://github.com/apache/mahout/pull/247

Submitted this PR to check with travis test results
Post by Raviteja Lokineni (JIRA)
Mahout fails to read from lucene index of solr-6.1.0
----------------------------------------------------
Key: MAHOUT-1876
URL: https://issues.apache.org/jira/browse/MAHOUT-1876
Project: Mahout
Issue Type: Bug
Affects Versions: 0.12.2
Environment: Solr: 6.1.0
JDK: 1.8.0_92
Mahout: 0.12.2
OS: Linux
Reporter: Raviteja Lokineni
Command: {noformat}bin/mahout lucene.vector --dir ~/softwares/solr-6.1.0/server/solr/nlp-core/data/index --output /tmp/solr-nlp-core/out.vec --field rspns_val --dictOut /tmp/solr-nlp-core/dictionary.txt --norm 2{noformat}
{noformat}
hadoop binary is not in PATH,HADOOP_HOME/bin,HADOOP_PREFIX/bin, running locally
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/lok268/softwares/apache-mahout-distribution-0.12.2/mahout-examples-0.12.2-job.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/lok268/softwares/apache-mahout-distribution-0.12.2/mahout-mr-0.12.2-job.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/lok268/softwares/apache-mahout-distribution-0.12.2/lib/slf4j-log4j12-1.7.19.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
Exception in thread "main" org.apache.lucene.index.IndexFormatTooNewException: Format version is not supported (resource: ChecksumIndexInput(MMapIndexInput(path="/home/lok268/softwares/solr-6.1.0/server/solr/nlp-core/data/index/segments_2"))): 6 (needs to be between 0 and 1)
at org.apache.lucene.codecs.CodecUtil.checkHeaderNoMagic(CodecUtil.java:148)
at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:329)
at org.apache.lucene.index.StandardDirectoryReader$1.doBody(StandardDirectoryReader.java:56)
at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:843)
at org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:52)
at org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:66)
at org.apache.mahout.utils.vectors.lucene.Driver.dumpVectors(Driver.java:89)
at org.apache.mahout.utils.vectors.lucene.Driver.main(Driver.java:277)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72)
at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:145)
at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:153)
at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:195)
{noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
Raviteja Lokineni (JIRA)
2016-08-04 20:07:20 UTC
Permalink
[ https://issues.apache.org/jira/browse/MAHOUT-1876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15408412#comment-15408412 ]

Raviteja Lokineni commented on MAHOUT-1876:
-------------------------------------------

Travis CI reported a clean report for my pull request. I guess my changes are good to merge.

https://travis-ci.org/apache/mahout/builds/149840408
Post by Raviteja Lokineni (JIRA)
Mahout fails to read from lucene index of solr-6.1.0
----------------------------------------------------
Key: MAHOUT-1876
URL: https://issues.apache.org/jira/browse/MAHOUT-1876
Project: Mahout
Issue Type: Bug
Affects Versions: 0.12.2
Environment: Solr: 6.1.0
JDK: 1.8.0_92
Mahout: 0.12.2
OS: Linux
Reporter: Raviteja Lokineni
Command: {noformat}bin/mahout lucene.vector --dir ~/softwares/solr-6.1.0/server/solr/nlp-core/data/index --output /tmp/solr-nlp-core/out.vec --field rspns_val --dictOut /tmp/solr-nlp-core/dictionary.txt --norm 2{noformat}
{noformat}
hadoop binary is not in PATH,HADOOP_HOME/bin,HADOOP_PREFIX/bin, running locally
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/lok268/softwares/apache-mahout-distribution-0.12.2/mahout-examples-0.12.2-job.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/lok268/softwares/apache-mahout-distribution-0.12.2/mahout-mr-0.12.2-job.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/lok268/softwares/apache-mahout-distribution-0.12.2/lib/slf4j-log4j12-1.7.19.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
Exception in thread "main" org.apache.lucene.index.IndexFormatTooNewException: Format version is not supported (resource: ChecksumIndexInput(MMapIndexInput(path="/home/lok268/softwares/solr-6.1.0/server/solr/nlp-core/data/index/segments_2"))): 6 (needs to be between 0 and 1)
at org.apache.lucene.codecs.CodecUtil.checkHeaderNoMagic(CodecUtil.java:148)
at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:329)
at org.apache.lucene.index.StandardDirectoryReader$1.doBody(StandardDirectoryReader.java:56)
at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:843)
at org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:52)
at org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:66)
at org.apache.mahout.utils.vectors.lucene.Driver.dumpVectors(Driver.java:89)
at org.apache.mahout.utils.vectors.lucene.Driver.main(Driver.java:277)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72)
at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:145)
at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:153)
at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:195)
{noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
ASF GitHub Bot (JIRA)
2016-08-08 00:54:20 UTC
Permalink
[ https://issues.apache.org/jira/browse/MAHOUT-1876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15411149#comment-15411149 ]

ASF GitHub Bot commented on MAHOUT-1876:
----------------------------------------

GitHub user bond- opened a pull request:

https://github.com/apache/mahout/pull/248

MAHOUT-1876: Upgrade lucene to 5.5.2 and fix compilation failures

Looked at the Lucene migrate guides and past deprecation warnings to find alternatives to removed features. This PR is compatible with Java 7 and above.

All tests successful: https://gist.github.com/bond-/6f7872cd9557fce5f09cdc3d9915b996
Also tested the following examples and are successful with cdh-cluster 5.5 and Java 7:
- classify-wikipedia.sh
- Option 2
- cluster-reuters.sh
- Option 1,2
- classify-20newsgroups.sh
- Option 1

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/bond-/mahout mahout-1876-java7

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/mahout/pull/248.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #248

----
commit ebfba1c0d9f47df1d93c3494e8c932f4ef9b59dd
Author: Raviteja Lokineni <***@gmail.com>
Date: 2016-08-07T17:51:50Z

MAHOUT-1876: Upgrade lucene to 5.5.2 and fix compilation failures

Looked at the lucene migrate guides and past deprecation warnings to find alternatives to removed features

commit 984f7c4101e00b3ca911a663c15492117564c906
Author: Raviteja Lokineni <***@gmail.com>
Date: 2016-08-07T18:21:44Z

Merge remote-tracking branch 'upstream/master' into mahout-1876-java7

----
Post by Raviteja Lokineni (JIRA)
Mahout fails to read from lucene index of solr-6.1.0
----------------------------------------------------
Key: MAHOUT-1876
URL: https://issues.apache.org/jira/browse/MAHOUT-1876
Project: Mahout
Issue Type: Bug
Affects Versions: 0.12.2
Environment: Solr: 6.1.0
JDK: 1.8.0_92
Mahout: 0.12.2
OS: Linux
Reporter: Raviteja Lokineni
Command: {noformat}bin/mahout lucene.vector --dir ~/softwares/solr-6.1.0/server/solr/nlp-core/data/index --output /tmp/solr-nlp-core/out.vec --field rspns_val --dictOut /tmp/solr-nlp-core/dictionary.txt --norm 2{noformat}
{noformat}
hadoop binary is not in PATH,HADOOP_HOME/bin,HADOOP_PREFIX/bin, running locally
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/lok268/softwares/apache-mahout-distribution-0.12.2/mahout-examples-0.12.2-job.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/lok268/softwares/apache-mahout-distribution-0.12.2/mahout-mr-0.12.2-job.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/lok268/softwares/apache-mahout-distribution-0.12.2/lib/slf4j-log4j12-1.7.19.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
Exception in thread "main" org.apache.lucene.index.IndexFormatTooNewException: Format version is not supported (resource: ChecksumIndexInput(MMapIndexInput(path="/home/lok268/softwares/solr-6.1.0/server/solr/nlp-core/data/index/segments_2"))): 6 (needs to be between 0 and 1)
at org.apache.lucene.codecs.CodecUtil.checkHeaderNoMagic(CodecUtil.java:148)
at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:329)
at org.apache.lucene.index.StandardDirectoryReader$1.doBody(StandardDirectoryReader.java:56)
at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:843)
at org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:52)
at org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:66)
at org.apache.mahout.utils.vectors.lucene.Driver.dumpVectors(Driver.java:89)
at org.apache.mahout.utils.vectors.lucene.Driver.main(Driver.java:277)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72)
at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:145)
at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:153)
at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:195)
{noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
ASF GitHub Bot (JIRA)
2016-08-08 18:16:20 UTC
Permalink
[ https://issues.apache.org/jira/browse/MAHOUT-1876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15412197#comment-15412197 ]

ASF GitHub Bot commented on MAHOUT-1876:
----------------------------------------

Github user bond- commented on the issue:

https://github.com/apache/mahout/pull/247

Closing this PR since Java 7 support in Mahout can't be dropped at this moment.
Post by Raviteja Lokineni (JIRA)
Mahout fails to read from lucene index of solr-6.1.0
----------------------------------------------------
Key: MAHOUT-1876
URL: https://issues.apache.org/jira/browse/MAHOUT-1876
Project: Mahout
Issue Type: Bug
Affects Versions: 0.12.2
Environment: Solr: 6.1.0
JDK: 1.8.0_92
Mahout: 0.12.2
OS: Linux
Reporter: Raviteja Lokineni
Command: {noformat}bin/mahout lucene.vector --dir ~/softwares/solr-6.1.0/server/solr/nlp-core/data/index --output /tmp/solr-nlp-core/out.vec --field rspns_val --dictOut /tmp/solr-nlp-core/dictionary.txt --norm 2{noformat}
{noformat}
hadoop binary is not in PATH,HADOOP_HOME/bin,HADOOP_PREFIX/bin, running locally
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/lok268/softwares/apache-mahout-distribution-0.12.2/mahout-examples-0.12.2-job.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/lok268/softwares/apache-mahout-distribution-0.12.2/mahout-mr-0.12.2-job.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/lok268/softwares/apache-mahout-distribution-0.12.2/lib/slf4j-log4j12-1.7.19.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
Exception in thread "main" org.apache.lucene.index.IndexFormatTooNewException: Format version is not supported (resource: ChecksumIndexInput(MMapIndexInput(path="/home/lok268/softwares/solr-6.1.0/server/solr/nlp-core/data/index/segments_2"))): 6 (needs to be between 0 and 1)
at org.apache.lucene.codecs.CodecUtil.checkHeaderNoMagic(CodecUtil.java:148)
at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:329)
at org.apache.lucene.index.StandardDirectoryReader$1.doBody(StandardDirectoryReader.java:56)
at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:843)
at org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:52)
at org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:66)
at org.apache.mahout.utils.vectors.lucene.Driver.dumpVectors(Driver.java:89)
at org.apache.mahout.utils.vectors.lucene.Driver.main(Driver.java:277)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72)
at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:145)
at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:153)
at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:195)
{noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
ASF GitHub Bot (JIRA)
2016-08-08 18:16:20 UTC
Permalink
[ https://issues.apache.org/jira/browse/MAHOUT-1876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15412198#comment-15412198 ]

ASF GitHub Bot commented on MAHOUT-1876:
----------------------------------------

Github user bond- closed the pull request at:

https://github.com/apache/mahout/pull/247
Post by Raviteja Lokineni (JIRA)
Mahout fails to read from lucene index of solr-6.1.0
----------------------------------------------------
Key: MAHOUT-1876
URL: https://issues.apache.org/jira/browse/MAHOUT-1876
Project: Mahout
Issue Type: Bug
Affects Versions: 0.12.2
Environment: Solr: 6.1.0
JDK: 1.8.0_92
Mahout: 0.12.2
OS: Linux
Reporter: Raviteja Lokineni
Command: {noformat}bin/mahout lucene.vector --dir ~/softwares/solr-6.1.0/server/solr/nlp-core/data/index --output /tmp/solr-nlp-core/out.vec --field rspns_val --dictOut /tmp/solr-nlp-core/dictionary.txt --norm 2{noformat}
{noformat}
hadoop binary is not in PATH,HADOOP_HOME/bin,HADOOP_PREFIX/bin, running locally
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/lok268/softwares/apache-mahout-distribution-0.12.2/mahout-examples-0.12.2-job.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/lok268/softwares/apache-mahout-distribution-0.12.2/mahout-mr-0.12.2-job.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/lok268/softwares/apache-mahout-distribution-0.12.2/lib/slf4j-log4j12-1.7.19.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
Exception in thread "main" org.apache.lucene.index.IndexFormatTooNewException: Format version is not supported (resource: ChecksumIndexInput(MMapIndexInput(path="/home/lok268/softwares/solr-6.1.0/server/solr/nlp-core/data/index/segments_2"))): 6 (needs to be between 0 and 1)
at org.apache.lucene.codecs.CodecUtil.checkHeaderNoMagic(CodecUtil.java:148)
at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:329)
at org.apache.lucene.index.StandardDirectoryReader$1.doBody(StandardDirectoryReader.java:56)
at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:843)
at org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:52)
at org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:66)
at org.apache.mahout.utils.vectors.lucene.Driver.dumpVectors(Driver.java:89)
at org.apache.mahout.utils.vectors.lucene.Driver.main(Driver.java:277)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72)
at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:145)
at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:153)
at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:195)
{noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
ASF GitHub Bot (JIRA)
2016-08-11 05:45:21 UTC
Permalink
[ https://issues.apache.org/jira/browse/MAHOUT-1876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15416598#comment-15416598 ]

ASF GitHub Bot commented on MAHOUT-1876:
----------------------------------------

Github user asfgit closed the pull request at:

https://github.com/apache/mahout/pull/248
Post by Raviteja Lokineni (JIRA)
Mahout fails to read from lucene index of solr-6.1.0
----------------------------------------------------
Key: MAHOUT-1876
URL: https://issues.apache.org/jira/browse/MAHOUT-1876
Project: Mahout
Issue Type: Bug
Affects Versions: 0.12.2
Environment: Solr: 6.1.0
JDK: 1.8.0_92
Mahout: 0.12.2
OS: Linux
Reporter: Raviteja Lokineni
Command: {noformat}bin/mahout lucene.vector --dir ~/softwares/solr-6.1.0/server/solr/nlp-core/data/index --output /tmp/solr-nlp-core/out.vec --field rspns_val --dictOut /tmp/solr-nlp-core/dictionary.txt --norm 2{noformat}
{noformat}
hadoop binary is not in PATH,HADOOP_HOME/bin,HADOOP_PREFIX/bin, running locally
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/lok268/softwares/apache-mahout-distribution-0.12.2/mahout-examples-0.12.2-job.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/lok268/softwares/apache-mahout-distribution-0.12.2/mahout-mr-0.12.2-job.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/lok268/softwares/apache-mahout-distribution-0.12.2/lib/slf4j-log4j12-1.7.19.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
Exception in thread "main" org.apache.lucene.index.IndexFormatTooNewException: Format version is not supported (resource: ChecksumIndexInput(MMapIndexInput(path="/home/lok268/softwares/solr-6.1.0/server/solr/nlp-core/data/index/segments_2"))): 6 (needs to be between 0 and 1)
at org.apache.lucene.codecs.CodecUtil.checkHeaderNoMagic(CodecUtil.java:148)
at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:329)
at org.apache.lucene.index.StandardDirectoryReader$1.doBody(StandardDirectoryReader.java:56)
at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:843)
at org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:52)
at org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:66)
at org.apache.mahout.utils.vectors.lucene.Driver.dumpVectors(Driver.java:89)
at org.apache.mahout.utils.vectors.lucene.Driver.main(Driver.java:277)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72)
at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:145)
at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:153)
at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:195)
{noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
Suneel Marthi (JIRA)
2016-08-11 05:46:20 UTC
Permalink
[ https://issues.apache.org/jira/browse/MAHOUT-1876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Suneel Marthi resolved MAHOUT-1876.
-----------------------------------
Resolution: Fixed
Fix Version/s: 0.13.0

Merged. Thanks again for the contribution.
Post by Raviteja Lokineni (JIRA)
Mahout fails to read from lucene index of solr-6.1.0
----------------------------------------------------
Key: MAHOUT-1876
URL: https://issues.apache.org/jira/browse/MAHOUT-1876
Project: Mahout
Issue Type: Bug
Affects Versions: 0.12.2
Environment: Solr: 6.1.0
JDK: 1.8.0_92
Mahout: 0.12.2
OS: Linux
Reporter: Raviteja Lokineni
Fix For: 0.13.0
Command: {noformat}bin/mahout lucene.vector --dir ~/softwares/solr-6.1.0/server/solr/nlp-core/data/index --output /tmp/solr-nlp-core/out.vec --field rspns_val --dictOut /tmp/solr-nlp-core/dictionary.txt --norm 2{noformat}
{noformat}
hadoop binary is not in PATH,HADOOP_HOME/bin,HADOOP_PREFIX/bin, running locally
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/lok268/softwares/apache-mahout-distribution-0.12.2/mahout-examples-0.12.2-job.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/lok268/softwares/apache-mahout-distribution-0.12.2/mahout-mr-0.12.2-job.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/lok268/softwares/apache-mahout-distribution-0.12.2/lib/slf4j-log4j12-1.7.19.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
Exception in thread "main" org.apache.lucene.index.IndexFormatTooNewException: Format version is not supported (resource: ChecksumIndexInput(MMapIndexInput(path="/home/lok268/softwares/solr-6.1.0/server/solr/nlp-core/data/index/segments_2"))): 6 (needs to be between 0 and 1)
at org.apache.lucene.codecs.CodecUtil.checkHeaderNoMagic(CodecUtil.java:148)
at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:329)
at org.apache.lucene.index.StandardDirectoryReader$1.doBody(StandardDirectoryReader.java:56)
at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:843)
at org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:52)
at org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:66)
at org.apache.mahout.utils.vectors.lucene.Driver.dumpVectors(Driver.java:89)
at org.apache.mahout.utils.vectors.lucene.Driver.main(Driver.java:277)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72)
at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:145)
at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:153)
at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:195)
{noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
Suneel Marthi (JIRA)
2016-08-11 05:46:21 UTC
Permalink
[ https://issues.apache.org/jira/browse/MAHOUT-1876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Suneel Marthi updated MAHOUT-1876:
----------------------------------
Summary: Mahout fails to read from lucene index of solr-5.5.2 (was: Mahout fails to read from lucene index of solr-6.1.0)
Mahout fails to read from lucene index of solr-5.5.2
----------------------------------------------------
Key: MAHOUT-1876
URL: https://issues.apache.org/jira/browse/MAHOUT-1876
Project: Mahout
Issue Type: Bug
Affects Versions: 0.12.2
Environment: Solr: 6.1.0
JDK: 1.8.0_92
Mahout: 0.12.2
OS: Linux
Reporter: Raviteja Lokineni
Fix For: 0.13.0
Command: {noformat}bin/mahout lucene.vector --dir ~/softwares/solr-6.1.0/server/solr/nlp-core/data/index --output /tmp/solr-nlp-core/out.vec --field rspns_val --dictOut /tmp/solr-nlp-core/dictionary.txt --norm 2{noformat}
{noformat}
hadoop binary is not in PATH,HADOOP_HOME/bin,HADOOP_PREFIX/bin, running locally
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/lok268/softwares/apache-mahout-distribution-0.12.2/mahout-examples-0.12.2-job.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/lok268/softwares/apache-mahout-distribution-0.12.2/mahout-mr-0.12.2-job.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/lok268/softwares/apache-mahout-distribution-0.12.2/lib/slf4j-log4j12-1.7.19.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
Exception in thread "main" org.apache.lucene.index.IndexFormatTooNewException: Format version is not supported (resource: ChecksumIndexInput(MMapIndexInput(path="/home/lok268/softwares/solr-6.1.0/server/solr/nlp-core/data/index/segments_2"))): 6 (needs to be between 0 and 1)
at org.apache.lucene.codecs.CodecUtil.checkHeaderNoMagic(CodecUtil.java:148)
at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:329)
at org.apache.lucene.index.StandardDirectoryReader$1.doBody(StandardDirectoryReader.java:56)
at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:843)
at org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:52)
at org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:66)
at org.apache.mahout.utils.vectors.lucene.Driver.dumpVectors(Driver.java:89)
at org.apache.mahout.utils.vectors.lucene.Driver.main(Driver.java:277)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72)
at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:145)
at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:153)
at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:195)
{noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
Suneel Marthi (JIRA)
2016-08-11 05:47:20 UTC
Permalink
[ https://issues.apache.org/jira/browse/MAHOUT-1876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Suneel Marthi updated MAHOUT-1876:
----------------------------------
Environment:
Solr: 5.5.2
JDK: 1.7.0
Mahout: 0.12.2
OS: Linux

was:
Solr: 6.1.0
JDK: 1.8.0_92
Mahout: 0.12.2
OS: Linux
Post by Suneel Marthi (JIRA)
Mahout fails to read from lucene index of solr-5.5.2
----------------------------------------------------
Key: MAHOUT-1876
URL: https://issues.apache.org/jira/browse/MAHOUT-1876
Project: Mahout
Issue Type: Bug
Affects Versions: 0.12.2
Environment: Solr: 5.5.2
JDK: 1.7.0
Mahout: 0.12.2
OS: Linux
Reporter: Raviteja Lokineni
Fix For: 0.13.0
Command: {noformat}bin/mahout lucene.vector --dir ~/softwares/solr-6.1.0/server/solr/nlp-core/data/index --output /tmp/solr-nlp-core/out.vec --field rspns_val --dictOut /tmp/solr-nlp-core/dictionary.txt --norm 2{noformat}
{noformat}
hadoop binary is not in PATH,HADOOP_HOME/bin,HADOOP_PREFIX/bin, running locally
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/lok268/softwares/apache-mahout-distribution-0.12.2/mahout-examples-0.12.2-job.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/lok268/softwares/apache-mahout-distribution-0.12.2/mahout-mr-0.12.2-job.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/lok268/softwares/apache-mahout-distribution-0.12.2/lib/slf4j-log4j12-1.7.19.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
Exception in thread "main" org.apache.lucene.index.IndexFormatTooNewException: Format version is not supported (resource: ChecksumIndexInput(MMapIndexInput(path="/home/lok268/softwares/solr-6.1.0/server/solr/nlp-core/data/index/segments_2"))): 6 (needs to be between 0 and 1)
at org.apache.lucene.codecs.CodecUtil.checkHeaderNoMagic(CodecUtil.java:148)
at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:329)
at org.apache.lucene.index.StandardDirectoryReader$1.doBody(StandardDirectoryReader.java:56)
at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:843)
at org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:52)
at org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:66)
at org.apache.mahout.utils.vectors.lucene.Driver.dumpVectors(Driver.java:89)
at org.apache.mahout.utils.vectors.lucene.Driver.main(Driver.java:277)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72)
at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:145)
at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:153)
at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:195)
{noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
Hudson (JIRA)
2016-08-11 06:32:21 UTC
Permalink
[ https://issues.apache.org/jira/browse/MAHOUT-1876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15416639#comment-15416639 ]

Hudson commented on MAHOUT-1876:
--------------------------------

SUCCESS: Integrated in Mahout-Quality #3384 (See [https://builds.apache.org/job/Mahout-Quality/3384/])
MAHOUT-1876: Upgrade lucene to 5.5.2 and fix compilation failures, this (smarthi: rev 4d0cd66a6269eb02fceaabdb11d70fd38d433474)
* mr/src/main/java/org/apache/mahout/common/lucene/AnalyzerUtils.java
* mr/src/main/java/org/apache/mahout/vectorizer/encoders/LuceneTextValueEncoder.java
* integration/src/main/java/org/apache/mahout/text/wikipedia/WikipediaAnalyzer.java
* integration/src/main/java/org/apache/mahout/utils/vectors/lucene/CachedTermInfo.java
* integration/src/test/java/org/apache/mahout/utils/vectors/lucene/DriverTest.java
* integration/src/test/java/org/apache/mahout/clustering/TestClusterDumper.java
* mr/src/main/java/org/apache/mahout/vectorizer/TFIDF.java
* pom.xml
* integration/src/main/java/org/apache/mahout/text/MailArchivesClusteringAnalyzer.java
* examples/src/main/java/org/apache/mahout/classifier/NewsgroupHelper.java
* integration/src/test/java/org/apache/mahout/utils/nlp/collocations/llr/BloomTokenFilterTest.java
* integration/src/main/java/org/apache/mahout/utils/vectors/lucene/ClusterLabels.java
* integration/src/test/java/org/apache/mahout/utils/vectors/lucene/CachedTermInfoTest.java
* integration/src/main/java/org/apache/mahout/utils/vectors/lucene/Driver.java
* mr/src/test/java/org/apache/mahout/vectorizer/encoders/TextValueEncoderTest.java
* integration/src/test/java/org/apache/mahout/utils/vectors/lucene/LuceneIterableTest.java
* integration/src/main/java/org/apache/mahout/utils/vectors/lucene/AbstractLuceneIterator.java
* integration/src/main/java/org/apache/mahout/utils/regex/AnalyzerTransformer.java
Post by Suneel Marthi (JIRA)
Mahout fails to read from lucene index of solr-5.5.2
----------------------------------------------------
Key: MAHOUT-1876
URL: https://issues.apache.org/jira/browse/MAHOUT-1876
Project: Mahout
Issue Type: Bug
Affects Versions: 0.12.2
Environment: Solr: 5.5.2
JDK: 1.7.0
Mahout: 0.12.2
OS: Linux
Reporter: Raviteja Lokineni
Fix For: 0.13.0
Command: {noformat}bin/mahout lucene.vector --dir ~/softwares/solr-6.1.0/server/solr/nlp-core/data/index --output /tmp/solr-nlp-core/out.vec --field rspns_val --dictOut /tmp/solr-nlp-core/dictionary.txt --norm 2{noformat}
{noformat}
hadoop binary is not in PATH,HADOOP_HOME/bin,HADOOP_PREFIX/bin, running locally
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/lok268/softwares/apache-mahout-distribution-0.12.2/mahout-examples-0.12.2-job.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/lok268/softwares/apache-mahout-distribution-0.12.2/mahout-mr-0.12.2-job.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/lok268/softwares/apache-mahout-distribution-0.12.2/lib/slf4j-log4j12-1.7.19.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
Exception in thread "main" org.apache.lucene.index.IndexFormatTooNewException: Format version is not supported (resource: ChecksumIndexInput(MMapIndexInput(path="/home/lok268/softwares/solr-6.1.0/server/solr/nlp-core/data/index/segments_2"))): 6 (needs to be between 0 and 1)
at org.apache.lucene.codecs.CodecUtil.checkHeaderNoMagic(CodecUtil.java:148)
at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:329)
at org.apache.lucene.index.StandardDirectoryReader$1.doBody(StandardDirectoryReader.java:56)
at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:843)
at org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:52)
at org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:66)
at org.apache.mahout.utils.vectors.lucene.Driver.dumpVectors(Driver.java:89)
at org.apache.mahout.utils.vectors.lucene.Driver.main(Driver.java:277)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72)
at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:145)
at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:153)
at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:195)
{noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
Raviteja Lokineni (JIRA)
2016-08-11 14:49:22 UTC
Permalink
[ https://issues.apache.org/jira/browse/MAHOUT-1876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15417361#comment-15417361 ]

Raviteja Lokineni commented on MAHOUT-1876:
-------------------------------------------

[~smarthi] Will you be updating changelog at a later stage?

[~Andrew_Palumbo] A minor thing but aren't you guys supposed to add the contributor email as given in: http://mahout.apache.org/developers/github.html
Post by Suneel Marthi (JIRA)
Mahout fails to read from lucene index of solr-5.5.2
----------------------------------------------------
Key: MAHOUT-1876
URL: https://issues.apache.org/jira/browse/MAHOUT-1876
Project: Mahout
Issue Type: Bug
Affects Versions: 0.12.2
Environment: Solr: 5.5.2
JDK: 1.7.0
Mahout: 0.12.2
OS: Linux
Reporter: Raviteja Lokineni
Fix For: 0.13.0
Command: {noformat}bin/mahout lucene.vector --dir ~/softwares/solr-6.1.0/server/solr/nlp-core/data/index --output /tmp/solr-nlp-core/out.vec --field rspns_val --dictOut /tmp/solr-nlp-core/dictionary.txt --norm 2{noformat}
{noformat}
hadoop binary is not in PATH,HADOOP_HOME/bin,HADOOP_PREFIX/bin, running locally
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/lok268/softwares/apache-mahout-distribution-0.12.2/mahout-examples-0.12.2-job.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/lok268/softwares/apache-mahout-distribution-0.12.2/mahout-mr-0.12.2-job.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/lok268/softwares/apache-mahout-distribution-0.12.2/lib/slf4j-log4j12-1.7.19.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
Exception in thread "main" org.apache.lucene.index.IndexFormatTooNewException: Format version is not supported (resource: ChecksumIndexInput(MMapIndexInput(path="/home/lok268/softwares/solr-6.1.0/server/solr/nlp-core/data/index/segments_2"))): 6 (needs to be between 0 and 1)
at org.apache.lucene.codecs.CodecUtil.checkHeaderNoMagic(CodecUtil.java:148)
at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:329)
at org.apache.lucene.index.StandardDirectoryReader$1.doBody(StandardDirectoryReader.java:56)
at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:843)
at org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:52)
at org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:66)
at org.apache.mahout.utils.vectors.lucene.Driver.dumpVectors(Driver.java:89)
at org.apache.mahout.utils.vectors.lucene.Driver.main(Driver.java:277)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72)
at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:145)
at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:153)
at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:195)
{noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Loading...