Discussion:
[jira] [Created] (MAHOUT-1799) Read null row vectors from file in TextDelimeterReaderWriter driver
Jussi Jousimo (JIRA)
2016-02-29 14:50:18 UTC
Permalink
Jussi Jousimo created MAHOUT-1799:
-------------------------------------

Summary: Read null row vectors from file in TextDelimeterReaderWriter driver
Key: MAHOUT-1799
URL: https://issues.apache.org/jira/browse/MAHOUT-1799
Project: Mahout
Issue Type: Improvement
Components: spark
Reporter: Jussi Jousimo
Priority: Minor






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
Jussi Jousimo (JIRA)
2016-02-29 14:55:18 UTC
Permalink
[ https://issues.apache.org/jira/browse/MAHOUT-1799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jussi Jousimo updated MAHOUT-1799:
----------------------------------
Description: Since some row vectors in a sparse matrix can be null, Mahout writes them out to a file with the row label only. However, Mahout cannot read these files, but throws an exception when it encounters a label-only row.
Post by Jussi Jousimo (JIRA)
Read null row vectors from file in TextDelimeterReaderWriter driver
-------------------------------------------------------------------
Key: MAHOUT-1799
URL: https://issues.apache.org/jira/browse/MAHOUT-1799
Project: Mahout
Issue Type: Improvement
Components: spark
Reporter: Jussi Jousimo
Priority: Minor
Since some row vectors in a sparse matrix can be null, Mahout writes them out to a file with the row label only. However, Mahout cannot read these files, but throws an exception when it encounters a label-only row.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
ASF GitHub Bot (JIRA)
2016-03-07 23:08:40 UTC
Permalink
[ https://issues.apache.org/jira/browse/MAHOUT-1799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15183954#comment-15183954 ]

ASF GitHub Bot commented on MAHOUT-1799:
----------------------------------------

Github user dlyubimov commented on the pull request:

https://github.com/apache/mahout/pull/182#issuecomment-193496785

Noted, need Pat's review
Post by Jussi Jousimo (JIRA)
Read null row vectors from file in TextDelimeterReaderWriter driver
-------------------------------------------------------------------
Key: MAHOUT-1799
URL: https://issues.apache.org/jira/browse/MAHOUT-1799
Project: Mahout
Issue Type: Improvement
Components: spark
Reporter: Jussi Jousimo
Priority: Minor
Since some row vectors in a sparse matrix can be null, Mahout writes them out to a file with the row label only. However, Mahout cannot read these files, but throws an exception when it encounters a label-only row.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
Suneel Marthi (JIRA)
2016-03-12 23:26:33 UTC
Permalink
[ https://issues.apache.org/jira/browse/MAHOUT-1799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Suneel Marthi updated MAHOUT-1799:
----------------------------------
Assignee: Pat Ferrel
Post by Jussi Jousimo (JIRA)
Read null row vectors from file in TextDelimeterReaderWriter driver
-------------------------------------------------------------------
Key: MAHOUT-1799
URL: https://issues.apache.org/jira/browse/MAHOUT-1799
Project: Mahout
Issue Type: Improvement
Components: spark
Reporter: Jussi Jousimo
Assignee: Pat Ferrel
Priority: Minor
Fix For: 0.12.0
Since some row vectors in a sparse matrix can be null, Mahout writes them out to a file with the row label only. However, Mahout cannot read these files, but throws an exception when it encounters a label-only row.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
Suneel Marthi (JIRA)
2016-03-12 23:26:33 UTC
Permalink
[ https://issues.apache.org/jira/browse/MAHOUT-1799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Suneel Marthi updated MAHOUT-1799:
----------------------------------
Fix Version/s: 0.12.0
Post by Jussi Jousimo (JIRA)
Read null row vectors from file in TextDelimeterReaderWriter driver
-------------------------------------------------------------------
Key: MAHOUT-1799
URL: https://issues.apache.org/jira/browse/MAHOUT-1799
Project: Mahout
Issue Type: Improvement
Components: spark
Reporter: Jussi Jousimo
Assignee: Pat Ferrel
Priority: Minor
Fix For: 0.12.0
Since some row vectors in a sparse matrix can be null, Mahout writes them out to a file with the row label only. However, Mahout cannot read these files, but throws an exception when it encounters a label-only row.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
ASF GitHub Bot (JIRA)
2016-03-17 15:53:33 UTC
Permalink
[ https://issues.apache.org/jira/browse/MAHOUT-1799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15199763#comment-15199763 ]

ASF GitHub Bot commented on MAHOUT-1799:
----------------------------------------

Github user pferrel commented on the pull request:

https://github.com/apache/mahout/pull/182#issuecomment-197944335
Post by Jussi Jousimo (JIRA)
Read null row vectors from file in TextDelimeterReaderWriter driver
-------------------------------------------------------------------
Key: MAHOUT-1799
URL: https://issues.apache.org/jira/browse/MAHOUT-1799
Project: Mahout
Issue Type: Improvement
Components: spark
Reporter: Jussi Jousimo
Assignee: Pat Ferrel
Priority: Minor
Fix For: 0.12.0
Since some row vectors in a sparse matrix can be null, Mahout writes them out to a file with the row label only. However, Mahout cannot read these files, but throws an exception when it encounters a label-only row.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
Pat Ferrel (JIRA)
2016-03-17 15:57:33 UTC
Permalink
[ https://issues.apache.org/jira/browse/MAHOUT-1799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15199770#comment-15199770 ]

Pat Ferrel commented on MAHOUT-1799:
------------------------------------

Can't test this or even merge it right now so if someone else can merge, great otherwise is doesn't seem like a requirement for release and so unless someone speaks up I'll push to 1.0
Post by Jussi Jousimo (JIRA)
Read null row vectors from file in TextDelimeterReaderWriter driver
-------------------------------------------------------------------
Key: MAHOUT-1799
URL: https://issues.apache.org/jira/browse/MAHOUT-1799
Project: Mahout
Issue Type: Improvement
Components: spark
Reporter: Jussi Jousimo
Assignee: Pat Ferrel
Priority: Minor
Fix For: 1.0.0
Since some row vectors in a sparse matrix can be null, Mahout writes them out to a file with the row label only. However, Mahout cannot read these files, but throws an exception when it encounters a label-only row.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
Pat Ferrel (JIRA)
2016-03-17 15:57:33 UTC
Permalink
[ https://issues.apache.org/jira/browse/MAHOUT-1799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Pat Ferrel updated MAHOUT-1799:
-------------------------------
Fix Version/s: (was: 0.12.0)
1.0.0
Post by Jussi Jousimo (JIRA)
Read null row vectors from file in TextDelimeterReaderWriter driver
-------------------------------------------------------------------
Key: MAHOUT-1799
URL: https://issues.apache.org/jira/browse/MAHOUT-1799
Project: Mahout
Issue Type: Improvement
Components: spark
Reporter: Jussi Jousimo
Assignee: Pat Ferrel
Priority: Minor
Fix For: 1.0.0
Since some row vectors in a sparse matrix can be null, Mahout writes them out to a file with the row label only. However, Mahout cannot read these files, but throws an exception when it encounters a label-only row.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
ASF GitHub Bot (JIRA)
2016-03-18 00:36:33 UTC
Permalink
[ https://issues.apache.org/jira/browse/MAHOUT-1799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15200707#comment-15200707 ]

ASF GitHub Bot commented on MAHOUT-1799:
----------------------------------------

Github user statguy commented on the pull request:

https://github.com/apache/mahout/pull/182#issuecomment-198145198

@pferrel you are asking why null rows should not give an error? For example, when I find user-item recommendations, but not all users get recommendations. I write the matrix to an intermediate file and read it back to process it further to fill those null rows with some sort of average recommendations. I could write a unit test for this too, but I would like get an approval first.
Post by Jussi Jousimo (JIRA)
Read null row vectors from file in TextDelimeterReaderWriter driver
-------------------------------------------------------------------
Key: MAHOUT-1799
URL: https://issues.apache.org/jira/browse/MAHOUT-1799
Project: Mahout
Issue Type: Improvement
Components: spark
Reporter: Jussi Jousimo
Assignee: Pat Ferrel
Priority: Minor
Fix For: 1.0.0
Since some row vectors in a sparse matrix can be null, Mahout writes them out to a file with the row label only. However, Mahout cannot read these files, but throws an exception when it encounters a label-only row.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
ASF GitHub Bot (JIRA)
2016-03-30 05:57:25 UTC
Permalink
[ https://issues.apache.org/jira/browse/MAHOUT-1799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15217455#comment-15217455 ]

ASF GitHub Bot commented on MAHOUT-1799:
----------------------------------------

Github user smarthi commented on the pull request:

https://github.com/apache/mahout/pull/182#issuecomment-203260920

@statguy Thanks for the PR, please also provide unit tests that can validate the problem and the fix.
Post by Jussi Jousimo (JIRA)
Read null row vectors from file in TextDelimeterReaderWriter driver
-------------------------------------------------------------------
Key: MAHOUT-1799
URL: https://issues.apache.org/jira/browse/MAHOUT-1799
Project: Mahout
Issue Type: Improvement
Components: spark
Reporter: Jussi Jousimo
Assignee: Pat Ferrel
Priority: Minor
Fix For: 1.0.0
Since some row vectors in a sparse matrix can be null, Mahout writes them out to a file with the row label only. However, Mahout cannot read these files, but throws an exception when it encounters a label-only row.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
ASF GitHub Bot (JIRA)
2016-05-21 14:13:13 UTC
Permalink
[ https://issues.apache.org/jira/browse/MAHOUT-1799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15295045#comment-15295045 ]

ASF GitHub Bot commented on MAHOUT-1799:
----------------------------------------

Github user statguy commented on the pull request:

https://github.com/apache/mahout/pull/182#issuecomment-220779951

I have now provided the unit test.
Post by Jussi Jousimo (JIRA)
Read null row vectors from file in TextDelimeterReaderWriter driver
-------------------------------------------------------------------
Key: MAHOUT-1799
URL: https://issues.apache.org/jira/browse/MAHOUT-1799
Project: Mahout
Issue Type: Improvement
Components: spark
Reporter: Jussi Jousimo
Assignee: Pat Ferrel
Priority: Minor
Fix For: 1.0.0
Since some row vectors in a sparse matrix can be null, Mahout writes them out to a file with the row label only. However, Mahout cannot read these files, but throws an exception when it encounters a label-only row.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
ASF GitHub Bot (JIRA)
2016-05-21 16:15:12 UTC
Permalink
[ https://issues.apache.org/jira/browse/MAHOUT-1799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15295109#comment-15295109 ]

ASF GitHub Bot commented on MAHOUT-1799:
----------------------------------------

Github user pferrel commented on the pull request:

https://github.com/apache/mahout/pull/182#issuecomment-220786346

OK, I'll have to hand merge this as per our process so it will be in the mainline for 0.13
Post by Jussi Jousimo (JIRA)
Read null row vectors from file in TextDelimeterReaderWriter driver
-------------------------------------------------------------------
Key: MAHOUT-1799
URL: https://issues.apache.org/jira/browse/MAHOUT-1799
Project: Mahout
Issue Type: Improvement
Components: spark
Reporter: Jussi Jousimo
Assignee: Pat Ferrel
Priority: Minor
Fix For: 1.0.0
Since some row vectors in a sparse matrix can be null, Mahout writes them out to a file with the row label only. However, Mahout cannot read these files, but throws an exception when it encounters a label-only row.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
Suneel Marthi (JIRA)
2016-05-22 00:07:12 UTC
Permalink
[ https://issues.apache.org/jira/browse/MAHOUT-1799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Suneel Marthi resolved MAHOUT-1799.
-----------------------------------
Resolution: Fixed
Post by Jussi Jousimo (JIRA)
Read null row vectors from file in TextDelimeterReaderWriter driver
-------------------------------------------------------------------
Key: MAHOUT-1799
URL: https://issues.apache.org/jira/browse/MAHOUT-1799
Project: Mahout
Issue Type: Improvement
Components: spark
Reporter: Jussi Jousimo
Assignee: Pat Ferrel
Priority: Minor
Fix For: 1.0.0
Since some row vectors in a sparse matrix can be null, Mahout writes them out to a file with the row label only. However, Mahout cannot read these files, but throws an exception when it encounters a label-only row.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
ASF GitHub Bot (JIRA)
2016-05-22 00:07:12 UTC
Permalink
[ https://issues.apache.org/jira/browse/MAHOUT-1799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15295314#comment-15295314 ]

ASF GitHub Bot commented on MAHOUT-1799:
----------------------------------------

Github user asfgit closed the pull request at:

https://github.com/apache/mahout/pull/182
Post by Jussi Jousimo (JIRA)
Read null row vectors from file in TextDelimeterReaderWriter driver
-------------------------------------------------------------------
Key: MAHOUT-1799
URL: https://issues.apache.org/jira/browse/MAHOUT-1799
Project: Mahout
Issue Type: Improvement
Components: spark
Reporter: Jussi Jousimo
Assignee: Pat Ferrel
Priority: Minor
Fix For: 1.0.0
Since some row vectors in a sparse matrix can be null, Mahout writes them out to a file with the row label only. However, Mahout cannot read these files, but throws an exception when it encounters a label-only row.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
Suneel Marthi (JIRA)
2016-05-22 00:14:13 UTC
Permalink
[ https://issues.apache.org/jira/browse/MAHOUT-1799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Suneel Marthi updated MAHOUT-1799:
----------------------------------
Fix Version/s: (was: 1.0.0)
0.13.0
Post by Jussi Jousimo (JIRA)
Read null row vectors from file in TextDelimeterReaderWriter driver
-------------------------------------------------------------------
Key: MAHOUT-1799
URL: https://issues.apache.org/jira/browse/MAHOUT-1799
Project: Mahout
Issue Type: Improvement
Components: spark
Reporter: Jussi Jousimo
Assignee: Pat Ferrel
Priority: Minor
Fix For: 0.13.0
Since some row vectors in a sparse matrix can be null, Mahout writes them out to a file with the row label only. However, Mahout cannot read these files, but throws an exception when it encounters a label-only row.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
Hudson (JIRA)
2016-05-22 00:45:12 UTC
Permalink
[ https://issues.apache.org/jira/browse/MAHOUT-1799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15295341#comment-15295341 ]

Hudson commented on MAHOUT-1799:
--------------------------------

FAILURE: Integrated in Mahout-Quality #3358 (See [https://builds.apache.org/job/Mahout-Quality/3358/])
MAHOUT-1799:Read null row vectors from file in (smarthi: rev bd1f7bdabceeaaaffd3e7d9c372d40b3b714afc8)
* spark/src/main/scala/org/apache/mahout/drivers/TextDelimitedReaderWriter.scala
* spark/src/test/scala/org/apache/mahout/drivers/TextDelimitedReaderWriterSuite.scala
* math-scala/src/main/scala/org/apache/mahout/math/indexeddataset/Schema.scala
Post by Jussi Jousimo (JIRA)
Read null row vectors from file in TextDelimeterReaderWriter driver
-------------------------------------------------------------------
Key: MAHOUT-1799
URL: https://issues.apache.org/jira/browse/MAHOUT-1799
Project: Mahout
Issue Type: Improvement
Components: spark
Reporter: Jussi Jousimo
Assignee: Pat Ferrel
Priority: Minor
Fix For: 0.13.0
Since some row vectors in a sparse matrix can be null, Mahout writes them out to a file with the row label only. However, Mahout cannot read these files, but throws an exception when it encounters a label-only row.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Loading...