Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mallet train before init results in cryptic error, fix workflow #1

Open
JaimieMurdock opened this issue Jan 6, 2018 · 0 comments
Open

Comments

@JaimieMurdock
Copy link
Member

Add some checks in ldamark/benchmark.py to ensure that all the files Mallet expects are there to anticipate some exceptions. Help the user by suggesting init be run first.

jaimie@compy386:~/workspace/inpho/ldamark 
$ ldamark ap --m mallet --f train --iterations 100 --topics 50
Traceback (most recent call last):
  File "/home/jaimie/anaconda3/envs/py27/bin/ldamark", line 11, in <module>
    load_entry_point('ldamark', 'console_scripts', 'ldamark')()
  File "/home/jaimie/workspace/inpho/ldamark/ldamark/benchmark.py", line 155, in main
    stderr=subprocess.STDOUT).split("\n")[-2]
  File "/home/jaimie/anaconda3/envs/py27/lib/python2.7/subprocess.py", line 219, in check_output
    raise CalledProcessError(retcode, cmd, output=output)
subprocess.CalledProcessError: Command '['/usr/bin/time', '--format', '%e,%U,%S', './mallet-2.0.8RC3/bin/mallet', 'train-topics', '--input', 'out.mallet', '--num-topics', '50', '--output-state', 'mallet_out.gz', '--output-topic-keys', 'mallet_out.txt', '--output-doc-topics', 'mallet_out.txt', '--num-iterations', '100']' returned non-zero exit status 1
# 17:23:34 ✗ (1.673s) 1

jaimie@compy386:~/workspace/inpho/ldamark 
$ /usr/bin/time --format %e,%U,%S ./mallet-2.0.8RC3/bin/mallet train-topics --input out.mallet --num-topics 50 --output-state mallet_out.gz --output-topic-keys mallet_out.txt --output-doc-topics mallet_out.txt --num-iterations 100
Mallet LDA: 50 topics, 6 topic bits, 111111 topic mask
java.io.FileNotFoundException: out.mallet (No such file or directory)
	at java.io.FileInputStream.open0(Native Method)
	at java.io.FileInputStream.open(FileInputStream.java:195)
	at java.io.FileInputStream.<init>(FileInputStream.java:138)
	at cc.mallet.types.InstanceList.load(InstanceList.java:787)
	at cc.mallet.topics.tui.TopicTrainer.main(TopicTrainer.java:199)
Unable to restore instance list out.mallet: java.lang.IllegalArgumentException: Couldn't read InstanceList from file out.mallet
Command exited with non-zero status 1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant