Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Experiences converting DBpedia's bz2 files #50

Open
kurzum opened this issue Aug 8, 2017 · 0 comments
Open

Experiences converting DBpedia's bz2 files #50

kurzum opened this issue Aug 8, 2017 · 0 comments
Milestone

Comments

@kurzum
Copy link

kurzum commented Aug 8, 2017

Hi all,
I just tried to convert DBpedia to hdt and would like to share my experience. Some things did not owrk for me, although I am not sure, whether it is my fault or just missing features.

Java

  1. I saw that bz2 works for Java. Good!
  2. I tried to read in several bz2. files in Java as input and this didn't work (several input files not allowed)
  3. I tried to read one file via stdin in java which worked
  4. I tried to read in several files in Java via stdin with process substitution and mvn exec <(lbzip2 file1.bz2) <(lbzip2 file2.bz2) but it didn't work
  5. I tried to create a big Java Jar with all dependencies with mvn assembly:single, but it just created the bin files (I was too lazy to adjust the descriptors in pom.xnml)

cpp

  1. Although I did apt-get install serdi serd-dbg make failed, I disabled it in the Makefile then it compiled
  2. bz2 didn't work. actually the unzipping seemed to work, but the parser threw a lot of errors
  3. reading via stdin with process substitution failed as cpp doesn't accept stdin
  4. reading from several input files worked. Well, it said Sorting triples and I lost patience as the percentage didn't go up

Overall, I solved it now by sort -um <(lbzip file1.bz2) .... | gzip2 > core.gz

My question is just whether all of the above is intended and expected behaviour or whether I should try some of the methods again as I might have done something wrong.
All the best,
Sebastian

@D063520 D063520 added this to the 3.1.0 milestone Mar 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants