SegFaults on Linux DiaNN v. 1.8 #134
Replies: 5 comments 8 replies
-
Hi Matteo, Thanks for letting me know. Temporary workaround: option 1: run prediction on Windows and then use the resulting .predicted.speclib file on Linux, option 2: just rerun until .predicted.speclib is generated correctly. I will look into this, maybe it's a DIA-NN bug, maybe pytorch bug (DIA-NN relies on pytorch for this prediction step). Best, |
Beta Was this translation helpful? Give feedback.
-
Hello Vadim, After a longer break I would like to come back to this problem with segfaults in the prediction step.
If I run this command, the core dumps happen at different stages of the algorithm.
It took few minutes to get the segfault.
Now, this looks more troubling, because it would seem that this occurs after the library has been created.
Say I would be interested, for the time being, to run prediction stage on Windows. I don't know, maybe it is somehow related to me trying to push in only one raw folder. Now, after this bug report, a small question: how should I cut my command to make DiaNN only do the library generation without any additional step later on? Is that possible? Best wishes, MatteoLacki, AG Tenzer |
Beta Was this translation helpful? Give feedback.
-
I am only a user of DIANN and find it worked faster on linux. |
Beta Was this translation helpful? Give feedback.
-
I did try to trace when the segfault happens, seems to be inside pytorch library, when doing a neural network forward pass. Can be either a pytorch bug (in this case I will not be able to do anything about it, can only check if update works fine), or DIA-NN prepared input data for it incorrectly. But I put some checks on the latter, seems all fine. |
Beta Was this translation helpful? Give feedback.
-
I may be experiencing the same non-deterministic issue. Initially, I thought it was a single thread vs. multi-thread issue since I only saw the bug when I ran the command without the I am running DIA-NN with the following command: I ran it via GDB, to help track down to the following:
Its hard to debug further since the program is not open source. |
Beta Was this translation helpful? Give feedback.
-
Hello Vadim,
We have observed some seg-faults (not entirely determinisitic) while testing the linux version of version 1.8.
Basically, after installing the binary, I am running a bash scripts like this:
For 32 threads (in this case only) it resulted in
You might have noticed that I kept data on a ram-disk: don't worry about getting out of RAM: we have 256GB of it, so that should not play a role.
Also without ramdisk I have seen a seg-fault, while running 128 threads.
What should I do to help you eliminate that bug?
Best wishes,
Matteo Lacki
Beta Was this translation helpful? Give feedback.
All reactions