-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
EXC_BAD_ACCESS #16
Comments
It's likely UB. Another phenotype is aborting with
|
I had some difficulty with the C++ API. In my case the key issue seemed to
be the initialization of the reduction parameter. For the time being I'm
using the C API and setting the reduction explicitly.
…On Wed, Apr 20, 2022, 13:56 Armin Töpfer ***@***.***> wrote:
It's likely UB. Another phenotype is aborting with
[WFA::Backtrace] Wrong type trace.2
—
Reply to this email directly, view it on GitHub
<#16 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AABDQEJG7Z2AYOZBJMQ6BDLVF7WHTANCNFSM5T3Y7WNQ>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
Another UB hit
@ekg do you have a code snippet to reproduce dual affine-gap in C? |
https://github.com/vcflib/vcflib/blob/master/src/Variant.cpp#L2158 from
here
Let us know if this fixes what you're seeing.
…On Wed, Apr 20, 2022, 14:07 Armin Töpfer ***@***.***> wrote:
Another UB hit
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior ../subprojects/wfa/wavefront/wavefront_extend.c:116:38 in
../subprojects/wfa/system/mm_allocator.c:400:66: runtime error: addition of unsigned offset to 0x632000000800 overflowed to 0x6320000007f8
@ekg <https://github.com/ekg> do you have a code snippet to reproduce
dual affine-gap in C?
—
Reply to this email directly, view it on GitHub
<#16 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AABDQELVUMWR3IKKPVLYY6LVF7XRTANCNFSM5T3Y7WNQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Okay, I added it https://github.com/armintoepfer/aligner-testbed/blob/main/src/main.cpp#L168-L215 Anything obviously broken during my copy/paste? |
I can see you already set
|
We have successfully executed the code on a 2017-MacBook Air (Intel i5-5350U) running a Monterey 12.3.1.
The problem you are experiencing seems to be related to unaligned memory accesses during the extend()/LCP() computation. We optimize this function by comparing input-sequence blocks of 64-bits (8 characters) at a time. This optimization requires unaligned memory access. To be able to help you better, Can you let us know the machine/core you are using to run the benchmark? In any case, make sure that compiling Lastly, note that short executions might not be representative. See a flame-graph on the WFA2-lib execution for short-sequences where most of the time is invested in the initial allocation and final deallocation. Also, note that WFA is running in exact mode here. You could even obtain better performance using Let us know. |
First of all, great to hear that you could run it. I'm using a standard x86 i7 in my iMac with the latest apple clang and gcc11. The issue is independent of march. Can you try running with multiple rounds? Maybe under a debugger? It does not happen with the C API directly. The C API call is also slower than the C++ version. Any idea? The way we map and align is similar to the minimap2 approach. Alignment of very short sequences has been working great so far. Do you think alignment of full 20kb vs 20kb CLR with WFA will be faster than first mapping, cutting into small regions, and then alignment? I can obviously try, but maybe you have done that study already. |
Food for thought, I've added
|
Ok, I've tried on an Intel i7-6500U (Ubuntu 18.04) and I couldn't reproduce the unaligned memory problem. These are, indeed, short sequences and both KSW2 and WFA perform pretty fast: => KSW2 => Exact WFA In all cases, the measurements are really small. I have profiled the case of WFA and we spend a substantial amount doing bookkeeping (e.g., reaping internal buffers). I guess we could do better if we focus on these cases. But, for the time being, for these short sequences, KSW2 has the upper hand against the exact-WFA (being the execution times so small). Note that, comparing CIGARs (using the penalties you provided), for 77.1% of the pairs, WFA returns a better score/CIGAR. I'm not aware of the band size used for KSW2. But this aspect might be interesting to explore (and how suboptimal alignments might affect the results of the downstream analyses). Perhaps, it's not relevant to get the exact optimal in these cases. |
For the long: I refer to the previous results.
I believe that the newest biWFA could do even better. We could also check how close to the optimal KSW2 cigars are. |
Then, for the clr1: We have 2 sequences of length 18779 and 18956, aligning at edit distance 3645 (e~19%). Seems that there are no big indels, but the error is distributed along with the sequences. | 20220420 22:19:21.555 | INFO | Number of sequence pairs : 1 | 20220420 22:20:18.136 | INFO | Number of sequence pairs : 1 Compared to the exact-WFA, KSW2 does a pretty good job and returns the correct/optimal alignment. Considering this case in particular, the exact-WFA is forced to explore a lot of the DP-matrix: Meanwhile, using the
./at ../data/clr1.txt --miniwfa=false --wfa2-c=true --wfa2-cpp=false --ksw2=false --rounds 20 This is a good example of a sequence that is not particularly favourable to the WFA. In any case, comparable time using heuristics (I guess that in the playground of heuristics we could tune it and do better, as KSW2 could too) and 6x slower calculating the optimal CIGAR. I think we can take it from here and optimize those cases of your interest. |
I'm running into an issue that I can't produce if I just give it one sequence pair...
ASAN/UBSAN gives something else...
You can try to reproduce with https://github.com/armintoepfer/clr-align-challenge and then
The text was updated successfully, but these errors were encountered: