Speed up WFA when two sequences differ greatly in length #13

lh3 · 2022-04-12T03:55:06Z

When patching gaps between two anchors, we sometimes need to align two sequences of vastly different lengths. Suppose we are aligning a 10bp sequence against a 100kb sequence. The active band [lo,hi] in the original WFA will grow from 1 to 100010 in size. The total number of iterations is about 100010²/2. Based on the running time of WFA2-lib, I guess WFA2-lib has a similar behavior. This is not necessary given that a wavefront can only consist of 10 cells in this example.

Generally, a key observation is that the active band is determined by the cells in the current stripe (the wfalm terminology). Sometimes we can decrease "hi" or increase "lo" when the wavefront hits the end of the query or the target. I added a hack to miniwfa to achieve that. For 10bp vs 100kb, WFA2-lib takes 44 sec while the modified miniwfa takes 0.2 sec. ksw2 only takes 0.01s mainly because WFA doesn't have an advantage on such examples and partly because my hack is inefficient. I think there should be a faster and cleaner solution but I haven't found that.

RagnarGrootKoerkamp · 2022-04-12T06:43:08Z

Is this in the context of global or semi-global alignment (I suppose global given that you're aligning between anchors)?

Could you explain the code you added a bit more? It looks quadratic to me but surely that's just my misunderstanding. Is the following correct:
WFA keeps diagonals that already reached the end active, and (tries to) update them on every increment of s. Your patch decreases the band whenever this happens, so that only diagonals that haven't reached the end yet are updated.

lh3 · 2022-04-12T14:26:19Z

Is this in the context of global or semi-global alignment (I suppose global given that you're aligning between anchors)?

It is global.

Could you explain the code you added a bit more?

Function wf_stripe_shrink() finds the smallest lower bound and the largest upper bound among cells in the current stripe. This function is called every 64 cycles.

It looks quadratic to me

Let q=max(x,o1+e1,o2+e2). Then wf->n = q+1. Perhaps it is safer to swap the order of for (j) and for (d). In that case, the time complexity of calling the function once is O(qw+q) where w is the change of the bandwidth. Most of time, w=0. When w>0, the following iterations will be faster. With the current loop, the time complexity can be O(qw+q²) in the worst case, I think.

PS:

WFA keeps diagonals that already reached the end active, and (tries to) update them on every increment of s. Your patch decreases the band whenever this happens, so that only diagonals that haven't reached the end yet are updated.

Almost correct, except that the "whenever" part – my implementation only calls the function when s%64==0 as most of time this function has no effect.

probably this has no effect in practice. See also smarco/WFA2-lib#13

smarco · 2022-04-17T15:24:02Z

Sorry for the late reply.

I think I understand the situation you are presenting. In the case of the WFA2-lib, it calls the function wavefront_compute_trim_ends() every time a new wavefront is computed.

This should have a similar effect as your wf_stripe_shrink() (?)

For 10bp vs 100kb, WFA2-lib takes 44 sec while the modified miniwfa takes 0.2 sec

But it clearly isn't. So I must have made an implementation mistake here.

mainly because WFA doesn't have an advantage on such examples

Right. In some extreme corner cases, like this one, the WFA doesn't have the upper-hand.

the modified miniwfa takes 0.2 sec. ksw2 only takes 0.01s

Although the time measurement is too small to make an assessment, this case invites to make a fine profile of this case and see what is going on.

I'm leaving this open until I can address this case.
Thanks for the feedback.

lh3 added a commit to lh3/miniwfa that referenced this issue Apr 12, 2022

improved the worst case during band shrinking

f87dfb9

probably this has no effect in practice. See also smarco/WFA2-lib#13

smarco added bug Something isn't working feature Feature request discussion Discussion labels Apr 17, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Speed up WFA when two sequences differ greatly in length #13

Speed up WFA when two sequences differ greatly in length #13

lh3 commented Apr 12, 2022

RagnarGrootKoerkamp commented Apr 12, 2022 •

edited

Loading

lh3 commented Apr 12, 2022 •

edited

Loading

smarco commented Apr 17, 2022

Speed up WFA when two sequences differ greatly in length #13

Speed up WFA when two sequences differ greatly in length #13

Comments

lh3 commented Apr 12, 2022

RagnarGrootKoerkamp commented Apr 12, 2022 • edited Loading

lh3 commented Apr 12, 2022 • edited Loading

smarco commented Apr 17, 2022

RagnarGrootKoerkamp commented Apr 12, 2022 •

edited

Loading

lh3 commented Apr 12, 2022 •

edited

Loading