-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Upgrade dependency opensearch-java 2.1.0 to newer #4821
Comments
PR: apache#1648 Locally it seemed to pass, but let's see with the CI. Then I will run some perf tests |
Flame graphs and thread dump... Everything seems blocked but really hard to see what's wrong. I think I need to do those maybe more like a bit after I started a run? I waited too long and James seems completely bloated. Will try again |
Can u try to get a heap dump (may be upon OOM - JVM has an option for this) and share to the team? |
A CPU spending 80% of it's time allocating arrays: there's something wrong! |
Noticed that, not sure where to look at though yet^^'
I did actually, just forgot to upload it yesterday, as it was too big for the zip for github. |
This tie I tried to capture earlier, even if it gets bloated pretty quickly again. I think in opensearch driver threads, what's interesting is:
We can see it getting errors on ByteArrayBuffer, but from the core5 lib of apache. I;m not sure how far I can dig in it, seeing how it's hard to be able to fully focus on a task for me these days. But will try |
Q: This heap dump is achieved from the JVM running tests, right? 5 times "One instance of "org.apache.hc.core5.reactor.InternalDataChannel" loaded by "jdk.internal.loader.ClassLoaders$AppClassLoader @ 0x740050000" occupies 104,961,680 (15.68%) bytes". It is exactly 5 tests in So my guess is that test memory blows up because we do not release the hc5 client gracefully after each test / upon James shutdown. I checked for production code, the |
I tried honestly to reproduce it first running |
Really can't get around this. Been trying to check the code on opensearch-java but hardly understands all those schenanigans. What I think though, is that if you look here: https://heaphero.io/heap-report-wc.jsp?p=RThCZ0lOakJ3K2dCd3FiSEc4VEhsdEkxUGFsRW8rR2tWWjNxdm5rZjVTeFBmUEVYc0xHVEpCK0NkSEF4MVpoYlFnaHBWNkU1dXJKVHpHdnFxOW1Xamc9PQ== (cool online tool btw @quantranhong1999 ) the value of one of the array byte reported as leaking suspect (actually same for all):
So it seems to happen for scrolls. Maybe it doesn't clean up those until the scroll is finished, and because in preprod we have lots of records, it goes boom before the end? Something in those lines I guess. |
Well this is a run with searchOverrides off, only IMAP simulation. 0.8.2http5 updateWe can see quite a difference here on the searches. With the work on this PR, dropping the rest client, we loose 35% on searchDeleted and 48% on searchUnseen for P99. Also, there is a 10% perf loss as well on the mean for searchDeleted and searchUnseen, compared to 0.8.2. I think the work done on the java client is still not good enough to get rid yet of the rest client for transport. In favor to wait and not merge this work yet. Other feedbacks welcomed :) |
Agree |
+1 |
Alright putting it back to the backlog then, but was worth a try, seems better than last time already, just not good enough yet :) |
Latest master IMAPLatest master with opensearch http5 transport clientWith searchoverrides disabled of course, to force the hits on opensearch. Huge loss of perfs... Much worse than a few months ago actually, which is rather odd... Also the docker image size for both is the same, so there is no gain in that either... Not worth at all the switch |
May I ask for flame graphs of the terrible run? |
When I finish with release runs |
The core5 lib used for transport with OS is clearly taking around 30% cpu and 50% memory. Badly optimized? |
The update opensearch-java 2.1.0 -> 2.6.0 make a test case fail.
Investigate why and update it successfully
apache#1647 (comment)
apache#1647 (comment)
EDIT @Arsnael :
We need as well to:
The text was updated successfully, but these errors were encountered: