-
Notifications
You must be signed in to change notification settings - Fork 154
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf: improve user specified flush delay automatically by statistics #159
base: main
Are you sure you want to change the base?
Conversation
08c3c3b
to
dfa11e7
Compare
Codecov ReportBase: 96.19% // Head: 96.17% // Decreases project coverage by
Additional details and impacted files@@ Coverage Diff @@
## main #159 +/- ##
==========================================
- Coverage 96.19% 96.17% -0.02%
==========================================
Files 36 36
Lines 25382 25401 +19
==========================================
+ Hits 24415 24429 +14
- Misses 846 851 +5
Partials 121 121
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
related to #156 |
Hello @rueian - this looks interesting, some initial thoughts after looking at PR (have not even run it yet):
|
Also, I suppose this should be tested with read-intensive commands – in https://github.com/FZambia/pipelines I only tried PUBLISH command (which has a very small response), but probably if client reads a lot from Redis then more smooth write flushes of read-intensive commands may be preferred for CPU - but just a shot in the dark. |
Another thought: let's say we have a bigger latency, and set some delay we are comfortable with in the app. As far as I understand this auto adjustment does not take latency into account - it may be that there is no much sense to reduce delay automatically given large RTT time (since it may result in larger CPU on Redis at the end). BTW, I experimented locally with https://github.com/Shopify/toxiproxy yesterday to add some latency to Redis. But I am on MacOS, for Linux |
I tried to experiment a bit. Still - the more I think about this the more I tend to think that automatic adjustment does a bit separate thing: it does not reduce CPU, but goes towards throughput. For example, for parallelism 1 and delay 100 microseconds I got similar results as you, but CPU usage Increased 3x on both app and Redis in case of using automatic adjustment. But I suppose if I wanted to improve throughput - I could just tune the MaxFlushDelay parameter myself as I know my app and its load as a developer better. This may be a good thing in general, but it feels a completely different mode than we had before with And I still doubt whether Rueidis really needs complex behaviour like this - probably better to drop the idea at all otherwise we risk to get pretty unmanageable behavior. Really worry about the fact that adjustments in runtime may be too tricky to maintain and there are too many corner cases, types/size of data, latency involved. Maybe I am too conservative in that – but just prefer keeping things as simple as possible. |
Indeed. I use the word
Let's start with the question: How long should we wait for more commands? Sure, we could just sleep for For example, sleeping for 100 microseconds reduced the throughput by ~290%. ▶ benchstat ori-0us-p1.txt old-100us-p1.txt
name old time/op new time/op delta
Rueidis-10 3.08µs ± 0% 11.99µs ± 0% +289.49% (p=0.000 n=19+20) To avoid sleeping too long, I would like to make this sleep including the successive flush similar to previous flushes. In other words, I want Assuming the I can also have an approximated average receiving time of a byte, by tracking durations between each explicit flush, donated as Therefore, the original question can be approximated to How many bytes I should wait for to make them take a similar time to average flushes donated as Then the sleeping duration can be |
Sure, but there are many cases that their traffic pattern can't be fit to a fixed
We probably can also have a mode allowing user to set
Don't worry. This won't be merged until we find a better solution. |
Current I think you implemented some bright ideas here, but I am not sure at this point that I'd use this mode somehow. Probably it's better to wait for more feedback here, maybe from other users. |
Thinking more.. Not sure whether it's good idea or not – can this batching strategy be a pluggable element somehow – just to keep Rueidis core simple yet extensible |
It is a good idea. Probably something like this |
dfa11e7
to
ba3adb3
Compare
d936e9a
to
8410f48
Compare
fdb6041
to
b9b0f4b
Compare
Choose flush delay automatically by statistics from previous flush histories.
It generally achieves better throughput than just using
MaxFlushDelay
provided by users directly while still keeping reasonable CPU usage.Benchmark comparison to the old v0.0.89 by using the code https://github.com/FZambia/pipelines:
PIPE_PARALLELISM=1 PIPE_MAX_FLUSH_DELAY=100
PIPE_PARALLELISM=128 PIPE_MAX_FLUSH_DELAY=100
PIPE_PARALLELISM=1 PIPE_MAX_FLUSH_DELAY=20
PIPE_PARALLELISM=128 PIPE_MAX_FLUSH_DELAY=20