Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace mutex with ReadWriteLock in CipherManager #981

Closed
wants to merge 1 commit into from

Conversation

torusrxxx
Copy link
Contributor

CipherManager seems to be accessed a lot, but uses a mutex. So I replaced it with a ReadWriteLock.

@bertm
Copy link
Contributor

bertm commented Oct 5, 2024

Benchmarking shows that the proposed change decreases the performance of getDigestedKey regardless of cache hit ratio.

Below are the results of a microbenchmark testing for:

  • different cache implementations
    • cacheType = ReadWriteLocked - the proposed new implementation
    • cacheType = Synchronized - the existing implementation
    • cacheType = NoCache - no key cache, just use SHA256.getMessageDigest/returnMessageDigest
    • cacheType = NoCache_NoDigestPool - no key cache, just use a fresh MessageDigest.getInstance("SHA-256", provider)
  • different cache hit ratios (through number of unique keys used)
    • numberOfKeys = 128 matches the cache size, giving 100% cache hit rate
    • numberOfKeys = 1024 reduces this to 12.5%
  • different routing key types
    • routingKeyType = SSK has 66 byte keys
    • routingKeyType = CHK has 34 byte keys

Benchmark is run using 4 threads on a ~3 year old i7-10510U.

         (cacheType)  (numberOfKeys)  (routingKeyType)  Mode  Cnt  Score   Error  Units

-100% cache hit ratio, SSK keys-
     ReadWriteLocked             128               SSK  avgt   50  0,932 ± 0,024  us/op
        Synchronized             128               SSK  avgt   50  0,703 ± 0,035  us/op
             NoCache             128               SSK  avgt   50  1,275 ± 0,023  us/op
NoCache_NoDigestPool             128               SSK  avgt   50  0,811 ± 0,055  us/op

-100% cache hit ratio, CHK keys-
     ReadWriteLocked             128               CHK  avgt   50  0,930 ± 0,016  us/op
        Synchronized             128               CHK  avgt   50  0,727 ± 0,025  us/op
             NoCache             128               CHK  avgt   50  1,041 ± 0,014  us/op
NoCache_NoDigestPool             128               CHK  avgt   50  0,462 ± 0,020  us/op

-12.5% cache hit ratio, SSK keys-
     ReadWriteLocked            1024               SSK  avgt   50  7,885 ± 0,732  us/op
        Synchronized            1024               SSK  avgt   50  2,282 ± 0,135  us/op
             NoCache            1024               SSK  avgt   50  1,320 ± 0,105  us/op
NoCache_NoDigestPool            1024               SSK  avgt   50  0,764 ± 0,031  us/op

-12.5% cache hit ratio, CHK keys-
     ReadWriteLocked            1024               CHK  avgt   50  5,661 ± 0,530  us/op
        Synchronized            1024               CHK  avgt   50  2,339 ± 0,130  us/op
             NoCache            1024               CHK  avgt   50  1,028 ± 0,014  us/op
NoCache_NoDigestPool            1024               CHK  avgt   50  0,485 ± 0,023  us/op

This shows that the new implementation is between 28% slower (high cache hit ratio, CHK keys) and 246% slower (low cache hit ratio, SSK keys).

For what it's worth, in a flame graph for the slowest benchmark, only the little time is spent actually computing the hashes (annotated with red borders), all what remains is overhead, most of it due to the locking.
image

That all said: this PR is fine code-wise, but it does not appear to solve any actual problem.
If anyone has benchmarks showing otherwise, I would love to see them!

@torusrxxx
Copy link
Contributor Author

Everyone wants to get rid of locks. You need to test in a multi threaded environment before claiming locks are useless. Even if ReadWriteLock is a bit slower, it lets more threads through than a mutex. But in this case, I think getDigestedKey might be so fast on modern CPU that it doesn't need a cache anyway.

@bertm
Copy link
Contributor

bertm commented Oct 6, 2024

Everyone wants to get rid of locks

That is not the point I was raising. The locks (whether monitors or read-write) are necessary for synchronizing access to the cache - when there is one.

You need to test in a multi threaded environment before claiming locks are useless.

Which is why the above micro benchmark was run using 4 threads to maximize contention.

Even if ReadWriteLock is a bit slower, it lets more threads through than a mutex.

The benchmark result appears to indicate the opposite.

But in this case, I think getDigestedKey might be so fast on modern CPU that it doesn't need a cache anyway.

This is exactly the point I was trying to raise 👍

@torusrxxx
Copy link
Contributor Author

Let's remove the unnecessary caches instead

@torusrxxx torusrxxx closed this Oct 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants