Replace mutex with ReadWriteLock in CipherManager #981

torusrxxx · 2024-09-22T12:38:15Z

CipherManager seems to be accessed a lot, but uses a mutex. So I replaced it with a ReadWriteLock.

src/freenet/store/saltedhash/CipherManager.java

bertm · 2024-10-05T14:32:54Z

Benchmarking shows that the proposed change decreases the performance of getDigestedKey regardless of cache hit ratio.

Below are the results of a microbenchmark testing for:

different cache implementations
- cacheType = ReadWriteLocked - the proposed new implementation
- cacheType = Synchronized - the existing implementation
- cacheType = NoCache - no key cache, just use SHA256.getMessageDigest/returnMessageDigest
- cacheType = NoCache_NoDigestPool - no key cache, just use a fresh MessageDigest.getInstance("SHA-256", provider)
different cache hit ratios (through number of unique keys used)
- numberOfKeys = 128 matches the cache size, giving 100% cache hit rate
- numberOfKeys = 1024 reduces this to 12.5%
different routing key types
- routingKeyType = SSK has 66 byte keys
- routingKeyType = CHK has 34 byte keys

Benchmark is run using 4 threads on a ~3 year old i7-10510U.

         (cacheType)  (numberOfKeys)  (routingKeyType)  Mode  Cnt  Score   Error  Units

-100% cache hit ratio, SSK keys-
     ReadWriteLocked             128               SSK  avgt   50  0,932 ± 0,024  us/op
        Synchronized             128               SSK  avgt   50  0,703 ± 0,035  us/op
             NoCache             128               SSK  avgt   50  1,275 ± 0,023  us/op
NoCache_NoDigestPool             128               SSK  avgt   50  0,811 ± 0,055  us/op

-100% cache hit ratio, CHK keys-
     ReadWriteLocked             128               CHK  avgt   50  0,930 ± 0,016  us/op
        Synchronized             128               CHK  avgt   50  0,727 ± 0,025  us/op
             NoCache             128               CHK  avgt   50  1,041 ± 0,014  us/op
NoCache_NoDigestPool             128               CHK  avgt   50  0,462 ± 0,020  us/op

-12.5% cache hit ratio, SSK keys-
     ReadWriteLocked            1024               SSK  avgt   50  7,885 ± 0,732  us/op
        Synchronized            1024               SSK  avgt   50  2,282 ± 0,135  us/op
             NoCache            1024               SSK  avgt   50  1,320 ± 0,105  us/op
NoCache_NoDigestPool            1024               SSK  avgt   50  0,764 ± 0,031  us/op

-12.5% cache hit ratio, CHK keys-
     ReadWriteLocked            1024               CHK  avgt   50  5,661 ± 0,530  us/op
        Synchronized            1024               CHK  avgt   50  2,339 ± 0,130  us/op
             NoCache            1024               CHK  avgt   50  1,028 ± 0,014  us/op
NoCache_NoDigestPool            1024               CHK  avgt   50  0,485 ± 0,023  us/op

This shows that the new implementation is between 28% slower (high cache hit ratio, CHK keys) and 246% slower (low cache hit ratio, SSK keys).

For what it's worth, in a flame graph for the slowest benchmark, only the little time is spent actually computing the hashes (annotated with red borders), all what remains is overhead, most of it due to the locking.

That all said: this PR is fine code-wise, but it does not appear to solve any actual problem.
If anyone has benchmarks showing otherwise, I would love to see them!

torusrxxx · 2024-10-06T08:21:16Z

Everyone wants to get rid of locks. You need to test in a multi threaded environment before claiming locks are useless. Even if ReadWriteLock is a bit slower, it lets more threads through than a mutex. But in this case, I think getDigestedKey might be so fast on modern CPU that it doesn't need a cache anyway.

bertm · 2024-10-06T10:57:50Z

Everyone wants to get rid of locks

That is not the point I was raising. The locks (whether monitors or read-write) are necessary for synchronizing access to the cache - when there is one.

You need to test in a multi threaded environment before claiming locks are useless.

Which is why the above micro benchmark was run using 4 threads to maximize contention.

Even if ReadWriteLock is a bit slower, it lets more threads through than a mutex.

The benchmark result appears to indicate the opposite.

But in this case, I think getDigestedKey might be so fast on modern CPU that it doesn't need a cache anyway.

This is exactly the point I was trying to raise 👍

torusrxxx · 2024-10-06T13:19:30Z

Let's remove the unnecessary caches instead

Bombe requested changes Sep 22, 2024

View reviewed changes

src/freenet/store/saltedhash/CipherManager.java Outdated Show resolved Hide resolved

Bombe requested changes Sep 22, 2024

View reviewed changes

src/freenet/store/saltedhash/CipherManager.java Outdated Show resolved Hide resolved

Replace mutex with ReadWriteLock in CipherManager

487c1e3

torusrxxx force-pushed the patch-1 branch from 205259f to 487c1e3 Compare September 23, 2024 01:32

Bombe approved these changes Sep 23, 2024

View reviewed changes

torusrxxx closed this Oct 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replace mutex with ReadWriteLock in CipherManager #981

Replace mutex with ReadWriteLock in CipherManager #981

torusrxxx commented Sep 22, 2024

bertm commented Oct 5, 2024 •

edited

Loading

torusrxxx commented Oct 6, 2024

bertm commented Oct 6, 2024

torusrxxx commented Oct 6, 2024

Replace mutex with ReadWriteLock in CipherManager #981

Replace mutex with ReadWriteLock in CipherManager #981

Conversation

torusrxxx commented Sep 22, 2024

bertm commented Oct 5, 2024 • edited Loading

torusrxxx commented Oct 6, 2024

bertm commented Oct 6, 2024

torusrxxx commented Oct 6, 2024

bertm commented Oct 5, 2024 •

edited

Loading