Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Big integer abstraction for Name accumulators & rug-based BigInt backend #373

Merged
merged 11 commits into from
Dec 6, 2023

Conversation

matheus23
Copy link
Member

@matheus23 matheus23 commented Nov 24, 2023

This introduces the Big trait in the wnfs-nameaccumulator crate which abstracts the type & functions needed from the bigint library to support name accumulators.
This allows us to experiment with different backends.
I've implemented a rug based backend and tested it against the num-bigint-dig backend, it improves name accumulator performance by roughly 2x:

Benchmark output
[philipp@nixos:~/program/work/rs-wnfs]$ cargo bench -p wnfs-bench --bench nameaccumulator
   Compiling wnfs-nameaccumulator v0.1.25 (/home/philipp/program/work/rs-wnfs/wnfs-nameaccumulator)
   Compiling wnfs-bench v0.1.25 (/home/philipp/program/work/rs-wnfs/wnfs-bench)
    Finished bench [optimized] target(s) in 2.05s
     Running nameaccumulator.rs (target/release/deps/nameaccumulator-969d61c944503e93)
Benchmarking NameSegment::<BigNumDig>::new_hashed: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 7.1s, enable flat sampling, or reduce sample count to 50.
Benchmarking NameSegment::<BigNumDig>::new_hashed: Collecting 100 samples in estima
NameSegment::<BigNumDig>::new_hashed
                        time:   [1.4227 ms 1.4422 ms 1.4615 ms]
Found 3 outliers among 100 measurements (3.00%)
  1 (1.00%) low severe
  1 (1.00%) low mild
  1 (1.00%) high severe

Benchmarking NameSegment::<BigNumRug>::new_hashed: Collecting 100 samples in estima
NameSegment::<BigNumRug>::new_hashed
time: [247.75 µs 250.97 µs 254.11 µs]
Found 2 outliers among 100 measurements (2.00%)
1 (1.00%) low mild
1 (1.00%) high mild

Benchmarking NameSegment::<BigNumDig>::new(rng): Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 6.9s, enable flat sampling, or reduce sample count to 50.
Benchmarking NameSegment::<BigNumDig>::new(rng): Collecting 100 samples in estimate
NameSegment::<BigNumDig>::new(rng)
time: [1.3551 ms 1.3711 ms 1.3873 ms]
Found 10 outliers among 100 measurements (10.00%)
1 (1.00%) low severe
2 (2.00%) low mild
6 (6.00%) high mild
1 (1.00%) high severe

Benchmarking NameSegment::<BigNumRug>::new(rng): Collecting 100 samples in estimate
NameSegment::<BigNumRug>::new(rng)
time: [460.11 µs 461.60 µs 463.13 µs]
Found 4 outliers among 100 measurements (4.00%)
1 (1.00%) low severe
1 (1.00%) low mild
1 (1.00%) high mild
1 (1.00%) high severe

Benchmarking NameAccumulator::<BigNumDig>::add: Collecting 100 samples in estimated
NameAccumulator::<BigNumDig>::add
time: [1.3226 ms 1.3311 ms 1.3398 ms]
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high mild

Benchmarking NameAccumulator::<BigNumRug>::add: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 5.9s, enable flat sampling, or reduce sample count to 60.
Benchmarking NameAccumulator::<BigNumRug>::add: Collecting 100 samples in estimated
NameAccumulator::<BigNumRug>::add
time: [735.55 µs 739.12 µs 742.85 µs]
Found 6 outliers among 100 measurements (6.00%)
1 (1.00%) low severe
3 (3.00%) low mild
2 (2.00%) high mild

Benchmarking NameAccumulator::<BigNumDig> serialization: Collecting 100 samples in
NameAccumulator::<BigNumDig> serialization
time: [986.84 ns 1.0084 µs 1.0381 µs]
Found 7 outliers among 100 measurements (7.00%)
2 (2.00%) high mild
5 (5.00%) high severe

Benchmarking NameAccumulator::<BigNumRug> serialization: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 6.0s, enable flat sampling, or reduce sample count to 60.
Benchmarking NameAccumulator::<BigNumRug> serialization: Collecting 100 samples in
NameAccumulator::<BigNumRug> serialization
time: [597.78 ns 608.67 ns 622.15 ns]
Found 9 outliers among 100 measurements (9.00%)
7 (7.00%) high mild
2 (2.00%) high severe

The downside of the rug backend is that you need to have the gmp libraries installed locally.

Currently WIP because:

  • Needs cleanup of the Big trait. There's some weirdness to it. Function names and stuff.
  • Consider splitting the Big::Num type into Big::U2048 and Big::U256, or Big::GroupElem and Big::Exponent, or whatever is needed to support a crypto-bigint backend. Not now.
  • Figure out what to is going on with snapshot tests (why isn't the num-bigint-dig backend generating exactly the same data?).
  • Decide whether backends must use e.g. the same pseudo-random prime derivation algorithms. No, it just needs to be pseudorandom.
  • Add a note to the readme about the rug feature.

@matheus23 matheus23 self-assigned this Nov 24, 2023
@fabricedesre
Copy link
Collaborator

Very nice perf improvement!

About:

The downside of the rug backend is that you need to have the gmp libraries installed locally.

The doc at https://docs.rs/gmp-mpfr-sys/1.6.1/gmp_mpfr_sys/index.html#experimental-optional-features says that it builds gmp. I guess it's statically linked so there is no runtime dependency to install.

@matheus23
Copy link
Member Author

Oh that's cool.

One thing to figure out I guess would be licensing. GMP is LGPLv3 and GPLv2 dual-licensed.
Not sure what that'd mean for our Apache 2/MIT licensing.

@matheus23
Copy link
Member Author

So after some additional research & talking to people it seems like LGPL is designed to allow projects that use something that's LGPL to be licensed differently.

Looking at rug reverse dependencies: https://crates.io/crates/rug/reverse_dependencies there's lots of projects that are licensed as MIT/Apache 2.0 with rug in their direct dependencies.

Unlike all of these other crates, I think I'll add a note in the nameaccumulator README that by enabling the "rug" feature, you'll be depending on rug, just as an FYI.

Copy link

codecov bot commented Dec 6, 2023

Codecov Report

Merging #373 (7b3248f) into main (17c14c4) will increase coverage by 0.27%.
The diff coverage is 78.20%.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #373      +/-   ##
==========================================
+ Coverage   54.57%   54.85%   +0.27%     
==========================================
  Files          56       57       +1     
  Lines        3879     3936      +57     
  Branches      933      947      +14     
==========================================
+ Hits         2117     2159      +42     
- Misses       1162     1168       +6     
- Partials      600      609       +9     
Files Coverage Δ
wnfs/src/private/directory.rs 68.92% <ø> (ø)
wnfs/src/public/node/node.rs 60.81% <ø> (ø)
wnfs/src/root_tree.rs 63.23% <ø> (ø)
wnfs-nameaccumulator/src/uint256_serde_be.rs 40.00% <50.00%> (-32.73%) ⬇️
wnfs-nameaccumulator/src/fns.rs 78.94% <76.47%> (-10.53%) ⬇️
wnfs-common/src/utils/test.rs 56.75% <73.68%> (-2.51%) ⬇️
wnfs-nameaccumulator/src/name.rs 84.71% <81.48%> (+2.18%) ⬆️
wnfs-nameaccumulator/src/traits.rs 79.03% <79.03%> (ø)

... and 3 files with indirect coverage changes

@matheus23 matheus23 marked this pull request as ready for review December 6, 2023 12:28
@matheus23 matheus23 requested a review from a team as a code owner December 6, 2023 12:28
@matheus23 matheus23 changed the title Big integer abstraction for Name accumulators & rug-based BigInt backend feat: Big integer abstraction for Name accumulators & rug-based BigInt backend Dec 6, 2023
@matheus23
Copy link
Member Author

This also implements wnfs-wg/spec#76 since that simplifies the Big trait & the previous implementation used big-endian in places that the spec actually required little-endian (so the previous implementation was incorrect).

Now it's all big-endian within wnfs-nameaccumulator, which makes it much harder to make mistakes such as these.

@matheus23 matheus23 merged commit c407818 into main Dec 6, 2023
10 checks passed
@matheus23 matheus23 deleted the matheus23/gmp-nameaccumulators branch December 6, 2023 13:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants