Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merkle trees first commit #612

Open
wants to merge 12 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 17 additions & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

3 changes: 2 additions & 1 deletion Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,8 @@ members = [
"fastcrypto-derive",
"fastcrypto-tbls",
"fastcrypto-zkp",
"fastcrypto-cli"
"fastcrypto-cli",
"fastcrypto-data",
]

# Dependencies that should be kept in sync through the whole workspace
Expand Down
15 changes: 15 additions & 0 deletions fastcrypto-data/Cargo.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
[package]
name = "fastcrypto-data"
version = "0.1.0"
edition = "2021"
license = "Apache-2.0"
authors = ["Mysten Labs <build@mystenlabs.com>"]
readme = "../README.md"
description = "Collection of useful cryptographic data structures"
repository = "https://github.com/MystenLabs/fastcrypto"

# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html

[dependencies]
rs_merkle = "1.4.1"
fastcrypto = { path = "../fastcrypto" }
4 changes: 4 additions & 0 deletions fastcrypto-data/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
fastcrypto-data
===

A collection of useful cryptographic data structures.
4 changes: 4 additions & 0 deletions fastcrypto-data/src/lib.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
// Copyright (c) 2022, Mysten Labs, Inc.
// SPDX-License-Identifier: Apache-2.0

pub mod merkle_tree;
298 changes: 298 additions & 0 deletions fastcrypto-data/src/merkle_tree.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,298 @@
// Copyright (c) 2022, Mysten Labs, Inc.
// SPDX-License-Identifier: Apache-2.0

//! This module contains an implementation of a Merkle Tree data structure (Merkle, R.C. (1988): A Digital Signature
//! Based on a Conventional Encryption Function) which is a data structure that allows an arbitrary
//! number of elements of a given type `T` to be added as leaves to the tree and we can then construct
//! proofs logarithmic in the number of leaves that a certain leaf has a given value. Such proofs can
//! be verified by a small verifier which only needs to know the root of the tree.
//!
//! # Example
//! ```rust
//! # use fastcrypto::hash::Sha256;
//! # use fastcrypto_data::merkle_tree::*;
//! let elements = [[1u8], [2u8], [3u8]];
//! let mut tree = MerkleTree::<32, Sha256, [u8; 1]>::new();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const like Sha256Length?

//! tree.insert_all(elements.iter());
//!
//! let index = 1;
//! let proof = tree.prove(index);
//!
//! let verifier = tree.create_verifier().unwrap();
//! assert!(verifier.verify(index, &elements[index], &proof));
//! ```

use std::borrow::Borrow;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the use case for this data structure? there are many variants of merkle trees, e.g., key-value, with deletions, etc.
Also, what is the expected number of objects, and the expected insertion/deletion pattern?

use std::marker::PhantomData;

use fastcrypto::error::FastCryptoError;
use fastcrypto::hash::HashFunction;
use rs_merkle::{Hasher, MerkleProof, MerkleTree as ExternalMerkleTree};

/// This represents a Merkle Tree with an arbitrary number of elements of type `T`. The [prove] function
/// can generate proofs that the leaf of a given index has a certain hash value.
///
/// New elements may be added continuously but once a verifier is generated with the [create_verifier]
/// function, the proofs are only valid for the state of the tree at that point.
///
/// To avoid second-preimage attacks, a 0x00 byte is prepended to the hash data for leaf nodes (see
/// [LeafHasher]), and 0x01 is prepended when computing internal node hashes (see [InternalNodeHasher]).
pub struct MerkleTree<const DIGEST_LENGTH: usize, H: HashFunction<DIGEST_LENGTH>, T: AsRef<[u8]>> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need T here? can we instead have generic insert()?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe instead of requiring AsRef, let's require trait Hashable or just Serialize
(AsRef requires using OnceCell in some cases, which we better avoid unless have to).

tree: ExternalMerkleTree<InternalNodeHasher<DIGEST_LENGTH, H>>,
_type: PhantomData<T>,
}

impl<const DIGEST_LENGTH: usize, H: HashFunction<DIGEST_LENGTH>, T: AsRef<[u8]>>
MerkleTree<DIGEST_LENGTH, H, T>
{
pub fn new() -> Self {
MerkleTree {
tree: ExternalMerkleTree::new(),
_type: PhantomData::default(),
}
}
}

impl<const DIGEST_LENGTH: usize, H: HashFunction<DIGEST_LENGTH>, T: AsRef<[u8]>> Default
for MerkleTree<DIGEST_LENGTH, H, T>
{
fn default() -> Self {
Self::new()
}
}

/// This verifier can verify proofs generated by [MerkleTree::prove].
pub struct MerkleTreeVerifier<
const DIGEST_LENGTH: usize,
H: HashFunction<DIGEST_LENGTH>,
T: AsRef<[u8]>,
> {
root: [u8; DIGEST_LENGTH],
number_of_leaves: usize,
_hash_function: PhantomData<H>,
_type: PhantomData<T>,
}

impl<const DIGEST_LENGTH: usize, H: HashFunction<DIGEST_LENGTH>, T: AsRef<[u8]>>
MerkleTreeVerifier<DIGEST_LENGTH, H, T>
{
/// Verify a [Proof] that an element with the given hash was at this index of this tree at the time
/// this verifier was created.
fn verify_with_hash(
&self,
index: usize,
leaf_hash: [u8; DIGEST_LENGTH],
proof: &Proof<DIGEST_LENGTH, H, T>,
) -> bool {
proof
.proof
.verify(self.root, &[index], &[leaf_hash], self.number_of_leaves)
}

/// Verify a [Proof] that an element was at this index of this tree at the time this verifier was
/// created.
pub fn verify(&self, index: usize, element: &T, proof: &Proof<DIGEST_LENGTH, H, T>) -> bool {
self.verify_with_hash(
index,
LeafHasher::<DIGEST_LENGTH, H>::hash(element.as_ref()),
proof,
)
}
}

impl<const DIGEST_LENGTH: usize, H: HashFunction<DIGEST_LENGTH>, T: AsRef<[u8]>>
MerkleTree<DIGEST_LENGTH, H, T>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pls merge with the impl of this struct above

{
/// Hash an element using the hash function used for this tree.
pub fn hash(element: &T) -> [u8; DIGEST_LENGTH] {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when will this be used?

LeafHasher::<DIGEST_LENGTH, H>::hash(element.as_ref())
}

/// Insert element in this tree and return the index of the newly inserted element.
fn insert_hash(&mut self, hash: [u8; DIGEST_LENGTH]) -> usize {
self.tree.insert(hash).commit();
self.tree.leaves_len() - 1
}

/// Insert all elements in the iterator into this tree. The elements are given consecutive indices
/// and the return value is the index of the last element.
fn insert_hashes(&mut self, hashes: impl Iterator<Item = [u8; DIGEST_LENGTH]>) -> usize {
for hash in hashes {
self.tree.insert(hash);
}
self.tree.commit();
self.tree.leaves_len() - 1
}

/// Insert element in this tree and return the index of the newly inserted element.
pub fn insert(&mut self, element: &T) -> usize {
self.insert_hash(Self::hash(element))
}

/// Insert all elements in the iterator into this tree. The elements are given consecutive indices
/// and the return value is the index of the last element.
pub fn insert_all(&mut self, elements: impl Iterator<Item = impl Borrow<T>>) -> usize {
self.insert_hashes(elements.map(|element| Self::hash(element.borrow())))
}

/// Create a proof for the element at the given index.
pub fn prove(&self, index: usize) -> Proof<DIGEST_LENGTH, H, T> {
Proof {
proof: self.tree.proof(&[index]),
_type: PhantomData::default(),
}
}

/// Create a [MerkleTreeVerifier] for the current state of this tree.
pub fn create_verifier(
&self,
) -> Result<MerkleTreeVerifier<DIGEST_LENGTH, H, T>, FastCryptoError> {
Ok(MerkleTreeVerifier {
root: self
.tree
.root()
.ok_or_else(|| FastCryptoError::GeneralError("Tree is empty".to_string()))?,
number_of_leaves: self.tree.leaves_len(),
_hash_function: PhantomData::default(),
_type: PhantomData::default(),
})
}

/// Return the number of leaves in this tree.
pub fn number_of_leaves(&self) -> usize {
self.tree.leaves_len()
}

/// Returns the root of this tree.
pub fn root(&self) -> Result<[u8; DIGEST_LENGTH], FastCryptoError> {
self.tree
.root()
.ok_or_else(|| FastCryptoError::GeneralError("Tree is empty".to_string()))
}
}

/// A proof that a leaf with a given index in a Merkle Tree has a certain hash value.
pub struct Proof<const DIGEST_LENGTH: usize, H: HashFunction<DIGEST_LENGTH>, T: AsRef<[u8]>> {
proof: MerkleProof<InternalNodeHasher<DIGEST_LENGTH, H>>,
_type: PhantomData<T>,
}

/// A hash function which given input `X` computes `H(PREFIX ||X)`
struct PrefixedHasher<const PREFIX: u8, const DIGEST_LENGTH: usize, H: HashFunction<DIGEST_LENGTH>>
{
_hasher: PhantomData<H>,
}

impl<const PREFIX: u8, const DIGEST_LENGTH: usize, H: HashFunction<DIGEST_LENGTH>> Clone
for PrefixedHasher<PREFIX, DIGEST_LENGTH, H>
{
fn clone(&self) -> Self {
Self {
_hasher: PhantomData::default(),
}
}
}

impl<const PREFIX: u8, const DIGEST_LENGTH: usize, H: HashFunction<DIGEST_LENGTH>> Hasher
for PrefixedHasher<PREFIX, DIGEST_LENGTH, H>
{
type Hash = [u8; DIGEST_LENGTH];

fn hash(data: &[u8]) -> Self::Hash {
let mut input = vec![];
input.push(PREFIX);
input.extend_from_slice(data);
H::digest(input).digest
}
}

/// Computes H(0x01 || X)
type InternalNodeHasher<const DIGEST_LENGTH: usize, H> = PrefixedHasher<0x01, DIGEST_LENGTH, H>;

/// Computes H(0x00 || X)
type LeafHasher<const DIGEST_LENGTH: usize, H> = PrefixedHasher<0x00, DIGEST_LENGTH, H>;

#[cfg(test)]
mod tests {
use crate::merkle_tree::{LeafHasher, MerkleTree, MerkleTreeVerifier, Proof};
use fastcrypto::hash::{HashFunction, Sha256};
use rs_merkle::proof_serializers::ReverseHashesOrder;
use rs_merkle::{Hasher, MerkleProof};
use std::marker::PhantomData;

#[test]
fn test_merkle_tree() {
let mut tree = MerkleTree::<32, Sha256, Vec<u8>>::new();

// An empty tree does not have a root
assert!(tree.root().is_err());
assert!(tree.create_verifier().is_err());

let elements = [vec![1u8], vec![2u8], vec![3u8]];
let index = 1;
let element = &elements[index];

// Adding elements should change the number of leaves
assert_eq!(0, tree.number_of_leaves());
assert_eq!(elements.len() - 1, tree.insert_all(elements.iter()));
assert_eq!(elements.len(), tree.number_of_leaves());

// Generate proof for a given element and verify
let proof = tree.prove(index);
let verifier = tree.create_verifier().unwrap();
assert!(verifier.verify(index, element, &proof));
assert!(!verifier.verify(index, &elements[index - 1], &proof));

// Adding another element changes the root and the old proof should no longer verify
let root = tree.root().unwrap();
tree.insert(&vec![4u8]);
let new_root = tree.root().unwrap();
assert_ne!(root, new_root);
let new_verifier = tree.create_verifier().unwrap();
assert!(!new_verifier.verify(index, element, &proof));
}

#[test]
fn test_preimage_attack() {
let mut tree = MerkleTree::<32, Sha256, Vec<u8>>::new();
let elements = [vec![0u8], vec![1u8], vec![2u8], vec![3u8]];
tree.insert_all(elements.iter());

// Create a proof for the first element in the tree and verify
let proof = tree.prove(0);
let verifier = tree.create_verifier().unwrap();
assert!(verifier.verify(0, &elements[0], &proof));
assert!(verifier.verify_with_hash(0, LeafHasher::<32, Sha256>::hash(&elements[0]), &proof));

// Create a modified proof where the nodes in the layer below the leaves are seen as leaves
let modified_proof = Proof {
proof: MerkleProof::from_bytes(&proof.proof.serialize::<ReverseHashesOrder>()[0..32])
.unwrap(),
_type: PhantomData::default(),
};

// Compute the leaf hash of this new modified tree. This is equal to the hash of the first
// internal node in the layer just below the nodes:
let mut hasher = Sha256::default();
hasher.update([1u8]);
hasher.update(LeafHasher::<32, Sha256>::hash(&elements[0]));
hasher.update(LeafHasher::<32, Sha256>::hash(&elements[1]));
let hash = hasher.finalize().digest;

// This modified proof should fail against the actual tree because the number of hashes does
// not match the depth of the tree.
assert!(!verifier.verify_with_hash(0, hash, &modified_proof));

// If we modify the verifier to think that the tree only has two elements instead of four, the
// modified proof verifies. This is avoided if we either a) uses the MerkleTreeVerifier struct
// to verify proofs or b) uses the verify instead where we need to provide the element instead
// of the hash.
let modified_verifier = MerkleTreeVerifier::<32, Sha256, Vec<u8>> {
root: tree.tree.root().unwrap(),
number_of_leaves: 2,
_hash_function: Default::default(),
_type: Default::default(),
};
assert!(modified_verifier.verify_with_hash(0, hash, &modified_proof));
}
}