Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement persistent commitments #543

Merged
merged 20 commits into from
Jul 20, 2023
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion fcomm/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -555,7 +555,7 @@ impl<F: LurkField + Serialize + DeserializeOwned> LurkPtr<F> {
LurkPtr::ZStorePtr(z_store_ptr) => {
let z_store = &z_store_ptr.z_store;
let z_ptr = z_store_ptr.z_ptr;
s.intern_z_expr_ptr(z_ptr, z_store)
s.intern_z_expr_ptr(&z_ptr, z_store)
.expect("failed to intern z_ptr")
}
}
Expand Down
36 changes: 36 additions & 0 deletions src/cli/commitment.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
use anyhow::Result;

use lurk::{field::LurkField, ptr::Ptr, store::Store, z_ptr::ZExprPtr, z_store::ZStore};
use serde::{Deserialize, Serialize};

/// Holds data for commitments.
///
/// WARNING: CONTAINS PRIVATE DATA
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does that warning differ qualitatively from any other instance of a store?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this one contains secrets

Copy link
Member

@huitseeker huitseeker Jul 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That I would expect. The idea that other stores don't is more surprising.

In practical terms, consider the code base you're editing here will be reviewed by a professional cryptography auditor. How does the comment you're offering here help them contextualize how files for Commitment should be considered, as opposed to other store files?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will address this tomorrow

#[derive(Serialize, Deserialize)]
pub struct Commitment<F: LurkField> {
pub hidden: ZExprPtr<F>,
pub zstore: ZStore<F>,
}

impl<'a, F: LurkField + Serialize + Deserialize<'a>> Commitment<F> {
arthurpaulino marked this conversation as resolved.
Show resolved Hide resolved
#[allow(dead_code)]
pub fn new(secret: F, payload: Ptr<F>, store: &mut Store<F>) -> Result<Self> {
let hidden = store.hide(secret, payload);
let mut zstore = Some(ZStore::<F>::default());
let hidden = store.get_z_expr(&hidden, &mut zstore)?.0;
Ok(Self {
hidden,
zstore: zstore.unwrap(),
})
}

#[cfg(not(target_arch = "wasm32"))]
huitseeker marked this conversation as resolved.
Show resolved Hide resolved
pub fn persist(&self, hash: &str) -> Result<()> {
use super::{field_data::FieldData, paths::commitment_path};
use std::{fs::File, io::BufWriter};

let fd = &FieldData::wrap::<F, Commitment<F>>(self)?;
bincode::serialize_into(BufWriter::new(&File::create(commitment_path(hash))?), fd)?;
Ok(())
}
}
27 changes: 27 additions & 0 deletions src/cli/field_data.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
use anyhow::Result;
use serde::{Deserialize, Serialize};

use lurk::field::{LanguageField, LurkField};

/// A wrapper for data whose deserialization depends on a certain LurkField
#[derive(Serialize, Deserialize)]
pub struct FieldData {
pub field: LanguageField,
arthurpaulino marked this conversation as resolved.
Show resolved Hide resolved
data: Vec<u8>,
huitseeker marked this conversation as resolved.
Show resolved Hide resolved
}

#[allow(dead_code)]
impl FieldData {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, but with this version you are moving the conversion to/from bytes at the wrapping time. I suspect you could peg what you want on a

struct Labeled<T: Serialize + DeserializeOwned> {
  label: Language field,
  val: T,
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this case I do intend to deserialize in two steps. We first read from FS to know the field and then we read from the vector to get the data with the correct field elements

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. You can definitely have those two steps, but in sequence within the same function, with the above structure.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't get the idea. I don't want to deserialize the vector of bytes if there is some inconsistency in the field. I want to error earlier. In other words, the vector is desirable

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the record, I did try something like you propose in my first attempt but then I got stuck because Rust doesn't have dependent types. I wanted T<F> where F: LurkField

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm.

As partially unrelated topic, I don't think any strategy that persists important Lurk data using an ad-hoc/bincode serialization of LanguageField is a good idea. It would be A Bad Thing if changes to that enum (which are so likely they are predictable, so predictable we should plan for them) led to supposedly durable data becoming unreadable.

Copy link
Member Author

@arthurpaulino arthurpaulino Jul 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How can we encode that information, then? Should we create another enum that we try to assure its stability?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want to have a completely general, 'dynamic' serialization, then it's going to require more design.

But is that really what is needed here? I think you just need to know what LanguageField you are currently using. Then everything will be written and read using that LanguageField. Moreover, it follows from our general cryptographic assumptions that any value used as a commitment (or as the hash part of a Lurk expression) cannot be produced by hashing some preimage in another LanguageField.

Therefore, as long as you are looking values up by their hash/digest, then it's fine to completely segregate them by field. So, given that this work is still using a filesystem-based commitment store, you could have a directory structure like:

.lurk/commitments/pallas
.lurk/commitments/vesta
.lurk/commitments/bls12-81
.lurk/commitments/grumpkin
…

If the current LanguageField is pallas, then you can write all commitments to the .lurk/commitments/pallas directory and look all commitments up there too. There is never any possibility that a valid commitment (say you are looking it up) expressed as an element of pallas::Scalar will be stored elsewhere.

Also, since (as above), we also cannot have collisions between fields (assuming indexes are always hashes, and the chosen field/hash combinations still preserve our security assumptions) you could even get the 'dynamic' behavior by searching for a given commitment in all available field directories. To prevent having to search in multiple places, you could use symlinks (for example) to provide a single index (perhaps hierarchically structured to avoid too-large directories, etc.)

Obviously, with a more powerful database management system than 'the file system', a different approach could be taken. But I think the above (especially the simplest version, which is probably all we need initially) should be fine.

The point is: I think you may be trying to solve the wrong problem. Certainly if the goal is a quick PR that decrees a format through code then that is the case.

While we may want to eventually have a format that allows dynamically mixing field types, that will need to be worked out as a careful extension of z_data.

Copy link
Member Author

@arthurpaulino arthurpaulino Jul 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That works for commitments, but lurk verify <proof_id> would need to ask the user for the field, which is annoying. So I thought that we might as well just use the same infra that's already available to give us extra consistency, assuring that we won't open a commitment that was generated in a field while we're in another field.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aren't proofs also always with respect to a known prover/verifier, which is itself necessarily parameterized on curve+field?

Also, why the heck are proofs (apparently) being stored with an id that is the timestamp rather than using content-addressing as previously?


Question: Are you trying to support verifying any Lurk proof, or do we assume that a given REPL session will only verify proofs of statements in the current field?

Either way, if you content-address the proofs, you should be able to use the symlink approach as above. (NOTE: in that model, you will need to check the actual location of resolved symlinks to get that meta-data — but I still think that's not what should be needed here.)


Opinion: Proof IDs should be the content address of the claim — just as they are in fcomm and clutch. This is actually important because it allows for caching of proofs. It's easy to imagine applications for which equivalent proofs are requested more than once (even many, many times). For example, that's how the current Lurk website works (or was designed to): we serve real proofs of expected claims in a way that is cost effective but still accurate.

A real outsourced-but-provable computation service could do the same.

#[inline]
pub fn wrap<F: LurkField, T: Serialize>(t: &T) -> Result<Self> {
Ok(Self {
field: F::FIELD,
data: bincode::serialize(t)?,
})
}

#[inline]
pub fn extract<'a, T: Deserialize<'a>>(&'a self) -> Result<T> {
Ok(bincode::deserialize(&self.data)?)
}
}
51 changes: 25 additions & 26 deletions src/cli/lurk_proof.rs
Original file line number Diff line number Diff line change
Expand Up @@ -7,35 +7,21 @@ use lurk::{
lang::{Coproc, Lang},
Status,
},
field::{LanguageField, LurkField},
field::LurkField,
proof::nova,
public_parameters::public_params,
z_ptr::ZExprPtr,
z_store::ZStore,
};

/// A wrapper for data whose deserialization depends on a certain LurkField
#[derive(Serialize, Deserialize)]
pub struct FieldData {
field: LanguageField,
data: Vec<u8>,
}
#[cfg(not(target_arch = "wasm32"))]
use std::{fs::File, io::BufReader, io::BufWriter};

#[allow(dead_code)]
impl FieldData {
#[inline]
pub fn wrap<F: LurkField, T: Serialize>(t: &T) -> Result<Self> {
Ok(Self {
field: F::FIELD,
data: bincode::serialize(t)?,
})
}

#[inline]
pub fn open<'a, T: Deserialize<'a>>(&'a self) -> Result<T> {
Ok(bincode::deserialize(&self.data)?)
}
}
#[cfg(not(target_arch = "wasm32"))]
use super::{
field_data::FieldData,
paths::{proof_meta_path, proof_path},
};

/// Carries extra information to help with visualization, experiments etc
#[derive(Serialize, Deserialize)]
Expand All @@ -51,6 +37,15 @@ pub struct LurkProofMeta<F: LurkField> {
pub zstore: ZStore<F>,
}

impl<F: LurkField + Serialize> LurkProofMeta<F> {
#[cfg(not(target_arch = "wasm32"))]
pub fn persist(&self, id: &str) -> Result<()> {
let fd = &FieldData::wrap::<F, LurkProofMeta<F>>(self)?;
bincode::serialize_into(BufWriter::new(&File::create(proof_meta_path(id))?), fd)?;
Ok(())
}
}

type F = pasta_curves::pallas::Scalar; // TODO: generalize this

/// Minimal data structure containing just enough for proof verification
Expand All @@ -67,6 +62,13 @@ pub enum LurkProof<'a> {
}

impl<'a> LurkProof<'a> {
#[cfg(not(target_arch = "wasm32"))]
pub fn persist(&self, id: &str) -> Result<()> {
let fd = &FieldData::wrap::<F, LurkProof<'_>>(self)?;
bincode::serialize_into(BufWriter::new(&File::create(proof_path(id))?), fd)?;
Ok(())
}

#[allow(dead_code)]
fn verify(self) -> Result<bool> {
match self {
Expand Down Expand Up @@ -96,12 +98,9 @@ impl<'a> LurkProof<'a> {

#[cfg(not(target_arch = "wasm32"))]
pub fn verify_proof(proof_id: &str) -> Result<()> {
use super::paths::proof_path;
use std::{fs::File, io::BufReader};

let file = File::open(proof_path(proof_id))?;
let fd: FieldData = bincode::deserialize_from(BufReader::new(file))?;
let lurk_proof: LurkProof = fd.open()?;
let lurk_proof: LurkProof = fd.extract()?;
Self::print_verification(proof_id, lurk_proof.verify()?);
Ok(())
}
Expand Down
2 changes: 2 additions & 0 deletions src/cli/mod.rs
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
mod commitment;
mod field_data;
mod lurk_proof;
mod paths;
mod repl;
Expand Down
24 changes: 17 additions & 7 deletions src/cli/paths.rs
Original file line number Diff line number Diff line change
Expand Up @@ -23,8 +23,13 @@ pub fn proofs_dir() -> PathBuf {
}

#[cfg(not(target_arch = "wasm32"))]
pub fn lurk_leaf_dirs() -> [PathBuf; 1] {
[proofs_dir()]
pub fn commits_dir() -> PathBuf {
lurk_dir().join(Path::new("commits"))
}

#[cfg(not(target_arch = "wasm32"))]
pub fn lurk_leaf_dirs() -> [PathBuf; 2] {
[proofs_dir(), commits_dir()]
}

#[cfg(not(target_arch = "wasm32"))]
Expand All @@ -35,6 +40,16 @@ pub fn create_lurk_dirs() -> Result<()> {
Ok(())
}

#[cfg(not(target_arch = "wasm32"))]
pub fn repl_history() -> PathBuf {
lurk_dir().join(Path::new("repl-history"))
}

#[cfg(not(target_arch = "wasm32"))]
pub fn commitment_path(hash: &str) -> PathBuf {
commits_dir().join(Path::new(hash))
}

#[cfg(not(target_arch = "wasm32"))]
pub fn proof_path(name: &str) -> PathBuf {
proofs_dir().join(Path::new(name)).with_extension("proof")
Expand All @@ -44,8 +59,3 @@ pub fn proof_path(name: &str) -> PathBuf {
pub fn proof_meta_path(name: &str) -> PathBuf {
proofs_dir().join(Path::new(name)).with_extension("meta")
}

#[cfg(not(target_arch = "wasm32"))]
pub fn repl_history() -> PathBuf {
lurk_dir().join(Path::new("repl-history"))
}
123 changes: 91 additions & 32 deletions src/cli/repl.rs
Original file line number Diff line number Diff line change
Expand Up @@ -37,13 +37,7 @@ use lurk::{
};

#[cfg(not(target_arch = "wasm32"))]
use std::{fs::File, io::BufWriter};

#[cfg(not(target_arch = "wasm32"))]
use super::{
lurk_proof::{LurkProof, LurkProofMeta},
paths::{proof_meta_path, proof_path},
};
use super::lurk_proof::{LurkProof, LurkProofMeta};

#[derive(Completer, Helper, Highlighter, Hinter)]
struct InputValidator {
Expand Down Expand Up @@ -166,8 +160,6 @@ impl Repl<F> {

#[cfg(not(target_arch = "wasm32"))]
pub fn prove_last_frames(&mut self) -> Result<()> {
use crate::cli::lurk_proof::FieldData;

match self.evaluation.as_mut() {
None => bail!("No evaluation to prove"),
Some(Evaluation {
Expand Down Expand Up @@ -211,37 +203,30 @@ impl Repl<F> {
assert_eq!(self.rc * num_steps, n_frames);
assert!(proof.verify(&pp, num_steps, &public_inputs, &public_outputs)?);

let lurk_proof_wrap = FieldData::wrap::<F, LurkProof<'_>>(&LurkProof::Nova {
let lurk_proof = &LurkProof::Nova {
proof,
public_inputs,
public_outputs,
num_steps,
rc: self.rc,
lang: (*self.lang).clone(),
})?;

let lurk_proof_meta_wrap =
FieldData::wrap::<F, LurkProofMeta<F>>(&LurkProofMeta {
iterations: *iterations,
evaluation_cost: *cost,
generation_cost: generation.duration_since(start).as_nanos(),
compression_cost: compression.duration_since(generation).as_nanos(),
status,
expression,
environment,
result,
zstore: zstore.unwrap(),
})?;
};

let lurk_proof_meta = &LurkProofMeta {
iterations: *iterations,
evaluation_cost: *cost,
generation_cost: generation.duration_since(start).as_nanos(),
compression_cost: compression.duration_since(generation).as_nanos(),
status,
expression,
environment,
result,
zstore: zstore.unwrap(),
};

let id = &format!("{}", timestamp());
bincode::serialize_into(
BufWriter::new(&File::create(proof_path(id))?),
&lurk_proof_wrap,
)?;
bincode::serialize_into(
BufWriter::new(&File::create(proof_meta_path(id))?),
&lurk_proof_meta_wrap,
)?;
lurk_proof.persist(id)?;
lurk_proof_meta.persist(id)?;
println!("Proof ID: \"{id}\"");
Ok(())
}
Expand All @@ -250,6 +235,39 @@ impl Repl<F> {
}
}

#[cfg(not(target_arch = "wasm32"))]
fn hide(&mut self, secret: F, payload: Ptr<F>) -> Result<()> {
use super::commitment::Commitment;

let commitment = Commitment::new(secret, payload, &mut self.store)?;
let hash = &format!("0x{}", commitment.hidden.value().hex_digits());
commitment.persist(hash)?;
println!("Data: {}\nHash: {hash}", payload.fmt_to_string(&self.store));
Ok(())
}

#[cfg(not(target_arch = "wasm32"))]
fn fetch(&mut self, hash: &str) -> Result<()> {
use super::{commitment::Commitment, field_data::FieldData, paths::commitment_path};
use std::{fs::File, io::BufReader};

let file = File::open(commitment_path(hash))?;
let fd: FieldData = bincode::deserialize_from(BufReader::new(file))?;
if fd.field != F::FIELD {
bail!("Invalid field: {}. Expected {}", &fd.field, &F::FIELD)
} else {
let commitment: Commitment<F> = fd.extract()?;
if format!("0x{}", commitment.hidden.value().hex_digits()) != hash {
bail!("Hash mismatch. Corrupted commitment file.")
} else {
self.store
.intern_z_expr_ptr(&commitment.hidden, &commitment.zstore);
println!("Data for {hash} is now available");
}
}
Ok(())
}

#[inline]
fn eval_expr(&mut self, expr_ptr: Ptr<F>) -> Result<(IO<F>, usize, Vec<Ptr<F>>)> {
Ok(Evaluator::new(expr_ptr, self.env, &mut self.store, self.limit, &self.lang).eval()?)
Expand Down Expand Up @@ -416,6 +434,47 @@ impl Repl<F> {
process::exit(1);
}
}
"lurk.commit" => {
#[cfg(not(target_arch = "wasm32"))]
{
let first = self.peek1(cmd, args)?;
let (first_io, ..) = self.eval_expr(first)?;
self.hide(ff::Field::ZERO, first_io.expr)?;
}
}
"lurk.hide" => {
#[cfg(not(target_arch = "wasm32"))]
{
let (first, second) = self.peek2(cmd, args)?;
let (first_io, ..) = self
.eval_expr(first)
.with_context(|| "evaluating first arg")?;
let (second_io, ..) = self
.eval_expr(second)
.with_context(|| "evaluating second arg")?;
let Some(secret) = self.store.fetch_num(&first_io.expr) else {
bail!("Secret must be a number. Got {}", first_io.expr.fmt_to_string(&self.store))
};
self.hide(secret.into_scalar(), second_io.expr)?;
}
}
"fetch" => {
#[cfg(not(target_arch = "wasm32"))]
{
let first = self.peek1(cmd, args)?;
let n = self.store.lurk_sym("num");
let expr = self.store.list(&[n, first]);
let (expr_io, ..) = self
.eval_expr(expr)
.with_context(|| "evaluating first arg")?;
let hash = self
.store
.fetch_num(&expr_io.expr)
.expect("must be a number");
#[cfg(not(target_arch = "wasm32"))]
self.fetch(&format!("0x{}", hash.into_scalar().hex_digits()))?;
}
}
"clear" => self.env = self.store.nil(),
"set-env" => {
// The state's env is set to the result of evaluating the first argument.
Expand Down
Loading