Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add serde to Field #506

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open

add serde to Field #506

wants to merge 3 commits into from

Conversation

mimoo
Copy link
Contributor

@mimoo mimoo commented Oct 29, 2022

I presume this is no good, but let's start a discussion o.o

It is currently really painful to use arkworks in real-world projects because as soon as you want to serialize to disk or to send something over the wire, you can't use serde. Or at least, you have to resort to very verbose tricks (using serde_with, for example) to be able to serialize/deserialize any struct you have that contains Field.

@mimoo mimoo requested review from a team as code owners October 29, 2022 23:27
@mimoo mimoo requested review from Pratyush, mmagician and weikengchen and removed request for a team October 29, 2022 23:27
@weikengchen
Copy link
Member

This is highly subject to debate.

@mimoo
Copy link
Contributor Author

mimoo commented Oct 29, 2022

BTW how do people deal with this in projects that run in prod? We chose to use serde_with everywhere but I'm thinking of just forking arkworks with this branch and using this atm.

@Pratyush
Copy link
Member

Thanks for the PR! Couple of points

  • Let's make it optional, so that it's not on always
  • Should we directly implement it for all types that impl CanonicalSerialize/Deserialize, instead of one-off impls?

@Pratyush
Copy link
Member

Pratyush commented Nov 1, 2022

Hm I had something more like this in mind:

// in ark-serialize
pub type Compressed<T: CanonicalSerialize + CanonicalDeserialize>(T);
pub type Uncompressed<T: CanonicalSerialize + CanonicalDeserialize>(T);

#[cfg(feature = "serde")]
impl<T: CanonicalSerialize> Serialize for Compressed<T> {
}

#[cfg(feature = "serde")]
impl<T: CanonicalDeserialize> Deserialize for Compressed<T> {
}

#[cfg(feature = "serde")]
impl<T: CanonicalSerialize> Serialize for Uncompressed<T> {
}

#[cfg(feature = "serde")]
impl<T: CanonicalDeserialize> Deserialize for Uncompressed<T> {
}

You can use this in your code as follows:

#[derive(Serialize, Deserialize)]
pub struct Proof {
    pub type: Compressed<Whatever> // or Uncompressed<Whatever>
}

Or like this, at serialization/deserialization boundaries:

#[derive(CanonicalSerialize, CanonicalDeserialize)]
pub struct Proof { ... }

fn broadcast_to_network(p: Proof, r: &mut Read) {
    Compressed(p).serialize_to_json(r); // or whatever is your serde-interfacing code
}

If the latter works for your code, I think it's better because it requires minimal code in arkworks libraries.

@mimoo
Copy link
Contributor Author

mimoo commented Nov 1, 2022

Humm, I'm not sure I understand your example. What is Uncompressed?

Also, if I understand correctly:

  • either I have to implement CanonicalSerialize, CanonicalDeserialize on my types, that's probably annoying
  • or I have to always handle this wrapper type around my Field elements, which could be in arrays, tuples, and other types. That sounds a bit annoying as well : o

As in, if I have the following:

pub struct Proof {
    pub single: Compressed<Field>,
    pub vec: Vec<Compressed<Field>>,
}

then I'd have to do things like proof.vec.into_iter().map(|x| x.0)... right? Although if I implement Deref it might not be that bad...

@Pratyush
Copy link
Member

Pratyush commented Nov 1, 2022

The idea is to define Uncompressed and Compressed as newtype wrappers around T in ark-serialize.

Regarding the two options, you can derive CanonicalSerialize and CanonicalDeserialize easily for your structs; we have derive macros for these traits. Probably that would be the least effort approach

@mimoo
Copy link
Contributor Author

mimoo commented Nov 1, 2022

hummm I see, but what is the point of Uncompressed?

@Pratyush
Copy link
Member

Pratyush commented Nov 1, 2022

Ah Uncompressed and Compressed together implement the different behaviours of compression. The idea is that the Serialize impl of Compressed<T> will call T::serialize_compressed, while the Serialize impl for Uncompressed<T> will call T::serialize_uncompressed. (ditto for the Deserialize impls).

We need these additional wrapper types because a blanket impl<T: CanonicalSerialize> serde::Serialize for T is not possible due to Rust's orphan rules

@mmaker
Copy link
Member

mmaker commented Nov 1, 2022

Do we ever need to send an uncompressed point through the wire?

@Pratyush
Copy link
Member

Pratyush commented Nov 1, 2022

People definitely use it when serializing to disk.

@mimoo
Copy link
Contributor Author

mimoo commented Dec 3, 2022

Sorry for the delayed replayed.

If I understand correctly this means I will be able to write code like this:

#[derive(Serialize, Deserialize)]
#[serde(bound = "T: CanonicalDeserialize + CanonicalSerialize")]
pub struct Proof<T> where T: CanonicalDeserialize + CanonicalSerialize {
    pub single: Compressed<T>,
    pub vec: Vec<Compressed<T>>,
}

It is annoying to have to bind T on the struct directly. Also if I want to implement Clone on that struct then Compressed will have to implement Clone. And this for every traits that I want.

The other problem is that you'll have to dereference with thing.0 every time you want to act on the type. I'm wondering if implementing Deref will be enough in most cases.

EDIT: I just tried implementing this in our codebase and it's becoming quite a nightmare to track :D Ideally I want to just be able to write

#[derive(Serialize, Deserialize)]
pub struct Proof<T> {
    pub single: T,
    pub vec: Vec<T>,
}

and be able to instantiate T with types from arkworks

@Pratyush
Copy link
Member

Pratyush commented Dec 5, 2022

@mimoo The usage I had in mind was more along the following lines:

#[derive(CanonicalSerialize, CanonicalDeserialize)]
pub struct Proof {
    // Your actual proof contents
}

// elsewhere, in your serialization code
let proof = Pickles::prove(stuff);
let compressed = Compressed(proof);
compressed.serialize_json(); // or whatever else serde method you want to use.

// similarly, while deserializing:
let compressed: Compressed<Proof> = deserialize_json();
let proof = compressed_proof.0;
// rest of your code.

I think that should be easier to handle, no?

@mimoo
Copy link
Contributor Author

mimoo commented Dec 5, 2022

oh, but that means we have to implement Canonical(De)Serialize on all of our types instead of serde + arkworks has to implement these traits on Rust primitive types (not sure if this is already done). Let me think about this : o

@Pratyush
Copy link
Member

Pratyush commented Dec 5, 2022

You wouldn't need to impl CanonicalSerialize/CanonicalDeserialize for all your types. For example, a hypothetical Transaction type would look like:

#[derive(Serialize, Deserialize)]
pub struct Transaction {
    state_root: [u8; 32],
    signature: [u8; 64],
    proof: Compressed<Proof>,
}

#[derive(CanonicalSerialize, CanonicalDeserialize)]
pub struct Proof {
    // proof contents
}

That is, you can be as granular as you want, and insert the CanonicalSerialize → Serialize transition wherever you want in the stack.

@mimoo
Copy link
Contributor Author

mimoo commented Dec 6, 2022

I see, another problem I can see is that I'm assuming the derivers don't come with the helpers the serde ones come with. For example #[serde(bound)] or #[serde(skip)] . I guess it might not be super annoying to just use the Compressed<T> wrapper when you need that... I'll experiment.

It doesn't seem like it plays well with unbounded type parameters also:

Screen Shot 2022-12-05 at 5 35 33 PM

@Pratyush
Copy link
Member

Pratyush commented Dec 7, 2022

We can add a ark_serialize::bound attribute. Re: the "not adding bounds by default", we went with that approach to avoid the situation with the std derives which add incorrect bounds.

@burdges
Copy link
Contributor

burdges commented Mar 11, 2023

As you need wrappers anyways, I'd suggest this become a separate ark-serde crate under algebra/serde. It'll look like pratyush's code above, but avoid feature flag confusion and provide more visible documentation.

I'd also suggest the verb-ish form Compress and Uncompress or NoCompress since the wrapped element is not already compressed.

@mimoo
Copy link
Contributor Author

mimoo commented Mar 11, 2023

I'm running into another issue now: I can't nicely JSON serialize some of the types in arkworks. For example, a Radix2EvaluationDomain is just going to look like a bytearray instead of a nice JSON struct. I talk more about this here: o1-labs/proof-systems#1044

I think there's no good solution here besides implementing serde for these types of structs. I still don't understand why we don't do this, and just choose compressed format as the default for elliptic curve points.

PS: at this point I'm considering creating a fork called arkworks_with_serde lol

@burdges
Copy link
Contributor

burdges commented Mar 12, 2023

a Radix2EvaluationDomain is just going to look like a bytearray instead of a nice JSON struct.

Why is this a problem? Users cannot read finite field elements anyways.

I still don't understand why we don't do this, and just choose compressed format as the default for elliptic curve points.

Why do so many serialization formats exist? Why is serde never going to be part of std? It's total chaos.. Inconsistent, incorrect, or inadequately defined behavior, incorrect canonicalization, format changes, semver fights, poor memory usage on HSMs, etc. Are capnproto arenas using unsafe or C code? Voila, secret leakage risk in serde!

It's basically universal that cryptography libraries only ever serialize as bytes, even when they support serde. Afaik even JS crypto libraries and the Web Crypto API obey this convention.

Also, ark-serialize has options for both compression and validation, but you might change validation criteria, like by doing a curve check but not a subgroup check, or something even more subtle. We'd never propagate this through serde correctly.

@mimoo
Copy link
Contributor Author

mimoo commented Mar 13, 2023

Why is this a problem? Users cannot read finite field elements anyways.

there are more than just fields in this struct. In our case we'd like to see a human-readable value for log_size_of_group

Why do so many serialization formats exist? Why is serde never going to be part of std? It's total chaos..

I don't agree, serde is pretty commonly accepted everywhere in the Rust ecosystem, most libraries have a serde feature flag

@burdges
Copy link
Contributor

burdges commented Apr 13, 2023

Just fyi, I wrote wrapper types like this for arkworks-scale which lives at https://github.com/w3f/ark-scale

It's simpler than serde because parity-scale-codec works almost exactly like arkworks' serialization conceptually. It's straightforward to serialize directly from an iterator into the bytes that'll decode as a Vec<T> too, which maybe doable with serde since the visitor goes in sequence: https://github.com/w3f/ring-vrf/blob/eb6c4b24ef0c35a821eb521236277174b054af0e/ark-scale/src/lib.rs#L247 and not sure if this is useful for most people anyways.

@burdges
Copy link
Contributor

burdges commented Aug 21, 2023

@mmaker
Copy link
Member

mmaker commented Sep 20, 2023

Hello there just to say we are at zksummit and still thinking about this issue. Hope you guys are all well

@mmaker
Copy link
Member

mmaker commented Jan 26, 2024

Trying to be constructive, here's what it seems to be needed to move forward that issue:

Add an optional serde feature (ref.) that and instruct serde in such a way that it will be able to serialize/deserialized compressed/uncompressed with/without validation (ref.). Make ser/deser compressed with validation the default, but allow either with serde compile flags (or with wrapper structs, ref.) all other options.

@mimoo want to take a stab at this with me?

@mimoo
Copy link
Contributor Author

mimoo commented Jan 27, 2024

sorry I have no time on my hands atm :o

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants