Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change attestation flow to be pull based #1109

Merged
merged 40 commits into from
Oct 18, 2024

Conversation

HCastano
Copy link
Collaborator

@HCastano HCastano commented Oct 10, 2024

While designing the system for issuing attestation checks at session boundaries I found
myself having a lot of trouble figuring out how to move potential validators and
attestations through various states.

This PR updates the attestation flow in such a way that validators are now expected to
request an attestation before any action that requires one (e.g validate(), session
rotation, endpoint changes, etc.).

When a validator wants to trigger one of these actions the extrinsic will require that a
quote is attched. It will then reach out to the Attestation pallet ((through the
AttestationHandler) to verify its validity.

This design lets us keep the attestation related state (e.g the pending attestations) in
the Attestation pallet, and the staking related state (e.g information about TSS servers)
in the Staking pallet. With this we can remove the messy infrastrucutre of the
ValidationQueue and the associated on_initialize hooks.

There's a few open questions that I want to talk through before finishing up this PR:

  • Do we like this flow? It's going to add an extra extrinsic roundtrip for TSS
    servers/operators, but I think it's worth it since it keeps the pallet code simpler
  • Do we have any strong opinions towards keeping the off-chain worker attestation code? I
    belive we can get rid of it (probably in a follow-up PR) since TSS servers will be
    responsible for keeping track of when they need to request an attestation

For follow ups I want to:

  • Add a /request_attestation endpoint to the TSS so an operator can get a quote back
    which they can use for extrinsics like validate()
    • The internal logic from this should be used for automated calls to
      Attestation::request_attestation() for actions that are kicked off by the chain
      but require a quote, e.g confirm_key_reshare() in the future
  • Add quote guards to actions modifying ServerInfo
  • Add a quote guard to confirm_key_reshare()

Closes #1089.

Since we're verifying the attestation in the extrinsic we can also write to these structures in the
extrinsic again.
We need to match what the TSS is committing to (e.g what we write to storage) on the Staking pallet
end, but we don't always have that information available on the Attestation pallet side. By having
the expected data come in alongside the quote we can more easily check that it does match.
We don't need all this infrastructure anymore since requests managed better by individual pallets.
Not sure if we're gonna need this in their current form anymore. Need to discuss with the others.
We don't want it to get used for the Client's `wasm` builds.
@JesseAbram
Copy link
Member

yes I like this flow but two things. I don't love the double calls. If we understand the nonce and the sending old attestation issue maybe we can have the nonce generated deterministic with like the last block hash or some randomness that is only known at a certain point.

If we are going to do the double call the nonce could be in the event, just food for thought, would be pretty clean

This let's us request quotes in a non-extrinsic way, useful for testing and benchmarking.
This was using the caller, who is a Substrate/Validator account.
We're now using "real" PCKs to get this to pass
This will make it easier for us to control the expected nonce in the context of benchmarking and
testing.
@@ -151,21 +150,70 @@ pub mod pallet {
#[pallet::weight({
<T as Config>::WeightInfo::attest()
})]
pub fn attest(origin: OriginFor<T>, quote: Vec<u8>) -> DispatchResult {
pub fn attest(origin: OriginFor<T>, _quote: Vec<u8>) -> DispatchResult {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this has been changed to do nothing and always pass for the sake of the TSS tests.

I think it makes sense to get rid of it with this new sort of model, but I would want to do that in a follow up.

We can also change the implementation a bit to actually check attestations if that's something that's desired. Let me know your thoughts.

@HCastano HCastano marked this pull request as ready for review October 16, 2024 20:53
Copy link
Contributor

@ameba23 ameba23 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see how this greatly simplifies and improves things for the validate case. But i feel like that case is a bit special because it is initiated by the node operator wanting to join, not in response to an on-chain event.

The change_endpoint and change_threshold_accounts endpoints are no problem because they are also initiated by the node operator so will work great with this flow.

But lets think what the flow would be for doing an attestation at the point of doing a reshare (or at a session boundary). Note that the quote cannot be submitted in the confirm_key_reshare extrinsic as you suggest above, because at that point the reshare protocol has already run and the TS server already has a parent keyshare.

I am guessing something like this (and correct me if you imagine it differently):

  • TS server's /reshare http endpoint is called by the offchain worker. In the handler for that endpoint:
    • The TS server submits a request_attestation extrinsic.
    • The TS server queries PendingAttestations to get the nonce.
    • The TS server creates a quote and submits it using some extrinsic which isn't yet implemented, lets call it submit_quote_before_reshare.
  • In the handler for submit_quote_before_reshare:
    • The quote gets validated.
    • If quotes from all other reshare parties have also already been validated, the reshare is marked as ready, and an offchain worker will make another http request to the TS server, which actually does the reshare protocol, just as the /reshare endpoint does now.

It looks to me like this works, but is a little bit inefficient, as a round trip could be saved by including the nonce in the initial http request.

I expect you've thought this through and have a better idea of how this would be done - but i would like to see an example, even if its just a description of how it would work, before we merge this.

All this would be made simpler if we already had a clear plan of exactly at what points we want / need to do attestations, which unfortunately we don't yet. But i think if we have a good attestation flow for both initially joining the validator set, and becoming a signer, then i am happy.

pallets/staking/Cargo.toml Outdated Show resolved Hide resolved
@HCastano
Copy link
Collaborator Author

@ameba23

But lets think what the flow would be for doing an attestation at the point of doing a reshare (or at a session boundary). Note that the quote cannot be submitted in the confirm_key_reshare extrinsic as you suggest above, because at that point the reshare protocol has already run and the TS server already has a parent keyshare.

Yeah, you're right here in that the key material would have already been shared with the signer. Our only recourse at this point is to slash the NextSigner that submitted the wrong proof - which we can do since they're bonded by this point.

I am guessing something like this (and correct me if you imagine it differently):

* TS server's `/reshare` http endpoint is called by the offchain worker. In the handler for that endpoint:
  
  * The TS server submits a `request_attestation` extrinsic.
  * The TS server queries `PendingAttestations` to get the nonce.
  * The TS server creates a quote and submits it using some extrinsic which isn't yet implemented, lets call it `submit_quote_before_reshare`.

* In the handler for `submit_quote_before_reshare`:
  
  * The quote gets validated.
  * If quotes from all other reshare parties have also already been validated, the reshare is marked as ready, and an offchain worker will make another http request to the TS server, which actually does the reshare protocol, just as the `/reshare` endpoint does now.

If we don't want to go through confirm_key_reshare() this is the reasonable flow to take. One thing here is the nonce gets emitted as an event from request_attestation(), so technically that's one less chain query (but not like it makes a huge difference here).

It looks to me like this works, but is a little bit inefficient, as a round trip could be saved by including the nonce in the initial http request.

Yeah, I don't really know if we have any requirements for the nonce other than just being fresh, so we could use the latest blockhash or something here as the nonce.

I expect you've thought this through and have a better idea of how this would be done - but i would like to see an example, even if its just a description of how it would work, before we merge this.

I was just gonna go with the confirm_key_reshare() approach, but now I'm thinking about how else to do this. Main thing is that I don't want to intermingle state from the Staking Extension and Attestation pallets since that gets messy and is complicated to reason about.

All this would be made simpler if we already had a clear plan of exactly at what points we want / need to do attestations, which unfortunately we don't yet. But i think if we have a good attestation flow for both initially joining the validator set, and becoming a signer, then i am happy.

Imo it's basically at two points:

  1. Anything that changes the state of a validator: their status as a candidate (e.g validate()), their ServerInfo (e.g change_endpoint()), etc.
  2. Session boundaries, which in a way is a special case of their candidacy status

We did also talk about having the chain randomly issue attestation challenges during the session, as well as having TSS servers issue them during signing rounds, but both of these seem like overkill to me.

Copy link
Contributor

@ameba23 ameba23 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok great, following the discussion yesterday and your comments i think we should merge this. We can remove the ignored tests, the empty function and the stuff where we add pending attestations in the genesis config for those tests.

Then as a next PR we can add a quote to the change_endpoint / change_threshold_accounts extrinsics, and use that to have a test which demonstrates the whole process of requesting and making an attestation on the TSS side. This will be useful for me as i can then use that to make and verify real quotes on our TDX machine - by changing the endpoint to the same as what it was before.

Finally we can come to a decision as to whether we want attestation as part of the reshare process. Personally I think we should and we should not worry about extra round trips there because we are not making anybody wait. I am gonna make an issue to discuss this.

@HCastano
Copy link
Collaborator Author

@ameba23 I'm going to remove the other attestation code in a follow up PR just to make it more obvious that these things were removed.

@HCastano HCastano merged commit 13910ee into master Oct 18, 2024
8 checks passed
@HCastano HCastano deleted the hc/change-attestation-to-be-pull-based branch October 18, 2024 20:19
@HCastano HCastano added the Bump `spec_version` A change which affects the current runtime behaviour label Oct 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bump `spec_version` A change which affects the current runtime behaviour
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Consolidate ValidationQueue and PendingAttestations
3 participants