Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Access] Add endpoints to execution nodes to support tx result err msgs #1398

Conversation

AndriiDiachuk
Copy link
Contributor

Description

onflow/flow-go#4754

This pull request introduces a new GRPC API call which gets the error messages of all failed transactions in the block ordered by transaction index.

Copy link
Contributor

@peterargue peterargue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @AndriiDiachuk!

I think this is a good start, and there's a little more needed for the usecase. I think we will need 2 new endpoints:

GetTransactionErrorMessage -> Takes a transactionID and blockID, and returns the error message string for that tx.
GetTransactionErrorMessagesByBlockID -> similar to what you have. takes a blockID, and returns a set of objects, one for each tx with an error.

I think the GetTransactionErrorMessagesByBlockID response should contain a repeated message that includes the transaction's ID, index and error string.

@@ -50,6 +50,11 @@ service ExecutionAPI {
rpc GetTransactionResultsByBlockID(GetTransactionsByBlockIDRequest)
returns (GetTransactionResultsResponse);

// GetTransactionErrorMessagesByBlockID gets the error messages of all failed transactions in the
// block ordered by transaction index
rpc GetTransactionErrorMessagesByBlockID(GetTransactionsByBlockIDRequest)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's create a separate request object for this GetTransactionErrorMessagesByBlockIDRequest

@@ -141,6 +146,10 @@ message GetTransactionResultsResponse {
entities.EventEncodingVersion event_encoding_version = 2;
}

message GetTransactionErrorMessagesResponse {
repeated string error_messages = 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we'll need a way to differentiate between the messages for different errors.

@AndriiDiachuk
Copy link
Contributor Author

AndriiDiachuk commented Nov 3, 2023

Thanks for the PR @AndriiDiachuk!

I think this is a good start, and there's a little more needed for the usecase. I think we will need 2 new endpoints:

GetTransactionErrorMessage -> Takes a transactionID and blockID, and returns the error message string for that tx. GetTransactionErrorMessagesByBlockID -> similar to what you have. takes a blockID, and returns a set of objects, one for each tx with an error.

I think the GetTransactionErrorMessagesByBlockID response should contain a repeated message that includes the transaction's ID, index and error string.

Thanks for response @peterargue. Originally, I wanted to implement a few methods like you have suggested but after taking a closer look on current implementation I have noticed that we sync execution data using execution data sync module which syncs error messages only block by block basis. So I thought that we should mirror this logic and sync error messages for all failed transactions in the block. Regarding the response, I've omitted txid and index since we can derive that data by looking at the transactions in block and filtering failed transactions.

For example, if we consider block txns: [A(success), B(failure), C(success), D(failure), E(failure)].
We will get the following response: [err_msg_B, err_msg_D, err_msg_E],
and we can derive txid and index by applying filter func F(res:LightTransactionResult) bool { return res.Failed }.
Main goal is to reduce the size of the response.

If you are worried about consistency of mapping error messages to the actual light execution results we can add a checksum of involved tx ids or a mechanism similar to signer indices(a bitset of tx indexes that are returned). If you think we can neglect with extra space I can always include full tx ids and indexes.

…chuk/4754-add-endpoints-to-execution-nodes-to-suppport-tx-result-err-msgs
@peterargue
Copy link
Contributor

Thanks for response @peterargue. Originally, I wanted to implement a few methods like you have suggested but after taking a closer look on current implementation I have noticed that we sync execution data using execution data sync module which syncs error messages only block by block basis. So I thought that we should mirror this logic and sync error messages for all failed transactions in the block. Regarding the response, I've omitted txid and index since we can derive that data by looking at the transactions in block and filtering failed transactions.

For example, if we consider block txns: [A(success), B(failure), C(success), D(failure), E(failure)]. We will get the following response: [err_msg_B, err_msg_D, err_msg_E], and we can derive txid and index by applying filter func F(res:LightTransactionResult) bool { return res.Failed }. Main goal is to reduce the size of the response.

If you are worried about consistency of mapping error messages to the actual light execution results we can add a checksum of involved tx ids or a mechanism similar to signer indices(a bitset of tx indexes that are returned). If you think we can neglect with extra space I can always include full tx ids and indexes.

So my thinking on this is:

  • Not every Access node/observer will need all tx error messages for every block. The public nodes, nodes that are used with blockchain indexers and other special usecases may, but the majority will only care about a small subset.
  • While it may be simplest in an initial version to have every node sync all error messages for every block, this may not scale well. Even if this is how the initial implementation works on ANs, let's still add the endpoint to get the error from a specific tx. I'd rather get the work done now, than have to rescope it later since doing both is minimally more work than just one.
  • I see your point about the response size. I worry that if there's a bug or execution fork on an EN, it could returns results for a different set of tx, resulting in inaccurate information cached and returned from the AN. While it does add a little overhead (~40 bytes per tx), the overhead is generally much less than the size of the error messages themselves and allows the AN to validate it got errors for all of the expected tx.

@AndriiDiachuk
Copy link
Contributor Author

Thanks for response @peterargue. Originally, I wanted to implement a few methods like you have suggested but after taking a closer look on current implementation I have noticed that we sync execution data using execution data sync module which syncs error messages only block by block basis. So I thought that we should mirror this logic and sync error messages for all failed transactions in the block. Regarding the response, I've omitted txid and index since we can derive that data by looking at the transactions in block and filtering failed transactions.
For example, if we consider block txns: [A(success), B(failure), C(success), D(failure), E(failure)]. We will get the following response: [err_msg_B, err_msg_D, err_msg_E], and we can derive txid and index by applying filter func F(res:LightTransactionResult) bool { return res.Failed }. Main goal is to reduce the size of the response.
If you are worried about consistency of mapping error messages to the actual light execution results we can add a checksum of involved tx ids or a mechanism similar to signer indices(a bitset of tx indexes that are returned). If you think we can neglect with extra space I can always include full tx ids and indexes.

So my thinking on this is:

  • Not every Access node/observer will need all tx error messages for every block. The public nodes, nodes that are used with blockchain indexers and other special usecases may, but the majority will only care about a small subset.
  • While it may be simplest in an initial version to have every node sync all error messages for every block, this may not scale well. Even if this is how the initial implementation works on ANs, let's still add the endpoint to get the error from a specific tx. I'd rather get the work done now, than have to rescope it later since doing both is minimally more work than just one.
  • I see your point about the response size. I worry that if there's a bug or execution fork on an EN, it could returns results for a different set of tx, resulting in inaccurate information cached and returned from the AN. While it does add a little overhead (~40 bytes per tx), the overhead is generally much less than the size of the error messages themselves and allows the AN to validate it got errors for all of the expected tx.

@peterargue Made some changes with your suggestions. Waiting for you feedback. Thanks in advance.

@peterargue peterargue merged commit 19ae56e into onflow:master Nov 17, 2023
1 check passed
@AndriiDiachuk AndriiDiachuk deleted the AndriiDiachuk/4754-add-endpoints-to-execution-nodes-to-suppport-tx-result-err-msgs branch November 21, 2023 12:38
@peterargue peterargue changed the title Andrii diachuk/4754 add endpoints to execution nodes to suppport tx result err msgs [Access] Add endpoints to execution nodes to support tx result err msgs Nov 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants