Skip to content
This repository has been archived by the owner on Apr 9, 2024. It is now read-only.

Refactor Internal Endorser #189

Closed

Conversation

mariaschett
Copy link
Contributor

@mariaschett mariaschett commented Dec 21, 2022

Fixes #165 .

This PR merges the types.ValidatedProvenance and common.ProvenanceIR into types.ProvenanceIR.

Benefits:

  • only one internal, central representation of provenances
  • reduces code duplication, in particular binary digest & binary name

Downside:
To avoid cyclic dependencies, we have a bit of a round-about.

  1. To determine whether we have a SLSAv02 or Amber provenance, we need to first parse the intoto.statement with types.ParseStatementData. This only sets the field provenanceStatement in a ProvenanceIR p.
  2. To determine the other fields in ProvenanceIR we need to call common.SetProvenanceValues on p (this has to live in common to avoid cyclic dependencies).

@@ -92,6 +93,13 @@ func NewProvenanceIR(binarySHA256Digest string, options ...func(p *ProvenanceIR)
return provenance
}

// WithBinaryName adds a binary name when creating a new ProvenanceIR.
func WithBinaryName(binaryName string) func(p *ProvenanceIR) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if we want to verify this, but I could be wrong. Was there anything about this when you did the requirements analysis?

Copy link
Contributor Author

@mariaschett mariaschett Dec 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, not at all, I removed it!

@mariaschett mariaschett force-pushed the 165-generalize-endorser branch 2 times, most recently from b579e0e to 4251b13 Compare December 30, 2022 11:03
@mariaschett mariaschett marked this pull request as ready for review December 30, 2022 12:34

if diff := cmp.Diff(got, want, cmp.AllowUnexported(ProvenanceIR{})); diff != "" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had intentionally used cmp.Diff. This is the standard pattern that we should use everywhere (see #188). Please revert this change, and rewrite the test using cmp.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm... I didn't know where to get a good in-toto test statement for generating a NewProvenanceIR(provenanceStatement *intoto.Statement---so this was the quickest fix I could think of.

I see two option:

  1. I think I could use ParseStatementData to get the *intoto.Statement from the provenanceExample path, but that seemed a bit round-about. Or am I missing a simpler solution to get a test intoto.Statement?

2.make sure the cmp.Diff does not compare the provenanceStatement, but I am not sure if that is possible.

Do you think that any of those two much better?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, what exactly is this test case testing? The name still says TestFromProvenance_Amber, but FromProvenance is replaced with SetProvenanceData, which is not even called in this test.

I think fixing this test will be easier once SetProvenanceData is fixed (ref my other comments on SetXYZData).

predType := prov.PredicateType()
// SetProvenanceData sets, depending on the predicate of the underlying provenance intoto.Statement,
// the data for verifying the provenance.
func SetProvenanceData(provenanceIR *types.ProvenanceIR) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer the immutable style.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, do you think it needs to be changed?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, and after a closer look at the code I think it is doable.

// ProvenanceIR wraps an intoto.Statement representing a valid SLSA provenance statement. A provenance statement is valid if it contains a single subject, with a SHA256 hash.
// ProvenanceIR also holds internal intermediate representations of data from provenances. We want to map different provenances of different build types to ProvenanceIR, so
// all fields except for `provenanceStatement` are optional.
type ProvenanceIR struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see why this is moved from common to here. In particular, ProvenanceIR is an internal representation, and belongs to internal not pkg, which by convention provides the public interface.

Please move the content of this file to common, and leave only ParseStatementData in this file.

internal/verifier/verifier.go Show resolved Hide resolved
return nil, fmt.Errorf("could not parse the provenance bytes: %v", err)
}

if err := SetAmberProvenanceData(provenanceIR); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general we want ParseProvenanceFile and ParseStatementData (or perhaps ParseProvenanceData considering the new change) to be only different in that one uses a file as input and the other uses a byte array as input.

I suggest changing SetAmberProvenanceData to take a byte array as input, first calling types.ParseStatementData(statementBytes) and then the rest. I also suggest renaming it to ParseProvenanceData. Then in ParseProvenanceFile as before, you only need to call one function (ParseProvenanceData) and no other logic.

Copy link
Contributor Author

@mariaschett mariaschett Jan 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure SetAmberProvenanceData to take bytes as input will work with the dependencies.

I was quite struggling to get the dependencies right, but I might be missing something.

I suggest I'll first move the definition of ProvenanceIR from types to common as in the comment below and then revisit this comment.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well yes, you don't want to introduce a dependency to internal/common in the files and packages in pkg. But if all these functions are tightly related to ProvenanceIR they should go into common too.

But having a function ParseProvenanceFile in this file returning a data structure in pkg (e.g., intoto.Statement) gives a more convenient API.

Currently types.ValidatedProvenance is the glue between internal and pkg. The only thing that types.ValidatedProvenance adds to intoto.Statement is a validity check. If we only care about this validity in contexts where ProvenanceIR is used, then we can move the validation logic to a function that creates an instance of ProvenanceIR. Then we can remove types.ValidatedProvenance, and replace all its occurrences with intoto.Statement.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently types.ValidatedProvenance is the glue between internal and pkg.

Yes, you are right, and I think types.ValidatedProvenance being the glue is the key.

I just tried to move ProvenanceIR to internal.common, which brings me in a full circle of where I started: needing both: ProvenanceIR and ValidatedProvenance.

Now after all this, I think we should keep both: ValidatedProvenance and ProvenanceIR.

The main reason is:

We have to go from bytes to ValidatedProvenance to ProvenanceIR. The last step, from ValidatedProvenace to ProvenanceIR depends on ValidatedProvenance.

So I do not think it a good idea to merge the two types, if we want to keep the flexibility, especially as

replace all its occurrences with intoto.Statement

Yes, that is what we would need to do, but I feel we are loosing information and control over the type in this case.

To summarize: I think we should keep ValidatedProvenance and ProvenanceIR.

However, I am not sure about the guarantees the two types give (and related their names).

Should ProvenanceIR hold the binary digest/name? More generally: do we, internally, only want to work on ProvenanceIR?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To summarize: I think we should keep ValidatedProvenance and ProvenanceIR.

Sounds good.

However, I am not sure about the guarantees the two types give (and related their names).

Should ProvenanceIR hold the binary digest/name? More generally: do we, internally, only want to work on ProvenanceIR?

I think it would be nice to do that. That would require converting ValidatedProvenance to ProvenanceIR as early as possible. Having separate fields in ProvenanceIR for digest and name makes sense to me, although IIRC, we won't have a reference value for name.

// The field is private so that invalid instances cannot be created.
provenance intoto.Statement
provenanceStatement *intoto.Statement
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we still need to keep the provenance statement? It feels redundant.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For GetBinarySHA256Digest and GetBinaryName.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update after discussion: We need the intoto.Statement ih the ReproducibleProvenanceVerifier for LoadBuildConfig

@@ -200,3 +201,20 @@ func GetMaterialsGitURI(pred ProvenancePredicate) []string {
}
return gitURIs
}

// SetSLSAv02ProvenanceData sets data to verify a SLSA v02 provenance in the given ProvenanceIR.
func SetSLSAv02ProvenanceData(provenanceIR *types.ProvenanceIR) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above, I suggest renaming this to ParseProvenanceData, and changing it to take a byte array as input.

func SetAmberProvenanceData(provenanceIR *types.ProvenanceIR) error {
buildType := AmberBuildTypeV1

predicate, err := slsav02.ParseSLSAv02Predicate(provenanceIR.GetProvenance().Predicate)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you make the other changes I suggested, then here you could call slsav02.ParseProvenanceData and reuse more of the common functionality between slsav02 and amber.

Copy link
Contributor Author

@mariaschett mariaschett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review, I answered most comments.

Meta question: Are you happy with the change?

It is mainly a refactoring to get rid of the two definitions ProvenanceIR and ValidatedProvenance, but it is not strictly necessary.

I can drop it, if you prefer!

predType := prov.PredicateType()
// SetProvenanceData sets, depending on the predicate of the underlying provenance intoto.Statement,
// the data for verifying the provenance.
func SetProvenanceData(provenanceIR *types.ProvenanceIR) error {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, do you think it needs to be changed?


if diff := cmp.Diff(got, want, cmp.AllowUnexported(ProvenanceIR{})); diff != "" {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm... I didn't know where to get a good in-toto test statement for generating a NewProvenanceIR(provenanceStatement *intoto.Statement---so this was the quickest fix I could think of.

I see two option:

  1. I think I could use ParseStatementData to get the *intoto.Statement from the provenanceExample path, but that seemed a bit round-about. Or am I missing a simpler solution to get a test intoto.Statement?

2.make sure the cmp.Diff does not compare the provenanceStatement, but I am not sure if that is possible.

Do you think that any of those two much better?

internal/verifier/verifier.go Show resolved Hide resolved
// The field is private so that invalid instances cannot be created.
provenance intoto.Statement
provenanceStatement *intoto.Statement
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For GetBinarySHA256Digest and GetBinaryName.

return nil, fmt.Errorf("could not parse the provenance bytes: %v", err)
}

if err := SetAmberProvenanceData(provenanceIR); err != nil {
Copy link
Contributor Author

@mariaschett mariaschett Jan 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure SetAmberProvenanceData to take bytes as input will work with the dependencies.

I was quite struggling to get the dependencies right, but I might be missing something.

I suggest I'll first move the definition of ProvenanceIR from types to common as in the comment below and then revisit this comment.

predType := prov.PredicateType()
// SetProvenanceData sets, depending on the predicate of the underlying provenance intoto.Statement,
// the data for verifying the provenance.
func SetProvenanceData(provenanceIR *types.ProvenanceIR) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, and after a closer look at the code I think it is doable.


if diff := cmp.Diff(got, want, cmp.AllowUnexported(ProvenanceIR{})); diff != "" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, what exactly is this test case testing? The name still says TestFromProvenance_Amber, but FromProvenance is replaced with SetProvenanceData, which is not even called in this test.

I think fixing this test will be easier once SetProvenanceData is fixed (ref my other comments on SetXYZData).

internal/verifier/verifier.go Show resolved Hide resolved
return nil, fmt.Errorf("could not parse the provenance bytes: %v", err)
}

if err := SetAmberProvenanceData(provenanceIR); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well yes, you don't want to introduce a dependency to internal/common in the files and packages in pkg. But if all these functions are tightly related to ProvenanceIR they should go into common too.

But having a function ParseProvenanceFile in this file returning a data structure in pkg (e.g., intoto.Statement) gives a more convenient API.

Currently types.ValidatedProvenance is the glue between internal and pkg. The only thing that types.ValidatedProvenance adds to intoto.Statement is a validity check. If we only care about this validity in contexts where ProvenanceIR is used, then we can move the validation logic to a function that creates an instance of ProvenanceIR. Then we can remove types.ValidatedProvenance, and replace all its occurrences with intoto.Statement.

testutil.AssertEq(t, "builder image digest", gotBuilderImageDigest, "9e2ba52487d945504d250de186cb4fe2e3ba023ed2921dd6ac8b97ed43e76af9")

gotRepoURIs := got.GetRepoURIs()
testutil.AssertEq(t, "repo URI", gotRepoURIs[0], "https://github.com/project-oak/transparent-release")
}

func TestFromProvenance_Slsav1(t *testing.T) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update the name of the test.

@mariaschett
Copy link
Contributor Author

Close in favour of the fresh PR #210 adding discussed changes.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Generalize endorser to work on ProvenanceIR
2 participants