Skip to content
This repository has been archived by the owner on Nov 23, 2023. It is now read-only.

Bug: collection.json with identical STAC id doesn't update root catalog. #2018

Open
17 tasks
billgeo opened this issue Sep 13, 2022 · 1 comment
Open
17 tasks
Labels
bug Something isn't working needs refinement Needs to be discussed by the team

Comments

@billgeo
Copy link
Contributor

billgeo commented Sep 13, 2022

Bug Description

When submitting some data to a dataset that has the same STAC ID in teh collection/catalog.json as another dataset results in the dataset being imported to S3 but then the 'update root catalog' function fails because you can't have two catalog children with the same id.

Need to check what happens when submitting a partial dataset version update, like adding one item.json to a dataset.

Tasks

  • Look at options for resolving this bug, with possible options are:

  • drop the static catalogue

  • Restrict the supplied files can only be collection.json

  • If a collection.json or catalog.json is being updated, store the STAC IDs in dynamoDB mapped to dataset for every dataset import

  • Check that STAC ID doesn't exist in a different dataset

  • If it isn't unique send a useful message back to the user

How to Reproduce

  1. Create a dataset and import a dataset version
  2. Create a 2nd dataset and import a dataset version with the same staging data as the 1st version

What did you expect to happen?

  • STEP function notices that the dataset ID is exactly the same as another dataset and stops the process and returns a message to the user, before any other validation or file copy etc
  • And root catalog is also not updated

What actually happened?

  • Dataset version was processed successfully
  • Root catalog had a new version that was identical to the previous version, and it had no child link to the just created child dataset

Software Context

Operating system: AWS Console Lambda 'test' tool

Environment: Nonprod

Relevant software versions:

  • AWS CLI:
  • Poetry:

Additional context

Definition of Done

  • This bug is done:
    • Bug resolved to user's satisfaction
    • Automated tests are passing
    • Code is peer reviewed and pushed to master
    • Deployed successfully to test environment
    • Checked against
      CODING guidelines
    • Relevant new tasks are added to backlog and communicated to the team
    • Important decisions recorded in the issue ticket
    • Readme/Changelog/Diagrams are updated
    • Product Owner has approved as complete
    • No regression to functional or
      non-functional
      requirements
@billgeo billgeo added bug Something isn't working Epic This is a Zenhub label and can be ignored labels Sep 13, 2022
@billgeo billgeo changed the title Bug: collection.json with identical Bug: collection.json with identical id Sep 13, 2022
@billgeo billgeo removed the Epic This is a Zenhub label and can be ignored label Sep 13, 2022
@billgeo
Copy link
Contributor Author

billgeo commented Sep 13, 2022

Two options:

  1. write the id in the geostore to match the random id string created by the geostore dataset function, so it is always unique
  2. check the IDs are unique across datasets and exit the step fucntion if they are not, and return message to the user (could use pystac or put all the STAC IDs in dynamodb)

@billgeo billgeo changed the title Bug: collection.json with identical id Bug: collection.json with identical STAC id doesn't update root catalog. Sep 14, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working needs refinement Needs to be discussed by the team
Development

No branches or pull requests

1 participant