-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Berkeley schema ingest #1295
Merged
Merged
Berkeley schema ingest #1295
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
naglepuff
force-pushed
the
berkeley-schema-ingest
branch
from
July 31, 2024 19:09
1e075ce
to
c9e6e62
Compare
naglepuff
force-pushed
the
berkeley-schema-migration
branch
from
August 1, 2024 18:38
7da694b
to
481461a
Compare
naglepuff
commented
Aug 6, 2024
naglepuff
commented
Aug 6, 2024
naglepuff
commented
Aug 6, 2024
naglepuff
commented
Aug 6, 2024
naglepuff
commented
Aug 6, 2024
naglepuff
commented
Aug 6, 2024
jeffbaumes
approved these changes
Aug 6, 2024
marySalvi
approved these changes
Aug 6, 2024
New slots were added to NMDC schema that breaks our current usage of the function.
NMDC schema changed the slot that links a Biosample to a study. Previously the relationship was contained in the `part_of` slot on class `Biosample`. That slot has been renamed to `associated_studies`. Also, temporarily disable the backup link through omics_processing, as that relationship has changed in a more complex way.
This isn't necessary since `associated_studies` is required and has cardinality of 1..*.
Formerly known as omics_procecssing records, there were some small schema tweaks that needed to be reflected in ingest. Note how the process of obtaining input biosamples is simpler since we now only need to query one collection of processes. This is a result of the change in database structure that puts related objects into the same collection.
Note that this should be updated in the SQL schema to be optional/nullable.
Note that in the future we might want to make instrument a fully fledged model in our database.
naglepuff
force-pushed
the
berkeley-schema-migration
branch
from
August 7, 2024 19:50
481461a
to
f4645eb
Compare
naglepuff
force-pushed
the
berkeley-schema-ingest
branch
from
August 7, 2024 19:50
89a22e3
to
d117b81
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fix #1291
Breaking changes
This breaks the ingest process for any version of
nmdc-schema
version<11.0.0
. For deployments of the NMDC Data Portal using this code base and ingest, ingest must be pointed at a database that conforms to the "Berkeley Schema"Changes
This set of changes updates ingest to be able to accept data from a source mongo database running version
>=11.0.0
of the NMDC Schema (Berkeley). It does not attempt to update the data portal in any other way. From and end-user perspective, the changes here should have no impact. Users should see the same data, presented in the same way as data ingested from an older version of the mongo database. It does not attempt to rename endpoints, files, classes, functions, or variables to be up to date with the new schema either (e.g. the term "omics processing" will still exist in our code).Some specific changes include:
part_of
->associated_studies
)type
field.instrument_id
field from a Data Generation object and do a lookup in theinstrument_set
collection.