Skip to content

Commit

Permalink
Merge remote-tracking branch 'origin/main' into 1148-add-a-data-struc…
Browse files Browse the repository at this point in the history
…ture-that-specifies-a-migration-collections-to-be-migrated-source-schema-and-destination-schema
  • Loading branch information
eecavanna committed Oct 4, 2023
2 parents cc6aa15 + 03bcd93 commit 963cbd3
Show file tree
Hide file tree
Showing 29 changed files with 28,998 additions and 30,852 deletions.
56 changes: 56 additions & 0 deletions RELEASE_NOTES_v7.7.2_to_v7.8.0.md → CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,59 @@
# Changelog

All notable changes to this project will be documented in this file. The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/) beginning with the 8.1.0 version. Previous content was copied over without reformatting.

Versioning for this project is based on [Semantic Versioning](https://semver.org/spec/v2.0.0.html) with the following guidelines:

* Patch versions: changes that have no effect on the data format described by the schema. For example: updates to slot descriptions, adding examples.
* Minor versions: changes that have backwards-compatible effects on the data format. For example: adding a new, non-required slot to a class, broadening the range of a slot.
* Major version: changes that have backwards-incompatible effects of the data format. These changes will require existing data to be migrated in order to be compatible with the new version. For example: adding a required slot to a class, changing a slot from multivalued to single-valued.

## Unreleased

## 8.1.0 - 2023-10-03

### Fixed

- Remove incorrect description of `lat_lon` slot
- Remove non-monotonic range override on `used` slot of `MetaproteomicsAnalysisActivity` class.

## 8.0.0 - 2023-09-21

### Overhaul of the definition and usage of CURIe prefixes.
- Most notably, the `default_curi_maps` assertions have been
removed from all schema source file, like `src/schema/nmdc.yaml`. All prefixes that will be used in the schema
(`slot_uri`s, `mappings`, `id_prefixes`, etc.) must be defined in either `nmdc.yaml` or another source field that it
`imports`. There is now a single file that contains all prefix definitions across the merged schema:
`project/jsonld/nmdc.context.jsonld`. Note that it uses two pattern for expanding prefixes, Both are accessed from the
`@context` outer key.
- direct: `"EFO": "http://www.ebi.ac.uk/efo/"`
- via an `@id` inner key: `"ENVO": { "@id": "http://purl.obolibrary.org/obo/ENVO_"}`.
Other keys in these dictionaries can usually be ignored.
- The `GOLD` prefix is no longer allowed in the schema or any schema compliant data. Only `gold` is allowed now.
- A discussion of prefixes, CURIes, identifiers and mappings has been added: `src/docs/prefixes_curies_ids_mappings_etc.md`
- https://bioregistry.io is now consistently preferred over http://identifiers.org as a CURIe resolving service.
The version is this release is a draft, and community members are welcome to ask questions or make suggestions.

### New data migration code:
- `Extraction`s must replace usages of the `sample_mass` slot with `input_mass`
- replacement of `GOLD` prefixes with `gold` prefixes in three classes
- updates to `src/data` files: `/valid` shows the post migration state.

### Other
- SPARQL queries have beena dded or updated in `assets/sparql`
- see also `nmdc_schema/class_sparql.py`
- example python QC code, using LinkML SchemaView, has been added
- `nmdc_schema/list_id_prefixes_and_patterns.py`
- `nmdc_schema/list_slot_usages.py`
- `nmdc_schema/list_structured_patterns.py`
- many definitional attributes of slots have been moved out of per-class `slot_usage` blocks.
Especially in `src/schema/workflow_execution_activity.yaml`. Likewise, all class should not assert their slots with a
`slots` list, not implicitly via the `slot_usage` blocks. `required: true`, customized `range`s and customized `description`s
can still be found in `slot_usage` blocks. Not that some of these are non-monotonic and need further attention.
An exampl would be when the global definition of a slot uses an enumeration `range` but a slot_usage uses a `string` `range`.

## 7.8.0 - 2023-08-30

- More aggressive .gitigore for cleaner merges
- for releases: git add -f examples nmdc_schema project src
- Refactored Makefile and project.Makefile
Expand Down
32 changes: 0 additions & 32 deletions RELEASE_NOTES_v7.8.0_to_v8.0.0.md

This file was deleted.

Original file line number Diff line number Diff line change
Expand Up @@ -217,9 +217,9 @@ nmdc:bsm-99-dtTMNb a nmdc:Biosample ;
MIXS:0000647 [ a nmdc:QuantityValue ;
nmdc:has_raw_value "MIxS does not provide an example" ] ;
MIXS:0000652 [ a nmdc:TextValue ;
nmdc:has_raw_value "arsenic;0.09 micrograms per gram" ],
nmdc:has_raw_value "mercury;0.09 micrograms per gram" ],
[ a nmdc:TextValue ;
nmdc:has_raw_value "mercury;0.09 micrograms per gram" ] ;
nmdc:has_raw_value "arsenic;0.09 micrograms per gram" ] ;
MIXS:0000735 [ a nmdc:QuantityValue ;
nmdc:has_raw_value "0.2 micrometer" ] ;
MIXS:0000736 [ a nmdc:QuantityValue ;
Expand Down
26 changes: 13 additions & 13 deletions examples/output/Database-functional-annotations.ttl
Original file line number Diff line number Diff line change
Expand Up @@ -2,33 +2,33 @@

[] a nmdc:Database ;
nmdc:functional_annotation_set [ a nmdc:FunctionalAnnotation ;
nmdc:has_function "CATH:1.10.10.200" ],
[ a nmdc:FunctionalAnnotation ;
nmdc:has_function "MetaNetX:MNXR101574" ],
nmdc:has_function "EC:1.1.1.1" ],
[ a nmdc:FunctionalAnnotation ;
nmdc:has_function "KEGG_PATHWAY:rsk00410" ],
nmdc:has_function "EGGNOG:veNOG12876" ],
[ a nmdc:FunctionalAnnotation ;
nmdc:has_function "KEGG.ORTHOLOGY:K00001" ],
[ a nmdc:FunctionalAnnotation ;
nmdc:has_function "EGGNOG:veNOG12876" ],
nmdc:has_function "RHEA:12345" ],
[ a nmdc:FunctionalAnnotation ;
nmdc:has_function "PANTHER.FAMILY:PTHR12345" ],
[ a nmdc:FunctionalAnnotation ;
nmdc:has_function "MetaCyc:RXN-14904" ],
nmdc:has_function "CATH:1.10.10.200" ],
[ a nmdc:FunctionalAnnotation ;
nmdc:has_function "PFAM:PF11779" ],
nmdc:has_function "SUPFAM:SSF57615" ],
[ a nmdc:FunctionalAnnotation ;
nmdc:has_function "TIGRFAM:TIGR00010" ],
nmdc:has_function "SEED:Biotin_biosynthesis" ],
[ a nmdc:FunctionalAnnotation ;
nmdc:has_function "RHEA:12345" ],
nmdc:has_function "KEGG.REACTION:R00100" ],
[ a nmdc:FunctionalAnnotation ;
nmdc:has_function "SEED:Biotin_biosynthesis" ],
nmdc:has_function "MetaCyc:RXN-14904" ],
[ a nmdc:FunctionalAnnotation ;
nmdc:has_function "SUPFAM:SSF57615" ],
nmdc:has_function "PFAM:PF11779" ],
[ a nmdc:FunctionalAnnotation ;
nmdc:has_function "GO:0032571" ],
[ a nmdc:FunctionalAnnotation ;
nmdc:has_function "EC:1.1.1.1" ],
nmdc:has_function "TIGRFAM:TIGR00010" ],
[ a nmdc:FunctionalAnnotation ;
nmdc:has_function "KEGG_PATHWAY:rsk00410" ],
[ a nmdc:FunctionalAnnotation ;
nmdc:has_function "KEGG.REACTION:R00100" ] .
nmdc:has_function "MetaNetX:MNXR101574" ] .

42 changes: 21 additions & 21 deletions examples/output/Database-mags-activities.ttl
Original file line number Diff line number Diff line change
Expand Up @@ -17,17 +17,6 @@ nmdc:wfmag-99-5MiDJM a nmdc:MagsAnalysisActivity ;
nmdc:input_contig_num 169782 ;
nmdc:low_depth_contig_num 0 ;
nmdc:mags_list [ a nmdc:MagBin ;
nmdc:bin_name "bins.3" ;
nmdc:bin_quality "LQ" ;
nmdc:completeness "2.0"^^xsd:float ;
nmdc:contamination "0.0"^^xsd:float ;
nmdc:gene_count 294 ;
nmdc:num_16s 0 ;
nmdc:num_23s 0 ;
nmdc:num_5s 0 ;
nmdc:num_t_rna 1 ;
nmdc:number_of_contig 11 ],
[ a nmdc:MagBin ;
nmdc:bin_name "bins.2" ;
nmdc:bin_quality "LQ" ;
nmdc:completeness "51.25"^^xsd:float ;
Expand All @@ -38,6 +27,17 @@ nmdc:wfmag-99-5MiDJM a nmdc:MagsAnalysisActivity ;
nmdc:num_5s 1 ;
nmdc:num_t_rna 26 ;
nmdc:number_of_contig 426 ],
[ a nmdc:MagBin ;
nmdc:bin_name "bins.3" ;
nmdc:bin_quality "LQ" ;
nmdc:completeness "2.0"^^xsd:float ;
nmdc:contamination "0.0"^^xsd:float ;
nmdc:gene_count 294 ;
nmdc:num_16s 0 ;
nmdc:num_23s 0 ;
nmdc:num_5s 0 ;
nmdc:num_t_rna 1 ;
nmdc:number_of_contig 11 ],
[ a nmdc:MagBin ;
nmdc:bin_name "bins.1" ;
nmdc:bin_quality "LQ" ;
Expand Down Expand Up @@ -71,16 +71,16 @@ nmdc:wfmag-99-VOgM5i a nmdc:MagsAnalysisActivity ;
nmdc:input_contig_num 78376 ;
nmdc:low_depth_contig_num 0 ;
nmdc:mags_list [ a nmdc:MagBin ;
nmdc:bin_name "bins.3" ;
nmdc:bin_name "bins.2" ;
nmdc:bin_quality "LQ" ;
nmdc:completeness "17.61"^^xsd:float ;
nmdc:completeness "0.0"^^xsd:float ;
nmdc:contamination "0.0"^^xsd:float ;
nmdc:gene_count 313 ;
nmdc:gene_count 383 ;
nmdc:num_16s 0 ;
nmdc:num_23s 0 ;
nmdc:num_5s 0 ;
nmdc:num_t_rna 7 ;
nmdc:number_of_contig 58 ],
nmdc:num_t_rna 5 ;
nmdc:number_of_contig 74 ],
[ a nmdc:MagBin ;
nmdc:bin_name "bins.1" ;
nmdc:bin_quality "LQ" ;
Expand All @@ -93,16 +93,16 @@ nmdc:wfmag-99-VOgM5i a nmdc:MagsAnalysisActivity ;
nmdc:num_t_rna 4 ;
nmdc:number_of_contig 74 ],
[ a nmdc:MagBin ;
nmdc:bin_name "bins.2" ;
nmdc:bin_name "bins.3" ;
nmdc:bin_quality "LQ" ;
nmdc:completeness "0.0"^^xsd:float ;
nmdc:completeness "17.61"^^xsd:float ;
nmdc:contamination "0.0"^^xsd:float ;
nmdc:gene_count 383 ;
nmdc:gene_count 313 ;
nmdc:num_16s 0 ;
nmdc:num_23s 0 ;
nmdc:num_5s 0 ;
nmdc:num_t_rna 5 ;
nmdc:number_of_contig 74 ] ;
nmdc:num_t_rna 7 ;
nmdc:number_of_contig 58 ] ;
nmdc:name "MAGs activiity 1781_86089" ;
nmdc:started_at_time "2021-01-10T00:00:00+00:00" ;
nmdc:too_short_contig_num 75364 ;
Expand Down
Loading

0 comments on commit 963cbd3

Please sign in to comment.