Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

relations for collections missing #54

Open
acwittmann opened this issue Nov 24, 2023 · 2 comments
Open

relations for collections missing #54

acwittmann opened this issue Nov 24, 2023 · 2 comments

Comments

@acwittmann
Copy link

Hi Robert,
would it be possible to also get the relations (related literature) for collections (previously parents)?
When I do e.g.
ds = pg.PanDataSet(839065) # Is child dataset
ds.relations
[{'id': 'ref66016',
'title': 'Late Holocene primary productivity and sea surface temperature variations in the northeastern Arabian Sea: Implications for winter monsoon variability',
'uri': 'https://doi.org/10.1002/2013PA002579',
'type': 'Related to'}]

However:
dn = pg.PanDataSet('10.1594/PANGAEA.839067') # Is parent record.
dn.relations
[]

Thanks!
Astrid

@fspreck-indiscale
Copy link
Contributor

This doesn't seem to be caused by collections, cf https://doi.pangaea.de/10.1594/PANGAEA.957900:

from pangaeapy.pandataset import PanDataSet
ds = PanDataSet("https://doi.pangaea.de/10.1594/PANGAEA.957900")
print(ds.relations)

prints

[{'id': 'ref118074',
  'title': 'Regional and global impact of CO2 uptake in the Benguela Upwelling System through preformed nutrients',
  'uri': 'https://doi.org/10.1038/s41467-023-38208-y',
  'type': 'Supplement to'},
 {'id': 'ref118076',
  'title': 'DSHIP Landsystem',
  'uri': 'http://dship.bsh.de/',
  'type': 'Source'},
 {'id': 'ref115066',
  'title': 'Methods of Seawater Analysis: Third, Completely Revised and Extended Edition',
  'uri': 'https://doi.org/10.1002/9783527613984',
  'type': 'References'}]

https://doi.pangaea.de/10.1594/PANGAEA.839067 doesn't have a relation but a supplement to in the citation, so in your example, while dn.relations is empty, dn.supplement_to should show

{'id': 'ref66016',
 'title': 'Late Holocene primary productivity and sea surface temperature variations in the northeastern Arabian Sea: Implications for winter monsoon variability',
 'uri': 'https://doi.org/10.1002/2013PA002579',
 'year': '2014'}

You are the experts on this, this is probably because of a datamodel change in PANGAEA? Maybe old datasets have the supplement to in their citations, new ones in their relations? In my example, df.supplement_to is empty. This is of course a bit confusing when using pangaeapy to extract the supplement to information (and probably any other tool relying on the panmd representation). @huberrob Do you think it makes sense to set PanDataset.supplement_to to the relation with "type": "Supplement to" if there is one and it would be empty otherwise?

@huberrob
Copy link
Contributor

huberrob commented Dec 8, 2023

Yes this is not a collection issue. Such information most probably is not available for all datasets/collections during the curation process and thus missing.

And @fspreck-indiscale I agree that supplements should be added to the relations dict as you propose.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

3 participants