Skip to content

translation of thesaurus.xml files to rdf representations (using skos)

Notifications You must be signed in to change notification settings

vliz-be-opsci/gbif-thes2rdf

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Intro

This little project provides an automated strategy to build RDF serialisations for the various thesaurus-xml files provided at https://rs.gbif.org/vocabulary/

These gbif-vocabularies introduce decent and stable URI for curated terms lists in use in the domain. Currently these are made available only in a gbif-specific thesaurus-xml-format

Scope

To provide a solution that automatically converts thesaurus xml into ttl and jsonld representations (targetting skos)

Usage

Python-Dependency management for this depends on python-poetry

Have some local ./data folder around that has the thesaurus-xml-files to be converted. (These can be nested in subdirs: **/*.xml)

Then simply run

$ poetry install          # only once to install python package dependencies locally ina virtual environment
$ poetry shell            # to activate the virtual environment where said dependencies are available
$ ./bin/gbif_thes2rdf.py  # to actually run the conversion -- can be ran multiple times, will overwrite ttl and jsonld 

The resulting **/*.ttl (text/turlte) and ** /*.jsonld (application/ld+json) files will be placed next to their xml source.

Ref

  • see some selection of thesauri to consider. Note: the locally provided ./bin/get_thes_from_urls.sh script will download them to a local ./data folder

TODO

Have dialogue with gbif people and/or site admins on

  1. how to integrate this in the rs.gbif.org workflow for these vocabs (they appear to be managed in this github repo) and
  2. how to publish them over there via Content Negotiation

About

translation of thesaurus.xml files to rdf representations (using skos)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published