-
Notifications
You must be signed in to change notification settings - Fork 7
DATAFAQS environment variables
We'll discuss each shell environment variable, how it is used, and how it should be set to affect changes in how DataFAQs is deployed or behaves.
The df-
scripts used to manage the FAqT Brick are guided by shell environment variables. This page lists the variables that can influence df-
scripts. df-vars.sh will show you all variables that can affect the operations, along with their current values and their defaults if not set.
CSV2RDF4LOD_HOME /opt/csv2rdf4lod-automation
DATAFAQS_HOME /opt/DataFAQs
DATAFAQS_BASE_URI !!! -- MUST BE SET -- !!! source datafaqs-source-me.sh
DATAFAQS_LOG_DIR (not required)
DATAFAQS_PUBLISH_METADATA_GRAPH_NAME (will default to: http://www.w3.org/ns/sparql-servi...)
DATAFAQS_PUBLISH_TDB (will default to: false)
DATAFAQS_PUBLISH_TDB_DIR (will default to: VVV/publish/tdb/)
DATAFAQS_PUBLISH_VIRTUOSO (will default to: false)
CSV2RDF4LOD_CONVERT_DATA_ROOT (not required, but vload will copy files when loading)
CSV2RDF4LOD_PUBLISH_VIRTUOSO_HOME (will default to: /opt/virtuoso)
CSV2RDF4LOD_PUBLISH_VIRTUOSO_ISQL_PATH (will default to: /opt/virtuoso/bin/isql)
CSV2RDF4LOD_PUBLISH_VIRTUOSO_PORT (will default to: 1111)
CSV2RDF4LOD_PUBLISH_VIRTUOSO_USERNAME (will default to: dba)
CSV2RDF4LOD_PUBLISH_VIRTUOSO_PASSWORD (will default to: dba)
CSV2RDF4LOD_CONCURRENCY (will default to: 1)
see documentation for variables in: https://github.com/timrdf/DataFAQs/wiki/DATAFAQS-environment-variables
Any URI that DataFAQs creates will be situated within DATAFAQS_BASE_URI
, rooted under $DATAFAQS_BASE_URI/datafaqs/
.
For example if DATAFAQS_BASE_URI
is at the machine level http://sparql.tw.rpi.edu
, URIs like http://sparql.tw.rpi.edu/datafaqs/epoch/2012-01-19/faqt/1
will be created.
export DATAFAQS_BASE_URI='http://sparql.tw.rpi.edu'
void:dataDumps of the RDF graphs collected by DataFAQS will be placed under $DATAFAQS_BASE_URI/datafaqs/dumps/
.
For another example, when the DataFAQs node will be at http://aquarius.tw.rpi.edu/projects/datafaqs/, use:
export DATAFAQS_BASE_URI='http://aquarius.tw.rpi.edu/projects'
Places in the implementation that use DATAFAQS_BASE_URI
:
- Used by bin/df-epoch.sh to name some graphs, named graphs (e.g.
<$DATAFAQS_BASE_URI/datafaqs/epoch/$epoch/config/faqt-services>
andvoid:dataDump <{{DATAFAQS_BASE_URI}}/datafaqs/dump/{{DUMP}}>
) - Used by bin/df-epoch-metadata.py to determine URIs of graphs, named graphs, epochs, faqt services, datasets, etc. (e.g.
<{{DATAFAQS_BASE_URI}}/datafaqs/epoch/{{EPOCH}}>
) - Used by packages/faqt.python/faqt/faqt.py to self-report some provenance of any FAqT Service.
When creating a new epoch with $DATAFAQS_HOME
/bin/df-epoch, if DATAFAQS_PUBLISH_THROUGHOUT_EPOCH
is set to true
, then it the RDF produced will be loaded as it is produced. Loading at each step in the epoch allows clients to monitor the development of an epoch from the web.
export DATAFAQS_PUBLISH_THROUGHOUT_EPOCH='true'
Note that either $DATAFAQS_PUBLISH_TDB
or $DATAFAQS_PUBLISH_VIRTUOSO
must be true for this to take affect.
defaults to http://www.w3.org/ns/sparql-service-description#NamedGraph
export DATAFAQS_PUBLISH_METADATA_GRAPH_NAME='http://www.w3.org/ns/sparql-service-description#NamedGraph'
if DATAFAQS_PUBLISH_VIRTUOSO
is true, $DATAFAQS_HOME
/bin/df-load-triple-store.sh will load the virtuoso triple store at:
- CSV2RDF4LOD_CONVERT_DATA_ROOT
-
CSV2RDF4LOD_PUBLISH_VIRTUOSO_PORT
-
CSV2RDF4LOD_PUBLISH_VIRTUOSO_ISQL_PATH
-
CSV2RDF4LOD_PUBLISH_VIRTUOSO_USERNAME
-
CSV2RDF4LOD_PUBLISH_VIRTUOSO_PASSWORD
-
CSV2RDF4LOD_CONCURRENCY (defaults to 1; number of threads to use to load triples)
This reuses the settings from csv2rdf4lod-automation.
This is not implemented. TLDR.
DATAFAQS_PUBLISH_SESAME='true'
-
DATAFAQS_PUBLISH_SESAME_HOME='~/utilities/sesame/openrdf-sesame-2.6.10/'
- The base directory forbin/console.sh
-
DATAFAQS_PUBLISH_SESAME_SERVER='http://localhost:8080/openrdf-sesame'
- The sesame server hosting one or more repositories. -
DATAFAQS_PUBLISH_SESAME_REPOSITORY_ID='spo-balance'
- The repository identifier. Many contexts are permitted within a single Sesame Repository.
Done once within console.sh:
> connect http://localhost:8080/openrdf-sesame .
> create native .
Please specify values for the following variables:
Repository ID [native]: spo-balance
Repository title [Native store]: spo-balance
Triple indexes [spoc,posc]: spoc,posc
Repository created
Done for each file/graph to load:
cat load.sc clear.sc
connect http://localhost:8080/openrdf-sesame.
open spo-balance.
load input_112b6d7443852b15aa3153fa41d7ebf3.rdf into http://xmlns.com/foaf/0.1 .
exit .
connect http://localhost:8080/openrdf-sesame.
open spo-balance.
clear http://xmlns.com/foaf/0.1 .
exit .
If you don't care, set it with /dev/null
; if you want to poke around, use export DATAFAQS_LOG_DIR=
pwd``
The base URI for the version-controlled source code of the FAqT Services. Used in faqt.python package to provide PROV-O assertions about the services.
see DATAFAQS_PROVENANCE_CODE_RAW_BASE. The base URI for the page (not the source code).
http://pypi.python.org/pypi/ckanclient requires an API key. The FAqT Services access the key from this environment variable.