batch-loader

Application for batch loading GW ScholarSpace

Setup

Requires Python >= 3.5

Get this code.

 git clone git@github.com:DigitalWPI/batch-loader.git
 OR
 git clone https://github.com/DigitalWPI/batch-loader.git

Create a virtualenv.

 virtualenv -p python3 ENV
 source ENV/bin/activate

install needed python libraries
```
 pip install -r requirements.txt
```
Copy configuration file.
```
 cp example.config.py config.py
```
Edit configuration file. The file is annotated with descriptions of the configuration options.

Running

To run batch-loader:

`python batch_loader.py <path to csv>`
OR if instead of haveing the column `files` you have the column `fulltext_url` of the related resource
`python batch_loader.py <path to csv> --url`
see example.csv and url_example.csv
finally it can also be run on json files, using the same elements as the csv. 
`python batch_loader.py <path to json file> --json`
there are more options as well such as what collections to ingest to if your
rake task can handle that, whether or not to generate tiffs, and print level.
use `python batch_loader.py --help` to see all the options

Specification of CSV

The first row must contain the field names.
Fields that take multiple values should be placed in multiple columns. Each field name should be appended with an incrementing integer. For example, "author1", "author2", "author3". Even if there is only a single entry, but the field is repeating, the field name should end with "1". (Fields with multiple values will be passed as lists to GWSS.)
The following fields are required: files (or fulltext_url for this --url variant), object_type, title, author1, type_of_work1, rights.
The following fields are optional, but if provided must use these field names: first_file, gwss_id. (TODO)
Additional fields included in the CSV will be passed to GWSS using the provided field names. For example, a "subtitle" field included in the CSV will be passed as "subtitle" to GWSS.
The ordering of fields is not significant.

TODO:

Support updating when already has a repo id.
Write output CSV containing repo id.
Error handling when import fails.

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
example		example
.gitignore		.gitignore
FormatLog.py		FormatLog.py
README.md		README.md
batch_loader.py		batch_loader.py
example.config.py		example.config.py
fake_rake.py		fake_rake.py
get_file.py		get_file.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

batch-loader

Setup

Running

Specification of CSV

TODO:

About

Releases

Packages

Languages

DigitalWPI/batch-loader

Folders and files

Latest commit

History

Repository files navigation

batch-loader

Setup

Running

Specification of CSV

TODO:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages