Skip to content

A script to upload remote media to the Internet Archive

Notifications You must be signed in to change notification settings

jacksongoode/ia-remote-upload

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Internet Archive Remote Media Uploader

This project provides a script to bulk upload remote files to the Internet Archive from a CSV file containing URLs and metadata.

💡 Recommendation: Because Google is a monopoly, you can get far better download/upload speeds by using a Colab notebook. However, IA might think you are a spam bot if you upload too quickly 🙃

Open in Colab

If you would like to run the script yourself, instructions are below.

Installation

Install packages into a venv. You can use uv to do this:

macOS and Linux

curl -LsSf https://astral.sh/uv/install.sh | sh
uv venv
source .venv/bin/activate
uv pip install -r requirements.txt

Windows

powershell -c "irm https://astral.sh/uv/install.ps1 | iex"
.venv\Scripts\activate
uv pip install -r requirements.txt

Then log in to your Internet Archive account with:

ia configure

Note the location the credentials (ia.ini) are stored in and copy them to the local directory.

Usage

Create a CSV file with the following columns:

  • file - The full URL to the file to upload
  • title - The title for the item on Internet Archive
  • creator - The creator/author for the item
  • date - The date for the item
  • mediatype - The mediatype for the item, e.g. "movies"

Optionally:

  • identifier - A unique ID for the item (a hash-based or random ID can also be generated)

As well as any other metadata fields you want to add. Please see the documentation on the Internet Archives CLI.

Run

python ia_remote_upload.py path/to/csv.csv -w 1

This will spawn threads to download each file and upload it to Internet Archive along with the given metadata.

Progress and results will be logged to log.txt. Any failed downloads will be saved in failed.txt

About

A script to upload remote media to the Internet Archive

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published