Skip to content

Latest commit

 

History

History
18 lines (11 loc) · 1.21 KB

README.md

File metadata and controls

18 lines (11 loc) · 1.21 KB

README

Binder

This repository contains the sample data for the Programming Historian's lesson Detecting Text Reuse with Passim, written by Matteo Romanello and Simon Hengchen (currently in preparation).

Data come from two different sources (see respective READMEs for license statements and further details):

  1. books from EEBO (Early English Books Online) → more info
  2. newspaper articles from impressomore info

The Jupyter notebook explore-passim-output.ipynb contains an example of how to load passim's JSON output into a pandas DataFrame to compute some statistics.

To run the notebook as well as the script eebo/code/main.py make sure that you install the required dependencies into a new virtual environment (created by using conda, pyenv, venv, etc.):

pip install -r requirements.txt