GitHub - timjones1/capstone1-web-scraping: Scraping legal case information from bailii.org using beautiful soup, SpaCy, Blackstone and Pandas.

Scraping legal case information from bailii.org

This is part one of a capstone project I completed during the Cambridge Spark bootcamp. The project was proposed by wavelength.law. I made use of Beautiful Soup to extract legal case information from the Bailii.org website. I also used a pretrained Spacy Named Entity Recognition Model called Blackstone, trained on English legal texts, to extract relevant sections and Acts from the legal cases.

My Project report is available here

Installation of Software.

Spacy 2.1.8 is required to work with blackstone, see the blackstone github page for more detais:

https://github.com/ICLRandD/Blackstone

blackstone.yml file is included for conda environment configuration that worked for me. package verions are below:

spacy version 2.1.8

blackstone version 0.1.15

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
Capstone 1 Project Report for Wavelength.law.pdf		Capstone 1 Project Report for Wavelength.law.pdf
README.md		README.md
Wavelength Data Extraction.ipynb		Wavelength Data Extraction.ipynb
blackstone.yml		blackstone.yml
legislation_linker.py		legislation_linker.py
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Scraping legal case information from bailii.org

Installation of Software.

About

Releases

Packages

Languages

timjones1/capstone1-web-scraping

Folders and files

Latest commit

History

Repository files navigation

Scraping legal case information from bailii.org

Installation of Software.

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages