Pali Canon Database

Pali Canon Database

Introduction

The Pali Canon database is a SQL relational database of the Pali Canon that provides a structural representation of the three baskets:

Sutta Piṭaka
Vinaya Piṭaka
Abhidhamma Piṭaka

By organizing these texts into a relational schema, the database can faciliate research and textual analysis. Scholars and practioners can run complex queries for in-depth exploration of the Pali Canon.

All data is sourced from SuttaCentral.

Schema Overview

Entity-Relationship Diagram

Author

Contains author/translator codes and full names.

author_uid: Unique identifier for each author
author_short: Short name or abbreviation for the author
author: Full name of the author

Language

Lists the languages used in the database.

lang: Language code
lang_name: Full name of the language

TextInfo

Contains metadata about each text in the Pali Canon.

uid: Unique identifier for each text
parent_uid: Identifier of the parent text (for hierarchical organization)
blurb: Brief description of the text
original_title: Title in the original language
translated_title: Translated title
acronym: Acronym or short code for the text
difficulty: Indicates the difficulty level of the text
basket: Basket to which the translation belongs to
root_lang_name: Name of the original language
root_lang: Code of the original language
type: Type or category of the text

Translation

Contains information and text content of each translation.

translation_id: Unique identifier for each translation
uid: Identifier of the original text (corresponds to TextInfo.uid)
lang: Language code of the translation (corresponds to Language.lang)
author_uid: Identifier of the translator (corresponds to Author.author_uid)
local_file_path: Path to the translation file
basket: Basket to which the translation belongs to
text_context: The text content of the translation

Installation and Setup

Install the latest version of the Pali Canon database in the Release section.
Install DB Browser for easy viewing and access.

Usage

Example queries

Searching for all occurences of "impermanence" in the text content of English translations:

SELECT 
   tr.translation_id, 
   tr.uid, 
	ti.parent_uid,
	ti.basket,
   ti.translated_title,
   a.author,
   tr.text_content
FROM 
   Translation AS tr
LEFT JOIN TextInfo ti
   ON tr.uid = ti.uid
LEFT JOIN Author a 
   ON tr.author_uid = a.author_uid
WHERE 
   tr.text_content LIKE '%impermanence%' 
   AND tr.lang = 'en';

Retrieving the lineage of a text

WITH RECURSIVE lineage AS (
  SELECT uid, parent_uid, basket
  FROM TextInfo 
  WHERE uid = 'mn11'
  UNION ALL
  SELECT t.uid, t.parent_uid, t.basket
  FROM TextInfo t
  INNER JOIN lineage l ON t.uid = l.parent_uid
)
SELECT GROUP_CONCAT(uid, ' > ') AS lineage, basket
FROM lineage
ORDER BY LENGTH(uid) DESC
LIMIT 1;

This query builds the lineage of mn11 recursively. For example, the output looks like:

lineage	basket
mn11 > mn-sihanadavagga > mn-mulapannasa > mn > middle	sutta

Architecture

All texts and metadata are sourced from SuttaCentral, which provides comprehensive access to the Pali Canon.

Future work / To-do

Dockerize project to simplify deployment.
Enhance query speed and implement more advanced search capabilities, especially for searching through textual content.
Implement CDC strategy to update target databases when data sources are updated.

Contribute

Contributions are welcome!

Issues: If you have any problems, questions, suggestions for improvement and new features, please open an issue.
Pull requests: Fork the repo and open a pull request to submit contrubitions to the project.

Name		Name	Last commit message	Last commit date
Latest commit History 56 Commits
etl_scripts		etl_scripts
pali_canon_dbt		pali_canon_dbt
sc-data @ 16688e4		sc-data @ 16688e4
suttacentral @ d0dad3a		suttacentral @ d0dad3a
.gitignore		.gitignore
.gitmodules		.gitmodules
Architecture.png		Architecture.png
README.md		README.md
erd.png		erd.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Pali Canon Database

Introduction

Schema Overview

Entity-Relationship Diagram

Author

Language

TextInfo

Translation

Installation and Setup

Usage

Example queries

Architecture

Future work / To-do

Contribute

About

Releases 2

Languages

gyk-jane/pali-canon-db

Folders and files

Latest commit

History

Repository files navigation

Pali Canon Database

Introduction

Schema Overview

Entity-Relationship Diagram

Author

Language

TextInfo

Translation

Installation and Setup

Usage

Example queries

Architecture

Future work / To-do

Contribute

About

Topics

Resources

Stars

Watchers

Forks

Releases 2

Languages