Bengali Word-Finder

Psycholinguistic experiments often require sets of words that begin with, end with, or contain certain sounds, have a certain number of syllables etc. Finding these by introspection takes time and is not the most efficient or reliable.

The word-finder uses an underlying corpus to generate lists of words matching specified phonological descriptions. These can be the presence or absence of certain sounds at given positions, or number of syllables. Apart from single sounds, it also pre-defines linguistically relevant sound groups, so that it is possible to find, for example, words that begin with a nasal, or contain a voiced retroflex. You can use boolean operators (AND and OR) to combine multiple conditions.

It is also possible to preview, edit, and filter a current selection before generating an output file with the list of words.

To use:

Clone this repository
Open a terminal window and direct it to the base directory (cd path/to/directory/Bengali_Word_Finder/)
Run bengali_word_finder.py on python 3 (python3 bengali_word_finder.py)

Note that the other scripts are called as modules by the main program, so make sure the directory structure remains the same
Detailed documentation and example output in the Documentation folder. Usage instructions can also be found from the HELP command inside the main program

This tool currently works with data and transcription system from the Bengali SHRUTI corpus. Words are transliterated using the ITRANS format (Indian languages TRANSliteration). To use with any related language that has a corpus with a similar transcription scheme (reference on pg. 52 here), replace the file shruti.dic with the pronunciation dictionary from your corpus of choice.

Using the word-finder with a different transcription scheme/phonologically unrelated language is possible, but will need some changes to the code. If you want to use it for another language, are interested in contributing, or have any suggestions, please drop me an email at auromita.mitra@gmail.com!

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
documentation		documentation
.gitignore		.gitignore
README.md		README.md
bengali_word_finder.py		bengali_word_finder.py
data.py		data.py
display.py		display.py
shruti.dic		shruti.dic
sounds.py		sounds.py
subsets.py		subsets.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Bengali Word-Finder

About

Releases

Packages

Languages

auromitamitra/Bengali_Word_Finder

Folders and files

Latest commit

History

Repository files navigation

Bengali Word-Finder

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages