Speech Recognition

Live Transcription of Swahili Audio to Swahili Text

Navigation

Speech Recognition

Introduction

World food Program wants to collect nutritional information of food bought and sold in Kenya. The project is designed to have selected people install an app on their mobile phones, and whenever they buy food, they use their voices to activate the app to register the list of items they have bought in Swahili. The app is expected to live transcribe the voice of the people to text and organize the information in an easy-to-process way in a database

Objective

This project builds, trains and deploy a deep learning model which transcribe audio in Swahili to text in Swahili.

How to start

Machine Setup:

First, you need to have python 3 installed.

Next clone this github link

git clone https://github.com/10Academy-Group-4/Week-4

Finally, you can install the requirements. If you are an Anaconda user: (else replace pip with pip3 and python with python3)

pip install -r requirements.txt

Docker:

This is a containerized flask application with docker image put on docker hub.A docker image is available with all pre-requisites installed. Here is how you use it

Pull docker image

docker pull nebasam/stt-swahili

Run docker image

docker run --rm -it -p 33507:33507/tcp nebasam/stt-swahili:latest

Data

Dataset for Swahili- https://github.com/getalp/ALFFA_PUBLIC

Data_Features

Input features (X): audio clips of spoken words
Target labels (y):  text transcript of what was spoken

Directory_Structure

Artifacts-A directory which contains artifacts such meta files and other artifacts generated through the project
Notebook-A directory which contains notebooks for describing the functionality of the the classes to achieve the meta generation and the preprocessing
Scripts-A directory which contains scripts for Meta generation, preprocessing and feature extraction
test_data-A directory which has data for running tests for every commit or merge on the main branch
tests-A directory which has the codes for testing every commit or merge on the main branch
data.dvc- DVC File for versioning of the data
requirements.txt- A file for dependencies for the project

Testing

The inbuit unittest library in python was used to for the testing of the functions and classes in the project. A .travis.ymal was added to automate testing of any commit or merge made to the main branch. Data used for testing is found in test_data directory

Modelling

To get an idea of how models are setup and investigated, take a look at the notebooks for Models, WordError and Augmentation.

Deployment

The user interface was built with flask. The model was dockerized and deployed on Heroku on https://swahili-stt.herokuapp.com/

Contributors

Michael Darko Ahwireng

Toyin Hawau Olamide

Nebiyu Samuel

Sibitenda Harriet

Same Michael

Mubarak Sani

Khairat Ayinde

Name		Name	Last commit message	Last commit date
Latest commit History 107 Commits
.dvc		.dvc
.github/workflows		.github/workflows
.vscode		.vscode
artifacts		artifacts
logs		logs
model		model
notebook		notebook
scripts		scripts
static/assets		static/assets
templates		templates
test_data		test_data
tests		tests
uploads/audio		uploads/audio
.dockerignore		.dockerignore
.dvcignore		.dvcignore
.gitignore		.gitignore
.travis.yml		.travis.yml
Aptfile		Aptfile
Dockerfile		Dockerfile
Procfile		Procfile
README.md		README.md
__init__.py		__init__.py
app.py		app.py
augmented.ipynb		augmented.ipynb
data.dvc		data.dvc
requirements.txt		requirements.txt
setup.py		setup.py
wsgi		wsgi

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Speech Recognition

Live Transcription of Swahili Audio to Swahili Text

Navigation

Introduction

Objective

How to start

Data

Data_Features

Directory_Structure

Testing

Modelling

Deployment

Contributors

About

Releases

Packages

Contributors 7

Languages

10Academy-Group-4/Week-4

Folders and files

Latest commit

History

Repository files navigation

Speech Recognition

Live Transcription of Swahili Audio to Swahili Text

Navigation

Introduction

Objective

How to start

Data

Data_Features

Directory_Structure

Testing

Modelling

Deployment

Contributors

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 7

Languages

Packages