World food Program wants to collect nutritional information of food bought and sold in Kenya. The project is designed to have selected people install an app on their mobile phones, and whenever they buy food, they use their voices to activate the app to register the list of items they have bought in Swahili. The app is expected to live transcribe the voice of the people to text and organize the information in an easy-to-process way in a database
This project builds, trains and deploy a deep learning model which transcribe audio in Swahili to text in Swahili.
- Machine Setup:
First, you need to have python 3 installed.
Next clone this github link
git clone https://github.com/10Academy-Group-4/Week-4
Finally, you can install the requirements. If you are an Anaconda user: (else replace pip with pip3 and python with python3)
pip install -r requirements.txt
- Docker:
This is a containerized flask application with docker image put on docker hub.A docker image is available with all pre-requisites installed. Here is how you use it
Pull docker image
docker pull nebasam/stt-swahili
Run docker image
docker run --rm -it -p 33507:33507/tcp nebasam/stt-swahili:latest
- Dataset for Swahili- https://github.com/getalp/ALFFA_PUBLIC
Input features (X): audio clips of spoken words
Target labels (y): text transcript of what was spoken
- Artifacts-A directory which contains artifacts such meta files and other artifacts generated through the project
- Notebook-A directory which contains notebooks for describing the functionality of the the classes to achieve the meta generation and the preprocessing
- Scripts-A directory which contains scripts for Meta generation, preprocessing and feature extraction
- test_data-A directory which has data for running tests for every commit or merge on the main branch
- tests-A directory which has the codes for testing every commit or merge on the main branch
- data.dvc- DVC File for versioning of the data
- requirements.txt- A file for dependencies for the project
The inbuit unittest library in python was used to for the testing of the functions and classes in the project. A .travis.ymal was added to automate testing of any commit or merge made to the main branch. Data used for testing is found in test_data directory
To get an idea of how models are setup and investigated, take a look at the notebooks for Models, WordError and Augmentation.
The user interface was built with flask. The model was dockerized and deployed on Heroku on https://swahili-stt.herokuapp.com/