This is based on our work on automatic speech recognition models for Igbo and Fon. This study led to a paper, OkwuGbé: End-to-End Speech Recognition for Fon and Igbo, accepted at the AfricaNLP workshop of EACL 2021.
OkwuGbé is a step towards building speech recognition systems for African low-resourced languages. Using Fon and Igbo as our case study, we conduct a comprehensive linguistic analysis of each language and describe the creation of end-to-end, deep neural network-based speech recognition models for both languages.
Here is the open-source code for the End2End Automatic Speech Recognition Model and data preprocessing steps for Fon language.
The Igbo implementation here and for the full model architecture, see here
Implementation of Fon Automatic Speech Recognition model, from original paper: «OkwuGbé: End-to-End Speech Recognition for Fon and Igbo», Accepted at African NLP, EACL 2021.
Authors: Bonaventure F. P. Dossou, and Chris C. Emezue
Building an end-to-end Speech Recognition model in PyTorch for Fon with the code adapted from: https://www.assemblyai.com/blog/end-to-end-speech-recognition-pytorch
Addition: Attention Mechanism added by the Bonaventure F. P. Dossou
Please cite our paper using the citation below if you use our work in anyway:
@article{2103.07762, Author = {Bonaventure F. P. Dossou and Chris C. Emezue}, Title = {OkwuGbé: End-to-End Speech Recognition for Fon and Igbo}, Year = {2021}, Eprint = {arXiv:2103.07762}, Howpublished = {African NLP, EACL 2021} }