GitHub - szymonrucinski/good-mood: AlexaNet trained on MEL spectrograms originating from Berlin EmoDB set to recognize speakers emotions

Introduction

The goal of the following project is to build a production ready API and application. That based on the audio is capable to classify customers emotions based on the recordings. Model was trained on images that are MEL spectrograms of audio files. Model is trained from scratch and uses AlexaNet architecture to classify emotions.

Client

Application provides an intuitive interface for the user. It allows to upload audio files and get predictions.

Dataset

Used dataset is called EMO-DB. It contains 4,5k audio files with 8 emotions. Dataset is available here. It contains recordings of 10 different speakers. Each speaker recorded 15 sentences in 7 different emotions. Each sentence was recorded 3 times in German language.

Architecture

Model is based on AlexNet architecture. It was chosen because of its simplicity and good performance. It is composed of 5 convolutional layers and 3 fully connected layers. It uses ReLU activation function and max pooling. It was trained for 10 epochs with batch size of 32. It achieved 0.7 accuracy on validation set.

Convolutional layers are used to extract features from images. Fully connected layers are used to classify images. Max pooling is used to reduce the size of the image. ReLU activation function is used to introduce non-linearity to the model. Network is being trained on the following MEL spectrograms.

Stack

This application was written in Python Models were trained using Pytorch. It uses FastAPI to expose API. It uses Docker to containerize the application. It uses Pytorch to build the model. It uses Librosa to extract features from audio files. It uses Pandas to load and manipulate data. It uses Numpy to manipulate data. It uses Matplotlib to plot data. It uses Scikit-learn to split data into train and test sets. It uses Scipy to save and load model.

Build

run_docker.sh script contains all necessary command to build and run container. It will run container and expose API and JupyterNotebook server on ports 4444 and 8888.

chmod +x run_docker.sh
docker.sh

Accessing JupyterNotebooks

Querying Flask API

Api was tested using postman. It can be queried using the following configuration.

Name		Name	Last commit message	Last commit date
Latest commit History 43 Commits
data		data
documentation		documentation
notebooks		notebooks
pipeline		pipeline
utils		utils
.gitattributes		.gitattributes
.gitignore		.gitignore
Dockerfile		Dockerfile
Readme.md		Readme.md
main.py		main.py
requirements.txt		requirements.txt
setup.py		setup.py
start_docker.sh		start_docker.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Introduction

Client

Dataset

Architecture

Stack

Build

Accessing JupyterNotebooks

Querying Flask API

About

Releases

Packages

Languages

szymonrucinski/good-mood

Folders and files

Latest commit

History

Repository files navigation

Introduction

Client

Dataset

Architecture

Stack

Build

Accessing JupyterNotebooks

Querying Flask API

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages