Dataset contains two folders train and valid. Individual folder contains speech utterances/.wav files of 5 categories on which to do classification. The classes are namely : disgust, fear, happy, neutral, sad. The data is in .wav format. We will be exploring different features on audio to check which fits best for better model Learning. We will be using CNNs in our model architecture.
These are the model weights - https://drive.google.com/file/d/1y0V5cXVUFzimo3xhcES6-Ukk-Qk3fYpt/view?usp=sharing