Skip to content

Real-time audio analysis embedded system to detect specific absolutist keywords.

Notifications You must be signed in to change notification settings

dheerajkallakuri/High-Accuracy-Keyword-Spotting-on-Edge

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

High-Accuracy Keyword Spotting on Edge

Overview

This project aims to develop a low-power, real-time audio analysis embedded system to detect specific absolutist keywords, which can be used as markers for mental health language. The system will be designed using an Arduino Nano BLE Sense board equipped with a digital microphone, allowing for audio data collection, model training, and real-time keyword detection.

Project Phases

Phase 1: Data Collection

  1. Gather Audio Samples: Record audio samples for a set of given absolutist keywords.
  2. Expand Dataset: Integrate the gathered audio samples with the existing Speech Command dataset to create a comprehensive keyword spotting dataset.

Phase 2: Model Training

  1. Feature Extraction: Extract relevant features from the audio data using techniques discussed in class (e.g., MFCCs, spectrograms).
  2. Model Training: Train a machine learning model to detect the absolutist keywords using the extracted features.
  3. Model Validation: Validate the model to ensure it accurately identifies the keywords.

Phase 3: Deployment and Testing

  1. Deploy Model: Implement the trained model on the Arduino Nano BLE Sense board.
  2. Real-Time Testing: Test the system for real-time keyword spotting and evaluate its performance.

Dataset

The dataset will consist of:

  • Audio samples of absolutist keywords recorded specifically for this project.

Setup

  1. Record Audio Samples:

    • Use a recording device to collect audio samples of the following absolutist keywords: “all,” “must,” “never,” “none,”, “only”, and “silence”
    • Save these samples in the data/keywords directory.
    • Open Speech Recording tool can be used to record audio signals.
    • Speech Command dataset
  2. Integrate Dataset:

    • Combine the recorded samples with the Speech Command dataset in the data directory.

Training the Model

Train the model in the cloud using Google Colaboratory or locally using a Jupyter Notebook.

Use model.ipynb

Google Colaboratory Jupyter Notebook
*Estimated Training Time: ~2 Hours.*

Deployment on Arduino Nano BLE Sense

  1. Convert Model:

    • Convert the trained model to a format compatible with the Arduino board (e.g., TensorFlow Lite).
  2. Deploy Model:

    • Upload the model and the necessary code to the Arduino Nano BLE Sense board.
    • Refer to micro_speech folder.
  3. Real-Time Testing:

    • Fetch testing audios from testing_audio folder.
    • Test the system for real-time keyword spotting and evaluate its performance.

Results

result
  • There is an accuracy of 96% after training.
  • In training unknown and silence words are also included apart from 5 words.
  • The analysis of the confusion matrix revealed that the model exhibited high accuracy in recognizing certain keywords such as "all," "only," and "silence."
  • However, its performance was comparatively weaker when identifying keywords like "must," "none," and "never."

Video Demonstration

For a visual demonstration of this project, please refer to the video linked below:

Project Video Demonstration

Project Video Demonstration

Reference

Tensorflow Lite Micro_Speech

About

Real-time audio analysis embedded system to detect specific absolutist keywords.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published