Skip to content

Multi-Lingual-Speech-App enables real-time transcription, translation, and speech synthesis across English, Spanish, and French. A good tool for multilingual communication.

Notifications You must be signed in to change notification settings

Op27/Multi-Lingual-Speech-App

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 

Repository files navigation

Multi-Lingual Speech App

Overview

The Multi-Lingual Speech App uses AI to automate the flow from audio recording through translation to speech synthesis. This application is designed to transcribe speech into text, translate the text from English to Spanish or French, and then synthesize the translated text back into speech.

Demo Video

Demo.mp4

Key Features

  • Automated Audio Processing Workflow: Streamlines the process of converting spoken words into another language's speech output.
  • Speech-to-Text Transcription: Utilizes Google Speech-to-Text API to accurately transcribe audio recordings into text.
  • Multi-Language Translation: Employs DeepL API for high-quality translations from English to Spanish or French.
  • Text-to-Speech Synthesis: Uses Google Text-to-Speech API to generate natural-sounding speech in Spanish or French from the translated text.

How to Use

  1. Choose your target language (Spanish or French) from dropdown.
  2. Start by recording your speech through the application interface.
  3. The app automatically transcribes the speech into text using the Google Speech-to-Text API.
  4. The app translates the text using DeepL
  5. Finally, listen to the translated speech, synthesized by the Google Text-to-Speech API.

Installation

See the setion below titled "Installation Guide".

License

This project is licensed under the MIT License. See the LICENSE file for details.


Installation Guide

This section guides you through setting up the Multi-Lingual Speech App on your local machine for development and testing purposes. Follow these steps to get a copy of the project up and running.

Prerequisites

Before installing the application, ensure you have the following:

  • Python 3.6 or later installed on your system.
  • Pip for installing Python packages.

Obtaining API Keys

  1. Google Cloud Speech-to-Text and Text-to-Speech APIs:

    • Visit the Google Cloud Console.
    • Create a new project or select an existing one.
    • Enable the Speech-to-Text and Text-to-Speech APIs for your project.
    • Go to the "Credentials" page and create a new API key for your application.
  2. DeepL API:

    • Sign up for an account at DeepL.
    • Access the API subscription page and subscribe to a plan that suits your needs.
    • Obtain your DeepL API key from the account overview or API plan details.

Setting Up the Application

  1. Clone the Repository:

    • Clone this repository to your local machine using git clone https://github.com/Op27/Multi-Lingual-Speech-App.git.
  2. Install Dependencies:

    • Navigate to the project directory and install the required Python packages using:
      pip install -r requirements.txt
      
  3. Configure API Keys:

    • For security reasons, it's best to set your API keys as environment variables. On your system, set the following variables:
      • For Windows:
        set GOOGLE_APPLICATION_CREDENTIALS="path_to_your_google_credentials_json_file"
        set DEEPL_API_KEY="your_deepl_api_key"
        
      • For Unix/Linux/Mac:
        export GOOGLE_APPLICATION_CREDENTIALS="path_to_your_google_credentials_json_file"
        export DEEPL_API_KEY="your_deepl_api_key"
        
    • Replace path_to_your_google_credentials_json_file with the path to the JSON file containing your Google Cloud credentials, and your_deepl_api_key with your actual DeepL API key.
  4. Run the Application:

    • With the environment variables set and dependencies installed, you can now run the application. Navigate to the app's directory and execute:
      python app.py
      
  5. Accessing the Application:

    • Open your web browser and go to http://localhost:5000 (or whichever port your application runs on) to start using the app.

About

Multi-Lingual-Speech-App enables real-time transcription, translation, and speech synthesis across English, Spanish, and French. A good tool for multilingual communication.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published