Skip to content

Commit

Permalink
Merge pull request #225 from saksham0103/main
Browse files Browse the repository at this point in the history
Text-to-speech api
  • Loading branch information
gantavyamalviya authored Oct 13, 2022
2 parents c21a26c + ad0b966 commit 4f60aed
Show file tree
Hide file tree
Showing 7 changed files with 253 additions and 0 deletions.
47 changes: 47 additions & 0 deletions Text-to-Speech-main/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
## Text to Speech (Speech Synthesis)
A Text-to-Speech Web App that does vocal narration of text displayed on screen in real time built using JavaScript's Web Speech API. SpeechSynthesis interface was used that allows programs to read out their text content (normally via the device's default speech synthesizer.)

The site is live at : https://text2speeches.netlify.app/

<a target="_blank" href="https://text2speeches.netlify.app/">
<img src="https://github.com/rahulkarda/Text-to-Speech/blob/main/images/text2speeches.jpg?raw=true" width="100%" alt="Text To Speech Converter"/>
</a>
<br>

## Tech Stack
![](https://img.shields.io/badge/Code-HTML5-informational?style=flat&logo=html5&logoColor=white&color=brightgreen)
![](https://img.shields.io/badge/Code-CSS3-informational?style=flat&logo=css3&logoColor=white&color=brightgreen)
![](https://img.shields.io/badge/Code-JavaScript-informational?style=flat&logo=javascript&logoColor=white&color=brightgreen)
![](https://img.shields.io/badge/Code-Bootstrap-informational?style=flat&logo=bootstrap&logoColor=white&color=brightgreen)

[Bootstrap](https://getbootstrap.com/) is a free and open-source CSS framework directed at responsive, mobile-first front-end web development.

[Web Speech API](https://developer.mozilla.org/en-US/docs/Web/API/Web_Speech_API) enables you to incorporate voice data into web apps. The Web Speech API has two parts: SpeechSynthesis (Text-to-Speech), and SpeechRecognition (Asynchronous Speech Recognition.). SpeechSynthesis interface allows programs to read out their text content (normally via the device's default speech synthesizer.)

## Interface Used
<h3>Speech Synthesis</h3>

1. SpeechSynthesis - The controller interface for the speech service; this can be used to retrieve information about the synthesis voices available on the device, start and pause speech, and other commands besides.
2. SpeechSynthesisUtterance - Represents a speech request. It contains the content the speech service should read and information about how to read it (e.g. language, pitch and volume.)
3. SpeechSynthesisVoice - Represents a voice that the system supports. Every SpeechSynthesisVoice has its own relative speech service including information about language, name and URI.
4. Window.speechSynthesis - Implemented by the Window object, which returns a SpeechSynthesis object, which is the entry point into using Web Speech API speech synthesis functionality.

## Optimizations
While improve this project, I would start by implementing the following features -

1. Adding support for more languages
2. Adding more voice types

## Lessons Learned
My learning was focused on making the use of Web Speech API and to create a simple UI where user can type some text and start Speech Synthesis to convert the text to speech and narration options with different voice types. I learned how the Web Speech API works and differnece between SpeechSynthesis (Text-to-Speech), and SpeechRecognition (Asynchronous Speech Recognition.).











Binary file added Text-to-Speech-main/favicon.ico
Binary file not shown.
1 change: 1 addition & 0 deletions Text-to-Speech-main/images/note.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
//Add screenshot of text to speech web app
Binary file added Text-to-Speech-main/images/text2speeches.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
44 changes: 44 additions & 0 deletions Text-to-Speech-main/index.html
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
<!-- Developed by saksham gupta -->
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<link rel="shortcut icon" href="favicon.ico" type="image/x-icon">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Speech Synthesis</title>
<script src="https://kit.fontawesome.com/4fc31833c8.js" crossorigin="anonymous"></script>
<link href='https://fonts.googleapis.com/css?family=Pacifico' rel='stylesheet' type='text/css'>
<link rel="stylesheet" href="style.css">
</head>
<body>

<div class="voiceinator">

<h1>The Voiceinator 2002</h1>

<select name="voice" id="voices">
<option value="">Select A Voice</option>
</select>

<label for="rate">Rate:</label>
<input name="rate" type="range" min="0" max="3" value="1" step="0.1">

<label for="pitch">Pitch:</label>

<input name="pitch" type="range" min="0" max="2" step="0.1">
<textarea name="text">Hello! I'm saksham. 👋</textarea>
<button id="stop">Stop!</button>
<button id="speak">Speak</button>

<div class="socials">
<a href="https://github.com/saksham0103"><i class="fa-brands fa-github"></i></a>
<p><a href="#" id="dev">Developed by Saksham Gupta</a></p>
</div>
</div>


<script src="script.js"></script>>

</body>
</html>
39 changes: 39 additions & 0 deletions Text-to-Speech-main/script.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
const msg = new SpeechSynthesisUtterance();
let voices = [];
const voicesDropdown = document.querySelector('[name="voice"]');
const options = document.querySelectorAll('[type="range"], [name="text"]');
const speakButton = document.querySelector('#speak');
const stopButton = document.querySelector('#stop');
msg.text = document.querySelector('[name="text"]').value;

function populateVoices() {
voices = this.getVoices();
voicesDropdown.innerHTML = voices
.filter(voice => voice.lang.includes('en'))
.map(voice => `<option value="${voice.name}">${voice.name} (${voice.lang})</option>`)
.join('');
}

function setVoice() {
msg.voice = voices.find(voice => voice.name === this.value);
toggle();
}

function toggle(startOver = true) {
speechSynthesis.cancel();
if (startOver) {
speechSynthesis.speak(msg);
}
}

function setOption() {
console.log(this.name, this.value);
msg[this.name] = this.value;
toggle();
}

speechSynthesis.addEventListener('voiceschanged', populateVoices);
voicesDropdown.addEventListener('change', setVoice);
options.forEach(option => option.addEventListener('change', setOption));
speakButton.addEventListener('click', toggle);
stopButton.addEventListener('click', () => toggle(false));
122 changes: 122 additions & 0 deletions Text-to-Speech-main/style.css
Original file line number Diff line number Diff line change
@@ -0,0 +1,122 @@
html {
font-size: 10px;
box-sizing: border-box;
}

*, *:before, *:after {
box-sizing: inherit;
}

body {
margin: 0;
padding: 0;
font-family: sans-serif;
background-color: #3BC1AC;
display: flex;
min-height: 100vh;
align-items: center;

background-image:
radial-gradient(circle at 100% 150%, #3BC1AC 24%, #42D2BB 25%, #42D2BB 28%, #3BC1AC 29%, #3BC1AC 36%, #42D2BB 36%, #42D2BB 40%, transparent 40%, transparent),
radial-gradient(circle at 0 150%, #3BC1AC 24%, #42D2BB 25%, #42D2BB 28%, #3BC1AC 29%, #3BC1AC 36%, #42D2BB 36%, #42D2BB 40%, transparent 40%, transparent),
radial-gradient(circle at 50% 100%, #42D2BB 10%, #3BC1AC 11%, #3BC1AC 23%, #42D2BB 24%, #42D2BB 30%, #3BC1AC 31%, #3BC1AC 43%, #42D2BB 44%, #42D2BB 50%, #3BC1AC 51%, #3BC1AC 63%, #42D2BB 64%, #42D2BB 71%, transparent 71%, transparent),
radial-gradient(circle at 100% 50%, #42D2BB 5%, #3BC1AC 6%, #3BC1AC 15%, #42D2BB 16%, #42D2BB 20%, #3BC1AC 21%, #3BC1AC 30%, #42D2BB 31%, #42D2BB 35%, #3BC1AC 36%, #3BC1AC 45%, #42D2BB 46%, #42D2BB 49%, transparent 50%, transparent),
radial-gradient(circle at 0 50%, #42D2BB 5%, #3BC1AC 6%, #3BC1AC 15%, #42D2BB 16%, #42D2BB 20%, #3BC1AC 21%, #3BC1AC 30%, #42D2BB 31%, #42D2BB 35%, #3BC1AC 36%, #3BC1AC 45%, #42D2BB 46%, #42D2BB 49%, transparent 50%, transparent);
background-size:100px 50px;
}

.voiceinator {
padding: 2rem;
width: 50rem;
margin: 0 auto;
border-radius: 1rem;
position: relative;
background: white;
overflow: hidden;
z-index: 1;
box-shadow: 0 0 5px 5px rgba(0,0,0,0.1);
}

h1 {
width: calc(100% + 4rem);
margin: -2rem 0 2rem -2rem;
padding: .5rem;
background: #ffc600;
border-bottom: 5px solid #F3C010;
text-align: center;
font-size: 5rem;
font-weight: 100;
font-family: 'Pacifico', cursive;
text-shadow: 3px 3px 0 #F3C010;
}

.voiceinator input,
.voiceinator button,
.voiceinator select,
.voiceinator textarea {
width: 100%;
display: block;
margin: 10px 0;
padding: 10px;
border: 0;
font-size: 2rem;
background: #F7F7F7;
outline: 0;
}

textarea {
height: 20rem;
}

.voiceinator button {
background: #ffc600;
border: 0;
width: 49%;
float: left;
font-family: 'Pacifico', cursive;
margin-bottom: 0;
font-size: 2rem;
border-bottom: 5px solid #F3C010;
cursor: pointer;
position: relative;
}

.voiceinator button:active {
top: 2px;
}

.voiceinator button:nth-of-type(1) {
margin-right: 2%;
}

.voiceinator{
width: 50vw;
margin-top: 5vh;
margin-bottom: 5vh;
}
.socials{
font-size: 2rem;
clear: both;
text-align: center;
padding: 5rem 2rem 2rem 2rem;

}
.fa-github{
font-size: 35px;
}
.socials a{
color: black;
text-decoration: none;
}
.socials a:hover{
text-decoration: underline;
}

@media screen and (max-width: 800px) {
.voiceinator{
width: 33rem;
}
h1{
font-size: 3rem;
}
}

0 comments on commit 4f60aed

Please sign in to comment.