Skip to content

mbta/lexicon

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 

Repository files navigation

MBTA Pronunciation Lexicon

This repo stores a Pronunciation Lexicon used to help AWS Polly pronounce MBTA-specific terms and place names correctly.

Status

Recommended Use

The lexicon format is a standard any text-to-speech engine can use. However, this lexicon is only intended for use with AWS Polly's neural engine, and includes some "cheats" where we specify a slightly incorrect IPA to get Polly to pronounce something more naturally. Use with other TTS engines is currently neither supported nor recommended.

Comprehensiveness

This lexicon is not comprehensive/exhaustive: it contains only "fixes" for specific issues we've noticed with AWS Polly, rather than being a database of correct pronunciations for every possible MBTA term. It will evolve over time as we notice more issues.

This repo includes automation that keeps the copy of the lexicon in our AWS account synced with the committed copy, so our own apps that use Polly will always have the most up-to-date corrections.

Development

The lexicon contains two main types of entry: phonemes and aliases.

Phonemes

Provide a phonetic pronunciation of a word or phrase using IPA.

Example:

<lexeme>
  <grapheme>Quincy</grapheme>
  <phoneme>ˈkwɪnzi</phoneme>
</lexeme>

Careful: ˈ "Primary stress" character looks very similar to a single quote, '.

Aliases

Replace one word or phrase with another. Useful for expanding acronyms.

<lexeme>
  <grapheme>MBTA</grapheme>
  <alias>Massachusetts Bay Transportation Authority</alias>
</lexeme>

Testing

The AWS Polly dashboard can be used to test speech synthesis.

  • Select the Neural engine.
  • Enable the Customize pronunciation switch and select the MBTA lexicon.
  • You can enable the SSML switch and use phoneme tags to test potential changes without uploading a whole new lexicon. Note when this is enabled, the input must be enclosed in a <speak> tag.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages