No external libary dependencies except the wikitionary database
But before using this library you need to run jptranscription\phonetics\wikitionary_processor.py
Afterwards you need to move this file to jptranscription\phonetics\lang-de.json
I'm developing a script to transcribe German words into Japanese sounds using IPA.
For example the IPA of "Auto" is "aʊ̯to", the first sound "Au" ("aʊ̯") is transcribed as "アオ".
My first idea was to break the German words into their syllables and then create a Katakana mapping for all syllables. But since there are WAYYYYY too many German syllables (> 100.000), this approach is not feasible. Then I remembered that there are way less sounds and sound-combinations when converting the words into their respective IPA reading.
So this project attempts to generate the Katakana pronunciation of all German words using IPA.
Since some sound-combinations include shorter sounds/sound-combinations that might be pronounced differently, I'll start looking for a match starting from the longest possible substring.
I'm using the Wikipedia Wiktionary as a lookup table for most German words. It has as an IPA entry for most of them. Then each word is transcribed into Katakana using the IPA sounds.
Note
Pronunciation remarks for Japanese speakers:
- If the word ends in 「ス」, in 99% of the cases it is pronounced just s, instead of su.
- 「ハウス」 is pronounced Haus (not Hausu)
- If the word ends in 「ト」, there is a chance that the o in to is either silent or not. In that case the reader is forced to look at its German counterpart and make a case-by-case decision.
- I want to thank rhasspy whose library gruut helped me get a basic understanding of German words and their IPA
- I want to thank dmort27 whose library epitran made me come across the wikitionary library
- This website helped me check my Transcription if it sounded similar
- I want to thank the author of this blogpost who helped me as a starting point