Category Archives: CMUSphinx

Using the grapheme-to-phoneme feature in CMU Sphinx-4

Foreword This article summarizes and updates the previous articles [1] related to the new grapheme-to-phoneme (g2p) feature in CMU Sphinx-4 speech recognizer [2]. In order to support automatic g2p transcription in Sphinx-4 there were created a new weighted finite state transducers (wfst) in java [3] which its current API will be presented in a future… Read More »

Letter to Phoneme Conversion in CMU Sphinx-4: Literature review

1. Foreword Currently Sphinx-4 uses a predefined dictionary for mapping words to sequence of phonemes. I propose modifications in the Sphinx-4 code that will enable it to use trained models (through some king of machine learning algorithm) to map letters to phonemes and thus map words to sequence of phonemes without the need of a… Read More »