Tag Archives: C++

Using the grapheme-to-phoneme feature in CMU Sphinx-4

Foreword This article summarizes and updates the previous articles [1] related to the new grapheme-to-phoneme (g2p) feature in CMU Sphinx-4 speech recognizer [2]. In order to support automatic g2p transcription in Sphinx-4 there were created a new weighted finite state transducers (wfst) in java [3] which its current API will be presented in a future… Read More »

Automating the creation of joint multigram language models as WFST: Part 2

(originally posted at http://cmusphinx.sourceforge.net/2012/06/automating-the-creation-of-joint-multigram-language-models-as-wfst-part-2/) Foreword This a article presents an updated version of the model training application originally discussed in [1], considering the compatibility issues with phonetisaurus decoder as presented in [2]. The updated code introduces routines to regenerate a new binary fst model compatible with phonetisaurus’ decoder as suggested in [2] which will be… Read More »

Automating the creation of joint multigram language models as WFST

Notice: This article is outdated. The application described here is now part of the SphinxTrain application. Please refer to recent articles in CMUSphinx category for the latest info. (originally posted at http://cmusphinx.sourceforge.net/2012/06/automating-the-creation-of-joint-multigram-language-models-as-wfst/) Foreword Previous articles have introduced the C++ code to align a pronounciation dictionary [1] and how this aligned dictionary can be used in… Read More »

Porting phonetisaurus many-to-many alignment python script to C++

Notice: This article is outdated. The application described here is now part of the SphinxTrain application. Please refer to recent articles in CMUSphinx category for the latest info. (originally posted at http://cmusphinx.sourceforge.net/2012/05/porting-phonetisaurus-many-to-many-alignment-python-script-to-c/) Foreword Following our previous article on phonetisaurus [1] and the decision to use this framework as the g2p conversion method for my GSoC… Read More »

Phonetisaurus: A WFST-driven Phoneticizer – Framework Review

Foreword This article tries to analyze the phonetisaurus g2p [1], [2] code by describing it’s main parts and algorithms behind these. Phonetisaurus is a modular system and includes support for several third-party components. The system has been implemented primarily in python, but also leverages the OpenFST framework [3]. 1. Overall Architecture The procedure for model… Read More »