Category Archives: phonetisaurus

Compatibility issues using binary fst models generated by OpenGrm NGram Library with phonetisaurus decoder

(originally posted at Foreword Previous articles have shown how to use OpenGrm NGram Library for the encoding of joint multigram language models as WFST [1] and provided the code that simplifies and automates the fst model training [2]. As described in [1] the generated binary fst models with the procedures described in those articles… Read More »

Using OpenGrm NGram Library for the encoding of joint multigram language models as WFST

(originally posted at Foreword This article will review the OpenGrm NGram Library [1] and its usage for language modeling in ASR. OpenGrm makes use of functionality in the openFST library [2] to create, access and manipulate n-gram language models and it can be used as the language model training toolkit for integrating phonetisaurus’ model… Read More »

Porting phonetisaurus many-to-many alignment python script to C++

Notice: This article is outdated. The application described here is now part of the SphinxTrain application. Please refer to recent articles in CMUSphinx category for the latest info. (originally posted at Foreword Following our previous article on phonetisaurus [1] and the decision to use this framework as the g2p conversion method for my GSoC… Read More »

Phonetisaurus: A WFST-driven Phoneticizer – Framework Review

Foreword This article tries to analyze the phonetisaurus g2p [1], [2] code by describing it’s main parts and algorithms behind these. Phonetisaurus is a modular system and includes support for several third-party components. The system has been implemented primarily in python, but also leverages the OpenFST framework [3]. 1. Overall Architecture The procedure for model… Read More »