Category Archives: GSoC 2012

Porting openFST to java: Part 1

Notice: Parts of this article may be outdated. There are many changes to its API and performance improvements recently in the java fst framework. Please refer to recent articles in Java FST Framework category for the latest info. Foreword This article is the first part of a series of articles on porting openFST[1] in java.… Read More »

Phonetisaurus: A WFST-driven Phoneticizer – Framework Review

Foreword This article tries to analyze the phonetisaurus g2p [1], [2] code by describing it’s main parts and algorithms behind these. Phonetisaurus is a modular system and includes support for several third-party components. The system has been implemented primarily in python, but also leverages the OpenFST framework [3]. 1. Overall Architecture The procedure for model… Read More »

Letter to Phoneme Conversion in CMU Sphinx-4: Literature review

1. Foreword Currently Sphinx-4 uses a predefined dictionary for mapping words to sequence of phonemes. I propose modifications in the Sphinx-4 code that will enable it to use trained models (through some king of machine learning algorithm) to map letters to phonemes and thus map words to sequence of phonemes without the need of a… Read More »