Month: July 2012 | ICT Research Blog

Notice: Parts of this article may be outdated. There are many changes to its API and performance improvements recently in the java fst framework. Please refer to recent articles in Java FST Framework category for the latest info.

Foreword

This article, the fourth in a series regarding porting openFST to java, describes the latest version of the java fst library, which contains all the required code for porting phonetisaurus g2p decoder [1] to java and eventually integrate it with sphinx4.

1. Basic Usage

Creating as simple wfst like the one in in figure 1 below is a straightforward procedure

Figure 1: Example weighted finite-state transducer

		// Create a new Semiring
		TropicalSemiring ts = new TropicalSemiring();
		// Create the wfst
		Fst<Double> wfst = new Fst<Double>(ts);
		// Create input and output symbol tables 
		// and assign these to the wfst  
		Mapper<Integer, String> isyms = new Mapper<Integer, String>();
		isyms.put(0, "a");
		isyms.put(1, "b");
		isyms.put(2, "c");
		wfst.setIsyms(isyms);
 
		Mapper<Integer, String> osyms = new Mapper<Integer, String>();
		osyms.put(0, "x");
		osyms.put(1, "y");
		osyms.put(2, "z");
		wfst.setIsyms(osyms);
 
		// Create state 0
		State<Double> state = new State<Double>(ts.zero());
		// Add it to the wfst
		wfst.addState(state);
		// Set it as the start state
		wfst.setStart(s.getId());
		// Add arcs to state 0 
		state.addArc(new Arc<Double>(0, 0, 0.5, "1"));
		state.addArc(new Arc<Double>(1, 1, 1.5, "1"));
 
		// Create state 1
		state = new State<Double>(ts.zero());
		// Add it to the wfst
		wfst.addState(state);
		// Add arcs to state 1
		state.addArc(new Arc<Double>(2, 2, 2.5, "2"));
 
		// Create (final) state 2
		state = new State<Double>(new Weight<Double>(3.5));
		// Add it to the wfst
		wfst.addState(state);

Then this wfst can be serialized as binary java fst

		try {
			wfst.saveModel("path/to/filename");
		} catch (IOException e) {
			e.printStackTrace();
		}

or exported to openFst text format

1	Convert.export(wfst, wfst.getSemiring(), "path/to/exported");

Finally it can also be used for several operations as described in the following section.

2. Fst Operations

As already mentioned, priority was given to operations that are required for the porting of phonetisaurus’ g2p decoder to java. All operations described are defined in their own class having the same name with the operation under the edu.cmu.sphinx.fst.operations package.

2.1. ArcSort

This operation sorts the arcs in an FST per state. It is defined as follows:

1	public static <T extends Comparable<T>> void apply(Fst<T> fst, Comparator<Arc<T>> cmp)

The Comparator cmp can be either ILabelCompare or OlabelCompare which short the arcs according to their input or output labels accordingly.

2.2. Compose

This operation computes the composition of two transducers [2]. It is defined as follows:

1	public static <T extends Comparable<T>> Fst<T> get(Fst<T> fst1, Fst<T> fst2, Semiring<T> semiring)

2.3. Connect

This operation removes all states that are not contained on a path leading from the initial state to a final state. It is defined as follows:

1	public static <T extends Comparable<T>> void apply(Fst<T> fst)

2.4. Determinize

This operation determinizes a weighted transducer. The result will be an equivalent FST that has the property that no state has two transitions with the same input label. For this algorithm, epsilon transitions are treated as regular symbols [3]. It is defined as follows:

1	public static <T extends Comparable<T>> Fst<T> get(Fst<T> fst)

2.5. ExtendFinal

This operations extends a wfst by adding a new final state with weight 0 and adding epsilon transitions from the existing final states to the new one, with weights equal to the existing state’s final weight. It is defined as follows:

1	public static <T extends Comparable<T>> void apply(Fst<T> fst)

Furthermore there is also a procedure to undo this change, which is defined as:

1	public static <T extends Comparable<T>> void undo(Fst<T> fst)

2.6. NShortestPaths

This operation produces an FST containing the n -shortest paths in the input FST [4], [5]. It is defined as follows:

1	public static <T extends Comparable<T>> Fst<T> get(Fst<T> fst, int n)

where n is the number of the best paths to return.

2.7. Project

This operation projects an FST onto its domain or range by either copying each arc’s input label to its output label or vice versa. It is defined as follows:

1	public static <T extends Comparable<T>> void apply(Fst<T> fst, ProjectType pType)

where pType is an enumeration talking values either INPUT or OUTPUT which project the input labels to the output or the output to the input accordingly.

2.8. Reverse

This operation reverses an FST. Internally it first extends the input to a single final state and this extension is undone before exiting leaving the input fst unchanged. It is defined as follows:

1	public static <T extends Comparable<T>> Fst<T> get(Fst<T> fst)

2.9. RmEpsilon

This operation removes epsilon-transitions (when both the input and output label are an epsilon) from a transducer. The result will be an equivalent FST that has no such epsilon transitions [2]. It is defined as follows:

1	public static <T extends Comparable<T>> Fst<T> get(Fst<T> fst)

3. Performance in g2p decoding

The performance of the fst java library, was evaluated in g2p decoding by porting the phonetisaurus’ g2p decoder to java (available at [6]). N-gram fst models were created (for n=6,7,8,9 and 10) and then loaded in the decoder in order to phoneticize 5 different words. The loading time for each model and also the time of various operations taking place during the decoding, were measured and their average are summarized in the table below. The table also shows the memory used by the java VM after loading the model (this refers more or less to the memory needed by the model) and also the maximum amount of memory utilized during the g2p decoding. All tests were performed on an Intel Core Duo CPU running openSuSE 12.1 x64.

Table 1: Performance tests of java g2p decoder

The graphs below visualize the first four rows of the above table.

Figure 2: Performance graphs for the java g2p decoder

4. Conclusion – Future work

Studying the performance table in the previous section it is clear that the critical procedure in the decoding process is the model loading (deserialization) which usually take more than a minute to complete. Although this is an issue that needs to be fixed, a quick workaround is to load it in advance and keep a copy of it in memory for providing pronunciations for all required words. This of course comes with additional requirements for memory which are more or less equal to the amount shown in 3rd in the previous section’s table (row “Memory usage after load”).

Next step is of course to evaluate the g2p decoder’s ability to provide correct pronunciations for words and compare it with the original pronunciations produced by phonetisaurus. Having a same quality java g2p decoder will first of confirm the correctness of the java fst library code and enable us to continue with its integration with CMUSphinx 4.

References
[1] J. Salatas, “Phonetisaurus: A WFST-driven Phoneticizer – Framework Review”, ICT Research Blog, May 2012.
[2] M. Mohri, “Weighted automata algorithms”, Handbook of Weighted Automata. Springer, pp. 213-250, 2009.
[3] M. Mohri, “Finite-State Transducers in Language and Speech Processing”, Computational Linguistics, 23:2, 1997.
[4] M. Mohri, “Semiring Framework and Algorithms for Shortest-Distance Problems”, Journal of Automata, Languages and Combinatorics, 7(3), pp. 321-350, 2002.
[5] M. Mohri, M. Riley, “An Efficient Algorithm for the n-best-strings problem”, Proceedings of the International Conference on Spoken Language Processing 2002 (ICSLP ’02).
[6] G2P java decoder SVN repository

(originally posted at http://cmusphinx.sourceforge.net/2012/07/porting-openfst-to-java-part-3/)

Foreword

This article, the third in a series regarding, porting openFST to java, introduces the latest update to the java code, which resolve the previously raised issues regarding the java fst architecture in general and its compatibility with the original openFST format for saving models. [1]

1. Code Changes

1.1. Simplified java generics usage

As suggested in [1], the latest java fst code revision (11456), available in the cmusphinx SVN Repository [2], assumes only the base Weight class and modifies the State, Arc and Fst classes definition to simply use a type parameter.

The above modifications provide an easier to use api. As an example the construction of a basic FST in the class edu.cmu.sphinx.fst.demos.basic.FstTest is simplified as follows

...
Fst<Double> fst = new Fst<Double>();
 
// State 0
State<Double> s = new State<Double>(); 
s.AddArc(new Arc<Double>(new Weight<Double>(0.5), 1, 1, 1));
s.AddArc(new Arc<Double>(new Weight<Double>(1.5), 2, 2, 1));
fst.AddState(s);
 
// State 1
s = new State<Double>();
s.AddArc(new Arc<Double>(new Weight<Double>(2.5), 3, 3, 2));
fst.AddState(s);
 
// State 2 (final)
s = new State<Double>(new Weight<Double>(3.5));
fst.AddState(s);
...

1.2. openFST models compatibilty

Besides the simplified java generics usage above, the most important change is the code to load an openFST model in text format and convert it to a java fst serialized model. This is achieved also in the latest java fst code revision (11456) [2].

2. Converting openFST models to java

2.1. Installation

The procedure below is tested on an Intel CPU running openSuSE 12.1 x64 with gcc 4.6.2, Oracle Java Platform (JDK) 7u5, and ant 1.8.2.

In order to convert an openFST model in text format to java fst model, the first step is to checkout from the cmusphinx SVN repository the latest java fst code revision:

# svn co https://cmusphinx.svn.sourceforge.net/svnroot/cmusphinx/branches/g2p/fst
Next step is to build the java fst code
# cd fst # ant jar Buildfile: <path-to>/fst/build.xml jar: build-subprojects: init: [mkdir] Created dir: <path-to>/fst/bin build-project: [echo] fst: <path-to>/fst/build.xml [javac] <path-to>/fst/build.xml:38: warning: 'includeantruntime' was not set, defaulting to build.sysclasspath=last; set to false for repeatable builds [javac] Compiling 10 source files to <path-to>/fst/bin [javac] <path-to>/fst/build.xml:42: warning: 'includeantruntime' was not set, defaulting to build.sysclasspath=last; set to false for repeatable builds build: [jar] Building jar: <path-to>/fst/fst.jar BUILD SUCCESSFUL Total time: 2 seconds #

2.2. Usage

Having completed the installation as described above, and trained an openfst model named binary.fst as described in [3], with the latest model training code revision (11455) [4] the model is also saved in the openFST text format in a file named binary.fst.txt. The conversion to a java fst model is performed using the openfst2java.sh which can be found in the root directory of the java fst code. The openfst2java.sh accepts two parameters being the openfs input text model and the java fst output model as follows:

# ./openfst2java.sh binary.fst.txt binary.fst.ser Parsing input model... Saving as binary java fst model... Import completed. Total States Imported: 1091667 Total Arcs Imported: 2652251 #

The newly generated binary.fst.ser model can then be loaded in java, as follows:

try {
	Fst<Double> fst = (Fst<Double>) Fst.loadModel("binary.fst.ser");
} catch (ClassNotFoundException e) {
	// TODO Auto-generated catch block
	e.printStackTrace();
} catch (IOException e) {
	// TODO Auto-generated catch block
	e.printStackTrace();
}

3. Performance: Memory Usage

Testing the conversion and loading of the cmudict.fst model generated in [3], reveal that the conversion task requires about 1.0GB and the loading of the converted model requires about 900MB of RAM.

4. Conclusion – Future Works

Having the ability to convert and load an openFST model in java, takes the “Letter to Phoneme Conversion in CMU Sphinx-4” project to the next step, which is the port of phonetisaurus decoder to java which will eventually lead to its integration with cmusphinx 4.

A major concern at this point is the high memory utilization while loading large models. Although it is expected for java applications to consume more memory compared to a similar C++ application, this could be a problem especially when running in low end machines and needs further investigation and optimization (if possible).

References

[1] J. Salatas, “Porting openFST to java: Part 2”, ICT Research Blog, May 2012.

[2] Java fst SVN (Revision 11456)

[3] J. Salatas, “Automating the creation of joint multigram language models as WFST: Part 2”, ICT Research Blog, June 2012.

[4] openFST model training SVN (Revision 11455)

Monthly Archives: July 2012

Porting openFST to java: Part 4

Porting openFST to java: Part 3