Monthly Archives: September 2016

Korean data

These are Korean speech and associated data available from LDC. There are duplications. I included the finite state morphology and morphologically annotated text.

LDC2006S42 Korean Broadcast News Speech
/projects/ldc/ldc-standard-license/2006/LDC2006S42
LDC2006T14 Korean Broadcast News Transcripts
/projects/ldc/ldc-standard-license/2006/LDC2006T14
LDC2006S36 West Point Korean Speech
/projects/ldc/ldc-standard-license/2006/LDC2006S36
LDC2004L01 Klex: Finite-State Lexical Transducer for Korean
/projects/ldc/ldc-standard-license/2004/LDC2004L01
LDC2004T03 Morphologically Annotated Korean Text
/projects/ldc/ldc-standard-license/2004/LDC2004T03
LDC2003S07 Korean Telephone Conversations Complete Set
/projects/ldc/ldc-standard-license/2003/LDC2003S07
LDC2003L02 Korean Telephone Conversations Lexicon
/projects/ldc/ldc-standard-license/2003/LDC2003L02
LDC2003S03 Korean Telephone Conversations Speech
/projects/ldc/ldc-standard-license/2003/LDC2003S03
LDC2003T08 Korean Telephone Conversations Transcripts
/projects/ldc/ldc-standard-license/2003/LDC2003T08
LDC96S54 CALLFRIEND Korean

Spanish data

1 Reply

These are Spanish speech and associated data available from LDC. There appear to be duplications, we should work back from the later publications.

LDC2014T23 Fisher and CALLHOME Spanish–English Speech Translationnot on server
LDC2010T04 Fisher Spanish – Transcripts
/projects/ldc/ldc-standard-license/2010/LDC2010T04
LDC2010S01 Fisher Spanish Speech
/projects/ldc/ldc-standard-license/2010/LDC2010S01
LDC2006S37 West Point Heroico Spanish Speechnot on server
1997 HUB5 Spanish Transcriptsnot on server

LDC2002S25 1997 HUB5 Spanish Evaluation

LDC2001T61 CALLHOME Spanish Dialogue Act Annotation

LDC98S74 1997 Spanish Broadcast News Speech (HUB4-NE)

1997 Spanish Broadcast News Transcripts (HUB4-NE)

LDC98T29 HUB5 Spanish Telephone Speech Corpus

LDC98T27 HUB5 Spanish Transcripts

LDC96S57 ALLFRIEND Spanish-Caribbean Dialect

LDC96S58 CALLFRIEND Spanish-Non-Caribbean Dialect

LDC96L16 CALLHOME Spanish Lexicon
/projects/ldc/ldc-standard-license/1996/LDC96L16

LDC96S35 CALLHOME Spanish Speech
/projects/ldc/ldc-standard-license/1996/LDC96S35

LDC96T17 CALLHOME Spanish Transcripts
/projects/ldc/ldc-standard-license/1996/LDC96T17

LDC96S57 CALLFRIEND Spanish-Caribbean Dialect

LDC96S58 CALLFRIEND Spanish-Non-Caribbean Dialect

LDC95S28 LATINO-40 Spanish Read News

This entry was posted in language data on September 11, 2016 by Mats Rooth.

Language groups for this semester

Leave a reply

Spanish: Simone, Eliza, Bridget, Jimmy, Naomi

German: Alyssa, Jacob, Andrea, Shibo

Korean: Chloe, Hankyul, Shiven, Shohini

If anyone is missing from this list, please join one of the groups!

This entry was posted in Class info on September 7, 2016 by Shohini.

fsttopsort

Leave a reply

fsttopsort topologically sorts its input if acyclic, modifying it. Otherwise, the input is unchanged. When sorted, all transitions are from lower to higher state IDs. (documentation source)

Cyclic Example:

fstprint –isymbols=words.txt –osymbols=words.txt L1.fst

0 1 <eps> <eps>
1 2 a a
1 3 f f
2 0 <eps> <eps>
3 2 d d

when we apply fsttopsort, we get a warning saying “input FST is cyclic”.

fsttopsort L1.fst L1sorted.fst
WARNING: fsttopsort: Input FST is cyclic
fstprint –isymbols=words.txt –osymbols=words.txt L1sorted.fst

0 1 <eps> <eps>
1 2 a a
1 3 f f
2 0 <eps> <eps>
3 2 d d

As we expected, the output looks unchanged.

Acyclic Example:

fstprint –isymbols=words.txt –osymbols=words.txt L2.fst

1
0 2 <eps> <eps>
0 3 a a
2 5 f f
3 4 c c
4 6 b b
4 1 b b
5 3 d d
6 1 a a

as we run the operation,

fsttopsort L2.fst L2sorted.fst
fstprint –isymbols=words.txt –osymbols=words.txt L2sorted.fst

6
0 1 <eps> <eps>
0 3 a a
1 2 f f
2 3 d d
3 4 c c
4 5 b b
4 6 b b
5 6 a a

we can see the state IDs topologically sorted.

(I produced all the images by running ‘fstdraw [fst filename] | dot -Tpng >[png filename]’)

This entry was posted in Uncategorized on September 7, 2016 by Heejung Moon.

fstequal

Leave a reply

The fstequal command exits with a return code indicating whether or not the two compiled fsts passed in are equivalent. The return code will be 0 if they are equal, and nonzero if they are not.

The easiest way to see the return code is to type `echo $?` immediately after executing fstequal. This command shows the return code of the last executed command, and thus will show you a 0 if the two fsts are equal. For example, let’s c
onsider a perfectly straight fst:

jms852@kay:~$ fstprint straight.fst
0 0 0 0
0 1 1 0
1 2 0 0
2 3 0 0
3 3 0 0
3 3 1 0
jms852@kay:~$ fstequal straight.fst straight.fst
jms852@kay:~$ echo $?
0

Then let’s add another fst that forks off instead of continuing straight, and compare that one:

jms852@kay:~$ fstprint fork.fst
0 0 0 0
0 1 1 0
1 2 0 0
2 3 0 0
2 4 1 0
3 Infinity
4 Infinity
jms852@kay:~$ fstequal straight.fst fork.fst
jms852@kay:~$ echo $?
2

Two fsts can be equivalent even if they are not the exact same file, or if their edges are labelled differently. For example, if I create straight2.fst:

jms852@kay:~$ cat straight.txt
0 0 0 0

0 2 1 0

2 1 0 0

1 3 0 0

3 3 0 0

3 3 1 0

You can see that it is still straight, but the nodes have different labels.

However, we see that fstequal still says that they are equal

jms852@kay:~$ fstequal straight.fst straight2.fst
jms852@kay:~$ echo $?
0

This entry was posted in Uncategorized on September 7, 2016 by James.

fstprune

Leave a reply

According to the documentation:

This operation deletes states and arcs in the input FST that do not belong to a successful path whose weight is no more (w.r.t the natural the natural semiring order) than the threshold t ⊗-times the weight of the shortest path in the input FST.

Weights need to be commutative and have the path property. Both destructive and constructive implemenations are available

Example:

The fst:

0 0 0 0 0.699999988
0 1 0 0 0.299899995
0 2 0 0 9.99999975e-05
1
2 Infinity

After running

fstprune –weight=3 unpruned.fst pruned.fst

generates the new fst

0 0 0 0 0.699999988
0 1 0 0 0.299899995
1

Which has had the state 2 removed, as well as the transition to that state.

The demo can be found at /projects/speech/sys/kaldi-master/egs/rm/s5-avt26/demo/

This entry was posted in Uncategorized on September 7, 2016 by Alyssa Victoria Trigg.

fstrandgen

Leave a reply

Based on the documentation:

This operation randomly generates a set of successful paths in the input FST. The operation relies on an ArcSelector object for randomly selecting an outgoing transition at a given state in the input FST. The default arc selector, UniformArcSelector, randomly selects a transition using the uniform distribution. LogProbArcSelector randomly selects a transition w.r.t. the weights treated as negative log probabilities after normalizing for the total weight leaving the state. In all cases, finality is treated as a transition to a super-final state.

_____________________________________

Example:

fstrandgen G.fst rand1.fst

fstprint --acceptor --isymbols=words.txt rand1.fst
0 1 WILL 1 2 WABASH 2 3 SEVENTEENTH 3 4 OCTOBER 4 5 5

fstdraw --acceptor --isymbols=words.txt rand1.fst | dot -Tx11

My demo is under /projects/speech/sys/kaldi-master/egs/rm/s5-sb2295/demo/fstrandgen and you can run it with source demo.sh

This entry was posted in Kaldi on September 7, 2016 by Shohini.

fstintersect

Leave a reply

fstintersect computes the intersection of two FSAs. An intersection is the same as in math: the intersection of A and B is the set of all inputs,outputs of FSA A that also occur in FSA B.

for example,

fstprint a.fst

0 1 1 1

1 2 2 2

1 3 3 3

2 4 0 0

3 4 0 0

4

fstprint b.fs

t

0 1 1 1

1 2 2 2

2

fstintersect a.fst b.fst | fstprint

0 1 1 1

1 2 2 2

2 3 0 0

3

This entry was posted in Uncategorized on September 7, 2016 by Eliza Weaver.

fstreverse

Leave a reply

The operation fstreverse produces an fst that will contain all of the reverse strings of the input fst. If fst A transduces string x to y with weight a, then the reverse of A transduces the reverse of x to the reverse of y with weight a.Reverse(). Typically, a = a.Reverse().

========================
Initial automaton, L.fst
fstprint –osymbols=words.txt –isymbols=words.txt L.fst
0 1 a a
1 2 b b
1 3 c c
2 4 <eps> <eps>
3 4 <eps> <eps>
4

========================
Run fstreverse on L.fst to produce L2.fst
fstreverse L.fst L2.fst
fstprint –osymbols=words.txt –isymbols=words.txt L2.fst
0 5 <eps> <eps>
1
2 1 a a
3 2 b b
4 2 c c
5 3 <eps> <eps>
5 4 <eps> <eps>

========================

This entry was posted in Uncategorized on September 7, 2016 by Shiven.

Kaldi – show-alignments

Leave a reply

Once you have your alignments you might need to retrieve data from them.

Kaldi’s show-alignments generates an alignment file that is “readable for humans”. Here’s how to invoke it:

show-alignments $basePath”/data/lang/phones.txt” final.mdl ark:ali.1 > ali.1.txt
show-alignments $basePath”/data/lang/phones.txt” final.mdl ark:ali.2 > ali.2.txt
show-alignments $basePath”/data/lang/phones.txt” final.mdl ark:ali.3 > ali.3.txt
show-alignments $basePath”/data/lang/phones.txt” final.mdl ark:ali.4 > ali.4.txt

The file phones.txt is in data/lang/;

The file final.mdl is in exp/mono_ali/;

The files ali.1, ali.2, ali.3, ali.4 are in exp/mono_ali/. They have to be unziped (gunzip) before being used.

Before running show-alignments, the alignment files look like this:

f01br16b22k1-s003 3 12 18 17 1826 1825 1825 1825 1828 1830 1829 1829 1829 1829 1256 1258 1257 1257 1260 1259 1259 1259 1259 1259 1259 362 361 361 361 364 363 363 363 366 2750 2749 2749 2749 2749 2749 2749 2749 2749 2752 2751 2751 2754 2753 2753 2753 1928 1927 1927 1930 1929 1929 1932 1931 1931 1931 1931 980 982 984 1826 1825 1825 1825 1828 1827 1827 1827 1830 1829 2678 2677 2677 2677 2680 2679 2682 2681 2486 2485 2485 2485 2485 2485 2485 2488 2487 2487 2490 2489 2489 2489 2504 2503 2503 2503 2503 2506 2508 2738 2737 2737 2737 2737 2740 2739 2739 2739 2742 974 973 973 973 973 976 978 1814 1816 1818 2504 2503 2503 2506 2505 2508 2507 2507 2507 4 14 15 15 15 15 15 15 12 10 10 10 10 10 10 10 10 10 10 10 10 10 10 18 17 17 17 17 17 17 17 17 17

The above is utterance number 3 (s003) for female speaker 1 (f01br16b22k1) in the West Point Brazilian Portuguese LDC Corpus.

After your run show-alignments, you will see:

f01br16b22k1-s003 [ 3 12 18 17 ] [ 1826 1825 1825 1825 1828 1830 1829 1829 1829 1829 ] [ 1256 1258 1257 1257 1260 1259 1259 1259 1259 1259 1259 ] [ 362 361 361 361 364 363 363 363 366 ] [ 2750 2749 2749 2749 2749 2749 2749 2749 2749 2752 2751 2751 2754 2753 2753 2753 ] [ 1928 1927 1927 1930 1929 1929 1932 1931 1931 1931 1931 ] [ 980 982 984 ] [ 1826 1825 1825 1825 1828 1827 1827 1827 1830 1829 ] [ 2678 2677 2677 2677 2680 2679 2682 2681 ] [ 2486 2485 2485 2485 2485 2485 2485 2488 2487 2487 2490 2489 2489 2489 ] [ 2504 2503 2503 2503 2503 2506 2508 ] [ 2738 2737 2737 2737 2737 2740 2739 2739 2739 2742 ] [ 974 973 973 973 973 976 978 ] [ 1814 1816 1818 ] [ 2504 2503 2503 2506 2505 2508 2507 2507 2507 ] [ 4 14 15 15 15 15 15 15 12 10 10 10 10 10 10 10 10 10 10 10 10 10 10 18 17 17 17 17 17 17 17 17 17 ]

f01br16b22k1-s003 SIL m_B ew1_E a_B v_I o1_E eh1_S m_B ujn1_I t_I u_E v_B eh1_I lj_I u_E SIL

The above are the transition IDs (per frame) of each phone, and the phone is the text below – always aligned with the left-most TID – transition IDs are integers that encode the PDFs (probability density function), the phone identity, and information about self-loops or forward transitions.

The text part above corresponds to phones and the position they occupy in a word: _B is the beginning of a word, _E is the end, _I is word-internal phone, and _S is a singleton (word with one sound only).

The file above can be tweaked using your preference of sed awk, or similar, to look like this:

m ew1 a v o1 eh1 m ujn1 t u v eh1 lj u

These are the aligned words of utterance s003 for female speaker 1 above (literal translation: “My grandfather is very old” – don’t blame the speaker, they were reading prompts for the corpus)

This entry was posted in Kaldi on September 7, 2016 by Simone Harmath De Lemos.

Post navigation

← Older posts

Newer posts →