Before I was able to get this command (Step 5 from Mats’ run.sh
), we needed some files that were weren’t aware of. In total, you will need the following files before this command will work. All paths are relative to run.sh.
- data/local/dict/lexicon.txt
- data/local/dict/nonsilence_phones.txt
- data/local/dict/optional_silence.txt
- data/local/dict/silence_phones.txt
- path.sh
prepare_lang.sh
will complain if you don’t have any one of these. The complaint for path.sh is a little less clear, since not having this file seems to result in other errors.
lexicon.txt
contains a lexical entry on each line which consists of a word, a space, and then the phones in that word, separated by spaces.
nonsilence_phones.txt
contains one phone symbol per line.
optional_silence.txt
contains the symbol for an optional silence. This is just sil
, but the file still needs to exist. Make sure that there is a newline at the end.
silence_phones.txt
can be identical to optional_silence.txt
path.sh
can be copied from RM, though you may need to edit the KALDI_ROOT variable, since this is a relative path.
The German versions of all of these can be seen in kaldi-master/egs/vm1.