crta
Hrvatska znanstvena Sekcija img
bibliografija
3 gif
 Naslovna
 O projektu
 FAQ
 Kontakt
4 gif
Pregledavanje radova
Jednostavno pretraživanje
Napredno pretraživanje
Skupni podaci
Upis novih radova
Upute
Ispravci prijavljenih radova
Ostale bibliografije
Slični projekti
 Bibliografske baze podataka

Pregled bibliografske jedinice broj: 357927

Zbornik radova

Autori: Martinčić–Ipšić, Sanda; Grzybek, Peter; Mačutek, Jan; Matešić, Mihaela
Naslov: Distributions of Automatically Segmented Phonemes in Croatian Speech
( Distributions of Automatically Segmented Phonemes in Croatian Speech )
Izvornik: 6. Znanstveni skup s međunarodnim sudjelovanjem "Istraživanja govora" / G. Varošanec-Škarić ; D. Horga (ur.). - Zagreb :
Skup: 6. Znanstveni skup s međunarodnim sudjelovanjem "Istraživanja govora"
Mjesto i datum: Zagreb, Croatia, 6-8.12.2007
Ključne riječi: automatic speech segmenation; distributions
( automatic speech segmenation; distributions )
Sažetak:
In this paper we describe an automatic segmentation procedure for Croatian speech, which is based on a monophone speech recognition system and on word level transcriptions of speech signals. Since the transcription of the speech files is on the word level, the utterances have to be segmented on the phone level for the training procedures. For the word segmentation and recognition task, we have developed a phonetic dictionary, where a set of phonetic symbols is used to transcribe the words from the Croatian speech database ; the selected phonemes are derived according to SAMPA symbols proposed for Croatian [2]. The phonetic dictionary comprises all words, including all flective word formats, which occur in the Croatian speech corpora and their phonetic transcriptions [8]. The Croatian orthographic-to-phonetic rules are used for automatic conversion of graphemes into phonemes. The initial phone level segmentation of speech is performed using automatic alignment of speech signals and word transcriptions, which is based on hidden Markov monophone models (HMM) [3]. The automatic segmentation is performed using the forced alignment of the spoken utterance and the corresponding transcription using the monophone speech recognizer. The forced alignment assumes that all phones in the utterance are initially equally segmented. The monophone models were trained by iterations of Baum-Welch algorithm [4]. The Viterbi algorithm was used to find the most likely sequence of HMM states [4]. The results of the Viterbi algorithm are automatically determined time intervals of spoken phones in the speech signals. The automatically segmented phones are used as input for the speech recognition and speech synthesis training procedures. Since we use HMMs for acoustical modeling of Croatian speech in the speech recognition as well as in the speech synthesis, the same automatic segmentation procedure was performed and the same automatically segmented phones are used for training of the acoustic models of both systems [7]. Automatic segmentation results are presented for 13 hours of 25 professional speakers’ speech. Indirect measures used for the automatic speech segmentation performance are phoneme recognition correctness and word recognition correctness and accuracy. Additionally, the Croatian phoneme duration was calculated from automatically segmented phones. The data of the calculated duration for 674746 phones were used to test a theoretical model for phoneme duration, based on Altmann’ s [9] findings for vowel duration. A first attempt is made to apply this model to standard Croatian data and extend it to consonants.
Vrsta sudjelovanja: Predavanje
Vrsta prezentacije u zborniku: Sažetak
Vrsta recenzije: Međunarodna recenzija
Projekt / tema: 009-0361935-0852
Izvorni jezik: eng
Kategorija: Znanstveni
Znanstvena područja:
Informacijske i komunikacijske znanosti
Upisao u CROSBI: smarti@ffri.hr (smarti@ffri.hr), 3. Lip. 2008. u 11:22 sati



Verzija za printanje   za tiskati


upomoc
foot_4