• Title/Summary/Keyword: Prosodic boundary detection

Search Result 4, Processing Time 0.018 seconds

Automatic Detection of Korean Prosodic Boundaries U sing Acoustic and Grammatical Information (음성정보와 문법정보를 이용한 한국어 운율 경계의 자동 추정)

  • Kim, Sun-Hee;Jeon, Je-Hun;Hong, Hye-Jin;Chung, Min-Hwa
    • MALSORI
    • /
    • no.66
    • /
    • pp.117-130
    • /
    • 2008
  • This paper presents a method for automatically detecting Korean prosodic boundaries using both acoustic and grammatical information for the performance improvement of speech information processing systems. While most of previous works are solely based on grammatical information, our method utilizes not only grammatical information constructed by a Maximum-Entropy-based grammar model using 10 grammatical features, but also acoustical information constructed by a GMM-based acoustic model using 14 acoustic features. Given that Korean prosodic structure has two intonationally defined prosodic units, intonation phrase (IP) and accentual phrase (AP), experimental results show that the detection rate of AP boundaries is 82.6%, which is higher than the labeler agreement rate in hand transcribing, and that the detection rate of IP boundaries is 88.7%, which is slightly lower than the labeler agreement rate.

  • PDF

The Role of Prosodic Boundary Cues in Word Segmentation in Korean

  • Kim, Sa-Hyang
    • Speech Sciences
    • /
    • v.13 no.1
    • /
    • pp.29-41
    • /
    • 2006
  • This study investigates the degree to which various prosodic cues at the boundaries of prosodic phrases in Korean contribute to word segmentation. Since most phonological words in Korean are produced as one Accentual Phrase (AP), it was hypothesized that the detection of acoustic cues at AP boundaries would facilitate word segmentation. The prosodic characteristics of Korean APs include initial strengthening at the beginning of the phrase and pitch rise and final lengthening at the end. A perception experiment utilizing an artificial language learning paradigm revealed that cues conforming to the aforementioned prosodic characteristics of Korean facilitated listeners' word segmentation. Results also indicated that duration and amplitude cues were more helpful in segmentation than pitch. Nevertheless, results did show that a pitch cue that did not conform to the Korean AP interfered with segmentation.

  • PDF

Automatic Synthesis Method Using Prosody-Rich Database (대용량 운율 음성데이타를 이용한 자동합성방식)

  • 김상훈
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1998.08a
    • /
    • pp.87-92
    • /
    • 1998
  • In general, the synthesis unit database was constructed by recording isolated word. In that case, each boundary of word has typical prosodic pattern like a falling intonation or preboundary lengthening. To get natural synthetic speech using these kinds of database, we must artificially distort original speech. However, that artificial process rather resulted in unnatural, unintelligible synthetic speech due to the excessive prosodic modification on speech signal. To overcome these problems, we gathered thousands of sentences for synthesis database. To make a phone level synthesis unit, we trained speech recognizer with the recorded speech, and then segmented phone boundaries automatically. In addition, we used laryngo graph for the epoch detection. From the automatically generated synthesis database, we chose the best phone and directly concatenated it without any prosody processing. To select the best phone among multiple phone candidates, we used prosodic information such as break strength of word boundaries, phonetic contexts, cepstrum, pitch, energy, and phone duration. From the pilot test, we obtained some positive results.

  • PDF

The Role of Post-lexical Intonational Patterns in Korean Word Segmentation

  • Kim, Sa-Hyang
    • Speech Sciences
    • /
    • v.14 no.1
    • /
    • pp.37-62
    • /
    • 2007
  • The current study examines the role of post-lexical tonal patterns of a prosodic phrase in word segmentation. In a word spotting experiment, native Korean listeners were asked to spot a disyllabic or trisyllabic word from twelve syllable speech stream that was composed of three Accentual Phrases (AP). Words occurred with various post-lexical intonation patterns. The results showed that listeners spotted more words in phrase-initial than in phrase-medial position, suggesting that the AP-final H tone from the preceding AP helped listeners to segment the phrase-initial word in the target AP. Results also showed that listeners' error rates were significantly lower when words occurred with initial rising tonal pattern, which is the most frequent intonational pattern imposed upon multisyllabic words in Korean, than with non-rising patterns. This result was observed both in AP-initial and in AP-medial positions, regardless of the frequency and legality of overall AP tonal patterns. Tonal cues other than initial rising tone did not positively influence the error rate. These results not only indicate that rising tone in AP-initial and AP_final position is a reliable cue for word boundary detection for Korean listeners, but further suggest that phrasal intonation contours serve as a possible word boundary cue in languages without lexical prominence.

  • PDF