Search | Korea Science

A knowledge-based pronunciation generation system for French (지식 기반 프랑스어 발음열 생성 시스템)

Kim, Sunhee
- Phonetics and Speech Sciences
- /
- v.10 no.1
- /
- pp.49-55
- /
- 2018
This paper aims to describe a knowledge-based pronunciation generation system for French. It has been reported that a rule-based pronunciation generation system outperforms most of the data-driven ones for French; however, only a few related studies are available due to existing language barriers. We provide basic information about the French language from the point of view of the relationship between orthography and pronunciation, and then describe our knowledge-based pronunciation generation system, which consists of morphological analysis, Part-of-Speech (POS) tagging, grapheme-to-phoneme generation, and phone-to-phone generation. The evaluation results show that the word error rate of POS tagging, based on a sample of 1,000 sentences, is 10.70% and that of phoneme generation, using 130,883 entries, is 2.70%. This study is expected to contribute to the development and evaluation of speech synthesis or speech recognition systems for French.
https://doi.org/10.13064/KSSS.2018.10.1.049 인용 PDF KSCI

Phonetic Contrasts of One-syllable Words and Speech Intelligibility in Adults with Hearing Impairments (청각장애 성인의 일음절 낱말대조 명료도 특성)

Kim Soo-Jin;Do Yeon-Ji
- MALSORI
- /
- no.56
- /
- pp.1-13
- /
- 2005
This study examined the speech intelligibility of one-syllable words with phonetic contrasts and analyzed segmental factors that can predict the overall speech intelligibility in hearing-impaired adults. To identify the speech error characteristics, a Korean word list was audio-recorded by 7 hearing-impaired adults, and 35 listeners selected the heard word out of 5 choices. Based in part on previous studies of speech of the hearing impaired, the word list consisted of monosyllabic consonant-vowel-consonant (CVC) real word pairs. Stimulus words included 77 phonetic contrast pairs. The results showed that the percentage of errors in final position (coda) contrast was higher than in any other position in syllable. And the intelligibility deficit factors of phonetic contrast in the hearing-impaired were analyzed through stepwise regression analysis. The overall intelligibility was predicted by the error rate of manner contrast at coda, voicing contrast (homorganic triplets) at onset and high-low contrast at nucleus.
PDF

Real-time implementation of the 2.4kbps EHSX Speech Coder Using a $TMS320C6701^TM$ DSPCore ($TMS320C6701^TM$을 이용한 2.4kbps EHSX 음성 부호화기의 실시간 구현)

양용호;이인성;권오주
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.29 no.7C
- /
- pp.962-970
- /
- 2004
This paper presents an efficient implementation of the 2.4 kbps EHSX(Enhanced Harmonic Stochastic Excitation) speech coder on a TMS320C6701$^{TM}$ floating-point digital signal processor. The EHSX speech codec is based on a harmonic and CELP(Code Excited Linear Prediction) modeling of the excitation signal respectively according to the frame characteristic such as a voiced speech and an unvoiced speech. In this paper, we represent the optimization methods to reduce the complexity for real-time implementation. The complexity in the filtering of a CELP algorithm that is the main part for the EHSX algorithm complexity can be reduced by converting program using floating-point variable to program using fixed-point variable. We also present the efficient optimization methods including the code allocation considering a DSP architecture and the low complexity algorithm of harmonic/pitch search in encoder part. Finally, we obtained the subjective quality of MOS 3.28 from speech quality test using the PESQ(perceptual evaluation of speech quality), ITU-T Recommendation P.862 and could get a goal of realtime operation of the EHSX codec.c.
PDF KSCI

The comparison of the voice between the free field and the external auditory canal (음장과 외이도 내부에서의 음성 비교)

Heo, Seung-Deok;Kim, Lee-Suk;Ko, Do-Heung;Lee, Jung-Hak
- Speech Sciences
- /
- v.7 no.4
- /
- pp.83-90
- /
- 2000
The purpose of this study was to examine some acoustic characteristics in the ear canal. It was assumed that a sound outside the external auditory canal could be different from the sound inside the external auditory canal. The acoustic signals were captured by a probe microphone placed at a distance within 1 cm from the tympanic membrane, and a reference microphone was placed over the upper pinna. Three vowels /a/, /i/, /u/ were recorded from a normal adult male speaker. The parameters such as the formant frequency ($Fl\simF5$) and the peak intensity were measured using a speech analyser, PCquirer. It was found that the entering part of the external auditory canal functions as a narrowing point as to the speech that passes through the free field. Results show that acoustic characteristics were changed for speech discrimination rather than speech perception.
PDF

A Study on Speech Support for the Blind (시각 장애자를 위한 음성 지원에 관한 연구)

Jang, S.H.;Ham, K.K.;Choi, S.H.;Min, H.K.;Huh, W.
- Proceedings of the KOSOMBE Conference
- /
- v.1993 no.05
- /
- pp.113-115
- /
- 1993
In this paper, we proposed a speech support system of personal computer for the blind. The system is consist of hardware part and software part. The hardware part are consist of personal computer and sound card. The software part are consist of sound driver system, character table and sound output algorithm. This system can recognize inputted characters from keyboard and program produced character strings.
PDF

Korean Part-of-Speech Tagging System Using Resolution Rules for Individual Ambiguous Word (어절별 중의성 해소 규칙을 이용한 혼합형 한국어 품사 태깅 시스템)

Park, Hee-Geun;Ahn, Young-Min;Seo, Young-Hoon
- Journal of KIISE:Computing Practices and Letters
- /
- v.13 no.6
- /
- pp.427-431
- /
- 2007
In this paper we describe a Korean part-of-speech tagging approach using resolution rules for individual ambiguous word and statistical information. Our tagging approach resolves lexical ambiguities by common rules, rules for individual ambiguous word, and statistical approach. Common rules are ones for idioms and phrases of common use including phrases composed of main and auxiliary verbs. We built resolution rules for each word which has several distinct morphological analysis results to enhance tagging accuracy. Each rule may have morphemes, morphological tags, and/or word senses of not only an ambiguous word itself but also words around it. Statistical approach based on HMM is then applied for ambiguous words which are not resolved by rules. Experiment shows that the part-of-speech tagging approach has high accuracy and broad coverage.
PDF KSCI

The implementation of database for high quality Embedded Text-to-speech system (고품질 내장형 음성합성 시스템을 위한 음성합성 DB구현)

Kwon, Oh-Il
- Journal of the Institute of Electronics Engineers of Korea SP
- /
- v.42 no.4 s.304
- /
- pp.103-110
- /
- 2005
Speech Database is one of the most important part of Text-to-speech(TTS) system Especially, the embedded TTS system needs more small size of database than that of the server TTS system So, the compression and statistical reduction or database is a very important factor in the embedded TTS system But this compression and statistical reduction of database always rise a loss of quality of the synthesised speech. In this paper, we propose a method of constructing database for high quality embedded TTS system and verify the quality of synthesised speech with MOS(Mean Opinion Score) test.
PDF KSCI

A Survey of Machine Translation and Parts of Speech Tagging for Indian Languages

Khedkar, Vijayshri;Shah, Pritesh
- International Journal of Computer Science & Network Security
- /
- v.22 no.4
- /
- pp.245-253
- /
- 2022
Commenced in 1954 by IBM, machine translation has expanded immensely, particularly in this period. Machine translation can be broken into seven main steps namely- token generation, analyzing morphology, lexeme, tagging Part of Speech, chunking, parsing, and disambiguation in words. Morphological analysis plays a major role when translating Indian languages to develop accurate parts of speech taggers and word sense. The paper presents various machine translation methods used by different researchers for Indian languages along with their performance and drawbacks. Further, the paper concentrates on parts of speech (POS) tagging in Marathi dialect using various methods such as rule-based tagging, unigram, bigram, and more. After careful study, it is concluded that for machine translation, parts of speech tagging is a major step. Also, for the Marathi language, the Hidden Markov Model gives the best results for parts of speech tagging with an accuracy of 93% which can be further improved according to the dataset.
https://doi.org/10.22937/IJCSNS.2022.22.4.31 인용 PDF KSCI

A Study on Exceptional Pronunciations For Automatic Korean Pronunciation Generator (한국어 자동 발음열 생성 시스템을 위한 예외 발음 연구)

Kim Sunhee
- MALSORI
- /
- no.48
- /
- pp.57-67
- /
- 2003
This paper presents a systematic description of exceptional pronunciations for automatic Korean pronunciation generation. An automatic pronunciation generator in Korean is an essential part of a Korean speech recognition system and a TTS (Text-To-Speech) system. It is composed of a set of regular rules and an exceptional pronunciation dictionary. The exceptional pronunciation dictionary is created by extracting the words that have exceptional pronunciations, based on the characteristics of the words of exceptional pronunciation through phonological research and the systematic analysis of the entries of Korean dictionaries. Thus, the method contributes to improve performance of automatic pronunciation generator in Korean as well as the performance of speech recognition system and TTS system in Korean.
PDF

Prediction of Prosodic Boundaries Using Dependency Relation

Kim, Yeon-Jun;Oh, Yung-Hwan
- The Journal of the Acoustical Society of Korea
- /
- v.18 no.4E
- /
- pp.26-30
- /
- 1999
This paper introduces a prosodic phrasing method in Korean to improve the naturalness of speech synthesis, especially in text-to-speech conversion. In prosodic phrasing, it is necessary to understand the structure of a sentence through a language processing procedure, such as part-of-speech (POS) tagging and parsing, since syntactic structure correlates better with the prosodic structure of speech than with other factors. In this paper, the prosodic phrasing procedure is treated from two perspectives: dependency parsing and prosodic phrasing using dependency relations. This is appropriate for Ural-Altaic, since a prosodic boundary in speech usually concurs with a governor of dependency relation. From experimental results, using the proposed method achieved 12% improvement in prosody boundary prediction accuracy with a speech corpus consisting 300 sentences uttered by 3 speakers.
PDF

Search Result 439, Processing Time 0.022 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)