Search | Korea Science

Enhancement of speech with time-variant and colored noise

Mine, Katsutoshi;Kitazaki, Masato;Wakabayashi, Katsuyoshi;Morimoto, Yuji
- 제어로봇시스템학회:학술대회논문집
- /
- 1990.10b
- /
- pp.1098-1102
- /
- 1990
We consider a method for enhancement of speech signal degraded by additive random noise with time-variant and/or colored natures. For enhancement of speech signal with such noise, it is effective to utilize the natures of speech and noise. The objective of enhancement of speech is to improve the overall quality and the articulation of speech degraded by the time-variant and/or colored random noise. In the proposed method the distribution model of speech spectrum is given as information to noise reduction system. The proposed system can improve about lOdB in SNR when the input SNR is 0 dB.
PDF

Dysarthric speaker identification with different degrees of dysarthria severity using deep belief networks

Farhadipour, Aref;Veisi, Hadi;Asgari, Mohammad;Keyvanrad, Mohammad Ali
- ETRI Journal
- /
- v.40 no.5
- /
- pp.643-652
- /
- 2018
Dysarthria is a degenerative disorder of the central nervous system that affects the control of articulation and pitch; therefore, it affects the uniqueness of sound produced by the speaker. Hence, dysarthric speaker recognition is a challenging task. In this paper, a feature-extraction method based on deep belief networks is presented for the task of identifying a speaker suffering from dysarthria. The effectiveness of the proposed method is demonstrated and compared with well-known Mel-frequency cepstral coefficient features. For classification purposes, the use of a multi-layer perceptron neural network is proposed with two structures. Our evaluations using the universal access speech database produced promising results and outperformed other baseline methods. In addition, speaker identification under both text-dependent and text-independent conditions are explored. The highest accuracy achieved using the proposed system is 97.3%.
https://doi.org/10.4218/etrij.2017-0260 인용 PDF KSCI

Formant Locus Overlapping Method to Enhance Naturalness of Synthetic Speech (합성음의 자연도 향상을 위한 포먼트 궤적 중첩 방법)

안승권;성굉모
- Journal of the Korean Institute of Telematics and Electronics B
- /
- v.28B no.10
- /
- pp.755-760
- /
- 1991
In this paper, we propose a new formant locus overlapping method which can effectively enhance a naturalness of synthetic speech produced by ddemisyllable based Korean text-to-speech system. At first, Korean demisyllables are divided into several number of segments which have linear formant transition characteristics. Then, database, which is composed of start point and length of each formant segments, is provided. When we synthesize speech with these demisyllable database, we concatenate each formant locus by using a proposed overlapping method which can closely simulate haman articulation mechanism. We have implemented a Korean text-to-speech system by using this method and proved that the formant loci of synthetic speech are similar to those of the natural speech. Finally, we could illustrate that the resulting spectrograms of proposed method are more similar to natural speech than those of conventional method.
PDF

The maximum phonation time and temporal aspects in Korean stops in children with spastic cerebral palsy (경직형 뇌성마비 아동의 최대 발성지속시간과 파열음 산출 시 조음시간 특성 비교)

Jeong, Jin-Ok;Kim, Deog-Yong;Sim, Hyun-Sub;Park, Eun-Sook
- Phonetics and Speech Sciences
- /
- v.3 no.1
- /
- pp.135-143
- /
- 2011
This study evaluated the respiratory capacity of spastic cerebral palsy children who were grouped by GMFCS (Gross Motor Function Classification System) levels and identified the acoustic characteristics of three different types of Korean stops (stop consonants) which are needed for the temporal coordination of larynx and supra-larynx, in these children. Thirty-two children with dysarthria due to spastic cerebral palsy were divided into two subgroups: 14 children classified at GMFCS levels I~III were placed in Group 1 and 18 classified at GMFCS levels IV~V were placed in Group 11, and 18 children with normal speech were selected and placed in the control group. /a/ pronged phonation (sustained vowel /a/) and nine Korean VCV syllables were used. Examined acoustic characteristics were maximum phonation time (MPT) and closure duration and aspiration duration. The results were as follows: 1) The MPTs of the cerebral palsy (CP) groups, both Group I and Group II, were significantly shorter than those of the normal group. 2) The closure durations of the two CP groups were longer than those of the normal group for all 9 target syllables. 3) The aspiration durations of the two CP groups were longer than those of the normal group. 4) The closure duration of the normal and CP Group I was significantly different among tense, aspirated, and lax. However, the CP Group II was different from normal. 5) The aspiration duration of the normal and CP Group I was significantly different among aspirated, tense, and lax. However, the CP Group II was different from normal. 6) The place of articulation influenced less than the manner of articulation on closure and aspiration duration.
PDF

A study on adaptive noise cancellation for enhancement of digital speech articulation (디지털음성명료도 향상을 위한 적응형 잡음제거 기법에 관한 연구)

Kim, Soo-Yong;Jee, Suk-Kun
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.11 no.5
- /
- pp.961-968
- /
- 2007
Today, we can use radio communication device anywhere-anytime. Sometimes, we use the device in acoustic noise environment. The acoustic noise makes many problems in communication system. In acoustic noise environment, speaker cannot send clear information to receiver, because the received signal includes both speech signal and noise signal. A digital filter is useful to remove noise to get desired signal. One of methods is the adaptive digital filter using the adaptive noise canceller that automatically adjust filter parameters. This thesis addresses articulation algorithms against actual acoustic noises by means of two adaptive filtering methods. One is the adaptive noise canceller with two input channels and another is the spectral subtraction filter with one input channel. The experimental result from the proposed filter shows that the adaptive noise canceller is useful to reduce the non-stationary noises, while the spectral amplitude filter is effective for stationary noises.
https://doi.org/10.6109/jkiice.2007.11.5.961 인용 PDF KSCI

Gaps-In-Noise Test Performance in Children with Speech Sound Disorder and Cognitive Difficulty

Jung, Yu Kyung;Lee, Jae Hee
- Journal of Audiology & Otology
- /
- v.24 no.3
- /
- pp.133-139
- /
- 2020
Background and Objectives: The Gaps-In-Noise (GIN) test is a clinically effective measure of the integrity of the central auditory nervous system. The GIN procedure can be applied to a pediatric population above 7 years of age. The present study conducted the GIN test to compare the abilities of auditory temporal resolution among typically developing children, children with speech sound disorder (SSD), and children with cognitive difficulty (CD). Subjects and Methods: Children aged 8 to 11 years-(total n=30) participated in this study. There were 10 children in each of the following three groups: typically developing children, children with SSD, and children with CD. The Urimal Test of Articulation and Phonology was conducted as a clinical assessment of the children's articulation and phonology. The Korean version of the Wechsler Intelligence Scale for Children-III (K-WISC-III) was administered as a screening test for general cognitive function. According to the procedure of Musiek, the pre-recorded stimuli of the GIN test were presented at 50 dB SL. The results were scored by the approximated threshold and the overall percent correct score (%). Results: All the typically developing children had normal auditory temporal resolution based on the clinical cutoff criteria of the GIN test. The children with SSD or CD had significantly reduced gap detection performance compared to age-matched typically developing children. The children's intelligence score measured by the K-WISC-III test explained 37% of the variance in the percent-correct score. Conclusions: Children with SSD or CD exhibited poorer ability to resolve rapid temporal acoustic cues over time compared to the age-matched typically developing children. The ability to detect a brief temporal gap embedded in a stimulus may be related to the general cognitive ability or phonological processing.
https://doi.org/10.7874/jao.2019.00381 인용

Gaps-In-Noise Test Performance in Children with Speech Sound Disorder and Cognitive Difficulty

Jung, Yu Kyung;Lee, Jae Hee
- Korean Journal of Audiology
- /
- v.24 no.3
- /
- pp.133-139
- /
- 2020
Background and Objectives: The Gaps-In-Noise (GIN) test is a clinically effective measure of the integrity of the central auditory nervous system. The GIN procedure can be applied to a pediatric population above 7 years of age. The present study conducted the GIN test to compare the abilities of auditory temporal resolution among typically developing children, children with speech sound disorder (SSD), and children with cognitive difficulty (CD). Subjects and Methods: Children aged 8 to 11 years-(total n=30) participated in this study. There were 10 children in each of the following three groups: typically developing children, children with SSD, and children with CD. The Urimal Test of Articulation and Phonology was conducted as a clinical assessment of the children's articulation and phonology. The Korean version of the Wechsler Intelligence Scale for Children-III (K-WISC-III) was administered as a screening test for general cognitive function. According to the procedure of Musiek, the pre-recorded stimuli of the GIN test were presented at 50 dB SL. The results were scored by the approximated threshold and the overall percent correct score (%). Results: All the typically developing children had normal auditory temporal resolution based on the clinical cutoff criteria of the GIN test. The children with SSD or CD had significantly reduced gap detection performance compared to age-matched typically developing children. The children's intelligence score measured by the K-WISC-III test explained 37% of the variance in the percent-correct score. Conclusions: Children with SSD or CD exhibited poorer ability to resolve rapid temporal acoustic cues over time compared to the age-matched typically developing children. The ability to detect a brief temporal gap embedded in a stimulus may be related to the general cognitive ability or phonological processing.
https://doi.org/10.7874/jao.2019.00381 인용

AN ACOUSTIC ANALYSIS OF PRONUNCIATION IN CHILDREN WITH ANGLE'S CLASS II DIV. 1 MALOCCLUSION (Angle씨 II급 1류 부정교합아동의 발음에 관한 음향학적 연구)

Park, Yun-Chung;Lee, Sang-Hoon;Shon, Dong-Su
- Journal of the korean academy of Pediatric Dentistry
- /
- v.24 no.1
- /
- pp.95-111
- /
- 1997
The human speech organ consists of respiration system (lung, larynx), phonation system (vocal cord), articulation system (esophagus, pharynx, uvula, teeth, gingiva, palate, tongue, lip) and resonating system(oral cavity, nasal cavity, paranasal sinus). Because teeth are components of the articulation system, it has been reported that the persons with abnormally positioned teeth generally have abnormal occlusion and pronunciation. In this study, using /ㅅ(s)/, the most commonly mispronunced consonant in children with malocclusion, and the seven single vowels, /사(sa), 서($s\delta$), 소(so), 수(su), 스($s\omega$), 시(si), 세(se)/ and / ㅏ(a), ㅓ($\delta$), ㅗ(o), ㅜ(u), ㅡ($\omega$), 1(i), ㅔ(e)/ were recorded and analyzed using speech analysis program on computer by measuring formants and compared them for investigating the differences in pronunciation in children with Angle's class I occlusions and those with Angle's class II div.1 malocclusion. The result were as follows: 1. In the Angle's Class II div.1 group, there were no significant differences in F1 of all recorded sounds as compared with Angle's Class I group(p>0.05). 2. In the consonants, there were significant differences in F2 of /스($s\omega$)/ and F2/F1 ratio of /사(sa), 서($s\delta$), 시(si)/ between the two group(p<0.05). 3. In the vowels, there were significant differences F2/F1 ratio of /ㅓ($\delta$)/(p<0.05) and no significant differences in F2/F1 ratio between two group(p>0.05). 4. In the consonants, there were significant differences in F2 and F2/F1 ratio when succeeding vowels were high or low, and F2/F1 ratio when front in accordance with tongue position (p<0.05). 5. In the vowels, there were no significant differences in formant in accordance with tongue position(p>0.05)
PDF

Statistical Analysis of Korean Phonological Variations Using a Grapheme-to-phoneme System (발음열 자동 생성기를 이용한 한국어 음운 변화 현상의 통계적 분석)

이경님;정민화
- The Journal of the Acoustical Society of Korea
- /
- v.21 no.7
- /
- pp.656-664
- /
- 2002
We present a statistical analysis of Korean phonological variations using a Grapheme-to-Phoneme (GPT) system. The GTP system used for experiments generates pronunciation variants by applying rules modeling obligatory and optional phonemic changes and allophonic changes. These rules are derived form morphophonological analysis and government standard pronunciation rules. The GTP system is optimized for continuous speech recognition by generating phonetic transcriptions for training and constructing a pronunciation dictionary for recognition. In this paper, we describe Korean phonological variations by analyzing the statistics of phonemic change rule applications for the 60,000 sentences in the Samsung PBS Speech DB. Our results show that the most frequently happening obligatory phonemic variations are in the order of liaison, tensification, aspirationalization, and nasalization of obstruent, and that the most frequently happening optional phonemic variations are in the order of initial consonant h-deletion, insertion of final consonant with the same place of articulation as the next consonants, and deletion of final consonant with the same place of articulation as the next consonant's, These statistics can be used for improving the performance of speech recognition systems.
PDF KSCI

Building a Conceptual Model Using Ontology for the Efficient Retrieval of Cases from Fuzzy-CBR of Collision Avoidance Support System

Park, Gyei-Kark;Benedictos, John Leslie RM;Shin, Sung-Chul;Im, Nam-Kyun;Yi, Mi-Ra
- Proceedings of the Korean Institute of Intelligent Systems Conference
- /
- 2007.04a
- /
- pp.245-250
- /
- 2007
We have proposed Fuzzy-CBR to find a solution from past knowledge retrieved from the database and adapted to a new situation. However, ontology is needed in identifying concepts, relations and instances that are involved in a situation in order to improve and facilitate the efficient retrieval of similar cases from the CBR database. This paper proposes the way to apply ontology fur identifying the concepts involved in a new case, used as inputs, for a ship collision avoidance support system and in solving for similarity through document articulation and abstraction levels. These ontologies will be used to build a conceptual model of a maneuvering situation.
PDF

Search Result 106, Processing Time 0.03 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)