통합 검색 | Korea Science

러시아어 파열음에 나타나는 연자음의 음향음성학적 연구 (A Phonetic Study of Russian Soft Plosives)

변군혁
- 대한음성학회지:말소리
- /
- 제61호
- /
- pp.15-29
- /
- 2007
The present study investigates acoustic cues of russian soft plosive consonants. In previous studies, russian soft consonants are distinguished from hard consonants by F1, F2 of following vowels. The result showed: (1) that F0 of soft plosive consonants in following vowels were lower than those of hard plosive consonants; (2) and that VOT of soft plosive consonants were longer than those of hard plosive consonants. Hence, the present that, in addition to F1, F2, VOT and F0 are detected as acoustic cues that differentiate soft plosive consonants from hard plosive consonant in Russian.
PDF

한국어 파열음 인식을 위한 피쳐 셉 입력 인공 신경망 모델에 관한 연구 (A STUDY ON THE IMPLEMENTATION OF ARTIFICIAL NEURAL NET MODELS WITH FEATURE SET INPUT FOR RECOGNITION OF KOREAN PLOSIVE CONSONANTS)

김기석;김인범;황희융
- 대한전기학회:학술대회논문집
- /
- 대한전기학회 1990년도 하계학술대회 논문집
- /
- pp.535-538
- /
- 1990
The main problem in speech recognition is the enormous variability in acoustic signals due to complex but predictable contextual effects. Especially in plosive consonants it is very difficult to find invariant cue due to various contextual effects, but humans use these contextual effects as helpful information in plosive consonant recognition. In this paper we experimented on three artificial neural net models for the recognition of plosive consonants. Neural Net Model I used "Multi-layer Perceptron ". Model II used a variation of the "Self-organizing Feature Map Model". And Model III used "Interactive and Competitive Model" to experiment contextual effects. The recognition experiment was performed on 9 Korean plosive consonants. We used VCV speech chains for the experiment on contextual effects. The speech chain consists of Korean plosive consonants /g, d, b, K, T, P, k, t, p/ (/ㄱ, ㄷ, ㅂ, ㄲ, ㄸ, ㅃ, ㅋ, ㅌ, ㅍ/) and eight Korean monothongs. The inputs to Neural Net Models were several temporal cues - duration of the silence, transition and vot -, and the extent of the VC formant transitions to the presence of voicing energy during closure, burst intensity, presence of asperation, amount of low frequency energy present at voicing onset, and CV formant transition extent from the acoustic signals. Model I showed about 55 - 67 %, Model II showed about 60%, and Model III showed about 67% recognition rate.
PDF

한국어 파열음의 자동 인식에 대한 연구 : 한국어 치경 파열음의 자동 분류에 관한 연구 (A Study On The Automatic Discrimination Of The Korean Alveolar Stops)

최윤석;김기석;황희융
- 대한전기학회:학술대회논문집
- /
- 대한전기학회 1987년도 정기총회 및 창립40주년기념 학술대회 학회본부
- /
- pp.330-333
- /
- 1987
This paper is the study on the automatic discrimination of the Korean alveolar stops. In Korean, it is necessary to discriminate the asperate/tense plosive for the automatic speech recognition system because we, Korean, distinguish asperate/tense plosive allphones from tense and lax plosive. In order to detect acoustic cues for automatic recognition of the [ㄲ, ㄸ, ㅃ], we have experimented the discrimination of [ㄷ,ㄸ,ㅌ]. We used temporal cues like VOT and Silence Duration, etc., and energy cues like ratio of high frequency energy and low frequency energy as the acoustic parameters. The VCV speech data where V is the 8 Simple Vowels and C is the 3 alevolar stops, are used for experiments. The 192 speech data are experimented on and the recognition rate is resulted in about 82%-95%.
PDF

경직형과 불수의운동형 뇌성마비 성인의 파열음 산출의 음향음성학적 특성 (Acoustic Properties Associated with the Plosive Production of Adults with Cerebral Palsy)

김정연;황민아;박창일;지민제
- 음성과학
- /
- 제8권3호
- /
- pp.209-224
- /
- 2001
The purpose of this study was to identify the acoustic properties of 9 word initial Korean plosives in the speech of adults with cerebral palsy. Normal adults and two groups of adults with cerebral palsy (athetoid group and spastic group) participated in this study. Speech material included monosyllabic CVC real word pairs. Among the various acoustic properties of plosives, the aspiration duration was measured. Adults with cerebral palsy exhibited different patterns of aspiration duration for triplets of Korean plosives compared to normal adults. In addition, the plosive production of spastic group was distinguished from that of athetoid group. Such acoustic characteristics of plosives of adults with cerebral palsy may negatively affect the intelligibility of their speech.
PDF

청각장애아동과 건청아동의 모음 및 파열음 산출의 음향음성학적 특성 비교 (Acoustic Comparisons of Vowel and Plosive Productions between the Normal and the Hearing-Impaired Children)

오영자;지민제;김영태
- 음성과학
- /
- 제7권2호
- /
- pp.51-70
- /
- 2000
Twenty normal and 20 severe-to-profound hearing-impaired subjects participated in the present study. The two groups are matched by their chronological age. Each subject made a recording of three vowels of /i/, /a/, and /u/, and nine $VC_{plosive}V$ (hereafter, VCV) disyllables of /epe/, /ep'e/, /$ep^{h}e$/, /ete/, /et'e/, /$et^{h}e$/, /eke/, /ek'e/, and /$ek^{h}e$/, each five times. Formant frequencies of $F_1,\;F_2,\;and\;F_3$ were measured for the three vowels and six measures were made for the nine disyllables. The six measures were (1) the total duration of the disyllable, (2) the duration of the first vowel, (3) the duration of the closed period, (4) the ratio of the first vowel over the first vowel plus the closure period of the consonant, (5) the duration of the aspiration, and (6) the duration of the second vowel. Results shows that the three formants and each of the measures were significantly different between the two groups of subjects.
PDF

연속음성에서 천이구간의 탐색, 추출, 근사합성에 관한 연구 (A Study on a Searching, Extraction and Approximation-Synthesis of Transition Segment in Continuous Speech)

이시우
- 한국정보처리학회논문지
- /
- 제7권4호
- /
- pp.1299-1304
- /
- 2000
In a speed coding system using excitation source of voiced and unvoiced, it would be involved a distortion of speech quality in case coexist with a voiced and an unvoiced consonants in a frame. So, I propose TSIUVC(Transition Segment Including UnVoiced Consonant) searching, extraction ad approximation-synthesis method in order to uncoexistent with a voiced and unvoiced consonants in a frame. This method based on a zerocrossing rate and pitch detector using FIR-STREAK Digital Filter. As a result, the extraction rates of TSIUVC are 84.8% (plosive), 94.9%(fricative), 92.3%(affricative) in female voice, and 88%(plosive), 94.9%(fricative), 92.3%(affricative) in male voice respectively, Also, I obain a high quality approximation-synthesis waveforms within TSIUVC by using frequency information of 0.547kHz below and 2.813kHz above. This method has the capability of being applied to speech coding of low bit rate, speech analysis and speech synthesis.
PDF

중국인 여성 화자의 한국어 평음 파열음 발음: 독립 문장과 문단의 비교 (Korean plain plosive produced by Chinese female speakers: Sentence vs. Paragraph)

강반;김지은;이충우
- 말소리와 음성과학
- /
- 제7권2호
- /
- pp.111-117
- /
- 2015
The purpose of this study is to investigate how Chinese learners of Korean produce Korean plain plosives differently in a reading passage and isolated sentences. There are several studies on Korean plosives produced by Chinese speakers, but the study comparing the production of reading passage and isolated sentences are rare. For these purposes, ten Chinese speakers' VOT values of Korean plain plosives were measured using Speech Analyzer. The results show that there is no significant difference between the plain plosive production of a reading passage and that of isolated sentences. In the further studies, the measurement of pitch with VOT is needed.
https://doi.org/10.13064/KSSS.2015.7.2.111 인용 PDF KSCI

순환 신경망 모델을 이용한 한국어 음소의 음성인식에 대한 연구 (A Study on the Speech Recognition of Korean Phonemes Using Recurrent Neural Network Models)

김기석;황희영
- 대한전기학회논문지
- /
- 제40권8호
- /
- pp.782-791
- /
- 1991
In the fields of pattern recognition such as speech recognition, several new techniques using Artifical Neural network Models have been proposed and implemented. In particular, the Multilayer Perception Model has been shown to be effective in static speech pattern recognition. But speech has dynamic or temporal characteristics and the most important point in implementing speech recognition systems using Artificial Neural Network Models for continuous speech is the learning of dynamic characteristics and the distributed cues and contextual effects that result from temporal characteristics. But Recurrent Multilayer Perceptron Model is known to be able to learn sequence of pattern. In this paper, the results of applying the Recurrent Model which has possibilities of learning tedmporal characteristics of speech to phoneme recognition is presented. The test data consist of 144 Vowel+ Consonant + Vowel speech chains made up of 4 Korean monothongs and 9 Korean plosive consonants. The input parameters of Artificial Neural Network model used are the FFT coefficients, residual error and zero crossing rates. The Baseline model showed a recognition rate of 91% for volwels and 71% for plosive consonants of one male speaker. We obtained better recognition rates from various other experiments compared to the existing multilayer perceptron model, thus showed the recurrent model to be better suited to speech recognition. And the possibility of using Recurrent Models for speech recognition was experimented by changing the configuration of this baseline model.

주파수 영역의 선택정보를 이용한 멀티펄스 음성부호화 방식에 관한 연구 (A Study on Multi-Pulse Speech Coding Method by using Selected Information in a Frequency Domain)

이시우
- 인터넷정보학회논문지
- /
- 제7권4호
- /
- pp.57-66
- /
- 2006
본 연구에서는 연속음성에서 무성자음을 포함한 천이구간을 탐색, 추출하고 주파수대역에서 근사합성하는 새로운 멀티펄스 음성부호화 방식 (FBD-MPC)를 제안하였다. 실험결과, 여자 음성의 경우 TSIUVC 추출율은 84.8%(파열음), 94.9%(마찰음), 92.3%(파찰음), 남자 음성의 경우는 88%(파열음), 94.9%(마찰음), 92.3%(파찰음)의 결과를 얻었다. 아울러, 0.547kHz 이하 2.813kHz 이상의 주파수 정보를 사용하여 TSIUVC 음성파형을 양호하게 근사합성할 수 있었으며, 유성음/무성음 선택정보를 이용한 MPC와 유성음/무음/TSIUVC를 이용한 FBO-MPC를 평가한 결과, FBO-MPC의 음질이 MPC의 음질에 비하여 개선되었음을 알 수 있었다.
PDF

LPC를 이용한 평안방언의 음향지표에 관한 연구 (A Study for Acoustic Cues of Pyoung-An Do Dialect Using LPC)

송철규;이명호;김영배
- 대한의용생체공학회:의공학회지
- /
- 제13권3호
- /
- pp.195-200
- /
- 1992
This paper deal with the acoustic cues of Pyoung-An Do dialect using linear prediction. Also, this paper descrbes a statistical comparison between standard tone speech data and Pyoung-An Do dia lects. The analysis done mainly focused on the distribution of formants and pitch periods accord to ac- cents variation. For the purpose of objective comparison, the experiments are performed by extracts for- mant LPC spectrum and pithch periods from average magnitude difference function waveforms. Summing up the results, In disyllable words (VCV pattern) , prepositioned vowels have longer phona lion time than postpositioned vowels and the intrin, iii phonation time is whore longer in the low vowels than in the high ones. The africative consonants show the mixed characteristics of the plosive and frlc ative consonants. The remarkable acoustic cues are the low frequency noise-like waves just before the 1st formants in the plosive consonants, the high frequency noise-like waves in the fricative consonants, and phonation time is not affected by the kinds of prepositioned or postpositioned vowels.
PDF

검색결과 34건 처리시간 0.019초

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)