Search | Korea Science

A Study on Korean Speech Analysis using Walsh Transform (Walsh변환을 이용한 한국어 숫자음 음성분석에 관한 연구)

김계현;김준현
- The Transactions of the Korean Institute of Electrical Engineers
- /
- v.37 no.4
- /
- pp.251-256
- /
- 1988
This work describes a speech analysis of Korean number ('1'-'10') which are spoken by several speakers using Fast Walsh Transform(FWHT) method. FWHT includes only addition and subtraction operations, therefore faster and needs less memory than FFT(Fast Fourier Transfifrm) or LPC(Linear Predictive Coding) analysis method. We have investigated that FWHT method can find speaker independent feature(which represents same cue about some word independent of different speakers) The results of this experiment, the 70% of same words(korean number '2')which spoken by several speakers have had slmilar patterns.
PDF

A STUDY OF THE KOREAN SINGLE VOWEL SOUND DISTORTION IN RELATION TO THE PALATAL PLATE THICKNESS -LINEAR PREDICTION CORRELATION AND LOG AREA RATIO ANALYSES BY COMPUTER- (구개상의 두께가 한국어 단모음 발음에 미치는 영향에 관한 연구 -컴퓨터를 이용한 선형 예측 분석과 LOG AREA RATIO 분석-)

Lee, Joung-Man;Choi, Dae-Gyun;Park, Nam-Soo;Choi, Boo-Byung
- The Journal of Korean Academy of Prosthodontics
- /
- v.26 no.1
- /
- pp.31-49
- /
- 1988
This study was performed to investigate the sound distortion following the alternation of the palatal plate thickness, for this study, 3 subjects who were born in Seoul and spoke Seoul dialect were recruited from K university male student population. First, their sounds of /아(a)/, 어(e)/, 오(o)/, 우(u)/, 으($\.{+}$), 이(i)/,에(e)/ without inserting plate were recorded , and then the sounds with palatal plates of different thickness were recorded, respectively. The palatal plates was constructed to cover the alveolar & palatal surfaces of the maxilla with an approximate thickness of 1.0mm, 2.5mm, and thickness of 2.5mm over the alveolar ridge & 1.0mm elsewhere and, named B, C, D-type, in succession. Series of analysis were administered through Computer (16 bit IBM PC/AT) at analyze the sound distortions. These experiments were analyzed by the LPC, Log Area Ratio. The findings led to the following conclusions: 1. Sound distortions were relatively minute in each condition and informations, however, /이(i)/ was the most distorted vowel in all conditions. 2. By and large, sound distortion was large in C, D-types. However, there was no correlation of the distortion rate on the 3 informants, and all tested vowels. 3. It was similar to LPC, Log Area Ratio distortion rates. 4. It was found that the sound distortion wit]1 plate inserted was verified to the numeric value with LPC and Log Area Ratio method.
PDF

Time-Synchronization Method for Dubbing Signal Using SOLA (SOLA를 이용한 더빙 신호의 시간축 동기화)

이기승;지철근;차일환;윤대희
- Journal of Broadcast Engineering
- /
- v.1 no.2
- /
- pp.85-95
- /
- 1996
The purpose of this paper Is to propose a dubbed signal time-synchroniztion technique based on the SOLA(Synchronized Over-Lap and Add) method which has been widely used to modify the time scale of speech signal. In broadcasting audio recording environments, the high degree of background noise requires dubbing process. Since the time difference between the original and the dubbed signal ranges about 200mili seconds, process is required to make the dubbed signal synchronize to the corresponding image. The proposed method finds he starting point of the dubbing signal using the short-time energy of the two signals. Thereafter, LPC cepstrum analysis and DTW(Dynamic Time Warping) process are applied to synchronize phoneme positions of the two signals. After determining the matched point by the minimum mean square error between orignal and dubbed LPC cepstrums, the SOLA method is applied to the dubbed signal, to maintain the consistency of the corresponding phase. Effectiveness of proposed method is verified by comparing the waveforms and the spectrograms of the original and the time synchronized dubbing signal.
PDF

Acoustic parameters for induced emotion categorizing and dimensional approach (자연스러운 정서 반응의 범주 및 차원 분류에 적합한 음성 파라미터)

Park, Ji-Eun;Park, Jeong-Sik;Sohn, Jin-Hun
- Science of Emotion and Sensibility
- /
- v.16 no.1
- /
- pp.117-124
- /
- 2013
This study examined that how precisely MFCC, LPC, energy, and pitch related parameters of the speech data, which have been used mainly for voice recognition system could predict the vocal emotion categories as well as dimensions of vocal emotion. 110 college students participated in this experiment. For more realistic emotional response, we used well defined emotion-inducing stimuli. This study analyzed the relationship between the parameters of MFCC, LPC, energy, and pitch of the speech data and four emotional dimensions (valence, arousal, intensity, and potency). Because dimensional approach is more useful for realistic emotion classification. It results in the best vocal cue parameters for predicting each of dimensions by stepwise multiple regression analysis. Emotion categorizing accuracy analyzed by LDA is 62.7%, and four dimension regression models are statistically significant, p<.001. Consequently, this result showed the possibility that the parameters could also be applied to spontaneous vocal emotion recognition.
PDF

Analysis of Unaspirated sound for Korean (한국어의 경음에 대한 분석)

Lim Soo-Ho;Kim Joo-Gon;Kim Bum-Guk;Jung Ho-Youl;Chung Hyun-Yeol
- Proceedings of the Acoustical Society of Korea Conference
- /
- spring
- /
- pp.41-44
- /
- 2004
본 논문에서는 한국어에만 나타나는 경음에 대하여 음운학적, 음향학적 특성을 고찰하고 이를 기반으로 음성인식 실험을 수행한 후 그 결과를 분석하였다. 음성인식 실험을 위하여 입력 음성을 48개의 유사음소단위 (PLU; Phoneme Likely Unit)로 레이블링을 한 후 각각의 음소군에 대하여 LPC (Liner Predictive Coding) 분해능을 증가시키면서 음소인식 및 단어인식 실험을 수행하였다. 그 결과, 음소 인식 실험에서 경음군의 인식률이 가장 낮게 나타나 경음에 대한 분석이 보다 많이 필요함을 알 수 있었다. 또한 PLC의 분해 차원이 23차 일 때 경음과 전체 음소 인식률이 각각 $34.11\%,\;46.1\%$로 나타나 가장 양호함을 알 수 있었으며 단어인식 실험에서도 LPC 23차와 25차 일 때 $81.68\%,\;81.87\%$로 인식률이 가장 좋음을 알 수 있었다. 이상의 실험 결과에서 한국어의 경음은 전체 시스템의 인식 성능과 밀접한 관계가 있음을 알 수 있었다.
PDF

A Study on Combining Bimodal Sensors for Robust Speech Recognition (강인한 음성인식을 위한 이중모드 센서의 결합방식에 관한 연구)

이철우;계영철;고인선
- The Journal of the Acoustical Society of Korea
- /
- v.20 no.6
- /
- pp.51-56
- /
- 2001
Recent researches have been focusing on jointly using lip motions and speech for reliable speech recognitions in noisy environments. To this end, this paper proposes the method of combining the visual speech recognizer and the conventional speech recognizer with each output properly weighted. In particular, we propose the method of autonomously determining the weights, depending on the amounts of noise in the speech. The correlations between adjacent speech samples and the residual errors of the LPC analysis are used for this determination. Simulation results show that the speech recognizer combined in this way provides the recognition performance of 83 ％ even in severely noisy environments.
PDF

A Simple Pitch Tracking Algorithm based on the Energy Operator (에너지 연산자에 기초한 간단한 피치 추적 방법)

Tai-Ho Lee
- Journal of the Institute of Convergence Signal Processing
- /
- v.5 no.1
- /
- pp.1-5
- /
- 2004
A new method for the estimation of pitch-frequency contour of voiced speech is presented. The method is based on the double application of Kaiser's energy operator［1］, which has the capabilities of extracting amplitude and frequency of a sinusoidal waveform. According to the modulation model, a vowel can be represented by a combination of damped sinusoids representing formants, modulated by pitch pulses. Therefore, the amplitude envelope of each of the components will give a pitch-like waveform and the pitch can be obtained by averaging the frequencies of this waveform. The first part is the same as Gopalan's approach［9］, but by substituting the LPC based spectral analysis with the second application of energy operator, the algorithm becomes very simple and can be processed on-line. Although the estimation is rather coarse, the suggested algorithm can be useful for getting a general sketch of pitch contour on-line.
PDF

A COMPUTER ANALYSIS ON THE KOREAN CONSONANT SOUND DISTORTION IN RELATION TO THE PALATAL PLATE THICKNESS -Dentoalveolar and hard palatal consonant- (구개상의 두께에 따른 한국어 자음의 발음 변화에 관한 컴퓨터 분석 - 치조음, 경구개음-)

Woo, Yi-Hyung;Choi, Dae-Kyun;Choi, Boo-Byung;Park, Nam-Soo
- The Journal of Korean Academy of Prosthodontics
- /
- v.25 no.1
- /
- pp.71-94
- /
- 1987
This study was carried out to investigate the sound distortion following the alternation of the palatal plate thickness. For this study, 2 healthy male subjects (24-year-old) were selected. Born in Seoul, they both spoke Seoul dialect. First, their sounds of /na(나)/, /da(다)/, /1a(라)/, /ja(자)/, /cha(차)/, /ta(타)/, without inserting plates were recorded, and then the sounds with palatal plates of different thickness were recorded, successively. The plate was fabricated in 3 types, each palatal thickness being 1.0mm, 2.5mm, dentoalveolar portion 2.5mm, other residual portion was 1.0mm, successively. Each type plates named B, C, D-type, in succession. Series of analysis were administered through Computer(16 bit) to analyze the sound distortions. These experiments were analyzed by the LPC (without weighting, pre-weighting, post-weighting) of the consonants, vowels portion, formant frequency of the vowels and word duration of the consonants. The findings led to the following conclusions: 1. There was no correlation of the distortion rate on the 2 informants. 2. Generally, vowels were not affected by the palatal plate thickness in the formant analysis, however, more distortion was detected in the LPC analysis, especially C, D-type plates. 3. Consonants distortion was more evident in the C, D-type plate. 4. The second formant was most disturbed and reduced in the all consonants with insertion of the palatal plate, especially C, D-type plate. 5. Word duration was shortened in the plate inserted(except /ja/, /cha/), especially C, D-type. 6. It was found that dentoalveolar, hard palatal sounds were severely distorted in plate inserted, and they were mainly affected by the dentoalveolar portion thickness. 7. There was correlation between palatal thickness and consonants quality.
PDF

A Study on the recognition of local name using Spatio-Temporal method (Spatio-temporal방법을 이용한 지역명 인식에 관한 연구)

지원우
- Proceedings of the Acoustical Society of Korea Conference
- /
- 1993.06a
- /
- pp.121-124
- /
- 1993
This paper is a study on the word recognition using neural network. A limited vocabulary, speaker independent, isolated word recognition system has been built. This system recognizes isolated word without performing segmentation, phoneme identification, or dynamic time wrapping. It needs a static pattern approach to recognize a spatio-temporal pattern. The preprocessing only includes preceding and tailing silence removal, and word length determination. A LPC analysis is performed on each of 24 equally spaced frames. The PARCOR coefficients plus 3 other features from each frame is extracted. In order to simplify a structure of neural network, we composed binary code form to decrease output nodes.
PDF

Development of Integrated Speech Training Aids for Hearing Impaired (청각 장애인용 통합형 발음 훈련 기기의 개발)

박상희;김동준
- Journal of Biomedical Engineering Research
- /
- v.13 no.4
- /
- pp.275-284
- /
- 1992
Development of Integrated Speech Training Aids for Hearing Impaired In this study, a spepch lralnlng aids that can do real-time display of vocal tract shape and other speech parameters together in a single system is implemenLed and self-training program for this system is developed. To estimate vocal tract shape, speech production process is assumed to be AR model. Through LPC analysis, vocal tract shape, intensity, and log spcclrum are calculated. And, fundamental frequency and nasality are measured using vibration sensors.
PDF

Search Result 95, Processing Time 0.02 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)