통합 검색 | Korea Science

Walsh변환을 이용한 한국어 숫자음 음성분석에 관한 연구 (A Study on Korean Speech Analysis using Walsh Transform)

김계현;김준현
- 대한전기학회논문지
- /
- 제37권4호
- /
- pp.251-256
- /
- 1988
This work describes a speech analysis of Korean number ('1'-'10') which are spoken by several speakers using Fast Walsh Transform(FWHT) method. FWHT includes only addition and subtraction operations, therefore faster and needs less memory than FFT(Fast Fourier Transfifrm) or LPC(Linear Predictive Coding) analysis method. We have investigated that FWHT method can find speaker independent feature(which represents same cue about some word independent of different speakers) The results of this experiment, the 70% of same words(korean number '2')which spoken by several speakers have had slmilar patterns.
PDF

구개상의 두께가 한국어 단모음 발음에 미치는 영향에 관한 연구 -컴퓨터를 이용한 선형 예측 분석과 LOG AREA RATIO 분석- (A STUDY OF THE KOREAN SINGLE VOWEL SOUND DISTORTION IN RELATION TO THE PALATAL PLATE THICKNESS -LINEAR PREDICTION CORRELATION AND LOG AREA RATIO ANALYSES BY COMPUTER-)

이정만;최대균;박남수;최부병
- 대한치과보철학회지
- /
- 제26권1호
- /
- pp.31-49
- /
- 1988
This study was performed to investigate the sound distortion following the alternation of the palatal plate thickness, for this study, 3 subjects who were born in Seoul and spoke Seoul dialect were recruited from K university male student population. First, their sounds of /아(a)/, 어(e)/, 오(o)/, 우(u)/, 으($\.{+}$), 이(i)/,에(e)/ without inserting plate were recorded , and then the sounds with palatal plates of different thickness were recorded, respectively. The palatal plates was constructed to cover the alveolar & palatal surfaces of the maxilla with an approximate thickness of 1.0mm, 2.5mm, and thickness of 2.5mm over the alveolar ridge & 1.0mm elsewhere and, named B, C, D-type, in succession. Series of analysis were administered through Computer (16 bit IBM PC/AT) at analyze the sound distortions. These experiments were analyzed by the LPC, Log Area Ratio. The findings led to the following conclusions: 1. Sound distortions were relatively minute in each condition and informations, however, /이(i)/ was the most distorted vowel in all conditions. 2. By and large, sound distortion was large in C, D-types. However, there was no correlation of the distortion rate on the 3 informants, and all tested vowels. 3. It was similar to LPC, Log Area Ratio distortion rates. 4. It was found that the sound distortion wit]1 plate inserted was verified to the numeric value with LPC and Log Area Ratio method.
PDF

SOLA를 이용한 더빙 신호의 시간축 동기화 (Time-Synchronization Method for Dubbing Signal Using SOLA)

이기승;지철근;차일환;윤대희
- 방송공학회논문지
- /
- 제1권2호
- /
- pp.85-95
- /
- 1996
본 논문에서는 음성 신호의 시간축 변화에 널리 사용되고 있는 SOLA(Synchronized Over-Lap and ADD)기법을 사용하여 더빙된 신호를 본래의 음성 신호와 시간적으로 일치시키는 기법을 제안하였다. 방송 녹음의 경우, 큰 레벨의 배경 잡음등으로 인하여 스튜디오에서의 재녹음이 필요한 경우가 발생하게 된다. 이러한 재녹음 신호는 원래의 녹음 시간과 비교하여 대략 200msec의 시간차이를 갖게 되며, 이러한 시간차이는 화면과 음성과의 합성시 입모양이 서로 불일치하는 현상을 야기시킨다. 본 논문에서는 이러한 문제점을 해결하기 먼저 에너지궤적을 통해 원녹음 신호와 더빙 신호간의 어절 시작점을 서로 일치시키고, 어절내의 음소 위치를 동기화시키기 위하여 LPC 켑스트럼 분석과 DTW(Dynamic Time Warping)을 적용하였다. 음소가 서로 일치하는 지점은 원래의 녹음 신호와 더빙된 신호간의 LPC켑스트럼 자승 오차가 취소로 되는 지점을 탐색함으로서 결정된다. 음성의 합성시에는 인접 프레임간의 위상 관계가 서로 일치하도록 SOLA 방법을 사용하였다. 컴퓨터를 이용하여 모의 실험을 수행한 결과, 제안된 알고리즘을 통해 시간축 보정된 음성 신호는 음성 파형, 스펙트로그램 및 청취상으로 원래의 녹음 신호와 시간적으로 서로 일치함을 확인할 수 있었다.
PDF

자연스러운 정서 반응의 범주 및 차원 분류에 적합한 음성 파라미터 (Acoustic parameters for induced emotion categorizing and dimensional approach)

박지은;박정식;손진훈
- 감성과학
- /
- 제16권1호
- /
- pp.117-124
- /
- 2013
본 연구는 음성 인식기에서 일반적으로 사용되는 음향적 특징인 MFCC, LPC, 에너지, 피치 관련 파라미터들을 이용하여 자연스러운 음성의 정서를 범주 및 차원으로 얼마나 잘 인식할 수 있는지 살펴보았다. 자연스러운 정서 반응 데이터를 얻기 위해 선행 연구에서 이미 타당도와 효과성이 밝혀진 정서 유발 자극을 사용하였고, 110명의 대학생들에게 7가지 정서 유발 자극을 제시한 후 유발된 음성 반응을 녹음하여 분석에 사용하였다. 각 음성 데이터에서 추출한 파라미터들을 독립변인으로 하여 선형 판별 분석(LDA)으로 7가지 정서 범주를 분류하였고, 범주 분류의 한계를 극복하기 위해 단계별 다중회귀(stepwise multiple regression) 모형을 도출하여 4가지 정서 차원(valence, arousal, intensity, potency)을 가장 잘 예측하는 음성 특징 파라미터를 산출하였다. 7가지 정서 범주 판별율은 평균 62.7%이었고, 4 차원 예측 회귀모형들도 p<.001수준에서 통계적으로 유의하였다. 결론적으로, 본 연구 결과는 자연스러운 감정의 음성 반응을 분류하는데 유용한 파라미터들을 선정하여 정서의 범주와 차원적 접근으로 정서 분류 가능성을 보였으며 논의에 본 연구의 개선방향에 대해 기술하였다.
PDF

한국어의 경음에 대한 분석 (Analysis of Unaspirated sound for Korean)

임수호;김주곤;김범국;정호열;정현열
- 한국음향학회:학술대회논문집
- /
- 한국음향학회 2004년도 춘계학술발표대회 논문집 제23권 1호
- /
- pp.41-44
- /
- 2004
본 논문에서는 한국어에만 나타나는 경음에 대하여 음운학적, 음향학적 특성을 고찰하고 이를 기반으로 음성인식 실험을 수행한 후 그 결과를 분석하였다. 음성인식 실험을 위하여 입력 음성을 48개의 유사음소단위 (PLU; Phoneme Likely Unit)로 레이블링을 한 후 각각의 음소군에 대하여 LPC (Liner Predictive Coding) 분해능을 증가시키면서 음소인식 및 단어인식 실험을 수행하였다. 그 결과, 음소 인식 실험에서 경음군의 인식률이 가장 낮게 나타나 경음에 대한 분석이 보다 많이 필요함을 알 수 있었다. 또한 PLC의 분해 차원이 23차 일 때 경음과 전체 음소 인식률이 각각 $34.11\%,\;46.1\%$로 나타나 가장 양호함을 알 수 있었으며 단어인식 실험에서도 LPC 23차와 25차 일 때 $81.68\%,\;81.87\%$로 인식률이 가장 좋음을 알 수 있었다. 이상의 실험 결과에서 한국어의 경음은 전체 시스템의 인식 성능과 밀접한 관계가 있음을 알 수 있었다.
PDF

강인한 음성인식을 위한 이중모드 센서의 결합방식에 관한 연구 (A Study on Combining Bimodal Sensors for Robust Speech Recognition)

이철우;계영철;고인선
- 한국음향학회지
- /
- 제20권6호
- /
- pp.51-56
- /
- 2001
최근 잡음이 심한 환경에서 음성인식을 신뢰성있게 하기 위하여 입모양의 움직임과 음성을 같이 사용하는 방법이 활발히 연구되고 있다 본 논문에서도 이러한 목적으로 영상언어인식기와 음성인식기의 결과에 각각 가중치를 주어 결합하는 방법을 제안한다. 특히 가중치를 입력음성의 잡음의 정도에 따라 자동적으로 결정하는 방법을 제안한다. 가중치의 결정을 위하여 입력샘플간의 상관도와 LPC분석의 잔여 오차를 이용한다. 모의실험 결과, 이런 방식으로 결합된 인식기는 잡음이 심한 환경에서도 약 83%의 인식성능을 보이고 있다.
PDF

에너지 연산자에 기초한 간단한 피치 추적 방법 (A Simple Pitch Tracking Algorithm based on the Energy Operator)

Tai-Ho Lee
- 융합신호처리학회논문지
- /
- 제5권1호
- /
- pp.1-5
- /
- 2004
유성음의 피치주파수 궤적을 추정할 수 있는 새로운 방법을 제시하였다. 이 방법은 에너지연산자［1］를 두 번 적용하는데 기초하고 있다. Kaiser의 에너지연산자는 정현파의 진폭과 주파수 정보를 추출하는 기능을 가지고 있다. 변조모형에 의하면 유성음은 피치 신호로 변조된 포만트들의 합성으로 파악될 수 있으므로 이 파형의 진폭 포락선을 추출해서 피치 신호와 유사한 파형을 얻는다. 이 파형의 평균 주파수를 검출하여 피치 주파수를 구하는 것이다. 앞부분은 Gopalan의 접근법［9］과 마찬가지이나, 뒷부분의 LPC-스펙트럼 분석등의 과정 대신 또 한번 에너지 연산자를 적용하도록 하여 매우 단순화되고 온라인 적용이 가능한 알고리듬을 얻었다. 추정 결과는 거친 편이지만 온라인으로 피치 궤적의 일반적 스케치를 얻는데 유용할 것으로 기대된다.
PDF

구개상의 두께에 따른 한국어 자음의 발음 변화에 관한 컴퓨터 분석 - 치조음, 경구개음- (A COMPUTER ANALYSIS ON THE KOREAN CONSONANT SOUND DISTORTION IN RELATION TO THE PALATAL PLATE THICKNESS -Dentoalveolar and hard palatal consonant-)

우이형;최대균;최부병;박남수
- 대한치과보철학회지
- /
- 제25권1호
- /
- pp.71-94
- /
- 1987
This study was carried out to investigate the sound distortion following the alternation of the palatal plate thickness. For this study, 2 healthy male subjects (24-year-old) were selected. Born in Seoul, they both spoke Seoul dialect. First, their sounds of /na(나)/, /da(다)/, /1a(라)/, /ja(자)/, /cha(차)/, /ta(타)/, without inserting plates were recorded, and then the sounds with palatal plates of different thickness were recorded, successively. The plate was fabricated in 3 types, each palatal thickness being 1.0mm, 2.5mm, dentoalveolar portion 2.5mm, other residual portion was 1.0mm, successively. Each type plates named B, C, D-type, in succession. Series of analysis were administered through Computer(16 bit) to analyze the sound distortions. These experiments were analyzed by the LPC (without weighting, pre-weighting, post-weighting) of the consonants, vowels portion, formant frequency of the vowels and word duration of the consonants. The findings led to the following conclusions: 1. There was no correlation of the distortion rate on the 2 informants. 2. Generally, vowels were not affected by the palatal plate thickness in the formant analysis, however, more distortion was detected in the LPC analysis, especially C, D-type plates. 3. Consonants distortion was more evident in the C, D-type plate. 4. The second formant was most disturbed and reduced in the all consonants with insertion of the palatal plate, especially C, D-type plate. 5. Word duration was shortened in the plate inserted(except /ja/, /cha/), especially C, D-type. 6. It was found that dentoalveolar, hard palatal sounds were severely distorted in plate inserted, and they were mainly affected by the dentoalveolar portion thickness. 7. There was correlation between palatal thickness and consonants quality.
PDF

Spatio-temporal방법을 이용한 지역명 인식에 관한 연구 (A Study on the recognition of local name using Spatio-Temporal method)

지원우
- 한국음향학회:학술대회논문집
- /
- 한국음향학회 1993년도 학술논문발표회 논문집 제12권 1호
- /
- pp.121-124
- /
- 1993
This paper is a study on the word recognition using neural network. A limited vocabulary, speaker independent, isolated word recognition system has been built. This system recognizes isolated word without performing segmentation, phoneme identification, or dynamic time wrapping. It needs a static pattern approach to recognize a spatio-temporal pattern. The preprocessing only includes preceding and tailing silence removal, and word length determination. A LPC analysis is performed on each of 24 equally spaced frames. The PARCOR coefficients plus 3 other features from each frame is extracted. In order to simplify a structure of neural network, we composed binary code form to decrease output nodes.
PDF

청각 장애인용 통합형 발음 훈련 기기의 개발 (Development of Integrated Speech Training Aids for Hearing Impaired)

박상희;김동준
- 대한의용생체공학회:의공학회지
- /
- 제13권4호
- /
- pp.275-284
- /
- 1992
Development of Integrated Speech Training Aids for Hearing Impaired In this study, a spepch lralnlng aids that can do real-time display of vocal tract shape and other speech parameters together in a single system is implemenLed and self-training program for this system is developed. To estimate vocal tract shape, speech production process is assumed to be AR model. Through LPC analysis, vocal tract shape, intensity, and log spcclrum are calculated. And, fundamental frequency and nasality are measured using vibration sensors.
PDF

검색결과 95건 처리시간 0.029초

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)