통합 검색 | Korea Science

Voice Activity Detection with Run-Ratio Parameter Derived from Runs Test Statistic

Oh, Kwang-Cheol
- 음성과학
- /
- 제10권1호
- /
- pp.95-105
- /
- 2003
This paper describes a new parameter for voice activity detection which serves as a front-end part for automatic speech recognition systems. The new parameter called run-ratio is derived from the runs test statistic which is used in the statistical test for randomness of a given sequence. The run-ratio parameter has the property that the values of the parameter for the random sequence are about 1. To apply the run-ratio parameter into the voice activity detection method, it is assumed that the samples of an inputted audio signal should be converted to binary sequences of positive and negative values. Then, the silence region in the audio signal can be regarded as random sequences so that their values of the run-ratio would be about 1. The run-ratio for the voiced region has far lower values than 1 and for fricative sounds higher values than 1. Therefore, the parameter can discriminate speech signals from the background sounds by using the newly derived run-ratio parameter. The proposed voice activity detector outperformed the conventional energy-based detector in the sense of error mean and variance, small deviation from true speech boundaries, and low chance of missing real utterances
PDF

Robust Entropy Based Voice Activity Detection Using Parameter Reconstruction in Noisy Environment

Han, Hag-Yong;Lee, Kwang-Seok;Koh, Si-Young;Hur, Kang-In
- Journal of information and communication convergence engineering
- /
- 제1권4호
- /
- pp.205-208
- /
- 2003
Voice activity detection is a important problem in the speech recognition and speech communication. This paper introduces new feature parameter which are reconstructed by spectral entropy of information theory for robust voice activity detection in the noise environment, then analyzes and compares it with energy method of voice activity detection and performance. In experiments, we confirmed that spectral entropy and its reconstructed parameter are superior than the energy method for robust voice activity detection in the various noise environment.
PDF KSCI

엔트로피 차와 신호의 에너지에 기반한 잡음환경에서의 음성검출 (Voice Activity Detection Based on Signal Energy and Entropy-difference in Noisy Environments)

하동경;조석제;진강규;신옥근
- Journal of Advanced Marine Engineering and Technology
- /
- 제32권5호
- /
- pp.768-774
- /
- 2008
In many areas of speech signal processing such as automatic speech recognition and packet based voice communication technique, VAD (voice activity detection) plays an important role in the performance of the overall system. In this paper, we present a new feature parameter for VAD which is the product of energy of the signal and the difference of two types of entropies. For this end, we first define a Mel filter-bank based entropy and calculate its difference from the conventional entropy in frequency domain. The difference is then multiplied by the spectral energy of the signal to yield the final feature parameter which we call PEED (product of energy and entropy difference). Through experiments. we could verify that the proposed VAD parameter is more efficient than the conventional spectral entropy based parameter in various SNRs and noisy environments.
https://doi.org/10.5916/jkosme.2008.32.5.768 인용 PDF KSCI

성대마비로 인한 기식 음성에 대한 Cepstral 분석 (A Cepstral Analysis of Breathy Voice with Vocal Fold Paralysis)

강영애;성철재
- 말소리와 음성과학
- /
- 제4권2호
- /
- pp.89-94
- /
- 2012
The aim of this study is to investigate the usefulness of the parameter CPP (cepstral peak prominence) and LTAS (long term average spectrum) band energy for an analysis of breathy voice with vocal fold paralysis. Thirty-four female subjects who have vocal paralysis after thyroidectomy participated in this study. According to the perceptual judgements by three speech pathologists and one phonetic scholar, subjects were divided into two groups: breathy voice group (n = 21) and non-breathy voice group (n = 13). Maximum sustained phonation task was measured for acoustic analysis. CPP-related (i.e. mean F0, mean CPP, and mean CPPs) and LTAS-related (i.e. minimum, maximum, and mean) parameters were used. Independent samples t-test was conducted. Regarding CPP, there are significant differences in mean CPP and mean CPPs between groups. The values of mean CPP and CPPs in the non-breathy voice group are higher than those in the breathy voice group. The CPP could be regarded as the useful parameter for breathy voice analysis in the clinic. When it comes to LTAS, energy from 0 to 2 kHz are significantly different between groups. The minimum value of non-breathy group is lower than that of breathy group, whereas the maximum value of non-breathy group is higher. The frequency band below 2 kHz seems to be related to breathy voice.
https://doi.org/10.13064/KSSS.2012.4.2.089 인용 PDF

음향 파라미터에 의한 정서적 음성의 음질 분석 (Analysis of the Voice Quality in Emotional Speech Using Acoustical Parameters)

조철우;리타오
- 대한음성학회지:말소리
- /
- 제55권
- /
- pp.119-130
- /
- 2005
The aim of this paper is to investigate some acoustical characteristics of the voice quality features from the emotional speech database. Six different parameters are measured and compared for 6 different emotions (normal, happiness, sadness, fear, anger, boredom) and from 6 different speakers. Inter-speaker variability and intra-speaker variability are measured. Some intra-speaker consistency of the parameter change across the emotions are observed, but inter-speaker consistency are not observed.
PDF

음성 활동 구간 검출을 위한 스펙트랄 엔트로피의 재구성 효과 (Reconstruction Effect of the Spectral Entropy for the Voice Activity Detection)

권호민;한학용;이광석;고시영;허강인
- 한국음향학회:학술대회논문집
- /
- 한국음향학회 2002년도 하계학술발표대회 논문집 제21권 1호
- /
- pp.25-28
- /
- 2002
Voice activity detection is important Problem in the speech recognition and communication. This paper introduces feature parameter which is reconstructed by the spectral entropy of information theory for the robust voice activity detection in the noise environment, analyzes and compares it with the energy method of voice activity detection and performance. In experiment, we confirmed that the spectral entropy is more feature parameter than the energy method for the robust voice activity detection in the various noise environment.
PDF

Classification of Pathological Voice Using Artigicial Neural Network with Normalized Parameters

Li, Tao;Bak, Il-Suh;Jo, Cheol-Woo
- 음성과학
- /
- 제11권1호
- /
- pp.21-29
- /
- 2004
In this paper we examined the effect of normalization on discriminating the pathological voice into normal and abnormal classes using artificial neural network. Average values per each parameter were used to normalize each set of parameter values. Artificial neural networks were used as classifiers. And the effect of normalization was evaluated by comparing the discrimination results between original and normalized parameter sets.
PDF

새로운 시간축 정규화 방법을 이용한 한국어 고립단어 인식기 (Korean isolated word recognizer using new time alignment method of speech signal)

남명우;박규홍;노승용
- 대한전자공학회논문지SP
- /
- 제38권5호
- /
- pp.567-575
- /
- 2001
본 논문에서는 음성신호의 발성길이와 상관없이 일정한 크기의 파라미터를 얻을 수 있는 새로운 방법을 제안하였다. 음성인식기의 성능은 음성신호에서 추출된 파라미터간의 유사도(패턴간의 거리)를 어떻게 비교하는지에 따라 결정된다. 그러나 화자에 따른 음성신호의 변이나 발성속도의 차이는 음성신호에서 일정한 크기의 파라미터 추출을 어렵게 한다. 제안한 방법은 음성신호에서 얻어진 파라미터를 스펙토그램의 형태로 표현한 뒤 2차원 DCT(Discrete Cosine Transform)를 이용해 일정한 크기의 파라미터로 정규화시키는 방법이다. 제안한 방법의 유효성을 입증하기 위해 청각세포를 모델링한 32개의 대역통과 필터로부터 얻어진 음성신호의 파라미터를 2차원 DCT 방법으로 가공한 후, 신경 회로망의 입력으로 사용하였다. 또한 기존 방법과의 인식률 비교를 위해 기존의 정규화된 입력을 구하는 방법 중 하나를 선택하여 비교 실험을 수행하였다. 실험결과 제안한 방법은 기존 방법에 비해 화자종속 및 화자독립 고립단어 인식에서 더 높은 인식률과 빠른 인식속도를 얻을 수 있었다.
PDF

애성환자에서 음향지표인 RAP, PPQ 및 APQ의 유용성 (Significance of Acoustic Parameter - RAP, PPQ, APQ- in Hoarseness)

안철민;이종혁;강현국;이용배
- 대한후두음성언어의학회지
- /
- 제6권1호
- /
- pp.22-26
- /
- 1995
Change of voice, espicially hoarseness show irregular vibration of vocal cord. So, computerized acoustic analysis has presented many acoustic parameters for objective evaluation of voice. We objectively investigated the vocal vibration of normal persons and hoarseness patients in Korea. The RAP(relative average perturbation), PPQ(pitch period perturbation quotient) and APQ(amplitude perturbation quotient) of normal persons were compared with that of hoarseness patients with multidimensional voice program for the possibility of distinguishing the pathologic vocal vibration from normal. Authors agree that RAP, PPQ and APQ showed interesting differences between the normal and the hoarseness patients by the multivariate statistical analysis. In conculusion, relative average perturbation, pitch period perturbation and amplitude perturbation quotient might be meangingful screening parameters distinguishing hoarseness patients from normal.
PDF

기식성 애성 판정을 위한 객관적 음향지표 : VTI(Voice Turbulance Index)의 유용성 (Acoustic Parameter for an Objective Assessment of Breathiness : The Significance of Voice Turbulance Index(VTI))

김형태;김민식;조승호
- 대한음성언어의학회:학술대회논문집
- /
- 대한음성언어의학회 1996년도 제6회 학술대회 심포지움
- /
- pp.78-78
- /
- 1996
기식성 애성을 객관적으로 평가할 수 있는 음향지표는 아직 많은 연구가 되어 있지 않고 단지 청각심리검사에 의존하고 있는 실정이다. 본 저자들은 컴퓨터음향분석의 한 지표로서 기식성 애성에 대한 객관적인 음향지표로 이용될 수 있는 Multi-Dimensional Voice Program(mode1 4305, Kay Elemtrics Corp, USA)의 VTI(voice turbulance index)를 정상인과 성대병변 환자에서 비교 분석함으로써 기식성 애성의 객관적인 음향지표로서의 유용성을 확인하고자 하였다. (중략)
PDF

검색결과 179건 처리시간 0.02초

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)