• Title/Summary/Keyword: 김화자

Search Result 184, Processing Time 0.029 seconds

회원작품

  • Korea Institute of Registered Architects
    • Korean Architects
    • /
    • no.9 s.138
    • /
    • pp.41-55
    • /
    • 1980
  • PDF

Speaker Identification Using Higher-Order Statistics In Noisy Environment (고차 통계를 이용한 잡음 환경에서의 화자식별)

  • Shin, Tae-Young;Kim, Gi-Sung;Kwon, Young-Uk;Kim, Hyung-Soon
    • The Journal of the Acoustical Society of Korea
    • /
    • v.16 no.6
    • /
    • pp.25-35
    • /
    • 1997
  • Most of speech analysis methods developed up to date are based on second order statistics, and one of the biggest drawback of these methods is that they show dramatical performance degradation in noisy environments. On the contrary, the methods using higher order statistics(HOS), which has the property of suppressing Gaussian noise, enable robust feature extraction in noisy environments. In this paper we propose a text-independent speaker identification system using higher order statistics and compare its performance with that using the conventional second-order-statistics-based method in both white and colored noise environments. The proposed speaker identification system is based on the vector quantization approach, and employs HOS-based voiced/unvoiced detector in order to extract feature parameters for voiced speech only, which has non-Gaussian distribution and is known to contain most of speaker-specific characteristics. Experimental results using 50 speaker's database show that higher-order-statistics-based method gives a better identificaiton performance than the conventional second-order-statistics-based method in noisy environments.

  • PDF

The Reduction or computation in MLLR Framework using PCA or ICA for Speaker Adaptation (화자적응에서 PCA 또는 ICA를 이용한 MLLR알고리즘 연산량 감소)

  • 김지운;정재호
    • The Journal of the Acoustical Society of Korea
    • /
    • v.22 no.6
    • /
    • pp.452-456
    • /
    • 2003
  • We discuss how to reduce the number of inverse matrix and its dimensions requested in MLLR framework for speaker adaptation. To find a smaller set of variables with less redundancy, we adapt PCA (principal component analysis) and ICA (independent component analysis) that would give as good a representation as possible. The amount of additional computation when PCA or ICA is applied is as small as it can be disregarded. 10 components for ICA and 12 components for PCA represent similar performance with 36 components for ordinary MLLR framework. If dimension of SI model parameter is n, the amount of computation of inverse matrix in MLLR is proportioned to O(n⁴). So, compared with ordinary MLLR, the amount of total computation requested in speaker adaptation is reduced by about 1/81 in MLLR with PCA and 1/167 in MLLR with ICA.

A Study on the Speaker Adaptation in CDHMM (CDHMM의 화자적응에 관한 연구)

  • Kim, Gwang-Tae
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.39 no.2
    • /
    • pp.116-127
    • /
    • 2002
  • A new approach to improve the speaker adaptation algorithm by means of the variable number of observation density functions for CDHMM speech recognizer has been proposed. The proposed method uses the observation density function with more than one mixture in each state to represent speech characteristics in detail. The number of mixtures in each state is determined by the number of frames and the determinant of the variance, respectively. The each MAP Parameter is extracted in every mixture determined by these two methods. In addition, the state segmentation method requiring speaker adaptation can segment the adapting speech more Precisely by using speaker-independent model trained from sufficient database as a priori knowledge. And the state duration distribution is used lot adapting the speech duration information owing to speaker's utterance habit and speed. The recognition rate of the proposed methods are significantly higher than that of the conventional method using one mixture in each state.

Framework Switching of Speaker Overlap Detection System (화자 겹침 검출 시스템의 프레임워크 전환 연구)

  • Kim, Hoinam;Park, Jisu;Cha, Shin;Son, Kyung A;Yun, Young-Sun;Park, Jeon Gue
    • Journal of Software Assessment and Valuation
    • /
    • v.17 no.1
    • /
    • pp.101-113
    • /
    • 2021
  • In this paper, we introduce a speaker overlap system and look at the process of converting the existed system on the specific framework of artificial intelligence. Speaker overlap is when two or more speakers speak at the same time during a conversation, and can lead to performance degradation in the fields of speech recognition or speaker recognition, and a lot of research is being conducted because it can prevent performance degradation. Recently, as application of artificial intelligence is increasing, there is a demand for switching between artificial intelligence frameworks. However, when switching frameworks, performance degradation is observed due to the unique characteristics of each framework, making it difficult to switch frameworks. In this paper, the process of converting the speaker overlap detection system based on the Keras framework to the pytorch-based system is explained and considers components. As a result of the framework switching, the pytorch-based system showed better performance than the existing Keras-based speaker overlap detection system, so it can be said that it is valuable as a fundamental study on systematic framework conversion.

An Enhancement of Microphone Array System Using Hybrid Window Algorithm (Hybrid Window 알고리듬을 이용한 마이크로폰 어레이 시스템의 성능 개선)

  • Lee Hak-Ju;Kim Ki-Man;Lee Won-Cheol;Cha Il-Whan;Youn Dae-Hee;Lee Chungyong
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • spring
    • /
    • pp.185-188
    • /
    • 2000
  • 본 연구에서는 화자의 음성신호를 이용하여 추출된 공간정보를 통해 화자의 위치를 실시간으로 추적하는 시스템을 제안하고 실시간 구현하였다. 기존의 대표적인 화자 위치 추출 알고리듬인 CPSP(Cross Power Spectrum Phase)는 실내환경에서 심각하게 일어나는 반향신호에 취약한 단점을 갖고 있으므로 구현된 시스템에서는 위치 추적 성능 개선을 위하여 반향신호에 강인한 hybrid window 알고리듬을 제안하여 적용하였다. Hybrid window 알고리듬은 실내 환경에 적합한 hybrid window를 설계하여 수신된 음성신호에 적용함으로써 반향신호에 의한 상호 상관관계를 줄이고 직접 경로에 의한 신호들의 상관관계를 높임으로써 보다 정확한 시간 지연 추정을 가능하게 한다. 제안된 시스템의 성능분석을 위해 DSP를 이용해 실시간 구현된 하드웨어를 이용해 기존의 CPSP 알고리듬과 제안된 hybrid window를 적용한 시스템을 실제 환경에서의 실험하였고 제안한 알고리듬을 적용한 시스템이 $22\%$ 이상 성공적으로 화자의 위치를 추적하였다.

  • PDF

Improvement in Supervector Linear Kernel SVM for Speaker Identification Using Feature Enhancement and Training Length Adjustment (특징 강화 기법과 학습 데이터 길이 조절에 의한 Supervector Linear Kernel SVM 화자식별 개선)

  • So, Byung-Min;Kim, Kyung-Wha;Kim, Min-Seok;Yang, Il-Ho;Kim, Myung-Jae;Yu, Ha-Jin
    • The Journal of the Acoustical Society of Korea
    • /
    • v.30 no.6
    • /
    • pp.330-336
    • /
    • 2011
  • In this paper, we propose a new method to improve the performance of supervector linear kernel SVM (Support Vector Machine) for speaker identification. This method is based on splitting one training datum into several pieces of utterances. We use four different databases for evaluating performance and use PCA (Principal Component Analysis), GKPCA (Greedy Kernel PCA) and KMDA (Kernel Multimodal Discriminant Analysis) for feature enhancement. As a result, the proposed method shows improved performance for speaker identification using supervector linear kernel SVM.

An Enhancement of Microphone Array System Using Hybrid Window Algorithm (CPSP의 저주파 위상 복원을 이용한 화자 위치 추적 알고리듬의 성능 개선)

  • Lee Hak-Ju;Kim Ki-Man;Lee Won-Cheol;Lee Chungyong
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • autumn
    • /
    • pp.213-216
    • /
    • 2000
  • 본 연구에서는 마이크로폰 어레이를 이용하여 화자의 음성신호로부터 화자의 위치를 추정하는 기존의 대표적인 알고리듬인 CPSP(Cross Power Spectrum Phase)로부터 보다 반향에 강인한 알고리듬인 저주파 위상 복원 알고리듬을 제안한다. CPSP 함수는 상호 상관관계(Cross Correlation)가 정규화 되어있는 형태를 갖는데, CPSP 함수의 최대 값 인덱스로부터 화자의 공간정보인 TDOA(Time Difference Of Arrival)를 추출한다. 그러나 CPSP 함수를 이용한 공간정보 추정 알고리듬은 실내환경에서 심각하게 일어나는 반향신호에 대해서 취약한 단점을 갖고 있다. 본 논문에서 제안하는 저주파 위상복원 알고리듬은 주파수 측면에서 반향신호가 CPSP 함수에 미치는 영향을 분석하여 반향으로 인하여 왜곡된 위상 성분을 복원함으로써 보다 신뢰도 있는 TDOA 추정을 가능하게 한다. 반향신호로 인한 CPSP의 위상은 저주파보다 고주파에서 심하게 왜곡되는데, 각각의 반향신호의 도달 시간을 기하학적 분포를 갖는 확률변수로 모델링하여 이를 수학적으로 증명하였다. 또한 실제 환경에서 채집한 음성신호를 이용한 모의 실험을 통해 개선된 알고리듬의 성능 개선을 확인하였다.

  • PDF

A Study on Realization of Continuous Speech Recognition System of Speaker Adaptation (화자적응화 연속음성 인식 시스템의 구현에 관한 연구)

  • 김상범;김수훈;허강인;고시영
    • The Journal of the Acoustical Society of Korea
    • /
    • v.18 no.3
    • /
    • pp.10-16
    • /
    • 1999
  • In this paper, we have studied Continuous Speech Recognition System of Speaker Adaptation using MAPE (Maximum A Posteriori Probability Estimation) which can adapt any small amount of adaptation speech data. Speaker adaptation is performed by the method of MAPB after Concatenation training which is making sentence unit HMM linked by syllable unit HMM and Viterbi segmentation classifies speech data to be adaptation into segmentation of syllable unit data automatically without hand labelling. For car control speech the recognition rates of adaptation of HMM was 77.18% which is approximately 6% improvement over that of unadapted HMM.(in case of O(n)DP)

  • PDF