• Title/Summary/Keyword: 화자내 변이

Search Result 7, Processing Time 0.018 seconds

Speaker Verification System Based on HMM Robust to Noise Environments (잡음환경에 강인한 HMM기반 화자 확인 시스템에 관한 연구)

  • 위진우;강철호
    • The Journal of the Acoustical Society of Korea
    • /
    • v.20 no.7
    • /
    • pp.69-75
    • /
    • 2001
  • Intra-speaker variation, noise environments, and mismatch between training and test conditions are the major reasons for the speaker verification system unable to use it practically. In this study, we propose robust end-point detection algorithm, noise cancelling with the microphone property compensation technique, and inter-speaker discriminate technique by weighting cepstrum for robust speaker verification system. Simulation results show that the average speaker verification rate is improved in the rate of 17.65% with proposed end-point detection algorithm using LPC residue and is improved in the rate of 36.93% with proposed noise cancelling and microphone property compensation algorithm. The proposed weighting function for discriminating inter-speaker variations also improves the average speaker verification rate in the rate of 6.515%.

  • PDF

A Study on Adaptive Model Updating and a Priori Threshold Decision for Speaker Verification System (화자 확인 시스템을 위한 적응적 모델 갱신과 사전 문턱치 결정에 관한 연구)

  • 진세훈;이재희;강철호
    • The Journal of the Acoustical Society of Korea
    • /
    • v.19 no.5
    • /
    • pp.20-26
    • /
    • 2000
  • In speaker verification system the HMM(hidden Markov model) parameter updating using small amount of data and the priori threshold decision are crucial factor for dealing with long-term variability in people voices. In the paper we present the speaker model updating technique which can be adaptable to the session-to-intra speaker variability and the priori threshold determining technique. The proposed technique decreases verification error rates which the session-to-session intra-speaker variability can bring by adapting new speech data to speaker model parameter through Baum Welch re-estimation. And in this study the proposed priori threshold determining technique is decided by a hybrid score measurement which combines the world model based technique and the cohen model based technique together. The results show that the proposed technique can lead a better performance and the difference of performance is small between the posteriori threshold decision based approach and the proposed priori threshold decision based approach.

  • PDF

The Speaker Recognition System using the Pitch Alteration (피치변경을 이용한 화자인식 시스템)

  • Jung JongSoon;Bae MyungJin
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • spring
    • /
    • pp.115-118
    • /
    • 2002
  • Parameters used in a speaker recognition system are desirable expressing speaker's characteristics filly and have in a speech. That is to say, if inter-speaker than intra-speaker variance a big characteristic, it is useful to distinguish between speakers. Also, to make minimum error between speakers, it is required the improved recognition technology as well as the distinguishing characteristics. When we see the result of recent simulation performance, we obtain more exact performance by using dynamic characteristics and constant characteristics by a speaking habit. Therefore we suggest it to solve this problem as followings. The prosodic information is used by a characteristic vector of speech. Characteristics vector generally using in speaker recognition system is a modeling spectrum information and is working for a high performance in non-noise circumstance. However, it is found a problem that characteristic vector is distorted in noise circumstance and it makes a reduction of recognition rate. In this paper, we change pitch line divided by segment which can estimate a dynamic characteristic and it is used as a recognition characteristic. we confirmed that the dynamic characteristic is very robust in noise circumstance with a simulation. We make a decision of acceptance or rejection by comparing test pattern and recognition rate using the proposed algorithm has more improvement than using spectrum and prosodic information. Especially stational recognition rate can be obtained in noise circumstance through the simulation.

  • PDF

The Application of an HMM-based Clustering Method to Speaker Independent Word Recognition (HMM을 기본으로한 집단화 방법의 불특정화자 단어 인식에 응용)

  • Lim, H.;Park, S.-Y.;Park, M.-W.
    • The Journal of the Acoustical Society of Korea
    • /
    • v.14 no.5
    • /
    • pp.5-10
    • /
    • 1995
  • In this paper we present a clustering procedure based on the use of HMM in order to get multiple statistical models which can well absorb the variants of each speaker with different ways of saying words. The HMM-clustered models obtained from the developed technique are applied to the speaker independent isolated word recognition. The HMM clustering method splits off all observation sequences with poor likelihood scores which fall below threshold from the training set and create a new model out of the observation sequences in the new cluster. Clustering is iterated by classifying each observation sequence as belonging to the cluster whose model has the maximum likelihood score. If any clutter has changed from the previous iteration the model in that cluster is reestimated by using the Baum-Welch reestimation procedure. Therefore, this method is more efficient than the conventional template-based clustering technique due to the integration capability of the clustering procedure and the parameter estimation. Experimental data show that the HMM-based clustering procedure leads to $1.43\%$ performance improvements over the conventional template-based clustering method and $2.08\%$ improvements over the single HMM method for the case of recognition of the isolated korean digits.

  • PDF

Deep neural networks for speaker verification with short speech utterances (짧은 음성을 대상으로 하는 화자 확인을 위한 심층 신경망)

  • Yang, IL-Ho;Heo, Hee-Soo;Yoon, Sung-Hyun;Yu, Ha-Jin
    • The Journal of the Acoustical Society of Korea
    • /
    • v.35 no.6
    • /
    • pp.501-509
    • /
    • 2016
  • We propose a method to improve the robustness of speaker verification on short test utterances. The accuracy of the state-of-the-art i-vector/probabilistic linear discriminant analysis systems can be degraded when testing utterance durations are short. The proposed method compensates for utterance variations of short test feature vectors using deep neural networks. We design three different types of DNN (Deep Neural Network) structures which are trained with different target output vectors. Each DNN is trained to minimize the discrepancy between the feed-forwarded output of a given short utterance feature and its original long utterance feature. We use short 2-10 s condition of the NIST (National Institute of Standards Technology, U.S.) 2008 SRE (Speaker Recognition Evaluation) corpus to evaluate the method. The experimental results show that the proposed method reduces the minimum detection cost relative to the baseline system.

Statistical analysis on long-term change of jitter component on continuous speech signal (음성신호의 Jitter 성분의 장시간 변화에 관한 통계적 분석)

  • Jo, Cheolwoo
    • Phonetics and Speech Sciences
    • /
    • v.12 no.4
    • /
    • pp.73-80
    • /
    • 2020
  • In this study, a method for measuring the jitter component in continuous speech is presented. In the conventional jitter measurement method, pitch variabilities are commonly measured from the sustained vowels. In the case of continuous speech, such as a spoken sentence, distortion occurs with the existing measurement method owing to the influence of prosody information according to the sentence. Therefore, we propose a method to reduce the pitch fluctuations of prosody information in continuous speech. To remove this pitch fluctuation component, a curve representing the fluctuation is obtained via polynomial interpolation for the pitch track in the analysis interval, and the shift is removed according to the curve. Subsequently, the variability of the pitch frequency is obtained by a method of measuring jitter from the trajectory of the pitch from which the shift is removed. To measure the effects of the proposed method, parameter values before and after the operations are compared using samples from the Kay Pentax MEEI database. The statistical analysis of the experimental results showed that jitter components from the continuous speech can be measured effectively by proposed method and the values are comparable to the parameters of sustained vowel from the same speaker.

The meaning based on Yin-Yang and Five Elements Principle in Semantic Landscape Composition of 'the Forty Eight Poems of Soswaewon' ('소쇄원(瀟灑園) 48영'의 의미경관 구성에 있어서 음양오행론적(陰陽五行論的) 의미(意味))

  • Jang, Il-Young;Shin, Sang-Sup
    • Journal of the Korean Institute of Traditional Landscape Architecture
    • /
    • v.31 no.2
    • /
    • pp.43-57
    • /
    • 2013
  • The purpose of this study is to identify potential semantic landscape makeup of "the Forty Eight Poems of Soswaewon" according to Yin-Yang and Five Elements Principle(陰陽五行論). that speculation system between human's nature and cosmical universal order. Existing academic discussions made so far concerning this topic can be summed up as follows: 1. Among Yin-Yang-based landscape makeups of the Forty Eight Poems of Soswaewon, poetic writings for embodiment of interactions between nature and human behaviors focused on depicting dynamic aspects of a poetic narrator when he appreciates or explores hills and streams as of to live free from worldly cares. Primarily, many of those writings were created on the east and south primarily through assignment of yang. On the other hand, poetic writings for embodiment of nature and seasonal scenery - as static landscape makeup of yin - were often created on or near the north and west for many times. Those writings focusing on embodiment of nature and artificial scenery as a work are divided into two categories: One category refers to author Kim In-hu's expression of semantic landscape from seasonal scenery in nature. The other refers to his depiction of realistic garden images as they are. In the Forty Eight Poems of Soswaewon, the poetic writings show that author Kim focused on embodying seasonal scenery rather than expressing human behaviors. In addition, both Poem No. 1 and Poem No. 48(last poem; titled 'Jangwon Jeyeong') were created in a same place, which author Kim sought to understand the place as a space of beginning and end where yin and yang - i.e. the principle of natural cycle - are inherent. 2. According to construction about landscape in the Forty Eight Poems of Soswaewon on the basis of Ohaeng-ron (five natural element principle), it was found that tree(木) and fire(火) are typical examples of a world combined by emanation. First, many of poetic writings depicting the sentiments of tree focused on embodying seasonal scenery and were located in the place of Ogogmun(五曲門) area in the east, from overall perspective of Soswaewon. The content of these poems shows generation and curve / straightness in flexibility and simplicity. Many of poems depicting the sentiments of fire(火) focused on embodying human behaviors, and they were created in Aeyangdan area on the south of Soswaewon over which sun rises at noon. These poems are all on a status of side movement that is characterized by emanation and ascension which belong to attributes of yang. 3. With regard to Ohaeng-ron's interpretation about landscape in the Forty Eight Poems of Soswaewon, it was found that metal(金) and water(水) are typical examples of world combined by convergence. First, it was found that all of poems depicting sentiments of metal focused on embodying seasonal scenery, and were created in a bamboo grove area on the west from overall perspective of Soswaewon. They represent scenery of autumn among 4 seasons to symbolize faithfulness vested in a man of virtue(seonbi) with integrity and righteousness. Poems depicting sentiments of water were created in vicinity of Jewoldang on the north, possibly topmost of Soswaewon. They were divided into two categories: One category refers to poems embodying actions of welcoming the first full moon deep in the night after sunset, and the other refers to poems embodying natural scenery of snowscape. All of those poems focused on expressing any atmosphere of turning into yin via convergence. 4. With regard to Ohaeng-ron's interpretation of landscape in the Forty Eight Poems of Soswaewon, it was found that poems depicting sentiments of earth(土), a complex body of convergence and emanation, were created in vicinity of mountain stream around Gwangpunggak which is located in the center of Soswaewon. These poems focused on carrying actions of author Kim by way of natural phenomena and artificial scenery.