• Title/Summary/Keyword: 화자독립

Search Result 231, Processing Time 0.023 seconds

A Study on Development and Real-Time Implementation of Voice Recognition Algorithm (화자독립방식에 의한 음성인식 알고리즘 개발 및 실시간 실현에 관한 연구)

  • Jung, Yang-geun;Jo, Sang Young;Yang, Jun Seok;Park, In-Man;Han, Sung Hyun
    • Journal of the Korean Society of Industry Convergence
    • /
    • v.18 no.4
    • /
    • pp.250-258
    • /
    • 2015
  • In this research, we proposed a new approach to implement the real-time motion control of biped robot based on voice command for unmanned FA. Voice is one of convenient methods to communicate between human and robots. To command a lot of robot task by voice, voice of the same number have to be able to be recognition voice is, the higher the time of recognition is. In this paper, a practical voice recognition system which can recognition a lot of task commands is proposed. The proposed system consists of a general purpose microprocessor and a useful voice recognition processor which can recognize a limited number of voice patterns. Given biped robots, each robot task is, classified and organized such that the number of robot tasks under each directory is net more than the maximum recognition number of the voice recognition processor so that robot tasks under each directory can be distinguished by the voice recognition command. By simulation and experiment, it was illustrated the reliability of voice recognition rates for application of the manufacturing process.

Experimental Determination of Differential Fast Neutron Spectra in a Reactor using Threshold Detectors

  • Kim, Dong-Hoon
    • Nuclear Engineering and Technology
    • /
    • v.4 no.4
    • /
    • pp.280-293
    • /
    • 1972
  • The differential fast neutron spectra above 0.5 Mev at particular spatial positions in tile reactor(TRIGA MARK-II) core has been determined experimentally using several threshold activation detectors. The series expansion technique utilizing the concept of least squares optimization was used to obtain an approximate solution to the set of integral equations which are defined by the experimentally determined activation data. The influence of use of different weighting functions in the solution was analyzed in each measurement. To carry out the necessary mathematical calculations, a computer code for the UNIVAC 1106 digital computer has been prepared. Good agreement was achieved between the differential fast neutron spectra determined in this work and the computed flux determined independently using space-independent multigroup transport theory.

  • PDF

Implementation of Motorized Wheelchair using Speaker Independent Voice Recognition Chip and Wireless Microphone (화자 독립 방식의 음성 인식 칩 및 무선 마이크를 이용한 전동 휄체어의 구현)

  • Song, Byung-Seop;Lee, Jung-Hyun;Park, Jung-Jae;Park, Hee-Joon;Kim, Myoung-Nam
    • Journal of Sensor Science and Technology
    • /
    • v.13 no.1
    • /
    • pp.20-26
    • /
    • 2004
  • For the disabled persons who can't use their limbs, motorized wheelchair that is activated by voice recognition module employing speaker independent method, was implemented. The wireless voice transfer device was designed and employed for the user convenience. And the wheelchair was designed to operate using voice and keypad by selection of the user because they can manipulate it using keypad if necessary. The speaker independent method was used as the voice recognition module in order that anyone can manipulate the wheelchair in case of assistance. Using the implemented wheelchair, performance and motion of the system was examined and it has over than 97% of voice recognition rate and proper movements.

Single-Channel Speech Separation Using Phase Model-Based Soft Mask (위상 모델 기반의 소프트 마스크를 이용한 단일 채널 음성분리)

  • Lee, Yun-Kyung;Kwon, Oh-Wook
    • The Journal of the Acoustical Society of Korea
    • /
    • v.29 no.2
    • /
    • pp.141-147
    • /
    • 2010
  • In this paper, we propose a new speech separation algorithm to extract and enhance the target speech signals from mixed speech signals by utilizing both magnitude and phase information. Since the previous statistical modeling algorithms assume that the log power spectrum values of the mixed speech signals are independent in the temporal and frequency domain, discontinuities occur in the resultant separated speech signals. To reduce the discontinuities, we apply a smoothing filter in the time-frequency domain. To further improve speech separation performance, we propose a statistical model based on both magnitude and phase information of speech signals. Experimental results show that the proposed algorithm improve signal-to-interference ratio (SIR) by 1.5 dB compared with the previous magnitude-only algorithms.

An Improvement of Stochastic Feature Extraction for Robust Speech Recognition (강인한 음성인식을 위한 통계적 특징벡터 추출방법의 개선)

  • 김회린;고진석
    • The Journal of the Acoustical Society of Korea
    • /
    • v.23 no.2
    • /
    • pp.180-186
    • /
    • 2004
  • The presence of noise in speech signals degrades the performance of recognition systems in which there are mismatches between the training and test environments. To make a speech recognizer robust, it is necessary to compensate these mismatches. In this paper, we studied about an improvement of stochastic feature extraction based on band-SNR for robust speech recognition. At first, we proposed a modified version of the multi-band spectral subtraction (MSS) method which adjusts the subtraction level of noise spectrum according to band-SNR. In the proposed method referred as M-MSS, a noise normalization factor was newly introduced to finely control the over-estimation factor depending on the band-SNR. Also, we modified the architecture of the stochastic feature extraction (SFE) method. We could get a better performance when the spectral subtraction was applied in the power spectrum domain than in the mel-scale domain. This method is denoted as M-SFE. Last, we applied the M-MSS method to the modified stochastic feature extraction structure, which is denoted as the MMSS-MSFE method. The proposed methods were evaluated on isolated word recognition under various noise environments. The average error rates of the M-MSS, M-SFE, and MMSS-MSFE methods over the ordinary spectral subtraction (SS) method were reduced by 18.6%, 15.1%, and 33.9%, respectively. From these results, we can conclude that the proposed methods provide good candidates for robust feature extraction in the noisy speech recognition.

A Study on a Model Parameter Compensation Method for Noise-Robust Speech Recognition (잡음환경에서의 음성인식을 위한 모델 파라미터 변환 방식에 관한 연구)

  • Chang, Yuk-Hyeun;Chung, Yong-Joo;Park, Sung-Hyun;Un, Chong-Kwan
    • The Journal of the Acoustical Society of Korea
    • /
    • v.16 no.5
    • /
    • pp.112-121
    • /
    • 1997
  • In this paper, we study a model parameter compensation method for noise-robust speech recognition. We study model parameter compensation on a sentence by sentence and no other informations are used. Parallel model combination(PMC), well known as a model parameter compensation algorithm, is implemented and used for a reference of performance comparision. We also propose a modified PMC method which tunes model parameter with an association factor that controls average variability of gaussian mixtures and variability of single gaussian mixture per state for more robust modeling. We obtain a re-estimation solution of environmental variables based on the expectation-maximization(EM) algorithm in the cepstral domain. To evaluate the performance of the model compensation methods, we perform experiments on speaker-independent isolated word recognition. Noise sources used are white gaussian and driving car noise. To get corrupted speech we added noise to clean speech at various signal-to-noise ratio(SNR). We use noise mean and variance modeled by 3 frame noise data. Experimental result of the VTS approach is superior to other methods. The scheme of the zero order VTS approach is similar to the modified PMC method in adapting mean vector only. But, the recognition rate of the Zero order VTS approach is higher than PMC and modified PMC method based on log-normal approximation.

  • PDF

Isolated Word Recognition Using k-clustering Subspace Method and Discriminant Common Vector (k-clustering 부공간 기법과 판별 공통벡터를 이용한 고립단어 인식)

  • Nam, Myung-Woo
    • Journal of the Institute of Electronics Engineers of Korea TE
    • /
    • v.42 no.1
    • /
    • pp.13-20
    • /
    • 2005
  • In this paper, I recognized Korean isolated words using CVEM which is suggested by M. Bilginer et al. CVEM is an algorithm which is easy to extract the common properties from training voice signals and also doesn't need complex calculation. In addition CVEM shows high accuracy in recognition results. But, CVEM has couple of problems which are impossible to use for many training voices and no discriminant information among extracted common vectors. To get the optimal common vectors from certain voice classes, various voices should be used for training. But CVEM is impossible to get continuous high accuracy in recognition because CVEM has a limitation to use many training voices and the absence of discriminant information among common vectors can be the source of critical errors. To solve above problems and improve recognition rate, k-clustering subspace method and DCVEM suggested. And did various experiments using voice signal database made by ETRI to prove the validity of suggested methods. The result of experiments shows improvements in performance. And with proposed methods, all the CVEM problems can be solved with out calculation problem.

A Speech Translation System for Hotel Reservation (호텔예약을 위한 음성번역시스템)

  • 구명완;김재인;박상규;김우성;장두성;홍영국;장경애;김응인;강용범
    • The Journal of the Acoustical Society of Korea
    • /
    • v.15 no.4
    • /
    • pp.24-31
    • /
    • 1996
  • In this paper, we present a speech translation system for hotel reservation, KT_STS(Korea Telecom Speech Translation System). KT-STS is a speech-to-speech translation system which translates a spoken utterance in Korean into one in Japanese. The system has been designed around the task of hotel reservation(dialogues between a Korean customer and a hotel reservation de나 in Japan). It consists of a Korean speech recognition system, a Korean-to-Japanese machine translation system and a korean speech synthesis system. The Korean speech recognition system is an HMM(Hidden Markov model)-based speaker-independent, continuous speech recognizer which can recognize about 300 word vocabularies. Bigram language model is used as a forward language model and dependency grammar is used for a backward language model. For machine translation, we use dependency grammar and direct transfer method. And Korean speech synthesizer uses the demiphones as a synthesis unit and the method of periodic waveform analysis and reallocation. KT-STS runs in nearly real time on the SPARC20 workstation with one TMS320C30 DSP board. We have achieved the word recognition rate of 94. 68% and the sentence recognition rate of 82.42% after the speech recognition tests. On Korean-to-Japanese translation tests, we achieved translation success rate of 100%. We had an international joint experiment in which our system was connected with another system developed by KDD in Japan using the leased line.

  • PDF

A Study on Isolated Word Recognition using Improved Multisection Vector Quantization Recognition System (개선된 MSVQ 인식 시스템을 이용한 단독어 인식에 관한 연구)

  • An, Tae-Ok;Kim, Nam-Joong;Song, Chul;Kim, Soon-Hyeob
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.16 no.2
    • /
    • pp.196-205
    • /
    • 1991
  • This paper is a study on the isolated word recognition of speaker independent which proposes to newly improved MSVQ(multisection vector quantization) recognition system which improve the classical MSVQ recognition system. It is a difference that test pattern has on more section than reference pattern in recognition system 146 DDD area names are selected as recognition vocabulary. 12th LPC cepstral coefficients is used as feature parameter. and when codebook is generated, MINSUM and MINMAX are used in finding the centroid. According to the experiment result. it is proved that this method is better than VQ(vector quantization) recognition methods, DTW(dynamic time warping) pattern matching methods and classical MSVQ methods for recognition rate and recognition time.

  • PDF

A Study on the Korean Broadcasting Speech Recognition (한국어 방송 음성 인식에 관한 연구)

  • 김석동;송도선;이행세
    • The Journal of the Acoustical Society of Korea
    • /
    • v.18 no.1
    • /
    • pp.53-60
    • /
    • 1999
  • This paper is a study on the korean broadcasting speech recognition. Here we present the methods for the large vocabuary continuous speech recognition. Our main concerns are the language modeling and the search algorithm. The used acoustic model is the uni-phone semi-continuous hidden markov model and the used linguistic model is the N-gram model. The search algorithm consist of three phases in order to utilize all available acoustic and linguistic information. First, we use the forward Viterbi beam search to find word end frames and to estimate related scores. Second, we use the backword Viterbi beam search to find word begin frames and to estimate related scores. Finally, we use A/sup */ search to combine the above two results with the N-grams language model and to get recognition results. Using these methods maximum 96.0% word recognition rate and 99.2% syllable recognition rate are achieved for the speaker-independent continuous speech recognition problem with about 12,000 vocabulary size.

  • PDF