• Title/Summary/Keyword: voice recognition rate improvement

Search Result 19, Processing Time 0.024 seconds

The research on the MEMS device improvement which is necessary for the noise environment in the speech recognition rate improvement (잡음 환경에서 음성 인식률 향상에 필요한 MEMS 장치 개발에 관한 연구)

  • Yang, Ki-Woong;Lee, Hyung-keun
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.22 no.12
    • /
    • pp.1659-1666
    • /
    • 2018
  • When the input sound is mixed voice and sound, it can be seen that the voice recognition rate is lowered due to the noise, and the speech recognition rate is improved by improving the MEMS device which is the H / W device in order to overcome the S/W processing limit. The MEMS microphone device is a device for inputting voice and is implemented in various shapes and used. Conventional MEMS microphones generally exhibit excellent performance, but in a special environment such as noise, there is a problem that the processing performance is deteriorated due to a mixture of voice and sound. To overcome these problems, we developed a newly designed MEMS device that can detect the voice characteristics of the initial input device.

A Study on Improved Method of Voice Recognition Rate (음성 인식률 개선방법에 관한 연구)

  • Kim, Young-Po;Lee, Han-Young
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.8 no.1
    • /
    • pp.77-83
    • /
    • 2013
  • In this paper, we suggested a method about the improvement of the voice recognition rate and carried out a study on it. In general, voices were detected by applying the most widely-used method, HMM (Hidden Markov Model) algorithm. Regarding the method of detecting voices, the zero crossing ratio was calculated based on the units of voices before the existence of data was identified. Regarding the method of recognizing voices, the patterns shown by the forms of voices were analyzed before they were compared to the patterns which had already been learned. According to the results of the experiment, in comparison with the recognition rate of 80% shown by the existing HMM algorithm, the suggested algorithm based on the recognition of the patterns shown by the forms of voices showed the recognition rate of 92%, reflecting the recognition rate improved by about 12% compared to the existing one.

Voice Recognition Performance Improvement using the Convergence of Bayesian method and Selective Speech Feature (베이시안 기법과 선택적 음성특징 추출을 융합한 음성 인식 성능 향상)

  • Hwang, Jae-Chun
    • Journal of the Korea Convergence Society
    • /
    • v.7 no.6
    • /
    • pp.7-11
    • /
    • 2016
  • Voice recognition systems which use a white noise and voice recognition environment are not correct voice recognition with variable voice mixture. Therefore in this paper, we propose a method using the convergence of Bayesian technique and selecting voice for effective voice recognition. we make use of bank frequency response coefficient for selective voice extraction, Using variables observed for the combination of all the possible two observations for this purpose, and has an voice signal noise information to the speech characteristic extraction selectively is obtained by the energy ratio on the output. It provide a noise elimination and recognition rates are improved with combine voice recognition of bayesian methode. The result which we confirmed that the recognition rate of 2.3% is higher than HMM and CHMM methods in vocabulary recognition, respectively.

A Study on the Realization of Wireless Home Network System Using High-performance Speech Recognition in Variable Position (가변위치 고음성인식 기술을 이용한 무선 홈 네트워크 시스템 구현에 관한 연구)

  • Yoon, Jun-Chul;Choi, Sang-Bang;Park, Chan-Sub;Kim, Se-Yong;Kim, Ki-Man;Kang, Suk-Youb
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.14 no.4
    • /
    • pp.991-998
    • /
    • 2010
  • In realization of wireless home network system using speech recognition in indoor voice recognition environment, background noise and reverberation are two main causes of digression in voice recognition system. In this study, the home network system resistant to reverberation and background noise using voice section detection method based on spectral entropy in indoor recognition environment is to be realized. Spectral subtraction can reduce the effect of reverberation and remove noise independent from voice signal by eliminating signal distorted by reverberation in spectrum. For effective spectral subtraction, the correct separation of voice section and silent section should be accompanied and for this, improvement of performance needs to be done, applying to voice section detection method based on entropy. In this study, experimental and indoor environment testing is carried out to figure out command recognition rate in indoor recognition environment. The test result shows that command recognition rate improved in static environment and reverberant room condition, using voice section detection method based on spectral entropy.

Voice Recognition Performance Improvement using a convergence of Voice Energy Distribution Process and Parameter (음성 에너지 분포 처리와 에너지 파라미터를 융합한 음성 인식 성능 향상)

  • Oh, Sang-Yeob
    • Journal of Digital Convergence
    • /
    • v.13 no.10
    • /
    • pp.313-318
    • /
    • 2015
  • A traditional speech enhancement methods distort the sound spectrum generated according to estimation of the remaining noise, or invalid noise is a problem of lowering the speech recognition performance. In this paper, we propose a speech detection method that convergence the sound energy distribution process and sound energy parameters. The proposed method was used to receive properties reduce the influence of noise to maximize voice energy. In addition, the smaller value from the feature parameters of the speech signal The log energy features of the interval having a more of the log energy value relative to the region having a large energy similar to the log energy feature of the size of the voice signal containing the noise which reducing the mismatch of the training and the recognition environment recognition experiments Results confirmed that the improved recognition performance are checked compared to the conventional method. Car noise environment of Pause Hit Rate is in the 0dB and 5dB lower SNR region showed an accuracy of 97.1% and 97.3% in the high SNR region 10dB and 15dB 98.3%, showed an accuracy of 98.6%.

Cyber Threats Analysis of AI Voice Recognition-based Services with Automatic Speaker Verification (화자식별 기반의 AI 음성인식 서비스에 대한 사이버 위협 분석)

  • Hong, Chunho;Cho, Youngho
    • Journal of Internet Computing and Services
    • /
    • v.22 no.6
    • /
    • pp.33-40
    • /
    • 2021
  • Automatic Speech Recognition(ASR) is a technology that analyzes human speech sound into speech signals and then automatically converts them into character strings that can be understandable by human. Speech recognition technology has evolved from the basic level of recognizing a single word to the advanced level of recognizing sentences consisting of multiple words. In real-time voice conversation, the high recognition rate improves the convenience of natural information delivery and expands the scope of voice-based applications. On the other hand, with the active application of speech recognition technology, concerns about related cyber attacks and threats are also increasing. According to the existing studies, researches on the technology development itself, such as the design of the Automatic Speaker Verification(ASV) technique and improvement of accuracy, are being actively conducted. However, there are not many analysis studies of attacks and threats in depth and variety. In this study, we propose a cyber attack model that bypasses voice authentication by simply manipulating voice frequency and voice speed for AI voice recognition service equipped with automated identification technology and analyze cyber threats by conducting extensive experiments on the automated identification system of commercial smartphones. Through this, we intend to inform the seriousness of the related cyber threats and raise interests in research on effective countermeasures.

Voice Recognition Performance Improvement using the Convergence of Voice signal Feature and Silence Feature Normalization in Cepstrum Feature Distribution (음성 신호 특징과 셉스트럽 특징 분포에서 묵음 특징 정규화를 융합한 음성 인식 성능 향상)

  • Hwang, Jae-Cheon
    • Journal of the Korea Convergence Society
    • /
    • v.8 no.5
    • /
    • pp.13-17
    • /
    • 2017
  • Existing Speech feature extracting method in speech Signal, there are incorrect recognition rates due to incorrect speech which is not clear threshold value. In this article, the modeling method for improving speech recognition performance that combines the feature extraction for speech and silence characteristics normalized to the non-speech. The proposed method is minimized the noise affect, and speech recognition model are convergence of speech signal feature extraction to each speech frame and the silence feature normalization. Also, this method create the original speech signal with energy spectrum similar to entropy, therefore speech noise effects are to receive less of the noise. the performance values are improved in signal to noise ration by the silence feature normalization. We fixed speech and non speech classification standard value in cepstrum For th Performance analysis of the method presented in this paper is showed by comparing the results with CHMM HMM, the recognition rate was improved 2.7%p in the speech dependent and advanced 0.7%p in the speech independent.

A Study on Word Selection Method and Device Improvement for Improving Speech Recognition Rate of Speech-Language-impaired in Severe Noise Environment (심한 소음환경에서 언어장애인 음성 인식률 향상을 위한 단어선정 방법 및 장치 개선에 관한 연구)

  • Yang, Ki-Woong;Lee, Hyung-keun
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.23 no.5
    • /
    • pp.555-567
    • /
    • 2019
  • Speech recognition rate is lowered even in a noisy environment, and it is difficult for a person with a speech disability or an inconvenient language to use it in a social life. In addition to improving the inconvenience of using the language, 280 words were selected using the word selection method which was improved when the word was selected considering the pronunciation characteristics of the language impaired. The MEMS development device used in the experiment was made considering material, lead wire type, length and direction. We improved the speech recognition rate by using the developed word selection method and the MEMS device developed to improve the speech recognition rate due to incorrect pronunciation and severe noise. The new method of selecting words and the mems device were improved and the results were included.

Voice Activity Detection Based on Entropy in Noisy Car Environment (차량 잡음 환경에서 엔트로피 기반의 음성 구간 검출)

  • Roh, Yong-Wan;Lee, Kue-Bum;Lee, Woo-Seok;Hong, Kwang-Seok
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.9 no.2
    • /
    • pp.121-128
    • /
    • 2008
  • Accurate voice activity detection have a great impact on performance of speech applications including speech recognition, speech coding, and speech communication. In this paper, we propose methods for voice activity detection that can adapt to various car noise situations during driving. Existing voice activity detection used various method such as time energy, frequency energy, zero crossing rate, and spectral entropy that have a weak point of rapid. decline performance in noisy environments. In this paper, the approach is based on existing spectral entropy for VAD that we propose voice activity detection method using MFB(Met-frequency filter banks) spectral entropy, gradient FFT(Fast Fourier Transform) spectral entropy. and gradient MFB spectral entropy. FFT multiplied by Mel-scale is MFB and Mel-scale is non linear scale when human sound perception reflects characteristic of speech. Proposed MFB spectral entropy method clearly improve the ability to discriminate between speech and non-speech for various in noisy car environments that achieves 93.21% accuracy as a result of experiments. Compared to the spectral entropy method, the proposed voice activity detection gives an average improvement in the correct detection rate of more than 3.2%.

  • PDF

A Study on the Multilingual Speech Recognition using International Phonetic Language (IPA를 활용한 다국어 음성 인식에 관한 연구)

  • Kim, Suk-Dong;Kim, Woo-Sung;Woo, In-Sung
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.12 no.7
    • /
    • pp.3267-3274
    • /
    • 2011
  • Recently, speech recognition technology has dramatically developed, with the increase in the user environment of various mobile devices and influence of a variety of speech recognition software. However, for speech recognition for multi-language, lack of understanding of multi-language lexical model and limited capacity of systems interfere with the improvement of the recognition rate. It is not easy to embody speech expressed with multi-language into a single acoustic model and systems using several acoustic models lower speech recognition rate. In this regard, it is necessary to research and develop a multi-language speech recognition system in order to embody speech comprised of various languages into a single acoustic model. This paper studied a system that can recognize Korean and English as International Phonetic Language (IPA), based on the research for using a multi-language acoustic model in mobile devices. Focusing on finding an IPA model which satisfies both Korean and English phonemes, we get 94.8% of the voice recognition rate in Korean and 95.36% in English.