• Title/Summary/Keyword: 영교차율

Search Result 59, Processing Time 0.022 seconds

Classification of Korean Traditional Musical Instruments Using Feature Functions and k-nearest Neighbor Algorithm (특성함수 및 k-최근접이웃 알고리즘을 이용한 국악기 분류)

  • Kim Seok-Ho;Kwak Kyung-Sup;Kim Jae-Chun
    • Journal of Korea Multimedia Society
    • /
    • v.9 no.3
    • /
    • pp.279-286
    • /
    • 2006
  • Classification method used in this paper is applied for the first time to Korean traditional music. Among the frequency distribution vectors, average peak value is suggested and proved effective comparing to previous classification success rate. Mean, variance, spectral centroid, average peak value and ZCR are used to classify Korean traditional musical instruments. To achieve Korean traditional instruments automatic classification, Spectral analysis is used. For the spectral domain, Various functions are introduced to extract features from the data files. k-NN classification algorithm is applied to experiments. Taegum, gayagum and violin are classified in accuracy of 94.44% which is higher than previous success rate 87%.

  • PDF

A Study on Real Time Pitch Alteration of Speech Signal (음성신호의 실시간 피치변경에 관한 연구)

  • 김종국;박형빈;배명진
    • The Journal of the Acoustical Society of Korea
    • /
    • v.23 no.1
    • /
    • pp.82-89
    • /
    • 2004
  • This paper describes how to reduce the effect of an occupation threshold by that the transform of mixture components of HMM parameters is controlled in hierarchical tree structure to prevent from over-adaptation. To reduce correlations between data elements and to remove elements with less variance, we employ PCA (principal component analysis) and ICA (independent component analysis) that would give as good a representation as possible, and decline the effect of over-adaptation. When we set lower occupation threshold and increase the number of transformation function, ordinary WLLR adaptation algorithm represents lower recognition rate than SI models, whereas the proposed MLLR adaptation algorithm represents the improvement of over 2% for the word recognition rate as compared to performance of SI models.

Recognition of Korean Phonemes in the Spoken Isolated Words Using Distributed Neural Network (분산 신경망을 이용한 고립 단어 음성에 나타난 음소 인식)

  • Kim, Seon-Il;Lee, Haing-Sei
    • The Journal of the Acoustical Society of Korea
    • /
    • v.14 no.6
    • /
    • pp.54-61
    • /
    • 1995
  • In this paper, we implemented distributed neural network that recognizes phonemes by frame unit for the 30 Korean proverbs sentences consist of 106 isolated words. The features of speech were chosen as PLP cepstrums, energy and zero crossings, where we get those being used as inputs to the distributed neural networks in wide area for a frame to get the good temperal characteristics. A young man of twenties has produced 30 proverbs 5 times. The learning of neural network uses 4 sets of them. 1 set being unused remains for test. There exists silence between words for the easy discrimination. The ratio of frame recognition in large grouping neural network is $95.3\%$ when 4 sets were used for the learning.

  • PDF

DNN based Robust Speech Feature Extraction and Signal Noise Removal Method Using Improved Average Prediction LMS Filter for Speech Recognition (음성 인식을 위한 개선된 평균 예측 LMS 필터를 이용한 DNN 기반의 강인한 음성 특징 추출 및 신호 잡음 제거 기법)

  • Oh, SangYeob
    • Journal of Convergence for Information Technology
    • /
    • v.11 no.6
    • /
    • pp.1-6
    • /
    • 2021
  • In the field of speech recognition, as the DNN is applied, the use of speech recognition is increasing, but the amount of calculation for parallel training needs to be larger than that of the conventional GMM, and if the amount of data is small, overfitting occurs. To solve this problem, we propose an efficient method for robust voice feature extraction and voice signal noise removal even when the amount of data is small. Speech feature extraction efficiently extracts speech energy by applying the difference in frame energy for speech and the zero-crossing ratio and level-crossing ratio that are affected by the speech signal. In addition, in order to remove noise, the noise of the speech signal is removed by removing the noise of the speech signal with an average predictive improved LMS filter with little loss of speech information while maintaining the intrinsic characteristics of speech in detection of the speech signal. The improved LMS filter uses a method of processing noise on the input speech signal by adjusting the active parameter threshold for the input signal. As a result of comparing the method proposed in this paper with the conventional frame energy method, it was confirmed that the error rate at the start point of speech is 7% and the error rate at the end point is improved by 11%.

Variable Quad Rate ADPCM for Efficient Speech Transmission and Real Time Implementation on DSP (효율적인 음성신호의 전송을 위한 4배속 가변 변환율 ADPCM기법 및 DSP를 이용한 실시간 구현)

  • 한경호
    • Journal of the Korean Institute of Illuminating and Electrical Installation Engineers
    • /
    • v.18 no.1
    • /
    • pp.129-136
    • /
    • 2004
  • In this paper, we proposed quad variable rates ADPCM coding method for efficient speech transmission and real time porcessing is implemented on TMS320C6711-DSP. The modified ADPCM with four variable coding rates, 16[kbps], 24[kbps], 32[kbps] and 40[kbps] are used for speech window samples for good quality speech transmission at a small data bits and real time encoding and decoding is implemented using DSP. ZCR is used to identify the influence of the noise on the speech signal and to decide the rate change threshold. For noise superior signals, low coding rates are applied to minimize data bit and for noise inferior signals, high coding rates are applied to enhance the speech quality. In most speech telecommunications, silent period takes more than half of the signals, speech quality close to 40[kbps] can be obtained at comparabley low data bits and this is shown by simulation and experiments. TMS320C6711-DSK board has 128K flash memory and performance of 1333MIPS and has meets the requirements for real time implementation of proposed coding algorithm.

Voice Activity Detection Based on Entropy in Noisy Car Environment (차량 잡음 환경에서 엔트로피 기반의 음성 구간 검출)

  • Roh, Yong-Wan;Lee, Kue-Bum;Lee, Woo-Seok;Hong, Kwang-Seok
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.9 no.2
    • /
    • pp.121-128
    • /
    • 2008
  • Accurate voice activity detection have a great impact on performance of speech applications including speech recognition, speech coding, and speech communication. In this paper, we propose methods for voice activity detection that can adapt to various car noise situations during driving. Existing voice activity detection used various method such as time energy, frequency energy, zero crossing rate, and spectral entropy that have a weak point of rapid. decline performance in noisy environments. In this paper, the approach is based on existing spectral entropy for VAD that we propose voice activity detection method using MFB(Met-frequency filter banks) spectral entropy, gradient FFT(Fast Fourier Transform) spectral entropy. and gradient MFB spectral entropy. FFT multiplied by Mel-scale is MFB and Mel-scale is non linear scale when human sound perception reflects characteristic of speech. Proposed MFB spectral entropy method clearly improve the ability to discriminate between speech and non-speech for various in noisy car environments that achieves 93.21% accuracy as a result of experiments. Compared to the spectral entropy method, the proposed voice activity detection gives an average improvement in the correct detection rate of more than 3.2%.

  • PDF

A Study on Improved Method of Voice Recognition Rate (음성 인식률 개선방법에 관한 연구)

  • Kim, Young-Po;Lee, Han-Young
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.8 no.1
    • /
    • pp.77-83
    • /
    • 2013
  • In this paper, we suggested a method about the improvement of the voice recognition rate and carried out a study on it. In general, voices were detected by applying the most widely-used method, HMM (Hidden Markov Model) algorithm. Regarding the method of detecting voices, the zero crossing ratio was calculated based on the units of voices before the existence of data was identified. Regarding the method of recognizing voices, the patterns shown by the forms of voices were analyzed before they were compared to the patterns which had already been learned. According to the results of the experiment, in comparison with the recognition rate of 80% shown by the existing HMM algorithm, the suggested algorithm based on the recognition of the patterns shown by the forms of voices showed the recognition rate of 92%, reflecting the recognition rate improved by about 12% compared to the existing one.

A Partial Discharge Diagnostic System for Power Cable Using FBDS(Frequency Band Detection Sensor) (주파수대역 검출센서를 이용한 전력케이블의 부분방전 진단 시스템)

  • Lee, Chul-hee;Choi, Hyung-ki;Hong, Soo-mi;Jeoung, Eui-bung;Park, Kee-Young
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.54 no.1
    • /
    • pp.157-163
    • /
    • 2017
  • This system is a diagnosis system that checks whether it causes a partial discharge of a power cable or not. PD(Partial Discharge) is detected by FBDS(Frequency Band Detection Sensor). That is, it means a acoustic sensor capable of detecting each frequency band. The wave shape of PD sound is similar to noise and is systematically generated by partial discharge. Therefore, in this paper, we could discriminate between normal and abnormal case using relative level crossing rate(RLCR) and spectrogram of frequency energy rate.

A Study on Extraction of Pitch and TSIUVC in Continuous Speech (연속음성신호에서 피치와 TSIUVC 추출에 관한 연구)

  • Lee See-Woo
    • Journal of Internet Computing and Services
    • /
    • v.6 no.4
    • /
    • pp.85-92
    • /
    • 2005
  • In this paper, I propose a new extraction method Pitch Pulse and TSIUVC in continuous speech, The TSIUVC searching and extraction method is based on a zero-crossing rate and individual Pitch Pulse extraction method using FIR-STREAK filter. As a result, the extraction rate of individual pitch pulses was $96{\%}$ for male voice and $85{\%}$ for female voice respectively. The TSIUVC extraction rates are $94.9{\%}$ under $88{\%}$ for male voice and $94.9{\%}$ under $84.8{\%}$ for female voice. This method has the capability of being applied to a new speech coding of Voiced/Silence/TSIUVC, speech analysis and speech synthesis.

  • PDF

Improving the Performance of a Speech Recognition System in a Vehicle by Distinguishing Male/Female Voice (성별 구별방법에 의한 자동차 내 음성 인식 성능 향상)

  • Yang, Jin-Woo;Kim, Sun-Hyeop
    • Journal of KIISE:Software and Applications
    • /
    • v.27 no.12
    • /
    • pp.1174-1182
    • /
    • 2000
  • 본 논문은 주행중인 자동차 환경에서 운전자의 안전성 및 편의성의 동시 확보를 위하여, 보조적인 스위치 조작 없이 상시 음성의 입, 출력이 가능한 시스템을 제안하였다. 이대 잡음에 강인한 threshold 값을 구하기 위하여, 1.5초마다 기준 에너지와 영 교차율을 변경하였으며 대역 통과 여과기를 이용하여 1차, 2차로 나누어 실시간 상태에서 자동으로, 정확하게 끝점 검출을 처리하였다. 또한 남성, 여성을 피치검출로 구분하여 모델을 선택하게 하였고, 주행중인 자동차 속도에 따라 가장 적합한 모델을 사용하기 위하여 Idle-40km, 40-80km, 80-100km로 구분하여 남성, 여성 모델을 각각 구분하여 인식할 수 있게 하였다. 그리고, 음성의 특징 벡터와 인식 알고리즘은 PLP 13차와 OSDP(one-Stage Dynamic Programming)을 사용하였다. 본 실험은 서울시내 도로 및 내부 순환도로에서 각각 속도별로 구분하여 화자독립 인식 실험을 한 결과 40-80km 상태에서 남자는 96.8%, 여자는 95.1%, 80-100km 상태에서는 남자 91.6%, 여자는 90.6%의 인식결과를 얻을 수 있었고, 화자종속 인식실험 결과 40-80km 상태에서 남자는 98%, 여자는 96%, 80-100km 상태에서는 남자는 96%, 여자는 94%의 높은 인식률을 얻었으므로, system의 유효성을 입증하였다.

  • PDF