• 제목/요약/키워드: Speech signal processing

검색결과 331건 처리시간 0.026초

Influence Analysis of Food on Body Organs by Applying Speech Signal Processing Techniques (음성신호처리 기술을 적용한 음식물이 인체 장기에 미치는 영향 분석)

  • Kim, Bong-Hyun;Cho, Dong-Uk
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • 제37권5A호
    • /
    • pp.388-394
    • /
    • 2012
  • In this paper, the influence analysis of food on human body organs is proposed by applying speech signal processing techniques. Until these days, most of researches regarding the influence of food on body organs are such that "A" ingredient of food may produce a good effect on "B" organ. However, the numerical and quantified researches regarding these effects hardly have been performed. This paper therefore proposes a method to quantify the effects by using numerical data, so as to retrieve new facts and informations. Especially, this paper investigates the effect of tomatoes on human heart function. The experiment collects samples of voice signals, before and after 5 minutes, 30 minutes and 1 hour, from 15 males in their 20s who have not abnormal heart function; the voice signal components are applied to measure changes of heart conditions to digitize and quantify the effects of tomatoes on cardiac function.

CSL Computerized Speech Lab - Model 4300B Software version 5.X

  • Ahn, Cheol-Min
    • Proceedings of the KSLP Conference
    • /
    • 대한음성언어의학회 1995년도 제4회 학술대회 심포지움 및 워크샵
    • /
    • pp.154-164
    • /
    • 1995
  • CSL, Model 4300B is a highly flexible audio processing package designed to provide a wide variety of speech analysis operations for both new and sophisticated users. Operations include 1) Data acquisition 2) File management 3) Graphics 4) Numerical display 5) Audio output 6) Signal editing 7) A variety of analysis functions, External module include 1) Input control B) Output control 3) Jacks, Software include 1) Wide range of speech display manipulation 2) Editing 3) Analysis (omitted)

  • PDF

Voice Activity Detection Based on Entropy in Noisy Car Environment (차량 잡음 환경에서 엔트로피 기반의 음성 구간 검출)

  • Roh, Yong-Wan;Lee, Kue-Bum;Lee, Woo-Seok;Hong, Kwang-Seok
    • Journal of the Institute of Convergence Signal Processing
    • /
    • 제9권2호
    • /
    • pp.121-128
    • /
    • 2008
  • Accurate voice activity detection have a great impact on performance of speech applications including speech recognition, speech coding, and speech communication. In this paper, we propose methods for voice activity detection that can adapt to various car noise situations during driving. Existing voice activity detection used various method such as time energy, frequency energy, zero crossing rate, and spectral entropy that have a weak point of rapid. decline performance in noisy environments. In this paper, the approach is based on existing spectral entropy for VAD that we propose voice activity detection method using MFB(Met-frequency filter banks) spectral entropy, gradient FFT(Fast Fourier Transform) spectral entropy. and gradient MFB spectral entropy. FFT multiplied by Mel-scale is MFB and Mel-scale is non linear scale when human sound perception reflects characteristic of speech. Proposed MFB spectral entropy method clearly improve the ability to discriminate between speech and non-speech for various in noisy car environments that achieves 93.21% accuracy as a result of experiments. Compared to the spectral entropy method, the proposed voice activity detection gives an average improvement in the correct detection rate of more than 3.2%.

  • PDF

Evaluation Performance of Speech Coder in Speech Signal Processing

  • Lee, Kwang-Seok
    • Journal of information and communication convergence engineering
    • /
    • 제5권2호
    • /
    • pp.177-180
    • /
    • 2007
  • We compared CS-ACELP with QCELP speech coder in CDMA cellular under channel error environment and experimented performance with its measured value under channel error environment. Also, we specified the effective coding scheme to overcome. CS-ACELP speech coder using a LSP vector quantizer shows transparent speech quality from the results that SD is 0.92dB and outlier frames over 2dB is 2.9% in the BER 0.10% condition. CS-ACELP speech coder which is utilizing MA predictor shows better results on SVR and SEGSNR than QCELP speech coder(IS-96) adopting DPCM type predictor when bit error occurs from BER 0.01% to 0.50%.

Noise Elimination Using Improved MFCC and Gaussian Noise Deviation Estimation

  • Sang-Yeob, Oh
    • Journal of the Korea Society of Computer and Information
    • /
    • 제28권1호
    • /
    • pp.87-92
    • /
    • 2023
  • With the continuous development of the speech recognition system, the recognition rate for speech has developed rapidly, but it has a disadvantage in that it cannot accurately recognize the voice due to the noise generated by mixing various voices with the noise in the use environment. In order to increase the vocabulary recognition rate when processing speech with environmental noise, noise must be removed. Even in the existing HMM, CHMM, GMM, and DNN applied with AI models, unexpected noise occurs or quantization noise is basically added to the digital signal. When this happens, the source signal is altered or corrupted, which lowers the recognition rate. To solve this problem, each voice In order to efficiently extract the features of the speech signal for the frame, the MFCC was improved and processed. To remove the noise from the speech signal, the noise removal method using the Gaussian model applied noise deviation estimation was improved and applied. The performance evaluation of the proposed model was processed using a cross-correlation coefficient to evaluate the accuracy of speech. As a result of evaluating the recognition rate of the proposed method, it was confirmed that the difference in the average value of the correlation coefficient was improved by 0.53 dB.

A Study on Extracting Valid Speech Sounds by the Discrete Wavelet Transform (이산 웨이브렛 변환을 이용한 유효 음성 추출에 관한 연구)

  • Kim, Jin-Ok;Hwang, Dae-Jun;Baek, Han-Uk;Jeong, Jin-Hyeon
    • The KIPS Transactions:PartB
    • /
    • 제9B권2호
    • /
    • pp.231-236
    • /
    • 2002
  • The classification of the speech-sound block comes from the multi-resolution analysis property of the discrete wavelet transform, which is used to reduce the computational time for the pre-processing of speech recognition. The merging algorithm is proposed to extract vapid speech-sounds in terms of position and frequency range. It performs unvoiced/voiced classification and denoising. Since the merging algorithm can decide the processing parameters relating to voices only and is independent of system noises, it is useful for extracting valid speech-sounds. The merging algorithm has an adaptive feature for arbitrary system noises and an excellent denoising signal-to-noise ratio and a useful system tuning for the system implementation.

Real-time implementation of the 2.4kbps EHSX Speech Coder Using a $TMS320C6701^TM$ DSPCore ($TMS320C6701^TM$을 이용한 2.4kbps EHSX 음성 부호화기의 실시간 구현)

  • 양용호;이인성;권오주
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • 제29권7C호
    • /
    • pp.962-970
    • /
    • 2004
  • This paper presents an efficient implementation of the 2.4 kbps EHSX(Enhanced Harmonic Stochastic Excitation) speech coder on a TMS320C6701$^{TM}$ floating-point digital signal processor. The EHSX speech codec is based on a harmonic and CELP(Code Excited Linear Prediction) modeling of the excitation signal respectively according to the frame characteristic such as a voiced speech and an unvoiced speech. In this paper, we represent the optimization methods to reduce the complexity for real-time implementation. The complexity in the filtering of a CELP algorithm that is the main part for the EHSX algorithm complexity can be reduced by converting program using floating-point variable to program using fixed-point variable. We also present the efficient optimization methods including the code allocation considering a DSP architecture and the low complexity algorithm of harmonic/pitch search in encoder part. Finally, we obtained the subjective quality of MOS 3.28 from speech quality test using the PESQ(perceptual evaluation of speech quality), ITU-T Recommendation P.862 and could get a goal of realtime operation of the EHSX codec.c.

Speech Signal Processing using Pitch Synchronous Multi-Spectra and DSP System Design in Cochlear Implant (피치동기 다중 스펙트럼을 이용한 청각보철장치의 음성신호처리 및 DSP 시스템 설계)

  • Shin, J. I.;Park, S. J.;Shin, D. K.;Lee, J. H.;Park, S. H.
    • Journal of Biomedical Engineering Research
    • /
    • 제20권4호
    • /
    • pp.495-502
    • /
    • 1999
  • We propose efficient speech signal processing algorithms and a system for cochlear implant in this paper. The outer and the middle car which perform amplifying, lowpass filtering and AGC, are modeled by an analog system, and the inner ear acting as a time-delayed multi filter and the transducer is implemented by the DSP circuit which enables real-time processing. Especially, the basilar membrane characteristic of the inner ear is modeled by a nonlinear filter bank, and then tonotopy and periodicity of the auditory system is satisfied by using a pitch-synchronous multi-spectra(PSMS) method. Moreover, most of the speech processing is performed by S/W so the system can be easily modified. And as our program is written in C-language, it can be easily transplanted to the system using other processors.

  • PDF

The Convergence Speed Enhancement using a Cosine Modulated Filter Banks and a Decimation Technique (코사인 변조된 필터 뱅크와 Decimation을 이용한 수렴 속도 성능 개선)

  • Choi Chang-Kwon;Cho Byung-Mo
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 한국음향학회 1999년도 학술발표대회 논문집 제18권 2호
    • /
    • pp.193-196
    • /
    • 1999
  • 본 논문은 음향 임펄스를 모델링하는데 코사인 변조된 필터 뱅크와 Decimation을 이용하여 수렴 속도를 개선하는 방법을 제안하고 이를 잡음제거에 응용하였다. 제안된 구조는 입력신호를 필터뱅크를 이용하여 각 서브밴드로 분할한 후 필터 입력신호의 고유벡터의 최대값과 최소값의 비를 줄이고 필터의 탭수를 줄이기 위해서 decimation을 행한다. 그리고 서브밴드대역의 샘플링 주파수를 낮추어 신호 스펙트럼을 확장시켜 이를 적응필터에 입력하여 수렴속도를 향상시켰다. 실험 결과, Colored잡음의 경우 LMS 알고리즘보다 제안된 방법이 MSE(Mean Square Error)는 좋지는 않았다. 실제 음향시스템의 모델링에는 거의 같은 MSE을 갖으며, 수렴 속도에는 모두 빠른 성능을 보였으며, 이를 음질향상에 적용하여 향상된 음질을 얻을 수 있었다.

  • PDF

Using speech enhancement parameter for ASR (잡음환경의 ASR 성능개선을 위한 음성강조 파라미터)

  • Cha, Young-Dong;Kim, Young-Sub;Hur, Kang-In
    • Proceedings of the Korea Institute of Convergence Signal Processing
    • /
    • 한국신호처리시스템학회 2006년도 하계 학술대회 논문집
    • /
    • pp.63-66
    • /
    • 2006
  • 음성인식시스템은 사람이 별도의 장비 없이 음성만으로 시스템의 사용이 가능한 편리한 장점을 지니고 있으나 여러 가지 기술적인 어려움과 실제 환경의 낮은 인식률로 폭넓게 사용되지 못한 상황이다. 그 중 배경잡음은 음성인식의 인식률을 저하시키는 원인으로 지적 받고 있다. 이러한 잡음환경에 있는 ASR(Automatic Speech Recognition)의 성능 향상을 위해 외측억제 기능 이 추가된 파라미터를 제안한다. ASR 에서 널리 사용되는 파라미터인 MFCC을 본 논문에서 제안한 파라미터와 HMM를 이용하여 인식률을 비교하여 성능을 비교하였다. 실험결과를 통해 제안된 파라미터의 사용을 통해 잡음환경에 있는 ASR의 성능 향상을 확인할 수 있었다.

  • PDF