• Title/Summary/Keyword: Speech detection

Search Result 469, Processing Time 0.026 seconds

A Study on Speech Period and Pitch Detection for Continuous Speech Recognition (연속음성인식을 위한 음성구간과 피치검출에 관한 연구)

  • Kim Tai Suk;Chang jong chil
    • Journal of Korea Multimedia Society
    • /
    • v.8 no.1
    • /
    • pp.56-61
    • /
    • 2005
  • In this thesis, propose speech period and pitch detection for continuous speech recognition. This mathod is distinguishes between vowel and consonant to frame unit in continuous speech, for distinguishable voice. Powerful extraction of speech period could threshold energy make use of input signal to real noise environment. Also algorithm of this method distinguish between vowel and consonant at the same time in voice make use of zero crossing rate and short time energy to extractible speech period.

  • PDF

Noise Whitening-Based Pitch Detection for Speech Highly Corrupted by Colored Noise

  • Byun, Kyung-Jin;Jeong, Sang-Bae;Kim, Hoi-Rin;Hahn, Min-Soo
    • ETRI Journal
    • /
    • v.25 no.1
    • /
    • pp.49-51
    • /
    • 2003
  • Pitch estimation is important in various speech research areas, but when the speech is noisy, accurate pitch estimation with conventional pitch detectors is almost impossible. To solve this problem, we propose a new pitch detection algorithm for noisy speech using a noise whitening technique on the background noise and obtain successful results.

  • PDF

The Pitch Detection Using Variable LPF (Variable LPF에 의한 피치검출)

  • 백금란
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1993.06a
    • /
    • pp.88-92
    • /
    • 1993
  • In speech signal processing, it is necessary to detect exactly the pitch. The algorithms of pitch extraction which have been proposed until now are difficult to detect pitches over wide range speech signals. Thus we propose a new algorithm which uses the G-peak extraction to do it. It is the method that finds the most MZI(maximum zero-crossing interval) at each frame and convolve it with speech signal ; this is the same with passing speech signals to variable LPF. Finally we obtained the pitch, improve the accuracy of pitch detection and extract it with the high speed.

  • PDF

An Explicit Voiced Speech Classification by using the Fluctuation of Maximum Magitudes (최대진폭의 Fluctuation에 의한 유성음구간 Explicit 검출)

  • 배명진
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1987.11a
    • /
    • pp.86-88
    • /
    • 1987
  • Accurate detection of the voicved segment in speech signals is important for robust pitch extraction. This paper describes an explicit detection algorithmfor detecting the voiced segment in speech signals. Thsi algoithm is based on the fluctuation properties of maximum magnitudes in each frame of speech signals. The performance of this detector is evaluated and compared to that obtained from manually classifying 150 recorded digit utterances.

  • PDF

A Comparative Study of Voice Activity Detection Algorithms in Adverse Environments (잡음 환경에서의 음성 검출 알고리즘 비교 연구)

  • Yang Kyong-Chul;Yook Dong-Suk
    • Proceedings of the KSPS conference
    • /
    • 2006.05a
    • /
    • pp.45-48
    • /
    • 2006
  • As the speech recognition systems are used in many emerging applications, robust performance of speech recognition systems under extremely noisy conditions become more important. The voice activity detection (VAD) has been taken into account as one of the important factors for robust speech recognition. In this paper, we investigate conventional VAD algorithms and analyze the weak and the strong points of each algorithm.

  • PDF

Robust Entropy Based Voice Activity Detection Using Parameter Reconstruction in Noisy Environment

  • Han, Hag-Yong;Lee, Kwang-Seok;Koh, Si-Young;Hur, Kang-In
    • Journal of information and communication convergence engineering
    • /
    • v.1 no.4
    • /
    • pp.205-208
    • /
    • 2003
  • Voice activity detection is a important problem in the speech recognition and speech communication. This paper introduces new feature parameter which are reconstructed by spectral entropy of information theory for robust voice activity detection in the noise environment, then analyzes and compares it with energy method of voice activity detection and performance. In experiments, we confirmed that spectral entropy and its reconstructed parameter are superior than the energy method for robust voice activity detection in the various noise environment.

Voice Activity Detection Based on SNR and Non-Intrusive Speech Intelligibility Estimation

  • An, Soo Jeong;Choi, Seung Ho
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.11 no.4
    • /
    • pp.26-30
    • /
    • 2019
  • This paper proposes a new voice activity detection (VAD) method which is based on SNR and non-intrusive speech intelligibility estimation. In the conventional SNR-based VAD methods, voice activity probability is obtained by estimating frame-wise SNR at each spectral component. However these methods lack performance in various noisy environments. We devise a hybrid VAD method that uses non-intrusive speech intelligibility estimation as well as SNR estimation, where the speech intelligibility score is estimated based on deep neural network. In order to train model parameters of deep neural network, we use MFCC vector and the intrusive speech intelligibility score, STOI (Short-Time Objective Intelligent Measure), as input and output, respectively. We developed speech presence measure to classify each noisy frame as voice or non-voice by calculating the weighted average of the estimated STOI value and the conventional SNR-based VAD value at each frame. Experimental results show that the proposed method has better performance than the conventional VAD method in various noisy environments, especially when the SNR is very low.

Boll's Spectral Subtraction Algorithm by New Voice Activity Detection (새로운 음성 활동 검출법에 의한 Boll의 스펙트럼 차감 알고리즘)

  • 류종훈;김대경;박장식;손경식
    • Journal of Korea Multimedia Society
    • /
    • v.4 no.1
    • /
    • pp.46-55
    • /
    • 2001
  • In this paper, a new voice activity detection method estimating SNR of enhanced speech with extended spectral subtraction (ESS) is proposed. Voice activity detection is performed by putting an second Wiener filter behind an Wiener filter used in the ESS to estimate speech and noise power of output signal of first Wiener filter. The proposed voice activity detection method does not require many computational loads and performs well under severe input SNR. Boll's spectral substraction algorithm with proposed voice activity detection was compared to ESS under several noise environment having different time-frequency distributions. During speech and non-speech activity, performance of Boll's spectral substraction algorithm with proposed voice activity detection is superior to that of ESS.

  • PDF

A New Endpoint Detection Method Based on Chaotic System Features for Digital Isolated Word Recognition System

  • Zang, Xian;Chong, Kil-To
    • Proceedings of the IEEK Conference
    • /
    • 2009.05a
    • /
    • pp.37-39
    • /
    • 2009
  • In the research of speech recognition, locating the beginning and end of a speech utterance in a background of noise is of great importance. Since the background noise presenting to record will introduce disturbance while we just want to get the stationary parameters to represent the corresponding speech section, in particular, a major source of error in automatic recognition system of isolated words is the inaccurate detection of beginning and ending boundaries of test and reference templates, thus we must find potent method to remove the unnecessary regions of a speech signal. The conventional methods for speech endpoint detection are based on two simple time-domain measurements - short-time energy, and short-time zero-crossing rate, which couldn't guarantee the precise results if in the low signal-to-noise ratio environments. This paper proposes a novel approach that finds the Lyapunov exponent of time-domain waveform. This proposed method has no use for obtaining the frequency-domain parameters for endpoint detection process, e.g. Mel-Scale Features, which have been introduced in other paper. Comparing with the conventional methods based on short-time energy and short-time zero-crossing rate, the novel approach based on time-domain Lyapunov Exponents(LEs) is low complexity and suitable for Digital Isolated Word Recognition System.

  • PDF

A Novel Two-Level Pitch Detection Approach for Speaker Tracking in Robot Control

  • Hejazi, Mahmoud R.;Oh, Han;Kim, Hong-Kook;Ho, Yo-Sung
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2005.06a
    • /
    • pp.89-92
    • /
    • 2005
  • Using natural speech commands for controlling a human-robot is an interesting topic in the field of robotics. In this paper, our main focus is on the verification of a speaker who gives a command to decide whether he/she is an authorized person for commanding. Among possible dynamic features of natural speech, pitch period is one of the most important ones for characterizing speech signals and it differs usually from person to person. However, current techniques of pitch detection are still not to a desired level of accuracy and robustness. When the signal is noisy or there are multiple pitch streams, the performance of most techniques degrades. In this paper, we propose a two-level approach for pitch detection which in compare with standard pitch detection algorithms, not only increases accuracy, but also makes the performance more robust to noise. In the first level of the proposed approach we discriminate voiced from unvoiced signals based on a neural classifier that utilizes cepstrum sequences of speech as an input feature set. Voiced signals are then further processed in the second level using a modified standard AMDF-based pitch detection algorithm to determine their pitch periods precisely. The experimental results show that the accuracy of the proposed system is better than those of conventional pitch detection algorithms for speech signals in clean and noisy environments.

  • PDF