• Title/Summary/Keyword: Speech signal processing

Search Result 331, Processing Time 0.024 seconds

Investigating the Effects of Hearing Loss and Hearing Aid Digital Delay on Sound-Induced Flash Illusion

  • Moradi, Vahid;Kheirkhah, Kiana;Farahani, Saeid;Kavianpour, Iman
    • Journal of Audiology & Otology
    • /
    • v.24 no.4
    • /
    • pp.174-179
    • /
    • 2020
  • Background and Objectives: The integration of auditory-visual speech information improves speech perception; however, if the auditory system input is disrupted due to hearing loss, auditory and visual inputs cannot be fully integrated. Additionally, temporal coincidence of auditory and visual input is a significantly important factor in integrating the input of these two senses. Time delayed acoustic pathway caused by the signal passing through digital signal processing. Therefore, this study aimed to investigate the effects of hearing loss and hearing aid digital delay circuit on sound-induced flash illusion. Subjects and Methods: A total of 13 adults with normal hearing, 13 with mild to moderate hearing loss, and 13 with moderate to severe hearing loss were enrolled in this study. Subsequently, the sound-induced flash illusion test was conducted, and the results were analyzed. Results: The results showed that hearing aid digital delay and hearing loss had no detrimental effect on sound-induced flash illusion. Conclusions: Transmission velocity and neural transduction rate of the auditory inputs decreased in patients with hearing loss. Hence, the integrating auditory and visual sensory cannot be combined completely. Although the transmission rate of the auditory sense input was approximately normal when the hearing aid was prescribed. Thus, it can be concluded that the processing delay in the hearing aid circuit is insufficient to disrupt the integration of auditory and visual information.

Central Auditory Processing Tests as Diagnostic Tools for the Early Identification of Elderly Individuals with Mild Cognitive Impairment

  • Jalaei, Bahram;Valadbeigi, Ayub;Panahi, Rasool;Nahrani, Morteza Hamidi;Arefi, Hossein Namvar;Zia, Maryam;Ranjbar, Nastaran
    • Journal of Audiology & Otology
    • /
    • v.23 no.2
    • /
    • pp.83-88
    • /
    • 2019
  • Background and Objectives: Mild cognitive impairment (MCI) is a disorder that usually occurs in the elderly, leading to dementia in some progressive cases. The purpose of this study is to examine the utility of central auditory processing tests as early diagnostic tools for identifying the elderly with MCI. Subjects and Methods: This study was conducted on 20 elderly patients with MCI and 20 healthy matched peers. The speech perception ability in a quiet environment and in the presence of background noise and also temporal resolution were assessed by using Speech Perception in Noise (SPIN) and Gap in Noise (GIN) tests, respectively. Results: The results indicated that the ability to understand speech in a quiet environment did not differ significantly between the two groups. However, SPIN at the three signal-to-noise ratios and the temporal resolution scores were significantly different between the two groups (p<0.001). Conclusions: Individuals with MCI appear to have poorer speech comprehension in noise and a lower temporal resolution than those of the same age, but without cognitive defects. Considering the utility of these tests in identifying cognitive problems, we propose that since the GIN test seems to be less influenced by intervening factors, this test can therefore, be a useful tool for the early screening of elderly people with cognitive problems.

Central Auditory Processing Tests as Diagnostic Tools for the Early Identification of Elderly Individuals with Mild Cognitive Impairment

  • Jalaei, Bahram;Valadbeigi, Ayub;Panahi, Rasool;Nahrani, Morteza Hamidi;Arefi, Hossein Namvar;Zia, Maryam;Ranjbar, Nastaran
    • Korean Journal of Audiology
    • /
    • v.23 no.2
    • /
    • pp.83-88
    • /
    • 2019
  • Background and Objectives: Mild cognitive impairment (MCI) is a disorder that usually occurs in the elderly, leading to dementia in some progressive cases. The purpose of this study is to examine the utility of central auditory processing tests as early diagnostic tools for identifying the elderly with MCI. Subjects and Methods: This study was conducted on 20 elderly patients with MCI and 20 healthy matched peers. The speech perception ability in a quiet environment and in the presence of background noise and also temporal resolution were assessed by using Speech Perception in Noise (SPIN) and Gap in Noise (GIN) tests, respectively. Results: The results indicated that the ability to understand speech in a quiet environment did not differ significantly between the two groups. However, SPIN at the three signal-to-noise ratios and the temporal resolution scores were significantly different between the two groups (p<0.001). Conclusions: Individuals with MCI appear to have poorer speech comprehension in noise and a lower temporal resolution than those of the same age, but without cognitive defects. Considering the utility of these tests in identifying cognitive problems, we propose that since the GIN test seems to be less influenced by intervening factors, this test can therefore, be a useful tool for the early screening of elderly people with cognitive problems.

A Study on Extraction of Pitch and TSIUVC in Continuous Speech (연속음성신호에서 피치와 TSIUVC 추출에 관한 연구)

  • Lee See-Woo
    • Journal of Internet Computing and Services
    • /
    • v.6 no.4
    • /
    • pp.85-92
    • /
    • 2005
  • In this paper, I propose a new extraction method Pitch Pulse and TSIUVC in continuous speech, The TSIUVC searching and extraction method is based on a zero-crossing rate and individual Pitch Pulse extraction method using FIR-STREAK filter. As a result, the extraction rate of individual pitch pulses was $96{\%}$ for male voice and $85{\%}$ for female voice respectively. The TSIUVC extraction rates are $94.9{\%}$ under $88{\%}$ for male voice and $94.9{\%}$ under $84.8{\%}$ for female voice. This method has the capability of being applied to a new speech coding of Voiced/Silence/TSIUVC, speech analysis and speech synthesis.

  • PDF

A Study on the Signal Processing for Content-Based Audio Genre Classification (내용기반 오디오 장르 분류를 위한 신호 처리 연구)

  • 윤원중;이강규;박규식
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.41 no.6
    • /
    • pp.271-278
    • /
    • 2004
  • In this paper, we propose a content-based audio genre classification algorithm that automatically classifies the query audio into five genres such as Classic, Hiphop, Jazz, Rock, Speech using digital sign processing approach. From the 20 seconds query audio file, the audio signal is segmented into 23ms frame with non-overlapped hamming window and 54 dimensional feature vectors, including Spectral Centroid, Rolloff, Flux, LPC, MFCC, is extracted from each query audio. For the classification algorithm, k-NN, Gaussian, GMM classifier is used. In order to choose optimum features from the 54 dimension feature vectors, SFS(Sequential Forward Selection) method is applied to draw 10 dimension optimum features and these are used for the genre classification algorithm. From the experimental result, we can verify the superior performance of the proposed method that provides near 90% success rate for the genre classification which means 10%∼20% improvements over the previous methods. For the case of actual user system environment, feature vector is extracted from the random interval of the query audio and it shows overall 80% success rate except extreme cases of beginning and ending portion of the query audio file.

Robust Voice Activity Detection Using the Spectral Peaks of Vowel Sounds

  • Yoo, In-Chul;Yook, Dong-Suk
    • ETRI Journal
    • /
    • v.31 no.4
    • /
    • pp.451-453
    • /
    • 2009
  • This letter proposes the use of vowel sound detection for voice activity detection. Vowels have distinctive spectral peaks. These are likely to remain higher than their surroundings even after severe corruption. Therefore, by developing a method of detecting the spectral peaks of vowel sounds in corrupted signals, voice activity can be detected as well even in low signal-to-noise ratio (SNR) conditions. Experimental results indicate that the proposed algorithm performs reliably under various noise and low SNR conditions. This method is suitable for mobile environments where the characteristics of noise may not be known in advance.

Statistical Voice Activity Defector Based on Signal Subspace Model (신호 준공간 모델에 기반한 통계적 음성 검출기)

  • Ryu, Kwang-Chun;Kim, Dong-Kook
    • The Journal of the Acoustical Society of Korea
    • /
    • v.27 no.7
    • /
    • pp.372-378
    • /
    • 2008
  • Voice activity detectors (VAD) are important in wireless communication and speech signal processing, In the conventional VAD methods, an expression for the likelihood ratio test (LRT) based on statistical models is derived in discrete Fourier transform (DFT) domain, Then, speech or noise is decided by comparing the value of the expression with a threshold, This paper presents a new statistical VAD method based on a signal subspace approach, The probabilistic principal component analysis (PPCA) is employed to obtain a signal subspace model that incorporates probabilistic model of noisy signal to the signal subspace method, The proposed approach provides a novel decision rule based on LRT in the signal subspace domain, Experimental results show that the proposed signal subspace model based VAD method outperforms those based on the widely used Gaussian distribution in DFT domain.

A Post-processing for Binary Mask Estimation Toward Improving Speech Intelligibility in Noise (잡음환경 음성명료도 향상을 위한 이진 마스크 추정 후처리 알고리즘)

  • Kim, Gibak
    • Journal of Broadcast Engineering
    • /
    • v.18 no.2
    • /
    • pp.311-318
    • /
    • 2013
  • This paper deals with a noise reduction algorithm which uses the binary masking in the time-frequency domain. To improve speech intelligibility in noise, noise-masked speech is decomposed into time-frequency units and mask "0" is assigned to masker-dominant region removing time-frequency units where noise is dominant compared to speech. In the previous research, Gaussian mixture models were used to classify the speech-dominant region and noise-dominant region which correspond to mask "1" and mask "0", respectively. In each frequency band, data were collected and trained to build the Gaussian mixture models and detection procedure is performed to the test data where each time-frequency unit belongs to speech-dominant region or noise-dominant region. In this paper, we consider the correlation of masks in the frequency domain and propose a post-processing method which exploits the Viterbi algorithm.

Implementation of Real-Time Adaptive Noise Cancellation System Using DSP Processor (DSP 프로세서를 이용한 실시간 ANC 시스템 구현에 관한 연구)

  • Lee Young Il;Choi Hong Sub
    • MALSORI
    • /
    • no.52
    • /
    • pp.121-132
    • /
    • 2004
  • This paper is aiming at real-time implementation of adaptive noise cancellation system using DSP processor. ACHARF algorithm, which guarantees stability and fast convergence by adaptive compensator, is used on this DSP system. For the experiments, TLV320AIC23 stereo CODEC of TI Inc. is used with TMS320C6413 DSP processor. Signals of primary input and reference input are obtained by two microphones. The primary input is the voice plus noise signal and the reference input is white noise or real noise. The experimental results show that ANC system using DSP processor with ACHARF is verified to be an effective speech enhancement method for various speech processing units.

  • PDF

Development of Realtime Phonetic Typewriter (실시간 음성타자 시스템 구현)

  • Cho, W.Y.;Choi, D.I.
    • Proceedings of the KIEE Conference
    • /
    • 1999.11c
    • /
    • pp.727-729
    • /
    • 1999
  • We have developed a realtime phonetic typewriter implemented on IBM PC with sound card based on Windows 95. In this system, analyzing of speech signal, learning of neural network, labeling of output neurons and visualizing of recognition results are performed on realtime. The developing environment for speech processing is established by adding various functions, such as editing, saving, loading of speech data and 3-D or gray level displaying of spectrogram. Recognition experimental using Korean phone had a 71.42% for 13 basic consonant and 90.01% for 7 basic vowel accuracy.

  • PDF