• 제목/요약/키워드: Speech processing strategy

검색결과 16건 처리시간 0.02초

Neural Spike Train Decoding에 기반한 인공와우 어음처리방식 성능평가 (Performance Evaluation of Cochlear Implants Speech Processing Strategy Using Neural Spike Train Decoding)

  • 김두희;김진호;김경환
    • 대한의용생체공학회:의공학회지
    • /
    • 제28권2호
    • /
    • pp.271-279
    • /
    • 2007
  • We suggest a novel method for the evaluation of cochlear implant (CI) speech processing strategy based on neural spike train decoding. From formant trajectories of input speech and auditory nerve responses responding to the electrical pulse trains generated from a specific CI speech processing strategy, optimal linear decoding filter was obtained, and used to estimate formant trajectory of incoming speech. Performance of a specific strategy is evaluated by comparing true and estimated formant trajectories. We compared a newly-developed strategy rooted from a closer mimicking of auditory periphery using nonlinear time-varying filter, with a conventional linear-filter-based strategy. It was shown that the formant trajectories could be estimated more exactly in the case of the nonlinear time-varying strategy. The superiority was more prominent when background noise level is high, and the spectral characteristic of the background noise was close to that of speech signals. This confirms the superiority observed from other evaluation methods, such as acoustic simulation and spectral analysis.

청각신경 시냅스의 적응 효과를 이용한 인공와우 어음처리 알고리즘의 개선에 대한 시뮬레이션 연구 (A Simulation Study on Improvements of Speech Processing Strategy of Cochlear Implants Using Adaptation Effect of Inner Hair Cell and Auditory Nerve Synapse)

  • 김진호;김경환
    • 대한의용생체공학회:의공학회지
    • /
    • 제28권2호
    • /
    • pp.205-211
    • /
    • 2007
  • A novel envelope extraction algorithm for speech processor of cochlear implants, called adaptation algorithm, was developed which is based on a adaptation effect of the inner hair cell(IHC)/auditory nerve(AN) synapse. We achieved acoustic simulation and hearing experiments with 12 normal hearing persons to compare this adaptation algorithm with existent standard envelope extraction method. The results shows that speech processing strategy using adaptation algorithm showed significant improvements in speech recognition rate under most channel/noise condition, compared to conventional strategy We verified that the proposed adaptation algorithm may yield better speech perception under considerable amount of noise, compared to the conventional speech processing strategy.

Improved Melody Recognition Performance of a Cochlear Implant Speech Processing Strategy Using Instantaneous Frequency Encoding Based on Teager Energy Operator

  • Choi, Sung-Jin;Ryu, Sang-Baek;Kim, Kyung-Hwan
    • 대한의용생체공학회:의공학회지
    • /
    • 제31권6호
    • /
    • pp.417-426
    • /
    • 2010
  • We present a speech processing strategy incorporating instantaneous frequency (IF) encoding for the enhancement of melody recognition performance of cochlear implants. For the IF extraction from incoming sound, we propose the use of a Teager energy operator (TEO), which is advantageous for its lower computational load. From time-frequency analysis, we verified that the TEO-based method provides proper IF encoding of input sound, which is crucial for melody recognition. Similar benefit could be obtained also from the use of a Hilbert transform (HT), but much higher computational cost was required. The melody recognition performance of the proposed speech processing strategy was compared with those of a conventional strategy using envelope extraction, and the HT-based IF encoding. Hearing tests on normal subjects were performed using acoustic simulation and a musical contour identification task. Insignificant difference in melody recognition performance was observed between the TEO-based and HT-based IF encodings, and both were superior to the conventional strategy. However, the TEO-based strategy was advantageous considering that it was approximately 35% faster than the HT-based strategy.

인공 청각 장치의 음성신호 처리와 자극방법의 시뮬레이션 (Simulation of speech processing and coding strategy for cochlear implants)

  • 김영훈;박광석
    • 대한의용생체공학회:학술대회논문집
    • /
    • 대한의용생체공학회 1991년도 추계학술대회
    • /
    • pp.30-33
    • /
    • 1991
  • The object of speech processor for cochlear implants is to deliver speech information to the central nerve system. In this study we have presented the method which simulate speech processing and coding strategy for cochlear implants and simulated two different processing methods to the 12 adults with normal ears. The formant sinusoidal coding was better than the formant pulse coding In the consonant perception test and learning effects.(p < 0.05)

  • PDF

Spike Train Decoding에 기반한 인공와우 어음처리기의 음성시작점 정보 전달특성 평가 (Performance Evaluation of Speech Onset Representation Characteristic of Cochlear Implants Speech Processor using Spike Train Decoding)

  • 김두희;김진호;김경환
    • 대한의용생체공학회:의공학회지
    • /
    • 제28권5호
    • /
    • pp.694-702
    • /
    • 2007
  • The adaptation effect originating from the chemical synapse between auditory nerve and inner hair cell gives advantage in accurate representation of temporal cues of incoming speech such as speech onset. Thus it is expected that the modification of conventional speech processing strategies of cochlear implant(CI) by incorporating the adaptation effect will result in considerable improvement of speech perception performance such as consonant perception score. Our purpose in this paper was to evaluate our new CI speech processing strategy incorporating the adaptation effect by the observation of auditory nerve responses. By classifying the presence or absence of speech from the auditory nerve responses, i. e. spike trains, we could quantitatively compare speech onset detection performances of conventional and improved strategies. We could verify the effectiveness of the adaptation effect in improving the speech onset representation characteristics.

청각 장애인을 위한 음성 신호의 자극패턴 추출에 관한 연구 (A Research on Speech Processing and Coding Strategy for Cochlear Implants)

  • 채대곤;변정근;최두일;백승화;박상희
    • 대한의용생체공학회:학술대회논문집
    • /
    • 대한의용생체공학회 1993년도 추계학술대회
    • /
    • pp.175-179
    • /
    • 1993
  • A Study on the speech processing and coding strategy for cochlear implants have been developed to create a speech signal processing system which extracts stimulus parameter including formants, pitch, amplitude information. In this study we have presented the method which extracts characteristic information of speech signal and adapt patients with hearing handicap.

  • PDF

Speech processing strategy and executive function: Korean children's stop perception

  • Kong, Eun Jong;Yoo, Jeewon
    • 말소리와 음성과학
    • /
    • 제9권3호
    • /
    • pp.57-65
    • /
    • 2017
  • The current study explored how Korean-speaking children processed the multiple acoustic cues (VOT and f0) for the stop laryngeal contrast (/t'/, /t/, and /$t^h$/) and examined whether individual perceptual strategies could be related to a general cognitive ability performing executive functions (EF). 15 children (aged from 7 to 8) participated in the speech perception task identifying the three Korean laryngeal stops (3AFC) on listening to the auditory stimuli of C-/a/ with synthetically varying VOT and f0. They completed a series of EF tasks to measure working memory, inhibition, and cognitive shifting ability. The findings showed that children used the two cues in a highly correlated manner. While children utilized VOT consistently for the three laryngeal categories, their use of f0 was either reduced or enhanced depending on the phonetic categories. Importantly, the children's processing strategies of a f0 suppression for a tense-aspirated contrast were meaningfully associated with children's better cognitive abilities such as working memory, inhibition, and attentional shifting. As a preliminary experimental investigation, the current research demonstrated that listeners with inefficient processing strategies were poor at the EF skills, suggesting that cognitive skills might be responsible for developmental variations of processing sub-phonemic information for the linguistic contrast.

On-Line Linear Combination of Classifiers Based on Incremental Information in Speaker Verification

  • Huenupan, Fernando;Yoma, Nestor Becerra;Garreton, Claudio;Molina, Carlos
    • ETRI Journal
    • /
    • 제32권3호
    • /
    • pp.395-405
    • /
    • 2010
  • A novel multiclassifier system (MCS) strategy is proposed and applied to a text-dependent speaker verification task. The presented scheme optimizes the linear combination of classifiers on an on-line basis. In contrast to ordinary MCS approaches, neither a priori distributions nor pre-tuned parameters are required. The idea is to improve the most accurate classifier by making use of the incremental information provided by the second classifier. The on-line multiclassifier optimization approach is applicable to any pattern recognition problem. The proposed method needs neither a priori distributions nor pre-estimated weights, and does not make use of any consideration about training/testing matching conditions. Results with Yoho database show that the presented approach can lead to reductions in equal error rate as high as 28%, when compared with the most accurate classifier, and 11% against a standard method for the optimization of linear combination of classifiers.

A Robust Method for Speech Replay Attack Detection

  • Lin, Lang;Wang, Rangding;Yan, Diqun;Dong, Li
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제14권1호
    • /
    • pp.168-182
    • /
    • 2020
  • Spoofing attacks, especially replay attacks, pose great security challenges to automatic speaker verification (ASV) systems. Current works on replay attacks detection primarily focused on either developing new features or improving classifier performance, ignoring the effects of feature variability, e.g., the channel variability. In this paper, we first establish a mathematical model for replay speech and introduce a method for eliminating the negative interference of the channel. Then a novel feature is proposed to detect the replay attacks. To further boost the detection performance, four post-processing methods using normalization techniques are investigated. We evaluate our proposed method on the ASVspoof 2017 dataset. The experimental results show that our approach outperforms the competing methods in terms of detection accuracy. More interestingly, we find that the proposed normalization strategy could also improve the performance of the existing algorithms.

Intonational Pattern Frequency of Seoul Korean and Its Implication to Word Segmentation

  • Kim, Sa-Hyang
    • 음성과학
    • /
    • 제15권2호
    • /
    • pp.21-30
    • /
    • 2008
  • The current study investigated distributional properties of the Korean Accentual Phrase and their implication to word segmentation. The properties examined were the frequency of various AP tonal patterns, the types of tonal patterns that are imposed upon content words, and the average number and temporal location of content words within the AP. A total of 414 sentences from the Read speech corpus and the Radio corpus were used for the data analysis. The results showed that the 84% of the APs contained one content word, and that almost 90% of the content words are located in AP-initial position. When the AP-initial onset was not an aspirated or tense consonant, the most common AP patterns were LH, LHH, and LHLH (78%), and 88% of the multisyllabic content words start with a rising tone in AP-initial position. When the AP-initial onset was an aspirated or tense consonant, the most common AP patterns were HH, HHLH, and HHL (72%), and 74% of the multisyllabic content words start with a level H tone in AP-initial position. The data further showed that 84.1% of APs end with the final H tone. The findings provide valuable information about the prosodic pattern and structure of Korean APs, and account for the results of a previous study which showed that Korean listeners are sensitive to AP-initial rising and AP-final high tones (Kim, 2007). This is in line with other cross-linguistic research which has revealed the correlation between prosodic probability and speech processing strategy.

  • PDF