Search | Korea Science

Speech Recognition Performance Improvement using Gamma-tone Feature Extraction Acoustic Model (감마톤 특징 추출 음향 모델을 이용한 음성 인식 성능 향상)

Ahn, Chan-Shik;Choi, Ki-Ho
- Journal of Digital Convergence
- /
- 제11권7호
- /
- pp.209-214
- /
- 2013
Improve the recognition performance of speech recognition systems as a method for recognizing human listening skills were incorporated into the system. In noisy environments by separating the speech signal and noise, select the desired speech signal. but In terms of practical performance of speech recognition systems are factors. According to recognized environmental changes due to noise speech detection is not accurate and learning model does not match. In this paper, to improve the speech recognition feature extraction using gamma tone and learning model using acoustic model was proposed. The proposed method the feature extraction using auditory scene analysis for human auditory perception was reflected In the process of learning models for recognition. For performance evaluation in noisy environments, -10dB, -5dB noise in the signal was performed to remove 3.12dB, 2.04dB SNR improvement in performance was confirmed.
https://doi.org/10.14400/JDPM.2013.11.7.209 인용 PDF

Effects of vowel types and sentence positions in standard passage on auditory and cepstral and spectral measures in patients with voice disorders (모음 유형과 표준문단의 문장 위치가 음성장애 환자의 청지각적 및 켑스트럼 및 스펙트럼 분석에 미치는 효과)

Mi-Hyeon Choi;Seong Hee Choi
- Phonetics and Speech Sciences
- /
- 제15권4호
- /
- pp.81-90
- /
- 2023
Auditory perceptual assessment and acoustic analysis are commonly used in clinical practice for voice evaluation. This study aims to explore the effects of speech task context on auditory perceptual assessment and acoustic measures in patients with voice disorders. Sustained vowel phonations (/a/, /e/, /i/, /o/, /u/, /ɯ/, /ʌ/) and connected speech (a standardized paragraph 'kaeul' and nine sub-sentences) were obtained from a total of 22 patients with voice disorders. GRBAS ('G', 'R', 'B', 'A', 'S') and CAPE-V ('OS', 'R', 'B', 'S', 'P', 'L') auditory-perceptual assessment were evaluated by two certified speech language pathologists specializing in voice disorders using blind and random voice samples. Additionally, spectral and cepstral measures were analyzed using the analysis of dysphonia in speech and voice model (ADSV).When assessing voice quality with the GRBAS scale, it was not significantly affected by the vowel type except for 'B', while the 'OS', 'R' and 'B' in CAPE-V were affected by the vowel type (p<.05). In addition, measurements of CPP and L/H ratio were influenced by vowel types and sentence positions. CPP values in the standard paragraph showed significant negative correlations with all vowels, with the highest correlation observed for /e/ vowel (r=-.739). The CPP of the second sentence had the strongest correlation with all vowels. Depending on the speech stimulus, CAPE-V may have a greater impact on auditory-perceptual assessment than GRBAS, vowel types and sentence position with consonants influenced the 'B' scale, CPP, and L/H ratio. When using vowels in the voice assessment of patients with voice disorders, it would be beneficial to use not only /a/, but also the vowel /i/, which is acoustically highly correlated with 'breathy'. In addition, the /e/ vowel was highly correlated acoustically with the standardized passage and sub-sentences. Furthermore, given that most dysphonic signals are aperiodic, 2nd sentence of the 'kaeul' passage, which is the most acoustically correlated with all vowels, can be used with CPP. These results provide clinical evidence of the impact of speech tasks on auditory perceptual and acoustic measures, which may help to provide guidelines for voice evaluation in patients with voice disorders.
https://doi.org/10.13064/KSSS.2023.15.4.081 인용 PDF

Development of 3D cochlear model to evaluate ECAP (ECAP 평가를 위한 3차원 달팽이관 모델 개발)

Kang, Soojin;Woo, Jihwan
- Journal of the Institute of Electronics and Information Engineers
- /
- 제50권6호
- /
- pp.287-293
- /
- 2013
Cochlear implant (CI) is an auditory prosthesis that delivers electrical stimulation via inserted electrodes into a cochlea. To evaluate CI performance, it is important to understand how auditory nerves are responded to electrical stimulations. In clinic, electrically evoked compound action potential (ECAP) is measured. In this study, we developed 3D finite element (FE) cochlear model to simulate ECAP in response to electrical stimulation. The model prododuced ECAP similar to that measured in animal experiments and clinics. This 3D FE cochlear model could be used in electrical stimulus method study to improve CI by analyzing neural responses to electrical stimulations.
https://doi.org/10.5573/ieek.2013.50.6.287 인용 PDF KSCI

Auditory Sensation by the Inserted "Electronic Cochlea" in the Cases of the Experim entally Destroyed Receptor Organ of Corti of the Cat (와우 수용기 모세포를 파괴한 가묘의 "전기와우" 삽입에 의한 "청각감")

장인원;김성남;양한모;정규화;최윤호;정종진;조용범;국태진;이정헌
- Proceedings of the KOR-BRONCHOESO Conference
- /
- 대한기관식도과학회 1979년도 제13차 학술대회 연제순서 및 초록
- /
- pp.4.3-4
- /
- 1979
In the cases of the experimentally destroyed receptor organ of the Corti of the cats, "electric cochlea" were inserted near the auditory neuron through the Scala tympani as an input of the inner device, and outer device is placed near the receiver of the audiometer. During exposing noise through the outer device, kymographic record were obtained as following: 1) Correlation between increasing intensity and amplitude showed parallel responses 2) The auricular reflex by repeated stimulation of the sound showed considerable increase at first, and decrease respectively. 3) In this experimental animals, absolutely non-responsed period, relatively non-responsed period and responsed period were observed. 4) Above mentioned reflex indicate that "Auditory sensation" can be induced by inserted "Electric cochlea" in the cases of the experimentally destroyed receptor organ of Corti of the cats.
PDF

A Comparison of Front-Ends for Robust Speech Recognition

Kim, Doh-Suk;Jeong, Jae-Hoon;Lee, Soo-Young;Kil, Rhee M.
- The Journal of the Acoustical Society of Korea
- /
- 제17권3E호
- /
- pp.3-11
- /
- 1998
Zero-crossings with Peak amplitudes (ZCPA) model motivated by human auditory periphery was proposed to extract reliable features form speech signals even in noisy environments for robust speech recognition. In this paper, the performance of the ZCPA model is further improved by incorporating conventional speech processing techniques into the model output. Spectral and cepstral representations of the ZCPA model output are compared, and the incorporation of dynamic features with several different lengths of time-derivative window are evaluated. Also, comparative evaluations with other front-ends in real-world noisy environments are performed, and result in the superiority of the ZCPA model.
PDF

A Virtual Reality System for the Cognitive and Behavioral Assessment of Schizophrenia (정신분열병 환자의 인지적/행동적 특성평가를 위한 가상현실시스템 구현)

Cho, Won-Geun;Kim, Ho-Sung;Ku, Jung-Hun;Kim, Jae-Hun;Kim, Byoung-Nyun;Lee, Jang-Han;Kim, Sun I.
- Proceedings of the Korean Society for Emotion and Sensibility Conference
- /
- 한국감성과학회 2003년도 춘계학술대회 논문집
- /
- pp.94-100
- /
- 2003
Patients with schizophrenia have thinking disorders such as delusion or hallucination, because they have a deficit in the ability which to systematize and integrate information. Therefore, they cannot integrate or systemize visual, auditory and tactile stimuli. In this study we suggest a virtual reality system for the assessment of cognitive ability of schizophrenia patients, based on the brain multimodal integration model. The virtual reality system provides multimodal stimuli, such as visual and auditory stimuli, to the patient, and can evaluate the patient's multimodal integration and working memory integration abilities by making the patient interpret and react to multimodal stimuli, which must be remembered for a given period of time. The clinical study showed that the virtual reality program developed is comparable to those of the WCST and the SPM.
PDF

A study imitating human auditory system for tracking the position of sound source (인간의 청각 시스템을 응용한 음원위치 추정에 관한 연구)

Bae, Jeen-Man;Cho, Sun-Ho;Park, Chong-Kuk
- Proceedings of the KIEE Conference
- /
- 대한전기학회 2003년도 학술회의 논문집 정보 및 제어부문 B
- /
- pp.878-881
- /
- 2003
To acquire an appointed speaker's clear voice signal from inspect-camera, picture-conference or hands free microphone eliminating interference noises needs to be preceded speaker's position automatically. Presumption of sound source position's basic algorithm is about measuring TDOA(Time Difference Of Arrival) from reaching same signals between two microphones. This main project uses ADF(Adaptive Delay Filter) [4] and CPS(Cross Power Spectrum) [5] which are one of the most important analysis of TDOA. From these analysis this project proposes presumption of real time sound source position and improved model NI-ADF which makes possible to presume both directions of sound source position. NI-ADF noticed that if auditory sense of humankind reaches above to some specified level in specified frequency, it will accept sound through activated nerve. NI-ADF also proposes practicable algorithm, the presumption of real time sound source position including both directions, that when microphone loads to some specified system, it will use sounds level difference from external system related to sounds of diffraction phenomenon. In accordance with the project, when existing both direction adaptation filter's algorithm measures sound source, it increases more than twice number by measuring one way. Preserving this weak point, this project proposes improved algorithm to presume real time in both directions.
PDF

Feline vocal communication

Tavernier, Chloe;Ahmed, Sohail;Houpt, Katherine Albro;Yeon, Seong Chan
- Journal of Veterinary Science
- /
- 제21권1호
- /
- pp.18.1-18.17
- /
- 2020
Cat vocalizes to communicate with another and express their internal states. The vocal repertoire of the cat is wide and up to 21 different vocalizations have been described in the literatures. But it is more than probable that the repertoire contains more types of vocalizations. An ethogram was created in this paper describing the actual known vocalisations of the domestic cat based on an auditory classification. However, the audiogram allows also a visual classification which can increase the accuracy of vocalization differentiation. The classification can be risky as it is sometimes unclear if different types of vocalizations are produced in different environments or if a unique type of vocalization is used with variation in the acoustic parameters. As an example, isolation calls produced by kittens differ depending on the context. The environment has an important impact on the vocal behaviour and thus feral cats and pet cats vocalize differently. Pet cats are thus able to create an efficient communication with humans thanks to the flexibility of vocalisation behaviours. This review allowed us to create a simple model of the cat vocal repertory.
https://doi.org/10.4142/jvs.2020.21.e18 인용 PDF

Autonomic, Respiratory and Subjective Effects of Long-term Exposure to Aversive Loud Noise : Tonic Effects in Accumulated Stress Model

Sohn, Jin-Hun;Sokhadze, Estate;Choi, Sang-Sup;Lee, Kyung-Hwa
- Science of Emotion and Sensibility
- /
- 제2권2호
- /
- pp.37-42
- /
- 1999
Long-term exposure to loud noise affects performance since it changes arousal level, distracts attention, and also is able to evoke subjective stress accompanied by negative emotional states. The purpose of the study was to analyze dynamics of subjective and physiological variables during a relatively long-lasting (30 min) exposure to white noise (85 dB[A]). Physiological signals were recorded on 15 college students during 30 min of intense auditory stimulation. Autonomic variables, namely skin conductance level , non-specific SCR number, inter-best intervals in ECG, heart rate variability index (HF/LF ratio of HRV), skin temperature, as well as respiration rate were analyzed on 5 min epoch basis. Psychological assessment (subjective rating of stress level) was also repeated every 5 min. Statistical analysis was employed to trace the time course of the dynamics of subjective and autonomic physiological variables and their relationships. Results showed that the intense noise evoked subjective stress as well as associated autonomic nervous system responses. However it was shown that physiological variables endured specific changes in the process of exposure to the loud white noise. Discussed were probable psychophysiological mechanisms mediating reactivity to long-term auditory stimulation of high intensity, namely short-term activation, followed by transient adaptation (with relatively stable autonomic balance) and then a subsequent wave of arousal due to tonic sympathetic dominance.
PDF

MEG Measurement Using a 40-channel SQUID System (40 채널 SQUID 시스템을 이용한 뇌자도 측정)

Kwon, H.;Lee, Y.H.;Kim, J.M.;Kim, K.W.;Park, Y.K.
- Progress in Superconductivity
- /
- 제4권1호
- /
- pp.19-26
- /
- 2002
We have earlier developed a 40-channel SQUID system. An important figure of merit of a MEG system is the localization error, within which the underlying current source can be localized. With this system, we investigated the localization error in terms of the standard deviation of the coordinates of the ECDs and the systematic error due to inadequate modeling. To do this, we made localization of single current dipoles from tangential components of auditory evoked fields. Equivalent current dipoles (ECD) at N1m peak were estimated based on a locally fitted spherical conductor model. In addition, we made skull phantom and simulation measurements to investigate the contribution of various errors to the localization error. It was found that the background noise was the main source of the errors that could explain the observed standard deviation. Further, the amount of systematic error, when modeling the head with a spherical conductor, was much less than the standard deviation due to the background noise. We also demonstrated the performance of the system by measuring the evoked fields to grammatical violation in sentence comprehension.
PDF

검색결과 160건 처리시간 1.415초

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)