Search | Korea Science

A Study on the Optimal Mahalanobis Distance for Speech Recognition

Lee, Chang-Young
- Speech Sciences
- /
- v.13 no.4
- /
- pp.177-186
- /
- 2006
In an effort to enhance the quality of feature vector classification and thereby reduce the recognition error rate of the speaker-independent speech recognition, we employ the Mahalanobis distance in the calculation of the similarity measure between feature vectors. It is assumed that the metric matrix of the Mahalanobis distance be diagonal for the sake of cost reduction in memory and time of calculation. We propose that the diagonal elements be given in terms of the variations of the feature vector components. Geometrically, this prescription tends to redistribute the set of data in the shape of a hypersphere in the feature vector space. The idea is applied to the speech recognition by hidden Markov model with fuzzy vector quantization. The result shows that the recognition is improved by an appropriate choice of the relevant adjustable parameter. The Viterbi score difference of the two winners in the recognition test shows that the general behavior is in accord with that of the recognition error rate.
PDF

Designing of efficient super-wide bandwidth extension system using enhanced parameter estimation in time domain (시간 영역에서 개선된 파라미터 추론을 통한 효율적인 초광대역 확장 시스템 설계)

Jeon, Jong-jeon
- Proceedings of the Korean Institute of Information and Commucation Sciences Conference
- /
- 2018.10a
- /
- pp.431-433
- /
- 2018
This paper proposes the system that offer super-wideband speech which is made by artificial bandwidth extension technique using wideband speech signal in time-domain. wideband excitation signal and line spectrum pair(LSP) are extracted based on source-filter model in time-domain. Two parameters are extended by each bandwidth extension algorithms, and then, super-wideband speech parameters are estimated. and synthesized. Subjective test shows super-wideband speech is better speech quality than wideband speech signal.
PDF

Erratum to: Voice quality of normal elderly people after a 3 oz water-swallow test: An acoustic analysis (Erratum to: 3온스 물 삼킴검사 이후 정상 노년층의 음질 변화: 음향학적 분석)

Lee, Sol Hee;Choi, Hong-Shik;Choi, Seong-Hee;Kim, HyangHee
- Phonetics and Speech Sciences
- /
- v.11 no.1
- /
- pp.65-66
- /
- 2019
https://doi.org/10.13064/KSSS.2019.11.1.065 인용 PDF KSCI

Quality of Life in Older Adults with Cochlear Implantation: Can It Be Equal to That of Healthy Older Adults?

Tokat, Taskin;Muderris, Togay;Bozkurt, Ergul Basaran;Ergun, Ugurtan;Aysel, Abdulhalim;Catli, Tolgahan
- Korean Journal of Audiology
- /
- v.25 no.3
- /
- pp.138-145
- /
- 2021
Background and Objectives: This study aimed to evaluate the audiologic results after cochlear implantation (CI) in older patients and the degree of improvement in their quality of life (QoL). Subjects and Methods: Patients over 65 years old who underwent CI at implant center in Bozyaka Training and Research Hospital were included in this study (n=54; 34 males and 20 females). The control group was patient over 65 years old with normal hearing (n=54; 34 males and 20 females). We administered three questionnaires [World Health Organization Quality of Life-BREF (WHOQOL-BREF), World Health Organization Quality of Life-OLD (WHOQOL-OLD)], and Geriatric Depression Scale (GDS) to evaluate the QoL, CIrelated effects on activities of daily life, and social activities in all the subjects. Moreover, correlations between speech recognition and the QoL scores were evaluated. The duration of implant use and comorbidities were also examined as potential factors affecting QoL. Results: The patients had remarkable improvements (the mean score of postoperative speech perception 75.7%) in speech perception after CI. The scores for the WHOQOL-OLD and WHOQOL-BREF questionnaire responses were similar in both the study and control groups, except those for a two subdomains (social relations and social participation). The patients with longer-term CI had higher scores than those with short-term CI use. In general, the changes in GDS scores were not significant (p<0.05). Conclusions: The treatment of hearing loss with CI conferred significant improvement in patient's QoL (p<0.01). The evaluation of QoL can provide multidimensional insights into a geriatric patient's progress and, therefore, should be considered by audiologists.
https://doi.org/10.7874/jao.2020.00458 인용

Quality of Life in Older Adults with Cochlear Implantation: Can It Be Equal to That of Healthy Older Adults?

Tokat, Taskin;Muderris, Togay;Bozkurt, Ergul Basaran;Ergun, Ugurtan;Aysel, Abdulhalim;Catli, Tolgahan
- Journal of Audiology & Otology
- /
- v.25 no.3
- /
- pp.138-145
- /
- 2021
Background and Objectives: This study aimed to evaluate the audiologic results after cochlear implantation (CI) in older patients and the degree of improvement in their quality of life (QoL). Subjects and Methods: Patients over 65 years old who underwent CI at implant center in Bozyaka Training and Research Hospital were included in this study (n=54; 34 males and 20 females). The control group was patient over 65 years old with normal hearing (n=54; 34 males and 20 females). We administered three questionnaires [World Health Organization Quality of Life-BREF (WHOQOL-BREF), World Health Organization Quality of Life-OLD (WHOQOL-OLD)], and Geriatric Depression Scale (GDS) to evaluate the QoL, CIrelated effects on activities of daily life, and social activities in all the subjects. Moreover, correlations between speech recognition and the QoL scores were evaluated. The duration of implant use and comorbidities were also examined as potential factors affecting QoL. Results: The patients had remarkable improvements (the mean score of postoperative speech perception 75.7%) in speech perception after CI. The scores for the WHOQOL-OLD and WHOQOL-BREF questionnaire responses were similar in both the study and control groups, except those for a two subdomains (social relations and social participation). The patients with longer-term CI had higher scores than those with short-term CI use. In general, the changes in GDS scores were not significant (p<0.05). Conclusions: The treatment of hearing loss with CI conferred significant improvement in patient's QoL (p<0.01). The evaluation of QoL can provide multidimensional insights into a geriatric patient's progress and, therefore, should be considered by audiologists.
https://doi.org/10.7874/jao.2020.00458 인용

Two-Microphone Binary Mask Speech Enhancement in Diffuse and Directional Noise Fields

Abdipour, Roohollah;Akbari, Ahmad;Rahmani, Mohsen
- ETRI Journal
- /
- v.36 no.5
- /
- pp.772-782
- /
- 2014
Two-microphone binary mask speech enhancement (2mBMSE) has been of particular interest in recent literature and has shown promising results. Current 2mBMSE systems rely on spatial cues of speech and noise sources. Although these cues are helpful for directional noise sources, they lose their efficiency in diffuse noise fields. We propose a new system that is effective in both directional and diffuse noise conditions. The system exploits two features. The first determines whether a given time-frequency (T-F) unit of the input spectrum is dominated by a diffuse or directional source. A diffuse signal is certainly a noise signal, but a directional signal could correspond to a noise or speech source. The second feature discriminates between T-F units dominated by speech or directional noise signals. Speech enhancement is performed using a binary mask, calculated based on the proposed features. In both directional and diffuse noise fields, the proposed system segregates speech T-F units with hit rates above 85%. It outperforms previous solutions in terms of signal-to-noise ratio and perceptual evaluation of speech quality improvement, especially in diffuse noise conditions.
https://doi.org/10.4218/etrij.14.0113.0917 인용 PDF KSCI KPUBS

A Study on the Effective Command Delivery of Commanders Using Speech Recognition Technology (국방 분야에서 전장 소음 환경 하에 음성 인식 기술 연구)

Yeong-hoon Kim;Hyun Kwon
- Convergence Security Journal
- /
- v.24 no.2
- /
- pp.161-165
- /
- 2024
Recently, speech recognition models have been advancing, accompanied by the development of various speech processing technologies to obtain high-quality data. In the defense sector, efforts are being made to integrate technologies that effectively remove noise from speech data in noisy battlefield situations and enable efficient speech recognition. This paper proposes a method for effective speech recognition in the midst of diverse noise in a battlefield scenario, allowing commanders to convey orders. The proposed method involves noise removal from noisy speech followed by text conversion using OpenAI's Whisper model. Experimental results show that the proposed method reduces the Character Error Rate (CER) by 6.17% compared to the existing method that does not remove noise. Additionally, potential applications of the proposed method in the defense are discussed.
https://doi.org/10.33778/kcsa.2024.24.2.161 인용 PDF HTML

Modification of Pitch Algorithm and Its Application to Noise (피치 알고리즘 수정 및 소음에의 적용)

Shin, Sung-Hwan;Ih, Jeong-Guon
- Proceedings of the Korean Society for Noise and Vibration Engineering Conference
- /
- 2002.11a
- /
- pp.354.1-354
- /
- 2002
Pitch is a perception related to frequency, one of the psychological aspects or attributes of tones, and an important factor to determine sound quality of sound together with loudness and timber. while a study on pitch has been actively achieved In the part of speech recognition and speech separation, that for analysis and improvement of product sound quality is not yet enough. (omitted)
PDF

The Effects of Nasalance on Quality of Voice (비성이 음질에 미치는 영향에 대한 음향학적 연구)

Ahn, Jong-Bok;Shin, Myung-Sun;Noh, Dong-Woo;Paik, Eun-A;Jeong, Ok-Ran
- Speech Sciences
- /
- v.9 no.3
- /
- pp.133-140
- /
- 2002
The purpose of this study was to investigate any changes in acoustic qualities of voice as ,a function of nasalance, in order to determine the relationship between vocal quality and nasalance. Twenty normal subjects (10 males and 10 females) vocalized /a/, /$\tilde{a}$/, and /a $\eta$/. The changes in nasalance and acoustic characteristics of the voice were analyzed by Nasometer (Model 6200-3, Kay Elemetrics, co) and Dr, Speech 4.0 (Tiger Electronics, Co), respectively. One-way ANOVA was used to examine any changes in jitter, shimmer, harmonics-to-noise ratio, and normalized noise energy relative to the nasalance in 3 types of vocalization. The Person r correlation coefficient was used to identify the relationship between the nasalance and the vocal quality. There was no statistically significant changes in jitter, shimmer, HNR and NNE. The jitter, however, tended to increase as the nasalance socre increased, compared to the other vocal parameters. In addition, the NNE showed an increase on / $\tilde{a}$/, and /a $\eta$/, more on the /a $\eta$/. Thus, it was speculated that NNE could be used to identify or screen resonant disorders with hypernasality
PDF

Comparison of Self-Reporting Voice Evaluations between Professional and Non-Professional Voice Users with Voice Disorders by Severity and Type (음성장애가 있는 직업적 음성사용자와 비직업적 음성사용자의 음성장애 중증도와 유형에 따른 자기보고식 음성평가 차이)

Kim, Jaeock
- Phonetics and Speech Sciences
- /
- v.7 no.4
- /
- pp.67-76
- /
- 2015
The purpose of this study was to compare professional (Pro) and non-professional (Non-pro) voice users with voice disorders in self-reporting voice evaluation using Korean-Voice Handicap Index (K-VHI) and Korean-Voice Related Quality of Life (K-VRQOL). In addition, those were compared by voice quality and voice disorder type. 94 Pro and 106 Non-pro were asked to fill out the K-VHI and K-VRQOL, perceptually evaluated on GRBAS scales, and divided into three types of voice disorders (functional, organic and neurologic) by an experienced speech-language pathologist and an otolaryngologist. The results showed that the functional (F) and physical (P) scores of K-VHI in Pro group were significantly higher than those in Non-pro group. As the voice quality evaluated by G scale got worse, the scores of all aspects except emotional (E) of K-VHI and social-emotional (SE) of K-VRQOL were higher. All scores of K-VHI and K-VRQOL in neurologic voice disorders were significantly higher than those in functional and organic voice disorders. In conclusion, professional voice users are more sensitive to their functional and physical handicap resulted by their voice problems and that goes double for the patients with severe and neurologic voice disorders.
https://doi.org/10.13064/KSSS.2015.7.4.067 인용 PDF KSCI

Search Result 807, Processing Time 0.025 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)