Search | Korea Science

A Study of Data Augmentation and Auto Speech Recognition for the Elderly (한국어 노인 음성 데이터 증강 및 인식 연구 )

Keon Hee Kim;Seoyoon Park;Hansaem Kim
- Annual Conference on Human and Language Technology
- /
- 2023.10a
- /
- pp.56-60
- /
- 2023
기존의 음성인식은 청장년 층에 초점이 맞추어져 있었으나, 최근 고령화가 가속되면서 노인 음성에 대한 연구 필요성이 증대되고 있다. 그러나 노인 음성 데이터셋은 청장년 음성 데이터셋에 비해서는 아직까지 충분히 확보되지 못하고 있다. 본 연구에서는 부족한 노인 음성 데이터셋 확보에 기여하고자 희소한 노인 데이터셋을 증강할 수 있는 방법론에 대해 연구하였다. 이를 위해 노인 음성 특징(feature)을 분석하였으며, '주파수'와 '발화 속도' 특징을 일반 성인 음성에 합성하여 데이터를 증강하였다. 이후 Whisper small 모델을 파인 튜닝한 뒤 노인 음성에 대한 CER(Character Error Rate)를 구하였고, 기존 노인 데이터셋에 증강한 데이터셋을 함께 사용하는 것이 가장 효과적임을 밝혀내었다.
PDF

노인성 음성

김영호
- Proceedings of the KSLP Conference
- /
- 2003.11a
- /
- pp.205-207
- /
- 2003
노년이 되면 후두암이나 신경장애와 같은 질환의 빈도가 증가하는 것이 사실이지만 가장 흔한 음성변화의 원인은 후두의 노화현상에 따른 것이다. 연령과 무관하게 발생하는 성대결절과 같은 질환도 노화의 영향에 따라 그 심각성이 달라지게 된다. 따라서 노인의 음성문제를 다루려면 노화과정에 대한 올바른 이해가 필수적이다. (중략)
PDF

Development of Voice Activity Detection Algorithm for Elderly Voice based on the Higher Order Differential Energy Operator (고차 미분에너지 기반 노인 음성에서의 음성 구간 검출 알고리즘 연구)

Lee, JiYeoun
- Journal of Digital Convergence
- /
- v.14 no.11
- /
- pp.249-255
- /
- 2016
Since the elderly voices include a lot of noise caused by physiological changes in respiration, phonation, and resonance, the performance of the convergence health-care equipments such as speech recognition, synthesis, analysis program done by elderly voice is deteriorated. Therefore it is necessary to develop researches to operate health-care instruments with elderly voices. In this study, a voice activity detection using a symmetric higher-order differential energy function (SHODEO) was developed and was compared with auto-correlation function(ACF) and the average magnitude difference function(AMDF). It was confirmed to have a better performance than other methods in the voice interval detection. The voice activity detection will be applied to a voice interface for the elderly to improve the accessibility of the smart devices.
https://doi.org/10.14400/JDC.2016.14.11.249 인용 PDF KSCI

Syllabic Speech Rate Control for Improving Elderly Speech Recognition of Smart Devices (음절 별 발화속도 조절을 통한 노인 음석인식 개선)

Kyeong, Ju Won;Son, Gui Young;Kwon, Soonil
- Proceedings of the Korea Information Processing Society Conference
- /
- 2015.10a
- /
- pp.1711-1714
- /
- 2015
스마트 디바이스가 사회와 소통할 수 있는 도구가 되었음에도 불구하고 아직까지 노인들이 사용하기에는 어려움이 있다. 여기에 음성인식 기술을 이용한 음성인터페이스를 활용함으로써 노인들의 스마트 디바이스에 대한 사용성을 높일 수 있다. 하지만 일반적인 음성인식 시스템은 청장년의 발성 스타일에 맞춰져 있기 때문에, 노화된 노인의 발성이 그대로 입력될 경우 음성인식률이 하락한다. 본 연구에서는 노인의 음절 별 발화속도가 일반적인 음성인식 시스템의 성능을 보증할 수 있는 범위를 벗어나는 경우가 많다는 분석 결과를 토대로 노인의 음절 별 발화속도를 조정한 결과 노인남녀 평균 음성인식률이 15.3% 상승하였다. 이처럼 노인의 음성인식 오류 원인들 중 하나인 발화속도의 재조정으로 음성 인식률을 높일 수 있는 토대를 마련하였다. 이는 노인들이 스마트 디바이스를 이용하여 쉽고 정확한 작업을 수행할 수 있게 됨으로써, 노인들의 사회 참여와 정보 획득이 용이해 지고 더 나아가 세대 간의 소통에도 이바지할 것으로 기대한다.
https://doi.org/10.3745/PKIPS.y2015m10a.1711 인용 PDF

노인성 음성(Aging Voice, Presbyphonia)의 치료

Gwon, Taek-Gyun
- Proceedings of the KSLP Conference
- /
- 2014.03a
- /
- pp.34-37
- /
- 2014
PDF

노인성 음성에 대한 최신 연구동향

Im, Jae-Yeol
- Proceedings of the KSLP Conference
- /
- 2014.03a
- /
- pp.38-39
- /
- 2014
PDF

Clinical Manifestation of Aging Voice (노인성 음성의 임상양상)

Baek, Min-Kwan;Kim, Dong Young
- Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
- /
- v.25 no.1
- /
- pp.16-19
- /
- 2014
The presbyphonia is a combination of physiological and structural changes due to aging of the larynx in elderly patients with voice problems. Some of these changes are inevitable, while others may be avoidable or reversible. The fatigue of phonation is the most common clinical symptom of the aging voice. The voice problems with aging are produced from variable causes including the organic lesions of the larynx. It is essential that the curers understand physiologic and pathologic changes of aging voice for minimizing glottal incompetence and improving vocal performance and quality of life of the elderly.
PDF

Aerodynamic Characteristics of Young and Elderly Adult Patients with Voice Disorders during Continuous Speech (젊은 성인 및 노인 음성장애 환자의 연속발화시 공기역학적 특성 비교)

Pyo, Hwa-young
- The Journal of the Korea Contents Association
- /
- v.19 no.12
- /
- pp.270-278
- /
- 2019
This study was performed to compare the aerodynamic characteristics of young and elderly adult male patients with voice disorders during continuous speech. Aerodynamic measurements were obtained after 12 young male patients and 9 elderly male patients read a paragraph. The elderly group showed longer duration, lower airflow rate and air volume than the younger group, but the differences were not significant except phonation time. So, when interpreting the meaning of aerodynamic measures of elderly voice disorder patients in the aspects of airflow and air volume, it should take into account various conditions(e. g. reading materials, pulmonary functions) as well as age.
https://doi.org/10.5392/JKCA.2019.19.12.270 인용 PDF KSCI

End-to-end speech recognition models using limited training data (제한된 학습 데이터를 사용하는 End-to-End 음성 인식 모델)

Kim, June-Woo;Jung, Ho-Young
- Phonetics and Speech Sciences
- /
- v.12 no.4
- /
- pp.63-71
- /
- 2020
Speech recognition is one of the areas actively commercialized using deep learning and machine learning techniques. However, the majority of speech recognition systems on the market are developed on data with limited diversity of speakers and tend to perform well on typical adult speakers only. This is because most of the speech recognition models are generally learned using a speech database obtained from adult males and females. This tends to cause problems in recognizing the speech of the elderly, children and people with dialects well. To solve these problems, it may be necessary to retain big database or to collect a data for applying a speaker adaptation. However, this paper proposes that a new end-to-end speech recognition method consists of an acoustic augmented recurrent encoder and a transformer decoder with linguistic prediction. The proposed method can bring about the reliable performance of acoustic and language models in limited data conditions. The proposed method was evaluated to recognize Korean elderly and children speech with limited amount of training data and showed the better performance compared of a conventional method.
https://doi.org/10.13064/KSSS.2020.12.4.063 인용 PDF KSCI

A comparison of the perceptual-auditory voice quality evaluation (GRBAS) and voice-related quality of life (K-VRQOL) according to choir type of elderly women choir members (여성 노인 합창단원의 합창단 유형에 따른 청지각적 음성평가(GRBAS) 및 음성관련 삶의 질(K-VRQOL) 비교)

Lee, Hyeonjung;Kang, Binna;Kim, Soo Ji
- Phonetics and Speech Sciences
- /
- v.12 no.2
- /
- pp.51-61
- /
- 2020
The purpose of this study is to compare voice characteristics and voice-related quality of life (K-VRQOL) of the elderly female choir members using perceptual-auditory voice quality evaluation (GRBAS) and K-VRQOL scales. The participants were 77 women over 60 years old who were actively engaged in the choir in either Seoul or Busan. There are two kinds of choirs that indicate different engagement levels: regular choir and church choir. The perceptual-auditory vocal quality evaluation was listened to by / a / vowels and were graded by experts using the GRBAS scale. As a result, when comparing the differences between groups, the elderly female participants of the regular choir showed higher satisfaction in speech using the subjective speech recognition level than the elderly female members who performed in the church choir. In addition, the analysis showed that the satisfaction level was high in the physical function area of the K-VRQOL scale. This study confirmed that choral activities could yield positive results not only in terms of improving voice function in old age, but also to improve the subjective perception level of voice use, thus suggesting the necessity of systematic music programs to improve voices that are aging.
https://doi.org/10.13064/KSSS.2020.12.2.051 인용 PDF KSCI

Search Result 50, Processing Time 0.026 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)