Search | Korea Science

Comparison of Adult and Child's Speech Recognition of Korean (한국어에서의 성인과 유아의 음성 인식 비교)

Yoo, Jae-Kwon;Lee, Kyoung-Mi
- The Journal of the Korea Contents Association
- /
- v.11 no.5
- /
- pp.138-147
- /
- 2011
While most Korean speech databases are developed for adults' speech, not for children's speech, there are various children's speech databases based on other languages. Because there are wide differences between children's and adults' speech in acoustic and linguistic characteristics, the children's speech database needs to be developed. In this paper, to find the differences between them in Korean, we built speech recognizers using HMM and tested them according to gender, age, and the presence of VTLN(Vocal Tract Length Normalization). This paper shows the speech recognizer made by children's speech has a much higher recognition rate than that made by adults' speech and using VTLN helps to improve the recognition rate in Korean.
https://doi.org/10.5392/JKCA.2011.11.5.138 인용 PDF KSCI

Implementation of A Morphological Analyzer Based on Pseudo-morpheme for Large Vocabulary Speech Recognizing (대어휘 음성인식을 위한 의사형태소 분석 시스템의 구현)

양승원
- Journal of Korea Society of Industrial Information Systems
- /
- v.4 no.2
- /
- pp.102-108
- /
- 1999
It is important to decide processing unit in the large vocabulary speech recognition system we propose a Pseudo-Morpheme as the recognition unit to resolve the problems in the recognition systems using the phrase or the general morpheme. We implement a morphological analysis system and tagger for Pseudo-Morpheme. The speech processing system using this pseudo-morpheme can get better result than other systems using the phrase or the general morpheme. So, the quality of the whole spoken language translation system can be improved. The analysis-ratio of our implemented system is similar to the common morphological analysis systems.
PDF

Integrated Verbal and Nonverbal Sentiment Analysis System for Evaluating Reliability of Video Contents (영상 콘텐츠의 신뢰도 평가를 위한 언어와 비언어 통합 감성 분석 시스템)

Shin, Hee Won;Lee, So Jeong;Son, Gyu Jin;Kim, Hye Rin;Kim, Yoonhee
- KIPS Transactions on Software and Data Engineering
- /
- v.10 no.4
- /
- pp.153-160
- /
- 2021
With the advent of the "age of video" due to the simplification of video content production and the convenience of broadcasting channel operation, review videos on various products are drawing attention. We proposes RASIA, an integrated reliability analysis system based on verbal and nonverbal sentiment analysis of review videos. RASIA extracts and quantifies each emotional value obtained through language sentiment analysis and facial analysis of the reviewer in the video. Subsequently, we conduct an integrated reliability analysis of standardized verbal and nonverbal sentimental values. RASIA provide an new objective indicator to evaluate the reliability of the review video.
https://doi.org/10.3745/KTSDE.2021.10.4.153 인용 PDF KSCI

후두미세수술후 지속적인 음성장애환자에 대한 임상적 고찰

김명상;표화영;최홍식;김영호;김광문
- Proceedings of the KSLP Conference
- /
- 1997.11a
- /
- pp.257-257
- /
- 1997
1996년 10월부터 1997년 6월까지 안ㆍ이비인후과병원과 영동세브란스병원 이비인후과에 음성장애를 주소로 내원한 환자 중 성대양성질환으로 성대미세수술을 시행하고 음성검사를 시행한 환자들을 대상으로 하였다. 대상환자는 총 85명으로 후향적으로 의무기록을 검토하여 수술시 병명, 수술시 레이저(laser)사용유무, 수술 후 발성장애 유무를 확인하였으며 그 중 수술 후 지속적인 혹은 반복적인 발성장애를 보인 환자는 총 12명으로 발성장애의 원인과 치료의 종류 및 치료효과를 판정하기 위해 후두비디오스트로보 스코피 검사, 청각심리검사 및 음성분석을 확인하였으며 통계처리를 하였다. (중략)
PDF

A study on the Prosody Generation of Korean Sentences using Artificial Neural networks (인공 신경망을 이용한 한국어 문장단위 운율 발생에 관한 연구)

이일구;민경중;강찬구;임운천
- Proceedings of the Acoustical Society of Korea Conference
- /
- autumn
- /
- pp.105-108
- /
- 1999
TTS(Text-To-Speech) 시스템 합성음성의 자연감을 개선하기 위해 하나의 언어에 대해 존재하는 운율 법칙을 정확히 구현해야 한다. 존재하는 운율 법칙을 추출하기 위해서는 방대한 분량의 언어 자료 구축이 필요하다. 그러나 이 방법은 존재하는 운율 현상이 포함된 언어자료에 대해 완벽한 운율을 파악할 수 없으므로 합성음성의 질을 좋게 할 수 없다. 본 논문은 한국어 음성의 운율을 학습하기 위해 2개의 인공 신경망을 제안한다. 하나의 신경망으로 문장의 각 음소에 대한 피치 변화를 학습시키는 것이며, 다른 하나는 에너지 변화를 학습하도록 하였다. 신경망은 BP 신경망을 이용하며 11개의 음소를 나타내기 위해 11개의 입력과, 중간 음소의 피치와 에너지 변화곡선을 근사하는 다항식 계수를 출력하도록 하였다. 신경망시스템의 학습과 평가에 앞서, 음성학적 균형잡힌 고립단어를 기반으로 의미있는 문장을 구성하였다. 문장을 남자 화자로 하여금 읽게 하고 녹음하여 음성 DB를 구축하였다. 음성 DB에 대해 각 음소의 운율 정보를 수집하여 신경망에 맞는 목표 패턴과 훈련 패턴을 작성하였다. 이 목표 패턴은 회귀분석을 통한 추세선을 이용해 피치와 에너지에 대한 2차 다항식계수로 구성하였다. 본 논문은 목표패턴에 맞는 신경망을 학습시켜 좋은 결과를 얻었다.
PDF

PRAAT Software: A Spech Interaction Tool to Analyze Teacher Voices (PRAAT 소프트웨어: 교사 목소리 분석을 위한 맞춤법 상호작용 도구)

Kidd, Ella Jane
- Journal of Convergence for Information Technology
- /
- v.9 no.9
- /
- pp.158-165
- /
- 2019
Through the use of speech software technology, this paper examines the effects of voice interactions within the inner circle of English. The fundamental frequency (F0) was obtained by analyzing native speakers (aged 30-55) speech effects based on nationality, age, and gender. The findings within this study reveal that the Caucasian British female (age 33) and the Caucasian American male (age 55) produced the most interactive speech. The contributing factor is the students' experience with various language styles throughout their language acquisition studies. The results of this study are compatible with $Traunm{\ddot{u}}eller$ & Eriksson (1995) and previous studies which agree that continuous speech above average is paramount towards student engagement and interactions.
https://doi.org/10.22156/CS4SMB.2019.9.9.158 인용 PDF KSCI

Analysis and Use of Intonation Features for Emotional States (감정 상태에 따른 발화문의 억양 특성 분석 및 활용)

Lee, Ho-Joon;Park, Jong C.
- Annual Conference on Human and Language Technology
- /
- 2008.10a
- /
- pp.145-150
- /
- 2008
본 논문에서는 8개의 문장에 대해서 6명의 화자가 5가지 감정 상태로 발화한 총 240개의 문장을 감정 음성 말뭉치로 활용하여 각 감정 상태에서 특징적으로 나타나는 억양 패턴을 분석하고, 이러한 억양 패턴을 음성 합성 시스템에 적용하는 방법에 대해서 논의한다. 이를 위해 본 논문에서는 감정 상태에 따른 특징적 억양 패턴을 억양구의 길이, 억양구의 구말 경계 성조, 하강 현상에 중점을 두어 분석하고, 기쁨, 슬픔, 화남, 공포의 감정을 구분 지을 수 있는 억양 특징들을 음성 합성 시스템에 적용하는 과정을 보인다. 본 연구를 통해 화남의 감정에서 나타나는 억양의 상승 현상을 확인할 수 있었고, 각 감정에 따른 특징적 억양 패턴을 찾을 수 있었다.
PDF

Context sentiment analysis based on Speech Tone (발화 음성을 기반으로 한 감정분석 시스템)

Jung, Jun-Hyeok;Park, Soo-Duck;Kim, Min-Seung;Park, So-Hyun;Han, Sang-Gon;Cho, Woo-Hyun
- Proceedings of the Korea Information Processing Society Conference
- /
- 2017.11a
- /
- pp.1037-1040
- /
- 2017
현재 머신러닝과 딥러닝의 기술이 빠른 속도로 발전하면서 수많은 인공지능 음성 비서가 출시되고 있지만, 발화자의 문장 내 존재하는 단어만 분석하여 결과를 반환할 뿐, 비언어적 요소는 인식할 수 없기 때문에 결과의 구조적인 한계가 존재한다. 따라서 본 연구에서는 인간의 의사소통 내 존재하는 비언어적 요소인 말의 빠르기, 성조의 변화 등을 수치 데이터로 변환한 후, "플루칙의 감정 쳇바퀴"를 기초로 지도학습 시키고, 이후 입력되는 음성 데이터를 사전 기계학습 된 데이터를 기초로 kNN 알고리즘을 이용하여 분석한다.
https://doi.org/10.3745/PKIPS.y2017m11a.1037 인용 PDF

한글중심의 한글 로마자삼기

Sin, Gyeong-Gu
- Annual Conference on Human and Language Technology
- /
- 1990.11a
- /
- pp.73-80
- /
- 1990
이 논문은 한글 로마자 삼기의 여러가지 방법을 살펴보고, 외국인을 중심으로 정확한 음성표기를 이루려 했던 현재의 문교부 표기법의 문제점을 분석할 것이다. 아울러 우리나라 사람의 언어의식을 기준으로 하고 한글과 로마자의 일대일 대응을 바탕으로 한 로마자 삼기 방안을 제시할 것이다.
PDF

Analysis of Voice Parameters on Different Phonatory Tasks using Multi-Channel Phonatory Function Analyzer in Healthy Adults (다채널 음성분석장치를 이용한 정상 성인에서의 발성 방식에 따른 음성변수 분석)

성명훈;이상준;김광현;노종렬;권택균;이강진;박광석;최종민
- Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
- /
- v.13 no.2
- /
- pp.132-138
- /
- 2002
Background and Objectives : The complex physiologic structure of the larynx can vibrate in three or more different ways that yield acuostically and perceptually distinct vocal quality. The purpose of this study is to examine the normal range of voice parameters in Multi-Channel Phonatory Function Analyzer and investigate the difference of voice parameters according to the phonatory patterns. Materials and Methods : Forty normal adult speakers (20 men and 20 women) with age ranging from third to forth decades pronounce low, comfortable, and high tone /a/ ; comfortable tone /${\ae}$/, /i/, /o/, and /u/ : fry, falsetto. Voice was analyzed by Newly developed multi-channel phonatory function analyzer. Results : The normal range of voice parameters in this system was similar to the existing data. Fry shows high jitter and falsetto low SQ. Fry and falsetto show low OQ in men but no difference in women. Jitter, OQ and SQ were different between men and women in modal register, whereas there was no gender difference in fry and falsetto. In frequency magnitude spectrum and EGG, modal register, fry and falsetto have distinguishing pattern. Conclusions : Modal register, fry and falsetto are distinguishable in voice parameters and show different vibratory patterns.
PDF

Search Result 384, Processing Time 0.023 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)