통합 검색 | Korea Science

핵심어 인식기에서 단어의 음소레벨 로그 우도 비율의 패턴을 이용한 발화검증 방법 (Utterance Verification using Phone-Level Log-Likelihood Ratio Patterns in Word Spotting Systems)

김정현;권석봉;김회린
- 말소리와 음성과학
- /
- 제1권1호
- /
- pp.55-62
- /
- 2009
This paper proposes an improved method to verify a keyword segment that results from a word spotting system. First a baseline word spotting system is implemented. In order to improve performance of the word spotting systems, we use a two-pass structure which consists of a word spotting system and an utterance verification system. Using the basic likelihood ratio test (LRT) based utterance verification system to verify the keywords, there have been certain problems which lead to performance degradation. So, we propose a method which uses phone-level log-likelihood ratios (PLLR) patterns in computing confidence measures for each keyword. The proposed method generates weights according to the PLLR patterns and assigns different weights to each phone in the process of generating confidence measures for the keywords. This proposed method has shown to be more appropriate to word spotting systems and we can achieve improvement in final word spotting accuracy.
PDF

Utterance Verification Using Search Confusion Rate and Its N-Best Approach

Kim, Kyu-Hong;Kim, Hoi-Rin;Hahn, Min-Soo
- ETRI Journal
- /
- 제27권4호
- /
- pp.461-464
- /
- 2005
Recently, a variety of confidence measures for utterance verification has been studied to improve speech recognition performance by rejecting out-of-vocabulary inputs. Most of the conventional confidence measures for utterance verification are based primarily on hypothesis testing or an approximated posterior probability, and their performances depend on the robustness of an alternative hypothesis or the prior probability. We introduce a novel confidence measure called a search confusion rate (SCR), which does not require an alternative hypothesis or the approximation of posterior probability. Our confusion-based approach shows better performance in additive noise-corrupted speech as well as in clean speech.
PDF

MODELING QUANTITATIVE VARIATION - In the Kyungnam Dialect of Korean -

Cho, Yong-Hyung
- 음성과학
- /
- 제1권
- /
- pp.137-152
- /
- 1997
The objectives of this paper are to see how the declination is realized in the different positions/lengths of the utterance, to see if the $F_0$ value throughout the utterance changes in a predictable way, and if so, to find out the best quantitative model which fits the declination. The experiment results are as follows. First, the peak value over the utterance can be affected by the position of the peak and length of the utterance. Second, the choice of quantitative models is dependent on the different list lengths. Third, in everyone's speech, there is a baseline (the lowest $F_0$ value a speaker can use), and the $F_0$ will not fall below the baseline. Forth, the peak $F_0$ of the last word in each list shows little variation in pitch value (target $F_0$) while the number of words in the list affects the starting $F_0$ values.
PDF

한국어 발화 속도의 연령별 증가에 관한 연구 －만 $3{\sim}8$ 세 아동을 대상으로－ (Increase in Speaking Rate by $3{\sim}8$-year-old Korean Children)

김태경;장경희;이필영
- 음성과학
- /
- 제13권3호
- /
- pp.83-95
- /
- 2006
This study attempts to suggest a criterion of Korean language development. For this purpose we investigated speaking rates of the spontaneous utterances produced by 144 children, aged 3 to 8. We analyzed each subject's speaking rate and its relevance with speaker's age, gender and utterance length. To determine the relative contributions of variables to the speaking rate, multiple regression was conducted. Results of this study can be summarized as follows: (1) The mean and maximum values of the speaking rate increased with the growth of age. (2) A statistically significant increase in speaking rate appeared at two-year intervals. (3) There was no significant difference between male and female groups in the speaking rate. (4) The multiple regression analysis has shown that along with the speaker's age, the utterance length(the mean number of syllables per utterance) is also important in estimating the speaking rates.
PDF

짧은 음성을 대상으로 하는 화자 확인을 위한 심층 신경망 (Deep neural networks for speaker verification with short speech utterances)

양일호;허희수;윤성현;유하진
- 한국음향학회지
- /
- 제35권6호
- /
- pp.501-509
- /
- 2016
본 논문에서는 짧은 테스트 발성에 대한 화자 확인 성능을 개선하는 방법을 제안한다. 테스트 발성의 길이가 짧을 경우 i-벡터/확률적 선형판별분석 기반 화자 확인 시스템의 성능이 하락한다. 제안한 방법은 짧은 발성으로부터 추출한 특징 벡터를 심층 신경망으로 변환하여 발성 길이에 따른 변이를 보상한다. 이 때, 학습시의 출력 레이블에 따라 세 종류의 심층 신경망 이용 방법을 제안한다. 각 신경망은 입력 받은 짧은 발성 특징에 대한 출력 결과와 원래의 긴 발성으로부터 추출한 특징과의 차이를 줄이도록 학습한다. NIST (National Institute of Standards Technology, 미국) 2008 SRE(Speaker Recognition Evaluation) 코퍼스의 short 2-10 s 조건 하에서 제안한 방법의 성능을 평가한다. 실험 결과 부류 내 분산 정규화 및 선형 판별 분석을 이용하는 기존 방법에 비해 최소 검출 비용이 감소하는 것을 확인하였다. 또한 짧은 발성 분산 정규화 기반 방법과도 성능을 비교하였다.
https://doi.org/10.7776/ASK.2016.35.6.501 인용 PDF KSCI

감정 음성의 국어 발화 말 경계성조 연구 (Research of Korean utterance-final boundary tones in Emotion speeches)

박미영
- 대한음성학회:학술대회논문집
- /
- 대한음성학회 2007년도 한국음성과학회 공동학술대회 발표논문집
- /
- pp.193-196
- /
- 2007
The purpose of this paper is to find boundary tone's characteristics in Korean emotion speeches. I mainly focus on investigating patterns and f0 values of boundary tones and f0 values in utterance final phrases.
PDF

Prosodic Patterns in Castilian Spanish Short Declarative Sentences

Kimura, Takuya
- 대한음성학회:학술대회논문집
- /
- 대한음성학회 1996년도 10월 학술대회지
- /
- pp.554-559
- /
- 1996
An utterance is normally divided into two or more intonation groups. Bach intonation group has its intonation pattern. Pitch movement of Spanish utterance is basically determined by a combination of two factors: position of the stressed syllables and the intonation pattern. The pitch of a syllable can be affected by that of preceding syllables. This is rather a physiological effect than a phonological one.
PDF

동영상 기반 자동 발화 심층 분석(SUDA) 어플리케이션 개발 (Development of the video-based smart utterance deep analyser (SUDA) application)

이수복;곽효정;윤재민;신동춘;심현섭
- 말소리와 음성과학
- /
- 제12권2호
- /
- pp.63-72
- /
- 2020
본 연구는 동영상을 기반으로 일상생활에서 녹화한 아동 및 성인의 발화를 자동으로 분석해주는 SUDA(smart utterance deep analyser) 하이브리드 앱 개발에 관한 것이다. 특히, 아동과 부모가 원하는 시간 및 장소에서 상호작용하는 장면을 촬영하여 업로드할 수 있고 시간의 흐름에 따라 데이터를 계속 축적하여 이를 관찰하고 분석할 수 있도록 도울 수 있다. SUDA는 안드로이드폰, 아이폰, 태플릿 PC 기반에서 구동되며, 대용량의 동영상을 녹화 및 업로드할 수 있고, 사용자의 목적(일반인, 전문가, 관리자)에 따라 차별화된 기능을 제공할 수 있다. 전문가 모드에서는 자동화된 시스템과 협업하여 대상자의 발화를 말·언어적인 측면(비유창성, 형태소수, 음절수, 단어수, 말속도, 반응시간 등)에서 세부적으로 분석할 수 있다. 즉, SDUA 시스템이 대상자의 발화를 반자동으로 전사 및 분석하면, 언어치료사가 이를 검토하고, 보완하여 의사소통장애 진단과 중재 시 활용할 수 있다. 일반인(부모)의 경우, 전문가가 분석한 결과를 그래프 형태로 제공 받아 모니터링 할 수 있고, 관리자는 발화 분석, 영상삭제 등 전체 시스템을 관리할 수 있다. 본 시스템은 발화 분석의 반자동화로 치료사와 연구자의 부담을 줄여주고, 부모가 자녀의 발화를 기반으로 하여 말·언어발달에 대한 정보를 쉽고 다양하게 제공 받을 수 있다는 점에서 임상적 의의가 있다. 또한, 한국형 말더듬아동 진단 및 중재에 적용할 수 있는 종단데이터를 구축하고, 말더듬 회복 예측 요인들을 찾는 기초자료로 활용하고자 한다.
https://doi.org/10.13064/KSSS.2020.12.2.063 인용 PDF KSCI

문장 따라말하기에서 말속도, 발화길이 및 통사적 복잡성에 따른 말더듬 아동과 일반아동의 비유창성 비교 (The influences of speech rate, utterance length and sentence complexity of disfluency in preschool children who stutter and children who do not stutter)

김예슬;심현섭
- 말소리와 음성과학
- /
- 제13권1호
- /
- pp.53-64
- /
- 2021
요구용량 모델에 의하면 말더듬 아동의 비유창성은 외적, 내적 환경에 영향을 받아 나타나는 것으로 알려져 있다. 본 연구의 목적은 외적 환경 중 언어적인 환경(말속도, 발화길이 및 통사적 복잡성)의 변화에 따른 말더듬 아동과 일반아동의 비유창성 차이를 비교 분석하려고 한다. 연구대상은 4-6세 말더듬 아동 9명, 일반아동 9명이었다. 연구 과제로 문장 따라말하기 과제를 실시하여 말더듬 아동과 일반 아동의 비유창성 빈도를 구하였다. 두 그룹의 비유창성 차이를 분석한 결과, 발화길이를 조절했을 때 평균 말속도에서 말더듬 아동은 발화길이와 상관없이 일반 아동보다 비유창성이 더 많이 나타났다. 말속도를 조절 때 말더듬 아동은 빠른 말속도에서 일반아동보다 많은 비유창성을 보였다. 그리고 말속도와 발화길이를 조절했을 시 빠른 말속도에서 말더듬 아동은 발화길이와 상관없이 일반아동보다 높은 비유창성을 보였다. 통사적 복잡성을 조절했을 때는 복문에서 말더듬 아동이 일반아동보다 더 많은 비유창성을 보였다. 말더듬 아동은 말속도, 발화길이 그리고 통사적 복잡성에 따라 비유창성에 영향을 받는 것으로 나타났다. 이는 말더듬 아동은 말운동 조절 그리고 언어처리 능력이 일반아동보다 취약한 것으로 보인다. 따라서 임상에서 말더듬 아동 치료 시 치료사와 부모가 말속도와 발화길이를 아동의 수준에 맞춰 진행하는 것이 중요한 것으로 확인되었다.
https://doi.org/10.13064/KSSS.2021.13.1.053 인용 PDF KSCI

예제 기반 챗봇을 위한 기계 학습 기반의 발화 간 유사도 측정 방법 (A Machine Learning based Method for Measuring Inter-utterance Similarity for Example-based Chatbot)

양민철;이연수;임해창
- 한국산학기술학회논문지
- /
- 제11권8호
- /
- pp.3021-3027
- /
- 2010
예제 기반 챗봇은 사용자 발화와 가장 유사한 예제 발화를 대화 예제 데이터베이스로부터 검색하여 응답을 생성한다. 가장 유사한 발화를 찾는 것은 응답의 적절성과 직결되는 것임에도 불구하고, 유사 발화 검색을 위해 어떠한 자질을 사용할 것인지, 어떠한 방식이 좋은 지에 대한 기존 연구는 부족하였다. 본 연구에서는 검색의 정확도와 예제의 활용도를 높이기 위해 다양한 어휘적, 의미적 자질을 이용한 기계 학습 방법을 제안한다. 실험 결과 1) 대화 예제 데이터베이스의 활용도 2) 예제 발화의 매칭의 정확률 3) 답변의 질적인 측면에서 제안하는 방법은 기존의 방법에 비해 더 나은 성능을 보였다.
https://doi.org/10.5762/KAIS.2010.11.8.3021 인용 PDF KSCI

검색결과 382건 처리시간 0.025초

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)