Search | Korea Science

An Adaptive Utterance Verification Framework Using Minimum Verification Error Training

Shin, Sung-Hwan;Jung, Ho-Young;Juang, Biing-Hwang
- ETRI Journal
- /
- v.33 no.3
- /
- pp.423-433
- /
- 2011
This paper introduces an adaptive and integrated utterance verification (UV) framework using minimum verification error (MVE) training as a new set of solutions suitable for real applications. UV is traditionally considered an add-on procedure to automatic speech recognition (ASR) and thus treated separately from the ASR system model design. This traditional two-stage approach often fails to cope with a wide range of variations, such as a new speaker or a new environment which is not matched with the original speaker population or the original acoustic environment that the ASR system is trained on. In this paper, we propose an integrated solution to enhance the overall UV system performance in such real applications. The integration is accomplished by adapting and merging the target model for UV with the acoustic model for ASR based on the common MVE principle at each iteration in the recognition stage. The proposed iterative procedure for UV model adaptation also involves revision of the data segmentation and the decoded hypotheses. Under this new framework, remarkable enhancement in not only recognition performance, but also verification performance has been obtained.
https://doi.org/10.4218/etrij.11.0110.0489 인용 PDF KSCI

Voice Activity Detection Algorithm using Wavelet Band Entropy Ensemble Analysis in Car Noisy Environments (문서 편집 접근성 향상을 위한 음성 명령 기반 모바일 어플리케이션 개발)

Park, Joo Hyun;Park, Seah;Lee, Muneui;Lim, Soon-Bum
- Journal of Korea Multimedia Society
- /
- v.21 no.11
- /
- pp.1342-1352
- /
- 2018
Voice Command systems are important means of ensuring accessibility to digital devices for use in situations where both hands are not free or for people with disabilities. Interests in services using speech recognition technology have been increasing. In this study, we developed a mobile writing application using voice recognition and voice command technology which helps people create and edit documents easily. This application is characterized by the minimization of the touch on the screen and the writing of memo by voice. We have systematically designed a mode to distinguish voice writing and voice command so that the writing and execution system can be used simultaneously in one voice interface. It provides a shortcut function that can control the cursor by voice, which makes document editing as convenient as possible. This allows people to conveniently access writing applications by voice under both physical and environmental constraints.
https://doi.org/10.9717/kmms.2018.21.11.1342 인용 PDF KSCI HTML

The Speaker Recognition System using the Pitch Alteration (피치변경을 이용한 화자인식 시스템)

Jung JongSoon;Bae MyungJin
- Proceedings of the Acoustical Society of Korea Conference
- /
- spring
- /
- pp.115-118
- /
- 2002
Parameters used in a speaker recognition system are desirable expressing speaker's characteristics filly and have in a speech. That is to say, if inter-speaker than intra-speaker variance a big characteristic, it is useful to distinguish between speakers. Also, to make minimum error between speakers, it is required the improved recognition technology as well as the distinguishing characteristics. When we see the result of recent simulation performance, we obtain more exact performance by using dynamic characteristics and constant characteristics by a speaking habit. Therefore we suggest it to solve this problem as followings. The prosodic information is used by a characteristic vector of speech. Characteristics vector generally using in speaker recognition system is a modeling spectrum information and is working for a high performance in non-noise circumstance. However, it is found a problem that characteristic vector is distorted in noise circumstance and it makes a reduction of recognition rate. In this paper, we change pitch line divided by segment which can estimate a dynamic characteristic and it is used as a recognition characteristic. we confirmed that the dynamic characteristic is very robust in noise circumstance with a simulation. We make a decision of acceptance or rejection by comparing test pattern and recognition rate using the proposed algorithm has more improvement than using spectrum and prosodic information. Especially stational recognition rate can be obtained in noise circumstance through the simulation.
PDF

Improvement of Reliability based Information Integration in Audio-visual Person Identification (시청각 화자식별에서 신뢰성 기반 정보 통합 방법의 성능 향상)

Tariquzzaman, Md.;Kim, Jin-Young;Hong, Joon-Hee
- MALSORI
- /
- no.62
- /
- pp.149-161
- /
- 2007
In this paper we proposed a modified reliability function for improving bimodal speaker identification(BSI) performance. The convectional reliability function, used by N. Fox[1], is extended by introducing an optimization factor. We evaluated the proposed method in BSI domain. A BSI system was implemented based on GMM and it was tested using VidTIMIT database. Through speaker identification experiments we verified the usefulness of our proposed method. The experiments showed the improved performance, i.e., the reduction of error rate by 39%.
PDF

The Phonological and Orthographic activation in Korean Word Recognition(II) (한국어 단어 재인에서의 음운정보와 철자정보의 활성화(II))

Choi Wonil;Nam Kichun
- Proceedings of the KSPS conference
- /
- 2003.10a
- /
- pp.33-36
- /
- 2003
Two experiments were conducted to support the suggestion that the same information processing was used in both input modalities, visual and auditory modality in Wonil Choi & Kichun Nam(2003)'s paper. The primed lexical decision task was performed and pseudoword prime stimuli were used. The result was that priming effect did not occur in any experimental condition. This result might be interpreted visual facilitative information and phonological inhibitory information cancelled each other.
PDF

Noise robust distant sound recognition (잡음 환경에 강인한 원거리 음향 정보 검출 기술 연구)

Yoo, In-Chul;Yook, Dong-Suk
- Proceedings of the KSPS conference
- /
- 2007.05a
- /
- pp.37-38
- /
- 2007
This paper reviews the issues in implementing sound recognizers in real environments. First is the signal corruption caused by background noises and reverberation. Second is the open-set problem which is the problem of rejecting out-of-vocabulary words and noises. These two issues must be solved for noise robust recognizers.
PDF

A Study on the Performance Improvement of a Stock Information Retrieval System using Continuous Speech Recognition Technology (연속음성인식기술을 이용한 음성인식 증권정보 시스템의 성능 향상에 대한 연구)

구명완
- Proceedings of the Acoustical Society of Korea Conference
- /
- 1998.08a
- /
- pp.51-55
- /
- 1998
한국통신이 개발하여 현재 700-3000번으로 서비스되고 있는 음성 인식 증권정보시스템을 소개하고, 음성인식 성능을 향상시키기 위한 한국통신의 연구현황을 기술하고자 한다. 현재 운용중에 있는 서비스 시스템은 120명이 동시에 사용할 수 있는 시스템이며 S/W 와 H/W를 분리시켜 S/W의 버전을 갱신하더라고 H/W의 변경이 최소화 되도록 설계되었다. 현재 고려하고 있는 성능 향상 방법은 연속음성 인식 기술을 이용하여 고립단어 인식을 시도하는 것과 거절기능 구현 및 tied-state에 의한 문맥종속 음소를 구하는 것이다. 또한 연속 HMM 모델 방식으로의 변경도 연구중에 있다.
PDF

The Recognition and Pedagogy of Chinese Tones (중국어 성조의 인지와 교육)

Shim So Hee
- MALSORI
- /
- no.40
- /
- pp.65-78
- /
- 2000
Korean learners of Chinese have diniculties in pronouncing Chinese tones which distinguish the meaning of words, because there are not such tones in Korean language. It makes Koreans hard to acquire Chinese. In this paper, I present the followings: First, I examine the characteristics of the tones pronounced by Korean speakers, exploiting the method of modern experimental phonetics. Second, I present the pedagogy of Chinese tones, considering the typical errors shown by the experiments on Korean speakers. The Pedagogy Presented in this Paper, which is based on the results of experiments, is not perfect. However, I expect this paper to serve as instrumental tools to help Korean speakers to improve their command of Chinese.
PDF

New Services based on speech recognition technology (음성인식기술을 이용한 새로운 서비스)

구명완
- Proceedings of the Acoustical Society of Korea Conference
- /
- 1995.06a
- /
- pp.47-51
- /
- 1995
음성인식기술을 이용한 시스템이 상용화되기 위해서 필요한 기술의 최근 동향과 현재의 기술로 실용화가 이루어지고 있는 서비스등에 대해 알아본다. 최근의 음성인식기술은 실용화를 목표로 음성 인식을 위한 기본 유니트 선정, 화자의 음성을 거절하는 기능, 및 실시간 구현 기술에 대한 연구가 활발히 진행되고 있다. 한편 현재의 기술로 가능한 실용서비스로는 전화번호 안내, 음성 다이얼링 서비스 등과 같이 현재 제공되고 서비스의 비용을 절감시키는 것과 교통안내, 날씨안내, 영화관 예약에 음성인식기술을 적용하여 새로운 서비스를 제공하는 것이 있다.
PDF

Virtual Reality based Situation Immersive English Dialogue Learning System (가상현실 기반 상황몰입형 영어 대화 학습 시스템)

Kim, Jin-Won;Park, Seung-Jin;Min, Ga-Young;Lee, Keon-Myung
- Journal of Convergence for Information Technology
- /
- v.7 no.6
- /
- pp.245-251
- /
- 2017
This presents an English conversation training system with which learners train their conversation skills in English, which makes them converse with native speaker characters in a virtual reality environment with voice. The proposed system allows the learners to talk with multiple native speaker characters in varous scenarios in the virtual reality environment. It recongizes voices spoken by the learners and generates voices by a speech synthesis method. The interaction with characters in the virtual reality environment in voice makes the learners immerged in the conversation situations. The scoring system which evaluates the learner's pronunciation provides the positive feedback for the learners to get engaged in the learning context.
https://doi.org/10.22156/CS4SMB.2017.7.6.245 인용 PDF KSCI

Search Result 527, Processing Time 0.023 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)