통합 검색 | Korea Science

한국어 음성인식 플랫폼의 설계 (Design of a Korean Speech Recognition Platform)

권오욱;김회린;유창동;김봉완;이용주
- 대한음성학회지:말소리
- /
- 제51호
- /
- pp.151-165
- /
- 2004
For educational and research purposes, a Korean speech recognition platform is designed. It is based on an object-oriented architecture and can be easily modified so that researchers can readily evaluate the performance of a recognition algorithm of interest. This platform will save development time for many who are interested in speech recognition. The platform includes the following modules: Noise reduction, end-point detection, met-frequency cepstral coefficient (MFCC) and perceptually linear prediction (PLP)-based feature extraction, hidden Markov model (HMM)-based acoustic modeling, n-gram language modeling, n-best search, and Korean language processing. The decoder of the platform can handle both lexical search trees for large vocabulary speech recognition and finite-state networks for small-to-medium vocabulary speech recognition. It performs word-dependent n-best search algorithm with a bigram language model in the first forward search stage and then extracts a word lattice and restores each lattice path with a trigram language model in the second stage.
PDF

한국어 음성인식 플랫폼(ECHOS)의 개선 및 평가 (Improvement and Evaluation of the Korean Large Vocabulary Continuous Speech Recognition Platform (ECHOS))

권석봉;윤성락;장규철;김용래;김봉완;김회린;유창동;이용주;권오욱
- 대한음성학회지:말소리
- /
- 제59호
- /
- pp.53-68
- /
- 2006
We report the evaluation results of the Korean speech recognition platform called ECHOS. The platform has an object-oriented and reusable architecture so that researchers can easily evaluate their own algorithms. The platform has all intrinsic modules to build a large vocabulary speech recognizer: Noise reduction, end-point detection, feature extraction, hidden Markov model (HMM)-based acoustic modeling, cross-word modeling, n-gram language modeling, n-best search, word graph generation, and Korean-specific language processing. The platform supports both lexical search trees and finite-state networks. It performs word-dependent n-best search with bigram in the forward search stage, and rescores the lattice with trigram in the backward stage. In an 8000-word continuous speech recognition task, the platform with a lexical tree increases 40% of word errors but decreases 50% of recognition time compared to the HTK platform with flat lexicon. ECHOS reduces 40% of recognition errors through incorporation of cross-word modeling. With the number of Gaussian mixtures increasing to 16, it yields word accuracy comparable to the previous lexical tree-based platform, Julius.
PDF

자동차 환경내의 음성인식 자동 평가 플랫폼 연구 (A Study of Automatic Evaluation Platform for Speech Recognition Engine in the Vehicle Environment)

이성재;강선미
- 한국통신학회논문지
- /
- 제37권7C호
- /
- pp.538-543
- /
- 2012
주행 중 차량내의 음성인터페이스 에서 음성인식기의 성능은 가장 중요한 부분이다. 본 논문은 차량내 음성인식기의 성능 평가를 자동화하기 위한 플랫폼의 개발에 대한 것이다. 개발된 플랫폼은 주 프로그램, 중계 프로그램 데이터베이스 관리, 통계산출 모듈로 구성된다. 성능 평가에 있어 실제 차량의 주행 조건을 고려한 시뮬레이션 환경이 구축되었고, 미리 녹음된 주행 노이즈와 발화자의 목소리를 마이크를 통해 입력하여 실험하였다. 실험 결과 제안하는 플랫폼에서 얻어진 음성인식 결과의 유효성이 입증되었다. 제안한 플랫폼으로 사용자는 음성인식의 자동화와 인식결과의 효율적인 관리 및 통계산출을 함으로서 차량 음성인식기의 평가를 효과적으로 진행할 수 있다.
https://doi.org/10.7840/KICS.2012.37.7C.538 인용 PDF KSCI

대화음성인식 시스템 구현을 위한 기본 플랫폼 개발 (Development of a Baseline Platform for Spoken Dialog Recognition System)

정민화;서정연;이용주;한명수
- 대한음성학회:학술대회논문집
- /
- 대한음성학회 2003년도 5월 학술대회지
- /
- pp.32-35
- /
- 2003
This paper describes our recent work for developing a baseline platform for Korean spoken dialog recognition. In our work, We have collected about 65 hour speech corpus with auditory transcriptions. Linguistic information on various levels such as mophology, syntax, semantics, and discourse is attached to the speech database by using automatic or semi-automatic tools for tagging linguistic information.
PDF

한국어 음성인식 플랫폼 (ECHOS) 개발 (Development of a Korean Speech Recognition Platform (ECHOS))

권오욱;권석봉;장규철;윤성락;김용래;장광동;김회린;유창동;김봉완;이용주
- 한국음향학회지
- /
- 제24권8호
- /
- pp.498-504
- /
- 2005
교육 및 연구 목적을 위하여 개발된 한국어 음성인식 플랫폼인 ECHOS를 소개한다. 음성인식을 위한 기본 모듈을 제공하는 BCHOS는 이해하기 쉽고 간단한 객체지향 구조를 가지며, 표준 템플릿 라이브러리 (STL)를 이용한 C++ 언어로 구현되었다. 입력은 8또는 16 kHz로 샘플링된 디지털 음성 데이터이며. 출력은 1-beat 인식결과, N-best 인식결과 및 word graph이다. ECHOS는 MFCC와 PLP 특징추출, HMM에 기반한 음향모델, n-gram 언어모델, 유한상태망 (FSN)과 렉시컬트리를 지원하는 탐색알고리듬으로 구성되며, 고립단어인식으로부터 대어휘 연속음성인식에 이르는 다양한 태스크를 처리할 수 있다. 플랫폼의 동작을 검증하기 위하여 ECHOS와 hidden Markov model toolkit (HTK)의 성능을 비교한다. ECHOS는 FSN 명령어 인식 태스크에서 HTK와 거의 비슷한 인식률을 나타내고 인식시간은 객체지향 구현 때문에 약 2배 정도 증가한다. 8000단어 연속음성인식에서는 HTK와 달리 렉시컬트리 탐색 알고리듬을 사용함으로써 단어오류율은 $40\%$ 증가하나 인식시간은 0.5배로 감소한다.
PDF KSCI

한국어 음성인식 플랫폼 개발현황 (Status Report on the Korean Speech Recognition Platform)

권오욱;권석봉;장규철;윤성락;김용래;장광동;김희린;유창동;김봉완;이용주
- 대한음성학회:학술대회논문집
- /
- 대한음성학회 2005년도 추계 학술대회 발표논문집
- /
- pp.215-218
- /
- 2005
This paper reports the current status of development of the Korean speech recognition platform (ECHOS). We implement new modules including ETSI feature extraction, backward search with trigram, and utterance verification. The ETSI feature extraction module is implemented by converting the public software to an object-oriented program. We show that trigram language modeling in the backward search pass reduces the word error rate from 23.5% to 22% on a large vocabulary continuous speech recognition task. We confirm the utterance verification module by examining word graphs with confidence score.
PDF

착용형 단말에서의 음성 인식과 제스처 인식을 융합한 멀티 모달 사용자 인터페이스 설계 (Design of Multimodal User Interface using Speech and Gesture Recognition for Wearable Watch Platform)

성기은;박유진;강순주
- 정보과학회 컴퓨팅의 실제 논문지
- /
- 제21권6호
- /
- pp.418-423
- /
- 2015
기술 발전에 따른 착용형 단말의 기능들은 더 다양하고 복잡해지고 있다. 복잡한 기능 때문에 일반 사용자들도 기능을 사용하기 힘든 경우가 있다. 본 논문에서는 사용자에게 편리하고 간단한 인터페이스 방식을 제공하자는데 목적을 두고 있다. 음성 인식의 경우 사용자 입장에서 직관적이고 사용하기 편리할 뿐만 아니라 다양한 명령어를 입력할 수 있다. 하지만 음성 인식을 착용형 단말에서 사용할 경우 컴퓨팅 파워라든지 소모전력 등 하드웨어적인 제약이 있다. 또한 착용형 단말은 언제 사용자가 음성으로 명령을 내릴지 그 시점을 알 수가 없다. 따라서 명령을 입력 받기 위해서는 음성 인식이 항상 동작하여야 한다. 하지만 소모전력 문제 때문에 이와 같은 방법을 사용하기에는 무리가 있다. 음성 인식이 가지고 있는 문제점을 보완하기 위해 제스처 인식을 사용한다. 본 논문에서는 음성과 제스처를 혼합한 멀티 모달 인터페이스로 사용자에게 어떻게 편리한 인터페이스를 제공할 것인지에 대해 설명하고 있다.
https://doi.org/10.5626/KTCP.2015.21.6.418 인용 KSCI

청각장애인을 위한 음성 인식 및 합성 애플리케이션 개발 (Development of Speech Recognition and Synthetic Application for the Hearing Impairment)

이원주;김우린;함혜원;윤상운
- 한국컴퓨터정보학회:학술대회논문집
- /
- 한국컴퓨터정보학회 2020년도 제62차 하계학술대회논문집 28권2호
- /
- pp.129-130
- /
- 2020
본 논문에서는 청각장애인의 의사소통을 위한 안드로이드 애플리케이션 시스템 구현 결과를 보인다. 구글 클라우드 플랫폼(Google Cloud Platform)의 STT(Speech to Text) API를 이용하여 음성 인식을 통해 대화의 내용을 텍스트의 형태로 출력한다. 그리고 TTS(Text to Speech)를 이용한 음성 합성을 통해 텍스트를 음성으로 출력한다. 또한, 포그라운드 서비스(Service)에서 가속도계 센서(Accelerometer Sensor)를 이용하여 스마트폰을 2~3회 흔들었을 때 해당 애플리케이션을 실행할 수 있도록 하여 애플리케이션의 활용성을 높인 시스템을 개발하였다.
PDF

긴급상황에 대한 가상현실 선원 훈련 플랫폼 (VR-simulated Sailor Training Platform for Emergency)

박철웅;정진기;양현승
- 한국항해항만학회:학술대회논문집
- /
- 한국항해항만학회 2015년도 추계학술대회
- /
- pp.175-178
- /
- 2015
본 논문은 국내외 해양사고 원인의 60-80%에 해당하는 인적과실을 예방하기 위하여 긴급상황에 대한 가상현실 선원 훈련 플랫폼을 제안한다. 제안된 훈련 플랫폼은 가상현실 기술을 통해 긴급 상황 내 절차 숙달을 위한 상호작용 방법과 가상 선박 환경 내에서 군중 에이전트를 제어하는 군중 제어 방법을 제공한다. 제안된 훈련 플랫폼의 상호작용 방법은 훈련 몰입도를 높이기 위하여 음성인식과 행동인식을 사용한다. 군중 제어는 사회적 특성을 반영한 에이전트의 행동모델을 적용하여 자연스러운 시뮬레이션을 제공한다. 제안된 훈련 플랫폼의 효율성을 실험하기 위해 선박 내 화재 상황에 대한 가상 훈련 시나리오를 standalone 훈련 플랫폼으로써 구현하였다.
PDF

5W1H 프로그래밍 모델을 기반으로 한 음성인식 스마트 홈 시스템 (Speech Recognition based Smart Home System using 5W1H Programming Model)

백영태;이세훈;김지성;신보배
- 한국컴퓨터정보학회:학술대회논문집
- /
- 한국컴퓨터정보학회 2017년도 제55차 동계학술대회논문집 25권1호
- /
- pp.43-44
- /
- 2017
본 논문에서는 상용화된 음성-인식 디바이스가 다른 임베디드 모듈과 통신하며 스마트홈 중앙처리 서버역할을 수행하려 할 때 제작사에 의해 개발되어지지 않거나 제한된 모듈과 서비스만을 제공한다는 문제점을 해결하기 위해 사용자가 직접 간단한 작업으로 원하는 기능의 모듈을 개발하여 자유롭게 음성인식명령을 추가할 수 있는 플랫폼을 제안한다. 본 논문에서 제안하는 플랫폼의 개념은 특정 OS에 종속되지 않으므로 다양한 시스템에서 제공될 수 있도록 설계되었으며 실험 플랫폼은 Windows기반으로 제작되었으나 다른 시스템에도 같은 개념을 적용하여 제작할 수 있다.
PDF

검색결과 20건 처리시간 0.023초

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)