통합 검색 | Korea Science

DSP를 이용한 자동차 소음에 강인한 음성인식기 구현 (Implementation of a Robust Speech Recognizer in Noisy Car Environment Using a DSP)

정익주
- 음성과학
- /
- 제15권2호
- /
- pp.67-77
- /
- 2008
In this paper, we implemented a robust speech recognizer using the TMS320VC33 DSP. For this implementation, we had built speech and noise database suitable for the recognizer using spectral subtraction method for noise removal. The recognizer has an explicit structure in aspect that a speech signal is enhanced through spectral subtraction before endpoints detection and feature extraction. This helps make the operation of the recognizer clear and build HMM models which give minimum model-mismatch. Since the recognizer was developed for the purpose of controlling car facilities and voice dialing, it has two recognition engines, speaker independent one for controlling car facilities and speaker dependent one for voice dialing. We adopted a conventional DTW algorithm for the latter and a continuous HMM for the former. Though various off-line recognition test, we made a selection of optimal conditions of several recognition parameters for a resource-limited embedded recognizer, which led to HMM models of the three mixtures per state. The car noise added speech database is enhanced using spectral subtraction before HMM parameter estimation for reducing model-mismatch caused by nonlinear distortion from spectral subtraction. The hardware module developed includes a microcontroller for host interface which processes the protocol between the DSP and a host.
PDF

펄스 도플러 레이더에서 HMM을 이용한 이동표적의 도플러 오디오 신호 식별 (Classification of Doppler Audio Signals for Moving Target Using Hidden Markov Model in Pulse Doppler Radar)

심재훈;이정호;배건성
- 전기전자학회논문지
- /
- 제22권3호
- /
- pp.624-629
- /
- 2018
감시 및 정찰용 펄스 도플러 레이더(Pulse Doppler Radar : PDR)에서 이동표적의 식별은 일반적으로 레이더 운용자의 도플러 오디오 신호 청취 및 훈련 경험을 바탕으로 수행된다. 본 논문에서는 음성인식 분야에서 널리 이용되는 Mel Frequency Cepstral Coefficients(MFCC) 특징 파라미터와 Hidden Markov Model(HMM) 식별 기법을 이용하여 이동 표적의 클래스를 자동 식별하는 방법을 제안하고, 시뮬레이션을 통해 식별성능을 분석하고 검증하였다.
https://doi.org/10.7471/ikeee.2018.22.3.624 인용 PDF KSCI

은닉 마코프 모델을 이용한 골프 비디오의 시멘틱 이벤트 검출 (Semantic Event Detection in Golf Video Using Hidden Markov Model)

김천석;추진호;배태면;진성호;노용만
- 한국멀티미디어학회논문지
- /
- 제7권11호
- /
- pp.1540-1549
- /
- 2004
본 논문에서는 은닉 마코프 모델을 이용하여 골프 비디오의 시멘틱한 이벤트들을 검출하는 알고리즘을 제안한다. 본 논문의 목적은 하이라이트에 기반한 비디오의 색인 및 요약을 용이하도록 이벤트들을 식별하고 분류하는 것이다. 제안된 알고리즘은 먼저 골프 비디오의 분석을 통하여 4개의 이벤트를 정의하고, 각 이벤트를 구성하는 상태를 이용하여 HMM 모델을 설계한다. 또한 각 이벤트의 HMM을 구성하는 파라메타를 구하기 위해 MPEG-7 시각 기술자에 기반한 10개의 시각 정보 특징들을 이용한다. 실험 결과 제안된 방법은 다양한 골프 이벤트들을 식별하는데 있어 양호한 성능의 검출 결과를 보여 주고 있다.
PDF

시간 동기 비터비 빔 탐색을 위한 인식 시간 감축법 (Recognition Time Reduction Technique for the Time-synchronous Viterbi Beam Search)

이강성
- 한국음향학회지
- /
- 제20권6호
- /
- pp.46-50
- /
- 2001
본 논문은 HMM (Hidden Markov Model) 음성 인식 시스템에 적용할 수 있는 새로운 인식 시간 알고리즘인 스코아 캐쉬기법을 제안한다. 다른 많은 기법들이 인식 시간을 줄이면서 계산량을 줄이기 위하여 어느 정도의 인식율 저하를 감수하는 반면에 제안하는 스코아 캐쉬기법은 인식율 저하를 전혀 일으키지 않으면서 인식 시간을 상당량 줄일 수 있는 기법이다. 단독어 인식 시스템에 적용 가능할 뿐 아니라 연속어 인식에도 적용이 가능하며, 기존에 이미 설계된 인식 시스템의 구조를 전혀 흩트리지 않고 간단히 하나의 함수만 대치함으로서 인식시간을 크게 감축할 수 있다 또한 기존의 계산량 감축 알고리즘과 함께 적용 가능하므로 추가의 계산량 감소를 얻을 수 있다. 스코아 캐쉬 기법을 적용한 결과 최대 54% 만큼 계산량을 줄일 수 있었다.
PDF

HMM을 이용한 알파벳 제스처 인식 (Alphabetical Gesture Recognition using HMM)

윤호섭;소정;민병우
- 한국정보과학회:학술대회논문집
- /
- 한국정보과학회 1998년도 가을 학술발표논문집 Vol.25 No.2 (2)
- /
- pp.384-386
- /
- 1998
The use of hand gesture provides an attractive alternative to cumbersome interface devices for human-computer interaction(HCI). Many methods hand gesture recognition using visual analysis have been proposed such as syntactical analysis, neural network(NN), Hidden Markov Model(HMM) and so on. In our research, a HMMs is proposed for alphabetical hand gesture recognition. In the preprocessing stage, the proposed approach consists of three different procedures for hand localization, hand tracking and gesture spotting. The hand location procedure detects the candidated regions on the basis of skin-color and motion in an image by using a color histogram matching and time-varying edge difference techniques. The hand tracking algorithm finds the centroid of a moving hand region, connect those centroids, and thus, produces a trajectory. The spotting a feature database, the proposed approach use the mesh feature code for codebook of HMM. In our experiments, 1300 alphabetical and 1300 untrained gestures are used for training and testing, respectively. Those experimental results demonstrate that the proposed approach yields a higher and satisfying recognition rate for the images with different sizes, shapes and skew angles.
PDF

권한이동 모델링을 통한 은닉 마르코프 모델 기반 침입탐지 시스템의 성능 향상 (Performance Improvement of Infusion Detection System based on Hidden Markov Model through Privilege Flows Modeling)

박혁장;조성배
- 한국정보과학회논문지:정보통신
- /
- 제29권6호
- /
- pp.674-684
- /
- 2002
기존 침입탐지시스템에서는 구현의 용이성 때문에 오용침입탐지 기법이 주로 사용되었지만, 새로운 침입에 대처하기 위해서는 궁극적으로 비정상행위탐지 기법이 요구된다. 그 중 HMM기법은 생성메커니즘을 알 수 없는 이벤트들을 모델링하고 평가하는 도구로서 다른 침입탐지기법에 비해 침입탐지율이 높은 장점이 있다. 하지만 높은 성능에 비해 정상행위 모델링 시간이 오래 걸리는 단점이 있는데, 본 논문에는 실제 해킹에 사용되고 있는 다양한 침입패턴을 분석하여 권한이동시의 이벤트 추출방법을 이용한 모델링 기법을 제안하였고 이를 통하여 모델링 시간과 False-Positive 오류를 줄일 수 있는 지 평가해 보았다. 실험결과 전체 이벤트 모델링에 비해 탐지율이 증가하였고 시간 또한 단축됨을 알 수 있었다.
PDF KSCI

VQ와 HMM을 이용한 음성인식에서 화자적응에 관한 연구 (Speaker Adaptation in VQ and HMM Based Speech Recognition)

이대룡
- 한국음향학회:학술대회논문집
- /
- 한국음향학회 1991년도 학술발표회 논문집
- /
- pp.54-57
- /
- 1991
본 논무에서는 HMM과 VQ를 이용한 고립단어에 대한 화자종속 및 화자독립 음성인식시스템을 만들고 여기에 화자적응을 하는 방법에 대한 연구를 했다. 화자적응방법에는 크게 VQ코드북을 적응시키는 방법과 HMM패러미터블 적응시키는 방법이 있다. 코드북적응을 하는 방법으로서 기존코드북에 대해 새로운화자의 적응음성을 양자화한 뒤 각 코드벡터에 해당하는 적응음성의 평균을 구해서 새로운 화자의 코드북을 구해주는 방법과 기준코드북에 대해 새로운화자의 적응음성을 양자화할 때 HMM의 각 상태에서 각각의 코드벡터를 발생할 확률을 거리오차의 계산에서 고려해 비록 거리오차는 크지만 그 코드벡터를 발생할 확률이 매우 높으면 적응음성이 그 코드벡터에 index되게해서 각 코드벡터에 해당하는 모든 적응음성데이타의 평균을 새로운 코드북으로 하는 두가지 알고리즘을 제안한다. 이렇게 함으로써 기존의 기준코드북을 초기 코드북으로해서 LBG알고리즘을 사용해서 적응음성데이타에 대한 새로운 코드북을 만드는 방법에 비해 5-10배의 계산시간을 감소하게 된다. 이 새로운 코드북으로 적응음성데이타를 다시 index해서 이 index된 음성렬로 HMM패러미터를 적응했다. 제안된 알고리즘이 코드북적응을 하는 경우에 기존의 적응방법에 비해 5-10배의 계산 시간을 단축하면서 인식률에서는 더 나은결과를 얻었다. 또 같은 적응방법에 대해서 화자종속모델 보다는 화자독립모델에 대해서 화자적응하는 것이 더 나은 인식결과를 보여주었다.
PDF

연속분포 HMM에서 평행분기 음성단위를 사용한 단어인식율 향상연구 (On the Use of a Parallel-Branch Subunit Mod디 in Continuous HMM for improved Word Recognition)

박용규;은종관
- The Journal of the Acoustical Society of Korea
- /
- 제14권2E호
- /
- pp.25-32
- /
- 1995
단어인식의 성능향상을 위하여 평행분기 음성단위(subunit) 모델의 사용을 제안하였으며 연속 분포 HMM에서 이 모델은 각 음성단위를 확률분포함수 (mixture components)를 이용하여 분기시킴에 의해 얻어진다. 제안된 방법을 사용한 결과에 따르면 기존에 제안된 평행분기 [1] 음성단위 모델이나 단일분기 모델보다 높은 인식률을 얻을 수 있었다. 본 연구에서는 각 음성단위에 대해 활률분포함수나 분기수의 적절한 결합을 통해 높은 인식률을 얻는데 이 1036 한국어 결리단어가 인시실험에 사용되었다.
PDF

분할기반 은닉 마르코프 모델과 다층 퍼셉트론 결합 영문수표필기단어 인식시스템 (A Segmentation-Based HMM and MLP Hybrid Classifier for English Legal Word Recognition)

김계경;김진호;박희주
- 한국지능시스템학회논문지
- /
- 제11권3호
- /
- pp.200-207
- /
- 2001
본 논문에서는 분할기반 은닉 마르코프 모델(segmentation based hidden Markov model)과 다층 퍼셉트론 (multi-layer perceptron)을 결합한 영문수표 필기단어 (legal word) 인식시스템을 제안하였다. 가변길이의 필기체 영문 단어 분할결과를 인식할 수 있도록 은닉 마르코프 모델을 이용하여 명확한 분할기반 (explicit segmentation-based) 단어단위 (word level) 인식기를 구현하고 다층 퍼셉트론을 이용하여 내재적 분할기반 (implicit segmentation-based) 단어단위 인식기를 구현하였다. 그리고 이종(heterogeneous)의 두 인식기를 새로운 결합 확률추정방식에 따라 결합함으로서 상호 보완 능력을 극대화시킬 수 있는 영문수표 필기단어 인식시스템을 구현하였다. 제안한 시스템을 캐나다 콘코디아 대학의 CENPARMI 영문 수표 데이터베이스에 적용하여 실험해 본 결과 기존의 연구결과에 비해 비교적 우수한 인식성능을 얻을 수 있었다.
PDF

Implementation of HMM-Based Speech Recognizer Using TMS320C6711 DSP

Bae Hyojoon;Jung Sungyun;Son Jongmok;Kwon Hongseok;Kim Siho;Bae Keunsung
- 대한전자공학회:학술대회논문집
- /
- 대한전자공학회 2004년도 ICEIC The International Conference on Electronics Informations and Communications
- /
- pp.391-394
- /
- 2004
This paper focuses on the DSP implementation of an HMM-based speech recognizer that can handle several hundred words of vocabulary size as well as speaker independency. First, we develop an HMM-based speech recognition system on the PC that operates on the frame basis with parallel processing of feature extraction and Viterbi decoding to make the processing delay as small as possible. Many techniques such as linear discriminant analysis, state-based Gaussian selection, and phonetic tied mixture model are employed for reduction of computational burden and memory size. The system is then properly optimized and compiled on the TMS320C6711 DSP for real-time operation. The implemented system uses 486kbytes of memory for data and acoustic models, and 24.5kbytes for program code. Maximum required time of 29.2ms for processing a frame of 32ms of speech validates real-time operation of the implemented system.
PDF

검색결과 964건 처리시간 0.025초

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)