• Title/Summary/Keyword: word spotting

Search Result 29, Processing Time 0.029 seconds

Utterance Verification using Phone-Level Log-Likelihood Ratio Patterns in Word Spotting Systems (핵심어 인식기에서 단어의 음소레벨 로그 우도 비율의 패턴을 이용한 발화검증 방법)

  • Kim, Chong-Hyon;Kwon, Suk-Bong;Kim, Hoi-Rin
    • Phonetics and Speech Sciences
    • /
    • v.1 no.1
    • /
    • pp.55-62
    • /
    • 2009
  • This paper proposes an improved method to verify a keyword segment that results from a word spotting system. First a baseline word spotting system is implemented. In order to improve performance of the word spotting systems, we use a two-pass structure which consists of a word spotting system and an utterance verification system. Using the basic likelihood ratio test (LRT) based utterance verification system to verify the keywords, there have been certain problems which lead to performance degradation. So, we propose a method which uses phone-level log-likelihood ratios (PLLR) patterns in computing confidence measures for each keyword. The proposed method generates weights according to the PLLR patterns and assigns different weights to each phone in the process of generating confidence measures for the keywords. This proposed method has shown to be more appropriate to word spotting systems and we can achieve improvement in final word spotting accuracy.

  • PDF

A study on the recognition of continuous speech using CHMM word spotting (CHMM Word Spotting 기법을 이용한 연속음성 인식에 관한 연구)

  • 김수훈
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1994.06c
    • /
    • pp.373-377
    • /
    • 1994
  • 연속음성 인식 시스템 구성을 위한 HMM WORD SPOTTING 기법을 검토하였다. 실험에 사용한 HMM WORD SPOTTING 기법은 O(n)DP 기법와 OPDP 법이다. 인식시스템은 파라메터로 멜켑스트럼 만을 사용한 경우와 동적 파라메터인 희귀계수를 결합한 경우의 2종류이며, 인식 알고리즘은 O(n)DP 법과 유한상태 오토마타에 의해 구문제어를 실?나 ONE PASS DP 법으로 나눌 수 있다. 또한 인식 단위는 음절과 단어가 혼합된 형태이고 학습은 모두 음절단위로 실시하였으며 연속음성 25문장에 대하여 O(n)DP법과 OPDP법의 인식결과를 비교하여 연속음성 인식에 구문제어 효과를 검증하였다. 실험 결과 평균 인식률이 O(n)DP 의 경우 각각 90.6%, 90.9%, OPDP 의 경우 각각 98.4%, 98.6%로 유한 상태 오토마타에 의한 구문제어를 이용한 평균 7.5%의 인식률이 향상되었다.

  • PDF

Keyword Spotting on Hangul Document Images Using Character Feature Models (문자 별 특징 모델을 이용한 한글 문서 영상에서 키워드 검색)

  • Park, Sang-Cheol;Kim, Soo-Hyung;Choi, Deok-Jai
    • The KIPS Transactions:PartB
    • /
    • v.12B no.5 s.101
    • /
    • pp.521-526
    • /
    • 2005
  • In this Paper, we propose a keyword spotting system as an alternative to searching system for poor quality Korean document images and compare the Proposed system with an OCR-based document retrieval system. The system is composed of character segmentation, feature extraction for the query keyword, and word-to-word matching. In the character segmentation step, we propose an effective method to remove the connectivity between adjacent characters and a character segmentation method by making the variance of character widths minimum. In the query creation step, feature vector for the query is constructed by a combination of a character model by typeface. In the matching step, word-to-word matching is applied base on a character-to-character matching. We demonstrated that the proposed keyword spotting system is more efficient than the OCR-based one to search a keyword on the Korean document images, especially when the quality of documents is quite poor and point size is small.

A Study of Fundamental Frequency for Focused Word Spotting in Spoken Korean (한국어 발화음성에서 중점단어 탐색을 위한 기본주파수에 대한 연구)

  • Kwon, Soon-Il;Park, Ji-Hyung;Park, Neung-Soo
    • The KIPS Transactions:PartB
    • /
    • v.15B no.6
    • /
    • pp.595-602
    • /
    • 2008
  • The focused word of each sentence is a help in recognizing and understanding spoken Korean. To find the method of focused word spotting at spoken speech signal, we made an analysis of the average and variance of Fundamental Frequency and the average energy extracted from a focused word and the other words in a sentence by experiments with the speech data from 100 spoken sentences. The result showed that focused words have either higher relative average F0 or higher relative variances of F0 than other words. Our findings are to make a contribution to getting prosodic characteristics of spoken Korean and keyword extraction based on natural language processing.

A study on the Method of the Keyword Spotting Recognition in the Continuous speech using Neural Network (신경 회로망을 이용한 연속 음성에서의 keyword spotting 인식 방식에 관한 연구)

  • Yang, Jin-Woo;Kim, Soon-Hyob
    • The Journal of the Acoustical Society of Korea
    • /
    • v.15 no.4
    • /
    • pp.43-49
    • /
    • 1996
  • This research proposes a system for speaker independent Korean continuous speech recognition with 247 DDD area names using keyword spotting technique. The applied recognition algorithm is the Dynamic Programming Neural Network(DPNN) based on the integration of DP and multi-layer perceptron as model that solves time axis distortion and spectral pattern variation in the speech. To improve performance, we classify word model into keyword model and non-keyword model. We make an experiment on postprocessing procedure for the evaluation of system performance. Experiment results are as follows. The recognition rate of the isolated word is 93.45% in speaker dependent case. The recognition rate of the isolated word is 84.05% in speaker independent case. The recognition rate of simple dialogic sentence in keyword spotting experiment is 77.34% as speaker dependent, and 70.63% as speaker independent.

  • PDF

Sentence Rejection using Word Spotting Ratio in the Phoneme-based Recognition Network (음소기반 인식 네트워크에서의 단어 검출률을 이용한 문장거부)

  • Kim, Hyung-Tai;Ha, Jin-Young
    • Proceedings of the KSPS conference
    • /
    • 2005.04a
    • /
    • pp.99-102
    • /
    • 2005
  • Research efforts have been made for out-of-vocabulary word rejection to improve the confidence of speech recognition systems. However, little attention has been paid to non-recognition sentence rejection. According to the appearance of pronunciation correction systems using speech recognition technology, it is needed to reject non-recognition sentences to provide users with more accurate and robust results. In this paper, we introduce standard phoneme based sentence rejection system with no need of special filler models. Instead we used word spotting ratio to determine whether input sentences would be accepted or rejected. Experimental results show that we can achieve comparable performance using only standard phoneme based recognition network in terms of the average of FRR and FAR.

  • PDF

A Study on Keyword Spotting System Using Pseudo N-gram Language Model (의사 N-gram 언어모델을 이용한 핵심어 검출 시스템에 관한 연구)

  • 이여송;김주곤;정현열
    • The Journal of the Acoustical Society of Korea
    • /
    • v.23 no.3
    • /
    • pp.242-247
    • /
    • 2004
  • Conventional keyword spotting systems use the connected word recognition network consisted by keyword models and filler models in keyword spotting. This is why the system can not construct the language models of word appearance effectively for detecting keywords in large vocabulary continuous speech recognition system with large text data. In this paper to solve this problem, we propose a keyword spotting system using pseudo N-gram language model for detecting key-words and investigate the performance of the system upon the changes of the frequencies of appearances of both keywords and filler models. As the results, when the Unigram probability of keywords and filler models were set to 0.2, 0.8, the experimental results showed that CA (Correctly Accept for In-Vocabulary) and CR (Correctly Reject for Out-Of-Vocabulary) were 91.1% and 91.7% respectively, which means that our proposed system can get 14% of improved average CA-CR performance than conventional methods in ERR (Error Reduction Rate).

A Study on the Real-time Word Spotting by Continuous density HMM (연속분포 HMM에 의한 실시간 Word Spotting 에 관한 연구)

  • 서상원
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1995.06a
    • /
    • pp.92-95
    • /
    • 1995
  • 연속분포 HMM을 사용한 실시간 로봇 암 제어 시스템에 대해 기술하고 있다. 본 시스템은 자연스러운 문장의 로봇 암 제어 명령 발성을 받아 핵심단어 인식의 framework을 통한 명령 인식 및 로봇 제어를 구현하고 있다. 로봇 몸체의 부분, 방향, 각도, 동작명령들에 대해 각기 우향 HMM, 이외의 비 핵심어들에 대해서는 이들을 한데 모아 ergodic형 상태천이를 모델링하는 garbage HMM을 형성했는데, 조사, 감탄사 등을 따로 모은 garbage 모델과, silence 및 배경 잡음에 대한 garbage 모델을 형성, 학습 및 인식에 포함시켜 연결단어 인식을 수행함으로써 핵심단어 인식의 효과를 얻었다. 이때 핵심단어들의 사용에 있어 간단한 문법적 제약을 가정하였다. 남성화자 35명을 대상으로 30개 문형에 대해 데이터 수집용 개념적 문장을 구성하여 음성 데이터를 수집하였다. 학습 화자에 대한 제어 명령 인식률은 95% 이상을 나타내고 있으며, 비 학습화자에 대한 인식율은 90% 이상이다. 또한 학습된 단어외의 비 핵심단어들의 사용에 대해서도 긍정적인 인식 성능을 보였다.

  • PDF

A Feature -Based Word Spotting for Content-Based Retrieval of Machine-Printed English Document Images (내용기반의 인쇄체 영문 문서 영상 검색을 위한 특징 기반 단어 검색)

  • Jeong, Gyu-Sik;Gwon, Hui-Ung
    • Journal of KIISE:Software and Applications
    • /
    • v.26 no.10
    • /
    • pp.1204-1218
    • /
    • 1999
  • 문서영상 검색을 위한 디지털도서관의 대부분은 논문제목과/또는 논문요약으로부터 만들어진 색인에 근거한 제한적인 검색기능을 제공하고 있다. 본 논문에서는 영문 문서영상전체에 대한 검색을 위한 단어 영상 형태 특징기반의 단어검색시스템을 제안한다. 본 논문에서는 검색의 효율성과 정확도를 높이기 위해 1) 기존의 단어검색시스템에서 사용된 특징들을 조합하여 사용하며, 2) 특징의 개수 및 위치뿐만 아니라 특징들의 순서를 포함하여 매칭하는 방법을 사용하며, 3) 특징비교에 의해 검색결과를 얻은 후에 여과목적으로 문자인식을 부분적으로 적용하는 2단계의 검색방법을 사용한다. 제안된 시스템의 동작은 다음과 같다. 문서 영상이 주어지면, 문서 영상 구조가 분석되고 단어 영역들의 조합으로 분할된다. 단어 영상의 특징들이 추출되어 저장된다. 사용자의 텍스트 질의가 주어지면 이에 대응되는 단어 영상이 만들어지며 이로부터 영상특징이 추출된다. 이 참조 특징과 저장된 특징들과 비교하여 유사한 단어를 검색하게 된다. 제안된 시스템은 IBM-PC를 이용한 웹 환경에서 구축되었으며, 영문 문서영상을 이용하여 실험이 수행되었다. 실험결과는 본 논문에서 제안하는 방법들의 유효성을 보여주고 있다. Abstract Most existing digital libraries for document image retrieval provide a limited retrieval service due to their indexing from document titles and/or the content of document abstracts. This paper proposes a word spotting system for full English document image retrieval based on word image shape features. In order to improve not only the efficiency but also the precision of a retrieval system, we develop the system by 1) using a combination of the holistic features which have been used in the existing word spotting systems, 2) performing image matching by comparing the order of features in a word in addition to the number of features and their positions, and 3) adopting 2 stage retrieval strategies by obtaining retrieval results by image feature matching and applying OCR(Optical Charater Recognition) partly to the results for filtering purpose. The proposed system operates as follows: given a document image, its structure is analyzed and is segmented into a set of word regions. Then, word shape features are extracted and stored. Given a user's query with text, features are extracted after its corresponding word image is generated. This reference model is compared with the stored features to find out similar words. The proposed system is implemented with IBM-PC in a web environment and its experiments are performed with English document images. Experimental results show the effectiveness of the proposed methods.

Phonological Process and Word Recognition in Continuous Speech: Evidence from Coda-neutralization (음운 현상과 연속 발화에서의 단어 인지 - 종성중화 작용을 중심으로)

  • Kim, Sun-Mi;Nam, Ki-Chun
    • Phonetics and Speech Sciences
    • /
    • v.2 no.2
    • /
    • pp.17-25
    • /
    • 2010
  • This study explores whether Koreans exploit their native coda-neutralization process when recognizing words in Korean continuous speech. According to the phonological rules in Korean, coda-neutralization process must come before the liaison process, as long as the latter(i.e. liaison process) occurs between 'words', which results in liaison-consonants being coda-neutralized ones such as /b/, /d/, or /g/, rather than non-neutralized ones like /p/, /t/, /k/, /ʧ/, /ʤ/, or /s/. Consequently, if Korean listeners use their native coda-neutralization rules when processing speech input, word recognition will be hampered when non-neutralized consonants precede vowel-initial targets. Word-spotting and word-monitoring tasks were conducted in Experiment 1 and 2, respectively. In both experiments, listeners recognized words faster and more accurately when vowel-initial target words were preceded by coda-neutralized consonants than when preceded by coda non-neutralized ones. The results show that Korean listeners exploit the coda-neutralization process when processing their native spoken language.

  • PDF