• Title/Summary/Keyword: phoneme similarity

Search Result 17, Processing Time 0.022 seconds

A Query-by-Speech Scheme for Photo Albuming (음성 질의 기반 디지털 사진 검색 기법)

  • Kim Tae-Sung;Suh Young-Joo;Lee Yong-Ju;Kim Hoi-Rin
    • MALSORI
    • /
    • no.57
    • /
    • pp.99-112
    • /
    • 2006
  • In this paper, we introduce two retrieval methods for photos with speech documents. We compare the pattern of speech query with those of speech documents recorded in digital cameras, and measure the similarities, and retrieve photos corresponding to the speech documents which have high similarity scores. As the first approach, a phoneme recognition scheme is used as the pre-processor for the pattern matching, and in the second one, the vector quantization (VQ) and the dynamic time warping (DTW) are applied to match the speech query with the documents in signal domain itself. Experimental results show that the performance of the first approach is highly dependent on that of phoneme recognition while the processing time is short. The second method provides a great improvement of performance. While the processing time is longer than that of the first method due to DTW, but we can reduce it by taking approximated methods.

  • PDF

A Study on Phoneme Extractions and Recognitions for Handwritten Korean Characters using Context-Free Grammar (CFG 방법을 이용한 필기체 한글에서의 자소추출과 인식에 관한 연구)

  • 김형래;박인갑;서동필;김에녹
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.29B no.9
    • /
    • pp.8-16
    • /
    • 1992
  • This paper presents a method which can recognized the Handwritten Korean characters by using a Context-Free Grammar. The input characters are thinned in order to dwindle the mount of data, the thinned characters are converted into one-dimension strings according to six-forms. when the point of contact among phonemes is found, two phonemes are seperated respectively by marking the index mark (\) at the points. The Context-Free Grammar to input characters is classified into group grammars concerning the similarity of phonemes, input characters are parsed by making use of the Pushdown automata method. As the bent parts in the Handwritten characters are found frequently, We try to correct the bent parts by using the parsing distance measure, which recognize characters according to minium value caused by measuring the weight distance between two sentences. In this experiment, the recognition rate shows 93.8% to 275 Handwritten Korean characters.

  • PDF

Algorithm for Concatenating Multiple Phonemic Units for Small Size Korean TTS Using RE-PSOLA Method

  • Bak, Il-Suh;Jo, Cheol-Woo
    • Speech Sciences
    • /
    • v.10 no.1
    • /
    • pp.85-94
    • /
    • 2003
  • In this paper an algorithm to reduce the size of Text-to-Speech database is proposed. The algorithm is based on the characteristics of Korean phonemic units. From the initial database, a reduced phoneme unit set is induced by articulatory similarity of concatenating phonemes. Speech data is read by one female announcer for 1000 phonetically balanced sentences. All the recorded speech is then segmented by phoneticians. Total size of the original speech data is about 640 MB including laryngograph signal. To synthesize wave, RE-PSOLA (Residual-Excited Pitch Synchronous Overlap and Add Method) was used. The voice quality of synthesized speech was compared with original speech in terms of spectrographic informations and objective tests. The quality of the synthesized speech is not much degraded when the size of synthesis DB was reduced from 320 MB to 82 MB.

  • PDF

A Study on the Improvement of Automatic Text Recognition of Road Signs Using Location-based Similarity Verification (위치기반 유사도 검증을 이용한 도로표지 안내지명 자동인식 개선방안 연구)

  • Chong, Kyusoo
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.18 no.6
    • /
    • pp.241-250
    • /
    • 2019
  • Road signs are guide facilities for road users, and the Ministry of Land, Infrastructure and Transport has established and operated a system to enhance the convenience of managing these road signs. The role of road signs will decrease in the future autonomous driving, but they will continue to be needed. For the accurate mechanical recognition of texts on road signs, automatic road sign recognition equipment has been developed and it has applied image-based text recognition technology. Yet there are many cases of misrecognition due to irregular specifications and external environmental factors such as manual manufacturing, illumination, light reflection, and rainfall. The purpose of this study is to derive location-based destination names for finding misrecognition errors that cannot be overcome by image analysis, and to improve the automatic recognition of road signs destination names by using Levenshtein similarity verification method based on phoneme separation.

Error Correction Methode Improve System using Out-of Vocabulary Rejection (미등록어 거절을 이용한 오류 보정 방법 개선 시스템)

  • Ahn, Chan-Shik;Oh, Sang-Yeob
    • Journal of Digital Convergence
    • /
    • v.10 no.8
    • /
    • pp.173-178
    • /
    • 2012
  • In the generated model for the recognition vocabulary, tri-phones which is not make preparations are produced. Therefore this model does not generate an initial estimate of parameter words, and the system can not configure the model appear as disadvantages. As a result, the sophistication of the Gaussian model is fall will degrade recognition. In this system, we propose the error correction system using out-of vocabulary rejection algorithm. When the systems are creating a vocabulary recognition model, recognition rates are improved to refuse the vocabulary which is not registered. In addition, this system is seized the lexical analysis and meaning using probability distributions, and this system deactivates the string before phoneme change was applied. System analysis determine the rate of error correction using phoneme similarity rate and reliability, system performance comparison as a result of error correction rate improve represent 2.8% by method using error patterns, fault patterns, meaning patterns.

Voice Recognition using a Phoneme based Similarity Algorithm in Home Networks (음소 기반의 유사율 알고리즘을 이용한 Home Network 환경에서의 음성 인식)

  • Lee, Chang-Sub;Yu, Jae-Bong;Park, Joon-Seok;Yang, Soo-Ho;Kim, Yu-Seop;Park, Chan-Young
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2005.05a
    • /
    • pp.767-770
    • /
    • 2005
  • 네트워크상에서 전달되는 음성데이터는 전달되는 과정에서 잡음 등의 외부 요인으로 인하여 데이터에 손실이 생기는 문제가 발생한다. 이렇게 전달된 음성데이터가 음성 인식기를 통과하면 바로 음성 인식기를 통과했을 때 보다 인식률이 낮아진다. 본 연구에서는 홈 네트워크를 제어하는데 있어서 음성 인식률을 향상시키기 위해서 음성 데이터를 입력받아, 이를 음소단위 기반의 유사율 알고리즘을 적용시켜 이미 구축된 홈 네트워크 용어 관련 사전에 등록된 단어와의 유사성을 검토하여 추출된 결과로 홈 네트워크를 제어하는 방안을 제안한다. 음소단위 기반의 유사율 알고리즘과 다중발화를 이용했을 때 Threshold 값이 85% 일 경우 사전에 구축된 단어와 매칭된 인식률은 100%였으며, 사전에 없는 단어의 오인식률은 2%로 감소되었다.

  • PDF

Facial Animation Generation by Korean Text Input (한글 문자 입력에 따른 얼굴 에니메이션)

  • Kim, Tae-Eun;Park, You-Shin
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.4 no.2
    • /
    • pp.116-122
    • /
    • 2009
  • In this paper, we propose a new method which generates the trajectory of the mouth shape for the characters by the user inputs. It is based on the character at a basis syllable and can be suitable to the mouth shape generation. In this paper, we understand the principle of the Korean language creation and find the similarity for the form of the mouth shape and select it as a basic syllable. We also consider the articulation of this phoneme for it and create a new mouth shape trajectory and apply at face of an 3D avatar.

  • PDF