Search | Korea Science

A Study on the Improvement of Automatic Text Recognition of Road Signs Using Location-based Similarity Verification (위치기반 유사도 검증을 이용한 도로표지 안내지명 자동인식 개선방안 연구)

Chong, Kyusoo
- The Journal of The Korea Institute of Intelligent Transport Systems
- /
- v.18 no.6
- /
- pp.241-250
- /
- 2019
Road signs are guide facilities for road users, and the Ministry of Land, Infrastructure and Transport has established and operated a system to enhance the convenience of managing these road signs. The role of road signs will decrease in the future autonomous driving, but they will continue to be needed. For the accurate mechanical recognition of texts on road signs, automatic road sign recognition equipment has been developed and it has applied image-based text recognition technology. Yet there are many cases of misrecognition due to irregular specifications and external environmental factors such as manual manufacturing, illumination, light reflection, and rainfall. The purpose of this study is to derive location-based destination names for finding misrecognition errors that cannot be overcome by image analysis, and to improve the automatic recognition of road signs destination names by using Levenshtein similarity verification method based on phoneme separation.
https://doi.org/10.12815/kits.2019.18.6.241 인용 PDF KSCI

The syllable recovrey rule-based system and the application of a morphological analysis method for the post-processing of a continuous speech recognition (연속음성인식 후처리를 위한 음절 복원 rule-based 시스템과 형태소분석기법의 적용)

박미성;김미진;김계성;최재혁;이상조
- Journal of the Korean Institute of Telematics and Electronics C
- /
- v.36C no.3
- /
- pp.47-56
- /
- 1999
Various phonological alteration occurs when we pronounce continuously in korean. This phonological alteration is one of the major reasons which make the speech recognition of korean difficult. This paper presents a rule-based system which converts a speech recognition character string to a text-based character string. The recovery results are morphologically analyzed and only a correct text string is generated. Recovery is executed according to four kinds of rules, i.e., a syllable boundary final-consonant initial-consonant recovery rule, a vowel-process recovery rule, a last syllable final-consonant recovery rule and a monosyllable process rule. We use a x-clustering information for an efficient recovery and use a postfix-syllable frequency information for restricting recovery candidates to enter morphological analyzer. Because this system is a rule-based system, it doesn't necessitate a large pronouncing dictionary or a phoneme dictionary and the advantage of this system is that we can use the being text based morphological analyzer.
PDF

A Study on the Multilingual Speech Recognition for On-line International Game (온라인 다국적 게임을 위한 다국어 혼합 음성 인식에 관한 연구)

Kim, Suk-Dong;Kang, Heung-Soon;Woo, In-Sung;Shin, Chwa-Cheul;Yoon, Chun-Duk
- Journal of Korea Game Society
- /
- v.8 no.4
- /
- pp.107-114
- /
- 2008
The requests for speech-recognition for multi-language in field of game and the necessity of multi-language system, which expresses one phonetic model from many different kind of language phonetics, has been increased in field of game industry. Here upon, the research regarding development of multi-national language system which can express speeches, that is consist of various different languages, into only one lexical model is needed. In this paper is basic research for establishing integrated system from multi-language lexical model, and it shows the system which recognize Korean and English speeches into IPA(International Phonetic Alphabet). We focused on finding the IPA model which is satisfied with Korean and English phoneme one simutaneously. As a result, we could get the 90.62% of Korean speech-recognition rate, also 91.71% of English speech-recognition rate.
PDF

A Phonetic Study og German (2) (독어음의 음성학적 고찰(2) - 현대독어의 복모음에 관하여 -)

Yun Jong-sun
- MALSORI
- /
- no.19_20
- /
- pp.33-42
- /
- 1990
Those who are interested in the German diphthongs wil1 find that they are classified into three kinds of forms in accordance with their gliding directions: closing, centring and rising. The German [aI], for example, which derives its origin from [i:] of the riddle high German. Is regarded as a distinctive feature that distinguishes the new high German from the middle high German. The diphthong [aI] is cal led fall ing one, because the sonority of the sound undergoes a diminution as the articulation proceeds. The end part of the diphthong [aI] is less sonorous than the beginning part. In most of the German diphthongs the diminution of prominence is caused by the fact that the end part is inherently less sonorous than the beginning. This applies to the other c los Ing and centring diphthongs. This way of diminution of sonority exerts influence on methods of constructing systems of phonetic notation. The above mentioned less sonorous end part of diphthong [I] shows that it differs from some analogous sound in another context. It is useful to demonstrate the occurrence of particular allophones by introducing special symbols to denote them (here: at→ae). Forms of transcription embodying extra symbol s are cal led narrow. But since strict adherence to the principle 'one sound one symbol' would involve the introduction of a large number of symbols, this would render phonetic transcriptions cumbrous and difficult to read. A broad style of transcription provides 'one symbol for each phoneme' of the language that is transcribed. Phonemic transcriptions are simple and unambiguous to everyone who knows the principles governing the use of allophones in the language transcribed. Among those German ways of transcriptions of diphthongs ( a?, a?, ??: ae, ao, ?ø; ae, ao, ?ø) the phonemic (broad) transcription is general Iy to be recommended, for Instance, in teaching the pronunciation of a foreign language, since it combines accuracy with the greatest measure of simplicity (Some passages and terms from Daniel Jones) .
PDF

Effect of Frenulotomy in Tongue-Tie : Focused on Alveolar Sounds (설소대 단축증 아동의 설소대 절개술 전 후 치조음 발음 양상의 변화)

안서지;양해동;김병철;신지철;고중화
- Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
- /
- v.11 no.1
- /
- pp.5-11
- /
- 2000
Background and Objectives : Tongue-tie, or partial ankyloglossia, is manifested by an abnormally short and thick lingual frenulum. Degree of tongue-tie varies from the mild to the rare, severe and its treatment of choice is frenulotomy. Theoretically tongue-tie can affect expression of alveolar sounds. The purpose of this study is to evaluate the degree of articulation problem and to evaluate the efficacy of frenulotomy itself on alveolar sounds in tongue-tie patients. Materials and Methods : Prospectively, the authors performed preoperative and postoperative speech evaluation using picture consonants test for tongue-tie patients. Percentage of consonants correct(PCC), mean value of each alveolar phoneme depends on articulation site were evaluated. for exclusion of other articulatory improving factors except of frenulotomy itself, postoperative picture consonants test was performed 1 month after surgery. Results : Preoperative speech evaluation was performed to 37(male 21, female 16) patients and postoperative speech evaluation was performed to 17(male 9, female 8) patients, the other 20 patients were follow-up loss. Low PCC was observed in tongue-tie patients and PCC of female was higher an at of male in 2-4 years old patients. Overall PCC was improved after frenulotomy. Preoperative mean value of liquids and fricatives was lower than the other alveolar phonemes(p＜0.05) and it was improved postoperatively(p＜0.05). Conclusion : Frenulotomy itself can improve the articulation of liquids and fricatives on short follow-up. Speech therapy would be needful for improvement of the other alveolar phonemes.
PDF

Evaluation of Word Recognition System For Mobile Telephone (이동전화를 위한 단어 인식기의 성능평가)

Kim Min-Jung;Hwang Cheol-Jun;Chung Ho-Youl;Chung Hyun-Yeol
- Proceedings of the Acoustical Society of Korea Conference
- /
- spring
- /
- pp.92-95
- /
- 1999
본 논문에서는 음성에 의해 구동되는 이동천화를 구현하기 위한 기초 실험으로서, 이동전화상에서 많이 사용되는 단어 데이터를 직접 채록하여 단어 인식 실험을 수행하여 인식기의 성능을 평가하였다. 인식 실험에 사용된 단어 데이터베이스는 서울 화자 360명(남성화자 180명, 여성화자 180명), 41상도 화자 240명(남성화자 120명, 여성화자 120명)으로 구성된 600명의 발성을 이용하여 구성하였다. 발성 단어는 이동전화에 주로 사용되는 중요 기능과 제어 단어, 그리고 숫자음을 포함한 55개 단어로 구성되었으며, 각 화자가 3회씩 발성하였다. 데이터의 채집환경은 잡음이 다소 있는 사무실환경이며, 샘플링율은 8kHz였다. 인식의 기본단위는 48개의 유사음소단위(Phoneme Like Unit : PLU)를 사용하였으며, 정적 특징으로 멜켑스트럼과 동적 특징으로 회귀계수를 특징 파라미터로 사용하였다. 인식실험에서는 OPDP(One Pass Dynamic Programming)알고리즘을 사용하였다. 인식실험을 위한 모델은 각 지역에 따라 학습을 수행한 모델과, 지역에 상관없이 학습한 모델을 만들었으며, 기존의 16Htz의 초기 모델에 8kHz로 채집된 데이터를 적응화시키는 방법을 이용하여 학습을 수행하였다. 인식실험에 있어서는 각 지역별 모델과 지역에 관계없이 학습한 모델에 대하여, 각 지역별로, 그리고 지역에 관계없이 평가용 데이터로 인식실험을 수행하였다 인식실험 결과, $90\%$이상의 비교적 높은 인식률을 얻어 인식시스템 성능의 유효성을 확인할 수 있었다.
PDF

Fast Speech Recognition System using Classification of Energy Labeling (에너지 라벨링 그룹화를 이용한 고속 음성인식시스템)

Han Su-Young;Kim Hong-Ryul;Lee Kee-Hee
- Journal of the Korea Society of Computer and Information
- /
- v.9 no.4 s.32
- /
- pp.77-83
- /
- 2004
In this paper, the Classification of Energy Labeling has been proposed. Energy parameters of input signal which are extracted from each phoneme are labelled. And groups of labelling according to detected energies of input signals are detected. Next. DTW processes in a selected group of labeling. This leads to DTW processing faster than a previous algorithm. In this Method, because an accurate detection of parameters is necessary on the assumption in steps of a detection of speeching duration and a detection of energy parameters, variable windows which are decided by pitch period are used. A pitch period is detected firstly : next window scale is decided between 200 frames and 300 frames. The proposed method makes it possible to cancel an influence of windows and reduces the computational complexity by $25\%$.
PDF

The Analysis and Recognition of Korean Speech Signal using the Phoneme (음소에 의한 한국어 음성의 분석과 인식)

Kim, Yeong-Il;Lee, Geon-Gi;Lee, Mun-Su
- The Journal of the Acoustical Society of Korea
- /
- v.6 no.2
- /
- pp.38-47
- /
- 1987
As Korean language can be phonemically classified according to the characteristic and structure of its pronunciation, Korean syllables can be divided into the phonemes such as consonant and vowel. The divided phonemes are analyzed by using the method of partial autocorrelation, and the order of partial autocorelation coefficient is 15. In analysis, it is shown that each characteristic of the same consonants, vowels, and end consonant in syllables in similar. The experiments is carried out by dividing 675 syllables into consonants, vowels, and end consonants. The recognition rate of consonants, vowels, end-consonants, and syllables are $85.0(\%)$, $90.7(\%)$, $85.5(\%)$and $72.1(\%)$ respectively. In conclusion, it is shown that Korean syllables, divided by the phonemes, are analyzed and recognized with minimum data and short processing time. Furthermore, it is shown that Korean syllables, words and sentences are recognized in the same way.
PDF

Efficient context dependent process modeling using state tying and decision tree-based method (상태 공유와 결정트리 방법을 이용한 효율적인 문맥 종속 프로세스 모델링)

Ahn, Chan-Shik;Oh, Sang-Yeob
- Journal of Korea Multimedia Society
- /
- v.13 no.3
- /
- pp.369-377
- /
- 2010
In vocabulary recognition systems based on HMM(Hidden Markov Model)s, training process unseen model bring on show a low recognition rate. If recognition vocabulary modify and make an addition then recreated modeling of executed database collected and training sequence on account of bring on additional expenses and take more time. This study suggest efficient context dependent process modeling method using decision tree-based state tying. On study suggest method is reduce recreated of model and it's offered that robustness and accuracy of context dependent acoustic modeling. Also reduce amount of model and offered training process unseen model as concerns context dependent a likely phoneme model has been used unseen model solve the matter. System performance as a result of represent vocabulary dependence recognition rate of 98.01%, vocabulary independence recognition rate of 97.38%.
PDF KSCI

Implementation of TTS Engine for Natural Voice (자연음 TTS(Text-To-Speech) 엔진 구현)

Cho Jung-Ho;Kim Tae-Eun;Lim Jae-Hwan
- Journal of Digital Contents Society
- /
- v.4 no.2
- /
- pp.233-242
- /
- 2003
A TTS(Text-To-Speech) System is a computer-based system that should be able to read any text aloud. To output a natural voice, we need a general knowledge of language, a lot of time, and effort. Furthermore, the sound pattern of english has a variable pattern, which consists of phonemic and morphological analysis. It is very difficult to maintain consistency of pattern. To handle these problems, we present a system based on phonemic analysis for vowel and consonant. By analyzing phonological variations frequently found in spoken english, we have derived about phonemic contexts that would trigger the multilevel application of the corresponding phonological process, which consists of phonemic and allophonic rules. In conclusion, we have a rule data which consists of phoneme, and a engine which economize in system. The proposed system can use not only communication system, but also utilize office automation and so on.
PDF

Search Result 458, Processing Time 0.034 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)