• 제목/요약/키워드: Recognition Errors

검색결과 349건 처리시간 0.031초

음성 자료에 대한 규칙 기반 Named Entity 인식 (Rule-based Named Entity (NE) Recognition from Speech)

  • 김지환
    • 대한음성학회지:말소리
    • /
    • 제58호
    • /
    • pp.45-66
    • /
    • 2006
  • In this paper, a rule-based (transformation-based) NE recognition system is proposed. This system uses Brill's rule inference approach. The performance of the rule-based system and IdentiFinder, one of most successful stochastic systems, are compared. In the baseline case (no punctuation and no capitalisation), both systems show almost equal performance. They also have similar performance in the case of additional information such as punctuation, capitalisation and name lists. The performances of both systems degrade linearly with the number of speech recognition errors, and their rates of degradation are almost equal. These results show that automatic rule inference is a viable alternative to the HMM-based approach to NE recognition, but it retains the advantages of a rule-based approach.

  • PDF

딥러닝을 통한 문서 내 표 항목 분류 및 인식 방법 (Methods of Classification and Character Recognition for Table Items through Deep Learning)

  • 이동석;권순각
    • 한국멀티미디어학회논문지
    • /
    • 제24권5호
    • /
    • pp.651-658
    • /
    • 2021
  • In this paper, we propose methods for character recognition and classification for table items through deep learning. First, table areas are detected in a document image through CNN. After that, table areas are separated by separators such as vertical lines. The text in document is recognized through a neural network combined with CNN and RNN. To correct errors in the character recognition, multiple candidates for the recognized result are provided for a sentence which has low recognition accuracy.

사용자의 성향 기반의 얼굴 표정을 통한 감정 인식률 향상을 위한 연구 (A study on the enhancement of emotion recognition through facial expression detection in user's tendency)

  • 이종식;신동희
    • 감성과학
    • /
    • 제17권1호
    • /
    • pp.53-62
    • /
    • 2014
  • 인간의 감정을 인식하는 기술은 많은 응용분야가 있음에도 불구하고 감정 인식의 어려움으로 인해 쉽게 해결되지 않는 문제로 남아 있다. 인간의 감정 은 크게 영상과 음성을 이용하여 인식이 가능하다. 감정 인식 기술은 영상을 기반으로 하는 방법과 음성을 이용하는 방법 그리고 두 가지를 모두 이용하는 방법으로 많은 연구가 진행 중에 있다. 이 중에 특히 인간의 감정을 가장 보편적으로 표현되는 방식이 얼굴 영상을 이용한 감정 인식 기법에 대한 연구가 활발히 진행 중이다. 그러나 지금까지 사용자의 환경과 이용자 적응에 따라 많은 차이와 오류를 접하게 된다. 본 논문에서는 감정인식률을 향상시키기 위해서는 이용자의 내면적 성향을 이해하고 분석하여 이에 따라 적절한 감정인식의 정확도에 도움을 주어서 감정인식률을 향상 시키는 메카니즘을 제안하였으며 본 연구는 이러한 이용자의 내면적 성향을 분석하여 감정 인식 시스템에 적용함으로 얼굴 표정에 따른 감정인식에 대한 오류를 줄이고 향상 시킬 수 있다. 특히 얼굴표정 미약한 이용자와 감정표현에 인색한 이용자에게 좀 더 향상된 감정인식률을 제공 할 수 있는 방법을 제안하였다.

대학생들이 또렷한 음성과 대화체로 발화한 영어문단의 구글음성인식 (Google speech recognition of an English paragraph produced by college students in clear or casual speech styles)

  • 양병곤
    • 말소리와 음성과학
    • /
    • 제9권4호
    • /
    • pp.43-50
    • /
    • 2017
  • These days voice models of speech recognition software are sophisticated enough to process the natural speech of people without any previous training. However, not much research has reported on the use of speech recognition tools in the field of pronunciation education. This paper examined Google speech recognition of a short English paragraph produced by Korean college students in clear and casual speech styles in order to diagnose and resolve students' pronunciation problems. Thirty three Korean college students participated in the recording of the English paragraph. The Google soundwriter was employed to collect data on the word recognition rates of the paragraph. Results showed that the total word recognition rate was 73% with a standard deviation of 11.5%. The word recognition rate of clear speech was around 77.3% while that of casual speech amounted to 68.7%. The reasons for the low recognition rate of casual speech were attributed to both individual pronunciation errors and the software itself as shown in its fricative recognition. Various distributions of unrecognized words were observed depending on each participant and proficiency groups. From the results, the author concludes that the speech recognition software is useful to diagnose each individual or group's pronunciation problems. Further studies on progressive improvements of learners' erroneous pronunciations would be desirable.

A Study on DNN-based STT Error Correction

  • Jong-Eon Lee
    • International journal of advanced smart convergence
    • /
    • 제12권4호
    • /
    • pp.171-176
    • /
    • 2023
  • This study is about a speech recognition error correction system designed to detect and correct speech recognition errors before natural language processing to increase the success rate of intent analysis in natural language processing with optimal efficiency in various service domains. An encoder is constructed to embedded the correct speech token and one or more error speech tokens corresponding to the correct speech token so that they are all located in a dense vector space for each correct token with similar vector values. One or more utterance tokens within a preset Manhattan distance based on the correct utterance token in the dense vector space for each embedded correct utterance token are detected through an error detector, and the correct answer closest to the detected error utterance token is based on the Manhattan distance. Errors are corrected by extracting the utterance token as the correct answer.

인두피판성형술 전후의 언어 평가 (SPEECH-LANGUAGE EVALUATION BEFORE AND AFTER PHARYNGOPLASTY)

  • 유양근;한진순;김정록;황순정
    • 대한구순구개열학회지
    • /
    • 제3권2호
    • /
    • pp.61-66
    • /
    • 2000
  • General characteristics of speech in deft palate patients are hypemasality and articulation disorder, which are affected by velopharyngeal inadequacy(VPI). 17 subjects with a chief complaint of 'nasal sounds and inaccurate pronunciation' underwent a speech-language evaluation before and after pharyngoplasty. Hypemasality and obligatory articulation errors were improved but compensatory articulation errors remained after pharyngoplasty. Above mentioned results indicate that resonance may be normal or improved following successful surgical management of VPI but, compensatory articulation errors will still persist. The separate recognition of hypemasality, compensatory and obligatory articulation errors in deft palate patients is important in determining the timing of therapy and selection of appropriate targets in therapy.

  • PDF

Korean Broadcast News Transcription Using Morpheme-based Recognition Units

  • Kwon, Oh-Wook;Alex Waibel
    • The Journal of the Acoustical Society of Korea
    • /
    • 제21권1E호
    • /
    • pp.3-11
    • /
    • 2002
  • Broadcast news transcription is one of the hardest tasks in speech recognition because broadcast speech signals have much variability in speech quality, channel and background conditions. We developed a Korean broadcast news speech recognizer. We used a morpheme-based dictionary and a language model to reduce the out-of·vocabulary (OOV) rate. We concatenated the original morpheme pairs of short length or high frequency in order to reduce insertion and deletion errors due to short morphemes. We used a lexicon with multiple pronunciations to reflect inter-morpheme pronunciation variations without severe modification of the search tree. By using the merged morpheme as recognition units, we achieved the OOV rate of 1.7% comparable to European languages with 64k vocabulary. We implemented a hidden Markov model-based recognizer with vocal tract length normalization and online speaker adaptation by maximum likelihood linear regression. Experimental results showed that the recognizer yielded 21.8% morpheme error rate for anchor speech and 31.6% for mostly noisy reporter speech.

Invariant Range Image Multi-Pose Face Recognition Using Fuzzy c-Means

  • Phokharatkul, Pisit;Pansang, Seri
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 제어로봇시스템학회 2005년도 ICCAS
    • /
    • pp.1244-1248
    • /
    • 2005
  • In this paper, we propose fuzzy c-means (FCM) to solve recognition errors in invariant range image, multi-pose face recognition. Scale, center and pose error problems were solved using geometric transformation. Range image face data was digitized into range image data by using the laser range finder that does not depend on the ambient light source. Then, the digitized range image face data is used as a model to generate multi-pose data. Each pose data size was reduced by linear reduction into the database. The reduced range image face data was transformed to the gradient face model for facial feature image extraction and also for matching using the fuzzy membership adjusted by fuzzy c-means. The proposed method was tested using facial range images from 40 people with normal facial expressions. The output of the detection and recognition system has to be accurate to about 93 percent. Simultaneously, the system must be robust enough to overcome typical image-acquisition problems such as noise, vertical rotated face and range resolution.

  • PDF

형상 역공학을 통한 공정중 금형 가공물의 자동인식 (Automatic Recognition of In-Process mold Dies Based on Reverse Engineering Technology)

  • 김정권;윤길상;최진화;김동우;조명우;박균명
    • 한국공작기계학회:학술대회논문집
    • /
    • 한국공작기계학회 2003년도 추계학술대회
    • /
    • pp.420-425
    • /
    • 2003
  • Generally, reverse engineering means getting CAD data from unidentified shape using vision or 3D laser scanner system. In this paper, we studied unidentified model by machine vision based reverse engineering system to get information about in-processing model. Recently, vision technology is widely used in current factories, because it could inspect the in-process object easily, quickly, accurately. The following tasks were mainly investigated and implemented. We obtained more precise data by corning camera's distortion, compensating slit-beam error and revising acquired image. Much more, we made similar curves or surface with B-spline approximation for precision. Until now, there have been many case study of shape recognition. But it was uncompatible to apply to the field, because it had taken too many processing time and has frequent recognition failure. This paper propose recognition algorithm that prevent such errors and give applications to the field.

  • PDF

Logical Activity Recognition Model for Smart Home Environment

  • Choi, Jung-In;Lim, Sung-Ju;Yong, Hwan-Seung
    • 한국컴퓨터정보학회논문지
    • /
    • 제20권9호
    • /
    • pp.67-72
    • /
    • 2015
  • Recently, studies that interact with human and things through motion recognition are increasing due to the expansion of IoT(Internet of Things). This paper proposed the system that recognizes the user's logical activity in home environment by attaching some sensors to various objects. We employ Arduino sensors and appreciate the logical activity by using the physical activitymodel that we processed in the previous researches. In this System, we can cognize the activities such as watching TV, listening music, talking, eating, cooking, sleeping and using computer. After we produce experimental data through setting virtual scenario, then the average result of recognition rate was 95% but depending on experiment sensor situation and physical activity errors the consequence could be changed. To provide the recognized results to user, we visualized diverse graphs.