• Title/Summary/Keyword: Recognition Improvement

Search Result 1,509, Processing Time 0.035 seconds

An Improved Face Recognition Method Using SIFT-Grid (SIFT-Grid를 사용한 향상된 얼굴 인식 방법)

  • Kim, Sung Hoon;Kim, Hyung Ho;Lee, Hyon Soo
    • Journal of Digital Convergence
    • /
    • v.11 no.2
    • /
    • pp.299-307
    • /
    • 2013
  • The aim of this paper is the improvement of identification performance and the reduction of computational quantities in the face recognition system based on SIFT-Grid. Firstly, we propose a composition method of integrated template by removing similar SIFT keypoints and blending different keypoints in variety training images of one face class. The integrated template is made up of computation of similarity matrix and threshold-based histogram from keypoints in a same sub-region which divided by applying SIFT-Grid of training images. Secondly, we propose a computation method of similarity for identify of test image from composed integrated templates efficiently. The computation of similarity is performed that a test image to compare one-on-one with the integrated template of each face class. Then, a similarity score and a threshold-voting score calculates according to each sub-region. In the experimental results of face recognition tasks, the proposed methods is founded to be more accurate than both two other methods based on SIFT-Grid, also the computational quantities are reduce.

Improvement of Speech Recognition System Using the Trained Model of Speech Feature (음성특성 학습 모델을 이용한 음성인식 시스템의 성능 향상)

  • 송점동
    • The Journal of Information Technology
    • /
    • v.3 no.4
    • /
    • pp.1-12
    • /
    • 2000
  • We can devide the speech into high frequency speech and low frequency speech according to the feature of the speech, However so far the construction of the recognizer without concerning this feature causes low recognition rate relatively and the needs of an amount of data in the research on the speech recognition. In this paper, we propose the method that can devide this feature of speaker's speech using the Formant frequency, and the method that can recognize the speech after constructing the recognizer model reflecting the feature of the high and low frequency of the speaker's speech, For the experiment we constructed the recognizer model using 47 mono-phone of Korean and trained the recognizer model using 20 women's and men's speech respectively. We divided the feature of speech using the Formant frequency Table, that had been consisted of the Formant frequency, and the value of pitch, and then We performed recognition using the trained model according to the feature of speech The proposed system outperformed the existing method in the recognition rate, as the result.

  • PDF

Enhanced Vein Detection Method by Using Image Scaler Based on Poly Phase Filter (Poly Phase Filter 기반의 영상 스케일러를 이용한 개선 된 정맥 영역 추출 방법)

  • Kim, HeeKyung;Lee, Seungmin;Kang, Bongsoon
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.22 no.5
    • /
    • pp.734-739
    • /
    • 2018
  • Fingerprint recognition and iris recognition, which are one of the biometric methods, are easily influenced by external factors such as sunlight. Recently, finger vein recognition is used as a method utilizing internal features. However, for accurate finger vein recognition, it is important to clearly separate vein and background regions. However, it is difficult to separate the vein region and background region due to the abnormalized illumination, and a method of separating the vein region and the background region after normalized the illumination of the input image has been proposed. In this paper, we proposed a method to enhance the quality improvement and improve the processing time compared to the existing finger vein recognition system binarization and labeling method of the image including the image stretching process based on the existing illumination normalization method.

Fast Hand-Gesture Recognition Algorithm For Embedded System (임베디드 시스템을 위한 고속의 손동작 인식 알고리즘)

  • Hwang, Dong-Hyun;Jang, Kyung-Sik
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.21 no.7
    • /
    • pp.1349-1354
    • /
    • 2017
  • In this paper, we propose a fast hand-gesture recognition algorithm for embedded system. Existing hand-gesture recognition algorithm has a difficulty to use in a low performance system such as embedded systems and mobile devices because of high computational complexity of contour tracing method that extracts all points of hand contour. Instead of using algorithms based on contour tracing, the proposed algorithm uses concentric-circle tracing method to estimate the abstracted contour of fingers, then classify hand-gestures by extracting features. The proposed algorithm has an average recognition rate of 95% and an average execution time of 1.29ms, which shows a maximum performance improvement of 44% compared with algorithm using the existing contour tracing method. It is confirmed that the algorithm can be used in a low performance system such as embedded systems and mobile devices.

Palm Area Detection by Maximum Hand Width (손 최장너비 기반 손바닥 영역 검출)

  • Choi, Eun Chang;Kim, Jun Yeon;Lee, Jae Won;Lim, Jong Gwan
    • The Journal of the Korea Contents Association
    • /
    • v.18 no.4
    • /
    • pp.398-405
    • /
    • 2018
  • In the HCI, hand gesture recognition is attracting attention as a method for interaction and information exchange between users and devices along with the development of IT devices. In hand gesture recognition through image processing, palm region detection is a key process contributing to improvement of processing speed and recognition rate. In this paper, we propose a new method for image segmentation between the hand and wrist for palm area detection. The anatomical characteristics of the hand are used to calculate the distance between the iliac bones of the thumb and little finger, which have the widest width, by the horizontal projection histogram of the hand image, and then the palm area is detected by drawing a circle having the width as the diameter. In order to verify the superiority of this method, multiple stage template matching is used to compare and evaluate recognition performance against the four conventional methods for 10 hand gestures. Note that the literatures to offer palm area detection performance evaluation are few although there are many studies on hand gesture recognition.

Effective Recognition of Velopharyngeal Insufficiency (VPI) Patient's Speech Using Simulated Speech Model (모의 음성 모델을 이용한 효과적인 구개인두부전증 환자 음성 인식)

  • Sung, Mee Young;Kwon, Tack-Kyun;Sung, Myung-Whun;Kim, Wooil
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.19 no.5
    • /
    • pp.1243-1250
    • /
    • 2015
  • This paper presents an effective recognition method of VPI patient's speech for a VPI speech reconstruction system. Speaker adaptation technique is employed to improve VPI speech recognition. This paper proposes to use simulated speech for generating an initial model for speaker adaptation, in order to effectively utilize the small size of VPI speech for model adaptation. We obtain 83.60% in average word accuracy by applying MLLR for speaker adaptation. The proposed speaker adaptation method using simulated speech model brings 6.38% improvement in average accuracy. The experimental results demonstrate that the proposed speaker adaptation method is highly effective for developing recognition system of VPI speech which is not suitable for constructing large-size speech database.

Signal Processing for Speech Recognition in Noisy Environment (잡음 환경에서 음성 인식을 위한 신호처리)

  • Kim, Weon-Goo;Lim, Yong-Hoon;Cha, Il-Whan;Youn, Dae-Hee
    • The Journal of the Acoustical Society of Korea
    • /
    • v.11 no.2
    • /
    • pp.73-84
    • /
    • 1992
  • This paper studies noise subtraction methods and distance measures for speech recognition in a noisy environment, and investigates noise robustness of the distance measures applied to the problem of isolated word recognition in white Gaussian and colored noise (vehicle noise) environments. Noise subtraction methods which can be used as a pre-processor for the speech recognition system, such as the spectral subtraction method, autocorrelation subtraction method, adaptive noise cancellation and acoustic beamforming are studied, and distance measures such and Log Likelihood Ratio ($d_{LLR}$), cepstral distance measure ($d_{CEP}$), weighted cepstral distance measure ($d_{WCEP}$), spectral slope distance measure ($d_{RPS}$) and cepstral projection distance measure ($d_{CP},\;d_{BCP},\;d_{WCP},\;d_{BWCP}$) are also investigated. Testing of the distance measures for speaker-dependent isolated word recognition in a noisy environment indicate that $d_{RPS}\;and\;d_{WCEP}$ which weigh higher order cepstral coefficients more heavily give considerable performance improvement over $d_{CEP}and\;d_{LLR}$. In addition, when no pre-emphasis is performed, the recognizer can maintain higher performance under high noise conditions.

  • PDF

The FE-SM/SONN for Recognition of the Car Skid Mark (자동차 스키드마크 인식을 위한 FE-SM/SONN)

  • Koo, Gun-Seo
    • Journal of the Korea Society of Computer and Information
    • /
    • v.17 no.1
    • /
    • pp.125-132
    • /
    • 2012
  • In this paper, We proposes FE-SM/SONN for recognizing blurred and smeared skid mark image caused by sudden braking of a vehicle. In a blurred and smeared skid marks, tread pattern image is ambiguous. To improve recognition of such image, FE-SM/SONN reads skid marks utilizing Fuzzy Logic and distinguishing tread pattern SONN(Self Organization Neural Networks) recognizer. In order to substantiate this finding, 48 tire models and 144 skid marks were compared and overall recognition ratio was 89%. This study showed 13.51% improved recognition compared to existing back propagation recognizer, and 8.78% improvement than FE-MCBP. The expected effect of this research is achieving recognition of ambiguous images by extracting distinguishing features, and the finding concludes that even when tread pattern image is in grey scale, Fuzzy Logic enables the tread pattern recognizable.

A Study on Error Correction Using Phoneme Similarity in Post-Processing of Speech Recognition (음성인식 후처리에서 음소 유사율을 이용한 오류보정에 관한 연구)

  • Han, Dong-Jo;Choi, Ki-Ho
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.6 no.3
    • /
    • pp.77-86
    • /
    • 2007
  • Recently, systems based on speech recognition interface such as telematics terminals are being developed. However, many errors still exist in speech recognition and then studies about error correction are actively conducting. This paper proposes an error correction in post-processing of the speech recognition based on features of Korean phoneme. To support this algorithm, we used the phoneme similarity considering features of Korean phoneme. The phoneme similarity, which is utilized in this paper, rams data by mono-phoneme, and uses MFCC and LPC to extract feature in each Korean phoneme. In addition, the phoneme similarity uses a Bhattacharrya distance measure to get the similarity between one phoneme and the other. By using the phoneme similarity, the error of eo-jeol that may not be morphologically analyzed could be corrected. Also, the syllable recovery and morphological analysis are performed again. The results of the experiment show the improvement of 7.5% and 5.3% for each of MFCC and LPC.

  • PDF

LSTM RNN-based Korean Speech Recognition System Using CTC (CTC를 이용한 LSTM RNN 기반 한국어 음성인식 시스템)

  • Lee, Donghyun;Lim, Minkyu;Park, Hosung;Kim, Ji-Hwan
    • Journal of Digital Contents Society
    • /
    • v.18 no.1
    • /
    • pp.93-99
    • /
    • 2017
  • A hybrid approach using Long Short Term Memory (LSTM) Recurrent Neural Network (RNN) has showed great improvement in speech recognition accuracy. For training acoustic model based on hybrid approach, it requires forced alignment of HMM state sequence from Gaussian Mixture Model (GMM)-Hidden Markov Model (HMM). However, high computation time for training GMM-HMM is required. This paper proposes an end-to-end approach for LSTM RNN-based Korean speech recognition to improve learning speed. A Connectionist Temporal Classification (CTC) algorithm is proposed to implement this approach. The proposed method showed almost equal performance in recognition rate, while the learning speed is 1.27 times faster.