• 제목/요약/키워드: Recognition of Korea

검색결과 9,780건 처리시간 0.04초

음소 인식을 위한 특징 추출의 위치와 지속 시간 길이에 관한 연구 (A Study on Duration Length and Place of Feature Extraction for Phoneme Recognition)

  • 김범국;정현열
    • 한국음향학회지
    • /
    • 제13권4호
    • /
    • pp.32-39
    • /
    • 1994
  • 한국어 음성인식 시스템을 구현하기 위한 기초 연구로서 한국어 전음소를 대상으로 1) 각 음소의 특성을 가장 잘 나타내는 최적의 위치, 2) 최고의 인식률을 얻기 위한 적당한 지속시간길이를 찾기위해서 음소인식을 수행하였다. 인식실험을 위해 특징파라메터로 21차원 켑스트럼계수를 이용하여 베이즈 결정법칙으로서 세화자에 대한 종속인식실험을 행하였다. 인식실험결과 최고의 인식률을 보이는 최적의 특징추출의 위치는 모음에서는 10~50ms, 마찰음및 파찰음은 40~100ms, 비음, 유음은 10~50ms, 그리고 파열음은 10~50ms임을 알 수 있었다. 또, 35 전음소를 대상으로한 인식에 있어서는 최고의 인식률을 얻기위한 지속시간 정 보의 길이는 60~70ms정도가 충분함을 알 수 있었다.

  • PDF

Embedded System 기반 Vision Box 설계와 적용 (Design and Application of Vision Box Based on Embedded System)

  • 이종혁
    • 한국정보통신학회논문지
    • /
    • 제13권8호
    • /
    • pp.1601-1607
    • /
    • 2009
  • 비전 시스템은 카메라를 통하여 획득한 이미지 정보를 캡쳐 후, 이를 분석하여 물체를 인식하는 것으로서, 차종 분류를 포함 한 다양한 산업현장에서 사용하고 있다. 이런 필요성으로 인하여 차종 분류를 위한 많은 연구가 이루어지고 있으나 복잡한 계산과정으로 인하여 처리 시간이 많이 소요되는 단점이 있다. 본 논문에서는 임베디드 시스템을 기반으로 하는 Vision Box를 설계하고 이를 사용한 차종인식 시스템을 제안하였다. 제안한 Vision Box의 성능을 자동차의 차종분류를 통한 사전 테스트 결과 최적 화된 환경 조건에서는 100%의 차종별 인식률을 보였으며, 조명 및 회전의 작은 변화에 따른 테스트에서 차종인식은 가능하였으나, 패턴점수가 낮아졌다. 제안한 Vision Box 시스템을 산업 현장에 적용한 결과 처리시간, 인식률 등에서 산업체의 요구 조건을 만족 할 수 있음을 확인할 수 일었다.

3D Holographic Image Recognition by Using Graphic Processing Unit

  • Lee, Jeong-A;Moon, In-Kyu;Liu, Hailing;Yi, Faliu
    • Journal of the Optical Society of Korea
    • /
    • 제15권3호
    • /
    • pp.264-271
    • /
    • 2011
  • In this paper we examine and compare the computational speeds of three-dimensional (3D) object recognition by use of digital holography based on central unit processing (CPU) and graphic processing unit (GPU) computing. The holographic fringe pattern of a 3D object is obtained using an in-line interferometry setup. The Fourier matched filters are applied to the complex image reconstructed from the holographic fringe pattern using a GPU chip for real-time 3D object recognition. It is shown that the computational speed of the 3D object recognition using GPU computing is significantly faster than that of the CPU computing. To the best of our knowledge, this is the first report on comparisons of the calculation time of the 3D object recognition based on the digital holography with CPU vs GPU computing.

Smart Phone Road Signs Recognition Model Using Image Segmentation Algorithm

  • Huang, Ying;Song, Jeong-Young
    • 한국정보통신학회:학술대회논문집
    • /
    • 한국정보통신학회 2012년도 추계학술대회
    • /
    • pp.887-890
    • /
    • 2012
  • Image recognition is one of the most important research directions of pattern recognition. Image based road automatic identification technology is widely used in current society, the intelligence has become the trend of the times. This paper studied the image segmentation algorithm theory and its application in road signs recognition system. With the help of image processing technique, respectively, on road signs automatic recognition algorithm of three main parts, namely, image segmentation, character segmentation, image and character recognition, made a systematic study and algorithm. The experimental results show that: the image segmentation algorithm to establish road signs recognition model, can make effective use of smart phone system and application.

  • PDF

반음절 문맥종속 모델을 이용한 한국어 4 연숫자음 인식에 관한 연구 (A Study on Korean 4-connected Digit Recognition Using Demi-syllable Context-dependent Models)

  • 이기영;최성호;이호영;배명진
    • 한국음향학회지
    • /
    • 제22권3호
    • /
    • pp.175-181
    • /
    • 2003
  • 한국어 숫자음은 단음절이며 연결된 숫자음 사이에 연음현상의 영향 때문에 한국어 연결 숫자음의 인식방법으로 반음절에 기반한 모델들이 제시되어 왔다. 기존에 제안된 반음절이나 반음절+반음절의 인식모델을 이용한 방법에서는 아직까지 우수한 인식성능을 보이지 못하고 있다. 본 논문에서는 확장된 문맥종속 반음절 모델을 이용한 한국어 4 연숫자음 인식방법을 제안한다. 실험에서 연결숫자음은 SiTEC의 4 연숫자음 데이터 베이스를 사용하였으며 학습과 인식방법으로는 HTK 3.0의 C-HMM을 이용하였다. 기존의 방법들과 인식율을 비교해 본 결과, 92%의 비교적 우수한 인식성능을 보였다.

Human Gait Recognition Based on Spatio-Temporal Deep Convolutional Neural Network for Identification

  • Zhang, Ning;Park, Jin-ho;Lee, Eung-Joo
    • 한국멀티미디어학회논문지
    • /
    • 제23권8호
    • /
    • pp.927-939
    • /
    • 2020
  • Gait recognition can identify people's identity from a long distance, which is very important for improving the intelligence of the monitoring system. Among many human features, gait features have the advantages of being remotely available, robust, and secure. Traditional gait feature extraction, affected by the development of behavior recognition, can only rely on manual feature extraction, which cannot meet the needs of fine gait recognition. The emergence of deep convolutional neural networks has made researchers get rid of complex feature design engineering, and can automatically learn available features through data, which has been widely used. In this paper,conduct feature metric learning in the three-dimensional space by combining the three-dimensional convolution features of the gait sequence and the Siamese structure. This method can capture the information of spatial dimension and time dimension from the continuous periodic gait sequence, and further improve the accuracy and practicability of gait recognition.

얼굴영상과 음성을 이용한 멀티모달 감정인식 (Multimodal Emotion Recognition using Face Image and Speech)

  • 이현구;김동주
    • 디지털산업정보학회논문지
    • /
    • 제8권1호
    • /
    • pp.29-40
    • /
    • 2012
  • A challenging research issue that has been one of growing importance to those working in human-computer interaction are to endow a machine with an emotional intelligence. Thus, emotion recognition technology plays an important role in the research area of human-computer interaction, and it allows a more natural and more human-like communication between human and computer. In this paper, we propose the multimodal emotion recognition system using face and speech to improve recognition performance. The distance measurement of the face-based emotion recognition is calculated by 2D-PCA of MCS-LBP image and nearest neighbor classifier, and also the likelihood measurement is obtained by Gaussian mixture model algorithm based on pitch and mel-frequency cepstral coefficient features in speech-based emotion recognition. The individual matching scores obtained from face and speech are combined using a weighted-summation operation, and the fused-score is utilized to classify the human emotion. Through experimental results, the proposed method exhibits improved recognition accuracy of about 11.25% to 19.75% when compared to the most uni-modal approach. From these results, we confirmed that the proposed approach achieved a significant performance improvement and the proposed method was very effective.

2D - PCA와 영상분할을 이용한 얼굴인식 (Face Recognition using 2D-PCA and Image Partition)

  • 이현구;김동주
    • 디지털산업정보학회논문지
    • /
    • 제8권2호
    • /
    • pp.31-40
    • /
    • 2012
  • Face recognition refers to the process of identifying individuals based on their facial features. It has recently become one of the most popular research areas in the fields of computer vision, machine learning, and pattern recognition because it spans numerous consumer applications, such as access control, surveillance, security, credit-card verification, and criminal identification. However, illumination variation on face generally cause performance degradation of face recognition systems under practical environments. Thus, this paper proposes an novel face recognition system using a fusion approach based on local binary pattern and two-dimensional principal component analysis. To minimize illumination effects, the face image undergoes the local binary pattern operation, and the resultant image are divided into two sub-images. Then, two-dimensional principal component analysis algorithm is separately applied to each sub-images. The individual scores obtained from two sub-images are integrated using a weighted-summation rule, and the fused-score is utilized to classify the unknown user. The performance evaluation of the proposed system was performed using the Yale B database and CMU-PIE database, and the proposed method shows the better recognition results in comparison with existing face recognition techniques.

한국어 음성 인식을 위한 mono-phone 구성의 기초 연구 (The Basic Study on making mono-phone for Korean Speech Recognition)

  • 황영수;송민석
    • 한국음향학회:학술대회논문집
    • /
    • 한국음향학회 2000년도 학술발표대회 논문집 제19권 2호
    • /
    • pp.45-48
    • /
    • 2000
  • In the case of making large vocabulary speech recognition system, it is better to use the segment than the syllable or the word as the recognition unit. In this paper, we study on the basis of making mono-phone for Korean speech recognition. For experiments, we use the speech toolkit of OGI in U.S.A. The result shows that the recognition rate of :he case in which the diphthong is established as a single unit is superior to that of the case in which the diphthong is established as two units, i.e. a glide plus a vowel. And also, the recognition rate by the number of consonants is a little different.

  • PDF

동적 도시 환경에서 의미론적 시각적 장소 인식 (Semantic Visual Place Recognition in Dynamic Urban Environment)

  • 사바 아르샤드;김곤우
    • 로봇학회논문지
    • /
    • 제17권3호
    • /
    • pp.334-338
    • /
    • 2022
  • In visual simultaneous localization and mapping (vSLAM), the correct recognition of a place benefits in relocalization and improved map accuracy. However, its performance is significantly affected by the environmental conditions such as variation in light, viewpoints, seasons, and presence of dynamic objects. This research addresses the problem of feature occlusion caused by interference of dynamic objects leading to the poor performance of visual place recognition algorithm. To overcome the aforementioned problem, this research analyzes the role of scene semantics in correct detection of a place in challenging environments and presents a semantics aided visual place recognition method. Semantics being invariant to viewpoint changes and dynamic environment can improve the overall performance of the place matching method. The proposed method is evaluated on the two benchmark datasets with dynamic environment and seasonal changes. Experimental results show the improved performance of the visual place recognition method for vSLAM.