• 제목/요약/키워드: Size recognition

검색결과 964건 처리시간 0.031초

기존 3차원 인터랙션 동작인식 기술 현황 파악을 위한 메타분석 (Analysis of 3D Motion Recognition using Meta-analysis for Interaction)

  • 김용우;황민철;김종화;우진철;김치중;김지혜
    • 대한인간공학회지
    • /
    • 제29권6호
    • /
    • pp.925-932
    • /
    • 2010
  • Most of the research on three-dimensional interaction field have showed different accuracy in terms of sensing, mode and method. Furthermore, implementation of interaction has been a lack of consistency in application field. Therefore, this study is to suggest research trends of three-dimensional interaction using meta-analysis. Searching relative keyword in database provided with 153 domestic papers and 188 international papers covering three-dimensional interaction. Analytical coding tables determined 18 domestic papers and 28 international papers for analysis. Frequency analysis was carried out on method of action, element, number, accuracy and then verified accuracy by effect size of the meta-analysis. As the results, the effect size of sensor-based was higher than vision-based, but the effect size was extracted to small as 0.02. The effect size of vision-based using hand motion was higher than sensor-based using hand motion. Therefore, implementation of three-dimensional sensor-based interaction and vision-based using hand motions more efficient. This study was significant to comprehensive analysis of three-dimensional motion recognition for interaction and suggest to application directions of three-dimensional interaction.

영상 인식을 위한 2차원 자동 변형 템플릿 매칭 (Two-dimensional Automatic Transformation Template Matching for Image Recognition)

  • 한영모
    • 한국산학기술학회논문지
    • /
    • 제20권9호
    • /
    • pp.1-6
    • /
    • 2019
  • 영상 인식을 위한 한 방법으로 템플릿 매칭이 있다. 기존의 템플릿 매칭에서는 주어진 매칭 영상 내에서 템플릿의 2차원 이동 변위를 바꿔가면서 블록 매칭 알고리즘(BMA)을 수행한다. 이 블록 매칭 알고리즘 수행 중에 템플릿의 크기와 모양은 바뀌지 않는다. 그리고 각각의 2차원 이동변위에 해당하는 블록에서 유사성 척도(similarity measure)로 계산된 매칭 에러 값을 비교하여 대상 체의 위치를 결정한다. 2차원 이동변위만 고려하기 때문에 템플릿과 매칭 영상에서 대상 체의 크기와 방향이 일치하지 않으면 성공률이 떨어진다. 반면 본 논문의 경우는 템플릿의 2차원 방향과 크기를 조정하는 변수를 새로이 추가하고 각각의 2차원 이동 변위에 해당하는 블록에서 이 변수의 최적 값이 자동으로 계산된다. 이렇게 계산된 최적 값을 사용하여, 각 블록에 최적인 템플릿으로 자동 변형된다. 그리고 자동 변형된 템플릿을 기준으로 각 블록의 매칭 에러 값이 계산된다. 이렇게 방향과 크기 차이가 보정된 각 블록의 매칭 에러 값들을 비교하여 대상 체의 위치를 결정한다. 따라서 방향과 크기 차이에 대해 좀 더 안정적인 결과 값을 얻을 수 있다. 사용의 편의를 위해서, 알고리즘을 템플릿 영상 외에 추가의 정보, 예를 들면, 거리정보를 필요로 하지 않는 닫힌 형태로 설계하는 데 주력한다.

모의 음성 모델을 이용한 효과적인 구개인두부전증 환자 음성 인식 (Effective Recognition of Velopharyngeal Insufficiency (VPI) Patient's Speech Using Simulated Speech Model)

  • 성미영;권택균;성명훈;김우일
    • 한국정보통신학회논문지
    • /
    • 제19권5호
    • /
    • pp.1243-1250
    • /
    • 2015
  • 본 논문에서는 VPI 환자 음성을 정상인 음성으로 복원하기 위한 기술의 단계로서 효과적인 VPI 음성 인식 기술을 소개한다. 소량의 VPI 환자 음성을 모델 적응에 효과적으로 사용하기 위해 정상인의 모의 음성을 이용하여 화자 적응을 위한 사전 모델로 이용하는 기법을 제안한다. MLLR 기법을 이용한 화자 적응을 통해 평균 83.60%의 인식률을 보이고, 모의 음성 모델을 화자 적응의 사전 모델로 이용함으로써 평균 6.38%의 인식률 향상을 가져온다. 음소 인식 평가 결과는 제안한 화자 적응 방식이 대폭적인 음성 인식 성능 향상을 가져오는 것을 증명한다. 이러한 결과는 본 논문에서 제안하는 모의 음성 모델을 이용한 화자 적응 기법이 대량의 VPI 환자 음성을 취득하기 어려운 조건에서 보다 향상된 성능의 VPI 환자 음성 인식기를 구축하는데 효과적임을 입증한다.

최근접 이웃 결정방법 알고리즘을 이용한 도로교통안전표지판 영상인식의 구현 (A Study on the Implement of Image Recognition the Road Traffic Safety Information Board using Nearest Neighborhood Decision Making Algorithm)

  • 정진용;김동현;이소행
    • 경영과정보연구
    • /
    • 제4권
    • /
    • pp.257-284
    • /
    • 2000
  • According as the drivers increase who have their cars, the comprehensive studies on the automobile for the traffic safety have been raised as the important problems. Visual Recognition System for radio-controled driving is a part of the sensor processor of Unmanned Autonomous Vehicle System. When a driver drives his car on an unknown highway or general road, it produces a model from the successively inputted road traffic information. The suggested Recognition System of the Road Traffic Safety Information Board is to recognize and distinguish automatically a Road Traffic Safety Information Board as one of road traffic information. The whole processes of Recognition System of the Road Traffic Safety Information Board suggested in this study are as follows. We took the photographs of Road Traffic Safety Information Board with a digital camera in order to get an image and normalize bitmap image file with a size of $200{\times}200$ byte with Photo Shop 5.0. The existing True Color is made up the color data of sixteen million kinds. We changed it with 256 Color, because it has large capacity, and spend much time on calculating. We have practiced works of 30 times with erosion and dilation algorithm to remove unnecessary images. We drawing out original image with the Region Splitting Technique as a kind of segmentation. We made three kinds of grouping(Attention Information Board, Prohibit Information Board, and Introduction Information Board) by RYB( Red, Yellow, Blue) color segmentation. We minimized the image size of board, direction, and the influence of rounding. We also minimized the Influence according to position. and the brightness of light and darkness with Eigen Vector and Eigen Value. The data sampling this feature value appeared after building the learning Code Book Database. The suggested Recognition System of the Road Traffic Safety Information Board firstly distinguished three kinds of groups in the database of learning Code Book, and suggested in order to recognize after comparing and judging the board want to recognize within the same group with Nearest Neighborhood Decision Making.

  • PDF

공동 이용을 위한 음성 인식 및 합성용 음성코퍼스의 발성 목록 설계 (Design of Linguistic Contents of Speech Copora for Speech Recognition and Synthesis for Common Use)

  • 김연화;김형주;김봉완;이용주
    • 대한음성학회지:말소리
    • /
    • 제43호
    • /
    • pp.89-99
    • /
    • 2002
  • Recently, researches into ways of improving large vocabulary continuous speech recognition and speech synthesis are being carried out intensively as the field of speech information technology is progressing rapidly. In the field of speech recognition, developments of stochastic methods such as HMM require large amount of speech data for training, and also in the field of speech synthesis, recent practices show that synthesis of better quality can be produced by selecting and connecting only the variable size of speech data from the large amount of speech data. In this paper we design and discuss linguistic contents for speech copora for speech recognition and synthesis to be shared in common.

  • PDF

모델 축소를 위한 그룹 모델 클러스터링 방법에 대한 연구 (Group Model Clustering Method for Model Downsizing)

  • 박미나;하진영
    • 산업기술연구
    • /
    • 제28권A호
    • /
    • pp.185-189
    • /
    • 2008
  • Practical pattern recognition systems should overcome very large class problem. Sometimes it is almost impossible to build every model for every class due to memory and time constraints. For this case, grouping similar models will be helpful. In this paper, we propose GMC(Group Model Clustering) to build a large class Chinese character recognition system. We built hidden Markov models for 10% of total classes, then classify the rest of classes into already trained group classes. Finally group models are trained using group model clustered data. Recognition is performed using only group models, in order to achieve reduced model size and improved recognition speed.

  • PDF

방송 뉴스 인식을 위한 언어 모델 적응 (Language Model Adaptation for Broadcast News Recognition)

  • 김현숙;전형배;김상훈;최준기;윤승
    • 대한음성학회지:말소리
    • /
    • 제51호
    • /
    • pp.99-115
    • /
    • 2004
  • In this parer, we propose LM adaptation for broadcast news recognition. We collect information of recent articles from the internet on real time, make a recent small size LM, and then interpolate recent LM with a existing LM composed of existing large broadcast news corpus. We performed interpolation experiments to get the best type of articles from recent corpus because collected recent corpus is composed of articles which are related with test set, and which are unrelated. When we made an adapted LM using recent LM with similar articles to test set through Tf-Idf method and existing LM, we got the best result that ERR of pseudo-morpheme based recognition performance has 17.2 % improvement and the number of OOV has reduction from 70 to 27.

  • PDF

음소변동규칙의 발견빈도에 기반한 음성인식 발음사전 구성 (Generating Pronunciation Lexicon for Continuous Speech Recognition Based on Observation Frequencies of Phonetic Rules)

  • 나민수;정민화
    • 대한음성학회지:말소리
    • /
    • 제64호
    • /
    • pp.137-153
    • /
    • 2007
  • The pronunciation lexicon of a continuous speech recognition system should contain enough pronunciation variations to be used for building a search space large enough to contain a correct path, whereas the size of the pronunciation lexicon needs to be constrained for effective decoding and lower perplexities. This paper describes a procedure for selecting pronunciation variations to be included in the lexicon based on the frequencies of the corresponding phonetic rules observed in the training corpus. Likelihood of a phonetic rule's application is estimated using the observation frequency of the rule and is used to control the construction of a pronunciation lexicon. Experiments with various pronunciation lexica show that the proposed method is helpful to improve the speech recognition performance.

  • PDF

A Novel Algorithm for Face Recognition From Very Low Resolution Images

  • Senthilsingh, C.;Manikandan, M.
    • Journal of Electrical Engineering and Technology
    • /
    • 제10권2호
    • /
    • pp.659-669
    • /
    • 2015
  • Face Recognition assumes much significance in the context of security based application. Normally, high resolution images offer more details about the image and recognizing a face from a reasonably high resolution image would be easier when compared to recognizing images from very low resolution images. This paper addresses the problem of recognizing faces from a very low resolution image whose size is as low as $8{\times}8$. With the use of CCTV(Closed Circuit Television) and with other surveillance camera-based application for security purposes, the need to overcome the shortcomings with very low resolution images has been on the rise. The present day face recognition algorithms could not provide adequate performance when employed to recognize images from VLR images. Existing methods use super-resolution (SR) methods and Relation Based Super Resolution methods to construct from very low resolution images. This paper uses a learning based super resolution method to extract and construct images from very low resolution images. Experimental results show that the proposed SR algorithm based on relationship learning outperforms the existing algorithms in public face databases.

Object Recognition using Smart Tag and Stereo Vision System on Pan-Tilt Mechanism

  • Kim, Jin-Young;Im, Chang-Jun;Lee, Sang-Won;Lee, Ho-Gil
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 제어로봇시스템학회 2005년도 ICCAS
    • /
    • pp.2379-2384
    • /
    • 2005
  • We propose a novel method for object recognition using the smart tag system with a stereo vision on a pan-tilt mechanism. We developed a smart tag which included IRED device. The smart tag is attached onto the object. We also developed a stereo vision system which pans and tilts for the object image to be the centered on each whole image view. A Stereo vision system on the pan-tilt mechanism can map the position of IRED to the robot coordinate system by using pan-tilt angles. And then, to map the size and pose of the object for the robot to coordinate the system, we used a simple model-based vision algorithm. To increase the possibility of tag-based object recognition, we implemented our approach by using as easy and simple techniques as possible.

  • PDF