• Title/Summary/Keyword: Size recognition

Search Result 960, Processing Time 0.034 seconds

Analysis of 3D Motion Recognition using Meta-analysis for Interaction (기존 3차원 인터랙션 동작인식 기술 현황 파악을 위한 메타분석)

  • Kim, Yong-Woo;Whang, Min-Cheol;Kim, Jong-Hwa;Woo, Jin-Cheol;Kim, Chi-Jung;Kim, Ji-Hye
    • Journal of the Ergonomics Society of Korea
    • /
    • v.29 no.6
    • /
    • pp.925-932
    • /
    • 2010
  • Most of the research on three-dimensional interaction field have showed different accuracy in terms of sensing, mode and method. Furthermore, implementation of interaction has been a lack of consistency in application field. Therefore, this study is to suggest research trends of three-dimensional interaction using meta-analysis. Searching relative keyword in database provided with 153 domestic papers and 188 international papers covering three-dimensional interaction. Analytical coding tables determined 18 domestic papers and 28 international papers for analysis. Frequency analysis was carried out on method of action, element, number, accuracy and then verified accuracy by effect size of the meta-analysis. As the results, the effect size of sensor-based was higher than vision-based, but the effect size was extracted to small as 0.02. The effect size of vision-based using hand motion was higher than sensor-based using hand motion. Therefore, implementation of three-dimensional sensor-based interaction and vision-based using hand motions more efficient. This study was significant to comprehensive analysis of three-dimensional motion recognition for interaction and suggest to application directions of three-dimensional interaction.

Two-dimensional Automatic Transformation Template Matching for Image Recognition (영상 인식을 위한 2차원 자동 변형 템플릿 매칭)

  • Han, Young-Mo
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.20 no.9
    • /
    • pp.1-6
    • /
    • 2019
  • One method for image recognition is template matching. In conventional template matching, the block matching algorithm (BMA) is performed while changing the two-dimensional translational displacement of the template within a given matching image. The template size and shape do not change during the BMA. Since only two-dimensional translational displacement is considered, the success rate decreases if the size and direction of the object do not match in the template and the matching image. In this paper, a variable is added to adjust the two-dimensional direction and size of the template, and the optimal value of the variable is automatically calculated in the block corresponding to each two-dimensional translational displacement. Using the calculated optimal value, the template is automatically transformed into an optimal template for each block. The matching error value of each block is then calculated based on the automatically deformed template. Therefore, a more stable result can be obtained for the difference in direction and size. For ease of use, this study focuses on designing the algorithm in a closed form that does not require additional information beyond the template image, such as distance information.

Effective Recognition of Velopharyngeal Insufficiency (VPI) Patient's Speech Using Simulated Speech Model (모의 음성 모델을 이용한 효과적인 구개인두부전증 환자 음성 인식)

  • Sung, Mee Young;Kwon, Tack-Kyun;Sung, Myung-Whun;Kim, Wooil
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.19 no.5
    • /
    • pp.1243-1250
    • /
    • 2015
  • This paper presents an effective recognition method of VPI patient's speech for a VPI speech reconstruction system. Speaker adaptation technique is employed to improve VPI speech recognition. This paper proposes to use simulated speech for generating an initial model for speaker adaptation, in order to effectively utilize the small size of VPI speech for model adaptation. We obtain 83.60% in average word accuracy by applying MLLR for speaker adaptation. The proposed speaker adaptation method using simulated speech model brings 6.38% improvement in average accuracy. The experimental results demonstrate that the proposed speaker adaptation method is highly effective for developing recognition system of VPI speech which is not suitable for constructing large-size speech database.

A Study on the Implement of Image Recognition the Road Traffic Safety Information Board using Nearest Neighborhood Decision Making Algorithm (최근접 이웃 결정방법 알고리즘을 이용한 도로교통안전표지판 영상인식의 구현)

  • Jung Jin-Yong;Kim Dong-Hyun;Lee So-Haeng
    • Management & Information Systems Review
    • /
    • v.4
    • /
    • pp.257-284
    • /
    • 2000
  • According as the drivers increase who have their cars, the comprehensive studies on the automobile for the traffic safety have been raised as the important problems. Visual Recognition System for radio-controled driving is a part of the sensor processor of Unmanned Autonomous Vehicle System. When a driver drives his car on an unknown highway or general road, it produces a model from the successively inputted road traffic information. The suggested Recognition System of the Road Traffic Safety Information Board is to recognize and distinguish automatically a Road Traffic Safety Information Board as one of road traffic information. The whole processes of Recognition System of the Road Traffic Safety Information Board suggested in this study are as follows. We took the photographs of Road Traffic Safety Information Board with a digital camera in order to get an image and normalize bitmap image file with a size of $200{\times}200$ byte with Photo Shop 5.0. The existing True Color is made up the color data of sixteen million kinds. We changed it with 256 Color, because it has large capacity, and spend much time on calculating. We have practiced works of 30 times with erosion and dilation algorithm to remove unnecessary images. We drawing out original image with the Region Splitting Technique as a kind of segmentation. We made three kinds of grouping(Attention Information Board, Prohibit Information Board, and Introduction Information Board) by RYB( Red, Yellow, Blue) color segmentation. We minimized the image size of board, direction, and the influence of rounding. We also minimized the Influence according to position. and the brightness of light and darkness with Eigen Vector and Eigen Value. The data sampling this feature value appeared after building the learning Code Book Database. The suggested Recognition System of the Road Traffic Safety Information Board firstly distinguished three kinds of groups in the database of learning Code Book, and suggested in order to recognize after comparing and judging the board want to recognize within the same group with Nearest Neighborhood Decision Making.

  • PDF

Design of Linguistic Contents of Speech Copora for Speech Recognition and Synthesis for Common Use (공동 이용을 위한 음성 인식 및 합성용 음성코퍼스의 발성 목록 설계)

  • Kim Yoen-Whoa;Kim Hyoung-Ju;Kim Bong-Wan;Lee Yong-Ju
    • MALSORI
    • /
    • no.43
    • /
    • pp.89-99
    • /
    • 2002
  • Recently, researches into ways of improving large vocabulary continuous speech recognition and speech synthesis are being carried out intensively as the field of speech information technology is progressing rapidly. In the field of speech recognition, developments of stochastic methods such as HMM require large amount of speech data for training, and also in the field of speech synthesis, recent practices show that synthesis of better quality can be produced by selecting and connecting only the variable size of speech data from the large amount of speech data. In this paper we design and discuss linguistic contents for speech copora for speech recognition and synthesis to be shared in common.

  • PDF

Group Model Clustering Method for Model Downsizing (모델 축소를 위한 그룹 모델 클러스터링 방법에 대한 연구)

  • Park, Mi-Na;Ha, Jin-Young
    • Journal of Industrial Technology
    • /
    • v.28 no.A
    • /
    • pp.185-189
    • /
    • 2008
  • Practical pattern recognition systems should overcome very large class problem. Sometimes it is almost impossible to build every model for every class due to memory and time constraints. For this case, grouping similar models will be helpful. In this paper, we propose GMC(Group Model Clustering) to build a large class Chinese character recognition system. We built hidden Markov models for 10% of total classes, then classify the rest of classes into already trained group classes. Finally group models are trained using group model clustered data. Recognition is performed using only group models, in order to achieve reduced model size and improved recognition speed.

  • PDF

Language Model Adaptation for Broadcast News Recognition (방송 뉴스 인식을 위한 언어 모델 적응)

  • Kim Hyun Suk;Jeon Hyung Bae;Kim Sanghun;Choi Joon Ki;Yun Seung
    • MALSORI
    • /
    • no.51
    • /
    • pp.99-115
    • /
    • 2004
  • In this parer, we propose LM adaptation for broadcast news recognition. We collect information of recent articles from the internet on real time, make a recent small size LM, and then interpolate recent LM with a existing LM composed of existing large broadcast news corpus. We performed interpolation experiments to get the best type of articles from recent corpus because collected recent corpus is composed of articles which are related with test set, and which are unrelated. When we made an adapted LM using recent LM with similar articles to test set through Tf-Idf method and existing LM, we got the best result that ERR of pseudo-morpheme based recognition performance has 17.2 % improvement and the number of OOV has reduction from 70 to 27.

  • PDF

Generating Pronunciation Lexicon for Continuous Speech Recognition Based on Observation Frequencies of Phonetic Rules (음소변동규칙의 발견빈도에 기반한 음성인식 발음사전 구성)

  • Na, Min-Soo;Chung, Min-Hwa
    • MALSORI
    • /
    • no.64
    • /
    • pp.137-153
    • /
    • 2007
  • The pronunciation lexicon of a continuous speech recognition system should contain enough pronunciation variations to be used for building a search space large enough to contain a correct path, whereas the size of the pronunciation lexicon needs to be constrained for effective decoding and lower perplexities. This paper describes a procedure for selecting pronunciation variations to be included in the lexicon based on the frequencies of the corresponding phonetic rules observed in the training corpus. Likelihood of a phonetic rule's application is estimated using the observation frequency of the rule and is used to control the construction of a pronunciation lexicon. Experiments with various pronunciation lexica show that the proposed method is helpful to improve the speech recognition performance.

  • PDF

A Novel Algorithm for Face Recognition From Very Low Resolution Images

  • Senthilsingh, C.;Manikandan, M.
    • Journal of Electrical Engineering and Technology
    • /
    • v.10 no.2
    • /
    • pp.659-669
    • /
    • 2015
  • Face Recognition assumes much significance in the context of security based application. Normally, high resolution images offer more details about the image and recognizing a face from a reasonably high resolution image would be easier when compared to recognizing images from very low resolution images. This paper addresses the problem of recognizing faces from a very low resolution image whose size is as low as $8{\times}8$. With the use of CCTV(Closed Circuit Television) and with other surveillance camera-based application for security purposes, the need to overcome the shortcomings with very low resolution images has been on the rise. The present day face recognition algorithms could not provide adequate performance when employed to recognize images from VLR images. Existing methods use super-resolution (SR) methods and Relation Based Super Resolution methods to construct from very low resolution images. This paper uses a learning based super resolution method to extract and construct images from very low resolution images. Experimental results show that the proposed SR algorithm based on relationship learning outperforms the existing algorithms in public face databases.

Object Recognition using Smart Tag and Stereo Vision System on Pan-Tilt Mechanism

  • Kim, Jin-Young;Im, Chang-Jun;Lee, Sang-Won;Lee, Ho-Gil
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2005.06a
    • /
    • pp.2379-2384
    • /
    • 2005
  • We propose a novel method for object recognition using the smart tag system with a stereo vision on a pan-tilt mechanism. We developed a smart tag which included IRED device. The smart tag is attached onto the object. We also developed a stereo vision system which pans and tilts for the object image to be the centered on each whole image view. A Stereo vision system on the pan-tilt mechanism can map the position of IRED to the robot coordinate system by using pan-tilt angles. And then, to map the size and pose of the object for the robot to coordinate the system, we used a simple model-based vision algorithm. To increase the possibility of tag-based object recognition, we implemented our approach by using as easy and simple techniques as possible.

  • PDF