• 제목/요약/키워드: Human recognition technology

검색결과 755건 처리시간 0.037초

강인한 손가락 끝 추출과 확장된 CAMSHIFT 알고리즘을 이용한 자연스러운 Human-Robot Interaction을 위한 손동작 인식 (A Robust Fingertip Extraction and Extended CAMSHIFT based Hand Gesture Recognition for Natural Human-like Human-Robot Interaction)

  • 이래경;안수용;오세영
    • 제어로봇시스템학회논문지
    • /
    • 제18권4호
    • /
    • pp.328-336
    • /
    • 2012
  • In this paper, we propose a robust fingertip extraction and extended Continuously Adaptive Mean Shift (CAMSHIFT) based robust hand gesture recognition for natural human-like HRI (Human-Robot Interaction). Firstly, for efficient and rapid hand detection, the hand candidate regions are segmented by the combination with robust $YC_bC_r$ skin color model and haar-like features based adaboost. Using the extracted hand candidate regions, we estimate the palm region and fingertip position from distance transformation based voting and geometrical feature of hands. From the hand orientation and palm center position, we find the optimal fingertip position and its orientation. Then using extended CAMSHIFT, we reliably track the 2D hand gesture trajectory with extracted fingertip. Finally, we applied the conditional density propagation (CONDENSATION) to recognize the pre-defined temporal motion trajectories. Experimental results show that the proposed algorithm not only rapidly extracts the hand region with accurately extracted fingertip and its angle but also robustly tracks the hand under different illumination, size and rotation conditions. Using these results, we successfully recognize the multiple hand gestures.

지능형 로봇을 위한 인간-컴퓨터 상호작용(HCI) 연구동향 (Human-Computer Interaction Survey for Intelligent Robot)

  • 홍석주;이칠우
    • 한국콘텐츠학회:학술대회논문집
    • /
    • 한국콘텐츠학회 2006년도 추계 종합학술대회 논문집
    • /
    • pp.507-511
    • /
    • 2006
  • 지능형 로봇이란 인간과 비슷하게 시각, 청각 등의 감각기관을 기반으로 자율적으로 판단하고 행동하는 독립적 자율구동 시스템을 말한다. 인간은 언어 이외에도 제스처와 같은 비언어적 수단을 이용하여 의사소통을 하며, 이러한 비언어적 의사소통 수단을 로봇이 이해한다면, 로봇은 인간과 보다 친숙한 대상이 될 수 있을 것이다. 이러한 요구에 의해 얼굴인식, 제스처 인식을 비롯한 HCI(Human-Computer Interaction) 기술들이 활발하게 연구되고 있지만 아직 해결해야 할 문제점이 많은 실정이다. 본 논문에서는 지능형 로봇을 위한 기반 기술 중 인간과의 가장 자연스러운 의사소통 방법의 하나인 제스처 인식 기술에 대하여, 최근 연구 성과를 중심으로 요소 기술의 중요 내용과 응용 사례를 소개한다.

  • PDF

Classification of Three Different Emotion by Physiological Parameters

  • Jang, Eun-Hye;Park, Byoung-Jun;Kim, Sang-Hyeob;Sohn, Jin-Hun
    • 대한인간공학회지
    • /
    • 제31권2호
    • /
    • pp.271-279
    • /
    • 2012
  • Objective: This study classified three different emotional states(boredom, pain, and surprise) using physiological signals. Background: Emotion recognition studies have tried to recognize human emotion by using physiological signals. It is important for emotion recognition to apply on human-computer interaction system for emotion detection. Method: 122 college students participated in this experiment. Three different emotional stimuli were presented to participants and physiological signals, i.e., EDA(Electrodermal Activity), SKT(Skin Temperature), PPG(Photoplethysmogram), and ECG (Electrocardiogram) were measured for 1 minute as baseline and for 1~1.5 minutes during emotional state. The obtained signals were analyzed for 30 seconds from the baseline and the emotional state and 27 features were extracted from these signals. Statistical analysis for emotion classification were done by DFA(discriminant function analysis) (SPSS 15.0) by using the difference values subtracting baseline values from the emotional state. Results: The result showed that physiological responses during emotional states were significantly differed as compared to during baseline. Also, an accuracy rate of emotion classification was 84.7%. Conclusion: Our study have identified that emotions were classified by various physiological signals. However, future study is needed to obtain additional signals from other modalities such as facial expression, face temperature, or voice to improve classification rate and to examine the stability and reliability of this result compare with accuracy of emotion classification using other algorithms. Application: This could help emotion recognition studies lead to better chance to recognize various human emotions by using physiological signals as well as is able to be applied on human-computer interaction system for emotion recognition. Also, it can be useful in developing an emotion theory, or profiling emotion-specific physiological responses as well as establishing the basis for emotion recognition system in human-computer interaction.

3차원 인체 포즈 인식을 이용한 상호작용 게임 콘텐츠 개발 (Developing Interactive Game Contents using 3D Human Pose Recognition)

  • 최윤지;박재완;송대현;이칠우
    • 한국콘텐츠학회논문지
    • /
    • 제11권12호
    • /
    • pp.619-628
    • /
    • 2011
  • 일반적으로 비전기반 3차원 인체 포즈 인식 기술은 HCI(Human-Computer Interaction)에서 인간의 제스처를 전달하기 위한 방법으로 사용된다. 특수한 환경에서 단순한 2차원 움직임 포즈만 인식할 수 있는 2차원 포즈모델 기반 인식 방법에 비해 3차원 관절을 묘사한 포즈모델은 관절각에 대한 정보와 신체 부위의 모양정보를 선행지식으로 사용할 수 있어서 좀 더 일반적인 환경에서 복잡한 3차원 포즈도 인식할 수 있다는 장점이 있다. 이 논문은 인체의 3차원 관절 정보를 이용한 포즈 인식 기술을 인터페이스로 활용한 상호작용 게임 콘텐츠 개발에 관해 기술한다. 제안된 시스템에서 사용되는 포즈는 인체 관절 중 14개 관절의 3차원 위치정보를 이용해서 구성한 포즈 템플릿과 현재 사용자의 포즈를 비교해 인식된다. 이 방법을 이용하여 제작된 시스템은 사용자가 부가적인 장치의 사용 없이 사용자의 몸동작만으로 자연스럽게 게임 콘텐츠를 조작할 수 있도록 해준다. 제안된 3차원 인식 기술을 게임 콘텐츠에 적용하여 성능을 평가한다. 향후 다양한 환경에서 더욱 강건하게 포즈를 인식할 수 있는 연구를 수행할 계획이다.

3차원 손 모델을 이용한 비전 기반 손 모양 인식기의 개발 (Development of a Hand~posture Recognition System Using 3D Hand Model)

  • 장효영;변증남
    • 대한전기학회:학술대회논문집
    • /
    • 대한전기학회 2007년도 심포지엄 논문집 정보 및 제어부문
    • /
    • pp.219-221
    • /
    • 2007
  • Recent changes to ubiquitous computing requires more natural human-computer(HCI) interfaces that provide high information accessibility. Hand-gesture, i.e., gestures performed by one 'or two hands, is emerging as a viable technology to complement or replace conventional HCI technology. This paper deals with hand-posture recognition. Hand-posture database construction is important in hand-posture recognition. Human hand is composed of 27 bones and the movement of each joint is modeled by 23 degrees of freedom. Even for the same hand-posture,. grabbed images may differ depending on user's characteristic and relative position between the hand and cameras. To solve the difficulty in defining hand-postures and construct database effective in size, we present a method using a 3D hand model. Hand joint angles for each hand-posture and corresponding silhouette images from many viewpoints by projecting the model into image planes are used to construct the ?database. The proposed method does not require additional equations to define movement constraints of each joint. Also using the method, it is easy to get images of one hand-posture from many vi.ewpoints and distances. Hence it is possible to construct database more precisely and concretely. The validity of the method is evaluated by applying it to the hand-posture recognition system.

  • PDF

Human Action Recognition Using Pyramid Histograms of Oriented Gradients and Collaborative Multi-task Learning

  • Gao, Zan;Zhang, Hua;Liu, An-An;Xue, Yan-Bing;Xu, Guang-Ping
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제8권2호
    • /
    • pp.483-503
    • /
    • 2014
  • In this paper, human action recognition using pyramid histograms of oriented gradients and collaborative multi-task learning is proposed. First, we accumulate global activities and construct motion history image (MHI) for both RGB and depth channels respectively to encode the dynamics of one action in different modalities, and then different action descriptors are extracted from depth and RGB MHI to represent global textual and structural characteristics of these actions. Specially, average value in hierarchical block, GIST and pyramid histograms of oriented gradients descriptors are employed to represent human motion. To demonstrate the superiority of the proposed method, we evaluate them by KNN, SVM with linear and RBF kernels, SRC and CRC models on DHA dataset, the well-known dataset for human action recognition. Large scale experimental results show our descriptors are robust, stable and efficient, and outperform the state-of-the-art methods. In addition, we investigate the performance of our descriptors further by combining these descriptors on DHA dataset, and observe that the performances of combined descriptors are much better than just using only sole descriptor. With multimodal features, we also propose a collaborative multi-task learning method for model learning and inference based on transfer learning theory. The main contributions lie in four aspects: 1) the proposed encoding the scheme can filter the stationary part of human body and reduce noise interference; 2) different kind of features and models are assessed, and the neighbor gradients information and pyramid layers are very helpful for representing these actions; 3) The proposed model can fuse the features from different modalities regardless of the sensor types, the ranges of the value, and the dimensions of different features; 4) The latent common knowledge among different modalities can be discovered by transfer learning to boost the performance.

Multimodal audiovisual speech recognition architecture using a three-feature multi-fusion method for noise-robust systems

  • Sanghun Jeon;Jieun Lee;Dohyeon Yeo;Yong-Ju Lee;SeungJun Kim
    • ETRI Journal
    • /
    • 제46권1호
    • /
    • pp.22-34
    • /
    • 2024
  • Exposure to varied noisy environments impairs the recognition performance of artificial intelligence-based speech recognition technologies. Degraded-performance services can be utilized as limited systems that assure good performance in certain environments, but impair the general quality of speech recognition services. This study introduces an audiovisual speech recognition (AVSR) model robust to various noise settings, mimicking human dialogue recognition elements. The model converts word embeddings and log-Mel spectrograms into feature vectors for audio recognition. A dense spatial-temporal convolutional neural network model extracts features from log-Mel spectrograms, transformed for visual-based recognition. This approach exhibits improved aural and visual recognition capabilities. We assess the signal-to-noise ratio in nine synthesized noise environments, with the proposed model exhibiting lower average error rates. The error rate for the AVSR model using a three-feature multi-fusion method is 1.711%, compared to the general 3.939% rate. This model is applicable in noise-affected environments owing to its enhanced stability and recognition rate.

A Study on the Facial Expression Recognition using Deep Learning Technique

  • Jeong, Bong Jae;Kang, Min Soo;Jung, Yong Gyu
    • International Journal of Advanced Culture Technology
    • /
    • 제6권1호
    • /
    • pp.60-67
    • /
    • 2018
  • In this paper, the pattern of extracting the same expression is proposed by using the Android intelligent device to identify the facial expression. The understanding and expression of expression are very important to human computer interaction, and the technology to identify human expressions is very popular. Instead of searching for the symbols that users often use, you can identify facial expressions with a camera, which is a useful technique that can be used now. This thesis puts forward the technology of the third data is available on the website of the set, use the content to improve the infrastructure of the facial expression recognition accuracy, to improve the synthesis of neural network algorithm, making the facial expression recognition model, the user's facial expressions and similar expressions, reached 66%. It doesn't need to search for symbols. If you use the camera to recognize the expression, it will appear symbols immediately. So, this service is the symbols used when people send messages to others, and it can feel a lot of convenience. In countless symbols, there is no need to find symbols, which is an increasing trend in deep learning. So, we need to use more suitable algorithm for expression recognition, and then improve accuracy.

물리적 인지적 상황을 고려한 감성 인식 모델링 (Emotion recognition modeling in considering physical and cognitive factors)

  • 송성호;박희환;지용관;박지형;박장현
    • 한국정밀공학회:학술대회논문집
    • /
    • 한국정밀공학회 2005년도 춘계학술대회 논문집
    • /
    • pp.1937-1943
    • /
    • 2005
  • The technology of emotion recognition is a crucial factor in day of ubiquitous that it provides various intelligent services for human. This paper intends to make the system which recognizes the human emotions based on 2-dimensional model with two bio signals, GSR and HRV. Since it is too difficult to make model the human's bio system analytically, as a statistical method, Hidden Markov Model(HMM) is used, which uses the transition probability among various states and measurable observation variance. As a result of experiments for each emotion, we can get average recognition rates of 64% for first HMM results and 55% for second HMM results

  • PDF

Hybrid Facial Representations for Emotion Recognition

  • Yun, Woo-Han;Kim, DoHyung;Park, Chankyu;Kim, Jaehong
    • ETRI Journal
    • /
    • 제35권6호
    • /
    • pp.1021-1028
    • /
    • 2013
  • Automatic facial expression recognition is a widely studied problem in computer vision and human-robot interaction. There has been a range of studies for representing facial descriptors for facial expression recognition. Some prominent descriptors were presented in the first facial expression recognition and analysis challenge (FERA2011). In that competition, the Local Gabor Binary Pattern Histogram Sequence descriptor showed the most powerful description capability. In this paper, we introduce hybrid facial representations for facial expression recognition, which have more powerful description capability with lower dimensionality. Our descriptors consist of a block-based descriptor and a pixel-based descriptor. The block-based descriptor represents the micro-orientation and micro-geometric structure information. The pixel-based descriptor represents texture information. We validate our descriptors on two public databases, and the results show that our descriptors perform well with a relatively low dimensionality.