• Title/Summary/Keyword: vision-based recognition

Search Result 633, Processing Time 0.027 seconds

A Computer Vision-based Assistive Mobile Application for the Visually Impaired (컴퓨터 비전 기반 시각 장애 지원 모바일 응용)

  • Secondes, Arnel A.;Otero, Nikki Anne Dominique D.;Elijorde, Frank I.;Byun, Yung-Cheol
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.65 no.12
    • /
    • pp.2138-2144
    • /
    • 2016
  • People with visual disabilities suffer environmentally, socially, and technologically. Navigating through places and recognizing objects are already a big challenge for them who require assistance. This study aimed to develop an android-based assistive application for the visually impaired. Specifically, the study aimed to create a system that could aid visually impaired individuals performs significant tasks through object recognition and identifying locations through GPS and Google Maps. In this study, the researchers used an android phone allowing a visually impaired individual to go from one place to another with the aid of the application. Google Maps is integrated to utilize GPS in identifying locations and giving distance directions and the system has a cloud server used for storing pinned locations. Furthermore, Haar-like features were used in object recognition.

Human Action Recognition Using Deep Data: A Fine-Grained Study

  • Rao, D. Surendra;Potturu, Sudharsana Rao;Bhagyaraju, V
    • International Journal of Computer Science & Network Security
    • /
    • v.22 no.6
    • /
    • pp.97-108
    • /
    • 2022
  • The video-assisted human action recognition [1] field is one of the most active ones in computer vision research. Since the depth data [2] obtained by Kinect cameras has more benefits than traditional RGB data, research on human action detection has recently increased because of the Kinect camera. We conducted a systematic study of strategies for recognizing human activity based on deep data in this article. All methods are grouped into deep map tactics and skeleton tactics. A comparison of some of the more traditional strategies is also covered. We then examined the specifics of different depth behavior databases and provided a straightforward distinction between them. We address the advantages and disadvantages of depth and skeleton-based techniques in this discussion.

Hand Motion Design for Performance Enhancement of Vision Based Hand Signal Recognizer (영상기반의 안정적 수신호 인식기를 위한 손동작 패턴 설계 방법)

  • Shon, Su-Won;Beh, Joung-Hoon;Yang, Cheol-Jong;Wang, Han;Ko, Han-Seok
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.48 no.4
    • /
    • pp.30-37
    • /
    • 2011
  • This paper proposes a language set of hand motions for enhancing the performance of vision-based hand signal recognizer. Based on the statistical analysis of the angular tendency of hand movements in sign language and the hand motions in practical use, we construct four motion primitives as building blocks for basic hand motions. By combining these motion primitives, we design a discernable 'fundamental hand motion set' toward increasing the hand signal recognition. To demonstrate the validity of proposed designing method, we develop a 'fundamental hand motion set' recognizer based on hidden Markov model (HMM). The recognition system showed 99.01% recognition rate on the proposed language set. This result validates that the proposed language set enhances discernaility among the hand motions such that the performance of hand signal recognizer is improved.

Guidelines for Data Construction when Estimating Traffic Volume based on Artificial Intelligence using Drone Images (드론영상과 인공지능 기반 교통량 추정을 위한 데이터 구축 가이드라인 도출 연구)

  • Han, Dongkwon;Kim, Doopyo;Kim, Sungbo
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.40 no.3
    • /
    • pp.147-157
    • /
    • 2022
  • Recently, many studies have been conducted to analyze traffic or object recognition that classifies vehicles through artificial intelligence-based prediction models using CCTV (Closed Circuit TeleVision)or drone images. In order to develop an object recognition deep learning model for accurate traffic estimation, systematic data construction is required, and related standardized guidelines are insufficient. In this study, previous studies were analyzed to derive guidelines for establishing artificial intelligence-based training data for traffic estimation using drone images, and business reports or training data for artificial intelligence and quality management guidelines were referenced. The guidelines for data construction are divided into data acquisition, preprocessing, and validation, and guidelines for notice and evaluation index for each item are presented. The guidelines for data construction aims to provide assistance in the development of a robust and generalized artificial intelligence model in analyzing the estimation of road traffic based on drone image artificial intelligence.

MultiView-Based Hand Posture Recognition Method Based on Point Cloud

  • Xu, Wenkai;Lee, Ick-Soo;Lee, Suk-Kwan;Lu, Bo;Lee, Eung-Joo
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.9 no.7
    • /
    • pp.2585-2598
    • /
    • 2015
  • Hand posture recognition has played a very important role in Human Computer Interaction (HCI) and Computer Vision (CV) for many years. The challenge arises mainly due to self-occlusions caused by the limited view of the camera. In this paper, a robust hand posture recognition approach based on 3D point cloud from two RGB-D sensors (Kinect) is proposed to make maximum use of 3D information from depth map. Through noise reduction and registering two point sets obtained satisfactory from two views as we designed, a multi-viewed hand posture point cloud with most 3D information can be acquired. Moreover, we utilize the accurate reconstruction and classify each point cloud by directly matching the normalized point set with the templates of different classes from dataset, which can reduce the training time and calculation. Experimental results based on posture dataset captured by Kinect sensors (from digit 1 to 10) demonstrate the effectiveness of the proposed method.

A Novel Image Segmentation Method Based on Improved Intuitionistic Fuzzy C-Means Clustering Algorithm

  • Kong, Jun;Hou, Jian;Jiang, Min;Sun, Jinhua
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.6
    • /
    • pp.3121-3143
    • /
    • 2019
  • Segmentation plays an important role in the field of image processing and computer vision. Intuitionistic fuzzy C-means (IFCM) clustering algorithm emerged as an effective technique for image segmentation in recent years. However, standard fuzzy C-means (FCM) and IFCM algorithms are sensitive to noise and initial cluster centers, and they ignore the spatial relationship of pixels. In view of these shortcomings, an improved algorithm based on IFCM is proposed in this paper. Firstly, we propose a modified non-membership function to generate intuitionistic fuzzy set and a method of determining initial clustering centers based on grayscale features, they highlight the effect of uncertainty in intuitionistic fuzzy set and improve the robustness to noise. Secondly, an improved nonlinear kernel function is proposed to map data into kernel space to measure the distance between data and the cluster centers more accurately. Thirdly, the local spatial-gray information measure is introduced, which considers membership degree, gray features and spatial position information at the same time. Finally, we propose a new measure of intuitionistic fuzzy entropy, it takes into account fuzziness and intuition of intuitionistic fuzzy set. The experimental results show that compared with other IFCM based algorithms, the proposed algorithm has better segmentation and clustering performance.

Multimodal Attention-Based Fusion Model for Context-Aware Emotion Recognition

  • Vo, Minh-Cong;Lee, Guee-Sang
    • International Journal of Contents
    • /
    • v.18 no.3
    • /
    • pp.11-20
    • /
    • 2022
  • Human Emotion Recognition is an exciting topic that has been attracting many researchers for a lengthy time. In recent years, there has been an increasing interest in exploiting contextual information on emotion recognition. Some previous explorations in psychology show that emotional perception is impacted by facial expressions, as well as contextual information from the scene, such as human activities, interactions, and body poses. Those explorations initialize a trend in computer vision in exploring the critical role of contexts, by considering them as modalities to infer predicted emotion along with facial expressions. However, the contextual information has not been fully exploited. The scene emotion created by the surrounding environment, can shape how people perceive emotion. Besides, additive fusion in multimodal training fashion is not practical, because the contributions of each modality are not equal to the final prediction. The purpose of this paper was to contribute to this growing area of research, by exploring the effectiveness of the emotional scene gist in the input image, to infer the emotional state of the primary target. The emotional scene gist includes emotion, emotional feelings, and actions or events that directly trigger emotional reactions in the input image. We also present an attention-based fusion network, to combine multimodal features based on their impacts on the target emotional state. We demonstrate the effectiveness of the method, through a significant improvement on the EMOTIC dataset.

A Real Time Lane Detection Algorithm Using LRF for Autonomous Navigation of a Mobile Robot (LRF 를 이용한 이동로봇의 실시간 차선 인식 및 자율주행)

  • Kim, Hyun Woo;Hawng, Yo-Seup;Kim, Yun-Ki;Lee, Dong-Hyuk;Lee, Jang-Myung
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.19 no.11
    • /
    • pp.1029-1035
    • /
    • 2013
  • This paper proposes a real time lane detection algorithm using LRF (Laser Range Finder) for autonomous navigation of a mobile robot. There are many technologies for safety of the vehicles such as airbags, ABS, EPS etc. The real time lane detection is a fundamental requirement for an automobile system that utilizes outside information of automobiles. Representative methods of lane recognition are vision-based and LRF-based systems. By the vision-based system, recognition of environment for three dimensional space becomes excellent only in good conditions for capturing images. However there are so many unexpected barriers such as bad illumination, occlusions, and vibrations that the vision cannot be used for satisfying the fundamental requirement. In this paper, we introduce a three dimensional lane detection algorithm using LRF, which is very robust against the illumination. For the three dimensional lane detections, the laser reflection difference between the asphalt and lane according to the color and distance has been utilized with the extraction of feature points. Also a stable tracking algorithm is introduced empirically in this research. The performance of the proposed algorithm of lane detection and tracking has been verified through the real experiments.

Vision based Traffic Light Detection and Recognition Methods for Daytime LED Traffic Light (비전 기반 주간 LED 교통 신호등 인식 및 신호등 패턴 판단에 관한 연구)

  • Kim, Hyun-Koo;Park, Ju H.;Jung, Ho-Youl
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.9 no.3
    • /
    • pp.145-150
    • /
    • 2014
  • This paper presents an effective vision based method for LED traffic light detection at the daytime. First, the proposed method calculates horizontal coordinates to set region of interest (ROI) on input sequence images. Second, the proposed uses color segmentation method to extract region of green and red traffic light. Next, to classify traffic light and another noise, shape filter and haar-like feature value are used. Finally, temporal delay filter with weight is applied to remove blinking effect of LED traffic light, and state and weight of traffic light detection are used to classify types of traffic light. For simulations, the proposed method is implemented through Intel Core CPU with 2.80 GHz and 4 GB RAM, and tested on the urban and rural road video. Average detection rate of traffic light is 94.50 % and average recognition rate of traffic type is 90.24 %. Average computing time of the proposed method is 11 ms.

The Basic Position Tracking Technology of Power Connector Receptacle based on the Image Recognition (영상인식 기반 파워 컨넥터 리셉터클의 위치 확인을 위한 기초 연구)

  • Ko, Yun-Seok
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.12 no.2
    • /
    • pp.309-314
    • /
    • 2017
  • Recently, the fields such as the service robot, the autonomous driving electric car, and the torpedo ladle cars operated autonomously to enhance the efficiency of management of the steel mill are receiving great attention. But development of automatic power supply that doesn't need human intervention be a problem. In this paper, a position tracking technology of power connector receptacle based on the computer vision is studied which can recognize and identify the position of the power connector receptacle, and finally its possibility is verified using OpenCV program.