• Title/Summary/Keyword: Real-time object recognition

Search Result 279, Processing Time 0.026 seconds

Road Object Graph Modeling Method for Efficient Road Situation Recognition (효과적인 도로 상황 인지를 위한 도로 객체 그래프 모델링 방법)

  • Ariunerdene, Nyamdavaa;Jeong, Seongmo;Song, Seokil
    • Journal of Platform Technology
    • /
    • v.9 no.4
    • /
    • pp.3-9
    • /
    • 2021
  • In this paper, a graph data model is introduced to effectively recognize the situation between each object on the road detected by vehicles or road infrastructure sensors. The proposed method builds a graph database by modeling each object on the road as a node of the graph and the relationship between objects as an edge of the graph, and updates object properties and edge properties in real time. In this case, the relationship between objects represented as edges is set when there is a possibility of approach between objects in consideration of the position, direction, and speed of each object. Finally, we propose a spatial indexing technique for graph nodes and edges to update the road object graph database represented through the proposed graph modeling method continuously in real time. To show the superiority of the proposed indexing technique, we compare the proposed indexing based database update method to the non-indexing update method through simulation. The results of the simulation show the proposed method outperforms more than 10 times to the non-indexing method.

Utilization of Laser Range Measurements for Guiding Unmanned Agricultural Machinery

  • Jung, I. G.;Park, W. P.;Kim, S. C.;Sung, J. H.;Chung, S. O.
    • Agricultural and Biosystems Engineering
    • /
    • v.2 no.2
    • /
    • pp.69-74
    • /
    • 2001
  • Detection of operation lines in farm works, object recognition and obstacle avoidance are essential pre-requisite technologies for unmanned agricultural machinery. A CCD camera, which has been largely used for these functions, is expensive and has difficulty in real-time signal processing. In this study, a laser range sensor was selected as the guiding vision for unmanned agricultural machinery such as a tractor. To achieve this capability, algorithms for distance measurement, signal filtering, object recognition, and obstacle avoidance were developed. Computer simulations were carried out to evaluate performance of the algorithms. Experiments were also conducted with various materials and shapes, Laser beam lost its intensity for poor reflective materials, resulting in less range value than actual, so a compensation technique was considered to be necessary. Object detection system was fabricated on an agricultural tractor and the performance was evaluated. As test result for obstacle detection and avoidance in field, to detect and avoid obstacle for path finding with guiding system for unmanned agricultural machinery was enable.

  • PDF

An Illumination Invariant Traffic Sign Recognition in the Driving Environment for Intelligence Vehicles (지능형 자동차를 위한 조명 변화에 강인한 도로표지판 검출 및 인식)

  • Lee, Taewoo;Lim, Kwangyong;Bae, Guntae;Byun, Hyeran;Choi, Yeongwoo
    • Journal of KIISE
    • /
    • v.42 no.2
    • /
    • pp.203-212
    • /
    • 2015
  • This paper proposes a traffic sign recognition method in real road environments. The video stream in driving environments has two different characteristics compared to a general object video stream. First, the number of traffic sign types is limited and their shapes are mostly simple. Second, the camera cannot take clear pictures in the road scenes since there are many illumination changes and weather conditions are continuously changing. In this paper, we improve a modified census transform(MCT) to extract features effectively from the road scenes that have many illumination changes. The extracted features are collected by histograms and are transformed by the dense descriptors into very high dimensional vectors. Then, the high dimensional descriptors are encoded into a low dimensional feature vector by Fisher-vector coding and Gaussian Mixture Model. The proposed method shows illumination invariant detection and recognition, and the performance is sufficient to detect and recognize traffic signs in real-time with high accuracy.

Intelligent interface using hand gestures recognition based on artificial intelligence (인공지능 기반 손 체스처 인식 정보를 활용한 지능형 인터페이스)

  • Hangjun Cho;Junwoo Yoo;Eun Soo Kim;Young Jae Lee
    • Journal of Platform Technology
    • /
    • v.11 no.1
    • /
    • pp.38-51
    • /
    • 2023
  • We propose an intelligent interface algorithm using hand gesture recognition information based on artificial intelligence. This method is functionally an interface that recognizes various motions quickly and intelligently by using MediaPipe and artificial intelligence techniques such as KNN, LSTM, and CNN to track and recognize user hand gestures. To evaluate the performance of the proposed algorithm, it is applied to a self-made 2D top-view racing game and robot control. As a result of applying the algorithm, it was possible to control various movements of the virtual object in the game in detail and robustly. And the result of applying the algorithm to the robot control in the real world, it was possible to control movement, stop, left turn, and right turn. In addition, by controlling the main character of the game and the robot in the real world at the same time, the optimized motion was implemented as an intelligent interface for controlling the coexistence space of virtual and real world. The proposed algorithm enables sophisticated control according to natural and intuitive characteristics using the body and fine movement recognition of fingers, and has the advantage of being skilled in a short period of time, so it can be used as basic data for developing intelligent user interfaces.

  • PDF

Recognition and Modeling of 3D Environment based on Local Invariant Features (지역적 불변특징 기반의 3차원 환경인식 및 모델링)

  • Jang, Dae-Sik
    • Journal of the Korea Society of Computer and Information
    • /
    • v.11 no.3
    • /
    • pp.31-39
    • /
    • 2006
  • This paper presents a novel approach to real-time recognition of 3D environment and objects for various applications such as intelligent robots, intelligent vehicles, intelligent buildings,..etc. First, we establish the three fundamental principles that humans use for recognizing and interacting with the environment. These principles have led to the development of an integrated approach to real-time 3D recognition and modeling, as follows: 1) It starts with a rapid but approximate characterization of the geometric configuration of workspace by identifying global plane features. 2) It quickly recognizes known objects in environment and replaces them by their models in database based on 3D registration. 3) It models the geometric details the geometric details on the fly adaptively to the need of the given task based on a multi-resolution octree representation. SIFT features with their 3D position data, referred to here as stereo-sis SIFT, are used extensively, together with point clouds, for fast extraction of global plane features, for fast recognition of objects, for fast registration of scenes, as well as for overcoming incomplete and noisy nature of point clouds.

  • PDF

Real-time Traffic Sign Recognition using Rotation-invariant Fast Binary Patterns (회전에 강인한 고속 이진패턴을 이용한 실시간 교통 신호 표지판 인식)

  • Hwang, Min-Chul;Ko, Byoung Chul;Nam, Jae-Yeal
    • Journal of Broadcast Engineering
    • /
    • v.21 no.4
    • /
    • pp.562-568
    • /
    • 2016
  • In this paper, we focus on recognition of speed-limit signs among a few types of traffic signs because speed-limit sign is closely related to safe driving of drivers. Although histogram of oriented gradient (HOG) and local binary patterns (LBP) are representative features for object recognition, these features have a weakness with respect to rotation, in that it does not consider the rotation of the target object when generating patterns. Therefore, this paper propose the fast rotation-invariant binary patterns (FRIBP) algorithm to generate a binary pattern that is robust against rotation. The proposed FRIBP algorithm deletes an unused layer of the histogram, and eliminates the shift and comparison operations in order to quickly extract the desired feature. The proposed FRIBP algorithm is successfully applied to German Traffic Sign Recognition Benchmark (GTSRB) datasets, and the results show that the recognition capabilities of the proposed method are similar to those of other methods. Moreover, its recognition speed is considerably enhanced than related works as approximately 0.47second for 12,630 test data.

Real-Time Human Tracker Based on Location and Motion Recognition of User for Smart Home (스마트 홈을 위한 사용자 위치와 모션 인식 기반의 실시간 휴먼 트랙커)

  • Choi, Jong-Hwa;Park, Se-Young;Shin, Dong-Kyoo;Shin, Dong-Il
    • The KIPS Transactions:PartA
    • /
    • v.16A no.3
    • /
    • pp.209-216
    • /
    • 2009
  • The ubiquitous smart home is the home of the future that takes advantage of context information from the human and the home environment and provides an automatic home service for the human. Human location and motion are the most important contexts in the ubiquitous smart home. We present a real-time human tracker that predicts human location and motion for the ubiquitous smart home. We used four network cameras for real-time human tracking. This paper explains the real-time human tracker's architecture, and presents an algorithm with the details of two functions (prediction of human location and motion) in the real-time human tracker. The human location uses three kinds of background images (IMAGE1: empty room image, IMAGE2: image with furniture and home appliances in the home, IMAGE3: image with IMAGE2 and the human). The real-time human tracker decides whether the human is included with which furniture (or home appliance) through an analysis of three images, and predicts human motion using a support vector machine. A performance experiment of the human's location, which uses three images, took an average of 0.037 seconds. The SVM's feature of human's motion recognition is decided from pixel number by array line of the moving object. We evaluated each motion 1000 times. The average accuracy of all the motions was found to be 86.5%.

Video Analysis System for Action and Emotion Detection by Object with Hierarchical Clustering based Re-ID (계층적 군집화 기반 Re-ID를 활용한 객체별 행동 및 표정 검출용 영상 분석 시스템)

  • Lee, Sang-Hyun;Yang, Seong-Hun;Oh, Seung-Jin;Kang, Jinbeom
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.1
    • /
    • pp.89-106
    • /
    • 2022
  • Recently, the amount of video data collected from smartphones, CCTVs, black boxes, and high-definition cameras has increased rapidly. According to the increasing video data, the requirements for analysis and utilization are increasing. Due to the lack of skilled manpower to analyze videos in many industries, machine learning and artificial intelligence are actively used to assist manpower. In this situation, the demand for various computer vision technologies such as object detection and tracking, action detection, emotion detection, and Re-ID also increased rapidly. However, the object detection and tracking technology has many difficulties that degrade performance, such as re-appearance after the object's departure from the video recording location, and occlusion. Accordingly, action and emotion detection models based on object detection and tracking models also have difficulties in extracting data for each object. In addition, deep learning architectures consist of various models suffer from performance degradation due to bottlenects and lack of optimization. In this study, we propose an video analysis system consists of YOLOv5 based DeepSORT object tracking model, SlowFast based action recognition model, Torchreid based Re-ID model, and AWS Rekognition which is emotion recognition service. Proposed model uses single-linkage hierarchical clustering based Re-ID and some processing method which maximize hardware throughput. It has higher accuracy than the performance of the re-identification model using simple metrics, near real-time processing performance, and prevents tracking failure due to object departure and re-emergence, occlusion, etc. By continuously linking the action and facial emotion detection results of each object to the same object, it is possible to efficiently analyze videos. The re-identification model extracts a feature vector from the bounding box of object image detected by the object tracking model for each frame, and applies the single-linkage hierarchical clustering from the past frame using the extracted feature vectors to identify the same object that failed to track. Through the above process, it is possible to re-track the same object that has failed to tracking in the case of re-appearance or occlusion after leaving the video location. As a result, action and facial emotion detection results of the newly recognized object due to the tracking fails can be linked to those of the object that appeared in the past. On the other hand, as a way to improve processing performance, we introduce Bounding Box Queue by Object and Feature Queue method that can reduce RAM memory requirements while maximizing GPU memory throughput. Also we introduce the IoF(Intersection over Face) algorithm that allows facial emotion recognized through AWS Rekognition to be linked with object tracking information. The academic significance of this study is that the two-stage re-identification model can have real-time performance even in a high-cost environment that performs action and facial emotion detection according to processing techniques without reducing the accuracy by using simple metrics to achieve real-time performance. The practical implication of this study is that in various industrial fields that require action and facial emotion detection but have many difficulties due to the fails in object tracking can analyze videos effectively through proposed model. Proposed model which has high accuracy of retrace and processing performance can be used in various fields such as intelligent monitoring, observation services and behavioral or psychological analysis services where the integration of tracking information and extracted metadata creates greate industrial and business value. In the future, in order to measure the object tracking performance more precisely, there is a need to conduct an experiment using the MOT Challenge dataset, which is data used by many international conferences. We will investigate the problem that the IoF algorithm cannot solve to develop an additional complementary algorithm. In addition, we plan to conduct additional research to apply this model to various fields' dataset related to intelligent video analysis.

AnoVid: A Deep Neural Network-based Tool for Video Annotation (AnoVid: 비디오 주석을 위한 심층 신경망 기반의 도구)

  • Hwang, Jisu;Kim, Incheol
    • Journal of Korea Multimedia Society
    • /
    • v.23 no.8
    • /
    • pp.986-1005
    • /
    • 2020
  • In this paper, we propose AnoVid, an automated video annotation tool based on deep neural networks, that automatically generates various meta data for each scene or shot in a long drama video containing rich elements. To this end, a novel meta data schema for drama video is designed. Based on this schema, the AnoVid video annotation tool has a total of six deep neural network models for object detection, place recognition, time zone recognition, person recognition, activity detection, and description generation. Using these models, the AnoVid can generate rich video annotation data. In addition, AnoVid provides not only the ability to automatically generate a JSON-type video annotation data file, but also provides various visualization facilities to check the video content analysis results. Through experiments using a real drama video, "Misaeing", we show the practical effectiveness and performance of the proposed video annotation tool, AnoVid.

Dual Autostereoscopic Display Platform for Multi-user Collaboration with Natural Interaction

  • Kim, Hye-Mi;Lee, Gun-A.;Yang, Ung-Yeon;Kwak, Tae-Jin;Kim, Ki-Hong
    • ETRI Journal
    • /
    • v.34 no.3
    • /
    • pp.466-469
    • /
    • 2012
  • In this letter, we propose a dual autostereoscopic display platform employing a natural interaction method, which will be useful for sharing visual data with users. To provide 3D visualization of a model to users who collaborate with each other, a beamsplitter is used with a pair of autostereoscopic displays, providing a visual illusion of a floating 3D image. To interact with the virtual object, we track the user's hands with a depth camera. The gesture recognition technique we use operates without any initialization process, such as specific poses or gestures, and supports several commands to control virtual objects by gesture recognition. Experiment results show that our system performs well in visualizing 3D models in real-time and handling them under unconstrained conditions, such as complicated backgrounds or a user wearing short sleeves.