• Title/Summary/Keyword: vision-based recognition

Search Result 633, Processing Time 0.034 seconds

Attention Deep Neural Networks Learning based on Multiple Loss functions for Video Face Recognition (비디오 얼굴인식을 위한 다중 손실 함수 기반 어텐션 심층신경망 학습 제안)

  • Kim, Kyeong Tae;You, Wonsang;Choi, Jae Young
    • Journal of Korea Multimedia Society
    • /
    • v.24 no.10
    • /
    • pp.1380-1390
    • /
    • 2021
  • The video face recognition (FR) is one of the most popular researches in the field of computer vision due to a variety of applications. In particular, research using the attention mechanism is being actively conducted. In video face recognition, attention represents where to focus on by using the input value of the whole or a specific region, or which frame to focus on when there are many frames. In this paper, we propose a novel attention based deep learning method. Main novelties of our method are (1) the use of combining two loss functions, namely weighted Softmax loss function and a Triplet loss function and (2) the feasibility of end-to-end learning which includes the feature embedding network and attention weight computation. The feature embedding network has a positive effect on the attention weight computation by using combined loss function and end-to-end learning. To demonstrate the effectiveness of our proposed method, extensive and comparative experiments have been carried out to evaluate our method on IJB-A dataset with their standard evaluation protocols. Our proposed method represented better or comparable recognition rate compared to other state-of-the-art video FR methods.

Design of an efficient learning-based face detection system (학습기반 효율적인 얼굴 검출 시스템 설계)

  • Kim Hyunsik;Kim Wantae;Park Byungjoon
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.19 no.3
    • /
    • pp.213-220
    • /
    • 2023
  • Face recognition is a very important process in video monitoring and is a type of biometric technology. It is mainly used for identification and security purposes, such as ID cards, licenses, and passports. The recognition process has many variables and is complex, so development has been slow. In this paper, we proposed a face recognition method using CNN, which has been re-examined due to the recent development of computers and algorithms, and compared with the feature comparison method, which is an existing face recognition algorithm, to verify performance. The proposed face search method is divided into a face region extraction step and a learning step. For learning, face images were standardized to 50×50 pixels, and learning was conducted while minimizing unnecessary nodes. In this paper, convolution and polling-based techniques, which are one of the deep learning technologies, were used for learning, and 1,000 face images were randomly selected from among 7,000 images of Caltech, and as a result of inspection, the final recognition rate was 98%.

Photon-counting linear discriminant analysis for face recognition at a distance

  • Yeom, Seok-Won
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.12 no.3
    • /
    • pp.250-255
    • /
    • 2012
  • Face recognition has wide applications in security and surveillance systems as well as in robot vision and machine interfaces. Conventional challenges in face recognition include pose, illumination, and expression, and face recognition at a distance involves additional challenges because long-distance images are often degraded due to poor focusing and motion blurring. This study investigates the effectiveness of applying photon-counting linear discriminant analysis (Pc-LDA) to face recognition in harsh environments. A related technique, Fisher linear discriminant analysis, has been found to be optimal, but it often suffers from the singularity problem because the number of available training images is generally much smaller than the number of pixels. Pc-LDA, on the other hand, realizes the Fisher criterion in high-dimensional space without any dimensionality reduction. Therefore, it provides more invariant solutions to image recognition under distortion and degradation. Two decision rules are employed: one is based on Euclidean distance; the other, on normalized correlation. In the experiments, the asymptotic equivalence of the photon-counting method to the Fisher method is verified with simulated data. Degraded facial images are employed to demonstrate the robustness of the photon-counting classifier in harsh environments. Four types of blurring point spread functions are applied to the test images in order to simulate long-distance acquisition. The results are compared with those of conventional Eigen face and Fisher face methods. The results indicate that Pc-LDA is better than conventional facial recognition techniques.

Human Primitive Motion Recognition Based on the Hidden Markov Models (은닉 마르코프 모델 기반 동작 인식 방법)

  • Kim, Jong-Ho;Yun, Yo-Seop;Kim, Tae-Young;Lim, Cheol-Su
    • Journal of Korea Multimedia Society
    • /
    • v.12 no.4
    • /
    • pp.521-529
    • /
    • 2009
  • In this paper, we present a vision-based human primitive motion recognition method. It models the reference motion patterns, recognizes a user's motion, and measures the similarity between the reference action and the user's one. In order to recognize a motion, we provide a pattern modeling method based on the Hidden Markov Models. In addition, we provide a similarity measurement method between the reference motion and the user's one using the editing distance algorithm. Experimental results show that the recognition rate of ours is above 93%. Our method can be used in the motion recognizable games, the motion recognizable postures, and the rehabilitation training systems.

  • PDF

An Approach for Localization Around Indoor Corridors Based on Visual Attention Model (시각주의 모델을 적용한 실내 복도에서의 위치인식 기법)

  • Yoon, Kook-Yeol;Choi, Sun-Wook;Lee, Chong-Ho
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.17 no.2
    • /
    • pp.93-101
    • /
    • 2011
  • For mobile robot, recognizing its current location is very important to navigate autonomously. Especially, loop closing detection that robot recognize location where it has visited before is a kernel problem to solve localization. A considerable amount of research has been conducted on loop closing detection and localization based on appearance because vision sensor has an advantage in terms of costs and various approaching methods to solve this problem. In case of scenes that consist of repeated structures like in corridors, perceptual aliasing in which, the two different locations are recognized as the same, occurs frequently. In this paper, we propose an improved method to recognize location in the scenes which have similar structures. We extracted salient regions from images using visual attention model and calculated weights using distinctive features in the salient region. It makes possible to emphasize unique features in the scene to classify similar-looking locations. In the results of corridor recognition experiments, proposed method showed improved recognition performance. It shows 78.2% in the accuracy of single floor corridor recognition and 71.5% for multi floor corridors recognition.

Research on the Convergence of CCTV Video Information with Disaster Recognition and Real-time Crisis Response System (CCTV 영상 정보와 재난재해 인식 및 실시간 위기 대응 시스템의 융합에 관한 연구)

  • Kim, Ki-Bong;Geum, Gi-Moon;Jang, Chang-Bok
    • Journal of the Korea Convergence Society
    • /
    • v.8 no.3
    • /
    • pp.15-22
    • /
    • 2017
  • People generally believe that disaster forecast and warning systems and response systems are well established in the age of cutting edge technology. As a matter of fact, reliable systems to respond to disasters are not properly equipped, as we witnessed the Sewol ferry disaster in 2014. The existing forecast and warning systems are based on sensor information with low efficiency, and image information is only operated by monitoring staff manually. In addition, the interconnection between a warning system and a response system in order to decide how to cope with the recognized disaster is very insufficient. This paper introduces the CCTV based disaster recognition and real time crisis response system composed of the CCTV image recognition engine and the crisis response technique. This system has brought the possibility to overcome the limitations of existing sensor based forecast and warning systems, and to resolve the problems in the absence of monitoring staff when responding to crisis.

Learning Similarity between Hand-posture and Structure for View-invariant Hand-posture Recognition (관측 시점에 강인한 손 모양 인식을 위한 손 모양과 손 구조 사이의 학습 기반 유사도 결정 방법)

  • Jang Hyo-Young;Jung Jin-Woo;Bien Zeung-Nam
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.16 no.3
    • /
    • pp.271-274
    • /
    • 2006
  • This paper deals with a similarity decision method between the shape of hand-postures and their structures to improve performance of the vision-based hand-posture recognition system. Hand-posture recognition by vision sensors has difficulties since the human hand is an object with high degrees of freedom, and hence grabbed images present complex self-occlusion effects and, even for one hand-posture, various appearances according to viewing directions. Therefore many approaches limit the relative angle between cameras and hands or use multiple cameras. The former approach, however, restricts user's operation area. The latter requires additional considerations on the way of merging the results from each camera image to get the final recognition result. To recognize hand-postures, we use both of appearance and structural features and decide the similarity between the two types of features by learning.

Map-Building and Position Estimation based on Multi-Sensor Fusion for Mobile Robot Navigation in an Unknown Environment (이동로봇의 자율주행을 위한 다중센서융합기반의 지도작성 및 위치추정)

  • Jin, Tae-Seok;Lee, Min-Jung;Lee, Jang-Myung
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.13 no.5
    • /
    • pp.434-443
    • /
    • 2007
  • Presently, the exploration of an unknown environment is an important task for thee new generation of mobile service robots and mobile robots are navigated by means of a number of methods, using navigating systems such as the sonar-sensing system or the visual-sensing system. To fully utilize the strengths of both the sonar and visual sensing systems. This paper presents a technique for localization of a mobile robot using fusion data of multi-ultrasonic sensors and vision system. The mobile robot is designed for operating in a well-structured environment that can be represented by planes, edges, comers and cylinders in the view of structural features. In the case of ultrasonic sensors, these features have the range information in the form of the arc of a circle that is generally named as RCD(Region of Constant Depth). Localization is the continual provision of a knowledge of position which is deduced from it's a priori position estimation. The environment of a robot is modeled into a two dimensional grid map. we defines a vision-based environment recognition, phisically-based sonar sensor model and employs an extended Kalman filter to estimate position of the robot. The performance and simplicity of the approach is demonstrated with the results produced by sets of experiments using a mobile robot.

3D VISION SYSTEM FOR THE RECOGNITION OF FREE PARKING SITE LOCATION

  • Jung, H.G.;Kim, D.S.;Yoon, P.J.;Kim, J.H.
    • International Journal of Automotive Technology
    • /
    • v.7 no.3
    • /
    • pp.361-367
    • /
    • 2006
  • This paper describes a novel stereo vision based localization of free parking site, which recognizes the target position of automatic parking system. Pixel structure classification and feature based stereo matching extract the 3D information of parking site in real time. The pixel structure represents intensity configuration around a pixel and the feature based stereo matching uses step-by-step investigation strategy to reduce computational load. This paper considers only parking site divided by marking, which is generally drawn according to relevant standards. Parking site marking is separated by plane surface constraint and is transformed into bird's eye view, on which template matching is performed to determine the location of parking site. Obstacle depth map, which is generated from the disparity of adjacent vehicles, can be used as the guideline of template matching by limiting search range and orientation. Proposed method using both the obstacle depth map and the bird's eye view of parking site marking increases operation speed and robustness to visual noise by effectively limiting search range.

Automatic detection system for surface defects of home appliances based on machine vision (머신비전 기반의 가전제품 표면결함 자동검출 시스템)

  • Lee, HyunJun;Jeong, HeeJa;Lee, JangGoon;Kim, NamHo
    • Smart Media Journal
    • /
    • v.11 no.9
    • /
    • pp.47-55
    • /
    • 2022
  • Quality control in the smart factory manufacturing process is an important factor. Currently, quality inspection of home appliance manufacturing parts produced by the mold process is mostly performed with the naked eye of the operator, resulting in a high error rate of inspection. In order to improve the quality competition, an automatic defect detection system was designed and implemented. The proposed system acquires an image by photographing an object with a high-performance scan camera at a specific location, and reads defective products due to scratches, dents, and foreign substances according to the vision inspection algorithm. In this study, the depth-based branch decision algorithm (DBD) was developed to increase the recognition rate of defects due to scratches, and the accuracy was improved.