• Title/Summary/Keyword: video recognition

Search Result 696, Processing Time 0.023 seconds

Recognition of dog's front face using deep learning and machine learning (딥러닝 및 기계학습 활용 반려견 얼굴 정면판별 방법)

  • Kim, Jong-Bok;Jang, Dong-Hwa;Yang, Kayoung;Kwon, Kyeong-Seok;Kim, Jung-Kon;Lee, Joon-Whoan
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.21 no.12
    • /
    • pp.1-9
    • /
    • 2020
  • As pet dogs rapidly increase in number, abandoned and lost dogs are also increasing in number. In Korea, animal registration has been in force since 2014, but the registration rate is not high owing to safety and effectiveness issues. Biometrics is attracting attention as an alternative. In order to increase the recognition rate from biometrics, it is necessary to collect biometric images in the same form as much as possible-from the face. This paper proposes a method to determine whether a dog is facing front or not in a real-time video. The proposed method detects the dog's eyes and nose using deep learning, and extracts five types of directional face information through the relative size and position of the detected face. Then, a machine learning classifier determines whether the dog is facing front or not. We used 2,000 dog images for learning, verification, and testing. YOLOv3 and YOLOv4 were used to detect the eyes and nose, and Multi-layer Perceptron (MLP), Random Forest (RF), and the Support Vector Machine (SVM) were used as classifiers. When YOLOv4 and the RF classifier were used with all five types of the proposed face orientation information, the face recognition rate was best, at 95.25%, and we found that real-time processing is possible.

Freeway Congestion Information Display Criteria Considering Drivers' Recognition (운전자 인지도를 고려한 연속류 혼잡도 표출기준)

  • Jo, Soon Gee;Kim, Hyoungsoo;Lee, Chungwon
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.29 no.5D
    • /
    • pp.611-617
    • /
    • 2009
  • With advanced technologies applied to transportation, realtime traffic information has been necessary for not only drivers but also agencies. In normal, traffic conditions have been represented to three levels according to congestion: "free", "slow", and "jammed". Those categories and criteria are set up for traffic management even though traffic information is provided for drivers. This study examines how drivers feel current congestion levels and delves into traffic categories and criteria which they recognize. To collect data for drivers' recognition, a survey of freeway travellers is conducted answering the question about traffic flow speed from video image on a freeway section. In the result of the survey, the surveyee preferred a 4-level traffic condition including "delayed" to 3-level traffic condition. As its criteria, 20 km/h, 50 km/h, and 75 km/h were obtained. These results are expected to contribute to building more appropriate traffic information for drivers and providing an operational guideline for Traffic Monitering Centers.

Recognizing the Direction of Action using Generalized 4D Features (일반화된 4차원 특징을 이용한 행동 방향 인식)

  • Kim, Sun-Jung;Kim, Soo-Wan;Choi, Jin-Young
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.24 no.5
    • /
    • pp.518-528
    • /
    • 2014
  • In this paper, we propose a method to recognize the action direction of human by developing 4D space-time (4D-ST, [x,y,z,t]) features. For this, we propose 4D space-time interest points (4D-STIPs, [x,y,z,t]) which are extracted using 3D space (3D-S, [x,y,z]) volumes reconstructed from images of a finite number of different views. Since the proposed features are constructed using volumetric information, the features for arbitrary 2D space (2D-S, [x,y]) viewpoint can be generated by projecting the 3D-S volumes and 4D-STIPs on corresponding image planes in training step. We can recognize the directions of actors in the test video since our training sets, which are projections of 3D-S volumes and 4D-STIPs to various image planes, contain the direction information. The process for recognizing action direction is divided into two steps, firstly we recognize the class of actions and then recognize the action direction using direction information. For the action and direction of action recognition, with the projected 3D-S volumes and 4D-STIPs we construct motion history images (MHIs) and non-motion history images (NMHIs) which encode the moving and non-moving parts of an action respectively. For the action recognition, features are trained by support vector data description (SVDD) according to the action class and recognized by support vector domain density description (SVDDD). For the action direction recognition after recognizing actions, each actions are trained using SVDD according to the direction class and then recognized by SVDDD. In experiments, we train the models using 3D-S volumes from INRIA Xmas Motion Acquisition Sequences (IXMAS) dataset and recognize action direction by constructing a new SNU dataset made for evaluating the action direction recognition.

Object Tracking Based on Exactly Reweighted Online Total-Error-Rate Minimization (정확히 재가중되는 온라인 전체 에러율 최소화 기반의 객체 추적)

  • JANG, Se-In;PARK, Choong-Shik
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.4
    • /
    • pp.53-65
    • /
    • 2019
  • Object tracking is one of important steps to achieve video-based surveillance systems. Object tracking is considered as an essential task similar to object detection and recognition. In order to perform object tracking, various machine learning methods (e.g., least-squares, perceptron and support vector machine) can be applied for different designs of tracking systems. In general, generative methods (e.g., principal component analysis) were utilized due to its simplicity and effectiveness. However, the generative methods were only focused on modeling the target object. Due to this limitation, discriminative methods (e.g., binary classification) were adopted to distinguish the target object and the background. Among the machine learning methods for binary classification, total error rate minimization can be used as one of successful machine learning methods for binary classification. The total error rate minimization can achieve a global minimum due to a quadratic approximation to a step function while other methods (e.g., support vector machine) seek local minima using nonlinear functions (e.g., hinge loss function). Due to this quadratic approximation, the total error rate minimization could obtain appropriate properties in solving optimization problems for binary classification. However, this total error rate minimization was based on a batch mode setting. The batch mode setting can be limited to several applications under offline learning. Due to limited computing resources, offline learning could not handle large scale data sets. Compared to offline learning, online learning can update its solution without storing all training samples in learning process. Due to increment of large scale data sets, online learning becomes one of essential properties for various applications. Since object tracking needs to handle data samples in real time, online learning based total error rate minimization methods are necessary to efficiently address object tracking problems. Due to the need of the online learning, an online learning based total error rate minimization method was developed. However, an approximately reweighted technique was developed. Although the approximation technique is utilized, this online version of the total error rate minimization could achieve good performances in biometric applications. However, this method is assumed that the total error rate minimization can be asymptotically achieved when only the number of training samples is infinite. Although there is the assumption to achieve the total error rate minimization, the approximation issue can continuously accumulate learning errors according to increment of training samples. Due to this reason, the approximated online learning solution can then lead a wrong solution. The wrong solution can make significant errors when it is applied to surveillance systems. In this paper, we propose an exactly reweighted technique to recursively update the solution of the total error rate minimization in online learning manner. Compared to the approximately reweighted online total error rate minimization, an exactly reweighted online total error rate minimization is achieved. The proposed exact online learning method based on the total error rate minimization is then applied to object tracking problems. In our object tracking system, particle filtering is adopted. In particle filtering, our observation model is consisted of both generative and discriminative methods to leverage the advantages between generative and discriminative properties. In our experiments, our proposed object tracking system achieves promising performances on 8 public video sequences over competing object tracking systems. The paired t-test is also reported to evaluate its quality of the results. Our proposed online learning method can be extended under the deep learning architecture which can cover the shallow and deep networks. Moreover, online learning methods, that need the exact reweighting process, can use our proposed reweighting technique. In addition to object tracking, the proposed online learning method can be easily applied to object detection and recognition. Therefore, our proposed methods can contribute to online learning community and object tracking, detection and recognition communities.

Implementation of Intelligent Image Surveillance System based Context (컨텍스트 기반의 지능형 영상 감시 시스템 구현에 관한 연구)

  • Moon, Sung-Ryong;Shin, Seong
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.47 no.3
    • /
    • pp.11-22
    • /
    • 2010
  • This paper is a study on implementation of intelligent image surveillance system using context information and supplements temporal-spatial constraint, the weak point in which it is hard to process it in real time. In this paper, we propose scene analysis algorithm which can be processed in real time in various environments at low resolution video(320*240) comprised of 30 frames per second. The proposed algorithm gets rid of background and meaningless frame among continuous frames. And, this paper uses wavelet transform and edge histogram to detect shot boundary. Next, representative key-frame in shot boundary is selected by key-frame selection parameter and edge histogram, mathematical morphology are used to detect only motion region. We define each four basic contexts in accordance with angles of feature points by applying vertical and horizontal ratio for the motion region of detected object. These are standing, laying, seating and walking. Finally, we carry out scene analysis by defining simple context model composed with general context and emergency context through estimating each context's connection status and configure a system in order to check real time processing possibility. The proposed system shows the performance of 92.5% in terms of recognition rate for a video of low resolution and processing speed is 0.74 second in average per frame, so that we can check real time processing is possible.

Social Network Analysis of TV Drama via Location Knowledge-learned Deep Hypernetworks (장소 정보를 학습한 딥하이퍼넷 기반 TV드라마 소셜 네트워크 분석)

  • Nan, Chang-Jun;Kim, Kyung-Min;Zhang, Byoung-Tak
    • KIISE Transactions on Computing Practices
    • /
    • v.22 no.11
    • /
    • pp.619-624
    • /
    • 2016
  • Social-aware video displays not only the relationships between characters but also diverse information on topics such as economics, politics and culture as a story unfolds. Particularly, the speaking habits and behavioral patterns of people in different situations are very important for the analysis of social relationships. However, when dealing with this dynamic multi-modal data, it is difficult for a computer to analyze the drama data effectively. To solve this problem, previous studies employed the deep concept hierarchy (DCH) model to automatically construct and analyze social networks in a TV drama. Nevertheless, since location knowledge was not included, they can only analyze the social network as a whole in stories. In this research, we include location knowledge and analyze the social relations in different locations. We adopt data from approximately 4400 minutes of a TV drama Friends as our dataset. We process face recognition on the characters by using a convolutional- recursive neural networks model and utilize a bag of features model to classify scenes. Then, in different scenes, we establish the social network between the characters by using a deep concept hierarchy model and analyze the change in the social network while the stories unfold.

Research Representative Color Image Emotion Emotional Image Size Changes through Tree (영상 이미지 색채 감성트리를 통한 대표감성크기 변화 연구)

  • Lee, Yean-Ran;Park, Hyo-Jin
    • The Journal of the Korea Contents Association
    • /
    • v.15 no.11
    • /
    • pp.10-17
    • /
    • 2015
  • Emotional computer that you want to study in a regular number change is the continuing sensitivity. Emotional Computing manner the sensibilities numbered and emotions were running through the trees. Emotional assessment of emotional sensibility computing was used as the coordinates of the key effects of the James A. Russell (Core Affect). Emotional tree runs purpose was to verify the correlation of sensitivity and emotion computing tree. Emotional tree attributes experiment color, brightness, saturation was configured with. When 50% brightness increase, about pleasure (X-axis) has increased by 10.49 points. Brightness 50%, GREEN 50% increase in the degree of pleasure (X-axis) of 10.49 points, tone (Y axis) has increased by 15.85 points. Brightness 50%, GREEN 50% increase in the degree of pleasure (X-axis) of 10.49 points, tone (Y axis) has increased by 15.85 points. Brightness 50% of the free-extent (X-axis), BLUE 50% when the tone (Y axis), pleasure extent (X-axis) of 10.49 points, tone (Y axis) as much as 14.65 points sensibilities have changed. When representatives emotions size changes have increased 50% brightness, color RED 50%, increased 5.4% Emotional excitement, emotion depressed declined -4.2%. 50% brightness, color GREEN 50% increase in emotional excitement had increased to 8.6%, declined by -5.5% this melancholy sensibility. Representative emotion and emotional changes increase or decrease the size of the emotional attributes were analyzed by quantitative methods. After the happy emotions number is needed to study more similar to the human emotion through the execution of the video image emotion emotional tree computing.

A Study on the Improvement of Teaching Competence of Pre-service Science Teachers based on the Teaching Evaluation and Reflective Journal Writings on Science Class (수업 평가와 반성 저널쓰기를 통한 예비 과학교사들의 수업 수행 능력 개선에 대한 연구)

  • Kim, Hyun-Jung;Hong, Hun-Gi;Jeon, Hwa-Young
    • Journal of The Korean Association For Science Education
    • /
    • v.30 no.6
    • /
    • pp.836-849
    • /
    • 2010
  • The purpose of this study is to analyze changes of competency observed in teaching of pre-service science teachers through the teaching evaluation and reflective journal writings on science class during the period of student-teaching at high school. To do this, we videotaped all the science classes of six pre-service teachers participating in this study, evaluated their class teachings, and collected moving video clips recorded in their classes, reflective journals, interviews, instructional materials, and teaching evaluation they have provided. From the "Standards for teaching evaluation of science instruction" developed by Korea Education Curriculum and Assessment, sixteen evaluation elements were selected and used for the analysis. According to our results, all preservice teachers show improvement of teaching performance in most of the class evaluation elements as the number of science classes increases. They presented the lowest improvement in the 'to design meaningful learning program,' which was one of the sixteen elements. However, there are substantial individual differences in the pre-service teachers' teaching competence on each evaluation element. Although they thought that 'understanding of scientific concepts' is the most important part of a science class in the beginning of student-teaching training, they showed changes in recognition that 'interaction and respect' and 'managing student behaviors' are also important in the end. They have recognized that writing a reflective journal, based on the video clips recorded in class and teaching evaluation, helps improve their teaching competency. In addition, improvement in teaching competency has influence upon career-orientation towards the school teacher in the future.

Active Water-Level and Distance Measurement Algorithm using Light Beam Pattern (광패턴을 이용한 능동형 수위 및 거리 측정 기법)

  • Kim, Nac-Woo;Son, Seung-Chul;Lee, Mun-Seob;Min, Gi-Hyeon;Lee, Byung-Tak
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.52 no.4
    • /
    • pp.156-163
    • /
    • 2015
  • In this paper, we propose an active water level and distance measurement algorithm using a light beam pattern. On behalf of conventional water level gauge types of pressure, float-well, ultrasonic, radar, and others, recently, extensive research for video analysis based water level measurement methods is gradually increasing as an importance of accurate measurement, monitoring convenience, and much more has been emphasized. By turning a reference light beam pattern on bridge or embankment actively, we suggest a new approach that analyzes and processes the projected light beam pattern image obtained from camera device, measures automatically water level and distance between a camera and a bridge or a levee. As contrasted with conventional methods that passively have to analyze captured video information for recognition of a watermark attached on a bridge or specific marker, we actively use the reference light beam pattern suited to the installed bridge environment. So, our method offers a robust water level measurement. The reasons are as follows. At first, our algorithm is effective against unfavorable visual field, pollution or damage of watermark, and so on, and in the next, this is possible to monitor in real-time the portable-based local situation by day and night. Furthermore, our method is not need additional floodlight. Tests are simulated under indoor environment conditions from distance measurement over 0.4-1.4m and height measurement over 13.5-32.5cm.

Comparative Analysis of CNN Deep Learning Model Performance Based on Quantification Application for High-Speed Marine Object Classification (고속 해상 객체 분류를 위한 양자화 적용 기반 CNN 딥러닝 모델 성능 비교 분석)

  • Lee, Seong-Ju;Lee, Hyo-Chan;Song, Hyun-Hak;Jeon, Ho-Seok;Im, Tae-ho
    • Journal of Internet Computing and Services
    • /
    • v.22 no.2
    • /
    • pp.59-68
    • /
    • 2021
  • As artificial intelligence(AI) technologies, which have made rapid growth recently, began to be applied to the marine environment such as ships, there have been active researches on the application of CNN-based models specialized for digital videos. In E-Navigation service, which is combined with various technologies to detect floating objects of clash risk to reduce human errors and prevent fires inside ships, real-time processing is of huge importance. More functions added, however, mean a need for high-performance processes, which raises prices and poses a cost burden on shipowners. This study thus set out to propose a method capable of processing information at a high rate while maintaining the accuracy by applying Quantization techniques of a deep learning model. First, videos were pre-processed fit for the detection of floating matters in the sea to ensure the efficient transmission of video data to the deep learning entry. Secondly, the quantization technique, one of lightweight techniques for a deep learning model, was applied to reduce the usage rate of memory and increase the processing speed. Finally, the proposed deep learning model to which video pre-processing and quantization were applied was applied to various embedded boards to measure its accuracy and processing speed and test its performance. The proposed method was able to reduce the usage of memory capacity four times and improve the processing speed about four to five times while maintaining the old accuracy of recognition.