• Title/Summary/Keyword: Facial Detection

Search Result 377, Processing Time 0.03 seconds

Rotation Invariant Face Detection Using HOG and Polar Coordinate Transform

  • Jang, Kyung-Shik
    • Journal of the Korea Society of Computer and Information
    • /
    • v.26 no.11
    • /
    • pp.85-92
    • /
    • 2021
  • In this paper, a method for effectively detecting rotated face and rotation angle regardless of the rotation angle is proposed. Rotated face detection is a challenging task, due to the large variation in facial appearance. In the proposed polar coordinate transformation, the spatial information of the facial components is maintained regardless of the rotation angle, so there is no variation in facial appearance due to rotation. Accordingly, features such as HOG, which are used for frontal face detection without rotation but have rotation-sensitive characteristics, can be effectively used in detecting rotated face. Only the training data in the frontal face is needed. The HOG feature obtained from the polar coordinate transformed images is learned using SVM and rotated faces are detected. Experiments on 3600 rotated face images show a rotation angle detection rate of 97.94%. Furthermore, the positions and rotation angles of the rotated faces are accurately detected from images with a background including multiple rotated faces.

Detection of video editing points using facial keypoints (얼굴 특징점을 활용한 영상 편집점 탐지)

  • Joshep Na;Jinho Kim;Jonghyuk Park
    • Journal of Intelligence and Information Systems
    • /
    • v.29 no.4
    • /
    • pp.15-30
    • /
    • 2023
  • Recently, various services using artificial intelligence(AI) are emerging in the media field as well However, most of the video editing, which involves finding an editing point and attaching the video, is carried out in a passive manner, requiring a lot of time and human resources. Therefore, this study proposes a methodology that can detect the edit points of video according to whether person in video are spoken by using Video Swin Transformer. First, facial keypoints are detected through face alignment. To this end, the proposed structure first detects facial keypoints through face alignment. Through this process, the temporal and spatial changes of the face are reflected from the input video data. And, through the Video Swin Transformer-based model proposed in this study, the behavior of the person in the video is classified. Specifically, after combining the feature map generated through Video Swin Transformer from video data and the facial keypoints detected through Face Alignment, utterance is classified through convolution layers. In conclusion, the performance of the image editing point detection model using facial keypoints proposed in this paper improved from 87.46% to 89.17% compared to the model without facial keypoints.

Real-time Slant Face detection using improvement AdaBoost algorithm (개선한 아다부스트 알고리즘을 이용한 기울어진 얼굴 실시간 검출)

  • Na, Jong-Won
    • Journal of Advanced Navigation Technology
    • /
    • v.12 no.3
    • /
    • pp.280-285
    • /
    • 2008
  • The traditional face detection method is to use difference picture method are used to detect movement. However, most do not consider this mathematical approach using real-time or real-time implementation of the algorithm is complicated, not easy. This paper, the first to detect real-time facial image is converted YCbCr and RGB video input. Next, you convert the difference between video images of two adjacent to obtain and then to conduct Glassfire Labeling. Labeling value compared to the threshold behavior Area recognizes and converts video extracts. Actions to convert video to conduct face detection, and detection of facial characteristics required for the extraction and use of AdaBoost algorithm.

  • PDF

Face Detection using Brightness Distribution in the Surrounding Area of Eye (눈 주변영역의 명암분포를 이용한 얼굴탐지)

  • Hwang, Dae-Dong;Park, Joo-Chul;Kim, Gye-Young
    • The KIPS Transactions:PartB
    • /
    • v.16B no.6
    • /
    • pp.443-450
    • /
    • 2009
  • This paper develops a novel technique of face detection using brightness distribution in the surrounding area of eye. The proposed face detection consists of facial component candidate extraction, facial component candidate filtering through eye-lip combination, left/right eye classification using brightness distribution, face verification confirming edges in nose region. Because the proposed technique don't use any skin color, it can detect multiple faces in color images with complicated backgrounds and different illumination levels. The experimental results reveal that the proposed technique is better than the traditional techniques in terms of detection ratio.

Video Analysis System for Action and Emotion Detection by Object with Hierarchical Clustering based Re-ID (계층적 군집화 기반 Re-ID를 활용한 객체별 행동 및 표정 검출용 영상 분석 시스템)

  • Lee, Sang-Hyun;Yang, Seong-Hun;Oh, Seung-Jin;Kang, Jinbeom
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.1
    • /
    • pp.89-106
    • /
    • 2022
  • Recently, the amount of video data collected from smartphones, CCTVs, black boxes, and high-definition cameras has increased rapidly. According to the increasing video data, the requirements for analysis and utilization are increasing. Due to the lack of skilled manpower to analyze videos in many industries, machine learning and artificial intelligence are actively used to assist manpower. In this situation, the demand for various computer vision technologies such as object detection and tracking, action detection, emotion detection, and Re-ID also increased rapidly. However, the object detection and tracking technology has many difficulties that degrade performance, such as re-appearance after the object's departure from the video recording location, and occlusion. Accordingly, action and emotion detection models based on object detection and tracking models also have difficulties in extracting data for each object. In addition, deep learning architectures consist of various models suffer from performance degradation due to bottlenects and lack of optimization. In this study, we propose an video analysis system consists of YOLOv5 based DeepSORT object tracking model, SlowFast based action recognition model, Torchreid based Re-ID model, and AWS Rekognition which is emotion recognition service. Proposed model uses single-linkage hierarchical clustering based Re-ID and some processing method which maximize hardware throughput. It has higher accuracy than the performance of the re-identification model using simple metrics, near real-time processing performance, and prevents tracking failure due to object departure and re-emergence, occlusion, etc. By continuously linking the action and facial emotion detection results of each object to the same object, it is possible to efficiently analyze videos. The re-identification model extracts a feature vector from the bounding box of object image detected by the object tracking model for each frame, and applies the single-linkage hierarchical clustering from the past frame using the extracted feature vectors to identify the same object that failed to track. Through the above process, it is possible to re-track the same object that has failed to tracking in the case of re-appearance or occlusion after leaving the video location. As a result, action and facial emotion detection results of the newly recognized object due to the tracking fails can be linked to those of the object that appeared in the past. On the other hand, as a way to improve processing performance, we introduce Bounding Box Queue by Object and Feature Queue method that can reduce RAM memory requirements while maximizing GPU memory throughput. Also we introduce the IoF(Intersection over Face) algorithm that allows facial emotion recognized through AWS Rekognition to be linked with object tracking information. The academic significance of this study is that the two-stage re-identification model can have real-time performance even in a high-cost environment that performs action and facial emotion detection according to processing techniques without reducing the accuracy by using simple metrics to achieve real-time performance. The practical implication of this study is that in various industrial fields that require action and facial emotion detection but have many difficulties due to the fails in object tracking can analyze videos effectively through proposed model. Proposed model which has high accuracy of retrace and processing performance can be used in various fields such as intelligent monitoring, observation services and behavioral or psychological analysis services where the integration of tracking information and extracted metadata creates greate industrial and business value. In the future, in order to measure the object tracking performance more precisely, there is a need to conduct an experiment using the MOT Challenge dataset, which is data used by many international conferences. We will investigate the problem that the IoF algorithm cannot solve to develop an additional complementary algorithm. In addition, we plan to conduct additional research to apply this model to various fields' dataset related to intelligent video analysis.

Enhancing the performance of the facial keypoint detection model by improving the quality of low-resolution facial images (저화질 안면 이미지의 화질 개선를 통한 안면 특징점 검출 모델의 성능 향상)

  • KyoungOok Lee;Yejin Lee;Jonghyuk Park
    • Journal of Intelligence and Information Systems
    • /
    • v.29 no.2
    • /
    • pp.171-187
    • /
    • 2023
  • When a person's face is recognized through a recording device such as a low-pixel surveillance camera, it is difficult to capture the face due to low image quality. In situations where it is difficult to recognize a person's face, problems such as not being able to identify a criminal suspect or a missing person may occur. Existing studies on face recognition used refined datasets, so the performance could not be measured in various environments. Therefore, to solve the problem of poor face recognition performance in low-quality images, this paper proposes a method to generate high-quality images by performing image quality improvement on low-quality facial images considering various environments, and then improve the performance of facial feature point detection. To confirm the practical applicability of the proposed architecture, an experiment was conducted by selecting a data set in which people appear relatively small in the entire image. In addition, by choosing a facial image dataset considering the mask-wearing situation, the possibility of expanding to real problems was explored. As a result of measuring the performance of the feature point detection model by improving the image quality of the face image, it was confirmed that the face detection after improvement was enhanced by an average of 3.47 times in the case of images without a mask and 9.92 times in the case of wearing a mask. It was confirmed that the RMSE for facial feature points decreased by an average of 8.49 times when wearing a mask and by an average of 2.02 times when not wearing a mask. Therefore, it was possible to verify the applicability of the proposed method by increasing the recognition rate for facial images captured in low quality through image quality improvement.

Quantified Lockscreen: Integration of Personalized Facial Expression Detection and Mobile Lockscreen application for Emotion Mining and Quantified Self (Quantified Lockscreen: 감정 마이닝과 자기정량화를 위한 개인화된 표정인식 및 모바일 잠금화면 통합 어플리케이션)

  • Kim, Sung Sil;Park, Junsoo;Woo, Woontack
    • Journal of KIISE
    • /
    • v.42 no.11
    • /
    • pp.1459-1466
    • /
    • 2015
  • Lockscreen is one of the most frequently encountered interfaces by smartphone users. Although users perform unlocking actions every day, there are no benefits in using lockscreens apart from security and authentication purposes. In this paper, we replace the traditional lockscreen with an application that analyzes facial expressions in order to collect facial expression data and provide real-time feedback to users. To evaluate this concept, we have implemented Quantified Lockscreen application, supporting the following contributions of this paper: 1) an unobtrusive interface for collecting facial expression data and evaluating emotional patterns, 2) an improvement in accuracy of facial expression detection through a personalized machine learning process, and 3) an enhancement of the validity of emotion data through bidirectional, multi-channel and multi-input methodology.

Analysis of Understanding Using Deep Learning Facial Expression Recognition for Real Time Online Lectures (딥러닝 표정 인식을 활용한 실시간 온라인 강의 이해도 분석)

  • Lee, Jaayeon;Jeong, Sohyun;Shin, You Won;Lee, Eunhye;Ha, Yubin;Choi, Jang-Hwan
    • Journal of Korea Multimedia Society
    • /
    • v.23 no.12
    • /
    • pp.1464-1475
    • /
    • 2020
  • Due to the spread of COVID-19, the online lecture has become more prevalent. However, it was found that a lot of students and professors are experiencing lack of communication. This study is therefore designed to improve interactive communication between professors and students in real-time online lectures. To do so, we explore deep learning approaches for automatic recognition of students' facial expressions and classification of their understanding into 3 classes (Understand / Neutral / Not Understand). We use 'BlazeFace' model for face detection and 'ResNet-GRU' model for facial expression recognition (FER). We name this entire process 'Degree of Understanding (DoU)' algorithm. DoU algorithm can analyze a multitude of students collectively and present the result in visualized statistics. To our knowledge, this study has great significance in that this is the first study offers the statistics of understanding in lectures using FER. As a result, the algorithm achieved rapid speed of 0.098sec/frame with high accuracy of 94.3% in CPU environment, demonstrating the potential to be applied to real-time online lectures. DoU Algorithm can be extended to various fields where facial expressions play important roles in communications such as interactions with hearing impaired people.

Vestibular Schwannoma Atypically Invading Temporal Bone

  • Park, Soo Jeong;Yang, Na-Rae;Seo, Eui Kyo
    • Journal of Korean Neurosurgical Society
    • /
    • v.57 no.4
    • /
    • pp.292-294
    • /
    • 2015
  • Vestibular schwannoma (VS) usually present the widening of internal auditory canal (IAC), and these bony changes are typically limited to IAC, not extend to temporal bone. Temporal bone invasion by VS is extremely rare. We report 51-year-old man who revealed temporal bone destruction beyond IAC by unilateral VS. The bony destruction extended anteriorly to the carotid canal and inferiorly to the jugular foramen. On histopathologic examination, the tumor showed typical benign schwannoma and did not show any unusual vascularity or malignant feature. Facial nerve was severely compressed and distorted by tumor, which unevenly eroded temporal bone in surgical field. Vestibular schwannoma with atypical invasion of temporal bone can be successfully treated with combined translabyrinthine and lateral suboccipiral approach without facial nerve dysfunction. Early detection and careful dissection of facial nerve with intraoperative monitoring should be considered during operation due to severe adhesion and distortion of facial nerve by tumor and eroded temporal bone.

Invariant Range Image Multi-Pose Face Recognition Using Fuzzy c-Means

  • Phokharatkul, Pisit;Pansang, Seri
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2005.06a
    • /
    • pp.1244-1248
    • /
    • 2005
  • In this paper, we propose fuzzy c-means (FCM) to solve recognition errors in invariant range image, multi-pose face recognition. Scale, center and pose error problems were solved using geometric transformation. Range image face data was digitized into range image data by using the laser range finder that does not depend on the ambient light source. Then, the digitized range image face data is used as a model to generate multi-pose data. Each pose data size was reduced by linear reduction into the database. The reduced range image face data was transformed to the gradient face model for facial feature image extraction and also for matching using the fuzzy membership adjusted by fuzzy c-means. The proposed method was tested using facial range images from 40 people with normal facial expressions. The output of the detection and recognition system has to be accurate to about 93 percent. Simultaneously, the system must be robust enough to overcome typical image-acquisition problems such as noise, vertical rotated face and range resolution.

  • PDF