• Title/Summary/Keyword: vision-based recognition

Search Result 633, Processing Time 0.029 seconds

The Audience Behavior-based Emotion Prediction Model for Personalized Service (고객 맞춤형 서비스를 위한 관객 행동 기반 감정예측모형)

  • Ryoo, Eun Chung;Ahn, Hyunchul;Kim, Jae Kyeong
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.2
    • /
    • pp.73-85
    • /
    • 2013
  • Nowadays, in today's information society, the importance of the knowledge service using the information to creative value is getting higher day by day. In addition, depending on the development of IT technology, it is ease to collect and use information. Also, many companies actively use customer information to marketing in a variety of industries. Into the 21st century, companies have been actively using the culture arts to manage corporate image and marketing closely linked to their commercial interests. But, it is difficult that companies attract or maintain consumer's interest through their technology. For that reason, it is trend to perform cultural activities for tool of differentiation over many firms. Many firms used the customer's experience to new marketing strategy in order to effectively respond to competitive market. Accordingly, it is emerging rapidly that the necessity of personalized service to provide a new experience for people based on the personal profile information that contains the characteristics of the individual. Like this, personalized service using customer's individual profile information such as language, symbols, behavior, and emotions is very important today. Through this, we will be able to judge interaction between people and content and to maximize customer's experience and satisfaction. There are various relative works provide customer-centered service. Specially, emotion recognition research is emerging recently. Existing researches experienced emotion recognition using mostly bio-signal. Most of researches are voice and face studies that have great emotional changes. However, there are several difficulties to predict people's emotion caused by limitation of equipment and service environments. So, in this paper, we develop emotion prediction model based on vision-based interface to overcome existing limitations. Emotion recognition research based on people's gesture and posture has been processed by several researchers. This paper developed a model that recognizes people's emotional states through body gesture and posture using difference image method. And we found optimization validation model for four kinds of emotions' prediction. A proposed model purposed to automatically determine and predict 4 human emotions (Sadness, Surprise, Joy, and Disgust). To build up the model, event booth was installed in the KOCCA's lobby and we provided some proper stimulative movie to collect their body gesture and posture as the change of emotions. And then, we extracted body movements using difference image method. And we revised people data to build proposed model through neural network. The proposed model for emotion prediction used 3 type time-frame sets (20 frames, 30 frames, and 40 frames). And then, we adopted the model which has best performance compared with other models.' Before build three kinds of models, the entire 97 data set were divided into three data sets of learning, test, and validation set. The proposed model for emotion prediction was constructed using artificial neural network. In this paper, we used the back-propagation algorithm as a learning method, and set learning rate to 10%, momentum rate to 10%. The sigmoid function was used as the transform function. And we designed a three-layer perceptron neural network with one hidden layer and four output nodes. Based on the test data set, the learning for this research model was stopped when it reaches 50000 after reaching the minimum error in order to explore the point of learning. We finally processed each model's accuracy and found best model to predict each emotions. The result showed prediction accuracy 100% from sadness, and 96% from joy prediction in 20 frames set model. And 88% from surprise, and 98% from disgust in 30 frames set model. The findings of our research are expected to be useful to provide effective algorithm for personalized service in various industries such as advertisement, exhibition, performance, etc.

Video Analysis System for Action and Emotion Detection by Object with Hierarchical Clustering based Re-ID (계층적 군집화 기반 Re-ID를 활용한 객체별 행동 및 표정 검출용 영상 분석 시스템)

  • Lee, Sang-Hyun;Yang, Seong-Hun;Oh, Seung-Jin;Kang, Jinbeom
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.1
    • /
    • pp.89-106
    • /
    • 2022
  • Recently, the amount of video data collected from smartphones, CCTVs, black boxes, and high-definition cameras has increased rapidly. According to the increasing video data, the requirements for analysis and utilization are increasing. Due to the lack of skilled manpower to analyze videos in many industries, machine learning and artificial intelligence are actively used to assist manpower. In this situation, the demand for various computer vision technologies such as object detection and tracking, action detection, emotion detection, and Re-ID also increased rapidly. However, the object detection and tracking technology has many difficulties that degrade performance, such as re-appearance after the object's departure from the video recording location, and occlusion. Accordingly, action and emotion detection models based on object detection and tracking models also have difficulties in extracting data for each object. In addition, deep learning architectures consist of various models suffer from performance degradation due to bottlenects and lack of optimization. In this study, we propose an video analysis system consists of YOLOv5 based DeepSORT object tracking model, SlowFast based action recognition model, Torchreid based Re-ID model, and AWS Rekognition which is emotion recognition service. Proposed model uses single-linkage hierarchical clustering based Re-ID and some processing method which maximize hardware throughput. It has higher accuracy than the performance of the re-identification model using simple metrics, near real-time processing performance, and prevents tracking failure due to object departure and re-emergence, occlusion, etc. By continuously linking the action and facial emotion detection results of each object to the same object, it is possible to efficiently analyze videos. The re-identification model extracts a feature vector from the bounding box of object image detected by the object tracking model for each frame, and applies the single-linkage hierarchical clustering from the past frame using the extracted feature vectors to identify the same object that failed to track. Through the above process, it is possible to re-track the same object that has failed to tracking in the case of re-appearance or occlusion after leaving the video location. As a result, action and facial emotion detection results of the newly recognized object due to the tracking fails can be linked to those of the object that appeared in the past. On the other hand, as a way to improve processing performance, we introduce Bounding Box Queue by Object and Feature Queue method that can reduce RAM memory requirements while maximizing GPU memory throughput. Also we introduce the IoF(Intersection over Face) algorithm that allows facial emotion recognized through AWS Rekognition to be linked with object tracking information. The academic significance of this study is that the two-stage re-identification model can have real-time performance even in a high-cost environment that performs action and facial emotion detection according to processing techniques without reducing the accuracy by using simple metrics to achieve real-time performance. The practical implication of this study is that in various industrial fields that require action and facial emotion detection but have many difficulties due to the fails in object tracking can analyze videos effectively through proposed model. Proposed model which has high accuracy of retrace and processing performance can be used in various fields such as intelligent monitoring, observation services and behavioral or psychological analysis services where the integration of tracking information and extracted metadata creates greate industrial and business value. In the future, in order to measure the object tracking performance more precisely, there is a need to conduct an experiment using the MOT Challenge dataset, which is data used by many international conferences. We will investigate the problem that the IoF algorithm cannot solve to develop an additional complementary algorithm. In addition, we plan to conduct additional research to apply this model to various fields' dataset related to intelligent video analysis.

Semantic Classification of DSM Using Convolutional Neural Network Based Deep Learning (합성곱 신경망 기반의 딥러닝에 의한 수치표면모델의 객체분류)

  • Lee, Dae Geon;Cho, Eun Ji;Lee, Dong-Cheon
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.37 no.6
    • /
    • pp.435-444
    • /
    • 2019
  • Recently, DL (Deep Learning) has been rapidly applied in various fields. In particular, classification and object recognition from images are major tasks in computer vision. Most of the DL utilizing imagery is primarily based on the CNN (Convolutional Neural Network) and improving performance of the DL model is main issue. While most CNNs are involve with images for training data, this paper aims to classify and recognize objects using DSM (Digital Surface Model), and slope and aspect information derived from the DSM instead of images. The DSM data sets used in the experiment were established by DGPF (German Society for Photogrammetry, Remote Sensing and Geoinformatics) and provided by ISPRS (International Society for Photogrammetry and Remote Sensing). The CNN-based SegNet model, that is evaluated as having excellent efficiency and performance, was used to train the data sets. In addition, this paper proposed a scheme for training data generation efficiently from the limited number of data. The results demonstrated DSM and derived data could be feasible for semantic classification with desirable accuracy using DL.

Design and Implementation of Mobile Vision-based Augmented Galaga using Real Objects (실제 물체를 이용한 모바일 비전 기술 기반의 실감형 갤러그의 설계 및 구현)

  • Park, An-Jin;Yang, Jong-Yeol;Jung, Kee-Chul
    • Journal of Korea Game Society
    • /
    • v.8 no.2
    • /
    • pp.85-96
    • /
    • 2008
  • Recently, research on augmented games as a new game genre has attracted a lot of attention. An augmented game overlaps virtual objects in an augmented reality(AR) environment, allowing game players to interact with the AR environment through manipulating real and virtual objects. However, it is difficult to release existing augmented games to ordinary game players, as the games generally use very expensive and inconvenient 'backpack' systems: To solve this problem, several augmented games have been proposed using mobile devices equipped with cameras, but it can be only enjoyed at a previously-installed location, as a ‘color marker' or 'pattern marker’ is used to overlap the virtual object with the real environment. Accordingly, this paper introduces an augmented game, called augmented galaga based on traditional well-known galaga, executed on mobile devices to make game players experience the game without any economic burdens. Augmented galaga uses real object in real environments, and uses scale-invariant features(SIFT), and Euclidean distance to recognize the real objects. The virtural aliens are randomly appeared around the specific objects, several specific objects are used to improve the interest aspect, andgame players attack the virtual aliens by moving the mobile devices towards specific objects and clicking a button of mobile devices. As a result, we expect that augmented galaga provides an exciting experience without any economic burdens for players based on the game paradigm, where the user interacts with both the physical world captured by a mobile camera and the virtual aliens automatically generated by a mobile devices.

  • PDF

Noise Removal using Fuzzy Mask Filter (퍼지 마스크 필터를 이용한 잡음 제거)

  • Lee, Sang-Jun;Yoon, Seok-Hyun;Kim, Kwang-Baek
    • Journal of the Korea Society of Computer and Information
    • /
    • v.15 no.11
    • /
    • pp.41-45
    • /
    • 2010
  • Image processing techniques are fundamental in human vision-based image information processing. There have been widely studied areas such as image transformation, image enhancement, image restoration, and image compression. One of research subgoals in those areas is enhancing image information for the correct information retrieval. As a fundamental task for the image recognition and interpretation, image enhancement includes noise filtering techniques. Conventional filtering algorithms may have high noise removal rate but usually have difficulty in conserving boundary information. As a result, they often use additional image processing algorithms in compensation for the tradeoff of more CPU time and higher possibility of information loss. In this paper, we propose a Fuzzy Mask Filtering algorithm that has high noise removal rate but lesser problems in above-mentioned side-effects. Our algorithm firstly decides a threshold based on fuzzy logic with information from masks. Then it decides the output pixel value by that threshold. In a designed experiment that has random impulse noise and salt pepper noise, the proposed algorithm was more effective in noise removal without information loss.

The effect of Emotional Leadership of Dental Hygienist on job satisfaction (치과위생사의 감성리더십이 직무만족에 미치는 영향)

  • Lee, Hye-Kyung
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.20 no.12
    • /
    • pp.708-714
    • /
    • 2019
  • This study investigated the effect of emotional leadership on the job satisfaction of dental hygienists. Therefore, we tried to determine a way to improve job satisfaction through emotional leadership, which is one of the skills that dental hygienists should have. The study subjects were 237 dental hygienists working in dental clinics in Jeollabuk-do from April 16, 2018 to April 28, 2018. The data was analyzed using the SPSS 12.0 program. The following results were obtained. The emotional leadership of the subjects was lowest for leadership ability (3.15), and the average job satisfaction was 3.43. Salary levels were low, ranging from 2.49 to 2.98. The final academic achievement and dental career had significant influences on emotional leadership (p<0.05). Marital status, self-cognition ability, self-management ability, and relationship management ability influenced the subjects' job satisfaction. The explanatory power of the multiple regression model was 26.5% and the validity of the model was statistically significant (p<0.05). Based on the above results, it was determined that self-recognition ability, self-management ability and relationship management ability of emotional leadership are all highly related to the job satisfaction of dental hygienists. Thus, emotional leadership of dental hygienists can be strengthened to enhance empathy, understanding and management of others' feelings. Therefore, institutional support should be provided for developing an emotional leadership program based on education and training.

Performance Enhancement of the Attitude Estimation using Small Quadrotor by Vision-based Marker Tracking (영상기반 물체추적에 의한 소형 쿼드로터의 자세추정 성능향상)

  • Kang, Seokyong;Choi, Jongwhan;Jin, Taeseok
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.25 no.5
    • /
    • pp.444-450
    • /
    • 2015
  • The accuracy of small and low cost CCD camera is insufficient to provide data for precisely tracking unmanned aerial vehicles(UAVs). This study shows how UAV can hover on a human targeted tracking object by using CCD camera rather than imprecise GPS data. To realize this, UAVs need to recognize their attitude and position in known environment as well as unknown environment. Moreover, it is necessary for their localization to occur naturally. It is desirable for an UAV to estimate of his attitude by environment recognition for UAV hovering, as one of the best important problems. In this paper, we describe a method for the attitude of an UAV using image information of a maker on the floor. This method combines the observed position from GPS sensors and the estimated attitude from the images captured by a fixed camera to estimate an UAV. Using the a priori known path of an UAV in the world coordinates and a perspective camera model, we derive the geometric constraint equations which represent the relation between image frame coordinates for a marker on the floor and the estimated UAV's attitude. Since the equations are based on the estimated position, the measurement error may exist all the time. The proposed method utilizes the error between the observed and estimated image coordinates to localize the UAV. The Kalman filter scheme is applied for this method. its performance is verified by the image processing results and the experiment.

A Collaborative Video Annotation and Browsing System using Linked Data (링크드 데이터를 이용한 협업적 비디오 어노테이션 및 브라우징 시스템)

  • Lee, Yeon-Ho;Oh, Kyeong-Jin;Sean, Vi-Sal;Jo, Geun-Sik
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.3
    • /
    • pp.203-219
    • /
    • 2011
  • Previously common users just want to watch the video contents without any specific requirements or purposes. However, in today's life while watching video user attempts to know and discover more about things that appear on the video. Therefore, the requirements for finding multimedia or browsing information of objects that users want, are spreading with the increasing use of multimedia such as videos which are not only available on the internet-capable devices such as computers but also on smart TV and smart phone. In order to meet the users. requirements, labor-intensive annotation of objects in video contents is inevitable. For this reason, many researchers have actively studied about methods of annotating the object that appear on the video. In keyword-based annotation related information of the object that appeared on the video content is immediately added and annotation data including all related information about the object must be individually managed. Users will have to directly input all related information to the object. Consequently, when a user browses for information that related to the object, user can only find and get limited resources that solely exists in annotated data. Also, in order to place annotation for objects user's huge workload is required. To cope with reducing user's workload and to minimize the work involved in annotation, in existing object-based annotation automatic annotation is being attempted using computer vision techniques like object detection, recognition and tracking. By using such computer vision techniques a wide variety of objects that appears on the video content must be all detected and recognized. But until now it is still a problem facing some difficulties which have to deal with automated annotation. To overcome these difficulties, we propose a system which consists of two modules. The first module is the annotation module that enables many annotators to collaboratively annotate the objects in the video content in order to access the semantic data using Linked Data. Annotation data managed by annotation server is represented using ontology so that the information can easily be shared and extended. Since annotation data does not include all the relevant information of the object, existing objects in Linked Data and objects that appear in the video content simply connect with each other to get all the related information of the object. In other words, annotation data which contains only URI and metadata like position, time and size are stored on the annotation sever. So when user needs other related information about the object, all of that information is retrieved from Linked Data through its relevant URI. The second module enables viewers to browse interesting information about the object using annotation data which is collaboratively generated by many users while watching video. With this system, through simple user interaction the query is automatically generated and all the related information is retrieved from Linked Data and finally all the additional information of the object is offered to the user. With this study, in the future of Semantic Web environment our proposed system is expected to establish a better video content service environment by offering users relevant information about the objects that appear on the screen of any internet-capable devices such as PC, smart TV or smart phone.

Efficient Object Localization using Color Correlation Back-projection (칼라 상관관계 역투영법을 적용한 효율적인 객체 지역화 기법)

  • Lee, Yong-Hwan;Cho, Han-Jin;Lee, June-Hwan
    • Journal of Digital Convergence
    • /
    • v.14 no.5
    • /
    • pp.263-271
    • /
    • 2016
  • Localizing an object in image is a common task in the field of computer vision. As the existing methods provide a detection for the single object in an image, they have an utilization limit for the use of the application, due to similar objects are in the actual picture. This paper proposes an efficient method of object localization for image recognition. The new proposed method uses color correlation back-projection in the YCbCr chromaticity color space to deal with the object localization problem. Using the proposed algorithm enables users to detect and locate primary location of object within the image, as well as candidate regions can be detected accurately without any information about object counts. To evaluate performance of the proposed algorithm, we estimate success rate of locating object with common used image database. Experimental results reveal that improvement of 21% success ratio was observed. This study builds on spatially localized color features and correlation-based localization, and the main contribution of this paper is that a different way of using correlogram is applied in object localization.

The Method of Wet Road Surface Condition Detection With Image Processing at Night (영상처리기반 야간 젖은 노면 판별을 위한 방법론)

  • KIM, Youngmin;BAIK, Namcheol
    • Journal of Korean Society of Transportation
    • /
    • v.33 no.3
    • /
    • pp.284-293
    • /
    • 2015
  • The objective of this paper is to determine the conditions of road surface by utilizing the images collected from closed-circuit television (CCTV) cameras installed on roadside. First, a technique was examined to detect wet surfaces at nighttime. From the literature reviews, it was revealed that image processing using polarization is one of the preferred options. However, it is hard to use the polarization characteristics of road surface images at nighttime because of irregular or no light situations. In this study, we proposes a new discriminant for detecting wet and dry road surfaces using CCTV image data at night. To detect the road surface conditions with night vision, we applied the wavelet packet transform for analyzing road surface textures. Additionally, to apply the luminance feature of night CCTV images, we set the intensity histogram based on HSI(Hue Saturation Intensity) color model. With a set of 200 images taken from the field, we constructed a detection criteria hyperplane with SVM (Support Vector Machine). We conducted field tests to verify the detection ability of the wet road surfaces and obtained reliable results. The outcome of this study is also expected to be used for monitoring road surfaces to improve safety.