• Title/Summary/Keyword: video recognition

Search Result 696, Processing Time 0.02 seconds

Abnormal Behavior Recognition Based on Spatio-temporal Context

  • Yang, Yuanfeng;Li, Lin;Liu, Zhaobin;Liu, Gang
    • Journal of Information Processing Systems
    • /
    • v.16 no.3
    • /
    • pp.612-628
    • /
    • 2020
  • This paper presents a new approach for detecting abnormal behaviors in complex surveillance scenes where anomalies are subtle and difficult to distinguish due to the intricate correlations among multiple objects' behaviors. Specifically, a cascaded probabilistic topic model was put forward for learning the spatial context of local behavior and the temporal context of global behavior in two different stages. In the first stage of topic modeling, unlike the existing approaches using either optical flows or complete trajectories, spatio-temporal correlations between the trajectory fragments in video clips were modeled by the latent Dirichlet allocation (LDA) topic model based on Markov random fields to obtain the spatial context of local behavior in each video clip. The local behavior topic categories were then obtained by exploiting the spectral clustering algorithm. Based on the construction of a dictionary through the process of local behavior topic clustering, the second phase of the LDA topic model learns the correlations of global behaviors and temporal context. In particular, an abnormal behavior recognition method was developed based on the learned spatio-temporal context of behaviors. The specific identification method adopts a top-down strategy and consists of two stages: anomaly recognition of video clip and anomalous behavior recognition within each video clip. Evaluation was performed using the validity of spatio-temporal context learning for local behavior topics and abnormal behavior recognition. Furthermore, the performance of the proposed approach in abnormal behavior recognition improved effectively and significantly in complex surveillance scenes.

A Study for Improved Human Action Recognition using Multi-classifiers (비디오 행동 인식을 위하여 다중 판별 결과 융합을 통한 성능 개선에 관한 연구)

  • Kim, Semin;Ro, Yong Man
    • Journal of Broadcast Engineering
    • /
    • v.19 no.2
    • /
    • pp.166-173
    • /
    • 2014
  • Recently, human action recognition have been developed for various broadcasting and video process. Since a video can consist of various scenes, keypoint approaches have been more attracted than template based methods for real application. Keypoint approahces tried to find regions having motion in video, and made 3-dimensional patches. Then, descriptors using histograms were computed from the patches, and a classifier based on machine learning method was applied to detect actions in video. However, a single classifier was difficult to handle various human actions. In order to improve this problem, approaches using multi classifiers were used to detect and to recognize objects. Thus, we propose a new human action recognition using decision-level fusion with support vector machine and sparse representation. The proposed method extracted descriptors based on keypoint approach from a video, and acquired results from each classifier for human action recognition. Then, we applied weights which were acquired by training stage to fuse each results from two classifiers. The experiment results in this paper show better result than a previous fusion method.

Face Detection and Recognition in MPEG Compressed Video (MPEG 압축 비디오 상에서의 얼굴 영역 추출 및 인식)

  • 여창욱;유명현
    • Korean Journal of Cognitive Science
    • /
    • v.11 no.2
    • /
    • pp.79-87
    • /
    • 2000
  • In this paper we present a face recognition and face detection algorithm in MPEG compressed video. The proposed method consists three stage of processing steps. The first step is to produce a spatially reduced DC image form MPEG compressed video for processing. And the second step is face detection on reduced DC image. Finally, the last step is face recognition on partially extracted compressed frames which contain the detected faces. The spatially reduced DC image is produced from two dimensional inverse DCT of the DC coefficient and the first two AC coefficients. The face detection is performed on DC image and face recognition is performed on one extracted frame per GOP by using the K-L transform. In order to evaluate the proposed method, we carried out experiments on video database. The experiment results show the proposed method is very efficient and helpful for target tasks.

  • PDF

Study on News Video Character Extraction and Recognition (뉴스 비디오 자막 추출 및 인식 기법에 관한 연구)

  • 김종열;김성섭;문영식
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.40 no.1
    • /
    • pp.10-19
    • /
    • 2003
  • Caption information in news videos can be useful for video indexing and retrieval since it usually suggests or implies the contents of the video very well. In this paper, a new algorithm for extracting and recognizing characters from news video is proposed, without a priori knowledge such as font type, color, size of character. In the process of text region extraction, in order to improve the recognition rate for videos with complex background at low resolution, continuous frames with identical text regions are automatically detected to compose an average frame. The image of the averaged frame is projected to horizontal and vertical direction, and we apply region filling to remove backgrounds to produce the character. Then, K-means color clustering is applied to remove remaining backgrounds to produce the final text image. In the process of character recognition, simple features such as white run and zero-one transition from the center, are extracted from unknown characters. These feature are compared with the pre-composed character feature set to recognize the characters. Experimental results tested on various news videos show that the proposed method is superior in terms of caption extraction ability and character recognition rate.

Face Detection and Recognition for Video Retrieval (비디오 검색을 위한 얼굴 검출 및 인식)

  • lslam, Mohammad Khairul;Lee, Hyung-Jin;Paul, Anjan Kumar;Baek, Joong-Hwan
    • Journal of Advanced Navigation Technology
    • /
    • v.12 no.6
    • /
    • pp.691-698
    • /
    • 2008
  • We present a novel method for face detection and recognition methods applicable to video retrieval. The person matching efficiency largely depends on how robustly faces are detected in the video frames. Face regions are detected in video frames using viola-jones features boosted with the Adaboost algorithm After face detection, PCA (Principal Component Analysis) follows illumination compensation to extract features that are classified by SVM (Support Vector Machine) for person identification. Experimental result shows that the matching efficiency of the ensembled architecture is quit satisfactory.

  • PDF

Automatic Poster Generation System Using Protagonist Face Analysis

  • Yeonhwi You;Sungjung Yong;Hyogyeong Park;Seoyoung Lee;Il-Young Moon
    • Journal of information and communication convergence engineering
    • /
    • v.21 no.4
    • /
    • pp.287-293
    • /
    • 2023
  • With the rapid development of domestic and international over-the-top markets, a large amount of video content is being created. As the volume of video content increases, consumers tend to increasingly check data concerning the videos before watching them. To address this demand, video summaries in the form of plot descriptions, thumbnails, posters, and other formats are provided to consumers. This study proposes an approach that automatically generates posters to effectively convey video content while reducing the cost of video summarization. In the automatic generation of posters, face recognition and clustering are used to gather and classify character data, and keyframes from the video are extracted to learn the overall atmosphere of the video. This study used the facial data of the characters and keyframes as training data and employed technologies such as DreamBooth, a text-to-image generation model, to automatically generate video posters. This process significantly reduces the time and cost of video-poster production.

Human Posture Recognition: Methodology and Implementation

  • Htike, Kyaw Kyaw;Khalifa, Othman O.
    • Journal of Electrical Engineering and Technology
    • /
    • v.10 no.4
    • /
    • pp.1910-1914
    • /
    • 2015
  • Human posture recognition is an attractive and challenging topic in computer vision due to its promising applications in the areas of personal health care, environmental awareness, human-computer-interaction and surveillance systems. Human posture recognition in video sequences consists of two stages: the first stage is training and evaluation and the second is deployment. In the first stage, the system is trained and evaluated using datasets of human postures to ‘teach’ the system to classify human postures for any future inputs. When the training and evaluation process is deemed satisfactory as measured by recognition rates, the trained system is then deployed to recognize human postures in any input video sequence. Different classifiers were used in the training such as Multilayer Perceptron Feedforward Neural networks, Self-Organizing Maps, Fuzzy C Means and K Means. Results show that supervised learning classifiers tend to perform better than unsupervised classifiers for the case of human posture recognition.

Improved Bimodal Speech Recognition Study Based on Product Hidden Markov Model

  • Xi, Su Mei;Cho, Young Im
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.13 no.3
    • /
    • pp.164-170
    • /
    • 2013
  • Recent years have been higher demands for automatic speech recognition (ASR) systems that are able to operate robustly in an acoustically noisy environment. This paper proposes an improved product hidden markov model (HMM) used for bimodal speech recognition. A two-dimensional training model is built based on dependently trained audio-HMM and visual-HMM, reflecting the asynchronous characteristics of the audio and video streams. A weight coefficient is introduced to adjust the weight of the video and audio streams automatically according to differences in the noise environment. Experimental results show that compared with other bimodal speech recognition approaches, this approach obtains better speech recognition performance.

Object Recognition Algorithm with Partial Information

  • Yoo, Suk Won
    • International Journal of Advanced Culture Technology
    • /
    • v.7 no.4
    • /
    • pp.229-235
    • /
    • 2019
  • Due to the development of video and optical technology today, video equipments are being used in a variety of fields such as identification, security maintenance, and factory automation systems that generate products. In this paper, we investigate an algorithm that effectively recognizes an experimental object in an input image with a partial problem due to the mechanical problem of the input imaging device. The object recognition algorithm proposed in this paper moves and rotates the vertices constituting the outline of the experimental object to the positions of the respective vertices constituting the outline of the DB model. Then, the discordance values between the moved and rotated experimental object and the corresponding DB model are calculated, and the minimum discordance value is selected. This minimum value is the final discordance value between the experimental object and the corresponding DB model, and the DB model with the minimum discordance value is selected as the recognition result for the experimental object. The proposed object recognition method obtains satisfactory recognition results using only partial information of the experimental object.

Real-Time Cattle Action Recognition for Estrus Detection

  • Heo, Eui-Ju;Ahn, Sung-Jin;Choi, Kang-Sun
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.4
    • /
    • pp.2148-2161
    • /
    • 2019
  • In this paper, we present a real-time cattle action recognition algorithm to detect the estrus phase of cattle from a live video stream. In order to classify cattle movement, specifically, to detect the mounting action, the most observable sign of the estrus phase, a simple yet effective feature description exploiting motion history images (MHI) is designed. By learning the proposed features using the support vector machine framework, various representative cattle actions, such as mounting, walking, tail wagging, and foot stamping, can be recognized robustly in complex scenes. Thanks to low complexity of the proposed action recognition algorithm, multiple cattle in three enclosures can be monitored simultaneously using a single fisheye camera. Through extensive experiments with real video streams, we confirmed that the proposed algorithm outperforms a conventional human action recognition algorithm by 18% in terms of recognition accuracy even with much smaller dimensional feature description.