• 제목/요약/키워드: Video processing

Search Result 2,149, Processing Time 0.034 seconds

Action Recognition Method in Sports Video Shear Based on Fish Swarm Algorithm

  • Jie Sun;Lin Lu
    • Journal of Information Processing Systems
    • /
    • v.19 no.4
    • /
    • pp.554-562
    • /
    • 2023
  • This research offers a sports video action recognition approach based on the fish swarm algorithm in light of the low accuracy of existing sports video action recognition methods. A modified fish swarm algorithm is proposed to construct invariant features and decrease the dimension of features. Based on this algorithm, local features and global features can be classified. The experimental findings on the typical sports action data set demonstrate that the key details of sports action can be successfully retained by the dimensionality-reduced fusion invariant characteristics. According to this research, the average recognition time of the proposed method for walking, running, squatting, sitting, and bending is less than 326 seconds, and the average recognition rate is higher than 94%. This proves that this method can significantly improve the performance and efficiency of online sports video motion recognition.

A Method for Structuring Digital Video

  • Lee, Jae-Yeon;Jeong, Se-Yoon;Yoon, Ho-Sub;Kim, Kyu-Heon;Bae, Younglae-J;Jang, Jong-whan
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 1998.06b
    • /
    • pp.92-97
    • /
    • 1998
  • For the efficient searching and browsing of digital video, it is essential to extract the internal structure of the video contents. As an example, a news video consists of several sections such as politics, economics, sports and others, and also each section consists of individual topics. With this information in hand, users can ore easily access the required video frames. This paper addresses the problem of automatic shot boundary detection and selection of representative frames (R-frames), which are the essential step in recognizing the internal structure of video contents. In the shot boundary detection, a new algorithm that have dual detectors which are designed specifically for the abrupt boundaries (cuts) and gradually changing bounaries respectively is proposed. Compared to the existing 미algorithms that mostly have tried to detect both types by a single mechanism, the proposed algorithm is proved to be more robust and accurate. Also in the problem of R-frame selection, simple mechanical approaches such as selecting one frame every other second have been adopted. However this approach often selects too many R-frames in static short, while drops important frames in dynamic shots. To improve the selection mechanism, a new R-frame selection algorithm that uses motion information extracted from pixel difference is proposed.

  • PDF

A Real-time Multiview Video Coding System using Fast Disparity Estimation

  • Bae, Kyung-Hoon;Woo, Byung-Kwang
    • Journal of the Korean Institute of Illuminating and Electrical Installation Engineers
    • /
    • v.22 no.7
    • /
    • pp.37-42
    • /
    • 2008
  • In this paper, a real-time multiview video coding system using fast disparity estimation is proposed. In the multiview encoder, adaptive disparity-motion estimation (DME) for an effective 3-dimensional (3D) processing are proposed. That is, by adaptively predicting the mutual correlation between stereo images in the key-frame using the proposed algorithm, the bandwidth of stereo input images can be compressed to the level of a conventional 2D image and a predicted image also can be effectively reconstructed using a reference image and adaptive disparity vectors. Also, in multiview decoder, intermediate view reconstruction (IVR) using adaptive disparity search algorithm (DSA) for real-time multiview video processing is proposed. The proposed IVR can reduce a processing time of disparity estimation by selecting adaptively disparity search range. Accordingly, the proposed multiview video coding system is able to increase the efficiency of the coding rate and improve the resolution.

Video Processing of MPEG Compressed Data For 3D Stereoscopic Conversion (3차원 입체 변환을 위한 MPGE 압축 데이터에서의 영상 처리 기법)

  • 김만배
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 1998.06a
    • /
    • pp.3-8
    • /
    • 1998
  • The conversion of monoscopic video to 3D stereoscopic video has been studied by some pioneering researchers. In spite of the commercial of potential of the technology, two problems have bothered the progress of this research area: vertical motion parallax and high computational complexity. The former causes the low 3D perception, while the hardware complexity is required by the latter. The previous research has dealt with NTSC video, thur requiring complex processing steps, one of which is motion estimation. This paper proposes 3D stereoscopic conversion method of MPGE encoded data. Our proposed method has the advantage that motion estimation can be avoided by processing MPEG compressed data for the extraction of motion data as well as that camera and object motion in random in random directions can be handled.

  • PDF

Resource Efficient AI Service Framework Associated with a Real-Time Object Detector

  • Jun-Hyuk Choi;Jeonghun Lee;Kwang-il Hwang
    • Journal of Information Processing Systems
    • /
    • v.19 no.4
    • /
    • pp.439-449
    • /
    • 2023
  • This paper deals with a resource efficient artificial intelligence (AI) service architecture for multi-channel video streams. As an AI service, we consider the object detection model, which is the most representative for video applications. Since most object detection models are basically designed for a single channel video stream, the utilization of the additional resource for multi-channel video stream processing is inevitable. Therefore, we propose a resource efficient AI service framework, which can be associated with various AI service models. Our framework is designed based on the modular architecture, which consists of adaptive frame control (AFC) Manager, multiplexer (MUX), adaptive channel selector (ACS), and YOLO interface units. In order to run only a single YOLO process without regard to the number of channels, we propose a novel approach efficiently dealing with multi-channel input streams. Through the experiment, it is shown that the framework is capable of performing object detection service with minimum resource utilization even in the circumstance of multi-channel streams. In addition, each service can be guaranteed within a deadline.

CREATING JOYFUL DIGESTS BY EXPLOITING SMILE/LAUGHTER FACIAL EXPRESSIONS PRESENT IN VIDEO

  • Kowalik, Uwe;Hidaka, Kota;Irie, Go;Kojima, Akira
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2009.01a
    • /
    • pp.267-272
    • /
    • 2009
  • Video digests provide an effective way of confirming a video content rapidly due to their very compact form. By watching a digest, users can easily check whether a specific content is worth seeing in full. The impression created by the digest greatly influences the user's choice in selecting video contents. We propose a novel method of automatic digest creation that evokes a joyful impression through the created digest by exploiting smile/laughter facial expressions as emotional cues of joy from video. We assume that a digest presenting smiling/laughing faces appeals to the user since he/she is assured that the smile/laughter expression is caused by joyful events inside the video. For detecting smile/laughter faces we have developed a neural network based method for classifying facial expressions. Video segmentation is performed by automatic shot detection. For creating joyful digests, appropriate shots are automatically selected by shot ranking based on the smile/laughter detection result. We report the results of user trials conducted for assessing the visual impression with automatically created 'joyful' digests produced by our system. The results show that users tend to prefer emotional digests containing laughter faces. This result suggests that the attractiveness of automatically created video digests can be improved by extracting emotional cues of the contents through automatic facial expression analysis as proposed in this paper.

  • PDF

Video Expression Recognition Method Based on Spatiotemporal Recurrent Neural Network and Feature Fusion

  • Zhou, Xuan
    • Journal of Information Processing Systems
    • /
    • v.17 no.2
    • /
    • pp.337-351
    • /
    • 2021
  • Automatically recognizing facial expressions in video sequences is a challenging task because there is little direct correlation between facial features and subjective emotions in video. To overcome the problem, a video facial expression recognition method using spatiotemporal recurrent neural network and feature fusion is proposed. Firstly, the video is preprocessed. Then, the double-layer cascade structure is used to detect a face in a video image. In addition, two deep convolutional neural networks are used to extract the time-domain and airspace facial features in the video. The spatial convolutional neural network is used to extract the spatial information features from each frame of the static expression images in the video. The temporal convolutional neural network is used to extract the dynamic information features from the optical flow information from multiple frames of expression images in the video. A multiplication fusion is performed with the spatiotemporal features learned by the two deep convolutional neural networks. Finally, the fused features are input to the support vector machine to realize the facial expression classification task. The experimental results on cNTERFACE, RML, and AFEW6.0 datasets show that the recognition rates obtained by the proposed method are as high as 88.67%, 70.32%, and 63.84%, respectively. Comparative experiments show that the proposed method obtains higher recognition accuracy than other recently reported methods.

A Hadoop-based Multimedia Transcoding System for Processing Social Media in the PaaS Platform of SMCCSE

  • Kim, Myoungjin;Han, Seungho;Cui, Yun;Lee, Hanku;Jeong, Changsung
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.6 no.11
    • /
    • pp.2827-2848
    • /
    • 2012
  • Previously, we described a social media cloud computing service environment (SMCCSE). This SMCCSE supports the development of social networking services (SNSs) that include audio, image, and video formats. A social media cloud computing PaaS platform, a core component in a SMCCSE, processes large amounts of social media in a parallel and distributed manner for supporting a reliable SNS. Here, we propose a Hadoop-based multimedia system for image and video transcoding processing, necessary functions of our PaaS platform. Our system consists of two modules, including an image transcoding module and a video transcoding module. We also design and implement the system by using a MapReduce framework running on a Hadoop Distributed File System (HDFS) and the media processing libraries Xuggler and JAI. In this way, our system exponentially reduces the encoding time for transcoding large amounts of image and video files into specific formats depending on user-requested options (such as resolution, bit rate, and frame rate). In order to evaluate system performance, we measure the total image and video transcoding time for image and video data sets, respectively, under various experimental conditions. In addition, we compare the video transcoding performance of our cloud-based approach with that of the traditional frame-level parallel processing-based approach. Based on experiments performed on a 28-node cluster, the proposed Hadoop-based multimedia transcoding system delivers excellent speed and quality.

An Optimal Video Editing Method using Frame Information Pre-Processing (프레임 정보 전처리를 활용한 최적 영상 편집 방법)

  • Lee, Jun-Pyo;Cho, Chul-Young;Lee, Jong-Soon;Kim, Tae-Yeong;Kwon, Cheol-Hee
    • Journal of the Korea Society of Computer and Information
    • /
    • v.15 no.7
    • /
    • pp.27-32
    • /
    • 2010
  • We can cut and paste portions of MPEG coded bitstream efficiently to rearrange the audio and video sequences using our proposed method. The proposed method decodes the MPEG stream within just only one GOP(Group of Picture), edits the decoded video frames, and encodes it back to a MPEG stream. In this method, precise editing is possible. A pre-processing step is specially designed to provide easy cut and paste processing. In the pre-processing step for editing MPEG streams, the detail information is extracted. In addition, video quality is not degraded after the proposed editing process is applied. Consequently, the experimental results show significant improvements compared with traditional algorithms for video editing method in terms of the efficiency and exactness.

A Study on Optimization of Intelligent Video Surveillance System based on Embedded Module (임베디드 모듈 기반 지능형 영상감시 시스템의 최적화에 관한 연구)

  • Kim, Jin Su;Kim, Min-Gu;Pan, Sung Bum
    • Smart Media Journal
    • /
    • v.7 no.2
    • /
    • pp.40-46
    • /
    • 2018
  • The conventional CCTV surveillance system for preventing accidents and incidents misses 95% of the data after 22 minutes where one person monitors multiple CCTV. To address this issue, researchers have studied the computer-based intelligent video surveillance system for notifying people of the abnormal situation. However, because the system is involved in the problems of power consumption and costs, the intelligent video surveillance system based on embedded modules has been studied. This paper implements the intelligent video surveillance system based on embedded modules for detecting intruders, detecting fires and detecting loitering, falling. Moreover, the algorithm and the embedded module optimization method are applied to implement real-time processing. The intelligent video surveillance system based on embedded modules is implemented in Raspberry Pi. The algorithm processing time is 0.95 seconds on Raspberry Pi before optimization, and 0.47 seconds on Raspberry Pi after optimization, reduced processing time by 50.52%. Therefore, this suggests real processing possibility of the intelligent video surveillance system based on the embedded modules is possible.