• Title/Summary/Keyword: Video Software Method

Search Result 309, Processing Time 0.027 seconds

Spatial and Temporal Resolution Selection for Bit Stream Extraction in H.264 Scalable Video Coding (H.264 SVC에서 비트 스트림 추출을 위한 공간과 시간 해상도 선택 기법)

  • Kim, Nam-Yun;Hwang, Ho-Young
    • Journal of Korea Multimedia Society
    • /
    • v.13 no.1
    • /
    • pp.102-110
    • /
    • 2010
  • H.264 SVC(Scalable Video Coding) provides the advantages of low disk storage requirement and high scalability. However, a streaming server or a user terminal has to extract a bit stream from SVC file. This paper proposes a bit stream extraction method which can get the maximum PSNR value while date bit rate does not exceed the available network bandwidth. To do this, this paper obtains the information about extraction points which can get the maximum PSNR value offline and decides the spatial/temporal resolution of a bit stream at run-time. This resolution information along with available network bandwidth is used as the parameters to a bit stream extractor. Through experiment with JSVM reference software, we proved that proposed bit stream extraction method can get a higher PSNR value.

A Real-time Face Recognition System using Fast Face Detection (빠른 얼굴 검출을 이용한 실시간 얼굴 인식 시스템)

  • Lee Ho-Geun;Jung Sung-Tae
    • Journal of KIISE:Software and Applications
    • /
    • v.32 no.12
    • /
    • pp.1247-1259
    • /
    • 2005
  • This paper proposes a real-time face recognition system which detects multiple faces from low resolution video such as web-camera video. Face recognition system consists of the face detection step and the face classification step. At First, it finds face region candidates by using AdaBoost based object detection method which have fast speed and robust performance. It generates reduced feature vector for each face region candidate by using principle component analysis. At Second, Face classification used Principle Component Analysis and multi-SVM. Experimental result shows that the proposed method achieves real-time face detection and face recognition from low resolution video. Additionally, We implement the auto-tracking face recognition system using the Pan-Tilt Web-camera and radio On/Off digital door-lock system with face recognition system.

New developmental direction of telecommunications for Disabilities Welfare (장애인복지를 위한 정보통신의 발전방향)

  • 박민수
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.4 no.1
    • /
    • pp.35-43
    • /
    • 2000
  • This paper was studied on developmental direction of telecommunications for disabilities welfare. Method of this study is delphi method. Persons with disabilities is classed as motor disability, visual handicap, hearing impairment, and language and speech disorders. Persons with motor disability is needs as follow, speed recognition technology, video recognition technology, breath capacity recognition technology. Persons with visual handicap is needs as follow, display recognition technology, speed recognition technology, text recognition technology, intelligence conversion handling technology, video recognition - speed synthetic technology. Persons with hearing impairment and language - speech disorders is needs as follow, speed signal handling technology, speed recognition technology, intelligence conversion handling technology, video recognition technology, speed synthetic technology the results of this study is as follow: first, disabilities telecommunications organization must be constructed. Second, persons with disabilities in need of universal service. Third, Persons with disabilities in need of information education, Fourth, studying for telecommunications in need of support. Fifth, small telecommunications company in need of support. Sixth, software industry in need of new development. Seventh, Persons with disabilities in need of standard guideline for telecommunications.

  • PDF

Development of An Intelligent G-Learning Virtual Learning Platform Based on Real Video (실 화상 기반의 지능형 G-러닝 가상 학습 플랫폼 개발)

  • Jae-Yeon Park;Sung-Jun Park
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.24 no.2
    • /
    • pp.79-86
    • /
    • 2024
  • In this paper, we propose a virtual learning platform based on various interactions that occur during real class activities, rather than the existing content delivery-oriented learning metaverse platform. In this study, we provide a learning environment that combines AI and a virtual environment to solve problems by talking to real-time AI. Also, we applied G-learning techinques to improve class immersion. The Virtual Edu platform developed through this study provides an effective learning experience combining self-directed learning, simulation of interest through games, and PBL teaching method. And we propose a new educational method that improves student participation learning effectiveness. Experiment, we test performance on learninng activity based on real-time video classroom. As a result, it was found that the class progressing stably.

Selective Interpolation Filter for Video Coding (비디오 압축을 위한 선택적인 보간 필터)

  • Nam, Jung-Hak;Jo, Hyun-Ho;Sim, Dong-Gyu;Lee, Soo-Youn
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.49 no.1
    • /
    • pp.58-66
    • /
    • 2012
  • Even after establishment of H.264/AVC standard, the video coding experts group (VCEG) of ITU-T has researched on development of promising coding techniques to increase coding efficiency based on the key technology area (KTA) software. Recently, the joint collaboration team video coding (JCT-VC) which was composed of the VCEG and the motion picture experts group (MPEG) of ISO/IEC is developing a next-generation video standard namely HEVC intended to gain twice efficiency than H.264/AVC. An adaptive interpolation technique, one of various next-generation techniques, reported higher coding efficiency. However, it has high computational complexity and does not deal with various error characteristics for videos. In this paper, we investigate characteristics of interpolation filters and propose an effective fixed interpolation filter bank including diverse properties of error. Experimental results is shown that the proposed method achieved bitrate reduction by 0.7% and 1.3% compared to fixed directional interpolation filter (FDIF) of the KTA and the directional interpolation filter (DIF) of the HEVC test model, respectively.

Content-based Shot Boundary Detection from MPEG Data using Region Flow and Color Information (영역 흐름 및 칼라 정보를 이용한 MPEG 데이타의 내용 기반 셧 경계 검출)

  • Kang, Hang-Bong
    • Journal of KIISE:Software and Applications
    • /
    • v.27 no.4
    • /
    • pp.402-411
    • /
    • 2000
  • It is an important step in video indexing and retrieval to detect shot boundaries on video data. Some approaches are proposed to detect shot changes by computing color histogram differences or the variances of DCT coefficients. However, these approaches do not consider the content or meaningful features in the image data which are useful in high level video processing. In particular, it is desirable to detect these features from compressed video data because this requires less processing overhead. In this paper, we propose a new method to detect shot boundaries from MPEG data using region flow and color information. First, we reconstruct DC images and compute region flow information and color histogram differences from HSV quantized images. Then, we compute the points at which region flow has discontinuities or color histogram differences are high. Finally, we decide those points as shot boundaries according to our proposed algorithm.

  • PDF

Education of media by production of image contents - Focusing on Non-Linear Editing (영상 콘텐츠 제작을 통한 미디어 교육 - 비선형 편집을 중심으로)

  • Park, Sung-Dae
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.23 no.9
    • /
    • pp.1096-1103
    • /
    • 2019
  • Today the influence of digital video media grows bigger due to the development of information and telecommunication gradually and most of juveniles spend a number of image contents with their smart phones and computers. And they not only consume such image contents but also upload the videos they make at YouTube or Vimeo by recording and editing them by themselves. As we learned from such phenomenon, the education of media that uses and expresses the video in various environment such as making, reading and expressing video media is very important. The education of using the advance digital device and software of producing image contents is essential as current education of media cannot be discussed without that. This study treats the education of media based on the view point of education on image contents production. It discusses the education method that we can learn from the process of image contents production based on Non-Linear Editing System.

MPEG Video Segmentation using Two-stage Neural Networks and Hierarchical Frame Search (2단계 신경망과 계층적 프레임 탐색 방법을 이용한 MPEG 비디오 분할)

  • Kim, Joo-Min;Choi, Yeong-Woo;Chung, Ku-Sik
    • Journal of KIISE:Software and Applications
    • /
    • v.29 no.1_2
    • /
    • pp.114-125
    • /
    • 2002
  • In this paper, we are proposing a hierarchical segmentation method that first segments the video data into units of shots by detecting cut and dissolve, and then decides types of camera operations or object movements in each shot. In our previous work[1], each picture group is divided into one of the three detailed categories, Shot(in case of scene change), Move(in case of camera operation or object movement) and Static(in case of almost no change between images), by analysing DC(Direct Current) component of I(Intra) frame. In this process, we have designed two-stage hierarchical neural network with inputs of various multiple features combined. Then, the system detects the accurate shot position, types of camera operations or object movements by searching P(Predicted), B(Bi-directional) frames of the current picture group selectively and hierarchically. Also, the statistical distributions of macro block types in P or B frames are used for the accurate detection of cut position, and another neural network with inputs of macro block types and motion vectors method can reduce the processing time by using only DC coefficients of I frames without decoding and by searching P, B frames selectively and hierarchically. The proposed method classified the picture groups in the accuracy of 93.9-100.0% and the cuts in the accuracy of 96.1-100.0% with three different together is used to detect dissolve, types of camera operations and object movements. The proposed types of video data. Also, it classified the types of camera movements or object movements in the accuracy of 90.13% and 89.28% with two different types of video data.

Video Analysis System for Action and Emotion Detection by Object with Hierarchical Clustering based Re-ID (계층적 군집화 기반 Re-ID를 활용한 객체별 행동 및 표정 검출용 영상 분석 시스템)

  • Lee, Sang-Hyun;Yang, Seong-Hun;Oh, Seung-Jin;Kang, Jinbeom
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.1
    • /
    • pp.89-106
    • /
    • 2022
  • Recently, the amount of video data collected from smartphones, CCTVs, black boxes, and high-definition cameras has increased rapidly. According to the increasing video data, the requirements for analysis and utilization are increasing. Due to the lack of skilled manpower to analyze videos in many industries, machine learning and artificial intelligence are actively used to assist manpower. In this situation, the demand for various computer vision technologies such as object detection and tracking, action detection, emotion detection, and Re-ID also increased rapidly. However, the object detection and tracking technology has many difficulties that degrade performance, such as re-appearance after the object's departure from the video recording location, and occlusion. Accordingly, action and emotion detection models based on object detection and tracking models also have difficulties in extracting data for each object. In addition, deep learning architectures consist of various models suffer from performance degradation due to bottlenects and lack of optimization. In this study, we propose an video analysis system consists of YOLOv5 based DeepSORT object tracking model, SlowFast based action recognition model, Torchreid based Re-ID model, and AWS Rekognition which is emotion recognition service. Proposed model uses single-linkage hierarchical clustering based Re-ID and some processing method which maximize hardware throughput. It has higher accuracy than the performance of the re-identification model using simple metrics, near real-time processing performance, and prevents tracking failure due to object departure and re-emergence, occlusion, etc. By continuously linking the action and facial emotion detection results of each object to the same object, it is possible to efficiently analyze videos. The re-identification model extracts a feature vector from the bounding box of object image detected by the object tracking model for each frame, and applies the single-linkage hierarchical clustering from the past frame using the extracted feature vectors to identify the same object that failed to track. Through the above process, it is possible to re-track the same object that has failed to tracking in the case of re-appearance or occlusion after leaving the video location. As a result, action and facial emotion detection results of the newly recognized object due to the tracking fails can be linked to those of the object that appeared in the past. On the other hand, as a way to improve processing performance, we introduce Bounding Box Queue by Object and Feature Queue method that can reduce RAM memory requirements while maximizing GPU memory throughput. Also we introduce the IoF(Intersection over Face) algorithm that allows facial emotion recognized through AWS Rekognition to be linked with object tracking information. The academic significance of this study is that the two-stage re-identification model can have real-time performance even in a high-cost environment that performs action and facial emotion detection according to processing techniques without reducing the accuracy by using simple metrics to achieve real-time performance. The practical implication of this study is that in various industrial fields that require action and facial emotion detection but have many difficulties due to the fails in object tracking can analyze videos effectively through proposed model. Proposed model which has high accuracy of retrace and processing performance can be used in various fields such as intelligent monitoring, observation services and behavioral or psychological analysis services where the integration of tracking information and extracted metadata creates greate industrial and business value. In the future, in order to measure the object tracking performance more precisely, there is a need to conduct an experiment using the MOT Challenge dataset, which is data used by many international conferences. We will investigate the problem that the IoF algorithm cannot solve to develop an additional complementary algorithm. In addition, we plan to conduct additional research to apply this model to various fields' dataset related to intelligent video analysis.

Multi-Threaded Parallel H.264/AVC Decoder for Multi-Core Systems (멀티코어 시스템을 위한 멀티스레드 H.264/AVC 병렬 디코더)

  • Kim, Won-Jin;Cho, Keol;Chung, Ki-Seok
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.47 no.11
    • /
    • pp.43-53
    • /
    • 2010
  • Wide deployment of high resolution video services leads to active studies on high speed video processing. Especially, prevalent employment of multi-core systems accelerates researches on high resolution video processing based on parallelization of multimedia software. In this paper, we propose a novel parallel H.264/AVC decoding scheme on a multi-core platform. Parallel H.264/AVC decoding is challenging not only because parallelization may incur significant synchronization overhead but also because software may have complicated dependencies. To overcome such issues, we propose a novel approach called Multi-Threaded Parallelization(MTP). In MTP, to reduce synchronization overhead, a separate thread is allocated to each stage in the pipeline. In addition, an efficient memory reuse technique is used to reduce the memory requirement. To verify the effectiveness of the proposed approach, we parallelized FFmpeg H.264/AVC decoder with the proposed technique using OpenMP, and carried out experiments on an Intel Quad-Core platform. The proposed design performs better than FFmpeg H.264/AVC decoder before the parallelization by 53%. We also reduced the amount of memory usage by 65% and 81% for a high-definition(HD) and a full high-definition(FHD) video, respectively compared with that of popular existing method called 2Dwave.