• Title/Summary/Keyword: Video sequence

Search Result 507, Processing Time 0.026 seconds

Object Tracking Algorithm using Feature Map based on Siamese Network (Siamese Network의 특징맵을 이용한 객체 추적 알고리즘)

  • Lim, Su-Chang;Park, Sung-Wook;Kim, Jong-Chan;Ryu, Chang-Su
    • Journal of Korea Multimedia Society
    • /
    • v.24 no.6
    • /
    • pp.796-804
    • /
    • 2021
  • In computer vision, visual tracking method addresses the problem of localizing an specific object in video sequence according to the bounding box. In this paper, we propose a tracking method by introducing the feature correlation comparison into the siamese network to increase its matching identification. We propose a way to compute location of object to improve matching performance by a correlation operation, which locates parts for solving the searching problem. The higher layer in the network can extract a lot of object information. The lower layer has many location information. To reduce error rate of the object center point, we built a siamese network that extracts the distribution and location information of target objects. As a result of the experiment, the average center error rate was less than 25%.

SAD-Based Reordering of Feature Map Sequence for VCM (VCM 을 위한 SAD 기반 특징맵 시퀀스 재배열)

  • Kim, Dong-Ha;Yoon, Yong-Uk;Kim, Jae-Gon
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • fall
    • /
    • pp.30-32
    • /
    • 2021
  • 최근 머신비전 임무(machine vision task)를 위해 기계에 소비되는 비디오가 증가하면서 MPEG 은 기계를 위한 비디오 부호화 표준으로 VCM(Video Coding for Machine) 표준화 진행하고 있다. VCM 은 기계분석 네트워크에 입력되는 비디오 또는 특징(feature)을 부/복호화하여 압축 대비 임무 수행 정확도를 평가한다. 본 논문은 기계분석 네트워크에서 추출한 특징 데이터를 기존의 비디오 코덱을 사용하여 부/복호화를 진행할 때, 각 채널의 특징맵을 SAD(Sum of Absolute Difference) 기반으로 재배열하는 방법을 제안한다. 제안기법은 VCM 의 기준성능(anchor)에는 미치지 못하지만, 채널 재배열하지 않은 특징을 비디오 코덱으로 부호화 할 때 보다 개선된 성능을 보인다.

  • PDF

Ontology Modeling and Rule-based Reasoning for Automatic Classification of Personal Media (미디어 영상 자동 분류를 위한 온톨로지 모델링 및 규칙 기반 추론)

  • Park, Hyun-Kyu;So, Chi-Seung;Park, Young-Tack
    • Journal of KIISE
    • /
    • v.43 no.3
    • /
    • pp.370-379
    • /
    • 2016
  • Recently personal media were produced in a variety of ways as a lot of smart devices have been spread and services using these data have been desired. Therefore, research has been actively conducted for the media analysis and recognition technology and we can recognize the meaningful object from the media. The system using the media ontology has the disadvantage that can't classify the media appearing in the video because of the use of a video title, tags, and script information. In this paper, we propose a system to automatically classify video using the objects shown in the media data. To do this, we use a description logic-based reasoning and a rule-based inference for event processing which may vary in order. Description logic-based reasoning system proposed in this paper represents the relation of the objects in the media as activity ontology. We describe how to another rule-based reasoning system defines an event according to the order of the inference activity and order based reasoning system automatically classify the appropriate event to the category. To evaluate the efficiency of the proposed approach, we conducted an experiment using the media data classified as a valid category by the analysis of the Youtube video.

Video Compression using Characteristics of Wavelet Coefficients (웨이브렛 계수의 특성을 이용한 비디오 영상 압축)

  • 문종현;방만원
    • Journal of Broadcast Engineering
    • /
    • v.7 no.1
    • /
    • pp.45-54
    • /
    • 2002
  • This paper proposes a video compression algorithm using characteristics of wavelet coefficients. The proposed algorithm can provide lowed bit rate and faster running time while guaranteeing the reconstructed image qualify by the human virtual system. In this approach, each video sequence is decomposed into a pyramid structure of subimages with various resolution to use multiresolution capability of discrete wavelet transform. Then similarities between two neighboring frames are obtained from a low-frequency subband which Includes an important information of an image and motion informations are extracted from the similarity criteria. Four legion selection filters are designed according to the similarity criteria and compression processes are carried out by encoding the coefficients In preservation legions and replacement regions of high-frequency subbands. Region selection filters classify the high-frequency subbands Into preservation regions and replacement regions based on the similarity criteria and the coefficients In replacement regions are replaced by that of a reference frame or reduced to zero according to block-based similarities between a reference frame and successive frames. Encoding is carried out by quantizing and arithmetic encoding the wavelet coefficients in preservation regions and replacement regions separately. A reference frame is updated at the bottom point If the curve of similarity rates looks like concave pattern. Simulation results show that the proposed algorithm provides high compression ratio with proper Image quality. It also outperforms the previous Milton's algorithm in an Image quality, compression ratio and running time, leading to compression ratio less than 0.2bpp. PSNR of 32 dB and running tome of 10ms for a standard video image of size 352${\times}$240 pixels.

Intelligent Diagnosis Assistant System of Capsule Endoscopy Video Through Analysis of Video Frames (영상 프레임 분석을 통한 대용량 캡슐내시경 영상의 지능형 판독보조 시스템)

  • Lee, H.G.;Choi, H.K.;Lee, D.H.;Lee, S.C.
    • Journal of Intelligence and Information Systems
    • /
    • v.15 no.2
    • /
    • pp.33-48
    • /
    • 2009
  • Capsule endoscopy is one of the most remarkable inventions in last ten years. Causing less pain for patients, diagnosis for entire digestive system has been considered as a most convenience method over a normal endoscope. However, it is known that the diagnosis process typically requires very long inspection time for clinical experts because of considerably many duplicate images of same areas in human digestive system due to uncontrollable movement of a capsule endoscope. In this paper, we propose a method for clinical diagnosticians to get highly valuable information from capsule-endoscopy video. Our software system consists of three global maps, such as movement map, characteristic map, and brightness map, in temporal domain for entire sequence of the input video. The movement map can be used for effectively removing duplicated adjacent images. The characteristic and brightness maps provide frame content analyses that can be quickly used for segmenting regions or locating some features(such as blood) in the stream. Our experiments show the results of four patients having different health conditions. The result maps clearly capture the movements and characteristics from the image frames. Our method may help the diagnosticians quickly search the locations of lesion, bleeding, or some other interesting areas.

  • PDF

Fast Disparity Vector Estimation using Motion vector in Stereo Image Coding (스테레오 영상에서 움직임 벡터를 이용한 고속 변이 벡터 추정)

  • Doh, Nam-Keum;Kim, Tae-Yong
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.46 no.5
    • /
    • pp.56-65
    • /
    • 2009
  • Stereoscopic images consist of the left image and the right image. Thus, stereoscopic images have much amounts of data than single image. Then an efficient image compression technique is needed, the DPCM-based predicted coding compression technique is used in most video coding standards. Motion and disparity estimation are needed to realize the predicted coding compression technique. Their performing algorithm is block matching algorithm used in most video coding standards. Full search algorithm is a base algorithm of block matching algorithm which finds an optimal block to compare the base block with every other block in the search area. This algorithm presents the best efficiency for finding optimal blocks, but it has very large computational loads. In this paper, we have proposed fast disparity estimation algorithm using motion and disparity vector information of the prior frame in stereo image coding. We can realize fast disparity vector estimation in order to reduce search area by taking advantage of global disparity vector and to decrease computational loads by limiting search points using motion vectors and disparity vectors of prior frame. Experimental results show that the proposed algorithm has better performance in the simple image sequence than complex image sequence. We conclude that the fast disparity vector estimation is possible in simple image sequences by reducing computational complexities.

Channel Condition Adaptive Error Concealment using Scalability Coding (채널상태에 적응적인 계층 부호화를 이용한 오류 은닉 방법 연구)

  • Han Seung-Gyun;Park Seung-Ho;Suh Doug-Young
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.29 no.1B
    • /
    • pp.8-17
    • /
    • 2004
  • In this paper: we propose the adaptive error concealment technique for scalable video coding over wireless network error prove environment. We prove it is very effective that Error concealment techniques proposed in this paper are applied to scalable video data. In this paper, we propose two methods of error concealment. First one is that the en·or is concealed using the motion vector of base layer and previous VOP data. Second one is that according to existence of motion vector in error position, the error is concealed using the same position data of base layer when the motion vector is existing otherwise using the same position data of previous VOP when the motion vector is 0(zero) adaptively. We show that according to various error pattern caused by condition of wireless network and characteristics of sequence, we refer decoder to base layer data or previous enhancement layer data to effective error concealment. Using scalable coding of MPEG-4 In this paper, this error concealment techniques are available to be used every codec based on DCT.

Three Dimensional Tracking of Road Signs based on Stereo Vision Technique (스테레오 비전 기술을 이용한 도로 표지판의 3차원 추적)

  • Choi, Chang-Won;Choi, Sung-In;Park, Soon-Yong
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.20 no.12
    • /
    • pp.1259-1266
    • /
    • 2014
  • Road signs provide important safety information about road and traffic conditions to drivers. Road signs include not only common traffic signs but also warning information regarding unexpected obstacles and road constructions. Therefore, accurate detection and identification of road signs is one of the most important research topics related to safe driving. In this paper, we propose a 3-D vision technique to automatically detect and track road signs in a video sequence which is acquired from a stereo vision camera mounted on a vehicle. First, color information is used to initially detect the sign candidates. Second, the SVM (Support Vector Machine) is employed to determine true signs from the candidates. Once a road sign is detected in a video frame, it is continuously tracked from the next frame until it is disappeared. The 2-D position of a detected sign in the next frame is predicted by the 3-D motion of the vehicle. Here, the 3-D vehicle motion is acquired by using the 3-D pose information of the detected sign. Finally, the predicted 2-D position is corrected by template-matching of the scaled template of the detected sign within a window area around the predicted position. Experimental results show that the proposed method can detect and track many types of road signs successfully. Tracking comparisons with two different methods are shown.

A Simplification Method of Intra Prediction Considering Importance of Subjective Interest Region (주관적 관심영역 중요도를 고려한 화면내 예측 간소화 방법)

  • Lee, Ho-Young;Kwon, Soon-Kak
    • Journal of Korea Multimedia Society
    • /
    • v.12 no.7
    • /
    • pp.922-928
    • /
    • 2009
  • In H.264 as the newest video standard, 9 modes are used in order to predict the signal values of a block composed with several pixels by intra prediction. From these process, H.264 can bring high compression ratio in the encoded signal but the use of total 9 modes can give the inefficiency of the increase of the complexity induced by the amount of operation processing or the number of searching which is applied to compare adjacent pixels. This paper proposes a simplification method of prediction mode for the intra-picture coding by considering subjective interest region. There are certain region being interested within a picture of the video sequence. This region requires better subjective picture quality than the other regions. The proposed method increases the simplification of prediction mode by providing just essential modes of total 9 modes for less interest regions compared with the interest region. It is possible to get the additional 11%$\sim$15% simplification of the prediction mode by the proposed method, compared with the conventional method which simplifies the prediction mode for all of the picture by using the prediction characteristics only.

  • PDF

Quantization Modeling of Intra Frame for Rate Control (비트율 제어를 위한 인트라 프레임 양자화 모델링)

  • Park, Sang-Hyun
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.9 no.10
    • /
    • pp.1207-1214
    • /
    • 2014
  • The first frame of a GOP is encoded in intra mode which generates a larger number of bits. In addition, the first frame is used for the inter mode encoding of the following frames. Thus the encoding results of the intra frame affects the first frame as well as the following frames. Traditionally, the quantization parameter for an intra frame is determined only depending on the bpp not considering the characteristics of the intra frame. For accurate intra frame encoding, we should consider not only bpp but also the complexity of the video sequence and the output bandwidth. In this paper, we propose a real-time quantization model which is used to calculate the quantization parameter for an intra frame encoding based on the investigation on the characteristics of a GOP. It is shown by experimental results that the proposed quantization model captures the characteristics of an intra frame effectively and the proposed method for model parameters accurately estimates the real values.