• 제목/요약/키워드: Video-based

검색결과 5,489건 처리시간 0.04초

Video Captioning with Visual and Semantic Features

  • Lee, Sujin;Kim, Incheol
    • Journal of Information Processing Systems
    • /
    • 제14권6호
    • /
    • pp.1318-1330
    • /
    • 2018
  • Video captioning refers to the process of extracting features from a video and generating video captions using the extracted features. This paper introduces a deep neural network model and its learning method for effective video captioning. In this study, visual features as well as semantic features, which effectively express the video, are also used. The visual features of the video are extracted using convolutional neural networks, such as C3D and ResNet, while the semantic features are extracted using a semantic feature extraction network proposed in this paper. Further, an attention-based caption generation network is proposed for effective generation of video captions using the extracted features. The performance and effectiveness of the proposed model is verified through various experiments using two large-scale video benchmarks such as the Microsoft Video Description (MSVD) and the Microsoft Research Video-To-Text (MSR-VTT).

Versatile Video Coding을 활용한 Video based Point Cloud Compression 방법 (Video based Point Cloud Compression with Versatile Video Coding)

  • 권대혁;한희지;최해철
    • 한국방송∙미디어공학회:학술대회논문집
    • /
    • 한국방송∙미디어공학회 2020년도 하계학술대회
    • /
    • pp.497-499
    • /
    • 2020
  • 포인트 클라우드는 다수의 3D 포인터를 사용한 3D 데이터의 표현 방식 중 하나이며, 멀티미디어 획득 및 처리 기술의 발전에 따라 다양한 분야에서 주목하고 있는 기술이다. 특히 포인트 클라우드는 3D 데이터를 정밀하게 수집하고 표현할 수 있는 장점을 가진다. 하지만 포인트 클라우드는 방대한 양의 데이터를 가지고 있어 효율적인 압축이 필수적이다. 이에 따라 국제 표준화 단체인 Moving Picture Experts Group에서는 포인트 클라우드 데이터의 효율적인 압축을 위하여 Video based Point Cloud Compression(V-PCC)와 Geometry based Point Cloud Coding에 대한 표준을 제정하고 있다. 이 중 V-PCC는 기존 High Efficiency Video Coding(HEVC) 표준을 활용하여 포인트 클라우드를 압축하여 활용성이 높다는 장점이 있다. 본 논문에서는 V-PCC에 사용하는 HEVC 코덱을 2020년 7월 표준화 완료될 예정인 Versatile Video Coding으로 대체하여 V-PCC의 압축 성능을 더 개선할 수 있음을 보인다.

  • PDF

Online Video Synopsis via Multiple Object Detection

  • Lee, JaeWon;Kim, DoHyeon;Kim, Yoon
    • 한국컴퓨터정보학회논문지
    • /
    • 제24권8호
    • /
    • pp.19-28
    • /
    • 2019
  • In this paper, an online video summarization algorithm based on multiple object detection is proposed. As crime has been on the rise due to the recent rapid urbanization, the people's appetite for safety has been growing and the installation of surveillance cameras such as a closed-circuit television(CCTV) has been increasing in many cities. However, it takes a lot of time and labor to retrieve and analyze a huge amount of video data from numerous CCTVs. As a result, there is an increasing demand for intelligent video recognition systems that can automatically detect and summarize various events occurring on CCTVs. Video summarization is a method of generating synopsis video of a long time original video so that users can watch it in a short time. The proposed video summarization method can be divided into two stages. The object extraction step detects a specific object in the video and extracts a specific object desired by the user. The video summary step creates a final synopsis video based on the objects extracted in the previous object extraction step. While the existed methods do not consider the interaction between objects from the original video when generating the synopsis video, in the proposed method, new object clustering algorithm can effectively maintain interaction between objects in original video in synopsis video. This paper also proposed an online optimization method that can efficiently summarize the large number of objects appearing in long-time videos. Finally, Experimental results show that the performance of the proposed method is superior to that of the existing video synopsis algorithm.

DASH 기반의 다시점 비디오 서비스에서 시점전환 지연 최소화를 위한 비디오 전송 기법 (A Video Streaming Scheme for Minimizing Viewpoint Switching Delay in DASH-based Multi-view Video Services)

  • 김상욱;윤두열;정광수
    • 정보과학회 논문지
    • /
    • 제43권5호
    • /
    • pp.606-612
    • /
    • 2016
  • DASH(Dynamic Adaptive Streaming over HTTP) 기반의 다시점(Multi-view) 비디오 서비스는 여러 대의 카메라를 통해 획득한 다수의 영상 중에서 사용자가 선택한 영상이나 객체로 시점을 전환한다. 그러나 기존 DASH 기반의 다시점 비디오 서비스는 시점전환 이벤트가 발생하면 버퍼링 된 이전 비디오 영상의 모든 세그먼트를 소비한 후 새로운 비디오 영상을 재생하기 때문에 시점전환 시간이 오래 걸리는 문제가 있다. 본 논문에서는 DASH 기반의 다시점 비디오 서비스에서 시점전환 지연 최소화를 위한 비디오 전송 기법을 제안한다. 제안하는 기법은 시점전환 지연을 최소화하기 위해 GoP(Group of Pictures) 사이즈 조절을 통해 전환용 영상을 추가로 구성하고 대역폭 예측과 재생 버퍼 점유율을 기반으로 클라이언트 버퍼를 제어한다. 실험 결과를 통해 제안하는 기법이 시점전환 지연을 감소시키는 것을 확인하였다.

Novel Intent based Dimension Reduction and Visual Features Semi-Supervised Learning for Automatic Visual Media Retrieval

  • kunisetti, Subramanyam;Ravichandran, Suban
    • International Journal of Computer Science & Network Security
    • /
    • 제22권6호
    • /
    • pp.230-240
    • /
    • 2022
  • Sharing of online videos via internet is an emerging and important concept in different types of applications like surveillance and video mobile search in different web related applications. So there is need to manage personalized web video retrieval system necessary to explore relevant videos and it helps to peoples who are searching for efficient video relates to specific big data content. To evaluate this process, attributes/features with reduction of dimensionality are computed from videos to explore discriminative aspects of scene in video based on shape, histogram, and texture, annotation of object, co-ordination, color and contour data. Dimensionality reduction is mainly depends on extraction of feature and selection of feature in multi labeled data retrieval from multimedia related data. Many of the researchers are implemented different techniques/approaches to reduce dimensionality based on visual features of video data. But all the techniques have disadvantages and advantages in reduction of dimensionality with advanced features in video retrieval. In this research, we present a Novel Intent based Dimension Reduction Semi-Supervised Learning Approach (NIDRSLA) that examine the reduction of dimensionality with explore exact and fast video retrieval based on different visual features. For dimensionality reduction, NIDRSLA learns the matrix of projection by increasing the dependence between enlarged data and projected space features. Proposed approach also addressed the aforementioned issue (i.e. Segmentation of video with frame selection using low level features and high level features) with efficient object annotation for video representation. Experiments performed on synthetic data set, it demonstrate the efficiency of proposed approach with traditional state-of-the-art video retrieval methodologies.

영상그래픽 직무에 따른 교과목운영의 사례분석 (Case Studies and Derivation of Course Profile in accordance with Video Graphics Job)

  • 박혜숙
    • 전기학회논문지P
    • /
    • 제66권3호
    • /
    • pp.135-138
    • /
    • 2017
  • This study analyzed with the case analysis of a series of processes from job analysis survey and results analysis, and academic achievement in order to transform the curriculum of existing courses of the NCS-based video broadcasting. Also this study analysed the existing curriculum and analyzed the trend of workforce trends and needs of the broadcasting content industry. Also through a needs analysis for the industry and alumni and students, video graphics, video editing and video directing were selected. In this paper it dealt mainly with respect to the video graphics in a dual job. Modeling capability into the unit through a job analysis, animation, effects and lighting were chosen accordingly based introduction of graphics and application of graphics were derived two courses and selected profiles and performance criteria. This training according to the NCS curriculum for students was evaluated based on the student's job was to investigate the learning ability.

전기 안전 교육을 위한 모바일 에이전트 기반 비디오 검색 시스템 (A Video Retrieval System for Electric Safety Education based on Mobile Agent)

  • 조현섭;이근왕;김희숙
    • 대한전기학회:학술대회논문집
    • /
    • 대한전기학회 2005년도 제36회 하계학술대회 논문집 D
    • /
    • pp.2830-2832
    • /
    • 2005
  • Recently, retrieval or various video data has become an important issue as more and more multimedia content services are being provided. To effectively deal with video data, a semantic-based retrieval scheme that allows for processing diverse user queries and saving them on the database is required. In this regard, this paper proposes a semantic-based video retrieval system that allows the user to search diverse meanings of video data for electrical safetyrelated educational purposes by means of automatic annotation processing. If the user inputs a keyword to search video data for electrical safety-related educational purposes, the mobile agent of the proposed system extracts the features of the video data that are afterwards learned in a continuous manner, and detailed information on electrical safety education is saved on the database. The proposed system is designed to enhance video data retrieval efficiency for electrical safety-related educational purposes.

  • PDF

애니메이션을 이용한 전기 안전 교육용 모바일 에이전트 기반 비디오 검색 시스템 (A Video Retrieval System for Animation Using Electric Safety Education Based on Mobile Agent)

  • 조현섭;민진경;유인호
    • 한국산학기술학회:학술대회논문집
    • /
    • 한국산학기술학회 2006년도 춘계학술발표논문집
    • /
    • pp.320-323
    • /
    • 2006
  • Recently, retrieval of various video data has become an important issue as more and more multimedia content services are being provided. To effectively deal with video data, a semantic-based retrieval scheme that allows for processing diverse user queries and saving them on the database is required. In this regard, this paper proposes a semantic-based video retrieval system that allows the user to search diverse meanings of video data for electrical safetyrelated educational purposes by means of automatic annotation processing. If the user inputs a keyword to search video data for electrical safety-related educational purposes, the mobile agent of the proposed system extracts the features of the video data that are afterwards learned in a continuous manner, and detailed information on electrical safety education is saved on the database. The proposed system is designed to enhance video data retrieval efficiency for electrical safety-related educational purposes.

  • PDF

Scalable Multi-view Video Coding based on HEVC

  • Lim, Woong;Nam, Junghak;Sim, Donggyu
    • IEIE Transactions on Smart Processing and Computing
    • /
    • 제4권6호
    • /
    • pp.434-442
    • /
    • 2015
  • In this paper, we propose an integrated spatial and view scalable video codec based on high efficiency video coding (HEVC). The proposed video codec is developed based on similarity and uniqueness between the scalable extension and 3D multi-view extension of HEVC. To improve compression efficiency using the proposed scalable multi-view video codec, inter-layer and inter-view predictions are jointly employed by using high-level syntaxes that are defined to identify view and layer information. For the inter-view and inter-layer predictions, a decoded picture buffer (DPB) management algorithm is also proposed. The inter-view and inter-layer motion predictions are integrated into a consolidated prediction by harmonizing with the temporal motion prediction of HEVC. We found that the proposed scalable multi-view codec achieves bitrate reduction of 36.1%, 31.6% and 15.8% on the top of ${\times}2$, ${\times}1.5$ parallel scalable codec and parallel multi-view codec, respectively.

저비트율 동영상 전송을 위한 움직임 기반 동영상 분할 (The Motion-Based Video Segmentation for Low Bit Rate Transmission)

  • 이범로;정진현
    • 한국정보처리학회논문지
    • /
    • 제6권10호
    • /
    • pp.2838-2844
    • /
    • 1999
  • The motion-based video segmentation provides a powerful method of video compression, because it defines a region with similar motion, and it makes video compression system to more efficiently describe motion video. In this paper, we propose the Modified Fuzzy Competitive Learning Algorithm (MFCLA) to improve the traditional K-menas clustering algorithm to implement the motion-based video segmentation efficiently. The segmented region is described with the affine model, which consists of only six parameters. This affine model was calculated with optical flow, describing the movements of pixels by frames. This method could be applied in the low bit rate video transmission, such as video conferencing system.

  • PDF