• Title/Summary/Keyword: 프레임 기반 색인

Search Result 59, Processing Time 0.022 seconds

Design of Vision-based Interaction Tool for 3D Interaction in Desktop Environment (데스크탑 환경에서의 3차원 상호작용을 위한 비전기반 인터랙션 도구의 설계)

  • Choi, Yoo-Joo;Rhee, Seon-Min;You, Hyo-Sun;Roh, Young-Sub
    • The KIPS Transactions:PartB
    • /
    • v.15B no.5
    • /
    • pp.421-434
    • /
    • 2008
  • As computer graphics, virtual reality and augmented reality technologies have been developed, in many application areas based on those techniques, interaction for 3D space is required such as selection and manipulation of an 3D object. In this paper, we propose a framework for a vision-based 3D interaction which enables to simulate functions of an expensive 3D mouse for a desktop environment. The proposed framework includes a specially manufactured interaction device using three-color LEDs. By recognizing position and color of the LED from video sequences, various events of the mouse and 6 DOF interactions are supported. Since the proposed device is more intuitive and easier than an existing 3D mouse which is expensive and requires skilled manipulation, it can be used without additional learning or training. In this paper, we explain methods for making a pointing device using three-color LEDs which is one of the components of the proposed framework, calculating 3D position and orientation of the pointer and analyzing color of the LED from video sequences. We verify accuracy and usefulness of the proposed device by showing a measurement result of an error of the 3D position and orientation.

Novel Kernel Design for Implementing Volume Rendering in the PyCUDA Framework (PyCUDA 프레임워크에서 볼륨 렌더링을 구현하기 위한 새로운 커널 디자인)

  • Lee, SooHo;Kim, Jong-Hyun
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2022.01a
    • /
    • pp.349-351
    • /
    • 2022
  • 본 논문에서는 계산양이 큰 볼륨 렌더링을 구현할 수 있는 파이썬 기반의 CUDA(Computed Unified Device Architecture) 커널(Kernel) 디자인에 대해서 소개한다. 최근에 파이썬은 인공지능뿐만 아니라 서버, 보안, GUI, 데이터 시각화, 빅 데이터 처리 등 다양한 분야에서 활용이 되고 있기 때문에 인터페이스만을 위한 언어라는 색을 탈피한지 오래이다. 본 논문에서는 대용량 병렬처리 기법인 NVIDIA의 CUDA를 이용하여 파이썬 환경에서 커널을 디자인하고, 계산양이 큰 볼륨 렌더링이 빠르게 계산되는 결과를 보여준다. 결과적으로 C언어 기반의 CUDA뿐만 아니라, 상대적으로 개발이 효율적인 파이썬 환경에서도 GPU(Graphic Processing Unit)기반 애플리케이션 개발이 가능하다는 것을 볼륨 렌더링을 통해 보여준다.

  • PDF

A Design and Implementation of algorithm choosing Context-based Image used Multimedia Communication (멀티미디어 통신을 이용한 내용기반 이미지 추출 알고리즘 설계 및 구현)

  • 안병규
    • Journal of the Korea Computer Industry Society
    • /
    • v.2 no.11
    • /
    • pp.1421-1426
    • /
    • 2001
  • Nowadays, as the quantity of multimedia information increases rapidly, an efficient management for multimedia has become more important. In this paper, to index and search multimedia contents efficiently, we designed the algorithm searching specific image and saving the extracted image using the semantic information extraction scheme based on contents and it is one of the schemes to indexing and searching of video data. After extracting the RGB information from input image, while all frames of video is inspected sequentially, the specific image is saved through referring to the position and distribution of contents from the collection scheme of RGB range. In case of using the proposed image extraction algorithm, because only saved video is searched instead of the whole the searching time can be reduced.

  • PDF

A new approach for overlay text detection from complex video scene (새로운 비디오 자막 영역 검출 기법)

  • Kim, Won-Jun;Kim, Chang-Ick
    • Journal of Broadcast Engineering
    • /
    • v.13 no.4
    • /
    • pp.544-553
    • /
    • 2008
  • With the development of video editing technology, there are growing uses of overlay text inserted into video contents to provide viewers with better visual understanding. Since the content of the scene or the editor's intention can be well represented by using inserted text, it is useful for video information retrieval and indexing. Most of the previous approaches are based on low-level features, such as edge, color, and texture information. However, existing methods experience difficulties in handling texts with various contrasts or inserted in a complex background. In this paper, we propose a novel framework to localize the overlay text in a video scene. Based on our observation that there exist transient colors between inserted text and its adjacent background a transition map is generated. Then candidate regions are extracted by using the transition map and overlay text is finally determined based on the density of state in each candidate. The proposed method is robust to color, size, position, style, and contrast of overlay text. It is also language free. Text region update between frames is also exploited to reduce the processing time. Experiments are performed on diverse videos to confirm the efficiency of the proposed method.

VILODE : A Real-Time Visual Loop Closure Detector Using Key Frames and Bag of Words (VILODE : 키 프레임 영상과 시각 단어들을 이용한 실시간 시각 루프 결합 탐지기)

  • Kim, Hyesuk;Kim, Incheol
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.4 no.5
    • /
    • pp.225-230
    • /
    • 2015
  • In this paper, we propose an effective real-time visual loop closure detector, VILODE, which makes use of key frames and bag of visual words (BoW) based on SURF feature points. In order to determine whether the camera has re-visited one of the previously visited places, a loop closure detector has to compare an incoming new image with all previous images collected at every visited place. As the camera passes through new places or locations, the amount of images to be compared continues growing. For this reason, it is difficult for a visual loop closure detector to meet both real-time constraint and high detection accuracy. To address the problem, the proposed system adopts an effective key frame selection strategy which selects and compares only distinct meaningful ones from continuously incoming images during navigation, and so it can reduce greatly image comparisons for loop detection. Moreover, in order to improve detection accuracy and efficiency, the system represents each key frame image as a bag of visual words, and maintains indexes for them using DBoW database system. The experiments with TUM benchmark datasets demonstrates high performance of the proposed visual loop closure detector.

A Study on Shot Segmentation and Indexing of Language Education Videos by Content-based Visual Feature Analysis (교육용 어학 영상의 내용 기반 특징 분석에 의한 샷 구분 및 색인에 대한 연구)

  • Han, Heejun
    • Journal of the Korean Society for information Management
    • /
    • v.34 no.1
    • /
    • pp.219-239
    • /
    • 2017
  • As IT technology develops rapidly and the personal dissemination of smart devices increases, video material is especially used as a medium of information transmission among audiovisual materials. Video as an information service content has become an indispensable element, and it has been used in various ways such as unidirectional delivery through TV, interactive service through the Internet, and audiovisual library borrowing. Especially, in the Internet environment, the information provider tries to reduce the effort and cost for the processing of the provided information in view of the video service through the smart device. In addition, users want to utilize only the desired parts because of the burden on excessive network usage, time and space constraints. Therefore, it is necessary to enhance the usability of the video by automatically classifying, summarizing, and indexing similar parts of the contents. In this paper, we propose a method of automatically segmenting the shots that make up videos by analyzing the contents and characteristics of language education videos and indexing the detailed contents information of the linguistic videos by combining visual features. The accuracy of the semantic based shot segmentation is high, and it can be effectively applied to the summary service of language education videos.

An analysis of Scene Change Detection using HEVC coding additional information (HEVC 부호화 부가정보를 이용한 장면전환 검출 연구)

  • Eom, Yumi;Park, Sangil;Chung, Chang Woo
    • Journal of Broadcast Engineering
    • /
    • v.20 no.6
    • /
    • pp.871-879
    • /
    • 2015
  • With the increase of mass contents data, a method of a scene change detection is required for analysis, indexing and editing. Although many researchers are studying a variety of scene change detection method, it is too difficult to accurately detect various movements of the cameras and scene changes. Also, earlier scene change detection methods take too much time to apply to UHD video contents. That is because the UHD video contents with 4K (3820x2160) resolution or higher have greater amount of data. Therefore a method for detecting a scene change by using the next-generation codec, HEVC, is required. In this paper, we propose four scene change detection methods using the coding additional information of HEVC, and a new pixel-based scene change detection system. Furthermore, through the experimental results, we check the possibility of detecting the scene changes of UHD videos encoded in HEVC format.

Object Tracking in HEVC Bitstreams (HEVC 스트림 상에서의 객체 추적 방법)

  • Park, Dongmin;Lee, Dongkyu;Oh, Seoung-Jun
    • Journal of Broadcast Engineering
    • /
    • v.20 no.3
    • /
    • pp.449-463
    • /
    • 2015
  • Video object tracking is important for variety of applications, such as security, video indexing and retrieval, video surveillance, communication, and compression. This paper proposes an object tracking method in HEVC bitstreams. Without pixel reconstruction, motion vector (MV) and size of prediction unit in the bitstream are employed in an Spatio-Temporal Markov Random Fields (ST-MRF) model which represents the spatial and temporal aspects of the object's motion. Coefficient-based object shape adjustment is proposed to solve the over-segmentation and the error propagation problems caused in other methods. In the experimental results, the proposed method provides on average precision of 86.4%, recall of 79.8% and F-measure of 81.1%. The proposed method achieves an F-measure improvement of up to 9% for over-segmented results in the other method even though it provides only average F-measure improvement of 0.2% with respect to the other method. The total processing time is 5.4ms per frame, allowing the algorithm to be applied in real-time applications.

An Efficient MapReduce-based Skyline Query Processing Method with Two-level Grid Blocks (2-계층 그리드 블록을 이용한 효과적인 맵리듀스 기반 스카이라인 질의 처리 기법)

  • Ryu, Hyeongcheol;Jung, Sungwon
    • Journal of KIISE
    • /
    • v.44 no.6
    • /
    • pp.613-620
    • /
    • 2017
  • Skyline queries are used extensively to solve various problems, such as in decision-making, because they find data that meet a variety of user criteria. Recent research has focused on skyline queries by using the MapReduce framework for large database processing, mainly in terms of applying existing index structures to MapReduce. In a skyline, data closer to the origin dominate more area. However, the existing index structure does not reflect such characteristics of the skyline. In this paper, we propose a grid-block structure that groups grid cells to match the characteristics of a skyline, and a two-level grid-block structure that can be used even when there are no data close to the origin. We also propose an efficient skyline-query algorithm that uses the two-level grid-block structure.

Automatic Classification Technique of Offence Patterns using Neural Networks in Soccer Game (뉴럴네트워크를 이용한 축구경기 공격패턴 자동분류에 관한 연구)

  • Kim, Hyun-Sook;Yoon, Ho-Sub;Hwang, Chong-Sun;Yang, Young-Kyu
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2001.10a
    • /
    • pp.727-730
    • /
    • 2001
  • 멀티미디어 환경의 급속한 발전에 의해 영상처리 기술은 인간의 인체와 관련하여 얼굴인식, 제스처 인식에 관한 응용과 더불어 스포츠 관련분야로 깊숙히 정착하고 있다. 그러나 입력영상으로부터 움직이고 있는 선수들의 동작을 추출 및 추적하는 일은 컴퓨터비전 연구의 난 문제 중의 하나로 알려져 있다. 이러한 축구경기의 TV 중계에 있어서 하이라이트 장면의 자동추출(자동색인)은 그 경기의 가장 집약적인 표현이며, 축구경기 전체를 한 눈에 파악할 수 있도록 해주는 요약(summary)이자 intensive actions이고 경기의 진수이다. 따라서 축구경기와 같이 비교적 기 시간(대체로 1시간 30분) 동안 다수의 선수(양 팀 합해서 22명)들이 서로 복잡하게 뒤얽히면서 진행하는 경기의 하이라이트 장면을 효과적으로 포착하여 표현해 줄 수 있다면 TV를 통해서 경기를 관람하는 시청자들에게는 경기의 진행상황을 한 눈에 효과적으로 파악할 수 있게 해주어 흥미진진한 경기관람을 할 수 있게 해주고, 경기의 진행자들(감독, 코치, 선수 등)에게는 고차원적이고 과학적인 정보를 효과적으로 제공함으로써 한층 진보된 경기기법을 개발하고 과학적인 경기전략을 세울 수 있게 해준다. 본 논문은 이상과 같이 팀 스포츠(Team Spots)의 일종인 축구경기 하이라이트 장면의 자동색인을 위해 뉴럴네트워크 기법을 이용하여 그룹 포메이션(Group Formation) 중의 공격패턴 자동분류 기법을 개발하고 이를 검증하였다. 본 연구에서는 축구경기장 내의 빈번하게 변화하는 장면들을 자동으로 분할하여 대표 프레임을 선정하고, 대표 프레임 상에서 선수들의 위치정보와 공의 위치정보 등을 기초로 하여 경기 중에 이루어지는 선수들의 그룹 포메이션을 추적하여 그룹행동(group behavior)을 분석하고, 뉴럴네트워크의 BP(Back-Propagation) 알고리즘을 사용하여 축구경기 공격패턴을 자동으로 인식 및 분류함으로써 축구경기 하이라이트 장면의 자동추출을 위한 기반을 마련하였다. 본 연구의 실험에는 '98 프랑스 월드컵 축구경기의 다양한 공격패턴에 대한 비디오 영상에서 각각 좌측공격 60개, 우측공격 74개, 중앙공격 72개, 코너킥 39개, 프리킥 52개의 총 297개의 데이터를 추출하여 사용하였다. 실험과는 좌측공격 91.7%, 우측공격 100%, 중앙공격 87.5%, 코너킥 97.4%, 프리킥 75%로서 매우 양호한 인식율을 보였다.

  • PDF