• Title/Summary/Keyword: Key Object

Search Result 512, Processing Time 0.027 seconds

Vergence Control of Binocular Stereoscopic Camera Using Disparity Information

  • Kwon, Ki-Chul;Lim, Young-Tae;Kim, Nam;Song, Young-Jun;Choi, Young-Soo
    • Journal of the Optical Society of Korea
    • /
    • v.13 no.3
    • /
    • pp.379-385
    • /
    • 2009
  • The vergence control of binocular stereoscopic camera is the most essential factor for acquiring high quality stereoscopic images. In this paper, we proposed a binocular stereoscopic camera vergence control method using disparity information by the simple image processing and estimate the quantity of vergence control using the Lagrange interpolation equation. The method of extracting disparity information through image processing is as follows: first the key-object in left & right images was extracted through labeling of the central area of the image, and then a simple method was used for calculating the disparity value of the same key-object in the labeled left and right images. The vergence control method uses disparity information and keeps the convergence distance of left & right cameras and the distance of the key-object the same. According to the proposed method, variance in the distance of the key-object and application of calculated disparity information of obtained left & right images to the quadratic Lagrange interpolation equation could estimate the quantity of vergence control, which confirmed that the method of stereoscopic camera vergence control can be simplified through experiments on various key-objects and other convergence distance.

Study on a Robust Object Tracking Algorithm Based on Improved SURF Method with CamShift

  • Ahn, Hyochang;Shin, In-Kyoung
    • Journal of the Korea Society of Computer and Information
    • /
    • v.23 no.1
    • /
    • pp.41-48
    • /
    • 2018
  • Recently, surveillance systems are widely used, and one of the key technologies in this surveillance system is to recognize and track objects. In order to track a moving object robustly and efficiently in a complex environment, it is necessary to extract the feature points in the interesting object and to track the object using the feature points. In this paper, we propose a method to track interesting objects in real time by eliminating unnecessary information from objects, generating feature point descriptors using only key feature points, and reducing computational complexity for object recognition. Experimental results show that the proposed method is faster and more robust than conventional methods, and can accurately track objects in various environments.

A Review of 3D Object Tracking Methods Using Deep Learning (딥러닝 기술을 이용한 3차원 객체 추적 기술 리뷰)

  • Park, Hanhoon
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.22 no.1
    • /
    • pp.30-37
    • /
    • 2021
  • Accurate 3D object tracking with camera images is a key enabling technology for augmented reality applications. Motivated by the impressive success of convolutional neural networks (CNNs) in computer vision tasks such as image classification, object detection, image segmentation, recent studies for 3D object tracking have focused on leveraging deep learning. In this paper, we review deep learning approaches for 3D object tracking. We describe key methods in this field and discuss potential future research directions.

Tongue Image Segmentation via Thresholding and Gray Projection

  • Liu, Weixia;Hu, Jinmei;Li, Zuoyong;Zhang, Zuchang;Ma, Zhongli;Zhang, Daoqiang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.2
    • /
    • pp.945-961
    • /
    • 2019
  • Tongue diagnosis is one of the most important diagnostic methods in Traditional Chinese Medicine (TCM). Tongue image segmentation aims to extract the image object (i.e., tongue body), which plays a key role in the process of manufacturing an automated tongue diagnosis system. It is still challenging, because there exists the personal diversity in tongue appearances such as size, shape, and color. This paper proposes an innovative segmentation method that uses image thresholding, gray projection and active contour model (ACM). Specifically, an initial object region is first extracted by performing image thresholding in HSI (i.e., Hue Saturation Intensity) color space, and subsequent morphological operations. Then, a gray projection technique is used to determine the upper bound of the tongue body root for refining the initial object region. Finally, the contour of the refined object region is smoothed by ACM. Experimental results on a dataset composed of 100 color tongue images showed that the proposed method obtained more accurate segmentation results than other available state-of-the-art methods.

Content-based Video Information Retrieval and Streaming System using Viewpoint Invariant Regions

  • Park, Jong-an
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.2 no.1
    • /
    • pp.43-50
    • /
    • 2009
  • This paper caters the need of acquiring the principal objects, characters, and scenes from a video in order to entertain the image based query. The movie frames are divided into frames with 2D representative images called "key frames". Various regions in a key frame are marked as key objects according to their textures and shapes. These key objects serve as a catalogue of regions to be searched and matched from rest of the movie, using viewpoint invariant regions calculation, providing the location, size, and orientation of all the objects occurring in the movie in the form of a set of structures collaborating as video profile. The profile provides information about occurrences of every single key object from every frame of the movie it exists in. This information can further ease streaming of objects over various network-based viewing qualities. Hence, the method provides an effective reduced profiling approach of automatic logging and viewing information through query by example (QBE) procedure, and deals with video streaming issues at the same time.

  • PDF

Vehicle Detection in Aerial Images Based on Hyper Feature Map in Deep Convolutional Network

  • Shen, Jiaquan;Liu, Ningzhong;Sun, Han;Tao, Xiaoli;Li, Qiangyi
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.4
    • /
    • pp.1989-2011
    • /
    • 2019
  • Vehicle detection based on aerial images is an interesting and challenging research topic. Most of the traditional vehicle detection methods are based on the sliding window search algorithm, but these methods are not sufficient for the extraction of object features, and accompanied with heavy computational costs. Recent studies have shown that convolutional neural network algorithm has made a significant progress in computer vision, especially Faster R-CNN. However, this algorithm mainly detects objects in natural scenes, it is not suitable for detecting small object in aerial view. In this paper, an accurate and effective vehicle detection algorithm based on Faster R-CNN is proposed. Our method fuse a hyperactive feature map network with Eltwise model and Concat model, which is more conducive to the extraction of small object features. Moreover, setting suitable anchor boxes based on the size of the object is used in our model, which also effectively improves the performance of the detection. We evaluate the detection performance of our method on the Munich dataset and our collected dataset, with improvements in accuracy and effectivity compared with other methods. Our model achieves 82.2% in recall rate and 90.2% accuracy rate on Munich dataset, which has increased by 2.5 and 1.3 percentage points respectively over the state-of-the-art methods.

Object-Based Image Search Using Color and Texture Homogeneous Regions (유사한 색상과 질감영역을 이용한 객체기반 영상검색)

  • 유헌우;장동식;서광규
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.8 no.6
    • /
    • pp.455-461
    • /
    • 2002
  • Object-based image retrieval method is addressed. A new image segmentation algorithm and image comparing method between segmented objects are proposed. For image segmentation, color and texture features are extracted from each pixel in the image. These features we used as inputs into VQ (Vector Quantization) clustering method, which yields homogeneous objects in terns of color and texture. In this procedure, colors are quantized into a few dominant colors for simple representation and efficient retrieval. In retrieval case, two comparing schemes are proposed. Comparing between one query object and multi objects of a database image and comparing between multi query objects and multi objects of a database image are proposed. For fast retrieval, dominant object colors are key-indexed into database.

Dual-stream Co-enhanced Network for Unsupervised Video Object Segmentation

  • Hongliang Zhu;Hui Yin;Yanting Liu;Ning Chen
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.18 no.4
    • /
    • pp.938-958
    • /
    • 2024
  • Unsupervised Video Object Segmentation (UVOS) is a highly challenging problem in computer vision as the annotation of the target object in the testing video is unknown at all. The main difficulty is to effectively handle the complicated and changeable motion state of the target object and the confusion of similar background objects in video sequence. In this paper, we propose a novel deep Dual-stream Co-enhanced Network (DC-Net) for UVOS via bidirectional motion cues refinement and multi-level feature aggregation, which can fully take advantage of motion cues and effectively integrate different level features to produce high-quality segmentation mask. DC-Net is a dual-stream architecture where the two streams are co-enhanced by each other. One is a motion stream with a Motion-cues Refine Module (MRM), which learns from bidirectional optical flow images and produces fine-grained and complete distinctive motion saliency map, and the other is an appearance stream with a Multi-level Feature Aggregation Module (MFAM) and a Context Attention Module (CAM) which are designed to integrate the different level features effectively. Specifically, the motion saliency map obtained by the motion stream is fused with each stage of the decoder in the appearance stream to improve the segmentation, and in turn the segmentation loss in the appearance stream feeds back into the motion stream to enhance the motion refinement. Experimental results on three datasets (Davis2016, VideoSD, SegTrack-v2) demonstrate that DC-Net has achieved comparable results with some state-of-the-art methods.

Accurate Pig Detection for Video Monitoring Environment (비디오 모니터링 환경에서 정확한 돼지 탐지)

  • Ahn, Hanse;Son, Seungwook;Yu, Seunghyun;Suh, Yooil;Son, Junhyung;Lee, Sejun;Chung, Yongwha;Park, Daihee
    • Journal of Korea Multimedia Society
    • /
    • v.24 no.7
    • /
    • pp.890-902
    • /
    • 2021
  • Although the object detection accuracy with still images has been significantly improved with the advance of deep learning techniques, the object detection problem with video data remains as a challenging problem due to the real-time requirement and accuracy drop with occlusion. In this research, we propose a method in pig detection for video monitoring environment. First, we determine a motion, from a video data obtained from a tilted-down-view camera, based on the average size of each pig at each location with the training data, and extract key frames based on the motion information. For each key frame, we then apply YOLO, which is known to have a superior trade-off between accuracy and execution speed among many deep learning-based object detectors, in order to get pig's bounding boxes. Finally, we merge the bounding boxes between consecutive key frames in order to reduce false positive and negative cases. Based on the experiment results with a video data set obtained from a pig farm, we confirmed that the pigs could be detected with an accuracy of 97% at a processing speed of 37fps.

Stereoscopic Video Conversion Based on Image Motion Classification and Key-Motion Detection from a Two-Dimensional Image Sequence (영상 운동 분류와 키 운동 검출에 기반한 2차원 동영상의 입체 변환)

  • Lee, Kwan-Wook;Kim, Je-Dong;Kim, Man-Bae
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.34 no.10B
    • /
    • pp.1086-1092
    • /
    • 2009
  • Stereoscopic conversion has been an important and challenging issue for many 3-D video applications. Usually, there are two different stereoscopic conversion approaches, i.e., image motion-based conversion that uses motion information and object-based conversion that partitions an image into moving or static foreground object(s) and background and then converts the foreground in a stereoscopic object. As well, since the input sequence is MPEG-1/2 compressed video, motion data stored in compressed bitstream are often unreliable and thus the image motion-based conversion might fail. To solve this problem, we present the utilization of key-motion that has the better accuracy of estimated or extracted motion information. To deal with diverse motion types, a transform space produced from motion vectors and color differences is introduced. A key-motion is determined from the transform space and its associated stereoscopic image is generated. Experimental results validate effectiveness and robustness of the proposed method.