• Title/Summary/Keyword: Scene labeling

Search Result 17, Processing Time 0.03 seconds

Efficient 3D Scene Labeling using Object Detectors & Location Prior Maps (물체 탐지기와 위치 사전 확률 지도를 이용한 효율적인 3차원 장면 레이블링)

  • Kim, Joo-Hee;Kim, In-Cheol
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.21 no.11
    • /
    • pp.996-1002
    • /
    • 2015
  • In this paper, we present an effective system for the 3D scene labeling of objects from RGB-D videos. Our system uses a Markov Random Field (MRF) over a voxel representation of the 3D scene. In order to estimate the correct label of each voxel, the probabilistic graphical model integrates both scores from sliding window-based object detectors and also from object location prior maps. Both the object detectors and the location prior maps are pre-trained from manually labeled RGB-D images. Additionally, the model integrates the scores from considering the geometric constraints between adjacent voxels in the label estimation. We show excellent experimental results for the RGB-D Scenes Dataset built by the University of Washington, in which each indoor scene contains tabletop objects.

Panoramic Scene Reconstruction using SURF Algorithm and Homography (SURF 알고리즘과 호모그래피을 이용한 파노라마 영상 재구성)

  • Jang, Hyun-Woo;Park, Chang-Hill;Kim, Kwang-Beak
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2010.10a
    • /
    • pp.203-205
    • /
    • 2010
  • 파노라마 영상을 재구성하는 기존의 방법은 Labeling을 이용하여 객체를 비교한 후에 결합시키는 방법을 적용하였으나 시간이 많이 소요되고 각각의 이미지를 Labeling하는 과정에서 개체 간의 불일치가 발생하여 정확히 영상을 결합할 수 없는 경우가 발생한다. 따라서 본 논문에서는 처리 속도 개선을 위하여 전체 이미지의 1/3만 Labeling한 후에 객체 간을 비교하여 결함시킨다. 그리고 각도가 틀린 경우에는 특징점을 찾아내는 SURF 알고리즘을 적용하여 각각의 이미지에서 Labeling한 사각형의 4개의 포인터에 대해 1개의 중심점을 구하여 Homography를 이용하여 2개의 영상을 자연스럽게 정합한다. 본 논문에서 제안한 파노라마 영상 재구성 방법의 성능을 평가하기 위하여 다양한 이미지를 대상으로 실험한 결과, 기존의 방법보다 영상을 재구성하는데 효과적인 것을 확인하였다. 그리고 처리 속도 측면에서도 개선되었다.

  • PDF

Aerial Scene Labeling Based on Convolutional Neural Networks (Convolutional Neural Networks기반 항공영상 영역분할 및 분류)

  • Na, Jong-Pil;Hwang, Seung-Jun;Park, Seung-Je;Baek, Joong-Hwan
    • Journal of Advanced Navigation Technology
    • /
    • v.19 no.6
    • /
    • pp.484-491
    • /
    • 2015
  • Aerial scene is greatly increased by the introduction and supply of the image due to the growth of digital optical imaging technology and development of the UAV. It has been used as the extraction of ground properties, classification, change detection, image fusion and mapping based on the aerial image. In particular, in the image analysis and utilization of deep learning algorithm it has shown a new paradigm to overcome the limitation of the field of pattern recognition. This paper presents the possibility to apply a more wide range and various fields through the segmentation and classification of aerial scene based on the Deep learning(ConvNet). We build 4-classes image database consists of Road, Building, Yard, Forest total 3000. Each of the classes has a certain pattern, the results with feature vector map come out differently. Our system consists of feature extraction, classification and training. Feature extraction is built up of two layers based on ConvNet. And then, it is classified by using the Multilayer perceptron and Logistic regression, the algorithm as a classification process.

Multiple People Labeling and Tracking Using Stereo

  • Setiawan, Nurul Arif;Hong, Seok-Ju;Lee, Chil-Woo
    • 한국HCI학회:학술대회논문집
    • /
    • 2007.02a
    • /
    • pp.630-635
    • /
    • 2007
  • In this paper, we propose a system for multiple people tracking using fragment based histogram matching. Appearance model is based on IHLS color histogram which can be calculated efficiently using integral histogram representation. Since histograms will loss all spatial information, we define a fragment based region representation which retain spatial information, robust against occlusion and scale issue by using disparity information. Multiple people labeling is maintained by creating online appearance representation for each people detected in scene and calculating fragment vote map. Initialization is performed automatically from background segmentation step.

  • PDF

Panoramic Image Reconstruction using SURF Algorithm (SURF 알고리즘을 이용한 파노라마 영상 재구성)

  • Kim, Kwang-Baek
    • Journal of the Korea Society of Computer and Information
    • /
    • v.18 no.4
    • /
    • pp.13-18
    • /
    • 2013
  • Panorama picturing is an elongated photographing technique that connects images with rotating and moving multiple images horizontally that are partly overlapped. However, for hand-operated photographs, it is difficult to adjust overlapped parts because of tilted angles. There has been a study comparing adjacent pictures using labeling technique but it was time-consuming and had angle dissonant cases in nature. In this paper, we propose a less time-consuming paranoiac scene reconstruction method. Our method is also based on labeling-and-comparing technique but uses only 1/3 of it. Then, if there exists angle dissonance, it tries to find characteristic points by SURF algorithm and adjusts them with homography. The efficacy of this method is experimentally verified by experiments using various images

Automatic Detection of Highlights in Soccer videos based on analysis of scene structure (축구 동영상에서의 장면 구조 분석에 기반한 자동적인 하이라이트 장면 검출)

  • Park, Ki-Tae;Moon, Young-Shik
    • The KIPS Transactions:PartB
    • /
    • v.14B no.1 s.111
    • /
    • pp.1-4
    • /
    • 2007
  • In this paper, we propose an efficient scheme for automatically detecting highlight scenes in soccer videos. Highlights are defined as shooting scenes and goal scenes. Through the analysis of soccer videos, we notice that most of highlight scenes are shown around the goal post area. It is also noticed that the TV camera zooms in a setter player or spectators after the highlight stones. Detection of highlight scenes for soccer videos consists of three steps. The first step is the extraction of the playing field using a statistical threshold. The second step is the detection of goal posts. In the final step, we detect a zooming of a soccer player or spectators by using connected component labeling of non-playing field. In order to evaluate the performance of our method, the precision and the recall are computed. Experimental results have shown the effectiveness of the proposed method, with 95.2% precision and 85.4% recall.

Geometric and Semantic Improvement for Unbiased Scene Graph Generation

  • Ruhui Zhang;Pengcheng Xu;Kang Kang;You Yang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.17 no.10
    • /
    • pp.2643-2657
    • /
    • 2023
  • Scene graphs are structured representations that can clearly convey objects and the relationships between them, but are often heavily biased due to the highly skewed, long-tailed relational labeling in the dataset. Indeed, the visual world itself and its descriptions are biased. Therefore, Unbiased Scene Graph Generation (USGG) prefers to train models to eliminate long-tail effects as much as possible, rather than altering the dataset directly. To this end, we propose Geometric and Semantic Improvement (GSI) for USGG to mitigate this issue. First, to fully exploit the feature information in the images, geometric dimension and semantic dimension enhancement modules are designed. The geometric module is designed from the perspective that the position information between neighboring object pairs will affect each other, which can improve the recall rate of the overall relationship in the dataset. The semantic module further processes the embedded word vector, which can enhance the acquisition of semantic information. Then, to improve the recall rate of the tail data, the Class Balanced Seesaw Loss (CBSLoss) is designed for the tail data. The recall rate of the prediction is improved by penalizing the body or tail relations that are judged incorrectly in the dataset. The experimental findings demonstrate that the GSI method performs better than mainstream models in terms of the mean Recall@K (mR@K) metric in three tasks. The long-tailed imbalance in the Visual Genome 150 (VG150) dataset is addressed better using the GSI method than by most of the existing methods.

Text Region Detection using Edge and Regional Minima/Maxima Transformation from Natural Scene Images (에지 및 국부적 최소/최대 변환을 이용한 자연 이미지로부터 텍스트 영역 검출)

  • Park, Jong-Cheon;Lee, Keun-Wang
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.10 no.2
    • /
    • pp.358-363
    • /
    • 2009
  • Text region detection from the natural scene images used in a variety of applications, many research are needed in this field. Recent research methods is to detect the text region using various algorithm which it is combination of edge based and connected component based. Therefore, this paper proposes an text region detection using edge and regional minima/maxima transformation algorithm from natural scene images, and then detect the connected components of edge and regional minima/maxima, labeling edge and regional minima/maxima connected components. Analysis the labeled regions and then detect a text candidate regions, each of detected text candidates combined and create a single text candidate image, Final text region validated by comparing the similarity and adjacency of individual characters, and then as the final text regions are detected. As the results of experiments, proposed algorithm improved the correctness of text regions detection using combined edge and regional minima/maxima connected components detection methods.

Edge-based range image segmentation method using pseudo reflectance images (의사 밝기 영상을 이용한 에지 기반형 거리 영상 분할)

  • 송호근;김태은;최종수
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.33B no.4
    • /
    • pp.111-123
    • /
    • 1996
  • In this paper, a new edge-based segmentation algorithm for range image using pseudo reflectance images (PRIs) is proposed. A model of pseudo reflectance which is useful in analyzing three dimensional scene and objects is introduced and then three PRIs are generated by the model. For generating three PRIs, bels and jain's differential window operator is selected and three different light source directions are determined. Three edge images are extracted from each PRI and a fused (logical ORing) edge image is constructed for the benefit of enhanced edge formation. The final segmentation results of the proposed algoritm are obtained after the processing of thinning, labeling and correcting erroeneous regions with the fused edge image. The good performance of edge detection and segmentation is confirmed via computer simulation with synthetic and real range images.

  • PDF

A Study on the Rotation Angle Estimation of HMD for the Tele-operated Vision System (원격 비전시스템을 위한 HMD의 방향각 측정 알고리즘에 관한 연구)

  • Ro, Young-Shick;Yoon, Seung-Jun;Kang, Hee-Jun;Suh, Young-Soo
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.58 no.3
    • /
    • pp.605-613
    • /
    • 2009
  • In this paper, we studied for the real-time azimuthal measurement of HMD (Head Mounted Display) to control the tele-operated vision system on the mobile robot. In the preexistence tele-operated vision system, a joystick was used to control the pan-tilt unit of the remote camera system. To give the sense of presence to the tele-operator, we used a HMD to display the remote scene, measured the rotation angle of the HMD on a real time basis, and transmitted the measured rotation angles to the mobile robot controller to synchronize the pan-tilt angles of remote camera with the HMD. In this paper, we suggest an algorithm for the real-time estimation of the HMD rotation angles using feature points extraction from pc-camera image. The simple experiment is conducted to demonstrate the feasibility.