• Title/Summary/Keyword: Monocular Estimation Method

Search Result 40, Processing Time 0.027 seconds

Deep Learning Based Monocular Depth Estimation: Survey

  • Lee, Chungkeun;Shim, Dongseok;Kim, H. Jin
    • Journal of Positioning, Navigation, and Timing
    • /
    • v.10 no.4
    • /
    • pp.297-305
    • /
    • 2021
  • Monocular depth estimation helps the robot to understand the surrounding environments in 3D. Especially, deep-learning-based monocular depth estimation has been widely researched, because it may overcome the scale ambiguity problem, which is a main issue in classical methods. Those learning based methods can be mainly divided into three parts: supervised learning, unsupervised learning, and semi-supervised learning. Supervised learning trains the network from dense ground-truth depth information, unsupervised one trains it from images sequences and semi-supervised one trains it from stereo images and sparse ground-truth depth. We describe the basics of each method, and then explain the recent research efforts to enhance the depth estimation performance.

Monocular Camera based Real-Time Object Detection and Distance Estimation Using Deep Learning (딥러닝을 활용한 단안 카메라 기반 실시간 물체 검출 및 거리 추정)

  • Kim, Hyunwoo;Park, Sanghyun
    • The Journal of Korea Robotics Society
    • /
    • v.14 no.4
    • /
    • pp.357-362
    • /
    • 2019
  • This paper proposes a model and train method that can real-time detect objects and distances estimation based on a monocular camera by applying deep learning. It used YOLOv2 model which is applied to autonomous or robot due to the fast image processing speed. We have changed and learned the loss function so that the YOLOv2 model can detect objects and distances at the same time. The YOLOv2 loss function added a term for learning bounding box values x, y, w, h, and distance values z as 클래스ification losses. In addition, the learning was carried out by multiplying the distance term with parameters for the balance of learning. we trained the model location, recognition by camera and distance data measured by lidar so that we enable the model to estimate distance and objects from a monocular camera, even when the vehicle is going up or down hill. To evaluate the performance of object detection and distance estimation, MAP (Mean Average Precision) and Adjust R square were used and performance was compared with previous research papers. In addition, we compared the original YOLOv2 model FPS (Frame Per Second) for speed measurement with FPS of our model.

Localization of a Monocular Camera using a Feature-based Probabilistic Map (특징점 기반 확률 맵을 이용한 단일 카메라의 위치 추정방법)

  • Kim, Hyungjin;Lee, Donghwa;Oh, Taekjun;Myung, Hyun
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.21 no.4
    • /
    • pp.367-371
    • /
    • 2015
  • In this paper, a novel localization method for a monocular camera is proposed by using a feature-based probabilistic map. The localization of a camera is generally estimated from 3D-to-2D correspondences between a 3D map and an image plane through the PnP algorithm. In the computer vision communities, an accurate 3D map is generated by optimization using a large number of image dataset for camera pose estimation. In robotics communities, a camera pose is estimated by probabilistic approaches with lack of feature. Thus, it needs an extra system because the camera system cannot estimate a full state of the robot pose. Therefore, we propose an accurate localization method for a monocular camera using a probabilistic approach in the case of an insufficient image dataset without any extra system. In our system, features from a probabilistic map are projected into an image plane using linear approximation. By minimizing Mahalanobis distance between the projected features from the probabilistic map and extracted features from a query image, the accurate pose of the monocular camera is estimated from an initial pose obtained by the PnP algorithm. The proposed algorithm is demonstrated through simulations in a 3D space.

Improving Detection Range for Short Baseline Stereo Cameras Using Convolutional Neural Networks and Keypoint Matching (컨볼루션 뉴럴 네트워크와 키포인트 매칭을 이용한 짧은 베이스라인 스테레오 카메라의 거리 센싱 능력 향상)

  • Byungjae Park
    • Journal of Sensor Science and Technology
    • /
    • v.33 no.2
    • /
    • pp.98-104
    • /
    • 2024
  • This study proposes a method to overcome the limited detection range of short-baseline stereo cameras (SBSCs). The proposed method includes two steps: (1) predicting an unscaled initial depth using monocular depth estimation (MDE) and (2) adjusting the unscaled initial depth by a scale factor. The scale factor is computed by triangulating the sparse visual keypoints extracted from the left and right images of the SBSC. The proposed method allows the use of any pre-trained MDE model without the need for additional training or data collection, making it efficient even when considering the computational constraints of small platforms. Using an open dataset, the performance of the proposed method was demonstrated by comparing it with other conventional stereo-based depth estimation methods.

Novel Backprojection Method for Monocular Head Pose Estimation

  • Ju, Kun;Shin, Bok-Suk;Klette, Reinhard
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.13 no.1
    • /
    • pp.50-58
    • /
    • 2013
  • Estimating a driver's head pose is an important task in driver-assistance systems because it can provide information about where a driver is looking, thereby giving useful cues about the status of the driver (i.e., paying proper attention, fatigued, etc.). This study proposes a system for estimating the head pose using monocular images, which includes a novel use of backprojection. The system can use a single image to estimate a driver's head pose at a particular time stamp, or an image sequence to support the analysis of a driver's status. Using our proposed system, we compared two previous pose estimation approaches. We introduced an approach for providing ground-truth reference data using a mannequin model. Our experimental results demonstrate that the proposed system provides relatively accurate estimations of the yaw, tilt, and roll angle. The results also show that one of the pose estimation approaches (perspective-n-point, PnP) provided a consistently better estimate compared to the other (pose from orthography and scaling with iterations, POSIT) using our proposed system.

Single Image Depth Estimation With Integration of Parametric Learning and Non-Parametric Sampling

  • Jung, Hyungjoo;Sohn, Kwanghoon
    • Journal of Korea Multimedia Society
    • /
    • v.19 no.9
    • /
    • pp.1659-1668
    • /
    • 2016
  • Understanding 3D structure of scenes is of a great interest in various vision-related tasks. In this paper, we present a unified approach for estimating depth from a single monocular image. The key idea of our approach is to take advantages both of parametric learning and non-parametric sampling method. Using a parametric convolutional network, our approach learns the relation of various monocular cues, which make a coarse global prediction. We also leverage the local prediction to refine the global prediction. It is practically estimated in a non-parametric framework. The integration of local and global predictions is accomplished by concatenating the feature maps of the global prediction with those from local ones. Experimental results demonstrate that the proposed method outperforms state-of-the-art methods both qualitatively and quantitatively.

AdaMM-DepthNet: Unsupervised Adaptive Depth Estimation Guided by Min and Max Depth Priors for Monocular Images

  • Bello, Juan Luis Gonzalez;Kim, Munchurl
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2020.11a
    • /
    • pp.252-255
    • /
    • 2020
  • Unsupervised deep learning methods have shown impressive results for the challenging monocular depth estimation task, a field of study that has gained attention in recent years. A common approach for this task is to train a deep convolutional neural network (DCNN) via an image synthesis sub-task, where additional views are utilized during training to minimize a photometric reconstruction error. Previous unsupervised depth estimation networks are trained within a fixed depth estimation range, irrespective of its possible range for a given image, leading to suboptimal estimates. To overcome this suboptimal limitation, we first propose an unsupervised adaptive depth estimation method guided by minimum and maximum (min-max) depth priors for a given input image. The incorporation of min-max depth priors can drastically reduce the depth estimation complexity and produce depth estimates with higher accuracy. Moreover, we propose a novel network architecture for adaptive depth estimation, called the AdaMM-DepthNet, which adopts the min-max depth estimation in its front side. Intensive experimental results demonstrate that the adaptive depth estimation can significantly boost up the accuracy with a fewer number of parameters over the conventional approaches with a fixed minimum and maximum depth range.

  • PDF

Unsupervised Monocular Depth Estimation Using Self-Attention for Autonomous Driving (자율주행을 위한 Self-Attention 기반 비지도 단안 카메라 영상 깊이 추정)

  • Seung-Jun Hwang;Sung-Jun Park;Joong-Hwan Baek
    • Journal of Advanced Navigation Technology
    • /
    • v.27 no.2
    • /
    • pp.182-189
    • /
    • 2023
  • Depth estimation is a key technology in 3D map generation for autonomous driving of vehicles, robots, and drones. The existing sensor-based method has high accuracy but is expensive and has low resolution, while the camera-based method is more affordable with higher resolution. In this study, we propose self-attention-based unsupervised monocular depth estimation for UAV camera system. Self-Attention operation is applied to the network to improve the global feature extraction performance. In addition, we reduce the weight size of the self-attention operation for a low computational amount. The estimated depth and camera pose are transformed into point cloud. The point cloud is mapped into 3D map using the occupancy grid of Octree structure. The proposed network is evaluated using synthesized images and depth sequences from the Mid-Air dataset. Our network demonstrates a 7.69% reduction in error compared to prior studies.

Study of Accomodation-lag using Monocular Estimation Method(MEM) (단안평가법(MEM)을 이용한 조절지체에 대한 연구)

  • Park, Eun-kyoo;Seo, Jung-Ick
    • Journal of the Korea society of information convergence
    • /
    • v.6 no.2
    • /
    • pp.51-55
    • /
    • 2013
  • The accomodation is made to see near objects. This accomodation have different characteristics from individual to individual. Difference also occurs accommodation of the theory and real. This is accomodative-lag. Depth of focus directly affects the accomodative-lag. Depth of focus is affected by the refractive power and the size of the pupil. Depth of focus becomes deeper as the size of the pupil is small, the refractive power is increased. The accomodative-lag occur more as depth of focus is deep. In this paper, a study was made of the relationship of the accomodative-lag and refractive power. A Monocular Estimation Method(MEM) use for measuring the accomodative-lag. Results were measured by MEM, it tended to increase the refractive power so as to increase the accodative-lag. The accomodative-lag amount was measured at 0.51D. Men were measured at 0.52D, women were measured at 0.49D. The accomodative-lag by gender tended also increases the amount of refractive power increases.

  • PDF

Estimation of Angular Acceleration By a Monocular Vision Sensor

  • Lim, Joonhoo;Kim, Hee Sung;Lee, Je Young;Choi, Kwang Ho;Kang, Sung Jin;Chun, Sebum;Lee, Hyung Keun
    • Journal of Positioning, Navigation, and Timing
    • /
    • v.3 no.1
    • /
    • pp.1-10
    • /
    • 2014
  • Recently, monitoring of two-body ground vehicles carrying extremely hazardous materials has been considered as one of the most important national issues. This issue induces large cost in terms of national economy and social benefit. To monitor and counteract accidents promptly, an efficient methodology is required. For accident monitoring, GPS can be utilized in most cases. However, it is widely known that GPS cannot provide sufficient continuity in urban cannons and tunnels. To complement the weakness of GPS, this paper proposes an accident monitoring method based on a monocular vision sensor. The proposed method estimates angular acceleration from a sequence of image frames captured by a monocular vision sensor. The possibility of using angular acceleration is investigated to determine the occurrence of accidents such as jackknifing and rollover. By an experiment based on actual measurements, the feasibility of the proposed method is evaluated.