• Title/Summary/Keyword: 3D video system

Search Result 405, Processing Time 0.03 seconds

A Method for 3D Human Pose Estimation based on 2D Keypoint Detection using RGB-D information (RGB-D 정보를 이용한 2차원 키포인트 탐지 기반 3차원 인간 자세 추정 방법)

  • Park, Seohee;Ji, Myunggeun;Chun, Junchul
    • Journal of Internet Computing and Services
    • /
    • v.19 no.6
    • /
    • pp.41-51
    • /
    • 2018
  • Recently, in the field of video surveillance, deep learning based learning method is applied to intelligent video surveillance system, and various events such as crime, fire, and abnormal phenomenon can be robustly detected. However, since occlusion occurs due to the loss of 3d information generated by projecting the 3d real-world in 2d image, it is need to consider the occlusion problem in order to accurately detect the object and to estimate the pose. Therefore, in this paper, we detect moving objects by solving the occlusion problem of object detection process by adding depth information to existing RGB information. Then, using the convolution neural network in the detected region, the positions of the 14 keypoints of the human joint region can be predicted. Finally, in order to solve the self-occlusion problem occurring in the pose estimation process, the method for 3d human pose estimation is described by extending the range of estimation to the 3d space using the predicted result of 2d keypoint and the deep neural network. In the future, the result of 2d and 3d pose estimation of this research can be used as easy data for future human behavior recognition and contribute to the development of industrial technology.

A performance Evaluation and Development of 3D Endoscopic Imaging system

  • Song, Chul-Gyo;Kim, Kyeong-Seop;Kim, Nam-Gyun;Lee, Myoung-Ho
    • Journal of KIEE
    • /
    • v.10 no.1
    • /
    • pp.1-6
    • /
    • 2000
  • This paper represents the design of 3D endoscopic video system in order to improve visualization and enhance the ability of the surgeon to perform delicate endoscopic surgery. In comparison of the polarized and electric shutter-type stereo imaging system, The former is superior in terms of accuracy and performance speed for knot-tying and loop pass test. The result of experiments show that the proposed 3D endoscopy system has a wide viewing angle and zone which is necessary for multi-view and it has better image quality and stability of the optical performances than the electric shutter-type does.

  • PDF

Depth Images-based Human Detection, Tracking and Activity Recognition Using Spatiotemporal Features and Modified HMM

  • Kamal, Shaharyar;Jalal, Ahmad;Kim, Daijin
    • Journal of Electrical Engineering and Technology
    • /
    • v.11 no.6
    • /
    • pp.1857-1862
    • /
    • 2016
  • Human activity recognition using depth information is an emerging and challenging technology in computer vision due to its considerable attention by many practical applications such as smart home/office system, personal health care and 3D video games. This paper presents a novel framework of 3D human body detection, tracking and recognition from depth video sequences using spatiotemporal features and modified HMM. To detect human silhouette, raw depth data is examined to extract human silhouette by considering spatial continuity and constraints of human motion information. While, frame differentiation is used to track human movements. Features extraction mechanism consists of spatial depth shape features and temporal joints features are used to improve classification performance. Both of these features are fused together to recognize different activities using the modified hidden Markov model (M-HMM). The proposed approach is evaluated on two challenging depth video datasets. Moreover, our system has significant abilities to handle subject's body parts rotation and body parts missing which provide major contributions in human activity recognition.

MPEG-DASH based 3D Point Cloud Content Configuration Method (MPEG-DASH 기반 3차원 포인트 클라우드 콘텐츠 구성 방안)

  • Kim, Doohwan;Im, Jiheon;Kim, Kyuheon
    • Journal of Broadcast Engineering
    • /
    • v.24 no.4
    • /
    • pp.660-669
    • /
    • 2019
  • Recently, with the development of three-dimensional scanning devices and multi-dimensional array cameras, research is continuously conducted on techniques for handling three-dimensional data in application fields such as AR (Augmented Reality) / VR (Virtual Reality) and autonomous traveling. In particular, in the AR / VR field, content that expresses 3D video as point data has appeared, but this requires a larger amount of data than conventional 2D images. Therefore, in order to serve 3D point cloud content to users, various technological developments such as highly efficient encoding / decoding and storage, transfer, etc. are required. In this paper, V-PCC bit stream created using V-PCC encoder proposed in MPEG-I (MPEG-Immersive) V-PCC (Video based Point Cloud Compression) group, It is defined by the MPEG-DASH (Dynamic Adaptive Streaming over HTTP) standard, and provides to be composed of segments. Also, in order to provide the user with the information of the 3D coordinate system, the depth information parameter of the signaling message is additionally defined. Then, we design a verification platform to verify the technology proposed in this paper, and confirm it in terms of the algorithm of the proposed technology.

Adaptive Multi-view Video Service Framework for Mobile Environments (이동 환경을 위한 적응형 다시점 비디오 서비스 프레임워크)

  • Kwon, Jun-Sup;Kim, Man-Bae;Choi, Chang-Yeol
    • Journal of Broadcast Engineering
    • /
    • v.13 no.5
    • /
    • pp.586-595
    • /
    • 2008
  • In this paper, we propose an adaptive multi-view video service framework suitable for mobile environments. The proposed framework generates intermediate views in near-realtime and overcomes the limitations of mobile services by adapting the multi-view video according to the processing capability of a mobile device as well as the user characteristics of a client. By implementing the most of adaptation processes at the server side, the load on a client can be reduced. H.264/AVC is adopted as a compression scheme. The framework could provide an interactive service with efficient video service to a mobile client. For this, we present a multi-view video DIA (Digital Item Adaptation) that adapts the multi-view video according to the MPEG-21 DIA multimedia framework. Experimental results show that our proposed system can support a frame rate of 13 fps for 320{\times}240 video and reduce the time of generating an intermediate view by 20 % compared with a conventional 3D projection method.

Development of a Portable Multi-sensor System for Geo-referenced Images and its Accuracy Evaluation (Geo-referenced 영상 획득을 위한 휴대용 멀티센서 시스템 구축 및 정확도 평가)

  • Lee, Ji-Hun;Choi, Kyoung-Ah;Lee, Im-Pyeong
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.28 no.6
    • /
    • pp.637-643
    • /
    • 2010
  • In this study, we developed a Portable Multi-sensor System, which consists of a video camera, a GPS/MEMS IMU and a UMPC to acquire video images and position/attitude data. We performed image georeferencing based on the bundle adjustment without ground control points using the acquired data and then evaluated the effectiveness of our system through the accuracy verification. The experimental results showed that the RMSE of relative coordinates on the ground point coordinates obtained from our system was several centimeters. Our system can be efficiently utilized to obtain the 3D model of object and their relative coordinates. In future, we plan to improve the accuracy of absolute coordinates through the rigorous calibration of the system and camera.

Geocoding of the Free Stereo Mosaic Image Generated from Video Sequences (비디오 프레임 영상으로부터 제작된 자유 입체 모자이크 영상의 실좌표 등록)

  • Noh, Myoung-Jong;Cho, Woo-Sug;Park, Jun-Ku;Kim, Jung-Sub;Koh, Jin-Woo
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.29 no.3
    • /
    • pp.249-255
    • /
    • 2011
  • The free-stereo mosaics image without GPS/INS and ground control data can be generated by using relative orientation parameters on the 3D model coordinate system. Its origin is located in one reference frame image. A 3D coordinate calculated by conjugate points on the free-stereo mosaic images is represented on the 3D model coordinate system. For determining 3D coordinate on the 3D absolute coordinate system utilizing conjugate points on the free-stereo mosaic images, transformation methodology is required for transforming 3D model coordinate into 3D absolute coordinate. Generally, the 3D similarity transformation is used for transforming each other 3D coordinates. Error of 3D model coordinates used in the free-stereo mosaic images is non-linearly increased according to distance from 3D model coordinate and origin point. For this reason, 3D model coordinates used in the free-stereo mosaic images are difficult to transform into 3D absolute coordinates by using linear transformation. Therefore, methodology for transforming nonlinear 3D model coordinate into 3D absolute coordinate is needed. Also methodology for resampling the free-stereo mosaic image to the geo-stereo mosaic image is needed for overlapping digital map on absolute coordinate and stereo mosaic images. In this paper, we propose a 3D non-linear transformation for converting 3D model coordinate in the free-stereo mosaic image to 3D absolute coordinate, and a 2D non-linear transformation based on 3D non-linear transformation converting the free-stereo mosaic image to the geo-stereo mosaic image.

Fast Extraction of Objects of Interest from Images with Low Depth of Field

  • Kim, Chang-Ick;Park, Jung-Woo;Lee, Jae-Ho;Hwang, Jenq-Neng
    • ETRI Journal
    • /
    • v.29 no.3
    • /
    • pp.353-362
    • /
    • 2007
  • In this paper, we propose a novel unsupervised video object extraction algorithm for individual images or image sequences with low depth of field (DOF). Low DOF is a popular photographic technique which enables the representation of the photographer's intention by giving a clear focus only on an object of interest (OOI). We first describe a fast and efficient scheme for extracting OOIs from individual low-DOF images and then extend it to deal with image sequences with low DOF in the next part. The basic algorithm unfolds into three modules. In the first module, a higher-order statistics map, which represents the spatial distribution of the high-frequency components, is obtained from an input low-DOF image. The second module locates the block-based OOI for further processing. Using the block-based OOI, the final OOI is obtained with pixel-level accuracy. We also present an algorithm to extend the extraction scheme to image sequences with low DOF. The proposed system does not require any user assistance to determine the initial OOI. This is possible due to the use of low-DOF images. The experimental results indicate that the proposed algorithm can serve as an effective tool for applications, such as 2D to 3D and photo-realistic video scene generation.

  • PDF

Realtime Facial Expression Data Tracking System using Color Information (컬러 정보를 이용한 실시간 표정 데이터 추적 시스템)

  • Lee, Yun-Jung;Kim, Young-Bong
    • The Journal of the Korea Contents Association
    • /
    • v.9 no.7
    • /
    • pp.159-170
    • /
    • 2009
  • It is very important to extract the expression data and capture a face image from a video for online-based 3D face animation. In recently, there are many researches on vision-based approach that captures the expression of an actor in a video and applies them to 3D face model. In this paper, we propose an automatic data extraction system, which extracts and traces a face and expression data from realtime video inputs. The procedures of our system consist of three steps: face detection, face feature extraction, and face tracing. In face detection, we detect skin pixels using YCbCr skin color model and verifies the face area using Haar-based classifier. We use the brightness and color information for extracting the eyes and lips data related facial expression. We extract 10 feature points from eyes and lips area considering FAP defined in MPEG-4. Then, we trace the displacement of the extracted features from continuous frames using color probabilistic distribution model. The experiments showed that our system could trace the expression data to about 8fps.

Development of 3D Stereoscopic Image Generation System Using Real-time Preview Function in 3D Modeling Tools

  • Yun, Chang-Ok;Yun, Tae-Soo;Lee, Dong-Hoon
    • Journal of Korea Multimedia Society
    • /
    • v.11 no.6
    • /
    • pp.746-754
    • /
    • 2008
  • A 3D stereoscopic image is generated by interdigitating every scene with video editing tools that are rendered by two cameras' views in 3D modeling tools, like Autodesk MAX(R) and Autodesk MAYA(R). However, the depth of object from a static scene and the continuous stereo effect in the view of transformation, are not represented in a natural method. This is because after choosing the settings of arbitrary angle of convergence and the distance between the modeling and those two cameras, the user needs to render the view from both cameras. So, the user needs a process of controlling the camera's interval and rendering repetitively, which takes too much time. Therefore, in this paper, we will propose the 3D stereoscopic image editing system for solving such problems as well as exposing the system's inherent limitations. We can generate the view of two cameras and can confirm the stereo effect in real-time on 3D modeling tools. Then, we can intuitively determine immersion of 3D stereoscopic image in real-time, by using the 3D stereoscopic image preview function.

  • PDF