DOI QR코드

DOI QR Code

Motion Depth Generation Using MHI for 3D Video Conversion

3D 동영상 변환을 위한 MHI 기반 모션 깊이맵 생성

  • Kim, Won Hoi (Dept. of Computer and Communications Engineering, Kangwon National University) ;
  • Gil, Jong In (Dept. of Computer and Communications Engineering, Kangwon National University) ;
  • Choi, Changyeol (Dept. of Computer and Communications Engineering, Kangwon National University) ;
  • Kim, Manbae (Dept. of Computer and Communications Engineering, Kangwon National University)
  • 김원회 (강원대학교 컴퓨터정보통신공학과) ;
  • 길종인 (강원대학교 컴퓨터정보통신공학과) ;
  • 최창열 (강원대학교 컴퓨터정보통신공학과) ;
  • 김만배 (강원대학교 컴퓨터정보통신공학과)
  • Received : 2017.01.02
  • Accepted : 2017.05.25
  • Published : 2017.07.30

Abstract

2D-to-3D conversion technology has been studied over past decades and integrated to commercial 3D displays and 3DTVs. Generally, depth cues extracted from a static image is used for generating a depth map followed by DIBR (Depth Image Based Rendering) for producing a stereoscopic image. Further, motion is also an important cue for depth estimation and is estimated by block-based motion estimation, optical flow and so forth. This papers proposes a new method for motion depth generation using Motion History Image (MHI) and evaluates the feasiblity of the MHI utilization. In the experiments, the proposed method was performed on eight video clips with a variety of motion classes. From a qualitative test on motion depth maps as well as the comparison of the processing time, we validated the feasibility of the proposed method.

2D영상의 3D변환 기술은 3D 디스플레이 및 3DTV에 기본적으로 장착된 기술로 꾸준히 연구 및 상업화가 진행된 기술이다. 3D변환은 정지영상으로부터 다양한 깊이단서를 이용하여 깊이맵을 추출한 후에, DIBR(Depth Image Based Rendering)로 입체영상을 생성한다. 또한 비디오에서 추출할 수 있는 모션정보를 활용하여 모션 깊이맵을 얻기도 한다. 본 논문에서는 기존의 블록기반 모션예측, 광유 등의 모션 추출 방식이 아닌 운동 히스토리 영상(Motion History Image)를 활용하여 모션 깊이맵을 얻는 새로운 방법을 제안하고 실제 활용 가능성을 조사한다. 실험에서는 제안한 방법을 다양한 운동 유형을 가지는 8개의 2D 비디오 콘텐츠에 적용하였고, 생성된 모션 깊이맵의 정성적 평가 및 수행 속도의 비교를 통하여 MHI 기반 깊이맵의 실제 적용이 적합함을 증명하였다.

Keywords

References

  1. S. Kim and J. Yoo, "3D conversion of 2D video using depth layer partition,″ Journal of Broadcast Engineering, Vol. 15, No. 2, Jan. 2011.
  2. J. Jung, J. Lee, I Shin, J. Moon and Y. Ho, "Improved depth perception of single view images", ECTI Trans. on Electrical Engineering, Electronics and Communications, Vol. 8, No. 2, Aug. 2010.
  3. S. Battiato, A. Carpa, S. Curti and M. la Cascia, "3D stereoscopic image pairs by depth-map generation," Proceedings of 3DPVT, 2004.
  4. L. Zhang and W. Tam, "Stereoscopic image generation based on depth images for 3DTV," IEEE Trans. On Broadcasting, Vol. 51, Issue 2, June 2005
  5. I. Ideses, L. Yaroslavsky, B. Fishbain, "Real-time 2D to 3D video conversion," Journal of Real-Time Image Processing, vol. 2(1), pp. 2-9, 2007
  6. J. Konrad, F. M. Wang, P. Ishwar, C. Wu, and D. Mukherjee, "Learning-based, automatic 2d-to-3d image and video conversion", IEEE Trans. Image Processing, Vol. 22, No. 9, Sep. 2013.
  7. D. Kim, D. Min and K. Sohn, "A stereoscopic video generation method using stereoscopic display characterization and motion analysis", IEEE Trans. Broadcasting, Vol. 54, No. 2, June 2008.
  8. F. Xu, G. Fr, X. Xie, and Q. Dai, "2D-to-3D conversion based on motion and color mergence," 3DTV Conference: The True Vision-Capture, Transmission and Display of 3D Video, pp. 205-208, May 2008.
  9. L. Po, X. Xu, Y. Zhu, S. Zhang, K. Cheung and C. Ting, "Automatic 2D-to-3D video conversion technique based on depth-from motion and color segmentation", IEEE ICSP, 2010.
  10. A. Bobick and J. Davis, "The recognition of human movement using temporal templates," IEEE Trans. Pattern Recognition and Pattern Analysis, Vol 23, No. 3 Mar. 2001.
  11. T. Xiang, S. Gong, "Beyond tracking: modelling activity and understanding behaviour", Int. J. Computer Vision, 67:1, 2006, pp. 21-51. https://doi.org/10.1007/s11263-006-4329-6
  12. Chen, D., Yang, J., "Exploiting high dimensional video features using layered Gaussian mixture models", Proc. IEEE ICPR, 2006.
  13. A. R. Ahad, J. Tan, H. Kim and S. Ishikawa, "Motion history image: its variants and applications," Machine Vision and Applications, Oct. 2010.
  14. D. Tsai, M. Flagg, and J. M. Rehg, "Motion coherent tracking with multi-label MRF optimization," Proc. Brit. Mach. Vis. Conf., 2010.
  15. K. Fukuchi, K. Miyazato, A. Kimura, S. Takagi, and J. Yamato, "Saliency-based video segmentation with graph cuts and sequentially updated priors," Proc. IEEE Int. Conf. Multimedia Expo, pp. 638-641, June-July, 2009.
  16. D. Baltieri, R. Vezzani and R. Cucchiara, "3DPes: 3D People Dataset for Surveillance and Forensics," Proc. of the 1st International ACM Workshop on Multimedia access to 3D Human Objects, Scottsdale, Arizona, USA, pp. 59-64, Nov-Dec, 2011. (http://imagelab.ing.unimore.it/visor/3dpes.asp)