DOI QR코드

DOI QR Code

3차원 시각 주의 모델과 이를 이용한 무참조 스테레오스코픽 비디오 화질 측정 방법

3D Visual Attention Model and its Application to No-reference Stereoscopic Video Quality Assessment

  • 김동현 (연세대학교 전기전자공학과) ;
  • 손광훈 (연세대학교 전기전자공학과)
  • Kim, Donghyun (School of Electrical and Electronic Engineering, Yonsei University) ;
  • Sohn, Kwanghoon (School of Electrical and Electronic Engineering, Yonsei University)
  • 투고 : 2013.12.11
  • 심사 : 2014.03.26
  • 발행 : 2014.04.25

초록

최근 사용자에게 직접 입체감을 제공하는 3차원 영상기술에 대한 관심이 증대함에 따라 스테레오스코픽 비디오 화질 측정기술개발은 중요한 주제로 많은 연구자에게 관심을 받고 있다. 특히, 스테레오스코픽 비디오 화질 측정에 중요한 역할을 하는 인간시각특성을 고려한 연구가 활발히 진행되지 않고 있어 본 논문에서 스테레오스코픽 비디오를 시청할 때 자극되는 다수의 인간시각특성 요소인 깊이, 움직임, 컬러, 휘도, 대조 등을 고려하여 3차원 시각 주의 모델을 제안한다. 또한, 본 논문에서는 실제 3차원 영상 특정 영역의 화질 열화 정도를 측정하는데 제안된 3차원 시각 주의 모델을 사용하여 무참조 스테레오스코픽 비디오 화질 측정 방법을 제안하였다. 제안 방법을 검증하기 위해 주관평가를 실시하여 기존의 스테레오스코픽 비디오 화질 측정 방법보다 평균 평가점에서 더 높은 연관성을 보였다. 게다가, 3차원 시각 주의 모델을 이용하여 스테레오스코픽 비디오의 관심영역 추출 결과는 공간적, 시간적 요소를 고려하여 추출된 관심영역에 비해 실제 관심영역과 더욱 유사함을 주관적으로 보여 제안 방법의 효율성을 보였다.

As multimedia technologies develop, three-dimensional (3D) technologies are attracting increasing attention from researchers. In particular, video quality assessment (VQA) has become a critical issue in stereoscopic image/video processing applications. Furthermore, a human visual system (HVS) could play an important role in the measurement of stereoscopic video quality, yet existing VQA methods have done little to develop a HVS for stereoscopic video. We seek to amend this by proposing a 3D visual attention (3DVA) model which simulates the HVS for stereoscopic video by combining multiple perceptual stimuli such as depth, motion, color, intensity, and orientation contrast. We utilize this 3DVA model for pooling on significant regions of very poor video quality, and we propose no-reference (NR) stereoscopic VQA (SVQA) method. We validated the proposed SVQA method using subjective test scores from our results and those reported by others. Our approach yields high correlation with the measured mean opinion score (MOS) as well as consistent performance in asymmetric coding conditions. Additionally, the 3DVA model is used to extract information for the region-of-interest (ROI). Subjective evaluations of the extracted ROI indicate that the 3DVA-based ROI extraction outperforms the other compared extraction methods using spatial or/and temporal terms.

키워드

참고문헌

  1. H. Lee, "3D video and human factors," Journal of The Institute of Electronics Engineers of Korea, Vol. 37, no. 9, pp. 84-92, September 2010.
  2. Z. Wang and A. C. Bovik, "Mean squared error: Love it or leave it? A new look at signal fidelity measures," IEEE Signal Processing Magazine, Vol. 26, no. 1, pp. 98-117, January 2009.
  3. Z. Wang, A. C. Bovik, H. R. Sheikh and E. P. Simoncelli, "Image quality assessment: From error visibility to structural similarity," IEEE Trans. on Image Processing, Vol. 13, no. 4, pp. 600-612, April 2004. https://doi.org/10.1109/TIP.2003.819861
  4. Z. Wang, and X. Shang, "Spatial pooling strategies for perceptual image quality assessment," in Proc. of IEEE Int. Conf. on Image Processing, pp. 2945-2948, Atalanta, USA, October 2006.
  5. D. Min, and K. Sohn, "Virtual view rendering for 2D/3D freeview video generation," Journal of The Institute of Electronics Engineers of Korea, Vol. 45, no. 4, pp. 22-31, July 2008.
  6. A. Benoit, P. Le Callet, P. Campisi, R. Cousseau, "Quality assessment of stereoscopic images," EURASIP Journal on Image and Video Processing, Vol. 2008, pp. 1-13, December 2008.
  7. M. Lambooij, W. IJsselsteijn, D. G. Bouwhuis and I. Heynderickx, "Evaluation of stereoscopic images: beyond 2d quality," IEEE Trans. on Broadcasting, Vol. 57, no. 2, pp. 432-444, June 2011. https://doi.org/10.1109/TBC.2011.2134590
  8. A. K. Moorthy, C. C. Su, A. Mittal and A. C. Bovik, "Subjective evaluation of stereoscopic image quality," Signal Processing: Image Communication, Vol. 28, no. 8, pp. 870-883, September 2012.
  9. Z. M. P. Sazzad, S. Yamanaka, Y. Kawayokeita and Y. Horita, "Stereoscopic image quality prediction," in Proc. Int. Workshop on Quality of Multimedia Experience, pp. 180-185, San Diego, USA, July 2009.
  10. J. Seo, X. Liu, D. Kim and K. Sohn, "An objective video quality metric for compressed stereoscopic video," Circuits, Systems, and Signal Processing, Vol. 31, no. 3, pp. 1089-1107, July 2012. https://doi.org/10.1007/s00034-011-9369-7
  11. J. Yang, C. Hou, R. Xu, J. Lei, "New metric for stereo image quality assessment based on HVS," Int. Journal of Imaging Systems and Technology, Vol. 20, no. 4, pp. 301-307, November 2010. https://doi.org/10.1002/ima.20246
  12. L. Itti, C. Koch and E. Niebur, "A model of saliency-based visual attention for rapid scene analysis," IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol. 20, no. 11, pp.1254-1259, November 1998. https://doi.org/10.1109/34.730558
  13. D. Walther, and C. Koch, "Modeling attention to salient proto-objects," Neural Networks, Vol. 19, no. 9, pp. 1395-1407, November 2006. https://doi.org/10.1016/j.neunet.2006.10.001
  14. U. Rajashekar, I. van der Linde, A. C. Bovik and L. K. Cormack, "Gaffe: A gaze-attentive fixation finding engine," IEEE Trans. on Image Processing, Vol. 17, no. 4, pp. 564-573, April 2008. https://doi.org/10.1109/TIP.2008.917218
  15. M. Barkowsky, J. Bialkowski, B. Eskoer, R. Bitto and A. Kaup, "Temporal trajectory aware video quality measure," IEEE Journal of Selected Topics in Signal Processing, Vol. 3, no. 2, pp. 266-279, April 2009. https://doi.org/10.1109/JSTSP.2009.2015375
  16. L. Itti and C. Koch, "Feature combination strategies for saliency-based visual attention systems," Journal of Electronic Imaging, Vol. 10, no. 1, pp. 161-169, January 2001. https://doi.org/10.1117/1.1333677
  17. A. Ninassi, O. Le Meur, P. Le Callet and D. Barba, "Considering temporal variations of spatial visual distortions in video quality assessment," IEEE Journal of Selected Topics in Signal Processing, Vol. 3, no. 2, pp. 253-265, April 2009. https://doi.org/10.1109/JSTSP.2009.2014806
  18. L. Itti and C. Koch, "Computational modeling of visual attention," Nature reviews neroscience, Vol. 2, no. 3, pp. 194-203, March 2001. https://doi.org/10.1038/35058500
  19. W. Epstein and S. Rogers, Perception of space and motion, Academic Press., pp. 69-117, 1995.
  20. D. R. Prott, J. Stefanucci, T. Banton and W. Epstein, "The role of effort in perceiving distance," Psychological Science, Vol. 14, no. 2, pp. 106-112, March 2003. https://doi.org/10.1111/1467-9280.t01-1-01427
  21. R. Li, B. Zeng and M. L. Liou, "A new three-step search algorithm for block motion estimation," IEEE Trans, on Circuits and Systems for Video Technology, Vol. 4, no. 4, pp. 438-442, August 1994. https://doi.org/10.1109/76.313138
  22. J. Karathanasis, D. Kalivas and J. Vlontzos "Disparity estimation using block matching and dynamic programming," in Proc. IEEE Int. Conf. on Electronics, Circuits, and Systems, Vol. 2, pp. 728-731, Rodos, Greece, October 1996.
  23. D. Min, S. Yea, Z. Arican, and A. Vetro "Disparity search range estimation: enforcing temporal consistency," in Proc. IEEE Int. Conf. on Acoustics Speech and Signal Processing, pp. 2366-2369, Dallas, USA, March 2010.
  24. http://www.middlebury.edu/stereo
  25. S. Zeki, J. Watson, C. Lueck, K. Friston, C. Kennard, R. Frackowiak, "A direct demonstration of functional specialization in human visual cortex," The Journal of Neuroscience, Vol. 11, no. 3, pp. 641-649, March 1991.
  26. I. P. Howard, and B. J. Rogers, Seeing in Depth, Oxford Univ. Press, 2008.
  27. S. J. Watt, K. Akeley, M. O. Ernst, M. S. Banks, "Focus cues aect perceived depth," Journal of Vision, Vol. 5, no. 10, pp. 834-862 December 2005. https://doi.org/10.1167/5.8.834
  28. D. Kim, S. Ryu and K. Sohn, "Depth perception and motion cue based 3d video quality assessement," in Proc. IEEE Int. Symposium on Broadband Multimedia Systems and Broadcasting, pp. 1-4, Seoul, Korea, June 2012.
  29. S. Ryu and K. Sohn, "No-reference sharpness metric based on inherent sharpness," Electronics Letters, Vol. 47, no. 21, pp. 1178-1180, October 2011. https://doi.org/10.1049/el.2011.2222
  30. S. Ryu and K. Sohn, "Blind blockiness measurebased on marginal distribution of wavelet coefficient and saliency," in Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, pp. 1874-1878, Vancouver, Canada, May 2013.
  31. Video Quality Experts Group, "Final report from the video quality experts group on the validation of object models of video quality assessment," 2000.
  32. http://3dtv.at/Movies/Heidelberg..en.aspx
  33. M. Domanski, T. Grajek, K. Klimaszewski, M. Kurc, O. Stankiewicz, J. Stankowski and K. Wegner, "Poznan multiview video test sequences and camera parameters," Tech. rep. MPEG/M17050, ISO/IEC JTC1/SC29/WG11.
  34. M. Domanski, T. Grajek, K. Klimaszewski, M. Kurc, O. Stankiewicz, J. Stankowski and K. Wegner, "Undo dancer 3dv sequence for purposes of 3dv standardization," Tech. rep. MPEG/M17050, ISO/IEC JTC1/SC29/WG11.
  35. H. Schwarz, D. Marpe and T. Wiegand, "Description of exploration experiments in 3d video coding," Tech. rep. MPEG/N11274, ISO/IEC JTC1/SC29/WG11.
  36. A. Vetro, T. Wiegand and G. Sullivan, "Overview of the stereo and multiview video coding extensions of the H.264/MPEG-4 AVC standard," Proceedings of the IEEE, Vol. 99, no. 4, pp.626-642, April 2011. https://doi.org/10.1109/JPROC.2010.2098830
  37. http://diml.yonsei.ac.kr/dimldb/stereo
  38. Methodology for Subjective Assessment of the Quality of Television Pictures, ITU-R Recommendation BT.500, Recommendation of the ITU, Radiocommunication Sector.
  39. Smart Eye, Smart-eye pro 5.5 user manual, Gothenburg, Sweden, 2009.