Acknowledgement
This work was supported by the Electronics and Telecommunications Research Institute (ETRI) grant by the Korean government (22ZH1210, fundamental media contents technologies for hyper-realistic media space).
References
- J. L. Schonberger and J.-M. Frahm, Structure-from-motion revisited, (IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA), June 2016. https://doi.org/10.1109/CVPR.2016.445
- Y. Furukawa and C. Hernandez, Multi-View Stereo: A Tutorial, CGV, 9 (2015), no. 1-2, 1-148. https://doi.org/10.1561/0600000052
- X. Gu, Z. Fan, S. Zhu, Z. Dai, F. Tan, and P. Tan, Cascade cost volume for high-resolution multi-view stereo and stereo matching, (IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA), June 2020. https://doi.org/10.1109/cvpr42600.2020.00257
- S. Im, H. G. Jeon, S. Lin, and I. S. Kweon, DPSNet: End-to-End Deep Plane Sweep Stereo, arXiv preprint, May 2019. https://doi.org/10.48550/arXiv.1905.00538
- Y. Yao, Z. Luo, S. Li, T. Fang, and L. Quan, MVSNet: Depth inference for unstructured multi-view stereo, (European Conference Computer Vision), Munich, Germany, 2018. https://doi.org/10.1007/978-3-030-01237-3_47
- K. Luo, T. Guan, L. Ju, H. Huang, and Y. Luo, P-MVSNet: Learning patch-wise matching confidence aggregation for multi-view stereo, (IEEE/CVF International Conference on Computer Vision, Seoul, Rep. of Korea), 2019. https://doi.org/10.1109/iccv.2019.01055
- Y. Yao, Z. Luo, S. Li, T. Shen, T. Fang, and L. Quan, Recurrent MVSNet for high-resolution multi-view stereo depth inference, (IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA), 2019. https://doi.org/10.1109/cvpr.2019.00567
- R. Chen, S. Han, J. Xu, and H. Su, Point-based multi-view stereo network, (IEEE/CVF International Conference on Computer Vision, Seoul, Rep. of Korea), 2019. https://doi.org/10.1109/iccv.2019.00162
- Y. Wei, J. Feng, X. Liang, M.-M. Cheng, Y. Zhao, and S. Yan, Object region mining with adversarial erasing: A simple classification to semantic segmentation approach, (IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA), 2017. https://doi.org/10.1109/cvpr.2017.687
- Y. Wei, X. Liang, Y. Chen, X. Shen, M.-M. Cheng, J. Feng, Y. Zhao, and S. Yan, STC: A simple to complex framework for weakly supervised semantic segmentation, IEEE Trans. Pattern Anal. Mach. Intell. 39 (2017), no. 11, 2314-2320. https://doi.org/10.1109/TPAMI.2016.2636150
- L.-C. Chen, Y. Yang, J. Wang, W. Xu, and A. L. Yuille, Attention to scale: Scale-aware semantic image segmentation, (IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas NV, USA), 2016. https://doi.org/10.1109/cvpr.2016.396
- G. Papandreou, I. Kokkinos, and P.-A. Savalle, Modeling local and global deformations in deep learning: Epitomic convolution, multiple instance learning, and sliding window detection, (IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA), 2015. https://doi.org/10.1109/cvpr.2015.7298636
- I. Kreso, I. Kreso, D. Causevic, J. Krapac, and S. Segvic, Convolutional scale invariance for semantic segmentation, (Conference Proceedings Pattern Recognition, Hannover, Germany), 2016. https://doi.org/10.1007/978-3-319-45886-1_6
- I. Kokkinos, Pushing the boundaries of boundary detection using deep learning, arXiv Preprint, Jan. 2016. https://doi.org/10.48550/arXiv.1511.07386
- G. Ghiasi and C. C. Fowlkes, Laplacian pyramid reconstruction and refinement for semantic segmentation, (Proc. European Conference on Computer Vision, Amsterdam, Netherlands), Oct. 2016. https://doi.org/10.1007/978-3-319-46487-9_32
- J. Cao, Y. Pang, and X. Li, Triply supervised decoder networks for joint detection and segmentation, (IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA), 2019. https://doi.org/10.1109/cvpr.2019.00757
- L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille, DeepLab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs, IEEE Trans. Pattern Anal. Mach. Intell. 40 (2018), no. 4, 834-848. https://doi.org/10.1109/TPAMI.2017.2699184
- X. Lian, Y. Pang, J. Han, and J. Pan, Cascaded hierarchical atrous spatial pyramid pooling module for semantic segmentation, Pattern Recognit. 110 (2021), 107622. https://doi.org/10.1016/j.patcog.2020.107622
- L.-C. Chen, G. Papandreou, F. Schroff, and H. Adam, Rethinking atrous convolution for semantic image segmentation, arXiv Preprint, Dec. 2017. https://doi.org/10.48550/arXiv.1706.05587
- J. L. Schonberger, E. Zheng, J. M. Frahm, and M. Pollefeys, Pixelwise view selection for unstructured multi-view stereo, (Proc. European Conference on Computer Vision, Amsterdam, Netherlands), Oct. 2016. https://doi.org/10.1007/978-3-319-46487-9_31
- K. N. Kutulakos and S. M. Seitz, A theory of shape by space carving, (Proceedings of the Seventh IEEE International Conference on Computer Vision, Kerkyra, Greece), Sept. 1999. https://doi.org/10.1109/ICCV.1999.791235
- A. Kar, C. Hane, and J. Malik, Learning a multi-view stereo machine, arXiv preprint, Aug. 2017. https://doi.org/10.48550/arXiv.1708.05375
- S. M. Seitz and C. R. Dyer, Photorealistic Scene Reconstruction by Voxel Coloring US Patent 6363170B1, filed Apr, vol. 29, issued Mar. 26, 2002. 1999.
- M. Ji, J. Gall, H. Zheng, Y. Liu, and L. Fang, SurfaceNet: An end-to-end 3D neural network for multiview stereopsis, (IEEE International Conference on Computer Vision, Venice, Italy), Oct. 2017. https://doi.org/10.1109/iccv.2017.253
- M. Lhuillier and L. Quan, A quasi-dense approach to surface reconstruction from uncalibrated images, IEEE Trans. Pattern Anal. Mach. Intell. 27 (2005), no. 3, 418-433. https://doi.org/10.1109/TPAMI.2005.44
- Y. Furukawa and J. Ponce, Accurate, dense, and robust multiview stereopsis, IEEE Trans. Pattern Anal. Mach. Intell. 32 (2010), no. 8, 1362-1376. https://doi.org/10.1109/TPAMI.2009.161
- E. Tola, C. Strecha, and P. Fua, Efficient large-scale multi-view stereo for ultra high-resolution image sets, Mach. Vis. Appl. 23 (2012), no. 5, 903-920. https://doi.org/10.1007/s00138-011-0346-8
- S. Galliani, K. Lasinger, and K. Schindler, Massively parallel multiview stereopsis by surface normal diffusion, (IEEE International Conference on Computer Vision, Santiago, Chile), Dec. 2015. https://doi.org/10.1109/iccv.2015.106
- Y. Yao, S. Li, S. Zhu, H. Deng, T. Fang, and L. Quan, Relative camera refinement for accurate dense reconstruction, (International Conference on 3D Vision, Qingdao, China), Oct. 2017. https://doi.org/10.1109/3dv.2017.00030
- A. Romanoni and M. Matteucci, TAPA-MVS: textureless-aware PAtchMatch multi-view stereo, (IEEE/CVF International Conference on Computer Vision, Seoul, Rep. of Korea), 2019. https://doi.org/10.1109/iccv.2019.01051
- N. D. F. Campbell, G. Vogiatzis, C. Hernandez, and R. Cipolla, Using multiple hypotheses to improve depth-maps for multi-view stereo, (European Conference on Computer Vision, Marseille, France), 2008. https://doi.org/10.1007/978-3-540-88682-2_58
- R. Zhang, S. Zhu, T. Fang, and L. Quan, Distributed very large scale bundle adjustment by global camera consensus, (IEEE International Conference on Computer Vision, Venice, Italy), Oct. 2017. https://doi.org/10.1109/iccv.2017.13
- S. Zhu, R. Zhang, L. Zhou, T. Shen, T. Fang, P. Tan, and L. Quan, Very large-scale global SfM by distributed motion averaging, (IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA), 2018. https://doi.org/10.1109/cvpr.2018.00480
- R. Jensen, A. Dahl, G. Vogiatzis, E. Tola, and H. Aanaes, Large scale multi-view stereopsis evaluation, (IEEE Conference on Computer Vision and Pattern Recognition, Columbus OH, USA), 2014. https://doi.org/10.1109/cvpr.2014.59
- H. Aanaes, R. R. Jensen, G. Vogiatzis, E. Tola, and A. B. Dahl, Large-scale data for multiple-view stereopsis, Int. J. Comput. Vision, 120 (2016), no. 2, 153-168. https://doi.org/10.1007/s11263-016-0902-9
- P.-S. Wang, Y. Liu, Y.-X. Guo, C.-Y. Sun, and X. Tong, O-CNN: Octree-based convolutional neural networks for 3D shape analysis, ACM Trans. Graph. 36 (2017), no. 4, 1-11. https://doi.org/10.1145/3072959.3073608
- G. Riegler, A. O. Ulusoy, and A. Geiger, OctNet: Learning deep 3D representations at high resolutions, (IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA), 2017. https://doi.org/10.1109/cvpr.2017.701
- C. Farabet, C. Couprie, L. Najman, and Y. LeCun, Learning hierarchical features for scene labeling, IEEE Trans. Pattern Anal. Mach. Intell. 35 (2013), no. 8, 1915-1929. https://doi.org/10.1109/TPAMI.2012.231
- D. Eigen and R. Fergus, Predicting depth, surface normal, and semantic labels with a common multi-scale convolutional architecture, (IEEE International Conference on Computer Vision, Santiago, Chile), 2015. https://doi.org/10.1109/iccv.2015.304
- P. Pinheiro and R. Collobert, Recurrent convolutional neural networks for scene labeling, (Proceedings of the 31st International Conference on International Conference on Machine Learning, Beijing, China), June 2014, pp. 82-90.
- G. Lin, C. Shen, A. van den Hengel, and I. Reid, Efficient piecewise training of deep structured models for semantic segmentation, (IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas NV, USA), 2016. https://doi.org/10.1109/cvpr.2016.348
- V. Badrinarayanan, A. Kendall, and R. Cipolla, SegNet: A deep convolutional encoder-decoder architecture for image segmentation, IEEE Trans. Pattern Anal. Mach. Intell. 39 (2017), no. 12, 2481-2495. https://doi.org/10.1109/TPAMI.2016.2644615
- O. Ronneberger, P. Fischer, and T. Brox, U-Net: Convolutional networks for biomedical image segmentation, (International Conference Medical Image Computing and Computer-Assisted Intervention, Munich, Germany), Oct. 2015. https://doi.org/10.1007/978-3-319-24574-4_28
- G. Ghiasi and C. C. Fowlkes, Laplacian pyramid reconstruction and refinement for semantic segmentation, arXiv Preprint, 2016. https://doi.org/10.48550/arXiv.1605.02264
- G. Lin, A. Milan, C. Shen, and I. Reid, RefineNet: Multi-path refinement networks for high-resolution semantic segmentation, (IEEE Conference on Computer Vision and Pattern Recognition, Hololulu, HI, USA), 2017. https://doi.org/10.1109/cvpr.2017.549
- T. Pohlen, A. Hermans, M. Mathias, and B. Leibe, Full-resolution residual networks for semantic segmentation in street scenes, (IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA), 2017. https://doi.org/10.1109/cvpr.2017.353
- C. Peng, X. Zhang, G. Yu, G. Luo, and J. Sun, Large kernel matters - Improve semantic segmentation by global convolutional network, (IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA), 2017. https://doi.org/10.1109/cvpr.2017.189
- M. A. Islam, M. Rochan, N. D. B. Bruce, and Y. Wang, Gated feedback refinement network for dense image labeling, (IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA), 2017. https://doi.org/10.1109/cvpr.2017.518
- P. Krahenbuhl and V. Koltun, Efficient inference in fully connected CRFs with Gaussian edge potentials, Neural Inform Process. Syst. 24 (2011), 109-117.
- L.-C. Chen, Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs, (International Conference on Learning Representations, San Diego, CA, USA), May 2015.
- S. Zheng, S. Jayasumana, B. Romera-Paredes, V. Vineet, Z. Su, D. Du, C. Huang, and P. H. S. Torr, Conditional random fields as recurrent neural networks, (IEEE International Conference on Computer Vision, Santiago, Chile), Dec. 2015. https://doi.org/10.1109/iccv.2015.179
- A. G. Schwing, and R. Urtasun, Fully Connected deep structured networks, arXiv preprint, 2015. https://doi.org/10.48550/arXiv.1503.02351
- Z. Liu, X. Li, P. Luo, C.-C. Loy, and X. Tang, Semantic image segmentation via deep parsing network, (IEEE International Conference on Computer Vision, Santiago, Chile), 2015. https://doi.org/10.1109/iccv.2015.162
- F. Yu and V. Koltun, Multi-scale context aggregation by dilated convolutions, arXiv preprint, ICLR, 2016. https://doi.org/10.48550/arXiv.1511.07122
- H. Zhao, J. Shi, X. Qi, X. Wang, and J. Jia, Pyramid scene parsing network, (IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA), 2017. https://doi.org/10.1109/cvpr.2017.660
- Z. Wei, H. Yi, M. Ding, R. Zhang, Y. Chen, G. Wang, and Y.-W. Tai, Dense hybrid recurrent multi-view stereo net with dynamic consistency checking, (ECCV 2020: 16th European Conference, Glasgow, UK). Aug. 2020. https://doi.org/10.1007/978-3-030-58548-8_39