Acknowledgement
This work was supported by the Institute for Information & Ccommunications Technology Planning & Evaluation (IITP) grant funded by the Korean government (MSIT) (No. 2021-0-00891, Development of AI Service Integrated Framework for Autonomous Driving).
References
- C. Katrakazas, M. Quddus, W. Chen, and L. Deka, Real-time motion planning methods for autonomous on-road driving: state-of-the-arts and future research directions, Transp. Res. Part C 60 (2015), 416-442. https://doi.org/10.1016/j.trc.2015.09.011
- S.-J. Han, J. Kang, K.-W. Min, and J. Choi, DiLO: direct light detection and ranging odometry based on spherical range images for autonomous driving, ETRI J. 43 (2021), 603-616. https://doi.org/10.4218/etrij.2021-0088
- D. A. Pomerleau, ALVINN:, An autonomous land vehicle in a neural network, Adv. Neural Inform. Process. Syst. 1 (1988), 305-313.
- M. Bojarski, D. D. Testa, D. Dworakowski, B. Firner, B. Flepp, P. Goyal, L. D. Jackel, M. Monfort, U. Muller, J. Zhang, X. Zhang, J. Zhao, and K. Zieba, End-to-end learning for self-driving cars, arXive Preprint, 2016. https://doi.org/10.48550/arXiv.1604.07316
- C. Chen, A. Seff, A. Kornhauser, and J. Xiao, Deepdriving: learning affordance for direct perception in autonomous driving, (IEEE International Conference on Computer Vision, Santiago, Chile), 2015, pp. 2722-2730.
- H. Xu, Y. Gao, F. Yu, and T. Darrell, End-to-end learning of driving models from large-scale video datasets, (IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA), 2017, pp. 2774-2182.
- F. Codevilla, M. Miiller, A. Lopez, V. Koltun, and A. Dosovitskiy, End-to-end driving via conditional imitation learning, (IEEE International Conference on Robotics and Automation, Brisbane, Australia), 2018, pp. 4693-4700.
- S. Hecker, D. Dai, and L. V. Gool, End to end learning of driving models with surround-view cameras and route planners, (15th European Conference, Munich, Germany), 2018, pp. 450-464.
- L. Yang, X. Liang, T. Wang, and E. Xing, Real-to-virtual domain unification for end-to-end autonomous driving, (15th European Conference, Munich, Germany), 2018, pp. 553-570.
- F. Codevilla, E. Santana, A. M. Lopez, and A. Gaidon, Exploring the limitations of behavior cloning for autonomous driving, (IEEE/CVF International Conference on Computer Vision, Seoul, Rep. of Korea), 2019, pp. 9329-9338.
- T. Buhet, E. Wirbel, and X. Perrotton, Conditional vehicle trajectories prediction in CARLA urban environment, (IEEE/CVF International Conference on Computer Vision Workshop, Seoul, Rep. of Korea), Oct. 2019. https://doi.org/10.1109/ICCVW.2019.00284
- J. Kim, S. Moon, A. Rohrbach, T. Darrell, and J. Canny, Advisable learning for self-driving vehicles by internalizing observation-to-action rules, (IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA), 2020, pp. 9661-9670.
- N. Lee, W. Choi, P. Vernaza, C. B. Choy, P. H. S. Torr, and M. Chan, Desire: Distant future prediction in dynamic scenes with interacting agents, (IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA), 2017, pp. 2165-2174.
- A. Sadeghian, F. Legros, M. Voisin, R. Vesel, A. Alahi, and S. Savarese, CAR-Net: clairvoyant attentive recurrent network, (15th European Conference, Munich, Germany), 2018, pp. 151-167.
- A. Geiger, P. Lenz, and R. Urtasun, Are we ready for autonomous driving? the kitti vision benchmark suite, (IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, USA), 2012, pp. 3354-3361.
- Q.-H. Pham, P. Sevestre, R. S. Pahwa, C. H. Pang, H. Zhan, A. Mustafa, Y. Chen, J. Lin, and J. L. Chandrasekhar, A*3D data-set: towards autonomous driving in challenging environments, arXiv Preprint, 2019. https://doi.org/10.48550/arXiv.1909.07541
- M.-F. Chang, J. W. Lambert, P. Sangkloy, J. Singh, S. Bak, A. Hartnett, D. Wang, P. Carr, S. Lucey, D. Ramanan, and J. Hays, Argoverse: 3D tracking and forecasting with rich maps, (IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA), 2019, pp. 8740-8749.
- H. Caesar, V. Bankiti, A. H. Lang, S. Vora, V. E. Liong, Q. Xu, A. Krishnan, Y. Pan, G. Baldan, and O. Beijbom, nuScenes: A multimodal dataset for autonomous driving, (IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA), 2020, pp. 11621-11631.
- N. Rhinehart, R. McAllister, and S. Levine, Deep imitative models for flexible inference, planning, and control, (International Conference on Learning Representations, Addis Ababa, Ethiopia), 2020.
- N. Rhinehart, K. M. Kitani, and P. Vernaza, R2P2: A reparameterized pushforward policy for diverse, precise generative path forecasting, (15th European Conference, Munich, Germany), 2018, pp. 553-570.
- P. Cai, Y. Sun, H. Wang, and M. Liu, VTGNet: a vision-based trajectory generation network for autonomous vehicles in urban environments, arXive prepreint, 2020. https://doi.org/10.48550/arXiv.2004.12591
- Y. Yang, D. Jang, J. Wang, Z. Chen, Y. Li, Y. Wang, and R. Xiong, Imitation learning of hierarchical driving model: from continuous intention to continuous trajectory, IEEE Robot. Autom. Lett. 6 (2021), 2477-2484. https://doi.org/10.1109/LRA.2021.3061336
- A. Alahi, K. Goel, V. Ramanathan, A. Robicquet, L. Fei-Fei, and S. Savarese, Social LSTM: Human trajectory prediction in crowded spaces, (IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA), 2016, pp. 961-971.
- A. Gupta, J. Johnson, L. Fei-Fei, S. Savarese, and A. Alahi, Social GAN: Socially acceptable trajectories with generative adversarial networks, (IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA), 2018, pp. 2255-2264.
- A. Sadeghian, V. Kosaraju, N. Hirose, H. Rezatofighi, and S. Savarese, SoPhie: An attentive gan for predicting paths compliant to social and physical constraints, (IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA), 2019, pp. 1349-1358.
- W. Zeng, W. Luo, S. Suo, A. Sadat, B. Yang, S. Casas, and R. Urtasun, End-to-end interpretable neural motion planner, (IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA), 2019, pp. 8660-8669.
- D. Choi, K. Min, and J. Choi, Regularising neural networks for future trajectory prediction via inverse reinforcement learning framework, IET Comput. Vision 14 (2020), 192-200. https://doi.org/10.1049/iet-cvi.2019.0546
- L. Fang, Q. Jiang, J. Shi, and B. Zhou, TPNet: trajectory proposal network for motion prediction, (IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA), 2020, pp. 6797-6806.
- K. Xu, J. Ba, R. Kiros, K. Cho, A. Courville, R. Salakhudinov, R. Zemel, and Y. Bengio, Show, attend and tell: Neural image caption generation with visual attention, (Proceedings of the 32nd International Conference on Machine Learning, Lille, France), 2015, pp. 2048-2057.
- Q. Wu, C. Shen, L. Liu, A. Dick, and A. Van Den Hengel, What value do explicit high level concepts have in vision to language problems? (IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA), 2016, pp. 203-212.
- J. Lu, C. Xiong, D. Parikh, and R. Socher, Knowing when to look: adaptive attention via a visual sentinel for image captioning, (IEEE Conference on Computer Vision and Pattern Recognition, Ho, 2017, pp. 375-383.
- T. Yao, Y. Pan, Y. Li, Z. Qiu, and T. Mei, Boosting image captioning with attributes, (IEEE International Conference on Computer Vision, Venice, Italy), 2017, pp. 4894-4902.
- L. Guo, J. Liu, P. Yao, J. Li, and H. Lu, MSCap: Multi-style image captioning with unpaired stylized text, (IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA), 2019, pp. 4204-4213.
- L. Yang and H. Hu, Visual skeleton and reparative attention for part-of-speech image captioning system, Comput. Vision Image Underst. 189 (2019) https://doi.org/10.1016/j.cviu.2019.102819
- J. Aneja, H. Agrawal, D. Batra, and A. Schwing, Sequential latent spaces for modeling the intention during diverse image captioning, (IEEE/CVF International Conference on Computer Vision, Seoul, Rep. of Korea), 2019. https://doi.org/10.1109/ICCV.2019.00436
- I. Goodfellow, J. P. Abadie, M. Mirza, B. Xu, D. W. Farley, S. Ozair, A. Courville, and Y. Bengio, Generative adversarial nets, Adv. Neural Inform. Process. Syst. 27 (2014).
- K. He, X. Zhang, S. Ren, and J. Sun, Deep residual learning for image recognition, (IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA), 2016, pp. 770-778.
- C. Jiang, H. Xu, W. Zhang, X. Liang, and Z. Li, SP-NAS: Serialto-parallel backbone search for object detection, (IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA), 2020, pp. 11863-11872.
- J. Wang, K. Sun, T. Cheng, B. Jiang, C. Deng, Y. Zhao, D. Liu, Y. Mu, M. Tan, X. Wang, W. Liu, and B. Xiao, Deep highresolution representation learning for visual recognition, IEEE Trans. Pattern Anal. Mach. Intell. 43 (2021), 3349-3364. https://doi.org/10.1109/TPAMI.2020.2983686
- J. Kim, T. Misu, Y.-T. Chen, A. Tawari, and J. Canny, Grounding human-to-vehicle advice for self-driving vehicles, (IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA), 2019, pp. 10591-10599.
- L. A. Thiede and P. P. Brahma, Analyzing the variety loss in the context of probabilistic trajectory prediction, (IEEE/CVF International Conference on Computer Vision, Seoul, Rep. of Korea), 2019, pp. 9954-9963.
- D. P. Kingma and L. J. Ba, Adam: a method for stochastic optimization, (International Conference on Learning Representations, San Diego, CA, USA), 2015.