DOI QR코드

DOI QR Code

PathGAN: Local path planning with attentive generative adversarial networks

  • Dooseop, Choi (Artificial Intelligence Research Laboratory, Electronics and Telecommunications Research Institute) ;
  • Seung-Jun, Han (Artificial Intelligence Research Laboratory, Electronics and Telecommunications Research Institute) ;
  • Kyoung-Wook, Min (Artificial Intelligence Research Laboratory, Electronics and Telecommunications Research Institute) ;
  • Jeongdan, Choi (Artificial Intelligence Research Laboratory, Electronics and Telecommunications Research Institute)
  • Received : 2021.06.10
  • Accepted : 2022.07.18
  • Published : 2022.12.10

Abstract

For autonomous driving without high-definition maps, we present a model capable of generating multiple plausible paths from egocentric images for autonomous vehicles. Our generative model comprises two neural networks: feature extraction network (FEN) and path generation network (PGN). The FEN extracts meaningful features from an egocentric image, whereas the PGN generates multiple paths from the features, given a driving intention and speed. To ensure that the paths generated are plausible and consistent with the intention, we introduce an attentive discriminator and train it with the PGN under a generative adversarial network framework. Furthermore, we devise an interaction model between the positions in the paths and the intentions hidden in the positions and design a novel PGN architecture that reflects the interaction model for improving the accuracy and diversity of the generated paths. Finally, we introduce ETRIDriving, a dataset for autonomous driving, in which the recorded sensor data are labeled with discrete high-level driving actions, and demonstrate the state-of-the-art performance of the proposed model on ETRIDriving in terms of accuracy and diversity.

Keywords

Acknowledgement

This work was supported by the Institute for Information & Ccommunications Technology Planning & Evaluation (IITP) grant funded by the Korean government (MSIT) (No. 2021-0-00891, Development of AI Service Integrated Framework for Autonomous Driving).

References

  1. C. Katrakazas, M. Quddus, W. Chen, and L. Deka, Real-time motion planning methods for autonomous on-road driving: state-of-the-arts and future research directions, Transp. Res. Part C 60 (2015), 416-442. https://doi.org/10.1016/j.trc.2015.09.011
  2. S.-J. Han, J. Kang, K.-W. Min, and J. Choi, DiLO: direct light detection and ranging odometry based on spherical range images for autonomous driving, ETRI J. 43 (2021), 603-616. https://doi.org/10.4218/etrij.2021-0088
  3. D. A. Pomerleau, ALVINN:, An autonomous land vehicle in a neural network, Adv. Neural Inform. Process. Syst. 1 (1988), 305-313.
  4. M. Bojarski, D. D. Testa, D. Dworakowski, B. Firner, B. Flepp, P. Goyal, L. D. Jackel, M. Monfort, U. Muller, J. Zhang, X. Zhang, J. Zhao, and K. Zieba, End-to-end learning for self-driving cars, arXive Preprint, 2016. https://doi.org/10.48550/arXiv.1604.07316
  5. C. Chen, A. Seff, A. Kornhauser, and J. Xiao, Deepdriving: learning affordance for direct perception in autonomous driving, (IEEE International Conference on Computer Vision, Santiago, Chile), 2015, pp. 2722-2730.
  6. H. Xu, Y. Gao, F. Yu, and T. Darrell, End-to-end learning of driving models from large-scale video datasets, (IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA), 2017, pp. 2774-2182.
  7. F. Codevilla, M. Miiller, A. Lopez, V. Koltun, and A. Dosovitskiy, End-to-end driving via conditional imitation learning, (IEEE International Conference on Robotics and Automation, Brisbane, Australia), 2018, pp. 4693-4700.
  8. S. Hecker, D. Dai, and L. V. Gool, End to end learning of driving models with surround-view cameras and route planners, (15th European Conference, Munich, Germany), 2018, pp. 450-464.
  9. L. Yang, X. Liang, T. Wang, and E. Xing, Real-to-virtual domain unification for end-to-end autonomous driving, (15th European Conference, Munich, Germany), 2018, pp. 553-570.
  10. F. Codevilla, E. Santana, A. M. Lopez, and A. Gaidon, Exploring the limitations of behavior cloning for autonomous driving, (IEEE/CVF International Conference on Computer Vision, Seoul, Rep. of Korea), 2019, pp. 9329-9338.
  11. T. Buhet, E. Wirbel, and X. Perrotton, Conditional vehicle trajectories prediction in CARLA urban environment, (IEEE/CVF International Conference on Computer Vision Workshop, Seoul, Rep. of Korea), Oct. 2019. https://doi.org/10.1109/ICCVW.2019.00284
  12. J. Kim, S. Moon, A. Rohrbach, T. Darrell, and J. Canny, Advisable learning for self-driving vehicles by internalizing observation-to-action rules, (IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA), 2020, pp. 9661-9670.
  13. N. Lee, W. Choi, P. Vernaza, C. B. Choy, P. H. S. Torr, and M. Chan, Desire: Distant future prediction in dynamic scenes with interacting agents, (IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA), 2017, pp. 2165-2174.
  14. A. Sadeghian, F. Legros, M. Voisin, R. Vesel, A. Alahi, and S. Savarese, CAR-Net: clairvoyant attentive recurrent network, (15th European Conference, Munich, Germany), 2018, pp. 151-167.
  15. A. Geiger, P. Lenz, and R. Urtasun, Are we ready for autonomous driving? the kitti vision benchmark suite, (IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, USA), 2012, pp. 3354-3361.
  16. Q.-H. Pham, P. Sevestre, R. S. Pahwa, C. H. Pang, H. Zhan, A. Mustafa, Y. Chen, J. Lin, and J. L. Chandrasekhar, A*3D data-set: towards autonomous driving in challenging environments, arXiv Preprint, 2019. https://doi.org/10.48550/arXiv.1909.07541
  17. M.-F. Chang, J. W. Lambert, P. Sangkloy, J. Singh, S. Bak, A. Hartnett, D. Wang, P. Carr, S. Lucey, D. Ramanan, and J. Hays, Argoverse: 3D tracking and forecasting with rich maps, (IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA), 2019, pp. 8740-8749.
  18. H. Caesar, V. Bankiti, A. H. Lang, S. Vora, V. E. Liong, Q. Xu, A. Krishnan, Y. Pan, G. Baldan, and O. Beijbom, nuScenes: A multimodal dataset for autonomous driving, (IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA), 2020, pp. 11621-11631.
  19. N. Rhinehart, R. McAllister, and S. Levine, Deep imitative models for flexible inference, planning, and control, (International Conference on Learning Representations, Addis Ababa, Ethiopia), 2020.
  20. N. Rhinehart, K. M. Kitani, and P. Vernaza, R2P2: A reparameterized pushforward policy for diverse, precise generative path forecasting, (15th European Conference, Munich, Germany), 2018, pp. 553-570.
  21. P. Cai, Y. Sun, H. Wang, and M. Liu, VTGNet: a vision-based trajectory generation network for autonomous vehicles in urban environments, arXive prepreint, 2020. https://doi.org/10.48550/arXiv.2004.12591
  22. Y. Yang, D. Jang, J. Wang, Z. Chen, Y. Li, Y. Wang, and R. Xiong, Imitation learning of hierarchical driving model: from continuous intention to continuous trajectory, IEEE Robot. Autom. Lett. 6 (2021), 2477-2484. https://doi.org/10.1109/LRA.2021.3061336
  23. A. Alahi, K. Goel, V. Ramanathan, A. Robicquet, L. Fei-Fei, and S. Savarese, Social LSTM: Human trajectory prediction in crowded spaces, (IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA), 2016, pp. 961-971.
  24. A. Gupta, J. Johnson, L. Fei-Fei, S. Savarese, and A. Alahi, Social GAN: Socially acceptable trajectories with generative adversarial networks, (IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA), 2018, pp. 2255-2264.
  25. A. Sadeghian, V. Kosaraju, N. Hirose, H. Rezatofighi, and S. Savarese, SoPhie: An attentive gan for predicting paths compliant to social and physical constraints, (IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA), 2019, pp. 1349-1358.
  26. W. Zeng, W. Luo, S. Suo, A. Sadat, B. Yang, S. Casas, and R. Urtasun, End-to-end interpretable neural motion planner, (IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA), 2019, pp. 8660-8669.
  27. D. Choi, K. Min, and J. Choi, Regularising neural networks for future trajectory prediction via inverse reinforcement learning framework, IET Comput. Vision 14 (2020), 192-200. https://doi.org/10.1049/iet-cvi.2019.0546
  28. L. Fang, Q. Jiang, J. Shi, and B. Zhou, TPNet: trajectory proposal network for motion prediction, (IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA), 2020, pp. 6797-6806.
  29. K. Xu, J. Ba, R. Kiros, K. Cho, A. Courville, R. Salakhudinov, R. Zemel, and Y. Bengio, Show, attend and tell: Neural image caption generation with visual attention, (Proceedings of the 32nd International Conference on Machine Learning, Lille, France), 2015, pp. 2048-2057.
  30. Q. Wu, C. Shen, L. Liu, A. Dick, and A. Van Den Hengel, What value do explicit high level concepts have in vision to language problems? (IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA), 2016, pp. 203-212.
  31. J. Lu, C. Xiong, D. Parikh, and R. Socher, Knowing when to look: adaptive attention via a visual sentinel for image captioning, (IEEE Conference on Computer Vision and Pattern Recognition, Ho, 2017, pp. 375-383.
  32. T. Yao, Y. Pan, Y. Li, Z. Qiu, and T. Mei, Boosting image captioning with attributes, (IEEE International Conference on Computer Vision, Venice, Italy), 2017, pp. 4894-4902.
  33. L. Guo, J. Liu, P. Yao, J. Li, and H. Lu, MSCap: Multi-style image captioning with unpaired stylized text, (IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA), 2019, pp. 4204-4213.
  34. L. Yang and H. Hu, Visual skeleton and reparative attention for part-of-speech image captioning system, Comput. Vision Image Underst. 189 (2019) https://doi.org/10.1016/j.cviu.2019.102819
  35. J. Aneja, H. Agrawal, D. Batra, and A. Schwing, Sequential latent spaces for modeling the intention during diverse image captioning, (IEEE/CVF International Conference on Computer Vision, Seoul, Rep. of Korea), 2019. https://doi.org/10.1109/ICCV.2019.00436
  36. I. Goodfellow, J. P. Abadie, M. Mirza, B. Xu, D. W. Farley, S. Ozair, A. Courville, and Y. Bengio, Generative adversarial nets, Adv. Neural Inform. Process. Syst. 27 (2014).
  37. K. He, X. Zhang, S. Ren, and J. Sun, Deep residual learning for image recognition, (IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA), 2016, pp. 770-778.
  38. C. Jiang, H. Xu, W. Zhang, X. Liang, and Z. Li, SP-NAS: Serialto-parallel backbone search for object detection, (IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA), 2020, pp. 11863-11872.
  39. J. Wang, K. Sun, T. Cheng, B. Jiang, C. Deng, Y. Zhao, D. Liu, Y. Mu, M. Tan, X. Wang, W. Liu, and B. Xiao, Deep highresolution representation learning for visual recognition, IEEE Trans. Pattern Anal. Mach. Intell. 43 (2021), 3349-3364. https://doi.org/10.1109/TPAMI.2020.2983686
  40. J. Kim, T. Misu, Y.-T. Chen, A. Tawari, and J. Canny, Grounding human-to-vehicle advice for self-driving vehicles, (IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA), 2019, pp. 10591-10599.
  41. L. A. Thiede and P. P. Brahma, Analyzing the variety loss in the context of probabilistic trajectory prediction, (IEEE/CVF International Conference on Computer Vision, Seoul, Rep. of Korea), 2019, pp. 9954-9963.
  42. D. P. Kingma and L. J. Ba, Adam: a method for stochastic optimization, (International Conference on Learning Representations, San Diego, CA, USA), 2015.