Browse > Article
http://dx.doi.org/10.4218/etrij.2021-0192

PathGAN: Local path planning with attentive generative adversarial networks  

Dooseop Choi (Artificial Intelligence Research Laboratory, Electronics and Telecommunications Research Institute)
Seung-Jun Han (Artificial Intelligence Research Laboratory, Electronics and Telecommunications Research Institute)
Kyoung-Wook Min (Artificial Intelligence Research Laboratory, Electronics and Telecommunications Research Institute)
Jeongdan Choi (Artificial Intelligence Research Laboratory, Electronics and Telecommunications Research Institute)
Publication Information
ETRI Journal / v.44, no.6, 2022 , pp. 1004-1019 More about this Journal
Abstract
For autonomous driving without high-definition maps, we present a model capable of generating multiple plausible paths from egocentric images for autonomous vehicles. Our generative model comprises two neural networks: feature extraction network (FEN) and path generation network (PGN). The FEN extracts meaningful features from an egocentric image, whereas the PGN generates multiple paths from the features, given a driving intention and speed. To ensure that the paths generated are plausible and consistent with the intention, we introduce an attentive discriminator and train it with the PGN under a generative adversarial network framework. Furthermore, we devise an interaction model between the positions in the paths and the intentions hidden in the positions and design a novel PGN architecture that reflects the interaction model for improving the accuracy and diversity of the generated paths. Finally, we introduce ETRIDriving, a dataset for autonomous driving, in which the recorded sensor data are labeled with discrete high-level driving actions, and demonstrate the state-of-the-art performance of the proposed model on ETRIDriving in terms of accuracy and diversity.
Keywords
autonomous driving dataset; deep learning; generative adversarial networks; imitation learning; path planning;
Citations & Related Records
Times Cited By KSCI : 5  (Citation Analysis)
연도 인용수 순위
1 C. Katrakazas, M. Quddus, W. Chen, and L. Deka, Real-time motion planning methods for autonomous on-road driving: state-of-the-arts and future research directions, Transp. Res. Part C 60 (2015), 416-442.   DOI
2 S.-J. Han, J. Kang, K.-W. Min, and J. Choi, DiLO: direct light detection and ranging odometry based on spherical range images for autonomous driving, ETRI J. 43 (2021), 603-616.   DOI
3 D. A. Pomerleau, ALVINN:, An autonomous land vehicle in a neural network, Adv. Neural Inform. Process. Syst. 1 (1988), 305-313.
4 M. Bojarski, D. D. Testa, D. Dworakowski, B. Firner, B. Flepp, P. Goyal, L. D. Jackel, M. Monfort, U. Muller, J. Zhang, X. Zhang, J. Zhao, and K. Zieba, End-to-end learning for self-driving cars, arXive Preprint, 2016. https://doi.org/10.48550/arXiv.1604.07316   DOI
5 C. Chen, A. Seff, A. Kornhauser, and J. Xiao, Deepdriving: learning affordance for direct perception in autonomous driving, (IEEE International Conference on Computer Vision, Santiago, Chile), 2015, pp. 2722-2730.
6 H. Xu, Y. Gao, F. Yu, and T. Darrell, End-to-end learning of driving models from large-scale video datasets, (IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA), 2017, pp. 2774-2182.
7 F. Codevilla, M. Miiller, A. Lopez, V. Koltun, and A. Dosovitskiy, End-to-end driving via conditional imitation learning, (IEEE International Conference on Robotics and Automation, Brisbane, Australia), 2018, pp. 4693-4700.
8 S. Hecker, D. Dai, and L. V. Gool, End to end learning of driving models with surround-view cameras and route planners, (15th European Conference, Munich, Germany), 2018, pp. 450-464.
9 L. Yang, X. Liang, T. Wang, and E. Xing, Real-to-virtual domain unification for end-to-end autonomous driving, (15th European Conference, Munich, Germany), 2018, pp. 553-570.
10 F. Codevilla, E. Santana, A. M. Lopez, and A. Gaidon, Exploring the limitations of behavior cloning for autonomous driving, (IEEE/CVF International Conference on Computer Vision, Seoul, Rep. of Korea), 2019, pp. 9329-9338.
11 T. Buhet, E. Wirbel, and X. Perrotton, Conditional vehicle trajectories prediction in CARLA urban environment, (IEEE/CVF International Conference on Computer Vision Workshop, Seoul, Rep. of Korea), Oct. 2019. https://doi.org/10.1109/ICCVW.2019.00284   DOI
12 J. Kim, S. Moon, A. Rohrbach, T. Darrell, and J. Canny, Advisable learning for self-driving vehicles by internalizing observation-to-action rules, (IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA), 2020, pp. 9661-9670.
13 N. Lee, W. Choi, P. Vernaza, C. B. Choy, P. H. S. Torr, and M. Chan, Desire: Distant future prediction in dynamic scenes with interacting agents, (IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA), 2017, pp. 2165-2174.
14 A. Sadeghian, F. Legros, M. Voisin, R. Vesel, A. Alahi, and S. Savarese, CAR-Net: clairvoyant attentive recurrent network, (15th European Conference, Munich, Germany), 2018, pp. 151-167.
15 H. Caesar, V. Bankiti, A. H. Lang, S. Vora, V. E. Liong, Q. Xu, A. Krishnan, Y. Pan, G. Baldan, and O. Beijbom, nuScenes: A multimodal dataset for autonomous driving, (IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA), 2020, pp. 11621-11631.
16 A. Geiger, P. Lenz, and R. Urtasun, Are we ready for autonomous driving? the kitti vision benchmark suite, (IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, USA), 2012, pp. 3354-3361.
17 Q.-H. Pham, P. Sevestre, R. S. Pahwa, C. H. Pang, H. Zhan, A. Mustafa, Y. Chen, J. Lin, and J. L. Chandrasekhar, A*3D data-set: towards autonomous driving in challenging environments, arXiv Preprint, 2019. https://doi.org/10.48550/arXiv.1909.07541   DOI
18 M.-F. Chang, J. W. Lambert, P. Sangkloy, J. Singh, S. Bak, A. Hartnett, D. Wang, P. Carr, S. Lucey, D. Ramanan, and J. Hays, Argoverse: 3D tracking and forecasting with rich maps, (IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA), 2019, pp. 8740-8749.
19 N. Rhinehart, R. McAllister, and S. Levine, Deep imitative models for flexible inference, planning, and control, (International Conference on Learning Representations, Addis Ababa, Ethiopia), 2020.
20 N. Rhinehart, K. M. Kitani, and P. Vernaza, R2P2: A reparameterized pushforward policy for diverse, precise generative path forecasting, (15th European Conference, Munich, Germany), 2018, pp. 553-570.
21 P. Cai, Y. Sun, H. Wang, and M. Liu, VTGNet: a vision-based trajectory generation network for autonomous vehicles in urban environments, arXive prepreint, 2020. https://doi.org/10.48550/arXiv.2004.12591   DOI
22 Y. Yang, D. Jang, J. Wang, Z. Chen, Y. Li, Y. Wang, and R. Xiong, Imitation learning of hierarchical driving model: from continuous intention to continuous trajectory, IEEE Robot. Autom. Lett. 6 (2021), 2477-2484.   DOI
23 W. Zeng, W. Luo, S. Suo, A. Sadat, B. Yang, S. Casas, and R. Urtasun, End-to-end interpretable neural motion planner, (IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA), 2019, pp. 8660-8669.
24 A. Alahi, K. Goel, V. Ramanathan, A. Robicquet, L. Fei-Fei, and S. Savarese, Social LSTM: Human trajectory prediction in crowded spaces, (IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA), 2016, pp. 961-971.
25 A. Gupta, J. Johnson, L. Fei-Fei, S. Savarese, and A. Alahi, Social GAN: Socially acceptable trajectories with generative adversarial networks, (IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA), 2018, pp. 2255-2264.
26 A. Sadeghian, V. Kosaraju, N. Hirose, H. Rezatofighi, and S. Savarese, SoPhie: An attentive gan for predicting paths compliant to social and physical constraints, (IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA), 2019, pp. 1349-1358.
27 D. Choi, K. Min, and J. Choi, Regularising neural networks for future trajectory prediction via inverse reinforcement learning framework, IET Comput. Vision 14 (2020), 192-200.   DOI
28 L. Fang, Q. Jiang, J. Shi, and B. Zhou, TPNet: trajectory proposal network for motion prediction, (IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA), 2020, pp. 6797-6806.
29 K. Xu, J. Ba, R. Kiros, K. Cho, A. Courville, R. Salakhudinov, R. Zemel, and Y. Bengio, Show, attend and tell: Neural image caption generation with visual attention, (Proceedings of the 32nd International Conference on Machine Learning, Lille, France), 2015, pp. 2048-2057.
30 Q. Wu, C. Shen, L. Liu, A. Dick, and A. Van Den Hengel, What value do explicit high level concepts have in vision to language problems? (IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA), 2016, pp. 203-212.
31 L. Yang and H. Hu, Visual skeleton and reparative attention for part-of-speech image captioning system, Comput. Vision Image Underst. 189 (2019) https://doi.org/10.1016/j.cviu.2019.102819   DOI
32 J. Lu, C. Xiong, D. Parikh, and R. Socher, Knowing when to look: adaptive attention via a visual sentinel for image captioning, (IEEE Conference on Computer Vision and Pattern Recognition, Ho, 2017, pp. 375-383.
33 T. Yao, Y. Pan, Y. Li, Z. Qiu, and T. Mei, Boosting image captioning with attributes, (IEEE International Conference on Computer Vision, Venice, Italy), 2017, pp. 4894-4902.
34 L. Guo, J. Liu, P. Yao, J. Li, and H. Lu, MSCap: Multi-style image captioning with unpaired stylized text, (IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA), 2019, pp. 4204-4213.
35 J. Aneja, H. Agrawal, D. Batra, and A. Schwing, Sequential latent spaces for modeling the intention during diverse image captioning, (IEEE/CVF International Conference on Computer Vision, Seoul, Rep. of Korea), 2019. https://doi.org/10.1109/ICCV.2019.00436   DOI
36 I. Goodfellow, J. P. Abadie, M. Mirza, B. Xu, D. W. Farley, S. Ozair, A. Courville, and Y. Bengio, Generative adversarial nets, Adv. Neural Inform. Process. Syst. 27 (2014).
37 K. He, X. Zhang, S. Ren, and J. Sun, Deep residual learning for image recognition, (IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA), 2016, pp. 770-778.
38 C. Jiang, H. Xu, W. Zhang, X. Liang, and Z. Li, SP-NAS: Serialto-parallel backbone search for object detection, (IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA), 2020, pp. 11863-11872.
39 J. Kim, T. Misu, Y.-T. Chen, A. Tawari, and J. Canny, Grounding human-to-vehicle advice for self-driving vehicles, (IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA), 2019, pp. 10591-10599.
40 J. Wang, K. Sun, T. Cheng, B. Jiang, C. Deng, Y. Zhao, D. Liu, Y. Mu, M. Tan, X. Wang, W. Liu, and B. Xiao, Deep highresolution representation learning for visual recognition, IEEE Trans. Pattern Anal. Mach. Intell. 43 (2021), 3349-3364.   DOI
41 L. A. Thiede and P. P. Brahma, Analyzing the variety loss in the context of probabilistic trajectory prediction, (IEEE/CVF International Conference on Computer Vision, Seoul, Rep. of Korea), 2019, pp. 9954-9963.
42 D. P. Kingma and L. J. Ba, Adam: a method for stochastic optimization, (International Conference on Learning Representations, San Diego, CA, USA), 2015.