PathGAN: Local path planning with attentive generative adversarial networks

Dooseop Choi;Seung-Jun Han;Kyoung-Wook Min;Jeongdan Choi;

doi:10.4218/etrij.2021-0192

ETRI Journal

Volume 44 Issue 6
/
Pages.1004-1019
/
2022
/
1225-6463(pISSN)
/
2233-7326(eISSN)

Electronics and Telecommunications Research Institute (한국전자통신연구원)

DOI QR Code

PathGAN: Local path planning with attentive generative adversarial networks

Dooseop, Choi (Artificial Intelligence Research Laboratory, Electronics and Telecommunications Research Institute) ;
Seung-Jun, Han (Artificial Intelligence Research Laboratory, Electronics and Telecommunications Research Institute) ;
Kyoung-Wook, Min (Artificial Intelligence Research Laboratory, Electronics and Telecommunications Research Institute) ;
Jeongdan, Choi (Artificial Intelligence Research Laboratory, Electronics and Telecommunications Research Institute)

Received : 2021.06.10
Accepted : 2022.07.18
Published : 2022.12.10

https://doi.org/10.4218/etrij.2021-0192 Citation PDF KSCI

Download PDF

⟨ Previous Next ⟩

Abstract

For autonomous driving without high-definition maps, we present a model capable of generating multiple plausible paths from egocentric images for autonomous vehicles. Our generative model comprises two neural networks: feature extraction network (FEN) and path generation network (PGN). The FEN extracts meaningful features from an egocentric image, whereas the PGN generates multiple paths from the features, given a driving intention and speed. To ensure that the paths generated are plausible and consistent with the intention, we introduce an attentive discriminator and train it with the PGN under a generative adversarial network framework. Furthermore, we devise an interaction model between the positions in the paths and the intentions hidden in the positions and design a novel PGN architecture that reflects the interaction model for improving the accuracy and diversity of the generated paths. Finally, we introduce ETRIDriving, a dataset for autonomous driving, in which the recorded sensor data are labeled with discrete high-level driving actions, and demonstrate the state-of-the-art performance of the proposed model on ETRIDriving in terms of accuracy and diversity.

Keywords

Acknowledgement

This work was supported by the Institute for Information & Ccommunications Technology Planning & Evaluation (IITP) grant funded by the Korean government (MSIT) (No. 2021-0-00891, Development of AI Service Integrated Framework for Autonomous Driving).

References

C. Katrakazas, M. Quddus, W. Chen, and L. Deka, Real-time motion planning methods for autonomous on-road driving: state-of-the-arts and future research directions, Transp. Res. Part C 60 (2015), 416-442. https://doi.org/10.1016/j.trc.2015.09.011
S.-J. Han, J. Kang, K.-W. Min, and J. Choi, DiLO: direct light detection and ranging odometry based on spherical range images for autonomous driving, ETRI J. 43 (2021), 603-616. https://doi.org/10.4218/etrij.2021-0088
D. A. Pomerleau, ALVINN:, An autonomous land vehicle in a neural network, Adv. Neural Inform. Process. Syst. 1 (1988), 305-313.
M. Bojarski, D. D. Testa, D. Dworakowski, B. Firner, B. Flepp, P. Goyal, L. D. Jackel, M. Monfort, U. Muller, J. Zhang, X. Zhang, J. Zhao, and K. Zieba, End-to-end learning for self-driving cars, arXive Preprint, 2016. https://doi.org/10.48550/arXiv.1604.07316
C. Chen, A. Seff, A. Kornhauser, and J. Xiao, Deepdriving: learning affordance for direct perception in autonomous driving, (IEEE International Conference on Computer Vision, Santiago, Chile), 2015, pp. 2722-2730.
H. Xu, Y. Gao, F. Yu, and T. Darrell, End-to-end learning of driving models from large-scale video datasets, (IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA), 2017, pp. 2774-2182.
F. Codevilla, M. Miiller, A. Lopez, V. Koltun, and A. Dosovitskiy, End-to-end driving via conditional imitation learning, (IEEE International Conference on Robotics and Automation, Brisbane, Australia), 2018, pp. 4693-4700.
S. Hecker, D. Dai, and L. V. Gool, End to end learning of driving models with surround-view cameras and route planners, (15th European Conference, Munich, Germany), 2018, pp. 450-464.
L. Yang, X. Liang, T. Wang, and E. Xing, Real-to-virtual domain unification for end-to-end autonomous driving, (15th European Conference, Munich, Germany), 2018, pp. 553-570.
F. Codevilla, E. Santana, A. M. Lopez, and A. Gaidon, Exploring the limitations of behavior cloning for autonomous driving, (IEEE/CVF International Conference on Computer Vision, Seoul, Rep. of Korea), 2019, pp. 9329-9338.
T. Buhet, E. Wirbel, and X. Perrotton, Conditional vehicle trajectories prediction in CARLA urban environment, (IEEE/CVF International Conference on Computer Vision Workshop, Seoul, Rep. of Korea), Oct. 2019. https://doi.org/10.1109/ICCVW.2019.00284
J. Kim, S. Moon, A. Rohrbach, T. Darrell, and J. Canny, Advisable learning for self-driving vehicles by internalizing observation-to-action rules, (IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA), 2020, pp. 9661-9670.
N. Lee, W. Choi, P. Vernaza, C. B. Choy, P. H. S. Torr, and M. Chan, Desire: Distant future prediction in dynamic scenes with interacting agents, (IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA), 2017, pp. 2165-2174.
A. Sadeghian, F. Legros, M. Voisin, R. Vesel, A. Alahi, and S. Savarese, CAR-Net: clairvoyant attentive recurrent network, (15th European Conference, Munich, Germany), 2018, pp. 151-167.
A. Geiger, P. Lenz, and R. Urtasun, Are we ready for autonomous driving? the kitti vision benchmark suite, (IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, USA), 2012, pp. 3354-3361.
Q.-H. Pham, P. Sevestre, R. S. Pahwa, C. H. Pang, H. Zhan, A. Mustafa, Y. Chen, J. Lin, and J. L. Chandrasekhar, A*3D data-set: towards autonomous driving in challenging environments, arXiv Preprint, 2019. https://doi.org/10.48550/arXiv.1909.07541
M.-F. Chang, J. W. Lambert, P. Sangkloy, J. Singh, S. Bak, A. Hartnett, D. Wang, P. Carr, S. Lucey, D. Ramanan, and J. Hays, Argoverse: 3D tracking and forecasting with rich maps, (IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA), 2019, pp. 8740-8749.
H. Caesar, V. Bankiti, A. H. Lang, S. Vora, V. E. Liong, Q. Xu, A. Krishnan, Y. Pan, G. Baldan, and O. Beijbom, nuScenes: A multimodal dataset for autonomous driving, (IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA), 2020, pp. 11621-11631.
N. Rhinehart, R. McAllister, and S. Levine, Deep imitative models for flexible inference, planning, and control, (International Conference on Learning Representations, Addis Ababa, Ethiopia), 2020.
N. Rhinehart, K. M. Kitani, and P. Vernaza, R2P2: A reparameterized pushforward policy for diverse, precise generative path forecasting, (15th European Conference, Munich, Germany), 2018, pp. 553-570.
P. Cai, Y. Sun, H. Wang, and M. Liu, VTGNet: a vision-based trajectory generation network for autonomous vehicles in urban environments, arXive prepreint, 2020. https://doi.org/10.48550/arXiv.2004.12591
Y. Yang, D. Jang, J. Wang, Z. Chen, Y. Li, Y. Wang, and R. Xiong, Imitation learning of hierarchical driving model: from continuous intention to continuous trajectory, IEEE Robot. Autom. Lett. 6 (2021), 2477-2484. https://doi.org/10.1109/LRA.2021.3061336
A. Alahi, K. Goel, V. Ramanathan, A. Robicquet, L. Fei-Fei, and S. Savarese, Social LSTM: Human trajectory prediction in crowded spaces, (IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA), 2016, pp. 961-971.
A. Gupta, J. Johnson, L. Fei-Fei, S. Savarese, and A. Alahi, Social GAN: Socially acceptable trajectories with generative adversarial networks, (IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA), 2018, pp. 2255-2264.
A. Sadeghian, V. Kosaraju, N. Hirose, H. Rezatofighi, and S. Savarese, SoPhie: An attentive gan for predicting paths compliant to social and physical constraints, (IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA), 2019, pp. 1349-1358.
W. Zeng, W. Luo, S. Suo, A. Sadat, B. Yang, S. Casas, and R. Urtasun, End-to-end interpretable neural motion planner, (IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA), 2019, pp. 8660-8669.
D. Choi, K. Min, and J. Choi, Regularising neural networks for future trajectory prediction via inverse reinforcement learning framework, IET Comput. Vision 14 (2020), 192-200. https://doi.org/10.1049/iet-cvi.2019.0546
L. Fang, Q. Jiang, J. Shi, and B. Zhou, TPNet: trajectory proposal network for motion prediction, (IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA), 2020, pp. 6797-6806.
K. Xu, J. Ba, R. Kiros, K. Cho, A. Courville, R. Salakhudinov, R. Zemel, and Y. Bengio, Show, attend and tell: Neural image caption generation with visual attention, (Proceedings of the 32nd International Conference on Machine Learning, Lille, France), 2015, pp. 2048-2057.
Q. Wu, C. Shen, L. Liu, A. Dick, and A. Van Den Hengel, What value do explicit high level concepts have in vision to language problems? (IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA), 2016, pp. 203-212.
J. Lu, C. Xiong, D. Parikh, and R. Socher, Knowing when to look: adaptive attention via a visual sentinel for image captioning, (IEEE Conference on Computer Vision and Pattern Recognition, Ho, 2017, pp. 375-383.
T. Yao, Y. Pan, Y. Li, Z. Qiu, and T. Mei, Boosting image captioning with attributes, (IEEE International Conference on Computer Vision, Venice, Italy), 2017, pp. 4894-4902.
L. Guo, J. Liu, P. Yao, J. Li, and H. Lu, MSCap: Multi-style image captioning with unpaired stylized text, (IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA), 2019, pp. 4204-4213.
L. Yang and H. Hu, Visual skeleton and reparative attention for part-of-speech image captioning system, Comput. Vision Image Underst. 189 (2019) https://doi.org/10.1016/j.cviu.2019.102819
J. Aneja, H. Agrawal, D. Batra, and A. Schwing, Sequential latent spaces for modeling the intention during diverse image captioning, (IEEE/CVF International Conference on Computer Vision, Seoul, Rep. of Korea), 2019. https://doi.org/10.1109/ICCV.2019.00436
I. Goodfellow, J. P. Abadie, M. Mirza, B. Xu, D. W. Farley, S. Ozair, A. Courville, and Y. Bengio, Generative adversarial nets, Adv. Neural Inform. Process. Syst. 27 (2014).
K. He, X. Zhang, S. Ren, and J. Sun, Deep residual learning for image recognition, (IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA), 2016, pp. 770-778.
C. Jiang, H. Xu, W. Zhang, X. Liang, and Z. Li, SP-NAS: Serialto-parallel backbone search for object detection, (IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA), 2020, pp. 11863-11872.
J. Wang, K. Sun, T. Cheng, B. Jiang, C. Deng, Y. Zhao, D. Liu, Y. Mu, M. Tan, X. Wang, W. Liu, and B. Xiao, Deep highresolution representation learning for visual recognition, IEEE Trans. Pattern Anal. Mach. Intell. 43 (2021), 3349-3364. https://doi.org/10.1109/TPAMI.2020.2983686
J. Kim, T. Misu, Y.-T. Chen, A. Tawari, and J. Canny, Grounding human-to-vehicle advice for self-driving vehicles, (IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA), 2019, pp. 10591-10599.
L. A. Thiede and P. P. Brahma, Analyzing the variety loss in the context of probabilistic trajectory prediction, (IEEE/CVF International Conference on Computer Vision, Seoul, Rep. of Korea), 2019, pp. 9954-9963.
D. P. Kingma and L. J. Ba, Adam: a method for stochastic optimization, (International Conference on Learning Representations, San Diego, CA, USA), 2015.

ETRI Journal

PathGAN: Local path planning with attentive generative adversarial networks

Abstract

Keywords

Acknowledgement

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)