In this work, we combine Curriculum Learning with Deep Reinforcement Learning to learn without any prior domain knowledge, an end-to-end competitive driving policy for the CARLA autonomous driving simulator. To our knowledge, we are the first to provide consistent results of our driving policy on all towns available in CARLA. Our approach divides the reinforcement learning phase into multiple stages of increasing difficulty, such that our agent is guided towards learning an increasingly better driving policy. The agent architecture comprises various neural networks that complements the main convolutional backbone, represented by a ShuffleNet V2. Further contributions are given by (i) the proposal of a novel value decomposition scheme for learning the value function in a stable way and (ii) an ad-hoc function for normalizing the growth in size of the gradients. We show both quantitative and qualitative results of the learned driving policy.
An End-to-End Curriculum Learning Approach for Autonomous Driving Scenarios / Anzalone, L; Barra, P; Barra, S; Castiglione, A; Nappi, M. - In: IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS. - ISSN 1524-9050. - ELETTRONICO. - 23:10(2022), pp. 19817-19826. [10.1109/TITS.2022.3160673]
An End-to-End Curriculum Learning Approach for Autonomous Driving Scenarios
Anzalone, L
Primo
Methodology
;
2022
Abstract
In this work, we combine Curriculum Learning with Deep Reinforcement Learning to learn without any prior domain knowledge, an end-to-end competitive driving policy for the CARLA autonomous driving simulator. To our knowledge, we are the first to provide consistent results of our driving policy on all towns available in CARLA. Our approach divides the reinforcement learning phase into multiple stages of increasing difficulty, such that our agent is guided towards learning an increasingly better driving policy. The agent architecture comprises various neural networks that complements the main convolutional backbone, represented by a ShuffleNet V2. Further contributions are given by (i) the proposal of a novel value decomposition scheme for learning the value function in a stable way and (ii) an ad-hoc function for normalizing the growth in size of the gradients. We show both quantitative and qualitative results of the learned driving policy.File | Dimensione | Formato | |
---|---|---|---|
Anzalone et al. - 2022 - An End-to-End Curriculum Learning Approach for Aut.pdf
accesso aperto
Tipo:
Versione (PDF) editoriale
Licenza:
Licenza per Accesso Aperto. Creative Commons Attribuzione (CCBY)
Dimensione
2.95 MB
Formato
Adobe PDF
|
2.95 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.