Reinforcement Learning (RL) is widely used for training Unmanned Aerial Vehicles (UAVs) involving complex perception information (e.g., camera, lidar). RL achievable performance is affected by the time needed to learn from the direct interaction of the agent with the environment. AirSim is a widely used simulator for autonomous UAV research, and its photorealism is suitable for algorithms using cameras for making or assisting flying control decisions. This work aims to reduce the RL time by reducing the simulation time step. This impairs simulation accuracy, so the impact on RL training must be quantitatively assessed. We characterise the AirSim acceleration impact on RL training time and accuracy while performing an obstacle avoidance task in a UAV application. We observed that using 5x as the Airsim acceleration factor, the RL task performance degrades by 95%. The observed performance increase is due to the latencies present in the AirSim command chain. We overcome this limitation by proposing a new command approach which allows accelerating without performance degradation until 10x. When pushing the acceleration factor to the extreme (100x), the RL task performance degrades by 38% with a measured speed-up of 15x.

A Method for Accelerated Simulations of Reinforcement Learning Tasks of UAVs in AirSim

Alberto Musa;Luca Zanatta;Francesco Barchi;Andrea Bartolini;Andrea Acquaviva
2022

Abstract

Reinforcement Learning (RL) is widely used for training Unmanned Aerial Vehicles (UAVs) involving complex perception information (e.g., camera, lidar). RL achievable performance is affected by the time needed to learn from the direct interaction of the agent with the environment. AirSim is a widely used simulator for autonomous UAV research, and its photorealism is suitable for algorithms using cameras for making or assisting flying control decisions. This work aims to reduce the RL time by reducing the simulation time step. This impairs simulation accuracy, so the impact on RL training must be quantitatively assessed. We characterise the AirSim acceleration impact on RL training time and accuracy while performing an obstacle avoidance task in a UAV application. We observed that using 5x as the Airsim acceleration factor, the RL task performance degrades by 95%. The observed performance increase is due to the latencies present in the AirSim command chain. We overcome this limitation by proposing a new command approach which allows accelerating without performance degradation until 10x. When pushing the acceleration factor to the extreme (100x), the RL task performance degrades by 38% with a measured speed-up of 15x.
2022
SIMUL 2022, The Fourteenth International Conference on Advances in System Simulation
46
53
Alberto Musa, Luca Zanatta, Francesco Barchi, Andrea Bartolini, Andrea Acquaviva
File in questo prodotto:
Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/963214
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact