Reinforcement Learning (RL) is widely used for training Unmanned Aerial Vehicles (UAVs) involving complex perception information (e.g., camera, lidar). RL achievable performance is affected by the time needed to learn from the direct interaction of the agent with the environment. AirSim is a widely used simulator for autonomous UAV research, and its photorealism is suitable for algorithms using cameras for making or assisting flying control decisions. This work aims to reduce the RL time by reducing the simulation time step. This impairs simulation accuracy, so the impact on RL training must be quantitatively assessed. We characterise the AirSim acceleration impact on RL training time and accuracy while performing an obstacle avoidance task in a UAV application. We observed that using 5x as the Airsim acceleration factor, the RL task performance degrades by 95%. The observed performance increase is due to the latencies present in the AirSim command chain. We overcome this limitation by proposing a new command approach which allows accelerating without performance degradation until 10x. When pushing the acceleration factor to the extreme (100x), the RL task performance degrades by 38% with a measured speed-up of 15x.
Alberto Musa, L.Z. (2022). A Method for Accelerated Simulations of Reinforcement Learning Tasks of UAVs in AirSim. OTH Regensburg : Frank Herrmann.
A Method for Accelerated Simulations of Reinforcement Learning Tasks of UAVs in AirSim
Alberto Musa;Luca Zanatta;Francesco Barchi;Andrea Bartolini;Andrea Acquaviva
2022
Abstract
Reinforcement Learning (RL) is widely used for training Unmanned Aerial Vehicles (UAVs) involving complex perception information (e.g., camera, lidar). RL achievable performance is affected by the time needed to learn from the direct interaction of the agent with the environment. AirSim is a widely used simulator for autonomous UAV research, and its photorealism is suitable for algorithms using cameras for making or assisting flying control decisions. This work aims to reduce the RL time by reducing the simulation time step. This impairs simulation accuracy, so the impact on RL training must be quantitatively assessed. We characterise the AirSim acceleration impact on RL training time and accuracy while performing an obstacle avoidance task in a UAV application. We observed that using 5x as the Airsim acceleration factor, the RL task performance degrades by 95%. The observed performance increase is due to the latencies present in the AirSim command chain. We overcome this limitation by proposing a new command approach which allows accelerating without performance degradation until 10x. When pushing the acceleration factor to the extreme (100x), the RL task performance degrades by 38% with a measured speed-up of 15x.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.