Controlling quadrotors autonomously in dynamic environments requires agile and robust flight policies to ensure rapid adaptation to environmental changes. Deep Reinforcement Learning (DRL) has been shown to be an effective method to train Artificial Neural Networks (ANNs) policies, outperforming optimal control algorithms in performance while being more resource-efficient. Spiking Neural Networks (SNNs), biologically inspired neural networks, present a promising approach by natively processing temporal data through discrete spikes. This property allows SNNs to incorporate the temporal dimension, even within a feed-forward architecture, unlike ANNs, which is crucial in dynamic environments. Moreover, SNNs can be efficiently executed on neuromorphic hardware accelerators, making them well-suited for deployment on resource-constrained computing platforms. In this work, we trained an agile flight SNN policy using the state-of-the-art Deep Reinforcement Learning (DRL) algorithm, Proximal Policy Optimization (PPO). The flight policy maps the system states to low-level control commands sent to the quadrotor. With simulation experiments, we demonstrate that, compared to ANN-based policies, SNN-based ones achieve a 2.5% improvement in success rate, a 40% increase in average flight speed, and a 28.6% reduction in the time required to reach the target. Our results suggest that neuromorphic computing approaches can be beneficial for dynamical state-based problems, providing valuable insights for designing lightweight and efficient controllers in time-sensitive applications.
Lee, Y., Mengozzi, S., Zanatta, L., Bartolini, A., Acquaviva, A., Barchi, F. (2025). Bio-Inspired Drone Control: A Reinforcement Learning-Trained Spiking Neural Networks for Agile Navigation in Dynamic Environment. Institute of Electrical and Electronics Engineers Inc. [10.1109/coins65080.2025.11125776].
Bio-Inspired Drone Control: A Reinforcement Learning-Trained Spiking Neural Networks for Agile Navigation in Dynamic Environment
Mengozzi, Sebastiano;Zanatta, Luca;Bartolini, Andrea;Acquaviva, Andrea;Barchi, Francesco
2025
Abstract
Controlling quadrotors autonomously in dynamic environments requires agile and robust flight policies to ensure rapid adaptation to environmental changes. Deep Reinforcement Learning (DRL) has been shown to be an effective method to train Artificial Neural Networks (ANNs) policies, outperforming optimal control algorithms in performance while being more resource-efficient. Spiking Neural Networks (SNNs), biologically inspired neural networks, present a promising approach by natively processing temporal data through discrete spikes. This property allows SNNs to incorporate the temporal dimension, even within a feed-forward architecture, unlike ANNs, which is crucial in dynamic environments. Moreover, SNNs can be efficiently executed on neuromorphic hardware accelerators, making them well-suited for deployment on resource-constrained computing platforms. In this work, we trained an agile flight SNN policy using the state-of-the-art Deep Reinforcement Learning (DRL) algorithm, Proximal Policy Optimization (PPO). The flight policy maps the system states to low-level control commands sent to the quadrotor. With simulation experiments, we demonstrate that, compared to ANN-based policies, SNN-based ones achieve a 2.5% improvement in success rate, a 40% increase in average flight speed, and a 28.6% reduction in the time required to reach the target. Our results suggest that neuromorphic computing approaches can be beneficial for dynamical state-based problems, providing valuable insights for designing lightweight and efficient controllers in time-sensitive applications.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


