Unmanned Aerial Vehicles (UAVs), functioning as Unmanned Aerial Base Stations (UABSs), hold considerable potential for augmenting vehicular network performance through on-demand enhanced radio coverage. A pivotal challenge lies in devising algorithms that efficiently optimize UABS trajectories under strict Radio Resource Management (RRM) and coverage gap discovery. This can be tackled using Deep Reinforcement Learning (DRL) models. However, their effectiveness relies on the relevance of acquired knowledge to the current scenario, posing a challenge when the underlying dynamics or governing rules undergo modifications. To address this issue, we propose a framework integrating a deep meta-learning algorithm to enhance the adaptability of our DRL-based trajectory design to newly encountered scenarios. Scenarios may vary in mobile users’ movement profiles, UABS take-off zones, or new service maps. Our numerical results demonstrate that an agent that leverages information from prior tasks achieves target performance in fewer episodes compared to a conventional DRL agent, while also ensuring superior long-term training proficiency.
Spampinato, L., Testi, E., Buratti, C., Marini, R. (2024). Leveraging Meta-DRL for UAV Trajectory Planning and Radio Resource Management [10.1109/PIMRC59610.2024.10817293].
Leveraging Meta-DRL for UAV Trajectory Planning and Radio Resource Management
Spampinato L.
;Testi E.
;Buratti C.
;
2024
Abstract
Unmanned Aerial Vehicles (UAVs), functioning as Unmanned Aerial Base Stations (UABSs), hold considerable potential for augmenting vehicular network performance through on-demand enhanced radio coverage. A pivotal challenge lies in devising algorithms that efficiently optimize UABS trajectories under strict Radio Resource Management (RRM) and coverage gap discovery. This can be tackled using Deep Reinforcement Learning (DRL) models. However, their effectiveness relies on the relevance of acquired knowledge to the current scenario, posing a challenge when the underlying dynamics or governing rules undergo modifications. To address this issue, we propose a framework integrating a deep meta-learning algorithm to enhance the adaptability of our DRL-based trajectory design to newly encountered scenarios. Scenarios may vary in mobile users’ movement profiles, UABS take-off zones, or new service maps. Our numerical results demonstrate that an agent that leverages information from prior tasks achieves target performance in fewer episodes compared to a conventional DRL agent, while also ensuring superior long-term training proficiency.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.