In this paper, we consider a Linear Quadratic optimal control problem with the assumptions that the system dynamics is unknown and that the designed feedback control has to comply with a desired sparsity pattern. An important application where this set-up arises is distributed control of network systems, where the aim is to find an optimal sparse controller matching the communication graph. To tackle the problem, we propose a Reinforcement Learning framework based on a Q-learning scheme preserving a desired policy structure. At each time step the performance of the current candidate feedback is first evaluated through the computation of its Q-function, and then a new sparse feedback matrix, improving on the previous one, is computed. We prove that the scheme produces at each iteration a stabilizing feedback control with the desired sparsity and with non-increasing cost, which in turns indicates that every limit point of the computed feedback matrices is sparse and stabilizing. The algorithm is numerically tested on a distributed control scenario with randomly generated graph and unstable dynamics.

Sforni L., Camisa A., Notarstefano G. (2022). Structured-policy Q-learning: an LMI-based Design Strategy for Distributed Reinforcement Learning. 345 E 47TH ST, NEW YORK, NY 10017 USA : Institute of Electrical and Electronics Engineers Inc. [10.1109/CDC51059.2022.9992584].

Structured-policy Q-learning: an LMI-based Design Strategy for Distributed Reinforcement Learning

Sforni L.;Camisa A.;Notarstefano G.
2022

Abstract

In this paper, we consider a Linear Quadratic optimal control problem with the assumptions that the system dynamics is unknown and that the designed feedback control has to comply with a desired sparsity pattern. An important application where this set-up arises is distributed control of network systems, where the aim is to find an optimal sparse controller matching the communication graph. To tackle the problem, we propose a Reinforcement Learning framework based on a Q-learning scheme preserving a desired policy structure. At each time step the performance of the current candidate feedback is first evaluated through the computation of its Q-function, and then a new sparse feedback matrix, improving on the previous one, is computed. We prove that the scheme produces at each iteration a stabilizing feedback control with the desired sparsity and with non-increasing cost, which in turns indicates that every limit point of the computed feedback matrices is sparse and stabilizing. The algorithm is numerically tested on a distributed control scenario with randomly generated graph and unstable dynamics.
2022
Proceedings of the IEEE Conference on Decision and Control
4059
4064
Sforni L., Camisa A., Notarstefano G. (2022). Structured-policy Q-learning: an LMI-based Design Strategy for Distributed Reinforcement Learning. 345 E 47TH ST, NEW YORK, NY 10017 USA : Institute of Electrical and Electronics Engineers Inc. [10.1109/CDC51059.2022.9992584].
Sforni L.; Camisa A.; Notarstefano G.
File in questo prodotto:
Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/970122
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? 0
social impact