In this paper, we propose a data-driven strategy to iteratively find the state feedback gain matrix solving a Linear Quadratic Regulator (LQR) problem in a model-free fashion, i.e., under unknown system and cost matrices. In our setup, we assume that, at each iteration, an oracle provides the LQR cost of the tentative policy, e.g., by running the system or a simulator. Based on this information, we develop an algorithm based on Extremum-Seeking to iteratively refine our tentative solution without any additional knowledge on the system and cost models. By using a Lyapunov-based approach exploiting averaging theory for time-varying systems, we show that the proposed algorithm exponentially converges to an arbitrarily small ball containing the optimal gain matrix. We corroborate the theoretical results by testing the proposed strategy via numerical simulations.

Carnevale, G., Mimmo, N., Notarstefano, G. (2024). Extremum-Seeking Policy Iteration for Data-Driven LQR. Institute of Electrical and Electronics Engineers Inc. [10.1109/cdc56724.2024.10885851].

Extremum-Seeking Policy Iteration for Data-Driven LQR

Carnevale, Guido
;
Mimmo, Nicola;Notarstefano, Giuseppe
2024

Abstract

In this paper, we propose a data-driven strategy to iteratively find the state feedback gain matrix solving a Linear Quadratic Regulator (LQR) problem in a model-free fashion, i.e., under unknown system and cost matrices. In our setup, we assume that, at each iteration, an oracle provides the LQR cost of the tentative policy, e.g., by running the system or a simulator. Based on this information, we develop an algorithm based on Extremum-Seeking to iteratively refine our tentative solution without any additional knowledge on the system and cost models. By using a Lyapunov-based approach exploiting averaging theory for time-varying systems, we show that the proposed algorithm exponentially converges to an arbitrarily small ball containing the optimal gain matrix. We corroborate the theoretical results by testing the proposed strategy via numerical simulations.
2024
Proceedings of the IEEE Conference on Decision and Control
6263
6267
Carnevale, G., Mimmo, N., Notarstefano, G. (2024). Extremum-Seeking Policy Iteration for Data-Driven LQR. Institute of Electrical and Electronics Engineers Inc. [10.1109/cdc56724.2024.10885851].
Carnevale, Guido; Mimmo, Nicola; Notarstefano, Giuseppe
File in questo prodotto:
Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/1013619
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact