In this paper, we address infinite-horizon Linear Quadratic Regulator (LQR) problems for unknown discrete- time systems. As an additional challenge, we address an on- policy setup in which system matrices are identified while controlling the real system with a progressively optimized policy. Specifically, we consider a time-varying control policy that, while applied to the real unknown system, is iteratively refined (based on the most updated estimate of the system matrices) towards the optimal LQR solution. The overall learning procedure combines a recursive least squares method with a direct policy search based on the gradient method. By resorting to Lyapunov-based analysis tools in combination with averaging theory for nonlinear systems, exponential stability for the closed-loop scheme can be proven. Finally, a numerical example showing the effectiveness of the considered strategy corroborates the theoretical findings.

On-Policy Data-Driven Linear Quadratic Regulator via Combined Policy Iteration and Recursive Least Squares / Sforni L.; Carnevale G.; Notarnicola I.; Notarstefano G.. - ELETTRONICO. - (2023), pp. 5047-5052. (Intervento presentato al convegno 62nd IEEE Conference on Decision and Control, CDC 2023 tenutosi a Singapore nel 2023) [10.1109/CDC49753.2023.10383604].

On-Policy Data-Driven Linear Quadratic Regulator via Combined Policy Iteration and Recursive Least Squares

Sforni L.
Primo
;
Carnevale G.
Secondo
;
Notarnicola I.
Penultimo
;
Notarstefano G.
Ultimo
2023

Abstract

In this paper, we address infinite-horizon Linear Quadratic Regulator (LQR) problems for unknown discrete- time systems. As an additional challenge, we address an on- policy setup in which system matrices are identified while controlling the real system with a progressively optimized policy. Specifically, we consider a time-varying control policy that, while applied to the real unknown system, is iteratively refined (based on the most updated estimate of the system matrices) towards the optimal LQR solution. The overall learning procedure combines a recursive least squares method with a direct policy search based on the gradient method. By resorting to Lyapunov-based analysis tools in combination with averaging theory for nonlinear systems, exponential stability for the closed-loop scheme can be proven. Finally, a numerical example showing the effectiveness of the considered strategy corroborates the theoretical findings.
2023
Proceedings of the IEEE Conference on Decision and Control
5047
5052
On-Policy Data-Driven Linear Quadratic Regulator via Combined Policy Iteration and Recursive Least Squares / Sforni L.; Carnevale G.; Notarnicola I.; Notarstefano G.. - ELETTRONICO. - (2023), pp. 5047-5052. (Intervento presentato al convegno 62nd IEEE Conference on Decision and Control, CDC 2023 tenutosi a Singapore nel 2023) [10.1109/CDC49753.2023.10383604].
Sforni L.; Carnevale G.; Notarnicola I.; Notarstefano G.
File in questo prodotto:
Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/963414
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact