On-Policy Data-Driven Linear Quadratic Regulator via Combined Policy Iteration and Recursive Least Squares

Sforni, L.; Carnevale, G.; Notarnicola, I.; Notarstefano, G.

doi:10.1109/CDC49753.2023.10383604

In this paper, we address infinite-horizon Linear Quadratic Regulator (LQR) problems for unknown discrete- time systems. As an additional challenge, we address an on- policy setup in which system matrices are identified while controlling the real system with a progressively optimized policy. Specifically, we consider a time-varying control policy that, while applied to the real unknown system, is iteratively refined (based on the most updated estimate of the system matrices) towards the optimal LQR solution. The overall learning procedure combines a recursive least squares method with a direct policy search based on the gradient method. By resorting to Lyapunov-based analysis tools in combination with averaging theory for nonlinear systems, exponential stability for the closed-loop scheme can be proven. Finally, a numerical example showing the effectiveness of the considered strategy corroborates the theoretical findings.

Sforni L., Carnevale G., Notarnicola I., Notarstefano G. (2023). On-Policy Data-Driven Linear Quadratic Regulator via Combined Policy Iteration and Recursive Least Squares. Institute of Electrical and Electronics Engineers Inc. [10.1109/CDC49753.2023.10383604].

On-Policy Data-Driven Linear Quadratic Regulator via Combined Policy Iteration and Recursive Least Squares

Sforni L.^Primo;Carnevale G.^Secondo;Notarnicola I.^Penultimo;Notarstefano G.^Ultimo

2023

Abstract

In this paper, we address infinite-horizon Linear Quadratic Regulator (LQR) problems for unknown discrete- time systems. As an additional challenge, we address an on- policy setup in which system matrices are identified while controlling the real system with a progressively optimized policy. Specifically, we consider a time-varying control policy that, while applied to the real unknown system, is iteratively refined (based on the most updated estimate of the system matrices) towards the optimal LQR solution. The overall learning procedure combines a recursive least squares method with a direct policy search based on the gradient method. By resorting to Lyapunov-based analysis tools in combination with averaging theory for nonlinear systems, exponential stability for the closed-loop scheme can be proven. Finally, a numerical example showing the effectiveness of the considered strategy corroborates the theoretical findings.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2023
			
	Titolo del volume
	
				Proceedings of the IEEE Conference on Decision and Control
			
	Pagina iniziale
	
				5047
			
	Pagina finale
	
				5052
			
	Collana/Serie
	
				PROCEEDINGS OF THE IEEE CONFERENCE ON DECISION & CONTROL
			
	Codice DOI
	
				https://dx.doi.org/10.1109/CDC49753.2023.10383604
			
	Citazione
	
				Sforni L.,  Carnevale G.,  Notarnicola I.,  Notarstefano G. (2023). On-Policy Data-Driven Linear Quadratic Regulator via Combined Policy Iteration and Recursive Least Squares. Institute of Electrical and Electronics Engineers Inc. [10.1109/CDC49753.2023.10383604].
			
	Tutti gli autori
	
						Sforni L.; Carnevale G.; Notarnicola I.; Notarstefano G.

File in questo prodotto:

Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/963414

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

8

ND

ND

CRIS Current Research Information System