CRIS Current Research Information System

In this paper, we propose a data-driven strategy to iteratively find the state feedback gain matrix solving a Linear Quadratic Regulator (LQR) problem in a model-free fashion, i.e., under unknown system and cost matrices. In our setup, we assume that, at each iteration, an oracle provides the LQR cost of the tentative policy, e.g., by running the system or a simulator. Based on this information, we develop an algorithm based on Extremum-Seeking to iteratively refine our tentative solution without any additional knowledge on the system and cost models. By using a Lyapunov-based approach exploiting averaging theory for time-varying systems, we show that the proposed algorithm exponentially converges to an arbitrarily small ball containing the optimal gain matrix. We corroborate the theoretical results by testing the proposed strategy via numerical simulations.

Carnevale, G., Mimmo, N., Notarstefano, G. (2024). Extremum-Seeking Policy Iteration for Data-Driven LQR. Piscataway : Institute of Electrical and Electronics Engineers Inc. [10.1109/cdc56724.2024.10885851].

Extremum-Seeking Policy Iteration for Data-Driven LQR

Carnevale, Guido;Mimmo, Nicola;Notarstefano, Giuseppe

2024

Abstract

In this paper, we propose a data-driven strategy to iteratively find the state feedback gain matrix solving a Linear Quadratic Regulator (LQR) problem in a model-free fashion, i.e., under unknown system and cost matrices. In our setup, we assume that, at each iteration, an oracle provides the LQR cost of the tentative policy, e.g., by running the system or a simulator. Based on this information, we develop an algorithm based on Extremum-Seeking to iteratively refine our tentative solution without any additional knowledge on the system and cost models. By using a Lyapunov-based approach exploiting averaging theory for time-varying systems, we show that the proposed algorithm exponentially converges to an arbitrarily small ball containing the optimal gain matrix. We corroborate the theoretical results by testing the proposed strategy via numerical simulations.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2024
			
	Titolo del volume
	
				Proceedings of the IEEE Conference on Decision and Control
			
	Pagina iniziale
	
				6263
			
	Pagina finale
	
				6267
			
	Collana/Serie
	
				PROCEEDINGS OF THE ... IEEE CONFERENCE ON DECISION & CONTROL.
			
	Codice DOI
	
				https://dx.doi.org/10.1109/cdc56724.2024.10885851
			
	Citazione
	
				Carnevale, G., Mimmo, N., Notarstefano, G. (2024). Extremum-Seeking Policy Iteration for Data-Driven LQR. Piscataway : Institute of Electrical and Electronics Engineers Inc. [10.1109/cdc56724.2024.10885851].
			
	Tutti gli autori
	
						Carnevale, Guido; Mimmo, Nicola; Notarstefano, Giuseppe
					
	Appare nelle tipologie:
	
				4.01 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
main_RL_via_ES.pdf accesso aperto Tipo: Postprint / Author's Accepted Manuscript (AAM) - versione accettata per la pubblicazione dopo la peer-review Licenza: Licenza per accesso libero gratuito Dimensione 559.29 kB Formato Adobe PDF Visualizza/Apri	559.29 kB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/1013619

Citazioni

ND

0

0

social impact