CRIS Current Research Information System

We present an architecture where a feedback controller derived on an approximate model of the environment assists the learning process to enhance its data efficiency. This architecture, which we term as Control-Tutored Q-Learning (CTQL), is presented in two alternative flavours. The former is based on defining the reward function so that a Boolean condition can be used to determine when the control tutor policy is adopted, while the latter, termed as probabilistic CTQL (pCTQL), is instead based on executing calls to the tutor with a certain probability during learning. Both approaches are validated, and thoroughly benchmarked against Q-Learning, by considering the stabilization of an inverted pendulum as defined in OpenAI Gym as a representative problem.

Francesco De Lellis, Marco Coraggio, Giovanni Russo, Mirco Musolesi, Mario di Bernardo (2022). Control-Tutored Reinforcement Learning: Towards the Integration of Data-Driven and Model-Based Control.

Control-Tutored Reinforcement Learning: Towards the Integration of Data-Driven and Model-Based Control

Francesco De Lellis;Marco Coraggio;Giovanni Russo;Mirco Musolesi;Mario di Bernardo

2022

Abstract

We present an architecture where a feedback controller derived on an approximate model of the environment assists the learning process to enhance its data efficiency. This architecture, which we term as Control-Tutored Q-Learning (CTQL), is presented in two alternative flavours. The former is based on defining the reward function so that a Boolean condition can be used to determine when the control tutor policy is adopted, while the latter, termed as probabilistic CTQL (pCTQL), is instead based on executing calls to the tutor with a certain probability during learning. Both approaches are validated, and thoroughly benchmarked against Q-Learning, by considering the stabilization of an inverted pendulum as defined in OpenAI Gym as a representative problem.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2022
			
	Titolo del volume
	
				Proceedings of the 4th Annual Learning for Dynamics and Control Conference, Proceedings of Machine Learning Research
			
	Pagina iniziale
	
				1048
			
	Pagina finale
	
				1059
			
	Citazione
	
				Francesco De Lellis,  Marco Coraggio,  Giovanni Russo,  Mirco Musolesi,  Mario di Bernardo (2022). Control-Tutored Reinforcement Learning: Towards the Integration of Data-Driven and Model-Based Control.
			
	Tutti gli autori
	
						Francesco De Lellis; Marco Coraggio; Giovanni Russo; Mirco Musolesi; Mario di Bernardo

File in questo prodotto:

Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/904472

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

ND

ND

ND

social impact