In this paper, we address a data-driven linear quadratic optimal control problem in which the regulator design is performed on-policy by resorting to approaches from reinforcement learning and model reference adaptive control. In particular, a continuous-time identifier of the value function is used to generate online a reference model for the adaptive stabilizer. By introducing a suitably selected dithering signal, the resulting policy is shown to achieve asymptotic convergence to the optimal gain while the controlled plant reaches asymptotically the behavior of the optimal closed-loop system.
Borghesi, M., Bosso, A., Notarstefano, G. (2023). On-Policy Data-Driven Linear Quadratic Regulator via Model Reference Adaptive Reinforcement Learning. Institute of Electrical and Electronics Engineers Inc. [10.1109/CDC49753.2023.10383516].
On-Policy Data-Driven Linear Quadratic Regulator via Model Reference Adaptive Reinforcement Learning
Borghesi M.
Primo
;Bosso A.Secondo
;Notarstefano G.Ultimo
2023
Abstract
In this paper, we address a data-driven linear quadratic optimal control problem in which the regulator design is performed on-policy by resorting to approaches from reinforcement learning and model reference adaptive control. In particular, a continuous-time identifier of the value function is used to generate online a reference model for the adaptive stabilizer. By introducing a suitably selected dithering signal, the resulting policy is shown to achieve asymptotic convergence to the optimal gain while the controlled plant reaches asymptotically the behavior of the optimal closed-loop system.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


