Proceedings of ESCAPE 35ISSN: 2818-4734
Volume: 4 (2025)
Table of Contents
LAPSE:2025.0340
Published Article
LAPSE:2025.0340
Safe Reinforcement Learning with Lyapunov-Based Constraints for Control of an Unstable Reactor
José R. Torraca Neto, Bruno D. O. Capron, Argimiro R. Secchi, Antonio d.R. Chanona
June 27, 2025
Abstract
This work presents a Lyapunov-based framework for safe reinforcement learning (RL) applied to the control of an unstable reactor. The proposed method imposes stability constraints on the value and Q-functions through a Lyapunov candidate function defined as the negative of these functions, L(s)=-V(s) and L(s,a)=-Q(s,a). Constraints enforce positivity of the Lyapunov candidate function and non-positive time derivatives, promoting monotonic behavior aligned with Lyapunov stability conditions. The framework was tested on both on-policy (PPO) and off-policy (SAC, TD3, and DDPG) RL algorithms, with performance evaluated against their baseline versions and a nonlinear Model Predictive Controller (NMPC). Results showed that stability constraints significantly improved control performance across all tested algorithms, yielding consistently higher cumulative rewards, reduced overshoot, and decreased variability. Derivative-based constraints successfully mitigated abrupt changes and oscillatory behavior in the Q and value functions, especially evident in PPO, DDPG, and TD3. This work demonstrates that Lyapunov-based constraints are an effective tool for improving the safety and stability of RL algorithms in safety-critical applications. The proposed approach avoids the complexity of auxiliary optimization and offers a computationally efficient solution for enhancing the safety and stability of RL algorithms in safety-critical control applications.
Keywords
Lyapunov functions, process control, safety-critical systems, unstable dynamics
Suggested Citation
Neto JRT, Capron BDO, Secchi AR, Chanona AD. Safe Reinforcement Learning with Lyapunov-Based Constraints for Control of an Unstable Reactor. Systems and Control Transactions 4:1169-1174 (2025) https://doi.org/10.69997/sct.137298
Author Affiliations
Neto JRT: Universidade Federal do Rio de Janeiro, Chemical and Biochemical Process Engineering – EQ, Rio de Janeiro, RJ, Brazil
Capron BDO: Universidade Federal do Rio de Janeiro, Chemical and Biochemical Process Engineering – EQ, Rio de Janeiro, RJ, Brazil
Secchi AR: Universidade Federal do Rio de Janeiro, Chemical and Biochemical Process Engineering – EQ, Rio de Janeiro, RJ, Brazil; Universidade Federal do Rio de Janeiro, Chemical Engineering Program – COPPE, Rio de Janeiro, RJ, Brazil
Chanona AD: Imperial College London, Sargent Centre for Process Systems Engineering, London, United Kingdom
Journal Name
Systems and Control Transactions
Volume
4
First Page
1169
Last Page
1174
Year
2025
Publication Date
2025-07-01
Version Comments
Original Submission
Other Meta
PII: 1169-1174-1562-SCT-4-2025, Publication Type: Journal Article
Record Map
Published Article

LAPSE:2025.0340
This Record
External Link

https://doi.org/10.69997/sct.137298
Article DOI
Download
Files
Jun 27, 2025
Main Article
License
CC BY-SA 4.0
Meta
Record Statistics
Record Views
1114
Version History
[v1] (Original Submission)
Jun 27, 2025
 
Verified by curator on
Jun 27, 2025
This Version Number
v1
Citations
Most Recent
This Version
URL Here
https://psecommunity.org/LAPSE:2025.0340
 
Record Owner
PSE Press
Links to Related Works
Directly Related to This Work
Article DOI
References Cited
  1. Badgwell TA, Lee JH, Liu KH. Reinforcement learning - overview of recent progress and implications for process control. In: 13th International Symposium on Process Systems Engineering (PSE 2018). Eds: Eden MR, Ierapetritou MG, Towler GP. Vol 44, pp 71-85. Elsevier (2018) https://doi.org/10.1016/B978-0-444-64241-7.50008-2
  2. Annaswamy AM. Adaptive control and intersections with reinforcement learning. Annu Rev Control Robot Autom Syst 6:65-93 (2023) https://doi.org/10.1146/annurev-control-062922-090153
  3. Brunke L, Greeff M, Hall AW, Yuan Z, Zhou S, Panerati J, Schoellig AP. Safe learning in robotics: from learning-based control to safe reinforcement learning. Annu Rev Control Robot Autom Syst 5:411-444 (2022) https://doi.org/10.1146/annurev-control-042920-020211
  4. Gu S, Yang L, Du Y, Chen G, Walter F, Wang J, Knoll A. A review of safe reinforcement learning: methods, theories, and applications. IEEE Trans Pattern Anal Mach Intell 46:11216-11235 (2024) https://doi.org/10.1109/TPAMI.2024.3457538
  5. Gu S, Kumar R. Robust optimal safe and stability guaranteeing reinforcement learning control for quadcopter. arXiv (2024) https://arxiv.org/abs/2412.14003
  6. Osinenko P, Beckenbach L, Göhrt T, Streif S. A reinforcement learning method with closed-loop stability guarantee. IFAC-PapersOnLine 53(2):8043-8048 (2020) https://doi.org/10.1016/j.ifacol.2020.12.2237
  7. Bo S, Agyeman BT, Yin X, Liu J. Control invariant set enhanced safe reinforcement learning: improved sampling efficiency, guaranteed stability and robustness. arXiv (2023) https://arxiv.org/abs/2305.15602 https://doi.org/10.1016/j.compchemeng.2023.108413
  8. Bloor M, Torraca J, Sandoval IO, Ahmed A, White M, Mercangöz M, Tsay C, Del Rio Chanona EA, Mowbray M. PC-Gym: benchmark environments for process control problems. arXiv (2024) https://arxiv.org/abs/2410.22093
  9. Akiba T, Sano S, Yanase T, Ohta T, Koyama M. Optuna: a next-generation hyperparameter optimization framework. arXiv (2019) https://arxiv.org/abs/1907.10902 https://doi.org/10.1145/3292500.3330701

[0.36 s]