Proceedings of ESCAPE 36ISSN: 2818-4734
Volume: 5 (2026)
Table of Contents
LAPSE:2026.0465
Published Article
LAPSE:2026.0465
A Graph Reinforcement Learning Framework for Batch Process Scheduling in State-Task Networks
June 12, 2026
Abstract
Batch production scheduling of resources to meet fluctuating product demand is a critical topic in the process industry. Existing optimisation approaches, based on heuristic and exact methods, trade off solution optimality and scalability to large problems. In this work, we investigate deep reinforcement learning as a powerful alternative in order to learn heuristics for batch scheduling. We formulate the batch scheduling problem as a Markov decision process operating on a state-task network representation encoded using graph neural networks, capturing relevant structural inductive biases. We propose a centralised training with decentralised execution architecture, in which agents placed on machines individually choose which tasks to complete using a global view of the network, cooperating towards task schedules that optimise the final production quantity. Preliminary results demonstrate that the proposed end-to-end framework learns to construct task schedules comparable to the optimal solution on small instances unseen during training, exhibiting strong potential for extension to more general graph structures and better scalability.
Keywords
Batch Process Scheduling, Deep-Q Networks, Graph Neural Networks, Markov Decision Process, Reinforcement Learning
Suggested Citation
Johnn S, Darvariu V, Charitopoulos VM. A Graph Reinforcement Learning Framework for Batch Process Scheduling in State-Task Networks. Systems and Control Transactions 5:2099-2106 (2026) https://doi.org/10.69997/sct.190792
Author Affiliations
Johnn S: Department of Chemical Engineering & The Sargent Centre for Process Systems Engineering, University College London, London, United Kingdom [ORCID]
Darvariu V: Oxford Robotics Institute, Department of Engineering Science, University of Oxford, Oxford, United Kingdom [ORCID]
Charitopoulos VM: Department of Chemical Engineering & The Sargent Centre for Process Systems Engineering, University College London, London, United Kingdom [ORCID]
[Login] to see author email addresses.
Journal Name
Systems and Control Transactions
Volume
5
First Page
2099
Last Page
2106
Year
2026
Publication Date
2026-06-12
Version Comments
Original Submission
Other Meta
PII: 2099-2106-250-SCT-5-2026, Publication Type: Journal Article
Record Map
Published Article

LAPSE:2026.0465
This Record
External Link

https://doi.org/10.69997/sct.190792
Publisher Version
Download
Files
Jun 12, 2026
Main Article
License
CC BY-SA 4.0
Meta
Record Statistics
Record Views
20
Version History
[v1] (Original Submission)
Jun 12, 2026
 
Verified by curator on
Jun 12, 2026
This Version Number
v1
Citations
Most Recent
This Version
URL Here
https://psecommunity.org/LAPSE:2026.0465
 
Record Owner
PSE Press
Links to Related Works
Directly Related to This Work
Publisher Version
References Cited
  1. Amato, C., 2024. An introduction to centralized training for decentralized execution in cooperative multi-agent reinforcement learning. arXiv preprint arXiv:2409.03052.
  2. Darvariu, V.-A., Hailes, S., & Musolesi, M. (2024). Graph reinforcement learning for combinatorial optimization: A survey and unifying perspective. Transactions on Machine Learning Research (TMLR).
  3. del Real Torres A, Andreiana DS, Ojeda Roldán Á, Hernández Bustos A, Acevedo Galicia LE. A review of deep reinforcement learning approaches for smart manufacturing in industry 4.0 and 5.0 framework. Applied Sciences 12:12377 (2022) https://doi.org/10.3390/app122312377
  4. Dogru O, Xie J, Prakash O, Chiplunkar R, Soesanto J, Chen H, Velswamy K, Ibrahim F, Huang B. Reinforcement learning in process industries: review and perspective. IEEE/CAA J. Autom. Sinica 11:283-300 (2024) https://doi.org/10.1109/jas.2024.124227
  5. Hameed MSA, Schwung A. Graph neural networks-based scheduler for production planning problems using reinforcement learning. Journal of Manufacturing Systems 69:91-102 (2023) https://doi.org/10.1016/j.jmsy.2023.06.005
  6. Harjunkoski I, Maravelias CT, Bongers P, Castro PM, Engell S, Grossmann IE, Hooker J, Méndez C, Sand G, Wassick J. Scope for industrial applications of production scheduling models and solution methods. Computers & Chemical Engineering 62:161-193 (2014) https://doi.org/10.1016/j.compchemeng.2013.12.001
  7. Johnn SN, Charitopoulos VM. A hybrid deep q-learning approach to online planning and rescheduling of single-stage multi-product continuous processes. Computers & Chemical Engineering 204:109415 (2026) https://doi.org/10.1016/j.compchemeng.2025.109415
  8. Kondili E, Pantelides CC, Sargent RWH. A general algorithm for short-term scheduling of batch operations-i. MILP formulation. Computers & Chemical Engineering 17:211-227 (1993) https://doi.org/10.1016/0098-1354(93)80015-f
  9. Méndez CA, Cerdá J, Grossmann IE, Harjunkoski I, Fahl M. State-of-the-art review of optimization methods for short-term scheduling of batch processes. Computers & Chemical Engineering 30:913-946 (2006) https://doi.org/10.1016/j.compchemeng.2006.02.008
  10. Mnih V, Kavukcuoglu K, Silver D, Rusu AA, Veness J, Bellemare MG, Graves A, Riedmiller M, Fidjeland AK, Ostrovski G, Petersen S, Beattie C, Sadik A, Antonoglou I, King H, Kumaran D, Wierstra D, Legg S, Hassabis D. Human-level control through deep reinforcement learning. Nature 518:529-533 (2015) https://doi.org/10.1038/nature14236
  11. Rangel-Martinez D, Ricardez-Sandoval LA. Recurrent reinforcement learning strategy with a parameterized agent for online scheduling of a state task network under uncertainty. Ind. Eng. Chem. Res. 64:7126-7140 (2025) https://doi.org/10.1021/acs.iecr.4c04900
  12. Sutton, R. and Barto, A., 2018. Reinforcement learning: An introduction. MIT Press.
  13. Casanova A, Cucurull G, Drozdzal M, Romero A, Bengio Y. On the iterative refinement of densely connected representation levels for semantic segmentation. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) :1091-109109 (2018) https://doi.org/10.1109/cvprw.2018.00144
  14. Littman ML. Markov games as a framework for multi-agent reinforcement learning. Machine Learning Proceedings 1994 :157-163 (1994) https://doi.org/10.1016/b978-1-55860-335-6.50027-1
  15. Wu Y, Maravelias CT. A general framework and optimization models for the scheduling of continuous chemical processes. AIChE Journal 67: (2021) https://doi.org/10.1002/aic.17344
  16. Yoo H, Byun HE, Han D, Lee JH. Reinforcement learning for batch process control: review and perspectives. Annual Reviews in Control 52:108-119 (2021) https://doi.org/10.1016/j.arcontrol.2021.10.006
  17. Zhang JD, He Z, Chan WH, Chow CY. Deepmag: deep reinforcement learning with multi-agent graphs for flexible job shop scheduling. Knowledge-Based Systems 259:110083 (2023) https://doi.org/10.1016/j.knosys.2022.110083
(0.1 seconds)

[0.1 s]