Proceedings of ESCAPE 35ISSN: 2818-4734
Volume: 4 (2025)
Table of Contents
LAPSE:2025.0409v1
Published Article
LAPSE:2025.0409v1
GRAPSE: Graph-Based Retrieval Augmentation for Process Systems Engineering
Daniel Ovalle, Arpan Seth, John R. Kitchin, Carl D. Laird, Ignacio E. Grossmann
June 27, 2025
Abstract
Large Language Models have demonstrated potential in accelerating scientific discovery, but they face challenges when making inferences in rapidly evolving and niche domains like Process Systems Engineering (PSE). To address this, we propose a Graph-based Retrieval-Augmented Generation (RAG) pipeline specifically designed for PSE papers. Our pipeline includes custom document parsing, knowledge graph construction, and refinement to enhance retrieval accuracy. We evaluate the effectiveness of our approach using an automatically generated benchmark consisting entirely of PSE-related questions. The results show that our pipeline outperforms both non-RAG and vanilla RAG implementations in terms of relevant document retrieval and overall answer quality. Additionally, our implementation is fully customizable, allowing users to select the papers most relevant to their specific tasks. This framework is openly available, providing a flexible solution for those working in PSE or similar domains.
Keywords
Graph-based Retrieval, Large Language Model, Process Systems Engineering, Retrieval-Augmented Generation
Suggested Citation
Ovalle D, Seth A, Kitchin JR, Laird CD, Grossmann IE. GRAPSE: Graph-Based Retrieval Augmentation for Process Systems Engineering. Systems and Control Transactions 4:1598-1604 (2025) https://doi.org/10.69997/sct.198790
Author Affiliations
Ovalle D: Department of Chemical Engineering, Carnegie Mellon University, Pittsburgh, PA 15213, United States; These authors contributed equally to this work
Seth A: Evonik Corporation, Process Technology and Engineering, Trexlertown, PA 18087, United States; These authors contributed equally to this work
Kitchin JR: Department of Chemical Engineering, Carnegie Mellon University, Pittsburgh, PA 15213, United States
Laird CD: Department of Chemical Engineering, Carnegie Mellon University, Pittsburgh, PA 15213, United States
Grossmann IE: Department of Chemical Engineering, Carnegie Mellon University, Pittsburgh, PA 15213, United States
Journal Name
Systems and Control Transactions
Volume
4
First Page
1598
Last Page
1604
Year
2025
Publication Date
2025-07-01
Version Comments
Original Submission
Other Meta
PII: 1598-1604-1696-SCT-4-2025, Publication Type: Journal Article
Record Map
Published Article

LAPSE:2025.0409v1
This Record
External Link

https://doi.org/10.69997/sct.198790
Article DOI
Download
Files
Jun 27, 2025
Main Article
License
CC BY-SA 4.0
Meta
Record Statistics
Record Views
900
Version History
[v1] (Original Submission)
Jun 27, 2025
 
Verified by curator on
Jun 27, 2025
This Version Number
v1
Citations
Most Recent
This Version
URL Here
http://psecommunity.org/LAPSE:2025.0409v1
 
Record Owner
PSE Press
Links to Related Works
Directly Related to This Work
Article DOI
References Cited
  1. Schweidtmann AM. Generative artificial intelligence in chemical engineering. Nature Chemical Engineering 1.3:193-193 (2024) https://doi.org/10.1038/s44286-024-00041-5
  2. Hao C, Constante-Flores GE, Li C. Diagnosing infeasible optimization problems using large language models. INFOR: Information Systems and Operational Research 62.4:573-587 (2024) https://doi.org/10.1080/03155986.2024.2385189
  3. Petroni F. Language models as knowledge bases?. arXiv preprint arXiv:1909.01066 (2019)
  4. Pistikopoulos EN, Barbosa-Povoa A, Lee JH, Misener R, Mitsos A, Reklaitis GV, Venkatasubramanian V, You F, Gani R. Process systems engineering-the generation next?. Computers & Chemical Engineering 147: 107252 (2021) https://doi.org/10.1016/j.compchemeng.2021.107252
  5. Lewis P, Perez E, Piktus A, Petroni F, Karpukhin V, Goyal N, Küttler H, Lewis M, Yih WT, Rocktäschel T, Riedel S. Retrieval-augmented generation for knowledge-intensive nlp tasks. Advances in Neural Information Processing Systems 33:9459-9474 (2020)
  6. Hu Y, Lei Z, Zhang Z, Pan B, Ling C, Zhao L. GRAG: Graph Retrieval-Augmented Generation. arXiv preprint arXiv:2405.16506 (2024) https://doi.org/10.18653/v1/2025.findings-naacl.232
  7. Edge D, Trinh H, Cheng N, Bradley J, Chao A, Mody A, Truitt S, Larson J. From local to global: A graph rag approach to query-focused summarization. arXiv preprint arXiv:2404.16130 (2024)
  8. Wen Y, Wang Z, Sun J. Mindmap: knowledge graph prompting sparks graph of thoughts in large language models. arXiv. arXiv preprint arXiv:2308.09729 (2023) https://doi.org/10.18653/v1/2024.acl-long.558
  9. Leskovec J, Rajaraman A, Ullman JD. Mining of massive data sets. Cambridge University Press (2020) https://doi.org/10.1017/9781108684163
  10. Miller JJ. Graph database applications and concepts with Neo4j. In Proceedings of the southern association for information systems conference. 2324-36:141-147 (2013)
  11. Sarthi P, Abdullah S, Tuli A, Khanna S, Goldie A, Manning CD. Raptor: Recursive abstractive processing for tree-organized retrieval. arXiv preprint arXiv:2401.18059 (2024)
  12. Es S, James J, Espinosa-Anke L, Schockaert S. Ragas: Automated evaluation of retrieval augmented generation. arXiv preprint arXiv:2309.15217 (2023) https://doi.org/10.18653/v1/2024.eacl-demo.16
  13. Grossmann IE. Advanced Optimization for Process Systems Engineering. Cambridge University Press (2021) https://doi.org/10.1017/9781108917834