Proceedings of ESCAPE 36ISSN: 2818-4734
Volume: 5 (2026)
Table of Contents
LAPSE:2026.0437
Published Article
LAPSE:2026.0437
Evaluating and adapting modelling strategies for data-driven prediction of solvent effects on reaction barriers
June 12, 2026
Abstract
Predicting solvent effects on reaction activation barriers is central to understanding chemical reactivity and reaction kinetics, and guiding solvent selection. The solvent-induced change in activation free energy (DDG_solv‡) provides a quantitative descriptor of this effect, but remains costly to evaluate across vast reaction-solvent spaces, using quantum mechanical methods. Recent data-driven models have enabled prediction of solvent effects. However, most typically rely on two-dimensional representation of reactions and do not explicitly encode sufficient reaction context, such as transition-state information, or three-dimensional structural changes along the reaction, resulting in limited generalizability and predictive accuracy. In this study, systematic evaluation is presented of modelling strategies for predicting DDG_solv‡, with a focus on the role of reaction-state representation, input-geometry fidelity, and input modality. Using a large reaction-solvent dataset, models based on two-dimensional condensed reaction graphs are compared with models incorporating three-dimensional geometries of reactants, transition states, and products. The sensitivity of geometry-based models to structural accuracy is assessed by replacing quantum-chemically optimized transition states with structures predicted by a generative model. In addition, a dual-modality architecture combining two-dimensional graph-based and three-dimensional geometry-based representations is examined. The results show that explicit inclusion of both reactant and transition-state geometries leads to improved prediction accuracy relative to representations based on reaction endpoints or transition states alone. However, model performance depends strongly on the fidelity of the input geometries, with substantial degradation observed when low-quality structures are used. The dual-modality approach partially mitigates this sensitivity by adaptively reweighting two-dimensional and three-dimensional information, leading to performance recovery under low-fidelity conditions.
Keywords
3D geometry, Multi-modality, Solvation free energy of reaction, Solvent effect, Transition state
Suggested Citation
Shin D, Gui L, Na J, Lee WB, Lee LYS. Evaluating and adapting modelling strategies for data-driven prediction of solvent effects on reaction barriers. Systems and Control Transactions 5:1876-1883 (2026) https://doi.org/10.69997/sct.112630
Author Affiliations
Shin D: Department of Chemical and Biological Engineering, Seoul National University, Seoul 08826, Republic of Korea. Department of Chemical Engineering, University College London, Torrington Place, London, WC1E 7JE, United Kingdom [ORCID]
Gui L: School of Engineering and Physical Sciences, Heriot-Watt University, Edinburgh, Scotland EH14 4AS, United Kingdom [ORCID]
Na J: Department of Chemical Engineering and Materials Science, Ewha Womans University, Seoul 03760, Republic of Korea [ORCID]
Lee WB: Department of Chemical and Biological Engineering, Seoul National University, Seoul 08826, Republic of Korea [ORCID]
Lee LYS: Department of Chemical Engineering, University College London, Torrington Place, London, WC1E 7JE, United Kingdom [ORCID]
[Login] to see author email addresses.
Journal Name
Systems and Control Transactions
Volume
5
First Page
1876
Last Page
1883
Year
2026
Publication Date
2026-06-12
Version Comments
Original Submission
Other Meta
PII: 1876-1883-637-SCT-5-2026, Publication Type: Journal Article
Record Map
Published Article

LAPSE:2026.0437
This Record
External Link

https://doi.org/10.69997/sct.112630
Publisher Version
Download
Files
Jun 12, 2026
Main Article
License
CC BY-SA 4.0
Meta
Record Statistics
Record Views
138
Version History
[v1] (Original Submission)
Jun 12, 2026
 
Verified by curator on
Jun 12, 2026
This Version Number
v1
Citations
Most Recent
This Version
URL Here
https://psecommunity.org/LAPSE:2026.0437
 
Record Owner
PSE Press
Links to Related Works
Directly Related to This Work
Publisher Version
References Cited
  1. Spiekermann KA, Pattanaik L, Green WH. Fast predictions of reaction barrier heights: toward coupled-cluster accuracy. J. Phys. Chem. A 126:3976-3986 (2022) https://doi.org/10.1021/acs.jpca.2c02614
  2. Zhao Q, Hsu HH, Savoie BM. Conformational sampling for transition state searches on a computational budget. J. Chem. Theory Comput. 18:3006-3016 (2022) https://doi.org/10.1021/acs.jctc.2c00081
  3. Jackson R, Zhang W, Pearson J. Tsnet: predicting transition state structures with tensor field networks and transfer learning. Chem. Sci. 12:10022-10040 (2021) https://doi.org/10.1039/d1sc01206a
  4. Ferraz-Caetano J, Teixeira F, Cordeiro MNDS. Explainable supervised machine learning model to predict solvation gibbs energy. J. Chem. Inf. Model. 64:2250-2262 (2023) https://doi.org/10.1021/acs.jcim.3c00544
  5. Chung, Y., et al., Group contribution and machine learning approaches to predict Abraham solute parameters, solvation free energy, and solvation enthalpy. Journal of Chemical Information and Modeling, 2022. 62(3): p. 433-446.
  6. Low K, Coote ML, Izgorodina EI. Explainable solvation free energy prediction combining graph neural networks with chemical intuition. J. Chem. Inf. Model. 62:5457-5470 (2022) https://doi.org/10.1021/acs.jcim.2c01013
  7. Gui L, Yu Y, Oliyide TO, Siougkrou E, Armstrong A, Galindo A, Sayyed FB, Kolis SP, Adjiman CS. Integrating model-based design of experiments and computer-aided solvent design. Computers & Chemical Engineering 177:108345 (2023) https://doi.org/10.1016/j.compchemeng.2023.108345
  8. Struebing H, Ganase Z, Karamertzanis PG, Siougkrou E, Haycock P, Piccione PM, Armstrong A, Galindo A, Adjiman CS. Computer-aided molecular design of solvents for accelerated reaction kinetics. Nature Chem 5:952-957 (2013) https://doi.org/10.1038/nchem.1755
  9. Chung Y, Green WH. Machine learning from quantum chemistry to predict experimental solvent effects on reaction rates. Chem. Sci. 15:2410-2424 (2024) https://doi.org/10.1039/d3sc05353a
  10. Jorner K, Brinck T, Norrby PO, Buttar D. Machine learning meets mechanistic modelling for accurate prediction of experimental activation energies. Chem. Sci. 12:1163-1175 (2021) https://doi.org/10.1039/d0sc04896h
  11. Schwaller P, Vaucher AC, Laino T, Reymond JL. Prediction of chemical reaction yields using deep learning. Mach. Learn.: Sci. Technol. 2:015016 (2021) https://doi.org/10.1088/2632-2153/abc81d
  12. Probst D, Schwaller P, Reymond JL. Reaction classification and yield prediction using the differential reaction fingerprint DRFP. Digital Discovery 1:91-97 (2022) https://doi.org/10.1039/d1dd00006c
  13. Heid E, Green WH. Machine learning of reaction properties via learned representations of the condensed graph of reaction. J. Chem. Inf. Model. 62:2101-2110 (2021) https://doi.org/10.1021/acs.jcim.1c00975
  14. Dobbelaere MR, Lengyel I, Stevens CV, Van Geem KM. Rxn-insight: fast chemical reaction analysis using bond-electron matrices. J Cheminform 16: (2024) https://doi.org/10.1186/s13321-024-00834-z
  15. van Gerwen P, Briling KR, Calvino Alonso Y, Franke M, Corminboeuf C. Benchmarking machine-readable vectors of chemical reactions on computed activation barriers. Digital Discovery 3:932-943 (2024) https://doi.org/10.1039/d3dd00175j
  16. Schwaller P, Probst D, Vaucher AC, Nair VH, Kreutter D, Laino T, Reymond JL. Mapping the space of chemical reactions using attention-based neural networks. Nat Mach Intell 3:144-152 (2021) https://doi.org/10.1038/s42256-020-00284-w
  17. Mswahili ME, Jeong YS. Transformer-based models for chemical SMILES representation: a comprehensive literature review. Heliyon 10:e39038 (2024) https://doi.org/10.1016/j.heliyon.2024.e39038
  18. Kim S, Woo J, Kim WY. Diffusion-based generative AI for exploring transition states from 2D molecular graphs. Nat Commun 15: (2024) https://doi.org/10.1038/s41467-023-44629-6
  19. Duan C, Liu GH, Du Y, Chen T, Zhao Q, Jia H, Gomes CP, Theodorou EA, Kulik HJ. Optimal transport for generating transition states in chemical reactions. Nat Mach Intell 7:615-626 (2025) https://doi.org/10.1038/s42256-025-01010-0
  20. Duan C, Du Y, Jia H, Kulik HJ. Accurate transition state generation with an object-aware equivariant elementary reaction diffusion model. Nat Comput Sci 3:1045-1055 (2023) https://doi.org/10.1038/s43588-023-00563-7
  21. Choi S. Prediction of transition state structures of gas-phase chemical reactions via machine learning. Nat Commun 14: (2023) https://doi.org/10.1038/s41467-023-36823-3
  22. Heid, E., et al., Chemprop: a machine learning package for chemical property prediction. Journal of Chemical Information and Modeling, 2023. 64(1): p. 9-17.
  23. Grambow, C.A., L. Pattanaik, and W.H. Green, Reactants, products, and transition states of elementary chemical reactions based on quantum chemistry. Scientific data, 2020. 7(1): p. 137.
  24. O'Boyle NM, Banck M, James CA, Morley C, Vandermeersch T, Hutchison GR. Open babel: an open chemical toolbox. J Cheminform 3: (2011) https://doi.org/10.1186/1758-2946-3-33
  25. Yang K, Swanson K, Jin W, Coley C, Eiden P, Gao H, Guzman-Perez A, Hopper T, Kelley B, Mathea M, Palmer A, Settels V, Jaakkola T, Jensen K, Barzilay R. Analyzing learned molecular representations for property prediction. J. Chem. Inf. Model. 59:3370-3388 (2019) https://doi.org/10.1021/acs.jcim.9b00237
  26. Thölke, P. and G. De Fabritiis, Torchmd-net: equivariant transformers for neural network based molecular potentials. arXiv preprint arXiv:2202.02541, 2022.
  27. Luo, Y., et al., Molfm: A multimodal molecular foundation model. arXiv preprint arXiv:2307.09484, 2023.
  28. Yu, Q., et al., Multimodal molecular pretraining via modality blending. arXiv preprint arXiv:2307.06235, 2023.
  29. McInnes, L., J. Healy, and J. Melville, UMAP: uniform manifold approximation and projection for dimension reduction. arXiv. arXiv preprint arXiv:1802.03426, 2018. 10.
(0.1 seconds)

[0.1 s]