Proceedings of ESCAPE 36ISSN: 2818-4734
Volume: 5 (2026)
Table of Contents
LAPSE:2026.0408
Published Article
LAPSE:2026.0408
An End-to-End Pure Component Property Prediction Framework Based on a Hierarchical Molecular Fragmentation Method
June 12, 2026
Abstract
The accurate prediction of pure component properties has consistently been a critical issue in fields such as chemical engineering, biomedicine, and environmental science. In recent years, end-to-end deep learning methods have shown significant improvement over traditional machine learning approaches. This is due to their ability to automatically learn task-relevant representations from raw molecular data. In addition to accurate property prediction, researchers have increasingly focused on how specific fragment structures influence molecular properties. However, existing fragmentation methods based on predefined rules and group libraries struggle to capture novel molecular structures, which hampers the development of new materials and drugs. To address these challenges, this work proposes a hierarchical molecular fragmentation method. This method can automatically segment molecules into multiple fragments containing key functional groups. Then a three-branch graph attention network was constructed to achieve multi-level representation. Finally, a multi-layer perceptron is employed to establish the mapping relationship between molecular features and physical property values. Twenty datasets were used for validation, which can be grouped into four categories: Thermodynamic Properties, Pharmacokinetics, Toxicological Properties, and Industrial Safety. The results show that the best performance is achieved, with the average error reduced by 6.8% compared to existing research.
Suggested Citation
Jiao J, Li J. An End-to-End Pure Component Property Prediction Framework Based on a Hierarchical Molecular Fragmentation Method. Systems and Control Transactions 5:1634-1642 (2026) https://doi.org/10.69997/sct.100427
Author Affiliations
Jiao J: The University of Manchester, Department of Chemical Engineering, Manchester, UK [ORCID]
Li J: The University of Manchester, Department of Chemical Engineering, Manchester, UK [ORCID]
[Login] to see author email addresses.
Journal Name
Systems and Control Transactions
Volume
5
First Page
1634
Last Page
1642
Year
2026
Publication Date
2026-06-12
Version Comments
Original Submission
Other Meta
PII: 1634-1642-227-SCT-5-2026, Publication Type: Journal Article
Record Map
Published Article

LAPSE:2026.0408
This Record
External Link

https://doi.org/10.69997/sct.100427
Publisher Version
Download
Files
Jun 12, 2026
Main Article
License
CC BY-SA 4.0
Meta
Record Statistics
Record Views
4
Version History
[v1] (Original Submission)
Jun 12, 2026
 
Verified by curator on
Jun 12, 2026
This Version Number
v1
Citations
Most Recent
This Version
URL Here
https://psecommunity.org/LAPSE:2026.0408
 
Record Owner
PSE Press
Links to Related Works
Directly Related to This Work
Publisher Version
References Cited
  1. Rogers D, Hahn M. Extended-connectivity fingerprints. J. Chem. Inf. Model. 50:742-754 (2010) https://doi.org/10.1021/ci100050t
  2. Gani R. Group contribution-based property estimation methods: advances and perspectives. Current Opinion in Chemical Engineering 23:184-196 (2019) https://doi.org/10.1016/j.coche.2019.04.007
  3. Alshehri AS, Tula AK, You F, Gani R. Next generation pure component property estimation models: with and without machine learning techniques. AIChE Journal 68: (2021) https://doi.org/10.1002/aic.17469
  4. Aouichaoui ARN, Fan F, Abildskov J, Sin G. Application of interpretable group-embedded graph neural networks for pure compound properties. Computers & Chemical Engineering 176:108291 (2023) https://doi.org/10.1016/j.compchemeng.2023.108291
  5. Gilmer J, Schoenholz S S, Riley P F, Vinyals O, Dahl G E. Neural message passing for quantum chemistry. in International conference on machine learning 1263-1272 (2017) https://doi.org/proceedings.mlr.press/v70/gilmer17a
  6. Xiong Z, Wang D, Liu X, Zhong F, Wan X, Li X, Li Z, Luo X, Chen K, Jiang H, Zheng M. Pushing the boundaries of molecular representation for drug discovery with the graph attention mechanism. J. Med. Chem. 63:8749-8760 (2019) https://doi.org/10.1021/acs.jmedchem.9b00959
  7. Li X, Fourches D. SMILES pair encoding: a data-driven substructure tokenization algorithm for deep learning. J. Chem. Inf. Model. 61:1560-1569 (2021) https://doi.org/10.1021/acs.jcim.0c01127
  8. Hukkerikar AS, Sarup B, Ten Kate A, Abildskov J, Sin G, Gani R. Group-contribution+ (GC+) based estimation of properties of pure components: improved property estimation and uncertainty analysis. Fluid Phase Equilibria 321:25-43 (2012) https://doi.org/10.1016/j.fluid.2012.02.010
  9. Wang J, Wang Y. Brics-based generation and ai-assisted screening of ionic liquids with mechanistic insights into lithium transport in electrolytes. J. Chem. Inf. Model. 65:10961-10976 (2025) https://doi.org/10.1021/acs.jcim.5c01824
  10. Brody S, Alon U, Yahav E. How attentive are graph attention networks? arXiv Prepr. arXiv2105.14491 (2021) https://doi.org/10.48550/arXiv.2105.14491
  11. Jiao J, Gao X, Li J. Pure component property estimation framework using explainable machine learning methods. Chinese Journal of Chemical Engineering 84:158-178 (2025) https://doi.org/10.1016/j.cjche.2025.05.011
  12. Akiba T, Sano S, Yanase T, Ohta T, Koyama M. Optuna. Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining :2623-2631 (2019) https://doi.org/10.1145/3292500.3330701
  13. Cao X, Gong M, Tula A, Chen X, Gani R, Venkatasubramanian V. An improved machine learning model for pure component property estimation. Engineering 39:61-73 (2024) https://doi.org/10.1016/j.eng.2023.08.024
  14. Zhu W, Zhang Y, Zhao D, Xu J, Wang L. Hignn: a hierarchical informative graph neural network for molecular property prediction equipped with feature-wise attention. J. Chem. Inf. Model. 63:43-55 (2022) https://doi.org/10.1021/acs.jcim.2c01099
(0.08 seconds)

[0.09 s]