LAPSE:2025.0367
Published Article

LAPSE:2025.0367
A Component Property Modeling Framework Utilizing Molecular Similarity for Accurate Predictions and Uncertainty Quantification
June 27, 2025
Abstract
A key step in developing high-performance industrial products lies in the design of their constituent molecules. Computer-aided molecular design (CAMD) has garnered significant attention for its potential to accelerate and improve the design process. The mainstream method involves using property prediction models to predict the properties of potential molecules and selecting the best candidates based on these predictions. However, prediction errors are inevitable, introducing unreliability into the design. To address this issue, this paper proposes a novel component property modeling framework based on a molecular similarity coefficient. By calculating the similarity between a target molecule and those in an existing database, the framework selects the most similar molecules to form a tailored training dataset. The similarity coefficient also quantifies the reliability of the property predictions. In tests across various properties, this framework not only provides a quantifiable evaluation of prediction reliability but also improves the prediction accuracy of molecules with high reliability, which has the potential to enhance the integrity of molecular design.
A key step in developing high-performance industrial products lies in the design of their constituent molecules. Computer-aided molecular design (CAMD) has garnered significant attention for its potential to accelerate and improve the design process. The mainstream method involves using property prediction models to predict the properties of potential molecules and selecting the best candidates based on these predictions. However, prediction errors are inevitable, introducing unreliability into the design. To address this issue, this paper proposes a novel component property modeling framework based on a molecular similarity coefficient. By calculating the similarity between a target molecule and those in an existing database, the framework selects the most similar molecules to form a tailored training dataset. The similarity coefficient also quantifies the reliability of the property predictions. In tests across various properties, this framework not only provides a quantifiable evaluation of prediction reliability but also improves the prediction accuracy of molecules with high reliability, which has the potential to enhance the integrity of molecular design.
Record ID
Keywords
Molecular design, Property prediction, Similarity coefficient
Suggested Citation
Xu Y, Shao Z, Tula AK. A Component Property Modeling Framework Utilizing Molecular Similarity for Accurate Predictions and Uncertainty Quantification. Systems and Control Transactions 4:1342-1347 (2025) https://doi.org/10.69997/sct.144140
Author Affiliations
Xu Y: State Key Laboratory of Industrial Control Technology, College of Control Science and Engineering, Zhejiang University, Hangzhou 310027, China
Shao Z: State Key Laboratory of Industrial Control Technology, College of Control Science and Engineering, Zhejiang University, Hangzhou 310027, China
Tula AK: State Key Laboratory of Industrial Control Technology, College of Control Science and Engineering, Zhejiang University, Hangzhou 310027, China
Shao Z: State Key Laboratory of Industrial Control Technology, College of Control Science and Engineering, Zhejiang University, Hangzhou 310027, China
Tula AK: State Key Laboratory of Industrial Control Technology, College of Control Science and Engineering, Zhejiang University, Hangzhou 310027, China
Journal Name
Systems and Control Transactions
Volume
4
First Page
1342
Last Page
1347
Year
2025
Publication Date
2025-07-01
Version Comments
Original Submission
Other Meta
PII: 1342-1347-1182-SCT-4-2025, Publication Type: Journal Article
Record Map
Published Article

LAPSE:2025.0367
This Record
External Link

https://doi.org/10.69997/sct.144140
Article DOI
Download
Meta
Record Statistics
Record Views
926
Version History
[v1] (Original Submission)
Jun 27, 2025
Verified by curator on
Jun 27, 2025
This Version Number
v1
Citations
Most Recent
This Version
URL Here
https://psecommunity.org/LAPSE:2025.0367
Record Owner
PSE Press
Links to Related Works
References Cited
- Gani R. Group contribution-based property estimation methods: advances and perspectives. Curr Opin Chem Eng 2019;23:184-96. https://doi.org/10.1016/j.coche.2019.04.007
- Le T, Epa VC, Burden FR, Winkler DA. Quantitative structure-property relationship modeling of diverse materials properties. Chem Rev 2012;112:2889-919. https://doi.org/10.1021/cr200066h
- Wen S, Nanda K, Huang Y, Beran GJO. Practical quantum mechanics-based fragment methods for predicting molecular crystal properties. Physical Chemistry Chemical Physics 2012;14:7578-90. https://doi.org/10.1039/c2cp23949c
- Jirasek F, Hasse H. Machine Learning of Thermophysical Properties. Fluid Phase Equilib 2021;549. https://doi.org/10.1016/j.fluid.2021.113206
- Zhang J, Wang Q, Su Y, Jin S, Ren J, Eden M, et al. An accurate and interpretable deep learning model for environmental properties prediction using hybrid molecular representations. AIChE Journal 2022;68. https://doi.org/10.1002/aic.17634
- Alshehri AS, Tula AK, You F, Gani R. Next generation pure component property estimation models: With and without machine learning techniques. AIChE Journal 2022;68. https://doi.org/10.1002/aic.17469
- Joback KG, Reid RC. Estimation of Pure-Component Properties from Group-Contributions. Chem Eng Commun 1987;57:233-43. https://doi.org/10.1080/00986448708960487
- Gani R, Hytoft G, Jaksland C, Jensen AK. An integrated computer aided system for integrated design of chemical processes. Comput Chem Eng 1997;21:1135-1146. https://doi.org/10.1016/S0098-1354(96)00324-9
- Hukkerikar AS, Sarup B, Ten Kate A, Abildskov J, Sin G, Gani R. Group-contribution + (GC +) based estimation of properties of pure components: Improved property estimation and uncertainty analysis. Fluid Phase Equilib 2012;321:25-43. https://doi.org/10.1016/j.fluid.2012.02.010
- Cao X, Gong M, Tula A, Chen X, Gani R, Venkatasubramanian V. An Improved Machine Learning Model for Pure Component Property Estimation. Engineering 2024;39:61-73. https://doi.org/10.1016/j.eng.2023.08.024
(0.08 seconds)
[0.09 s]

