Browse
Keywords
Records with Keyword: Machine Learning
Nonmyopic Bayesian process optimization with a finite budget
July 11, 2025 (v1)
Subject: Optimization
Optimization under uncertainty is inherent to many PSE applications ranging from process design to RTO. Reaching process true optima often involves learning from experimentation, but actual experiments involve a cost (economic, resources, time) and a budget limit usually exists. Finding the best trade-off on cumulative process performance and experimental cost over a finite budget is a Partially Observable Markov Decision Process (POMDP), known to be computationally intractable. This paper follows the nonmyopic Bayesian optimization (BO) approximation to POMDPs developed by the machine-learning community, that naturally enables the use of hybrid plant surrogate models formed by fundamental laws and Gaussian processes (GP). Although nonmyopic BO using GPs may look more tractable, evaluating multi-step decision trees to find the best first-stage candidate action to apply is still expensive with evolutionary or NLP optimizers. Hence, we propose modelling the value function of the first-st... [more]
Enhancing Predictive Maintenance in Used Oil Re-Refining: a Hybrid Machine Learning Approach
July 8, 2025 (v1)
Subject: Process Operations
Keywords: Algorithms, Artificial Intelligence, Distillation, Industry 4.0, Machine Learning, Modelling, Planning
Maintenance is critical for industrial plants to ensure operational reliability and worker safety. In process industries, fouling, the accumulation of solid residues in equipment, poses a significant challenge, causing inefficiencies and productivity losses. Effective modeling of fouling evolution over time is essential for maintenance planning to prevent equipment from operating under suboptimal conditions. Traditional approaches to fouling prediction include equation-based models, which offer high precision but may struggle with continuously changing process bound-aries, and machine learning techniques, which are more adaptable but less effective at capturing rapidly evolving trends driven by complex underlying physics. This study introduces an innova-tive hybrid machine learning approach for predictive maintenance, combining the strengths of both methods. Pressure differential is modeled using an equation-based approach that links pressure data with fouling thickness, while the foul... [more]
Hybrid Models Identification and Training through Evolutionary Algorithms
July 2, 2025 (v2)
Subject: System Identification
Keywords: automatic identification, differential evolution, epistemic uncertainty, hybrid modelling, Machine Learning
Hybrid modelling is widely employed in chemical engineering to generate highly accurate predictions. Such an approach merges first-principle modelling with machine learning techniques to identify and model the epistemic uncertainty from experimental data. Despite its advantages, this still requires cross-domain competencies that are difficult to find in the chemical industry and high human involvement. The possibility of automating the identification and training model would be significantly beneficial for the widespread adoption of hybrid modelling methodology within the chemical industry. This work presents a novel algorithm for the automatic identification of hybrid models (HMs) starting from the first-principle representation of the system, described by differential equation sets. The methodology formulates the problem as mixed-integer programming, identifying the equation running under uncertainty, identifying the machine learning model hyperparameters, and training the latter. Th... [more]
Data-driven Digital Design of Pharmaceutical Crystallization Processes
June 27, 2025 (v1)
Subject: Process Design
Keywords: Artificial Intelligence, Machine Learning, Modelling and Simulations, Optimization, Process Design
Mechanistic population balance modeling (PBM) has advanced the design of pharmaceutical crystallization processes, enabling the production of active pharmaceutical ingredient (API) crystals with desired critical quality attributes (CQAs), such as purity and crystal size distribution. However, PBM development can sometimes be resource-intensive, requiring extensive design of experiments (DoE) and high-quality process data, making it impractical under fast-paced industrial development timelines. This study proposes a machine learning (ML)-based workflow for developing fit-for-purpose digital twins of crystallization processes, leveraging industrially available DoE data to link operating conditions with CQAs. Validated on industrial data for a commercial API with complex crystallization challenges, the workflow efficiently identifies optimal operating conditions, demonstrating the potential of data-driven digital twins to accelerate the development of pharmaceutical processes.
Data-driven Modeling of a Continuous Direct Compression Tableting Process using SINDy
June 27, 2025 (v1)
Subject: Modelling and Simulations
Understanding the complex dynamics of continuous processes in pharmaceutical manufacturing is essential to ensure product quality across the production line. This paper presents a data-driven modeling approach using Sparse Identification of Nonlinear Dynamics with Control (SINDYc) to capture the dynamics of a continuous direct compression (CDC) tableting line. By incorporating delayed control inputs into the candidate function library, the model effectively captures deviations from steady state in response to dynamic changes. The proposed model was developed by finding a balance between accuracy and sparsity, with focus on the ability to generalize to a wide range of operating conditions.
Machine Learning Applications in Dairy Production
June 27, 2025 (v1)
Subject: Numerical Methods and Statistics
Keywords: Algorithms, Artificial Intelligence, Artificial Neural Network, Dairy Production, Machine Learning, Milk
The Fourth Industrial Revolution (Industry 4.0) brings a new chapter at dairy sector. Dairy 4.0 technologies are based on Big Data Analysis, Internet of Things, Robotics and Machine Learning. The usage of smart technologies to processing and analyzing complicated massive data has a significant impact in automation, optimization, functional costs and innovation. Artificial Intelligence tools are applied from dairy farms and production lines including packaging- to supply chain. The aim of this paper is to demonstrate the most used applications of Machine Learning in dairy production so as to enhance the sustainability and the quality of dairy products. The most significant Machine Learning applications integrate machine vision, smart environmental sensors, activity collars, thermal imaging cameras, and digitized supply chain systems to facilitate inventory management. Challenges like milk adulteration, animal diseases, mastitis, traceability and supply chain losses are also addressed... [more]
Integrated hybrid modelling of lignin bioconversion
June 27, 2025 (v1)
Subject: Modelling and Simulations
Keywords: Biosystems, Dynamic Modelling, Lignin Valorization, Machine Learning
Global biomanufacturing is projected to expand rapidly in the coming decade due to advancements in DNA sequencing and manipulation. However, the complexity of cellular behaviour introduces difficulty in modelling and optimizing biomanufacturing processes. Phenomenological models that represent the physics of the system in empirical equations suffer from poor robustness, while their machine learning (ML) counterparts suffer from poor extrapolative capability. On the other hand, hybrid models allow us to leverage both physical constraints and the flexibility of ML. This work describes a new approach for hybrid modeling that integrates the time-variant parameter estimation and ML model training into a singular step. We implement this approach on a proposed scheme for the cell-mediated conversion of a lignin derivative into a bioplastic precursor and show that our integrated hybrid model outperforms the traditional two-step hybrid, phenomenological, and ML model counterparts. Lastly, we de... [more]
A Physics-based, Data-driven Numerical Framework for Anomalous Diffusion of Water in Soil
June 27, 2025 (v1)
Subject: Numerical Methods and Statistics
Keywords: Machine Learning, Modelling and Simulations, Numerical Methods, Renewable and Sustainable Energy, Water
Precision modeling and forecasting of soil moisture are essential for implementing smart irrigation systems and mitigating agricultural drought. Most agro-hydrological models are based on the standard Richards equation, a highly nonlinear, degenerate elliptic-parabolic partial differential equation (PDE) with first order time derivative. However, research has shown that standard Richards equation is unable to model preferential flow in soil with fractal structure. In such a scenario, the soil exhibits anomalous non-Boltzmann scaling behavior. Incorporating the anomalous non-Boltzmann scaling behavior into the Richards equation leads to a generalized, time-fractional Richards equation based on fractional time derivatives. As expected, solving the time-fractional Richards equation for accurate modeling of water flow dynamics in soil faces extensive computational challenges. To target these challenges, we propose a novel numerical method that integrates finite volume method (FVM), adaptiv... [more]
Machine Learning Models for Predicting the Amount of Nutrients Required in a Microalgae Cultivation System
June 27, 2025 (v1)
Subject: Modelling and Simulations
Keywords: Data Mining, Dunaliella carotenogenesis, Machine Learning, Microalgae Cultivation
Effective prediction of nutrient demands is crucial for optimising microalgae growth, maximising productivity and minimising the waste of resources. With the increasing amount of data related to microalgae cultivation systems, data mining and machine learning models to extract additional knowledge have gained popularity. In the development of such models, a data preprocessing stage is necessary due to the poor data quality. At this stage, cleaning and outlier removal techniques are employed to eliminate missing data and outliers, respectively. Afterwards, data splitting and cross-validation strategies are employed to ensure that the models are trained and evaluated with representative subsets of the data. Principal component analysis is also applied to simplify complex environmental datasets by reducing the number of features while retaining as much information as possible. To further improve prediction capabilities, ensemble methods are incorporated, leveraging multiple models to achi... [more]
10. LAPSE:2025.0521
Fed-batch bioprocess prediction and dynamic optimization from hybrid modelling and transfer learning
June 27, 2025 (v1)
Subject: Modelling and Simulations
Keywords: Biosystems, Dynamic Modelling, Dynamic Optimization, Hybrid Modelling, Machine Learning
Hybrid modelling utilizes advantageous aspects of both mechanistic (white box) and data-driven (black box) modelling. Combining the physical interpretability of kinetic modelling with the power of a data-driven Artificial Neural Network (ANN) yields a hybrid (grey box) model with superior accuracy when compared to a traditional mechanistic model, while requiring less data than a purely data-driven model. This study demonstrates the construction a hybrid model with transfer learning for the predictive modelling and optimization of a high-cell-density microalgal fermentation process for lutein production. Dynamic optimization was conducted to identify a feeding strategy that maximized final lutein production. The results were then experimentally validated. Overall, this work presents a novel digital twin application that can be easily adapted to general bioprocesses for model predictive control and process optimization.
11. LAPSE:2025.0515
Novel PSE applications and knowledge transfer in joint industry - university energy-related postgraduate education
June 27, 2025 (v1)
Subject: Modelling and Simulations
Keywords: Artificial Intelligence, Education, Knowledge Transfer, Machine Learning, Oil and Gas
The field of Process Systems Engineering (PSE) is undergoing a renaissance through the integration of artificial intelligence (AI) and machine learning (ML). This transformation is driven by the vast availability of industrial data and advanced computing power, enabling the practical application of sophisticated ML models. These models enhance PSE capabilities in design, control, optimization, and safety. The progress of ML and ever-present data collection address previously intractable problems, particularly in system integration and life-cycle modeling. ML-powered predictive algorithms are augmenting traditional control systems, showing potential in supply chain optimization and increasing operational resilience. Additionally, ML-driven fault prediction and diagnostics are enhancing process safety systems, allowing for predictive maintenance and minimizing risks of accidents. A case study of the collaboration between the University of West Attica and Helleniq Energy through the MSc p... [more]
12. LAPSE:2025.0507
Beyond ChatGMP: Improving LLM generation through user preferences
June 27, 2025 (v1)
Subject: Intelligent Systems
Keywords: Artificial Intelligence, Education, Industry 40, Intelligent Systems, Machine Learning
Prompt engineering improving the command given to a large language model (LLM) is becoming increasingly useful in order to maximize the performance of the model and therefore the quality of the output. However, in certain instances, the user is not able to enrich the prompt with additional and personalized details, such as the preferred tone and length of generated response. Therefore, it is useful to create models that learn these preferences and implement them directly in the prompt. Current state-of-the-art inductive logic programming (ILP) systems can play an important role in the development and advancement of digitalization strategies. For example, they can be used to learn personal preferences of users without sacrificing human interpretability of the learned outcomes. These systems have recently witnessed the development of data efficient, robust, and human interpretable algorithms and systems for learning predictive models from data and background knowledge. In this paper,... [more]
13. LAPSE:2025.0459
Physics-informed Data-driven control of Electrochemical Separation Processes
June 27, 2025 (v1)
Subject: Intelligent Systems
Keywords: Intelligent Systems, Machine Learning, Process Control, Reinforcement Learning, Separation
Optimizing the operational conditions of electrochemical separation systems to achieve higher separation efficiency remains a complex challenge due to their nonlinear and dynamic nature. In this work, we proposed a Reinforcement Learning (RL)-based control framework to address this challenge. By applying various RL algorithms, we trained an RL-based controller that adapts to different system configurations and conditions. Also, the trained model learns the optimality between the removal efficiency and energy consumption. Overall, this approach autonomously learns the optimal operational parameters, significantly improving ion removal efficiency. The proposed RL-based control system enhances the performance of electrochemical system, providing a versatile and adaptive solution for optimizing separation across multiple electrochemical technologies. This work demonstrates the potential of RL in advancing the design and control of sustainable water purification systems.
14. LAPSE:2025.0458
Reinforcement learning for distillation process synthesis using transformer blocks
June 27, 2025 (v1)
Subject: Optimization
Keywords: Artificial Intelligence, Distillation, Machine Learning, Optimization, Process Synthesis, Reinforcement learning, Transformer Blocks
A reinforcement learning framework is developed for the synthesis of distillation trains. The rigorous Naphtali-Sandholm algorithm for equilibrium separation modeling was implemented in JAX and coupled with the benchmarking Jumanji RL library. The vanilla actor-critic agent was successfully trained to build distillation trains for a seven-component hydrocarbon mixture. A transformer encoder structure was used to apply self-attention over the agents observation. The agent was trained on minimal data representation containing quantitative component flows and relative volatility parameters between present components. Training sessions involving 5·104 episodes (3·105 column designs) were typically run in under 60 minutes. While training was fast and reliable with appropriate tuning of the hyperparameters, further improvements are needed in the generalizability performance for similar separation problems.
15. LAPSE:2025.0457
Hybrid model development for Succinic Acid fermentation: relevance of ensemble learning for enhancing model prediction
June 27, 2025 (v1)
Subject: Energy Systems
Keywords: Fermentation, Hybrid modelling, Machine Learning, Modelling, Modelling and Simulations, Reaction Engineering, Succinic Acid Kinetics
Sustainable development goals have spurred advancements in bioprocess design, driven by improved process monitoring, data storage, and computational power. High-fidelity models are essential for advanced process system engineering, yet accurate parametric models for bioprocessing remain challenging due to overparameterization, often resulting in poor predictive accuracy. Hybrid modeling, combining parametric and non-parametric methods, offers a promising solution by enhancing accuracy while maintaining interpretability. This study explores hybrid models for succinic acid fermentation by Escherichia coli, a critical process for sustainable bio-based chemical production. The research presents a structured exploration of hybrid model architectures and their robustness under varying conditions. Experimental data were preprocessed to remove noise and outliers, and hybrid model structures were developed with differing levels of hybridization (from one to all reaction rates). Kinetic paramete... [more]
16. LAPSE:2025.0456
Predicting Surface Tension of Organic Molecules using COSMO-RS Theory and Machine Learning
June 27, 2025 (v1)
Subject: Modelling and Simulations
Keywords: COSMO-RS, First-Principle modeling, Hybrid Modeling, Machine Learning, Surface tension
Surface tension is a fundamental property at the liquid/gas interface, influencing phenomena such as capillary action, droplet formation, and interfacial behavior in chemical engineering processes. Despite its significance, experimental determination of surface tension is time-intensive and impractical for in silico-designed compounds. Predictive models are essential for bridging this gap. This study expands on Gaudin's COSMO-RS-based model, which assumes uniform molecular orientation at the surface, by testing its predictive capability across broader temperatures (5-50°C) and developing a hybrid model combining first-principle and machine learning insights to improve Gaudin's model predictions. The HM employs a serial configuration where COSMO-RS predictions serve as inputs alongside molecular descriptors, derived using the Mordred library. SHAP analysis guides feature selection, enhancing model interpretability. An artificial neural network refines predictions, optimized via Bayesian... [more]
17. LAPSE:2025.0454
A Comparative Analysis of Industrial MLOps prototype for ML Application Deployment at the edge devices
June 27, 2025 (v1)
Subject: Numerical Methods and Statistics
Keywords: Artificial Intelligence, Big Data, Edge Intelligence, Energy Efficiency, Industry 40, Machine Learning
This paper introduces a prototype for constructing an edge AI system utilizing the contemporary Machine Learning Operations (MLOps) concept. By employing microcontrollers such as the Raspberry Pi as hardware, our methodology includes data scrubbing and machine learning model deployment on edge devices. Crucially, the MLOps pipeline is fully developed within the ecoKI platform, a research platform for ML/AI applications. In this study, we thoroughly investigate the performance of our ecoKI platform by comparing it with the established Edge Impulse platform. We deployed the ML model with different weight quantization methods, such as FP32 and INT8, to compare accuracy variations and inference speed between these two platforms and quantization strategies on edge devices. In our experiments, we identified that the average accuracy performance of the ecoKI platform is 3.61% better than the edge impulse. Moreover, real-time AI processing on edge devices enables microcontrollers, even those w... [more]
18. LAPSE:2025.0453
A Novel Approach to Gradient Evaluation and Efficient Deep Learning: A Hybrid Method
June 27, 2025 (v1)
Subject: Modelling and Simulations
Deep learning faces significant challenges in efficiently training large-scale models. These issues are closely linked, as efficient training often depends on precise and computationally feasible gradient calculations. This work introduces innovative methodologies to improve deep learning network (DLN) training in complex systems. A novel approach to DLN training is proposed by adapting the block coordinate descent (BCD) method, which optimizes individual layers sequentially. This is combined with traditional batch-based training to create a hybrid method that harnesses the strengths of both techniques. Additionally, the study explores Iterated Control Random Search (ICRS) for initializing parameters and applies quasi-Newton methods like L-BFGS with restricted iterations to enhance optimization. By tackling DLN training efficiency, this contribution offers a comprehensive framework to address key challenges in modern machine learning. The proposed methods improve scalability and effect... [more]
19. LAPSE:2025.0452
Streamlining Catalyst Development through Machine Learning: Insights from Heterogeneous Catalysis and Photocatalysis
June 27, 2025 (v1)
Subject: Materials
Keywords: Alternative Fuels, Catalysis, Environment, Fischer-Tropsch Synthesis, Machine Learning, Modelling, Optimization, Photocatalysis
Catalysis design and reaction condition optimization are considered the heart of many chemical and petrochemical processes and industries; however, there are still significant challenges in these fields. Advances in machine learning (ML) have provided researchers with new tools to address some of these obstacles, offering the ability to predict catalyst behaviour, optimal reaction conditions, and product distributions without the need for extensive laboratory experimentation. In this contribution, the potential applications of ML in heterogeneous catalysis and photocatalysis are explored by analysing datasets from different reactions, including Fischer-Tropsch synthesis and photocatalytic pollutant degradation. First, datasets were collected from literature. After cleaning and preparing the datasets, they were employed to train and test several models. The best model for each dataset was selected and applied for optimization.
20. LAPSE:2025.0450
ML-based adsorption isotherm prediction of metal-organic frameworks for carbon dioxide and methane separation adsorbent screening
June 27, 2025 (v1)
Subject: Modelling and Simulations
The efficient separation of carbon dioxide (CO2) and methane (CH4) is crucial for chemical processes, including biogas upgrading and natural gas purification. Metal-organic frameworks (MOFs) have gained significant attention as promising adsorbents for these processes due to their high porosity and tunable structures. Estimating the adsorption capacity of MOFs is essential for screening high performing adsorbents. While molecular simulations are commonly used to estimate the adsorption capacities, their computational intensity acts as a bottleneck in screening MOF adsorbents. In this study, we propose a machine learning (ML)-based framework for the high-throughput prediction of adsorption isotherms for CO2 and CH4 in MOFs. A graph neural network (GNN) model was developed to predict adsorption capacities, effectively replacing the time-consuming molecular simulations. The GNN model processes the structural graphs of MOFs, capturing their spatial configurations, such as surface structure... [more]
21. LAPSE:2025.0448
Towards Self-Tuning PID Controllers: A Data-Driven, Reinforcement Learning Approach for Industrial Automation
June 27, 2025 (v1)
Subject: Intelligent Systems
Keywords: Industry 40, Intelligent Systems, Machine Learning, Process Control, Surrogate Model
As industries embrace the digitalization of Industry 4.0, the abundance of process data creates new opportunities to optimize industrial control systems. Traditional Proportional-Integral-Derivative (PID) controllers often require manual tuning to address changing conditions. This paper introduces an automated, adaptive PID tuning method using historical data and machine learning for a continuously evolving, data-driven approach. The method centers on training a surrogate model using historical process data to replicate real system behavior under various conditions. This enables safe exploration of control strategies without disrupting live operations. An RL (Reinforcement Learning) agent interacts with the surrogate model to learn optimal control policies, dynamically responding to the plant's state, defined by variables like operational conditions and measured disturbances. The agent adjusts PID parameters in real-time, optimizing metrics such as stability, response time, and energy... [more]
22. LAPSE:2025.0447
Selection of Fitness Criteria for Learning Interpretable PDE Solutions via Symbolic Regression
June 27, 2025 (v1)
Subject: Modelling and Simulations
Physics-Informed Symbolic Regression (PISR) offers a pathway to discover human-interpretable solutions to partial differential equations (PDEs). This work investigates three fitness metrics within a PISR framework: PDE fitness, Bayesian Information Criterion (BIC), and a fitness metric proportional to the probability of a model given the data. Through experiments with Laplaces equation, Burgers equation, and a nonlinear wave equation, we demonstrate that incorporating information theoretic criteria like BIC can yield higher fidelity models while maintaining interpretability. Our results show that BIC-based PISR achieved the best performance, identifying an exact solution to Laplaces equation and finding solutions with R2-values of 0.998 for Burgers equation and 0.957 for the nonlinear wave equation. The inclusion of the Bayes D-optimality criterion in estimating model probability strongly constrained solution complexity, limiting models to 3-4 parameters and reducing accuracy. Thes... [more]
23. LAPSE:2025.0446
On the role of artificial intelligence in feature oriented multi-criteria decision analysis
June 27, 2025 (v1)
Subject: Modelling and Simulations
Keywords: Artificial Intelligence, Key performance indicator, Machine Learning, Multi-Criteria Decision Analysis
Balancing economic and environmental goals in industrial applications is critical amid challenges like climate change. Multi-objective optimization (MOO) and multi-criteria decision analysis (MCDA) are key tools for addressing conflicting objectives. MOO generates viable solutions, while MCDA selects the optimal option based on key performance indicators such as profitability, environmental impact, safety, and efficiency. However, large datasets pose a challenge in selecting the preferred solution during the MCDA process This study introduces a novel machine learning-enhanced MCDA framework and applies the method to analyze decarbonization solutions for a European refinery. A stage-wise dimensionality reduction method, combining AutoEncoders and Principal Component Analysis (PCA), is applied to simplify high-dimensional datasets while preserving key spatial features. Geometric analysis techniques, including Intrinsic Shape Signatures (ISS), are employed to refine the identification of... [more]
24. LAPSE:2025.0444
Optimization of Shell and Tube Heat Exchangers Using Reinforcement Learning
June 27, 2025 (v1)
Subject: Optimization
Keywords: design optimization, heat exchanger, Machine Learning, reinforcement learning
This work presents a model for optimizing shell-and-tube heat exchanger design using Q-learning, a reinforcement learning technique. An agent is trained to interact with a simulated environment of a heat exchange model, iteratively refining design configurations to maximize a reward function. This reward function balances heat exchanger effectiveness and pressure drop, emphasizing designs that minimize pressure drop. Results showed that simpler configurations consistently achieved higher rewards, despite complex designs offering better heat transfer efficiency.
25. LAPSE:2025.0443
An Integrated Machine Learning Framework for Predicting HPNA Formation in Hydrocracking Units Using Forecasted Operational Parameters
June 27, 2025 (v1)
Subject: Modelling and Simulations
Keywords: Catalyst Deactivation, Heavy Polynuclear Aromatics HPNAs, Hydrocracking Unit Optimization, LSTM, Machine Learning, Simulation
The accumulation of heavy polynuclear aromatics (HPNAs) in hydrocracking units (HCUs) poses significant challenges to catalyst performance and process efficiency. This study proposes an integrated machine learning framework that combines ridge regression, K-means, and long short-term memory (LSTM) neural networks to predict HPNA formation, enabling proactive process management. For the training phase, weighted average bed temperature (WABT), catalyst deactivation phaseclustered using unsupervised K-means clusteringand hydrocracker feed (HCU feed) parameters obtained from laboratory analyses are utilized to capture the complex nonlinear relationships influencing HPNA formation. In the simulation phase, forecasted WABT values are generated using a ridge regression model, and future HCU feed changes are derived from planned crude oil blend data provided by the planning department. These forecasted WABT values, predicted catalyst deactivation phases, and anticipated HCU feed parameters s... [more]


