Browse
Keywords
Records with Keyword: Machine Learning
Showing records 26 to 50 of 842. [First] Page: 1 2 3 4 5 6 Last
Physics-informed Graph Neural Networks to Predict Thermodynamically Consistent Activity Coefficients in Multicomponent Mixtures
Lifeng Zhang, Benoît Chachuat, Claire S. Adjiman
June 12, 2026 (v1)
Keywords: Activity Coefficients, Graph Neural Network, Machine Learning, Physics-informed, Thermodynamic consistency
Activity coefficients are key thermodynamic quantities for describing phase equilibria, but their experimental determination entails laborious and costly phase-equilibrium measurements, making predictive approaches highly desirable. The potential of machine learning for such predictions has received growing attention as an alternative to physics-based models that require experimental data or expensive calculations for parameterization. We propose a physics-informed edge-enhanced graph attention network (PEGAT) to predict activity coefficients in multicomponent mixtures, where each molecule is encoded as a graph in which the nodes correspond to atoms and the edges to chemical bonds. The excess Gibbs free energy of the mixture is predicted using the proposed model, including a nonlinear transformation in the final layer to ensure that the excess Gibbs free energy vanishes for pure components. To further enforce thermodynamic consistency, the relevant activity coefficients are obtained vi... [more]
A Neural Model of Pinch-Based Multicomponent Distillation for Applications in Flowsheet Synthesis
Alexander B. Wolf, Mirko Skiborowski, Jakob Burger
June 12, 2026 (v1)
Keywords: Distillation, Machine Learning, Modelling and Simulations, Process Design, Surrogate Model
This work presents a data-driven surrogate modeling framework for predicting distillation behavior assuming an infinite number of stages and distillation limits informed by residue-curve topology and pinch-point feasibility analysis. The framework provides a direct mapping from feed composition and distillate-to-feed ratio (D/F) to distillate and bottom product compositions, making it suitable for flowsheet synthesis and optimization applications. The approach combines three components: a classifier that identifies feasible singular-point splits, a boundary regression model that predicts D/F limits separating pure- and mixed-product operating regimes, and a neural network that interpolates product compositions in the intermediate regime. The method is demonstrated for the ternary system ethanol, benzene, and water at 1 atm using data generated from rigorous vapor-liquid-liquid equilibrium analysis. Results show that the framework provides reliable predictions for pure splits while reta... [more]
Exploiting Input-Space Separation in Kolmogorov-Arnold Networks to Prevent Catastrophic Forgetting in Industrial NIR Systems
Imam M. Iqbal, Isabell Viedt, Leon Urbas
June 12, 2026 (v1)
Keywords: Artificial Intelligence, Industry 40, Machine Learning, Modelling, Process Monitoring
Near-infrared (NIR) sorting systems in waste sorting plants operate under multiple settings, creating distinct input-output relationships that challenge predictive modeling. Conventional neural networks, such as multilayer perceptron (MLP), often suffer from catastrophic forgetting under continual training, limiting reliability across settings. This study evaluates Kolmogorov-Arnold Networks (KAN) for continual regression modeling of multi-setting NIR systems. KAN assign nonlinear transformations to network edges using localized spline grids, enabling structural isolation between input regions. We introduce controlled input-space manipulations (shifting successive settings to adjacent or non-overlapping grid regions) and compare KAN performance with MLPs of comparable parameter count. We also examine single-input versus multi-input configurations to assess dimensionality effects. Results show that KANs with sufficient input-space separation maintain previously learned knowledge with pe... [more]
Bayesian Optimization Framework for Agrochemical Formulation Design
Yipei Zhao, Robin Wesley, Joan Cordiner
June 12, 2026 (v1)
Keywords: Agrochemical Formulation, Bayesian Optimisation, Gaussian Processes, Machine Learning, Space-Filling Designs
Manufacturing kinetically stable products remains a challenge in the agrochemical industry. Current agrochemical formulation design relies on semi-empirical and trial-and-error methods. The inconsistency is caused by the lack of a mechanistic understanding of the formulation, making the design a black-box optimisation problem. In addition, validating the ground truth of the high-dimensional design space is expensive, driving chemists to explore possible solutions using data-driven methods. We proposed a Bayesian optimisation framework employing a Gaussian process as the surrogate model to intelligently guide the screening of the design space. The uniqueness of our framework is the application to the classification task to increase the number of hits of stable formulation recipes. The framework was tested on a provided industry dataset with a focus on emulsifiable concentrates. The performance reached a comparable accuracy with only ~25% of the data being sampled and hit more stable for... [more]
A Modeling Framework Integrating Data Trends and Reference Information for Predicting Temperature-Dependent Thermophysical Properties
Shuai Zhang, Abdulelah S. Alshehri, Mansour S. Alhoshan, Anjan Tula
June 12, 2026 (v1)
Keywords: Bias correction, Hybrid modeling, Machine learning, Mechanistic constraints, Slope-based correction, Temperature-dependent property prediction
The availability of temperature-dependent physicochemical property data forms the cornerstone of process simulation, optimization, and sustainable molecular and product design. However, a critical data gap persists, as experimental measurements are accessible for only a small subset of known chemicals. This renders experimental characterization resource-prohibitive, often compelling reliance on empirical estimation methods. Moreover, although many models offer single-point predictions at fixed temperatures, accurately modeling continuous temperature-dependent behavior remains challenging. Conventional methods frequently overlook intermediate variations, resulting in limited extrapolation capability. To overcome these limitations, we introduce a mechanism-guided hybrid modeling framework that integrates physical insights into data-driven models. This framework is built on two strategies. Strategy ? targets trend correction by generating a continuous representation from discrete single-p... [more]
Dynamic Modelling of Renewable-driven CO2 Methanation using Recurrent Neural Networks
M. Andrea Pappagallo, Diego A. Romero Lombo, Mattia Vallerio, Emanuele Moioli
June 12, 2026 (v1)
Keywords: Chemical reaction engineering, Dynamic surrogate modelling, Energy systems analysis, Machine learning, Recurrent neural networks
A recurrent neural network (RNN) model for a CO2 methanation reactor was developed based on synthetic data generated from a validated mechanistic model of the same unit. The model was used to predict the main properties of the reactor - methane productivity and hotspot temperature - during a dynamic operation of the unit. The dynamic profile of feedstock availability was simulated taking into account the H2 flow that can be produced from PV-powered water electrolysis using solar irradiation profiles over a year in Milan, Italy. The dataset therefore consists of 366 data instances (one for each day), each composed of one datapoint per minute of sunlight. The best agreement between the predictions from the RNN and the target output values from the mechanistic model was found using a shallow RNN of 20 hidden-layer neurons, trained with a batch size of 10 and an 80/20 training-testing split. This showed that RNNs can constitute a reliable tool for dynamic surrogate modelling of energy conv... [more]
A Data-Efficient Symbolic Regression Framework for Automated Interpretable Bioprocess Modelling
Luca Riezzo, Alexander Rogers, Harry Kay, Dongda Zhang
June 12, 2026 (v1)
Keywords: Augmented intelligence, Biochemical reaction kinetics, Data intelligence, Machine learning, Symbolic Regression
Bioprocess modelling, optimisation and scale-up are central components for improving sustainable manufacturing within pharmaceutical and chemical industries. However, developing accurate bioprocess digital twins remains a challenging process. Conventional mechanistic models are difficult to construct because of limited mechanistic understanding and large complexity of cellular metabolisms. While data-driven models have gained popularity, they often require large amounts of experimental data that is often time consuming to obtain and lack any quantitative description of the process. Hybrid modelling methods have emerged as promising alternatives however fail to provide physical insight to the root cause of model error. This work therefore presents a promising solution by developing a data-efficient symbolic regression (SR) based framework to enable the automated discovery of interpretable bioprocess models. A universal kinetic model backbone was used to capture overall process behaviour... [more]
Development of a Predictive Model for Microbial Growth under Variable Conditions Using a Multilayer Perceptron Neural Network: Application to Candida guilliermondii
Jazmín Cortez-González, Juan Gabriel Segovia-Hernández, Salvador Hernández, Varinia López-Ramírez, Arturo Hernández-Aguirre, Rodolfo Murrieta-Dueñas
June 12, 2026 (v1)
Keywords: Artificial Intelligence, Biomass, Machine Learning, microbial growth, Modelling and Simulations, Optimization
In the field of biochemical process design, the accurate modeling of microbial growth is essential for the development and optimization of biological reactors used in the production of high-value compounds. Achieving this objective requires a detailed understanding of how environmental factors-such as pH and nutrient availability-influence microbial dynamics across the four distinct growth phases: lag, exponential, stationary, and death. Traditionally, reactor design relies heavily on the Monod model, which provides a simplified representation of microbial growth, focusing primarily on the exponential phase under constant operating conditions (1). However, this model presents substantial limitations when applied to dynamic environments where key parameters vary over time. To overcome these constraints, the present study proposes a data-driven modeling approach using a multilayer perceptron (MLP) artificial neural network for the prediction of microbial growth trajectories under varying... [more]
Physics-Informed Neural Networks for NIR Spectroscopy Analysis of Pharmaceutical Tablet Properties
Xinle Zhang, Shumaiya Furdoush, Marcial Gonzalez, Gintaras V. Reklaitis
June 12, 2026 (v1)
Keywords: Industry 40, Machine Learning, Near Infrared Spectroscopy, Pharmaceutical Tablets, Physics-Informed Neural Networks
In pharmaceutical process engineering, accurate prediction of tablet properties is crucial for ensuring product quality, optimizing manufacturing efficiency, and advancing sustainable production practices. This study presents a physics-informed neural network (PINN) framework for predicting the physical properties of pharmaceutical tablets from near-infrared (NIR) spectra. The PINN framework integrates revised Kubelka-Munk theory and physical constraints to ensure physically consistent predictions while requiring less training data than conventional artificial neural networks. Tablets were manufactured using acetaminophen and microcrystalline cellulose formulations with varying compositions and compression settings. The PINN framework successfully predicts critical quality attributes, including tensile strength, porosity, and density. It offers a data-efficient, interpretable solution for pharmaceutical tablet quality control.
A Machine Learning Implementation for Fermentation Quality Prediction in Wine Manufacturing
Matthew A.J. Hill, Dimitrios I. Gerogiorgis
June 12, 2026 (v1)
Keywords: alcoholic fermentation, artificial neural network, efficacy, fermentation time, machine learning, random forest regression, secondary metabolite concentration, support vector regression
Wine consumers are increasingly health- and environmentally conscious. At the same time, white wine and rosé drinkers favour freshness and varietal aromas, which requires low-temperature regimes that extend fermentation time and increase energy demand. Additionally, global warming accelerates grape ripening which increases alcohol level in wine. To reduce cost and alcohol levels while maintaining quality, predictive tools that forecast how fermentation conditions impact fermentation time, and primary and secondary metabolite concentrations, can provide practical benefits to wineries by expediting oenological decisions-making and in turn reducing energy demand. Additionally, literature highlights static models in smart manufacturing suffer from performance degradation with data drift. In light of this, we successfully developed and evaluated pipelines for the automated design and training of three ML methods - support vector regression, random forest and artificial neural networks - to... [more]
Chemical Additives in Plastics: Understanding the Reactions, Fate, and Releases during Pyrolysis
Ronald Borja-Roman, Andres Castellar-Freile, John D. Chea, Monica Rodriguez Morris, Gerardo J. Ruiz-Mercado, Kirti M. Yenkie
June 12, 2026 (v1)
Keywords: Environment, Machine Learning, Plastic Recycling, Reaction Engineering, Stochastic Simulations
Plastic pyrolysis is widely promoted as a techno-economic industrial scale recycling strategy. Nevertheless, the fate and reactivity of plastic chemical additives during pyrolysis are mostly overlooked in product quality and environmental release assessments. Here, we present an integrated modeling framework to elucidate the role of additives in plastic pyrolysis and evaluate the implications of their transformation products and environmental releases. Using high-density polyethylene (HDPE) as a case study, chemical additives of concern are selected based on occurrence, concentration data, and potential risk to human health and the environment. Bond dissociation energies are predicted using a machine learning model to identify dominant radical species formed under pyrolytic conditions. These additive-derived radicals are incorporated into an automatic chemical reaction mechanism generator that constructs kinetic models composed of elementary chemical reaction steps. These kinetic model... [more]
OpenAD-lib: Open-Source Framework for Uncertainty-Aware Anaerobic Digestion Digital Twins
Benaissa Dekhici, Rohit Murali, Michael Short
June 12, 2026 (v1)
Keywords: Anaerobic digestion, Bioenegry, Digital twins, Digitalisation, Machine learning, Model Predictive Control, Open-source framework, Uncertainty quantification
This paper presents OpenAD-lib, an open-source Python framework for anaerobic digestion (AD) digital twins, unifying mechanistic models, machine learning (ML) surrogates, and model predictive control (MPC) within a modular ecosystem. OpenAD-lib addresses the critical fragmentation in AD digitalisation by bridging mechanistic and data-driven paradigms under explicit uncertainty. By integrating uncertainty-aware feedstock characterisation with robust process control, the platform enables the transition from isolated research tools to fully integrated digital twins, delivering economic and environmental value in AD systems.
Hybrid Modeling of a Sewage-Sludge Gasifier using Flowsheet Simulation and Machine Learning
Malte Lutz, William Würpel, Fabian E. Habicht, Burcu Aker, Jan C. Schöneberger
June 12, 2026 (v1)
Keywords: data-driven, flowsheet simulation, gasification, hybrid model, machine learning, sewage sludge
This work presents a hybrid modelling approach for a downdraft sewage sludge gasifier within the Shit2Power (S2P) process. The gasifier is represented in CHEMCAD NXT by a series of four standard reactors that combine stoichiometric and equilibrium models with a data-driven correction step to account for deviations from ideal Gibbs equilibrium. Reaction conversions in the correction reactor are fitted to experimental synthesis gas compositions reported by Werle (2014) [7] for 30 operating points with varying equivalence ratios and reactor inlet air temperatures. The calibrated hybrid reactor model is evaluated against these data and shows conservative agreement for the combustible gas components of the synthesis gas. To overcome the limitations of linear interpolation between fitted operating points, several machine learning approaches are evaluated to predict the reaction conversions, and boosted neural networks are selected as a compromise between prediction accuracy and smooth behavi... [more]
Source Code (VBA): Data exchange between flow chart simulation (CHEMCAD) and machine learning model (Python)
Malte Lutz, William Würpel, Fabian E. Habicht, Burcu Arker, Jan C. Schöneberger
January 30, 2026 (v1)
Keywords: CHEMCAD, Flowsheet Simulation, Gasification, Machine Learning, VBA
Included in the CHEMCAD flowchart simulation model, this VBA source code extracts the inputs for the machine learning (ML) model from the simulation, passes them to the machine ML (Python), and sends the ML model outputs back to the simulation model. This code is not provided for direct use, but rather to demonstrate the methodology and as an aid for interested parties to develop their own solutions.
Supporting Information for: Beyond Tennessee Eastman: Benchmarking Deep Anomaly Detection on Real-World Pilot-Scale Continuous Distillation Data
Fabian Hartung, Aparna Muraleedharan, Marius Kloft, Jakob Burger
February 2, 2026 (v1)
Keywords: Anomaly Detection, Continuous Distillation, Heteroazeotropic Distillation, Machine Learning, Pilot Plant Data, Tennesse Eastman Process
Anomaly detection is essential for keeping chemical plants safe and running efficiently. Although many deep-learning methods have been proposed, most are still tested mainly on synthetic benchmarks such as the Tennessee Eastman Process (TEP). While these simulators enable fair comparisons, they do not reflect the noise, complexity, and irregular fault behavior of real industrial plants. As a result, it remains unclear how well these models generalize in practice. In this work, we extend our earlier ESCAPE study and move beyond water systems to industrially relevant chemical processes. We analyze data from two continuously operated pilot plant scenarios at the Technical University of Munich: n-butanol/water heteroazeotropic distillation and poly(oxymethylene) ether purification. We published these datasets for the first time at NeurIPS 2025. In this work, 30 anomaly detection methods, including 26 deep-learning and 4 classical approaches, are benchmarked using the open-source TimeSeAD l... [more]
Utilizing Machine Learning for Phenomena-based Synthesis of Intensified Process Flowsheets: Supplementary Material
Omar Alqusair, Jie Li
January 31, 2026 (v1)
Supplementary material for the article "Utilizing Machine Learning for Phenomena-based Synthesis of Intensified Process Flowsheets", submitted to The 36th European Symposium on Computer Aided Process Engineering (ESCAPE 36). The document includes information about the heurstic and samplic logic rules used in generating the initial dataset, and the grid search results for hyperparamter optimization.
Nonmyopic Bayesian process optimization with a finite budget
Jose Luis Pitarch, Leopoldo Armesto, Antonio Sala
July 11, 2025 (v1)
Subject: Optimization
Optimization under uncertainty is inherent to many PSE applications ranging from process design to RTO. Reaching process true optima often involves learning from experimentation, but actual experiments involve a cost (economic, resources, time) and a budget limit usually exists. Finding the best trade-off on cumulative process performance and experimental cost over a finite budget is a Partially Observable Markov Decision Process (POMDP), known to be computationally intractable. This paper follows the nonmyopic Bayesian optimization (BO) approximation to POMDPs developed by the machine-learning community, that naturally enables the use of hybrid plant surrogate models formed by fundamental laws and Gaussian processes (GP). Although nonmyopic BO using GPs may look more tractable, evaluating multi-step decision trees to find the best first-stage candidate action to apply is still expensive with evolutionary or NLP optimizers. Hence, we propose modelling the value function of the first-st... [more]
Enhancing Predictive Maintenance in Used Oil Re-Refining: a Hybrid Machine Learning Approach
Francesco Negri, Andrea Galeazzi, Francesco Gallo, Flavio Manenti
July 8, 2025 (v1)
Maintenance is critical for industrial plants to ensure operational reliability and worker safety. In process industries, fouling, the accumulation of solid residues in equipment, poses a significant challenge, causing inefficiencies and productivity losses. Effective modeling of fouling evolution over time is essential for maintenance planning to prevent equipment from operating under suboptimal conditions. Traditional approaches to fouling prediction include equation-based models, which offer high precision but may struggle with continuously changing process bound-aries, and machine learning techniques, which are more adaptable but less effective at capturing rapidly evolving trends driven by complex underlying physics. This study introduces an innova-tive hybrid machine learning approach for predictive maintenance, combining the strengths of both methods. Pressure differential is modeled using an equation-based approach that links pressure data with fouling thickness, while the foul... [more]
Hybrid Models Identification and Training through Evolutionary Algorithms
Ulderico Di Caprio, M. Enis Leblebici
July 2, 2025 (v2)
Keywords: automatic identification, differential evolution, epistemic uncertainty, hybrid modelling, Machine Learning
Hybrid modelling is widely employed in chemical engineering to generate highly accurate predictions. Such an approach merges first-principle modelling with machine learning techniques to identify and model the epistemic uncertainty from experimental data. Despite its advantages, this still requires cross-domain competencies that are difficult to find in the chemical industry and high human involvement. The possibility of automating the identification and training model would be significantly beneficial for the widespread adoption of hybrid modelling methodology within the chemical industry. This work presents a novel algorithm for the automatic identification of hybrid models (HMs) starting from the first-principle representation of the system, described by differential equation sets. The methodology formulates the problem as mixed-integer programming, identifying the equation running under uncertainty, identifying the machine learning model hyperparameters, and training the latter. Th... [more]
Data-driven Digital Design of Pharmaceutical Crystallization Processes
Yash Barhate, Yung Shun Kang, Neda Nazemifard, Ben Renner, Yihui Yang, Charles Papageorgiou, Zoltan K. Nagy
June 27, 2025 (v1)
Mechanistic population balance modeling (PBM) has advanced the design of pharmaceutical crystallization processes, enabling the production of active pharmaceutical ingredient (API) crystals with desired critical quality attributes (CQAs), such as purity and crystal size distribution. However, PBM development can sometimes be resource-intensive, requiring extensive design of experiments (DoE) and high-quality process data, making it impractical under fast-paced industrial development timelines. This study proposes a machine learning (ML)-based workflow for developing ‘fit-for-purpose’ digital twins of crystallization processes, leveraging industrially available DoE data to link operating conditions with CQAs. Validated on industrial data for a commercial API with complex crystallization challenges, the workflow efficiently identifies optimal operating conditions, demonstrating the potential of data-driven digital twins to accelerate the development of pharmaceutical processes.
Data-driven Modeling of a Continuous Direct Compression Tableting Process using SINDy
Pau Lapiedra Carrasquer, Satyajeet S. Bhonsale, Carlos André Muñoz López, Kristof Dockx, Jan F.M. Van Impe
June 27, 2025 (v1)
Keywords: Big Data, Dynamic Modelling, Industry 40, Machine Learning, Modelling, SINDy
Understanding the complex dynamics of continuous processes in pharmaceutical manufacturing is essential to ensure product quality across the production line. This paper presents a data-driven modeling approach using Sparse Identification of Nonlinear Dynamics with Control (SINDYc) to capture the dynamics of a continuous direct compression (CDC) tableting line. By incorporating delayed control inputs into the candidate function library, the model effectively captures deviations from steady state in response to dynamic changes. The proposed model was developed by finding a balance between accuracy and sparsity, with focus on the ability to generalize to a wide range of operating conditions.
Machine Learning Applications in Dairy Production
Alexandra Petrokolou, Satyajeet Sheetal Bhonsale, Jan FM Van Impe, Efstathia Tsakali
June 27, 2025 (v1)
The Fourth Industrial Revolution (Industry 4.0) brings a new chapter at dairy sector. Dairy 4.0 technologies are based on Big Data Analysis, Internet of Things, Robotics and Machine Learning. The usage of smart technologies to processing and analyzing complicated massive data has a significant impact in automation, optimization, functional costs and innovation. Artificial Intelligence tools are applied from dairy farms and production lines – including packaging- to supply chain. The aim of this paper is to demonstrate the most used applications of Machine Learning in dairy production so as to enhance the sustainability and the quality of dairy products. The most significant Machine Learning applications integrate machine vision, smart environmental sensors, activity collars, thermal imaging cameras, and digitized supply chain systems to facilitate inventory management. Challenges like milk adulteration, animal diseases, mastitis, traceability and supply chain losses are also addressed... [more]
Integrated hybrid modelling of lignin bioconversion
Sidharth Laxminarayan, Lily Cheung, Fani Boukouvala
June 27, 2025 (v1)
Keywords: Biosystems, Dynamic Modelling, Lignin Valorization, Machine Learning
Global biomanufacturing is projected to expand rapidly in the coming decade due to advancements in DNA sequencing and manipulation. However, the complexity of cellular behaviour introduces difficulty in modelling and optimizing biomanufacturing processes. Phenomenological models that represent the physics of the system in empirical equations suffer from poor robustness, while their machine learning (ML) counterparts suffer from poor extrapolative capability. On the other hand, hybrid models allow us to leverage both physical constraints and the flexibility of ML. This work describes a new approach for hybrid modeling that integrates the time-variant parameter estimation and ML model training into a singular step. We implement this approach on a proposed scheme for the cell-mediated conversion of a lignin derivative into a bioplastic precursor and show that our integrated hybrid model outperforms the traditional two-step hybrid, phenomenological, and ML model counterparts. Lastly, we de... [more]
A Physics-based, Data-driven Numerical Framework for Anomalous Diffusion of Water in Soil
Zeyuan Song, Zheyu Jiang
June 27, 2025 (v1)
Keywords: Machine Learning, Modelling and Simulations, Numerical Methods, Sustainability, Water
Precision modeling and forecasting of soil moisture are essential for implementing smart irrigation systems and mitigating agricultural drought. Most agro-hydrological models are based on the standard Richards equation, a highly nonlinear, degenerate elliptic-parabolic partial differential equation (PDE) with first order time derivative. However, research has shown that standard Richards equation is unable to model preferential flow in soil with fractal structure. In such a scenario, the soil exhibits anomalous non-Boltzmann scaling behavior. Incorporating the anomalous non-Boltzmann scaling behavior into the Richards equation leads to a generalized, time-fractional Richards equation based on fractional time derivatives. As expected, solving the time-fractional Richards equation for accurate modeling of water flow dynamics in soil faces extensive computational challenges. To target these challenges, we propose a novel numerical method that integrates finite volume method (FVM), adaptiv... [more]
Machine Learning Models for Predicting the Amount of Nutrients Required in a Microalgae Cultivation System
Geovani R. Freitas, Sara M. Badenes, Rui Oliveira, Fernando G. Martins
June 27, 2025 (v1)
Keywords: Data Mining, Dunaliella carotenogenesis, Machine Learning, Microalgae Cultivation
Effective prediction of nutrient demands is crucial for optimising microalgae growth, maximising productivity and minimising the waste of resources. With the increasing amount of data related to microalgae cultivation systems, data mining and machine learning models to extract additional knowledge have gained popularity. In the development of such models, a data preprocessing stage is necessary due to the poor data quality. At this stage, cleaning and outlier removal techniques are employed to eliminate missing data and outliers, respectively. Afterwards, data splitting and cross-validation strategies are employed to ensure that the models are trained and evaluated with representative subsets of the data. Principal component analysis is also applied to simplify complex environmental datasets by reducing the number of features while retaining as much information as possible. To further improve prediction capabilities, ensemble methods are incorporated, leveraging multiple models to achi... [more]
Showing records 26 to 50 of 842. [First] Page: 1 2 3 4 5 6 Last
(0.07 seconds)
[Show All Keywords]

[0.08 s]