Keywords
Records with Keyword: Machine Learning
Principal Component Analysis of Process Datasets with Missing Values
Kristen A. Severson, Mark C. Molaro, Richard D. Braatz
July 31, 2018 (v1)
Keywords: chemometrics, Machine Learning, missing data, multivariable statistical process control, principal component analysis, process data analytics, process monitoring, Tennessee Eastman problem
Datasets with missing values arising from causes such as sensor failure, inconsistent sampling rates, and merging data from different systems are common in the process industry. Methods for handling missing data typically operate during data pre-processing, but can also occur during model building. This article considers missing data within the context of principal component analysis (PCA), which is a method originally developed for complete data that has widespread industrial application in multivariate statistical process control. Due to the prevalence of missing data and the success of PCA for handling complete data, several PCA algorithms that can act on incomplete data have been proposed. Here, algorithms for applying PCA to datasets with missing values are reviewed. A case study is presented to demonstrate the performance of the algorithms and suggestions are made with respect to choosing which algorithm is most appropriate for particular settings. An alternating algorithm based... [more]
On the Use of Multivariate Methods for Analysis of Data from Biological Networks
Troy Vargason, Daniel P. Howsmon, Deborah L. McGuinness, Juergen Hahn
July 31, 2018 (v1)
Keywords: autism spectrum disorder, classification, Fisher discriminant analysis, Machine Learning, Multivariate Statistics, one carbon metabolism, probability density function, transsulfuration, urine toxic metals
Data analysis used for biomedical research, particularly analysis involving metabolic or signaling pathways, is often based upon univariate statistical analysis. One common approach is to compute means and standard deviations individually for each variable or to determine where each variable falls between upper and lower bounds. Additionally, p-values are often computed to determine if there are differences between data taken from two groups. However, these approaches ignore that the collected data are often correlated in some form, which may be due to these measurements describing quantities that are connected by biological networks. Multivariate analysis approaches are more appropriate in these scenarios, as they can detect differences in datasets that the traditional univariate approaches may miss. This work presents three case studies that involve data from clinical studies of autism spectrum disorder that illustrate the need for and demonstrate the potential impact of multivariate... [more]
Deterministic Global Optimization with Artificial Neural Networks Embedded
Global deterministische Optimierung von Optimierungsproblemen mit k√ľnstlichen neuronalen Netzwerken
Artur M Schweidtmann, Alexander Mitsos
October 15, 2018 (v2)
Subject: Optimization
Artificial neural networks (ANNs) are used in various applications for data-driven black-box modeling and subsequent optimization. Herein, we present an efficient method for deterministic global optimization of ANN embedded optimization problems. The proposed method is based on relaxations of algorithms using McCormick relaxations in a reduced-space [\textit{SIOPT}, 20 (2009), pp. 573-601] including the convex and concave envelopes of the nonlinear activation function of ANNs. The optimization problem is solved using our in-house global deterministic solver MAiNGO. The performance of the proposed method is shown in four optimization examples: an illustrative function, a fermentation process, a compressor plant and a chemical process optimization. The results show that computational solution time is favorable compared to the global general-purpose optimization solver BARON.
[Show All Keywords]