LAPSE:2025.0383
Published Article

LAPSE:2025.0383
An Efficient Convex Training Algorithm for Artificial Neural Networks by Utilizing Piecewise Linear Approximations and Semi-Continuous Formulations
June 27, 2025
Abstract
Artificial neural networks are widely used as data-driven models for capturing complex, nonlinear systems. However, suboptimal training remains a significant challenge due to the nonlinearity of activation functions and the reliance on local solvers, which makes achieving global solutions difficult. One solution involves reformulating activation functions as piecewise linear approximations to convexify the problem, though this approach often requires substantial CPU time. This study demonstrates that a tailored branch-and-bound algorithm can effectively address these challenges by efficiently navigating the solution space using linear relaxations. The proposed method achieves minimal training error, offering a robust solution to the training bottleneck. Unlike traditional mixed-integer programming approaches, which often struggle to converge within reasonable CPU times, the SOSX algorithm shows superior scalability, with computational demand growing almost linearly rather than exponentially as problem size increases. Experiments on datasets with the same dimensionality confirm that the algorithm's efficiency is not specific to a particular dataset, as it demonstrates consistent CPU time trends across multiple datasets.
Artificial neural networks are widely used as data-driven models for capturing complex, nonlinear systems. However, suboptimal training remains a significant challenge due to the nonlinearity of activation functions and the reliance on local solvers, which makes achieving global solutions difficult. One solution involves reformulating activation functions as piecewise linear approximations to convexify the problem, though this approach often requires substantial CPU time. This study demonstrates that a tailored branch-and-bound algorithm can effectively address these challenges by efficiently navigating the solution space using linear relaxations. The proposed method achieves minimal training error, offering a robust solution to the training bottleneck. Unlike traditional mixed-integer programming approaches, which often struggle to converge within reasonable CPU times, the SOSX algorithm shows superior scalability, with computational demand growing almost linearly rather than exponentially as problem size increases. Experiments on datasets with the same dimensionality confirm that the algorithm's efficiency is not specific to a particular dataset, as it demonstrates consistent CPU time trends across multiple datasets.
Record ID
Keywords
Artificial Neural Network, computational complexity, convex formulation, mixed-integer linear programming, piecewise linear functions
Subject
Suggested Citation
Köksal ES, Aydin E, Türkay M. An Efficient Convex Training Algorithm for Artificial Neural Networks by Utilizing Piecewise Linear Approximations and Semi-Continuous Formulations. Systems and Control Transactions 4:1438-1444 (2025) https://doi.org/10.69997/sct.125995
Author Affiliations
Köksal ES: Koc University, Department of Chemical and Biological Engineering, Istanbul, Sariyer, Turkey; Turkish Petroleum Refineries Corporation, Kocaeli, Korfez, Turkey
Aydin E: Koc University, Department of Chemical and Biological Engineering, Istanbul, Sariyer, Turkey
Türkay M: Koc University, Department of Industrial Engineering, Istanbul, Sariyer, Turkey; SmartOpt, Istanbul, Turkey
Aydin E: Koc University, Department of Chemical and Biological Engineering, Istanbul, Sariyer, Turkey
Türkay M: Koc University, Department of Industrial Engineering, Istanbul, Sariyer, Turkey; SmartOpt, Istanbul, Turkey
Journal Name
Systems and Control Transactions
Volume
4
First Page
1438
Last Page
1444
Year
2025
Publication Date
2025-07-01
Version Comments
Original Submission
Other Meta
PII: 1438-1444-1409-SCT-4-2025, Publication Type: Journal Article
Record Map
Published Article

LAPSE:2025.0383
This Record
External Link

https://doi.org/10.69997/sct.125995
Article DOI
Download
Meta
Record Statistics
Record Views
754
Version History
[v1] (Original Submission)
Jun 27, 2025
Verified by curator on
Jun 27, 2025
This Version Number
v1
Citations
Most Recent
This Version
URL Here
http://psecommunity.org/LAPSE:2025.0383
Record Owner
PSE Press
Links to Related Works
References Cited
- A. Thebelt, J. Wiebe, J. Kronqvist, C. Tsay, and R. Misener, "Maximizing information from chemical engineering data sets: Applications to machine learning," Chem Eng Sci, vol. 252, Apr. 2022, https://doi.org/10.1016/j.ces.2022.117469
- M. Bianchini and M. Gori, "Optimal learning in artificial neural networks: A review of theoretical results," 1996 https://doi.org/10.1016/0925-2312(95)00032-1
- J. P. Vielma, "Mixed integer linear programming formulation techniques," SIAM Review, vol. 57, no. 1, pp. 3-57, 2015, https://doi.org/10.1137/130915303
- E. M. L. Beale and J. A. Tomlin, "Special facilities in a general mathematical programming system for nonconvex problems using ordered sets of variables," Operational Research, vol. 69, pp. 447-454, 1970, [Online]. Available: https://www.researchgate.net/publication/313166553
- A. B. Keha, I. R. De Farias, and G. L. Nemhauser, "Models for representing piecewise linear cost functions," Operations Research Letters, vol. 32, no. 1, pp. 44-48, Jan. 2004, https://doi.org/10.1016/S0167-6377(03)00059-2
- C. Tsay, J. Kronqvist, A. Thebelt, and R. Misener, "Partition-based formulations for mixed-integer optimization of trained ReLU neural networks," Feb. 2021, [Online]. Available: http://arxiv.org/abs/2102.04373
- J. Katz, I. Pappas, S. Avraamidou, and E. N. Pistikopoulos, "Integrating deep learning models and multiparametric programming," Comput Chem Eng, vol. 136, May 2020, https://doi.org/10.1016/j.compchemeng.2020.106801
- B. Grimstad and H. Andersson, "ReLU networks as surrogate models in mixed-integer linear programs," Comput Chem Eng, vol. 131, Dec. 2019, https://doi.org/10.1016/j.compchemeng.2019.106580
- A. M. Schweidtmann and A. Mitsos, "Deterministic Global Optimization with Artificial Neural Networks Embedded," J Optim Theory Appl, vol. 180, no. 3, pp. 925-948, Mar. 2019, https://doi.org/10.1007/s10957-018-1396-0
- E. S. Koksal and E. Aydin, "Physics Informed Piecewise Linear Neural Networks for Process Optimization," Comput Chem Eng, vol. 174, Jun. 2023, https://doi.org/10.1016/j.compchemeng.2023.108244
- H. Sildir and E. Aydin, "A Mixed-Integer linear programming based training and feature selection method for artificial neural networks using piece-wise linear approximations," Chem Eng Sci, vol. 249, Feb. 2022, https://doi.org/10.1016/j.ces.2021.117273
- S. Ding, H. Zhao, Y. Zhang, X. Xu, and R. Nie, "Extreme learning machine: algorithm, theory and applications," Artif Intell Rev, vol. 44, no. 1, pp. 103-115, Jun. 2015, https://doi.org/10.1007/s10462-013-9405-z
- A. B. Keha, I. R. De Farias, and G. L. Nemhauser, "Models for representing piecewise linear cost functions," Operations Research Letters, vol. 32, no. 1, pp. 44-48, Jan. 2004, https://doi.org/10.1016/S0167-6377(03)00059-2
- K. Dunn, "Distillation tower [Dataset]," OpenMV
- D. P. Kingma and J. Ba, "Adam: A Method for Stochastic Optimization," Dec. 2014, [Online]. Available: http://arxiv.org/abs/1412.6980
- F. Chollet and et al, "Keras." [Online]. Available: https://keras.io
- Vanhooren, H., Nguyen, K., Vanrolleghem, P., & Spanjers, H. (1996). Development of a simulation protocol for evaluation of respirometry-based control strategies
- Otto, M., & Fong, A. (2013). Cancer data of United States of America [Dataset]. Kaggle. Released under MIT Licence. https://www.kaggle.com/datasets/varunraskar/cancer-regression/code
- Harrison, D., & Rubinfeld, D. L. (1978). Hedonic Housing Prices and the Demand for Clean Air. Journal of Environmental Economics and Management, 5, 81-102 https://doi.org/10.1016/0095-0696(78)90006-2

