LAPSE:2025.0424
Published Article

LAPSE:2025.0424
Exploring Industrial Text Data for Monitoring Chemical Manufacturing Processes
June 27, 2025
Abstract
To address the limitations of traditional sensing instrumentation in industrial processes, this work explores the use of industrial text data. Given that current instrumentation often fails to capture the full scope of process-related information, text data resulting from operation of industrial settings (for example: maintenance, inspection and incident reports) can provide valuable insights. This study focuses on accessing the effectiveness of natural language processing (NLP) techniques in retrieving critical information from industrial text data. To achieve this, the classification of Process Safety and Containment Events (PSCE) was used as case study. Overall, we found NLP methods are effective in information retrieval from industrial text data. However, the integration of the embeddings into machine learning (ML) approaches poses some challenges. The complexity of the information encoded in the embeddings makes them too disparate and unique samples of a larger domain, making challenging the training of a ML model.
To address the limitations of traditional sensing instrumentation in industrial processes, this work explores the use of industrial text data. Given that current instrumentation often fails to capture the full scope of process-related information, text data resulting from operation of industrial settings (for example: maintenance, inspection and incident reports) can provide valuable insights. This study focuses on accessing the effectiveness of natural language processing (NLP) techniques in retrieving critical information from industrial text data. To achieve this, the classification of Process Safety and Containment Events (PSCE) was used as case study. Overall, we found NLP methods are effective in information retrieval from industrial text data. However, the integration of the embeddings into machine learning (ML) approaches poses some challenges. The complexity of the information encoded in the embeddings makes them too disparate and unique samples of a larger domain, making challenging the training of a ML model.
Record ID
Keywords
chemical manufacturing industry, data mining, Industrial text data, natural language processing, process safety and containment events
Subject
Suggested Citation
Strelet E, Castillo I, Peng Y, Chin ST, Zink A, Rendall R, Reis MS. Exploring Industrial Text Data for Monitoring Chemical Manufacturing Processes. Systems and Control Transactions 4:1694-1699 (2025) https://doi.org/10.69997/sct.101096
Author Affiliations
Strelet E: The Dow Chemical Company; Univ Coimbra, CERES, Department of Chemical Engineering, Rua Sílvio Lima, Pólo II Pinhal de Marrocos, 3030-790 Coimbra, Portugal
Castillo I: The Dow Chemical Company
Peng Y: The Dow Chemical Company
Chin ST: The Dow Chemical Company
Zink A: The Dow Chemical Company
Rendall R: The Dow Chemical Company
Reis MS: Univ Coimbra, CERES, Department of Chemical Engineering, Rua Sílvio Lima, Pólo II Pinhal de Marrocos, 3030-790 Coimbra, Portugal
Castillo I: The Dow Chemical Company
Peng Y: The Dow Chemical Company
Chin ST: The Dow Chemical Company
Zink A: The Dow Chemical Company
Rendall R: The Dow Chemical Company
Reis MS: Univ Coimbra, CERES, Department of Chemical Engineering, Rua Sílvio Lima, Pólo II Pinhal de Marrocos, 3030-790 Coimbra, Portugal
Journal Name
Systems and Control Transactions
Volume
4
First Page
1694
Last Page
1699
Year
2025
Publication Date
2025-07-01
Version Comments
Original Submission
Other Meta
PII: 1694-1699-1228-SCT-4-2025, Publication Type: Journal Article
Record Map
Published Article

LAPSE:2025.0424
This Record
External Link

https://doi.org/10.69997/sct.101096
Article DOI
Download
Meta
Record Statistics
Record Views
746
Version History
[v1] (Original Submission)
Jun 27, 2025
Verified by curator on
Jun 27, 2025
This Version Number
v1
Citations
Most Recent
This Version
URL Here
https://psecommunity.org/LAPSE:2025.0424
Record Owner
PSE Press
Links to Related Works
References Cited
- Ye, Z., Yang, J., Zhong, N., Tu, X., Jia, J., & Wang, J. Tackling environmental challenges in pollution controls using artificial intelligence: A review. Science of The Total Environment, 699, 134279 (2020) https://doi.org/10.1016/j.scitotenv.2019.134279
- Wang, C., Nulty, P., & Lillis, D. A Comparative Study on Word Embeddings in Deep Learning for Text Classification. Proceedings of the 4th International Conference on Natural Language Processing and Information Retrieval, 37-46 (2021) https://doi.org/10.1145/3443279.3443304
- Antons, D., Grünwald, E., Cichy, P., Salge, T. O., & Salge, T. O. The application of text mining methods in innovation research: Current state, evolution patterns, and development priorities. R & D Management, 50(3), 329-351 (2020) https://doi.org/10.1111/radm.12408
- Manning, C. D. Human Language Understanding & Reasoning. Daedalus, 151 (2), 127-138 (2022) https://doi.org/10.1162/daed_a_01905
- Rato, T. J., & Reis, M. S. Real-time risk assessment and surveillance for early prediction of unplanned shutdown events. Chemical Engineering Science 282, 119364 (2023) https://doi.org/10.1016/j.ces.2023.119364
- Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., & Dean, J. Distributed Representations of Words and Phrases and their Compositionality. Advances in Neural Information Processing Systems 26 (2013) https://papers.nips.cc/paper_files/paper/2013/hash/9aa42b31882ec039965f3c4923ce901b-Abstract.html
- Breiman, L. Random Forests. Machine Learning, 45(1), 5-32 (2001) https://doi.org/10.1023/A:1010933404324
- Reimers, N., & Gurevych, I. Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP) 3980-3990 (2019) https://doi.org/10.18653/v1/D19-1410
- Azure OpenAI. https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/models
- McInnes, L., Healy, J., & Melville, J. UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction. arXiv:1802.03426 [Cs, Stat] (2018) http://arxiv.org/abs/1802.03426
- Strelet, E., Peng, Y., Castillo, I., Rendall, R., Wang, Z, Joswiak, M., Braun, B., Chiang, L., Reis, M. S., Multi-source and multimodal data fusion for improved management of a wastewater treatment plant. Journal of Environmental Chemical Engineering, 111530, (2023), https://doi.org/10.1016/j.jece.2023.111530
(0.08 seconds)
[0.08 s]

