Browse > Article
http://dx.doi.org/10.9713/kcer.2021.59.4.532

Application of Time-series Cross Validation in Hyperparameter Tuning of a Predictive Model for 2,3-BDO Distillation Process  

An, Nahyeon (Green Materials and Processes R&D Group, Korea Institute of Industrial Technology)
Choi, Yeongryeol (Green Materials and Processes R&D Group, Korea Institute of Industrial Technology)
Cho, Hyungtae (Green Materials and Processes R&D Group, Korea Institute of Industrial Technology)
Kim, Junghwan (Green Materials and Processes R&D Group, Korea Institute of Industrial Technology)
Publication Information
Korean Chemical Engineering Research / v.59, no.4, 2021 , pp. 532-541 More about this Journal
Abstract
Recently, research on the application of artificial intelligence in the chemical process has been increasing rapidly. However, overfitting is a significant problem that prevents the model from being generalized well to predict unseen data on test data, as well as observed training data. Cross validation is one of the ways to solve the overfitting problem. In this study, the time-series cross validation method was applied to optimize the number of batch and epoch in the hyperparameters of the prediction model for the 2,3-BDO distillation process, and it compared with K-fold cross validation generally used. As a result, the RMSE of the model with time-series cross validation was lower by 9.06%, and the MAPE was higher by 0.61% than the model with K-fold cross validation. Also, the calculation time was 198.29 sec less than the K-fold cross validation method.
Keywords
Cross validation; Distillation process; Predictive model; Hyperparameter tuning; Time-series cross validation;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Oh, K. C., Kwon, H., Roh, J., Choi, Y., Park, H., Cho, H. and Kim, J., "Development of Machine Learning-Based Platform for Distillation Column," Korean Chem. Eng. Res., 58(4), 565-572 (2020).
2 Hoon, S., Ah, Y. and Hyeong, J., "A Machine Learning Model for Predicting Silica Concentrations through Time Series Analysis of Mining Data," J. Korean Soc. Qual. Manag., 48(3), 499-508(2020).   DOI
3 Lee, Y., Choi, Y., Cho, H. and Kim, J., "Prediction of Distillation Column Temperature Using Machine Learning and Data Preprocessing," Korean Chem. Eng. Res., 59(2), 191-199(2021).
4 Brownlee, J., "What is the Difference Between a Batch and an Epoch in a Neural Network?," Mach. Learn. Mastery, (2018).
5 Wei, Y. and Weng, Z., "Research on TE Process Fault Diagnosis Method Based on DBN and Dropout," Can. J. Chem. Eng., 98(6), 1293-1306(2020).   DOI
6 Mazinan, A. H., "A New Algorithm to AI-based Predictive Control Scheme for a Distillation Column System," Int. J. Adv. Manuf. Technol., 66(9-12), 1379-1388(2013).   DOI
7 Lu, Z. J., Xiang, Q., Wu, Y. M. and Gu, J., "Application of Support Vector Machine and Genetic Algorithm Optimization for Quality Prediction Within Complex Industrial Process," Proceeding - 2015 IEEE Int. Conf. Ind. Informatics, INDIN 2015, 98-103(2015).
8 Eslamloueyan, R., "Designing a Hierarchical Neural Network Based on Fuzzy Clustering for Fault Diagnosis of the Tennessee-Eastman Process," Appl. Soft Comput. J., 11(1), 1407-1415(2011).   DOI
9 Wang, T., Gao, H. and Qiu, J., "A Combined Adaptive Neural Network and Nonlinear Model Predictive Control for Multirate Networked Industrial Process Control," IEEE Trans. Neural Networks Learn. Syst., 27(2), 416-425(2016).   DOI
10 Arlot, S. and Celisse, A., "A Survey of Cross-validation Procedures for Model Selection," Stat. Surv., 4, 40-79(2010).   DOI
11 Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I. and Salakhutdinov, R., "Dropout: A Simple Way to Prevent Neural Networks from Overfitting," J. Mach. Learn. Res.,(2014).
12 Bergmeir, C. and Benitez, J. M., "On the Use of Cross-validation for Time Series Predictor Evaluation," Inf. Sci. (Ny)., 191, 192-213(2012).   DOI
13 Chen, X., Chen, X., She, J. and Wu, M., "A Hybrid Time Series Prediction Model Based on Recurrent Neural Network and Double Joint Linear-nonlinear Extreme Learning Network for Prediction of Carbon Efficiency in Iron Ore Sintering Process," Neurocomputing, (2017).
14 Andreas C. M. and Sarah, G., thirdIntroduction to machine learning with python, O'Reilly(2020).
15 Kingma, D. P. and Ba, J. L., "Adam: A Method for Stochastic Optimization," 3rd Int. Conf. Learn. Represent. ICLR 2015 - Conf. Track Proc.
16 Mahdi, M. and Mehdi, B., A systematic review on overfitting control in shallow and deep neural networks, Springer Netherlands(2021).
17 Zhai, N., Yao, P. and Zhou, X., "Multivariate Time Series Forecast in Industrial Process Based on XGBoost and GRU," in, IEEE, ITAIC 2020 - IEEE 9th Joint International Information Technology and Artificial Intelligence Conferencepp. 1397-1400.
18 Benesty, J., Chen, J., Huang, Y. and Cohen, I., Pearson Correlation Coefficient, (2009).
19 Wu, H. and Zhao, J., "Deep Convolutional Neural Network Model Based Chemical Process Fault Diagnosis," Comput. Chem. Eng., 115, 185-197(2018).   DOI
20 Jing, C. and Hou, J., "SVM and PCA Based Fault Classification Approaches for Complicated Industrial Process," Neurocomputing, 167, 636-642(2015).   DOI
21 Ying, X., "An Overview of Overfitting and its Solutions," J. Phys. Conf. Ser., 1168(2), (2019).
22 Hochreiter, S. and Urgen Schmidhuber, J., "Long Shortterm Memory," Neural Comput., (1997).
23 Zhao, J., Wang, W. and Sheng, C., Data-driven prediction for industrial processes and their applications, (2018).