DOI QR코드

DOI QR Code

A Machine Learning Model for Predicting Silica Concentrations through Time Series Analysis of Mining Data

광업 데이터의 시계열 분석을 통해 실리카 농도를 예측하기 위한 머신러닝 모델

  • Lee, Seung Hoon (Department of Industrial and Management Engineering, Kyonggi University) ;
  • Yoon, Yeon Ah (Department of Industrial and Management Engineering, Kyonggi University Graduate School) ;
  • Jung, Jin Hyeong (Department of Industrial and Management Engineering, Kyonggi University Graduate School) ;
  • Sim, Hyun su (Department of Industrial and Management Engineering, Kyonggi University Graduate School) ;
  • Chang, Tai-Woo (Department of Industrial and Management Engineering, Kyonggi University) ;
  • Kim, Yong Soo (Department of Industrial and Management Engineering, Kyonggi University)
  • 이승훈 (경기대학교 산업경영공학과) ;
  • 윤연아 (경기대학교 일반대학원 산업경영공학과) ;
  • 정진형 (경기대학교 일반대학원 산업경영공학과) ;
  • 심현수 (경기대학교 일반대학원 산업경영공학과) ;
  • 장태우 (경기대학교 산업경영공학과) ;
  • 김용수 (경기대학교 산업경영공학과)
  • Received : 2020.09.13
  • Accepted : 2020.09.21
  • Published : 2020.09.30

Abstract

Purpose: The purpose of this study was to devise an accurate machine learning model for predicting silica concentrations following the addition of impurities, through time series analysis of mining data. Methods: The mining data were preprocessed and subjected to time series analysis using the machine learning model. Through correlation analysis, valid variables were selected and meaningless variables were excluded. To reflect changes over time, dependent variables at baseline were treated as independent variables at later time points. The relationship between independent variables and the dependent variable after n point was subjected to Pearson correlation analysis. Results: The correlation (R2) was strongest after 3 hours, which was adopted as a dependent variable. According to root mean square error (RMSE) data, the proposed method was superior to the other machine learning methods. The XGboost algorithm showed the best predictive performance. Conclusion: This study is important given the current lack of machine learning studies pertaining to the domestic mining industry. In addition, using time series analysis in mining data will show further improvement. Before establishing a predictive model for the proposed method, predictions should be made using data with time series characteristics. After doing this work, it should also improve prediction accuracy in other domains.

Keywords

References

  1. Afshin, G., Vladik, K., and Olga, K. 2018. Why 70/30 or 80/20 Relation Between Training and Testing Sets: A Pedagogical Explanation. Departmental Technical Reports(CS). 1209.
  2. Breiman, L. 2001. Random forests, Machine Learning 45(1):5-32. https://doi.org/10.1023/A:1010933404324
  3. Chen, T. and Guestrin, C. 2016. Xgboost : A scalable tree boosting system. Proceedings of the 22nd acm SigkddInternational Conference on Knowledge Discovery and Data mining. pp. 785-794.
  4. Choi, S.-H. and Hur, J. 2020. Optimized-XGBoost Learner Based Bagging Model for Photovoltaic Power Forecasting. The transactions of The Korean Institute of Electrical Engineers 69(7):978-984. https://doi.org/10.5370/KIEE.2020.69.7.978
  5. Gorain, B. K. FRANZIDIS, J. P. and MANLAPIG, E. V. 1995. Effect of Bubble Size, Gas Holdup and Superficial Gas Velocity on Metallurgical Performance in an Industrial Flotation Cell. JKMRC Report.
  6. Heimes, F. O. 2008. Recurrent neural networks for remaining useful life estimation. In 2008 international conference on prognostics and health management. IEEE. pp. 1-6.
  7. Hur, N. K., Jung, J. Y. and Kim, S. 2009. A Study on Air Demand Forecasting Using Multivariate Time Series Models. The Korean Journal of Applied Statistics 22.5:1007-1017. https://doi.org/10.5351/KJAS.2009.22.5.1007
  8. Jang, D. R. and Park. M. J. 2020. A Study on the Art Price Prediction Model Using the Random Forest. Journal of Applied Reliability 20(1):4-42.
  9. Jang, H. D. 2019. Australian Mining Transformation and Future Prospects in Response to the 4th Industrial Revolution. Journal of the Korean Society of Mineral and Energy Resources Engineers 56(5):490-513. https://doi.org/10.32390/ksmer.2019.56.5.490
  10. Jeong, H. S. 2018. Correlation Measure for Big Data. Journal of Applied Reliability 18(3):208-212. https://doi.org/10.33162/JAR.2018.09.18.3.208
  11. Kim, J. K., Lee, K. B., and Hong, S. G. 2017. ECG-based Biometric Authentication Using Random Forest. Journal of the Institute of Electronics and Information Engineers 54(6):100-105. https://doi.org/10.5573/ieie.2017.54.6.100
  12. Kohavi. and Ron. 1995. A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection. International Joint Conference on Artificial Intelligence 14(12):1137-43
  13. Kwame O, E. 2019. Machine Learning-based Quality Prediction in the Froth Flotation Process of Mining: Master's Degree Thesis in Microdata Analysis.
  14. Lee, C., Kim, S. M., and Choi, Y. 2019. Case Analysis for Introduction of Machine Learning Technology to the Mining Industry. Korean Society for Rock Mechanics. TUNNEL AND UNDERGROUND SPACE 29(1). 1-11. https://doi.org/10.7474/TUS.2019.29.1.001
  15. Lee, Y. H., Song, M. S., Ha, S. J., Baek, T. H., and Son, S. Y. 2016. Big data Cloud Service for Manufacturing Process Analysis. The Korean Journal of BigData 1(1):41-51.
  16. Molinaro., Annette, M., Richard, S., and Ruth M. P. 2005. Prediction Error Estimation: A Comparison of Resampling Methods. Bioinformatics (Oxford, England) 21(15). Oxford University Press: 3301-7. https://doi.org/10.1093/bioinformatics/bti499
  17. Park, K. T., and Baek, J. G. 2017. Time Series Prediction using ARIMA and DBNs with MODWT. Journal of the Korean Institue of Industrial Engineers 43.6:474-481. https://doi.org/10.7232/JKIIE.2017.43.6.474
  18. Sawyerr, C. T. 1998. Prediction of bubble size distribution in mechanical flotation cells. Journal of the Southern African Institute of Mining and Metallurgy 98(4):179-185.
  19. Seo, M. Y. and Rhee, J. T. 2003. A Study on the Seasonal Adjustment of Time Series and Demand Forecasting for Electronic Product Sales. Journal of Applied Reliability 3(1):13-39
  20. Tak, J. H., and Jung, W. 2018. Estimation of Failure Rate of SRU in RADAR System Utilizing Big Data. Journal of Applied Reliability 18(4):339-348. https://doi.org/10.33162/JAR.2018.12.18.4.339
  21. Yoon, D. H., Kim, S. M., and Kim, D. H. 2019. Clustering of Time Series Data using Deep Learning. Journal of Applied Reliability 19(2):167-178. https://doi.org/10.33162/JAR.2019.06.19.2.167
  22. Yoon, H. S., Um, M. J., Cho, W. C., and Heo, J. H. 2009. Orographic Precipitation Analysis with Regional Frequency Analysis and Multiple Linear Regression. Journal of Korea Water Resources Association 42(6):465-480. https://doi.org/10.3741/JKWRA.2009.42.6.465