DOI QR코드

DOI QR Code

Compositional Feature Selection and Its Effects on Bandgap Prediction by Machine Learning

기계학습을 이용한 밴드갭 예측과 소재의 조성기반 특성인자의 효과

  • Chunghee Nam (Department of Electrical and Electronic Engineering, Hannam University)
  • 남충희 (한남대학교 전기전자공학과)
  • Received : 2023.03.09
  • Accepted : 2023.04.02
  • Published : 2023.04.27

Abstract

The bandgap characteristics of semiconductor materials are an important factor when utilizing semiconductor materials for various applications. In this study, based on data provided by AFLOW (Automatic-FLOW for Materials Discovery), the bandgap of a semiconductor material was predicted using only the material's compositional features. The compositional features were generated using the python module of 'Pymatgen' and 'Matminer'. Pearson's correlation coefficients (PCC) between the compositional features were calculated and those with a correlation coefficient value larger than 0.95 were removed in order to avoid overfitting. The bandgap prediction performance was compared using the metrics of R2 score and root-mean-squared error. By predicting the bandgap with randomforest and xgboost as representatives of the ensemble algorithm, it was found that xgboost gave better results after cross-validation and hyper-parameter tuning. To investigate the effect of compositional feature selection on the bandgap prediction of the machine learning model, the prediction performance was studied according to the number of features based on feature importance methods. It was found that there were no significant changes in prediction performance beyond the appropriate feature. Furthermore, artificial neural networks were employed to compare the prediction performance by adjusting the number of features guided by the PCC values, resulting in the best R2 score of 0.811. By comparing and analyzing the bandgap distribution and prediction performance according to the material group containing specific elements (F, N, Yb, Eu, Zn, B, Si, Ge, Fe Al), various information for material design was obtained.

Keywords

Acknowledgement

This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Korea government (MSIT) (NRF-2021R1F1A1052971).

References

  1. RE100. Retrieved March 10, 2023 from https://www.there100.org/ 
  2. D.-J. Kim and B.-N. Park, Korean J. Met. Mater., 60, 851 (2022). 
  3. S.-I. Kim and H.-S. Kim, Korean J. Met. Mater., 59, 127 (2021). 
  4. E. Rahman and A. Nojeh, Nat. Commun., 12, 4622 (2021). 
  5. S. Nayak, B. SanthiBhushan, S. Lodha and S. P. Ganguly, J. Phys. Commun., 5, 025009 (2021). 
  6. S. Chen, D. Huang, P. Xu, W. Xue, L. Lei, M. Cheng, R. Wang, X. Liu and R. Deng, J. Mater. Chem. A, 8, 2286 (2020). 
  7. C. S. Hwang, Adv. Electron. Mater., 1, 1400056 (0215). 
  8. C.-Y. Jin and O. Wada, J. Phys. D: Appl. Phys., 47, 133001 (2014). 
  9. P. Raju and Q. Li, J. Electrochem. Soc., 169, 057518 (2022). 
  10. Y. Zhuo, A. M. Tehrani and J. Brgoch, J. Phys. Chem. Lett., 9, 1668 (2018). 
  11. A. Jain, S. P. Ong, G. Hautier, W. Chen, W. D. Richards, S. Dacek, S. Cholia, D. Gunter, D. Skinner, G. Ceder and K. A. Persson, APL Mater., 1, 011002 (2013). 
  12. G. Pilania, A. Mannodi-Kanakkithodi, B. P. Uberuaga, R. Ramprasad, J. E. Gubernatis and T. Lookman, Sci. Rep., 6, 19375 (2016). 
  13. T. Wang, K. Zhang, J. The and H. Yu, Comput. Mater. Sci., 201, 110899 (2022). 
  14. The Materials Genome Initiative is a federal multi-agency initiative for discovering, manufacturing, and deploying advanced materials twice as fast and at a fraction of the cost compared to traditional methods. Materials Genome Initiative, Retrieved March 10, 2023 from https://www.mgi.gov/ 
  15. A. M. Tehrani, A. O. Oliynyk, M. Parry, Z. Rizvi, S. Couper, F. Lin, L. Miyagi, T. D. Sparks and J. Brgoch, J. Am. Chem. Soc., 140, 9844 (2018). 
  16. A. Mazheika, Y.-G. Wang, R. Valero, F. Vines, F. Illas, L. M. Ghiringhelli, S. V. Levchenko and M. Scheffler, Nat. Commun., 13, 419 (2022). 
  17. M. Kim, B. C. Yeo, Y. Park, H. M. Lee, S. S. Han and D. Kim, Chem. Mater., 32, 709 (2020). 
  18. Z. Lu, X. Chen, X. Liu, D. Lin, Y. Wu, Y. Zhang, H. Wang, S. Jiang, H. Li, X. Wang and Z. Lu, npj Comput. Mater., 6, 187 (2020). 
  19. V. Venkatraman, Comput. Mater. Sci., 197, 110637 (2021). 
  20. R. Xin, E. M. D. Siriwardane, Y. Song, Y. Zhao, S.-Y. Louis, A. Nasiri and J. Hu, J. Phys. Chem. C, 125, 16118 (2021). 
  21. T. Xie and J. C. Grossman, Phys. Rev. Lett., 120, 145301 (2018). 
  22. Automatic-Flow for Materials Discovery, Retrieved March 10, 2023 from https://aflowlib.org/ 
  23. L. Ward, A. Agrawal, A. Choudhary and C. Wolverton, npj Comput. Mater., 2, 16028 (2016). 
  24. S. P. Ong, W. D. Richards, A. Jain, G. Hautier, M. Kocher, S. Cholia, D. Gunter, V. L. Chevrier, K. A. Persson and G. Ceder, Comput. Mater. Sci., 68, 314 (2013). 
  25. L. Ward, A. Dunn, A. Faghaninia, N. E. R. Zimmermann, S. Bajaj, Q. Wang, J. Montoya, J. Chen, K. Bystrom, M. Dylla, K. Chard, M. Asta, K. A. Persson, G. J. Snyder, I. Foster and A. Jain, Comput. Mater. Sci., 152, 60 (2018). 
  26. X. Ying, J. Phys.: Conf. Ser., 1168, 022022 (2019). 
  27. P. Chen, F. Li and C. Wu, J. Phys. Conf. Ser., 1757, 012054 (2021). 
  28. Python module of scikit-learn, Retrieved March 10, 2023 from https://scikit-learn.org/stable/ 
  29. A. Y.-T. Wang, R. J. Murdock, S. K. Kauwe, A. O. Oliynyk, A. Gurlo, J. Brgoch, K. A. Persson and T. D. Sparks, Chem. Mater., 32, 4954 (2020). 
  30. N. Koul and S. S Manvi, IOP Conf. Ser.: Mater. Sci. Eng., 1074, 012004 (2021). 
  31. O. A. Golikova, AIP Conf. Proc., 231, 108 (1991). 
  32. Z. Yuan, L. Hao, M. Xiong, R. Yang and D. Yu, Phys. Lett. A, 383, 125901 (2019). 
  33. Abundances of the Elements in the Universe, Retrieved March 10, 2023 from https://www.angelo.edu/faculty/kboudrea/periodic/physical_abundances.htm