Acknowledgement
김현중의 연구는 과학기술정보통신부 및 정보통신기획평가원의 학석사연계ICT핵심인재양성사업 (IITP-2023-00259934)과 한국연구재단(NRF) 연구비 (No. 2016R1D1A1B02011696)의 연구결과로 수행되었음.
References
- Alcal-Fdez J, Fernndez A, Luengo J, Derrac J, Garca S, Snchez L, and Herrera F (2011). Keel datamining software tool: Data set repository, integration of algorithms and experimental analysis framework, Journal of Multiple-Valued Logic and Soft Computing, 17, 255-287.
- Alfaro E, Gamez M, and Garcia N (2013). Adabag: An r package for classification with boosting and bagging, Journal of Statistical Software, 54, 1-35.
- Anand R, Mehrotra K, Mohan C, and Ranka S (1993). An improved algorithm for neural network classification of imbalanced training sets, IEEE Transactions on Neural Networks, 4, 962-969.
- Boyd K, Eng KH, and Page CD (2013). Area under the precision-recall curve: Point estimates and confidence intervals. In Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2013, Prague, Czech Republic, September 23-27, 2013, Proceedings, Part III 13 (pp. 451-466), Springer, Berlin.
- Buda M, Maki A, and Mazurowski MA (2018). A systematic study of the class imbalance problem in convolutional neural networks, Neural Networks, 106, 249-259.
- Chawla NV, Bowyer KW, Hall LO, and Kegelmeyer WP (2002). Smote: Synthetic minority oversampling technique, Journal of Artificial Intelligence Research, 16, 321-357.
- Chen YC, Ha H, Kim H, and Ahn H (2013). Canonical forest, Computational Statistics, 29, 849-867.
- Cheng F, Zhang J, Wen C, Liu Z, and Li Z (2017). Large cost-sensitive margin distribution machine for imbalanced data classification, Neurocomputing, 224, 45-57.
- Fan W, Stolfo S, Zhang J, and Chan P (1999). Adacost: Misclassification cost-sensitive boosting. In Proceedings of the Sixteenth International Conference on Machine Learning (ICML'99), San Francisco, CA, USA, 97-105.
- Fernandez A, Garcia S, Herrera F, and Chawla NV (2018). Smote for learning from imbalanced data: Progress and challenges, marking the 15-year anniversary, Journal of Artificial Intelligence Research, 61, 863-905.
- Garcia V, Sanchez J, and Mollineda R (2007). An empirical study of the behavior of classifiers on imbalanced ' and overlapped data sets. In Rueda L, Mery D, and Kittler J (Eds), Progress in Pattern Recognition, Image Analysis and Applications (pp. 397-406), Springer Berlin Heidelberg, Berlin, Heidelberg.
- Gong J and Kim H (2017). Rhsboost: Improving classification performance in imbalance data, Computational Statistics and Data Analysis, 111, 1-13.
- He H, Bai Y, Garcia EA, and Li S (2008). Adasyn: Adaptive synthetic sampling approach for imbalanced learning. In Proceedings of 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence), Hong Kong, 1322-1328.
- Huang J and Ling CX (2005). Using auc and accuracy in evaluating learning algorithms, IEEE Transactions on Knowledge and Data Engineering, 17, 299-310.
- Japkowicz N (2003). Class imbalances: Are we focusing on the right issue, Workshop on Learning from Imbalanced Data Sets II, 1723, 63.
- Jo T and Japkowicz N (2004). Class imbalances versus small disjuncts, SIGKDD Explorations Newsletter, 6, 40-49.
- Liaw A and Wiener M (2002). Classification and regression by randomforest, R news, 2, 18-22.
- Lichman M (2013). UCI machine learning repository, Available from: http://archive.ics.uci.edu/ml
- Lunardon N, Menardi G, and Torelli N (2014). Rose: A package for binary imbalanced learning, R Journal, 6, 79-89.
- Lopez V, Fernandez A, Garcia S, Palade V, and Herrera F (2013). An insight into classification with imbalanced data: Empirical results and current trends on using data intrinsic characteristics, Information Sciences, 250, 113-141.
- Manning CD (2008). Introduction to Information Retrieval, Cambridge University Press, Cambridge, England.
- Menardi G and Torelli N (2014). Training and assessing classification rules with imbalanced data, Data Mining and Knowledge Discovery, 28, 92-122.
- R Core Team (2022). R: A Language and Environment for Statistical Computing, R Foundation for Statistical Computing, Vienna, Austria.
- Rayhan F, Ahmed S, Mahbub A, Jani R, Shatabda S, and Farid DM (2017). Cusboost: Cluster-based under-sampling with boosting for imbalanced classification. In Proceedings of 2017 2nd International Conference on Computational Systems and Information Technology for Sustainable Solution (CSITSS), Bengaluru, 1-5.
- Ridgeway G and GBM Developers (2024). gbm: Generalized Boosted Regression Models. R package version 2.1.9, Available from: ¡https://CRAN.R-project.org/package=gbm¿
- Seiffert C, Khoshgoftaar TM, Van Hulse J, and Napolitano A (2010). Rusboost: A hybrid approach to alleviating class imbalance, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans, 40, 185-197.
- Sun Y, Wong AKC, and Kamel MS (2009). Classification of imbalanced data: A review, International Journal of Pattern Recognition and Artificial Intelligence, 23, 687-719.
- Tang B and He H (2017). Gir-based ensemble sampling approaches for imbalanced learning, Pattern Recognition, 71, 306-319.
- Therneau T, Atkinson B, and Ripley B (2015). rpart: Recursive partitioning and regression trees, r package version 4.1-15, Retrieved, 13:2015, Available from: https://cran.r-project.org/web/packages/rpart/rpart.pdf
- Vuttipittayamongkol P, Elyan E, and Petrovski A (2021). On the class overlap problem in imbalanced data classification, Knowledge-Based Systems, 212, 106631.