References
- Amin, A., Anwar, S., Adnan, A., Nawaz, M., Howard, N., Qadir, J., ... & Hussain, A. (2016). Comparing oversampling techniques to handle the class imbalance problem: A customer churn prediction case study. IEEE Access, 4, 7940-7957. https://doi.org/10.1109/ACCESS.2016.2619719
- Anwar, N., Jones, G., & Ganesh, S. (2014). Measurement of data complexity for classification problems with unbalanced data. Statistical Analysis and Data Mining: The ASA Data Science Journal, 7(3), 194-211. https://doi.org/10.1002/sam.11228
- Blagus, R., & Lusa, L. (2013). Improved shrunken centroid classifiers for high-dimensional class-imbalanced data. BMC bioinformatics, 14(1), 1-13. https://doi.org/10.1186/1471-2105-14-1
- Cano, J. R. (2013). Analysis of data complexity measures for classification. Expert systems with applications, 40(12), 4820-4831. https://doi.org/10.1016/j.eswa.2013.02.025
- Dogan, N., & Tanrikulu, Z. (2013). A comparative analysis of classification algorithms in data mining for accuracy, speed and robustness. Information Technology and Management, 14(2), 105-124. https://doi.org/10.1007/s10799-012-0135-8
- Feng, S., Keung, J., Yu, X., Xiao, Y., Bennin, K. E., Kabir, M. A., & Zhang, M. (2021). COSTE: Complexity-based OverSampling TEchnique to alleviate the class imbalance problem in software defect prediction. Information and Software Technology, 129, 106432. https://doi.org/10.1016/j.infsof.2020.106432
- George, G., Haas, M. R., & Pentland, A. (2014). Big data and management. Academy of management Journal, 57(2), 321-326. https://doi.org/10.5465/amj.2014.4002
- Ho, T. K. (2002). A data complexity analysis of comparative advantages of decision forest constructors. Pattern Analysis & Applications, 5(2), 102-112. https://doi.org/10.1007/s100440200009
- Ho, T. K., & Basu, M. (2002). Complexity measures of supervised classification problems. IEEE transactions on pattern analysis and machine intelligence, 24(3), 289-300. https://doi.org/10.1109/34.990132
- Huang, Y. M., Hung, C. M., & Jiau, H. C. (2006). Evaluation of neural networks and data mining methods on a credit assessment task for class imbalance problem. Nonlinear Analysis: Real World Applications, 7(4), 720-747. https://doi.org/10.1016/j.nonrwa.2005.04.006
- Jo, T., & Japkowicz, N. (2004). Class imbalances versus small disjuncts. ACM Sigkdd Explorations Newsletter, 6(1), 40-49. https://doi.org/10.1145/1007730.1007737
- Khan, I., Zhang, X., Rehman, M., & Ali, R. (2020). A literature survey and empirical study of meta-learning for classifier selection. IEEE Access, 8, 10262-10281. https://doi.org/10.1109/ACCESS.2020.2964726
- Khoshgoftaar, T. M., Van Hulse, J., & Napolitano, A. (2010). Comparing boosting and bagging techniques with noisy and imbalanced data. IEEE Transactions on Systems, Man, and Cybernetics-Part A: Systems and Humans, 41(3), 552-568. https://doi.org/10.1109/TSMCA.2010.2084081
- Kim, J., & Kwon, O. (2021). A model for rapid selection and covid-19 prediction with dynamic and imbalanced data. Sustainability, 13(6), 3099. https://doi.org/10.3390/su13063099
- Kim, E., & Hong, T. (2015). Response Modeling for the Marketing Promotion with Weighted Case Based Reasoning Under Imbalanced Data Distribution. Journal of Intelligence and Information Systems, 21(1), 29-45. https://doi.org/10.13088/JIIS.2015.21.1.29
- Kim, J., Kim, M. Y., & Kwon, O. (2020). The Effect of Meta-Features of Multiclass Datasets on the Performance of Classification Algorithms. Journal of Intelligence and Information Systems, 26(1), 23-45. https://doi.org/10.13088/JIIS.2020.26.1.023
- Kotsiantis, S., & Kanellopoulos, D. (2006). Discretization techniques: A recent survey. GESTS International Transactions on Computer Science and Engineering, 32(1), 47-58.
- Krawczyk, B. (2016). Learning from imbalanced data: open challenges and future directions. Progress in Artificial Intelligence, 5(4), 221-232. https://doi.org/10.1007/s13748-016-0094-0
- Lee, S., & Shin, T. (2018). Development and application of prediction model of hyperlipidemia using SVM and meta-learning algorithm. Journal of Intelligence and Information Systems, 24(2), 111-124. https://doi.org/10.13088/JIIS.2018.24.2.111
- Leyva, E., Gonzalez, A., & Perez, R. (2014). A set of complexity measures designed for applying meta-learning to instance selection. IEEE Transactions on Knowledge and Data Engineering, 27(2), 354-367. https://doi.org/10.1109/TKDE.2014.2327034
- Lopez, V., Fernandez, A., Garcia, S., Palade, V., & Herrera, F. (2013). An insight into classification with imbalanced data: Empirical results and current trends on using data intrinsic characteristics. Information sciences, 250, 113-141. https://doi.org/10.1016/j.ins.2013.07.007
- Lorena, A. C., Maciel, A. I., de Miranda, P. B., Costa, I. G., & Prudencio, R. B. (2018). Data complexity meta-features for regression problems. Machine Learning, 107(1), 209-246. https://doi.org/10.1007/s10994-017-5681-1
- Lu, W. Z., & Wang, D. (2008). Ground-level ozone prediction by support vector machine approach with a cost-sensitive classification scheme. Science of the total environment, 395(2-3), 109-116. https://doi.org/10.1016/j.scitotenv.2008.01.035
- Matsumoto, A., Merlone, U., & Szidarovszky, F. (2012). Some notes on applying the Herfindahl-Hirschman Index. Applied Economics Letters, 19(2), 181-184. https://doi.org/10.1080/13504851.2011.570705
- Merz, P. (2004). Advanced fitness landscape analysis and the performance of memetic algorithms. Evolutionary Computation, 12(3), 303-325. https://doi.org/10.1162/1063656041774956
- Munoz, M. A., Sun, Y., Kirley, M., & Halgamuge, S. K. (2015). Algorithm selection for black-box continuous optimization problems: A survey on methods and challenges. Information Sciences, 317, 224-245. https://doi.org/10.1016/j.ins.2015.05.010
- Park, G. U., & Jung, I. (2019). Comparison of resampling methods for dealing with imbalanced data in binary classification problem. The Korean Journal of Applied Statistics, 32(3), 349-374. https://doi.org/10.5351/KJAS.2019.32.3.349
- Pascual-Triana, J. D., Charte, D., Andres Arroyo, M., Fernandez, A., & Herrera, F. (2021). Revisiting data complexity metrics based on morphology for overlap and imbalance: snapshot, new overlap number of balls metrics and singular problems prospect. Knowledge and Information Systems, 63(7), 1961-1989. https://doi.org/10.1007/s10115-021-01577-1
- Pasupa, K., Vatathanavaro, S., & Tungjitnob, S. (2020). Convolutional neural networks based focal loss for class imbalance problem: a case study of canine red blood cells morphology classification. Journal of Ambient Intelligence and Humanized Computing, 1-17.
- Pfahringer, B., Bensusan, H., & Giraud-Carrier, C. G. (2000, June). Meta-Learning by Landmarking Various Learning Algorithms. In ICML (pp. 743-750).
- Pimentel, B. A., & De Carvalho, A. C. (2019). A new data characterization for selecting clustering algorithms using meta-learning. Information Sciences, 477, 203-219. https://doi.org/10.1016/j.ins.2018.10.043
- Qureshi, S. R., & Gupta, A. (2014, March). Towards efficient Big Data and data analytics: A review. In 2014 Conference on IT in Business, Industry and Government (CSIBIG) (pp. 1-6). IEEE.
- Rossi, A. L. D., de Leon Ferreira, A. C. P., Soares, C., & De Souza, B. F. (2014). MetaStream: A meta-learning based method for periodic algorithm selection in time-changing data. Neurocomputing, 127, 52-64. https://doi.org/10.1016/j.neucom.2013.05.048
- Strubell, E., Ganesh, A., & McCallum, A. (2019). Energy and policy considerations for deep learning in NLP. arXiv preprint arXiv:1906.02243.
- Sun, A., Lim, E. P., & Liu, Y. (2009). On strategies for imbalanced text classification using SVM: A comparative study. Decision Support Systems, 48(1), 191-201. https://doi.org/10.1016/j.dss.2009.07.011
- Van der Walt, C. M., & Barnard, E. (2007). Data characteristics that determine classifier performance. SAIEE Africa Research Journal, 98(3), 87-93. https://doi.org/10.23919/SAIEE.2007.9488132
- Weiss, G. M., & Provost, F. (2003). Learning when training data are costly: The effect of class distribution on tree induction. Journal of artificial intelligence research, 19, 315-354. https://doi.org/10.1613/jair.1199
- Wolpert, D. H., & Macready, W. G. (1997). No free lunch theorems for optimization. IEEE transactions on evolutionary computation, 1(1), 67-82. https://doi.org/10.1109/4235.585893
- Zhang, X., Li, R., Zhang, B., Yang, Y., Guo, J., & Ji, X. (2019). An instance-based learning recommendation algorithm of imbalance handling methods. Applied Mathematics and Computation, 351, 204-218. https://doi.org/10.1016/j.amc.2018.12.020