Browse > Article
http://dx.doi.org/10.7858/eamj.2021.023

COMPARATIVE STUDY OF THE PERFORMANCE OF SUPPORT VECTOR MACHINES WITH VARIOUS KERNELS  

Nam, Seong-Uk (Department of Mathematics, Pusan National University)
Kim, Sangil (Department of Mathematics, Pusan National University)
Kim, HyunMin (Department of Mathematics, Pusan National University)
Yu, YongBin (Department of Machine Learning Engineering,Department of Machine Learning Engineering, Silex)
Publication Information
Abstract
A support vector machine (SVM) is a state-of-the-art machine learning model rooted in structural risk minimization. SVM is underestimated with regards to its application to real world problems because of the difficulties associated with its use. We aim at showing that the performance of SVM highly depends on which kernel function to use. To achieve these, after providing a summary of support vector machines and kernel function, we constructed experiments with various benchmark datasets to compare the performance of various kernel functions. For evaluating the performance of SVM, the F1-score and its Standard Deviation with 10-cross validation was used. Furthermore, we used taylor diagrams to reveal the difference between kernels. Finally, we provided Python codes for all our experiments to enable re-implementation of the experiments.
Keywords
Machine Learning; Support Vector Machine; Kernel Functions; Taylor Diagram;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Mangasarian, Olvi L and Wolberg, William H, Cancer diagnosis via linear programming, University of Wisconsin-Madison Department of Computer Sciences, (1990).
2 Vedaldi, Andrea and Zisserman, Andrew, Efficient additive kernels via explicit feature maps, IEEE transactions on pattern analysis and machine intelligence 34, IEEE (2012), no.3 480-492.   DOI
3 Alashwal, Hany and Deris, Safaai and Othman, Razib M, A Bayesian kernel for the Prediction of Protein-Protein Interactions, World Academy of Science, Engineering and Technology 51, (2009), 928-933.
4 Shawe-Taylor, John and Bartlett, Peter L and Williamson, Robert C and Anthony, Martin, Structural risk minimization over data-dependent hierarchies, IEEE transactions on Information Theory 44, IEEE (1998), no.5 1926-1940.   DOI
5 Abiodun, Oludare Isaac and Jantan, Aman and Omolara, Abiodun Esther and Dada, Kemi Victoria and Mohamed, Nachaat AbdElatif and Arshad, Humaira, State-of-the-art in artificial neural network applications: A survey, Heliyon 4, Elsevier (2018), no.11.
6 Taylor, Karl E, Summarizing multiple aspects of model performance in a single diagram, Data mining and knowledge discovery 106, Journal of Geophysical Research: Atmospheres (1998), no.D7 7183-7192.   DOI
7 Yannick Copin, taylor diagram python code, URL https://gist.github.com/ycopin/3342888, (2018).
8 Church, Keith B and Curram, Stephen P, Forecasting consumers' expenditure: A comparison between econometric and neural network models, International journal of forecasting 12, Elsevier (1996), no.2 255-267.   DOI
9 Dietterich, Thomas G, Ensemble methods in machine learning, International workshop on multiple classifier systems, Springer (2000), 1-15.
10 Nayak, Janmenjoy and Naik, Bighnaraj and Behera, H, A comprehensive survey on support vector machine in data mining tasks: applications & challenges, International Journal of Database Theory and Application 8, (2015), no.1 169-186.   DOI
11 Pedregosa, Fabian and Varoquaux, Gael and Gramfort, Alexandre and Michel, Vincent and Thirion, Bertrand and Grisel, Olivier and Blondel, Mathieu and Prettenhofer, Peter and Weiss, Ron and Dubourg, Vincent and others, Scikit-learn: Machine learning in Python, Journal of machine learning research 12, (2011), 2825-2830
12 McKinney, Wes and others, Data structures for statistical computing in python, Proceedings of the 9th Python in Science Conference 445, Austin, TX (2010), 51-56.
13 Achirul Nanda, Muhammad and Boro Seminar, Kudang and Nandika, Dodi and Maddu, Akhiruddin, A comparison study of kernel functions in the support vector machine and its application for termite detection, Information 9, Multidisciplinary Digital Publishing Institute (2018), no.1.
14 Vanek, Jan and Michalek, Josef and Psutka, Josef, A Comparison of Support Vector Machines Training GPU-Accelerated Open Source Implementations, arXiv preprint arXiv:1707.06470, (2017).
15 Chiroma, Haruna and Abdulkareem, Sameem and Abubakar, Adamu I and Herawan, Tutut, Kernel functions for the support vector machine: comparing performances on crude oil price data, Recent Advances on Soft Computing and Data Mining, Springer (2014), 273-281.
16 Fleuret, Francois and Sahbi, Hichem, Scale-invariance of support vector machines based on the triangular kernel, 3rd International Workshop on Statistical and Computational Theories of Vision, (2003), 1-13.
17 Gunn, Steve R and others, Support vector machines for classification and regression, ISIS technical report 14, University of Southampton (1998), no.1 5-16.
18 Maji, Subhransu and Berg, Alexander C and Malik, Jitendra, Efficient classification for additive kernel SVMs, IEEE transactions on pattern analysis and machine intelligence 35, IEEE (2012), no.1 66-77.   DOI
19 Afifi, Shereen Moataz and GholamHosseini, Hamid and Poopak, S, Hardware implementations of SVM on FPGA: A state-of-the-art review of current practice, International Journal of Innovative Science Engineering and Technology (IJISET), (2015).
20 Andrews, Robert and Diederich, Joachim and Tickle, Alan B, Survey and critique of techniques for extracting rules from trained artificial neural networks, Knowledge-based systems 8, Elsevier (1995), no.6 373-389.   DOI
21 Castro, Juan L and Mantas, Carlos J and Benitez, Jose Manuel, Interpretation of artificial neural networks by means of fuzzy rules, IEEE Transactions on Neural Networks 13, IEEE (2002), no.1 101-116.   DOI
22 Setiono, Rudy and Leow, Wee Kheng and Zurada, Jacek M, Extraction of rules from artificial neural networks for nonlinear regression, IEEE transactions on neural networks 13, IEEE (2002), no.3 564-577.   DOI
23 Burges, Christopher JC, Speed up SVM algorithm for massive classification tasks, International conference on advanced data mining and applications, Springer (2008), 147-157.
24 Burges, CJC and Vapnik, V, A new method for constructing artificial neural networks, Interim technical report, ONR contract, (1995).
25 Saravanan, Kl and Sasithra, S, Review on classification based on artificial neural networks, International Journal of Ambient Systems and Applications (IJASA) 2, (2014), no.4 11-18.   DOI
26 Belkin, Mikhail and Hsu, Daniel and Ma, Siyuan and Mandal, Soumik, Reconciling modern machine-learning practice and the classical bias-variance trade-off, Proceedings of the National Academy of Sciences 116, National Acad Sciences (2019), no.32 15849- 15854.   DOI
27 Nasrabadi, Nasser M and Kwon, Heesung, Kernel spectral matched filter for hyperspectral target detection, Proceedings.(ICASSP'05). IEEE International Conference on Acoustics, Speech, and Signal Processing 4, IEEE (2005), iv-665.
28 Lisboa, Paulo JG, A review of evidence of health benefit from artificial neural networks in medical intervention, Neural networks 15, Elsevier (2002), no.1 11-39.   DOI
29 Vapnik, Vladimir, Estimation of dependences based on empirical data, Springer Science & Business Media, (2006).
30 Portney, Leslie Gross and Watkins, Mary P and others, Foundations of clinical research: applications to practice, Pearson/Prentice Hall Upper Saddle River, NJ 892, (2009).
31 McCulloch, Warren S and Pitts, Walter, A logical calculus of the ideas immanent in nervous activity, The bulletin of mathematical biophysics 5, Springer (1943), no.4 115-133.   DOI
32 McClelland, James L and Rumelhart, David E and PDP Research Group and others, Parallel distributed processing, MIT press Cambridge, MA: 2, (1987).
33 Rokach, Lior and Maimon, Oded, Feature set decomposition for decision trees, Intelligent Data Analysis 9, IOS Press (1998), no.2 131-158.   DOI
34 Setiono, Rudy and Thong, James YL, An approach to generate rules from neural networks for regression problems, European Journal of Operational Research 155, Elsevier (2004), no.1 239-250.   DOI
35 Castro, Juan Luis and Mantas, Carlos Javier and Benitez, JM, Neural networks with a continuous squashing function in the output are universal approximators, Neural Networks 13, Elsevier (2000), no.6 561-563.   DOI
36 Funahashi, Ken-Ichi, On the approximate realization of continuous mappings by neural networks, Neural networks 2, Elsevier (1989), no.3 183-192.   DOI
37 Saunders, Craig and Stitson, Mark O and Weston, Jason and Bottou, Leon and Smola, A and others, Support vector machine-reference manual, Technical Report, Department of Computer Science, Royal Holloway, University of London, Egham, UK, (1998).
38 Chen, Rung-Ching and Hsieh, Chung-Hsun, Web page classification based on a support vector machine using a weighted vote schema, Expert Systems with Applications 31, Elsevier (2006), no.2 427-435.   DOI
39 Valentini, Giorgio, Gene expression data analysis of human lymphoma using support vector machines and output coding ensembles, Artificial Intelligence in Medicine 26, Elsevier (2002), no.3 281-304.   DOI
40 Fadel, Sayed and Ghoniemy, Said and Abdallah, Mohamed and Sorra, Hussein Abu and Ashour, Amira and Ansary, Asif, Investigating the effect of different kernel functions on the performance of SVM for recognizing Arabic characters, International Journal of Advanced Computer Science and Applications 7, Citeseer (2016), no.1 446-450.
41 Kar, Purushottam and Karnick, Harish, Random feature maps for dot product kernels, Artificial Intelligence and Statistics (2012), 583-591.
42 Wang, Benjamin X and Japkowicz, Nathalie, Boosting support vector machines for imbalanced data sets, Knowledge and information systems 25, Springer (2010), no.1 1-20.   DOI
43 Martinez-Porchas, Marcel and Villalpando-Canchola, Enrique and Vargas-Albores, Francisco, Significant loss of sensitivity and specificity in the taxonomic classification occurs when short 16S rRNA gene sequences are used, Heliyon 2, Elsevier (2016), no.9.
44 Abid, Faroudja and Hamami, Latifa, A survey of neural network based automated systems for human chromosome classification, Artificial Intelligence Review 49, Springer (2018), no.1 41-56.   DOI
45 Wang, Yaohui and Zhang, Jiyang, Application of SVM in object tracking based on Laplacian kernel function, 8th International Conference on Intelligent Human-Machine Systems and Cybernetics (IHMSC) 2, IEEE (2016), 557-561.
46 Aftab, Wasim and Moinuddin, Muhammad and Shaikh, Muhammad Shafique, A novel kernel for RBF based neural networks, Abstract and Applied Analysis, Hindawi (2014).
47 Zhang, Li and Zhou, Weida and Jiao, Licheng, Wavelet support vector machine, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics) 34, IEEE (2004), no.1 34-39.   DOI
48 Xiang, Li and Quanyin, Zhu and Liuyang, Wang, Research of bessel kernel function of the first kind for support vector regression, Information Technology Journal 12, ANSINET (2013), no.14 2673-2682.   DOI
49 Horvath, Gabor, CMAC neural network as an SVM with B-Spline kernel functions, Proceedings of the 20th IEEE Instrumentation Technology Conference (Cat. No. 03CH37412) 2, IEEE (2003), 1108-1113.
50 BONITA, OLIVIA and MUFLIKHAH, LAILIL, Comparison of Gaussian and ANOVA Kernel in Support Vector Regression for Predicting Coal Price, 2018 International Conference on Sustainable Information Engineering and Technology (SIET), IEEE (2018), 147-150.
51 Gish, Herbert, A probabilistic approach to the understanding and training of neural network classifiers, International Conference on Acoustics, Speech, and Signal Processing, IEEE (1990), 1361-1364.
52 Caruana, Rich and Niculescu-Mizil, Alexandru, An empirical comparison of supervised learning algorithms, Proceedings of the 23rd international conference on Machine learning, ACM (2006), 161-168.
53 Roul, A Modified Cosine-Similarity based Log Kernel for Support Vector Machines in the Domain of Text Classification, Proceedings of the 14th International Conference on Natural Language Processing, (2017), 338-347.
54 Sollich, Peter and Krogh, Anders, Learning with ensembles: How overfitting can be useful, Advances in neural information processing systems, (1996), 190-196.
55 Abadi, Wassila and Fezari, Mohamed and Hamdi, Rachid, Bag of Visualwords and ChiSquared Kernel Support Vector Machine: A Way to Improve Hand Gesture Recognition, Proceedings of the International Conference on Intelligent Information Processing, Security and Advanced Communication, ACM (2015).
56 Faraway, Julian and Chatfield, Chris, Time series forecasting with neural networks: a comparative study using the air line data, Journal of the Royal Statistical Society: Series C (Applied Statistics) 47, Wiley Online Library (1998), no.2 231-250.
57 Kuncheva, Ludmila I and Whitaker, Christopher J, Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy, Machine learning 51, Springer (2003), no.2 181-207.   DOI
58 Brown, Gavin and Wyatt, Jeremy and Harris, Rachel and Yao, Xin, Diversity creation methods: a survey and categorisation, Information Fusion 6, Elsevier (2005), no.1 5-20.   DOI
59 Adeva, Juan Jose Garcia and Beresi, U and Calvo, R, Accuracy and diversity in ensembles of text categorisers, CLEI Electronic Journal 9, (2005), no.1 1-12.
60 Krogh, Anders and Vedelsby, Jesper, Neural network ensembles, cross validation, and active learning, Advances in neural information processing systems, (1995) 231-238.
61 Nakkiran, Preetum and Kaplun, Gal and Bansal, Yamini and Yang, Tristan and Barak, Boaz and Sutskever, Ilya, Deep double descent: Where bigger models and more data hurt, arXiv preprint arXiv:1912.02292, (2019).
62 Cybenko, George, Approximation by superpositions of a sigmoidal function, Mathematics of control, signals and systems 2, Springer (1989), no.4 303-314.   DOI
63 Callen, Jeffrey L and Kwan, Clarence CY and Yip, Patrick CY and Yuan, Yufei, Neural network forecasting of quarterly accounting earnings, International Journal of Forecasting 12, Elsevier (1996), no.4 475-482.   DOI
64 Boughorbel, Sabri and Tarel, J-P and Boujemaa, Nozha, Conditionally positive definite kernels for svm based image recognition, 2005 IEEE International Conference on Multimedia and Expo, IEEE (2005), 113-116.
65 Boughorbel, Sabri and Tarel, Jean-Philippe and Fleuret, Francois and Boujemaa, Nozha, The GCS kernel for SVM-based image recognition, International Conference on Artificial Neural Networks, Springer (2005), 595-600.
66 Joachims, Thorsten, Training linear SVMs in linear time, Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM (2006), 217-226.
67 Zhang, Guoqiang Peter, Neural networks for classification: a survey, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews) 30, IEEE (2000), no.4 451-462.   DOI
68 Adya, Monica and Collopy, Fred, How effective are neural networks at forecasting and prediction? A review and evaluation, Journal of forecasting 17, Wiley Online Library (1998), no.5-6 481-495.   DOI
69 Cortez, Paulo and Cerdeira, Antonio and Almeida, Fernando and Matos, Telmo and Reis, Jose, Modeling wine preferences by data mining from physicochemical properties, Decision Support Systems 47, Elsevier (1998), no.4 547-553.   DOI
70 Breiman, Leo, Classification and regression trees, Routledge (2017).
71 Hippert, Henrique Steinherz and Pedreira, Carlos Eduardo and Souza, Reinaldo Castro, Neural networks for short-term load forecasting: A review and evaluation, IEEE Transactions on power systems 16, IEEE (2001), no.1 44-55.   DOI
72 Connor, Jerome T and Martin, R Douglas and Atlas, Les E, Recurrent neural networks and robust time series prediction, IEEE transactions on neural networks 5, IEEE (1994), no.2 240-254.   DOI
73 Fletcher, Desmond and Goss, Ernie, Forecasting with neural networks: an application using bankruptcy data, Information & Management 24, Elsevier (1993), no.3 159-167.   DOI
74 Gorr, Wilpen L, Research prospective on neural network forecasting, International Journal of Forecasting 10, Elsevier (1994), no.1 1-4.   DOI
75 Belli, MR and Conti, Massimo and Crippa, Paolo and Turchetti, Claudio, Artificial neural networks as approximators of stochastic processes, Neural Networks 12, Elsevier (1999), no.4-5 647-658.   DOI
76 Cassotti, M and Ballabio, D and Todeschini, R and Consonni, V, A similarity-based QSAR model for predicting acute toxicity towards the fathead minnow (Pimephales promelas), SAR and QSAR in Environmental Research 26, Taylor & Francis (2015), no.3 217-243.   DOI
77 Nakai, Kenta and Kanehisa, Minoru, Expert system for predicting protein localization sites in gram-negative bacteria, Proteins: Structure, Function, and Bioinformatics 11, Springer (1991), no.2 95-110.   DOI
78 Cottrell, Marie and Girard, Bernard and Girard, Yvonne and Mangeas, Morgan and Muller, Corinne, Neural modeling for time series: a statistical stepwise method for weight elimination, IEEE transactions on neural networks 6, IEEE (1995), no.6 1355-1364.   DOI
79 Silva, Pedro FB and Marcal, Andre RS and da Silva, Rubim M Almeida, Evaluation of features for leaf discrimination, International Conference Image Analysis and Recognition, Springer (2013), 197-204.
80 Dua, Dheeru and Graff, Casey, UCI Machine Learning Repository, University of California, Irvine, School of Information and Computer Sciences, (2017).
81 Redmond, Michael and Baveja, Alok, A data-driven software tool for enabling cooperative information sharing among police departments, European Journal of Operational Research 141, Elsevier (2002), no.3 660-678.   DOI
82 Brooks, Thomas F and Pope, D Stuart and Marcolini, Michael A, Airfoil self-noise and prediction, (1989).
83 Smola, A tutorial on support vector regression, Statistics and computing 14, Springer (2004), no.3 199-222.   DOI
84 Deng, Wan-Yu and Ong, Yew-Soon and Zheng, Qing-Hua, A fast reduced kernel extreme learning machine, Neural Networks 76, Elsevier (2016), 29-38.   DOI
85 Rao, Swathi, Effects of Image Retrieval from Image Database using Linear Kernel and Hellinger Kernel Mapping of SVM, International Journal of Scientific & Engineering Research 4, no.5.
86 Cortez, Paulo and Morais, Anibal de Jesus Raimundo, A data mining approach to predict forest fires using meteorological data, Data mining and knowledge discovery, APPIA (2007).
87 Campbell, Colin, Kernel methods: a survey of current techniques, Neurocomputing 48, Elsevier (2002), no.1-4 63-84.   DOI
88 Boser, Bernhard E and Guyon, Isabelle M and Vapnik, Vladimir N, A training algorithm for optimal margin classifiers, Proceedings of the fifth annual workshop on Computational learning theory, ACM (1992), 144-152.
89 Burges, Christopher JC, A tutorial on support vector machines for pattern recognition, Data mining and knowledge discovery 2, Springer (1998), no.2 121-167.   DOI
90 Guyon, I, Svm application list, URL http://www.clopinet.com/isabelle/Projects/SVM/applist.html, (1999).
91 Wang, Guosheng, A survey on training algorithms for support vector machine classifiers, Fourth International Conference on Networked Computing and Advanced Information Management 1, IEEE (2008), 123-128.
92 Souza, Cesar R, Kernel functions for machine learning applications, Creative Commons Attribution-Noncommercial-Share Alike 3, (2010), 29.
93 Shawe-Taylor, John and Sun, Shiliang, A review of optimization methodologies in support vector machines, Neurocomputing 74, Elsevier (2011), no.17 3609-3618.   DOI
94 Smola, Alex J and Scholkopf, Bernhard, Learning with kernels, 4, Citeseer (1998).
95 Rahimi, Ali and Recht, Benjamin, Random features for large-scale kernel machines, Advances in neural information processing systems, (2008), 1177-1184.