Fast Training of Structured SVM Using Fixed-Threshold Sequential Minimal Optimization

  • Received : 2008.05.13
  • Accepted : 2009.02.18
  • Published : 2009.04.30

Abstract

In this paper, we describe a fixed-threshold sequential minimal optimization (FSMO) for structured SVM problems. FSMO is conceptually simple, easy to implement, and faster than the standard support vector machine (SVM) training algorithms for structured SVM problems. Because FSMO uses the fact that the formulation of structured SVM has no bias (that is, the threshold b is fixed at zero), FSMO breaks down the quadratic programming (QP) problems of structured SVM into a series of smallest QP problems, each involving only one variable. By involving only one variable, FSMO is advantageous in that each QP sub-problem does not need subset selection. For the various test sets, FSMO is as accurate as an existing structured SVM implementation (SVM-Struct) but is much faster on large data sets. The training time of FSMO empirically scales between O(n) and O($n^{1.2}$), while SVM-Struct scales between O($n^{1.5}$) and O($n^{1.8}$).

Keywords

References

  1. V. Vapnik, Statistical Learning Theory, Wiley, New York, 1998.
  2. E. Osuna, R. Freund, and F. Girosi, “Training Support VectorMachines: An Application to Face Detection,” Proc. CVPR, 1997,pp. 130-136.
  3. J. Platt, “Sequential Minimal Optimization: A Fast Algorithm for Training Support Vector Machines,” Microsoft Research Technical Report MSR-TR-98-14, 1998.
  4. I. Tsochantaridis et al., “Support Vector Machine Learning for Interdependent and Structured Output Spaces,” Proc. ICML, 2004, p. 104.
  5. B. Scholkopf and A. Smola, Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond, MIT Press, 2001.
  6. T. Joachims, “A Statistical Learning Model of Text Classification with Support Vector Machines,” Proc. SIGIR, 2001, pp. 128-136.
  7. D. Sculley and G.M. Wachman, “Relaxed Online SVMs for Spam Filtering,” Proc. SIGIR, 2007, pp. 415-422.
  8. Y. Yue et al., “A Support Vector Method for Optimizing Average Precision,” Proc. SIGIR, 2007, pp. 271-278.
  9. D. Kim, J. Song, and B. Choi, “Support Vector Machine Learning for Region-Based Image Retrieval with Relevance Feedback,” ETRI J., vol. 29, no. 5, 2007, pp. 700-702. https://doi.org/10.4218/etrij.07.0207.0037
  10. C. Lee et al., “A Multi-Strategic Concept-Spotting Approach for Robust Understanding of Spoken Korean,” ETRI J., vol. 29, no. 2, 2007, pp. 179-188. https://doi.org/10.4218/etrij.07.0106.0204
  11. B. Taskar, C. Guestrin, and D. Koller, “Max Margin Markov Networks,” NIPS, vol. 16, 2004.
  12. I.W. Tsang, J.T. Kwok, and P.M. Cheung, “Core Vector Machines: Fast SVM Training on Very Large Data Sets,” Journal of Machine Learning Research, vol. 6, 2005, pp. 363-392.
  13. T. Joachims, “Training Linear SVMs in Linear Time,” Proc. KDD, 2006.
  14. C. Lee et al., “Fine-Grained Named Entity Recognition UsingConditional Random Fields for Question Answering,” Proc.AIRS, 2006, pp. 581-587.
  15. C. Chang and C. Lin, “Training v-Support Vector Classifiers,” Neural Computation, vol. 13, no. 9, 2001, pp. 2119-2147. https://doi.org/10.1162/089976601750399335
  16. K. Crammer and Y. Singer, “On the Learnability and Design ofOutput Codes for Multiclass Problems,” Journal of MachineLearning, vol. 47, 2004, pp. 201-233. https://doi.org/10.1023/A:1013637720281
  17. Y. Altun, I. Tsochantaridis, and T. Hofmann, “Hidden Markov Support Vector Machines,” Proc. ICML, 2003.