Browse > Article
http://dx.doi.org/10.5391/IJFIS.2013.13.3.186

Semi-Supervised Recursive Learning of Discriminative Mixture Models for Time-Series Classification  

Kim, Minyoung (Department of Electronics & IT Media Engineering, Seoul National University of Science & Technology)
Publication Information
International Journal of Fuzzy Logic and Intelligent Systems / v.13, no.3, 2013 , pp. 186-199 More about this Journal
Abstract
We pose pattern classification as a density estimation problem where we consider mixtures of generative models under partially labeled data setups. Unlike traditional approaches that estimate density everywhere in data space, we focus on the density along the decision boundary that can yield more discriminative models with superior classification performance. We extend our earlier work on the recursive estimation method for discriminative mixture models to semi-supervised learning setups where some of the data points lack class labels. Our model exploits the mixture structure in the functional gradient framework: it searches for the base mixture component model in a greedy fashion, maximizing the conditional class likelihoods for the labeled data and at the same time minimizing the uncertainty of class label prediction for unlabeled data points. The objective can be effectively imposed as individual mixture component learning on weighted data, hence our mixture learning typically becomes highly efficient for popular base generative models like Gaussians or hidden Markov models. Moreover, apart from the expectation-maximization algorithm, the proposed recursive estimation has several advantages including the lack of need for a pre-determined mixture order and robustness to the choice of initial parameters. We demonstrate the benefits of the proposed approach on a comprehensive set of evaluations consisting of diverse time-series classification problems in semi-supervised scenarios.
Keywords
Mixture models; Bayesian networks; Semi-supervised learning; Functional gradient boosting; Time-series classification;
Citations & Related Records
Times Cited By KSCI : 7  (Citation Analysis)
연도 인용수 순위
1 S. Ko, D. W. Kim, and B. Y. Kang, "A matrix-based genetic algorithm for structure learning of Bayesian networks," International Journal of Fuzzy Logic and Intelligent Systems, vol. 11, no. 3, pp. 135-142, Sep. 2011. http://dx.doi.org/10.5391/IJFIS.2011.11.3.135   과학기술학회마을   DOI   ScienceOn
2 H. C. Cho, M. S. Fadali, and K. S. Lee, "Online parameter estimation and convergence property of dynamic Bayesian networks," International Journal of Fuzzy Logic and Intelligent Systems, vol. 7, no. 4, pp. 285-294, Dec. 2007. http://dx.doi.org/10.5391/IJFIS.2007.7.4.285   과학기술학회마을   DOI   ScienceOn
3 N. Friedman, D. Geiger, and M. Goldszmidt, "Bayesian network classifiers," Machine Learning, vol. 29, pp. 131- 163, 1997.   DOI   ScienceOn
4 F. Pernkopf and J. Bilmes, "Discriminative versus generative parameter and structure learning of Bayesian Network Classifiers," in ref Proceedings of the 22nd International Conference on Machine Learning, Bonn, 2005, pp. 657- 664.
5 J. Salojarvi, K. Puolamaki, and S. Kaski, "On discriminative joint density modeling," in Proceedings of the 16th European Conference on Machine Learning, Berlin, 2005, pp. 341-352.
6 Q. N. Dinh and C. H. Lee, "Model-based clustering of DOA data using von mises mixture model for sound source localization," International Journal of Fuzzy Logic and Intelligent Systems, vol. 13, no. 1, pp. 59-66, Mar. 2013. http://dx.doi.org/10.5391/IJFIS.2013.13.1.59   DOI   ScienceOn
7 J. Lee, S. Cho, J. Kim, and S.-T. Chung, "Layered object detection using adaptive gaussian mixture model in the complex and dynamic environment," Journal of Korean Institute of Intelligent Systems, vol.18, no. 3, pp. 387-391, Jun. 2008. http://dx.doi.org/10.5391/JKIIS.2008.18.3.387   과학기술학회마을   DOI   ScienceOn
8 S. S. Kim, K. C. Kwak, J. W. Ryu, and M. G. Chun, "A Neuro-Fuzzy Modeling using the Hierarchical Clustering and Gaussian Mixture Model," Journal of Korean Institute of Intelligent Systems, vol. 13, no. 5, pp. 512-519, Oct. 2003. http://dx.doi.org/10.5391/JKIIS.2003.13.5.512   과학기술학회마을   DOI   ScienceOn
9 M. Kim and V. Pavlovic, "Recursive method for discriminative mixture learning," in Proceedings of the 24th International Conference on Machine Learning, Corvallis, OR, 2007, pp. 409-416. http://dx.doi.org/10.1145/1273496.1273548   DOI
10 J. H. Friedman, "Greedy function approximation: a gradient boosting machine," Annals of Statistics, vol. 29, no. 5, pp. 1189-1232, Oct. 1999. http://dx.doi.org/10.1214/aos/1013203451   DOI
11 Y. Grandvalet and Y. Bengio, "Semi-supervised learning by entropy minimization," in Proceeding of Advances in Neural Information Processing Systems, Vancouver, BC, 2004.
12 K. B. Duan and S. S. Keerthi, "Which is the best multiclass SVM method? An empirical study," in Proceedings of the 6th International Conference on Multiple Classifier Systems, Seaside, CA, 2005, pp. 278-285. http://dx.doi. org/10.1007/11494683 28
13 A. Veeraraghavan, R. Chellappa, and A. K. Roy- Chowdhury, "The function space of an activity," in Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, New York, 2006, pp. 959-968. http://dx.doi.org/10.1109/CVPR.2006.304   DOI
14 K. Crammer and Y. Singer, "On the algorithmic implementation of multiclass kernel-based vector machines," Journal of Machine Learning Research, vol. 2, pp. 265- 292, Dec. 2001.
15 T. Hastie and R. Tibshirani, "Classification by pairwise coupling," in Proceedings of the 1997 Conference on Advances in Neural Information Processing Systems, Denver, CO, 1997, pp. 507-513.
16 E. Keogh and T. Folias, "The UCR time series data mining archive," Department Computer Science & Engineering, University of California, Riverside CA, 2002.
17 S. Hettich and S. D. Bay, "The UCI KDD archive," Department of Information and Computer Science, University of California, Irvine, CA, 2009.
18 R. Tanawongsuwan and A. F. Bobick, "Characteristics of time-distance gait parameters across speeds," Available https://smartech.gatech.edu/bitstream/handle/1853/ 85/03-01.pdf?sequence=1
19 R. Tanawongsuwan and A. Bobick, "Performance analysis of time-distance gait parameters under different speeds," in Proceedings of the 4th International Conference on Audio- and Video-Based Biometric Person Authentication, Guildford, 2003, pp. 715-724. http://dx.doi.org/10.1007/ 3-540-44887-X 83
20 P. Saisan,G. Doretto,Y. N. Wu, and S. Soatto, "Dynamic texture recognition," in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Kauai, HI, 2001, pp. 58-63.
21 J. Alon, S. Sclaroff, G. Kollios, and V. Pavlovic, "Discovering clusters in motion time-series data," in Proceedings of the 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Wisconsin, 2003, pp. 375-381.
22 T. Starner and A. Pentland, "Real-time American sign language recognition from video using hidden Markov models," in Proceedings of 1995 International Symposium on Computer Vision, Coral Gables, FL, 1995, pp. 265-270. http://dx.doi.org/10.1109/ISCV.1995.477012   DOI
23 A. D. Wilson and A. F. Bobick, "Parametric hidden Markov models for gesture recognition," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 21, no. 9, pp. 884-900, Sep. 1999. http://dx.doi.org/10. 1109/34.790429   DOI   ScienceOn
24 P. C. Woodland and D. Povey, "Large scale discriminative training of hidden Markov models for speech recognition," Computer Speech & Language, vol. 16, no. 1, pp. 25-47, Jan. 2002. http://dx.doi.org/10.1006/csla.2001.0182   DOI   ScienceOn
25 S. Y. Lee and K. J. Lee, "Pattern classification model design and performance comparison for data mining of time series data," Journal of Korean Institute of Intelligent Systems, vol. 21, no. 6, pp. 730-736, Dec. 2011. http: //dx.doi.org/10.5391/JKIIS.2011.21.6.730   DOI   ScienceOn
26 Y. K. Bang and C. H. Lee, "Design of fuzzy system with hierarchical classifying structures and its application to time series prediction," Journal of Korean Institute of Intelligent Systems, vol. 19, no. 5, pp. 595-602, Oct. 2009. http://dx.doi.org/10.5391/JKIIS.2009.19.5.595   과학기술학회마을   DOI   ScienceOn
27 R. Greiner and W. Zhou, "Structural extension to logistic regression: discriminative parameter learning of belief net classifiers," in Proceeding 18th National Conference on Artificial Intelligence, Edmonton, AB, 2002, pp. 167-173.
28 Y. Jing, V. Pavlovic, and J. M. Rehg, "Efficient discriminative learning of Bayesian network classifier via boosted augmented naive Bayes," in Proceedings of the 22nd International Conference on Machine Learning, Bonn, 2005, pp. 369-376. http://dx.doi.org/10.1145/1102351.1102398   DOI
29 A. Nadas, "A decision theorectic formulation of a training problem in speech recognition and a comparison of training by unconditional versus conditional maximum likelihood," IEEE Transactions on Acoustics, Speech and Signal Processing, vol. 31, no. 4, pp. 814-817, Aug. 1983. http://dx.doi.org/10.1109/TASSP.1983.1164173   DOI
30 V. Pavlovic, "Model-based motion clustering using boosted mixture modeling," in Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Washington, DC, 2004, pp. 811- 818.
31 Y. Freund and R. E. Schapire, "A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting," in Proceedings of the 2nd European Conference, Barcelona, 1995, pp. 23-37. http://dx.doi.org/10. 1007/3-540-59119-2 166
32 T. Jaakkola, M. Diekhans, and D. Haussler, "Using the Fisher kernel method to detect remote protein homologies," in Proceedings of the 7th International Conference on Intelligent Systems for Molecular Biology, Heidelberg, 1999, pp. 149-158.
33 H. Sakoe and S. Chiba, "Dynamic programming algorithm optimization for spoken word recognition," IEEE Transactions on Acoustics, Speech and Signal Processing, vol. 26, no. 1, pp. 43-49, Feb. 1978. http://dx.doi.org/10. 1109/TASSP.1978.1163055   DOI
34 C. A. Ratanamahatana and E. Keogh, "Making timeseries classification more accurate using learned constraints," in Proceedings of the 4th SIAM International Conference on Data Mining, Lake Buena Vista, FL, 2004, pp. 11-21.
35 P. Dollar, V. Rabaud, G. Cottrell, and S. Belongie, "Behavior recognition via sparse spatio-temporal features," in Proceedings of 2nd Joint IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance, Beijing, 2005, pp. 65-72. http://dx.doi.org/10.1109/VSPETS.2005.1570899   DOI
36 G. Doretto, A. Chiuso, Y. N.Wu, and S. Soatto, "Dynamic textures," International Journal of Computer Vision, vol. 51, no. 2, pp. 91109, Feb. 2003. http://dx.doi.org/10.1023/ A:1021669406132   DOI
37 A. B. Chan and N. Vasconcelos, "Probabilistic kernels for the classification of auto-regressive visual processes," in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Diego, CA, 2005, pp. 846-851. http://dx.doi.org/10.1109/CVPR. 2005.279   DOI