DOI QR코드

DOI QR Code

Semiparametric mixture of experts with unspecified gate network

  • Jung, Dahai (Department of Statistics, Sungkyunkwan University) ;
  • Seo, Byungtae (Department of Statistics, Sungkyunkwan University)
  • Received : 2017.03.22
  • Accepted : 2017.05.18
  • Published : 2017.05.31

Abstract

The traditional mixture of experts (ME) modeled the gate network using a certain parametric function. However, if the assumed parametric function does not properly reflect the true nature, the prediction strength of ME would become weak. For example, the parametric ME often uses logistic or multinomial logistic models for the network model. However, this could be very misleading if the true nature of the data is quite different from those models. Although, in this case, we may develop more flexible parametric models by extending the model at hand, we will never be free from such misspecification problems. In order to alleviate such weakness of the parametric ME, we propose to use the semi-parametric mixture of experts (SME) in which the gate network is estimated in a non-parametrical way. Based on this, we compared the performance of the SME with those of ME and neural networks via several simulation experiments and real data examples.

Keywords

References

  1. Benaglia, T., Chauveau, D., Hunter, D. R. and Young, D. (2009). Mixtools: An R package for analyzing finite mixture models. Journal of Statistical Software, 32, 1-29.
  2. Dempster, A. P., Laird, N. M. and Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, 39, 1-38.
  3. Goldfeld, S. M. and Quandt, R. E. (1973). A Markov model for switching regression. Journal of Econometrics, 1, 3-15. https://doi.org/10.1016/0304-4076(73)90002-X
  4. Gunther, F. and Fritsch, S. (2010). Neuralnet: Training of neural networks. The R Journal, 2, 30-38.
  5. Huang M. and Yao, W. (2012). Mixture of regression models with varying mixing proportions: A semiparametric approach. Journal of the American Statistical Association, 107, 711-724. https://doi.org/10.1080/01621459.2012.682541
  6. Hwang, S., Sohn, S. H. and Oh. C. (2015). Maximum likelihood estimation for a mixure distributions. Journal of the Korean Data & Information Science Society, 26, 313-322. https://doi.org/10.7465/jkdi.2015.26.2.313
  7. Jacobs, R. A., Jordan, M. I., Nowlan, S. J. and Hinton, G. E. (1991). Adaptive mixtures of local experts. Neural Computation, 3, 79-87. https://doi.org/10.1162/neco.1991.3.1.79
  8. Jordan, M. I. and Jacobs, R. A. (1994). Hierarchical mixtures of experts and the EM algorithm. Neural Computation, 6, 181-214. https://doi.org/10.1162/neco.1994.6.2.181
  9. Lee, K. E. (2004). Curve clustering in microarray. Journal of the Korean Data & Information Science Society, 15, 575-584.
  10. Masoudnia, S. and Ebrahimpour, R. (2014). Mixture of experts: A literature survey. Artificial Intelligence Review, 42, 275-293. https://doi.org/10.1007/s10462-012-9338-y
  11. Oh, C. (2014). A maximum likelihood estimation method for a mixture of shifted binomial distributions. Journal of the Korean Data & Information Science Society, 25, 255-261. https://doi.org/10.7465/jkdi.2014.25.1.255
  12. Young, D. S. and Hunter, D. R. (2010). Mixtures of regressions with predictor-dependent mixing proportions. Computational Statistics and Data Analysis, 54, 2253-2266. https://doi.org/10.1016/j.csda.2010.04.002