Browse > Article
http://dx.doi.org/10.9723/jksiis.2010.15.2.049

BPNN Algorithm with SVD Technique for Korean Document categorization  

Li, Chenghua (전북대학교 전자정보공학부)
Byun, Dong-Ryul (전북대학교 전자정보공학부)
Park, Soon-Choel (전북대학교 전자정보공학부)
Publication Information
Journal of Korea Society of Industrial Information Systems / v.15, no.2, 2010 , pp. 49-57 More about this Journal
Abstract
This paper proposes a Korean document. categorization algorithm using Back Propagation Neural Network(BPNN) with Singular Value Decomposition(SVD). BPNN makes a network through its learning process and classifies documents using the network. The main difficulty in the application of BPNN to document categorization is high dimensionality of the feature space of the input documents. SVD projects the original high dimensional vector into low dimensional vector, makes the important associative relationship between terms and constructs the semantic vector space. The categorization algorithm is tested and compared on HKIB-20000/HKIB-40075 Korean Text Categorization Test Collections. Experimental results show that BPNN algorithm with SVD achieves high effectiveness for Korean document categorization.
Keywords
Documents categorization; Back Propagation Neural Network Algorithm; SVD;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Li BL, Yu SW, Qin Lu. An improved k-nearest neighbor algorithm for text categorization. In: Sun MS, Yao TS, Yuan CF, eds. Proc. of the 20th Int'l Conf. on Computer Processing of Oriental Languages. Beijing: Tsinghua University Press, 2003.
2 Songbo Tan. An effective refinement strategy for KNN text classifier. Expert Systems with Applications, Volume 30, Issue 2, February, pp. 290-298. 2006.   DOI   ScienceOn
3 김대수, 신경망 이론과 응용(I, II) , 하이테크 정보, 1992
4 오일석, 패턴 인식, 교보문고, 2008
5 한국일보 문서범주화 실험문서집합. http://www.kristalinfo.com/TestCollections/readme_hkib.pdf
6 Nakayama, M., & Shimizu, Y. Subject categorization for web educational resources using MLP. In Proceedings of 11th European symposium on artificial neural networks, pp. 9–14. 2003.
7 C. Apte and F.Damerau, "Automated learning of decision rules for text categorization", ACM Transactions on Information Systems, Vol. 12, No.3, pp.233-251, 1994   DOI   ScienceOn
8 E. D. Wiener. A neural network approach to topic spotting in text. Master's thesis, Department of Computer Science, University of Colorado at Boulder, Boulder, US, 1995.
9 D. E. Rumelhart, R. Durbin, R. Goldenand, and Y. Chauvin. Backpropagation: The basic theory. In M. C. Mozer and D. E. Rumelhart, editors, Mathematical Perspectives on Neural Networks, Lawrence Associates, Hillsdale, NJ, pp 533–566. 1996.
10 D. E. Rumelhart and J. L. McClelland. Parallel distributed processing: exploration in the microstructure cognition, volume vols. 1 & 2. MIT Press, 1986.
11 Dudani, S.A. The distance-weighted k-nearest -neighbor rule. IEEE Trans. Syst. Man Cybern., SMC-6: 325–327, 1976.   DOI
12 Ruiz, M. E., Srinivasan, P. Automatic Text Categorization Using Neural Network, in: Proceedings of the 8th ASIS SIG/CR Workshop on Classification Research, pp. 59-72. 1998.
13 Noorinaeini, A. and Lehto, M.R. "Hybrid singular value decomposition; a model of human text classification", Int. J. of Human Factors Modelling and Simulation, Vol. 1,No.1, pp.95–118. 2006.   DOI
14 Y. Yany, "Noise reduction in a statistical approach to text categorization," in Proc. of the 18th ACM International Conference on Rexorch ond Development in Informorion Retrieval, New York, pp. 256.263. 1995.