Browse > Article
http://dx.doi.org/10.6109/jicce.2018.16.2.106

Text Categorization with Improved Deep Learning Methods  

Wang, Xingfeng (Information Engineering College, Eastern Liaoning University)
Kim, Hee-Cheol (Department of Computer Engineering & Institute of Digital Anti-Aging Healthcare (IDA), Inje University)
Abstract
Although deep learning methods of convolutional neural networks (CNNs) and long-/short-term memory (LSTM) are widely used for text categorization, they still have certain shortcomings. CNNs require that the text retain some order, that the pooling lengths be identical, and that collateral analysis is impossible; In case of LSTM, it requires the unidirectional operation and the inputs/outputs are very complex. Against these problems, we thus improved these traditional deep learning methods in the following ways: We created collateral CNNs accepting disorder and variable-length pooling, and we removed the input/output gates when creating bidirectional LSTMs. We have used four benchmark datasets for topic and sentiment classification using the new methods that we propose. The best results were obtained by combining LTSM regional embeddings with data convolution. Our method is better than all previous methods (including deep learning methods) in terms of topic and sentiment classification.
Keywords
CNN; Disorder; LSTM; Text categorization;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 J. Gao, P. Pantel, M. Gamon, X. He, and D. Li, "Modeling interestingness with deep neural networks," in Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, Doha, Qatar, pp. 2-13, 2014.
2 N. Kalchbrenner, E. Grefenstette, and P. Blunsom, "A convolutional neural network for modeling sentences," in Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, Baltimore, MD, pp. 655-665, 2014.
3 Q. Le and T. Mikolov, "Distributed representations of sentences and documents," in Proceedings of the 31st International Conference on International Conference on Machine Learning, Beijing, China, pp. 1188-1196, 2014.
4 P. Le and W. Zuidema, "Compositional distributional semantics with long short-term memory," in Proceedings of the 4th Joint Conference on Lexical and Computational Semantics, Denver, CO, pp. 10-19, 2015.
5 B. Li, N. Chen, J. Wen, X. Jin, and Y. Shi, "Text categorization system for stock prediction," International Journal of u- and e-Service Science and Technology, vol. 8, no. 2, pp. 35-44, 2015. DOI: 10.14257/ijunnesst.2015.8.2.04.   DOI
6 X. Wang and H. C. Kim, "New feature selection method for text categorization," Journal of Information and Communication Convergence Engineering, vol. 15, no. 1, pp. 53-61, 2017. DOI: 10.6109/jicce.2017.15.1.53.   DOI
7 A. McCallum and K. Nigam, "A comparison of event models for naïve Bayes text classification," in Proceedings of AAAI'98 Workshop on Learning for Text Categorization, Madison, WI, 1998.
8 P. Soucy and G. W. Mineau, "A simple KNN algorithm for text categorization," in Proceedings IEEE International Conference on Data Mining, San Jose, CA, pp. 647-648, 2001. DOI: 10.1109/ICDM.2001.989592   DOI
9 T. Joachims, "Transductive inference for text classification using support vector machines," in Proceedings of the 16th International Conference on Machine Learning, Bled, Slovenia, pp. 200-209, 1999.
10 S. Lai, L. Xu, K. Liu, and J. Zhao, "Recurrent convolutional neural networks for text classification," in Proceedings of the 29th AAAI Conference on Artificial Intelligence, Austin, TX, pp. 2267-2273, 2015.
11 J. Weston, S. Chopra, and K. Adams, "#tagspace: semantic embeddings from hashtags," in Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, Doha, Qatar, pp. 1822-1827, 2014.
12 S. Hochreiter and J. Schmidhuder, "Long short-term memory," Neural Computation, vol. 9, no. 8, pp. 1735-1780, 1997. DOI: 10.1162/neco.1997.9.8.1735.   DOI
13 B. Pang and L. Lee, "Opinion mining and sentiment analysis," Foundations and Trends in Information Retrieval, vol. 2, no. 1-2, pp. 1-135, 2008. DOI: 10.1561/1500000011.   DOI
14 A. Deshpande, "A beginner's guide to understanding convolutional neural networks," 2016 [Internet], Available: https://adeshpande3.github.io/adeshpande3.github.io/A-Beginner's-Guide-To-Understanding-Convolutional-Neural-Networks/.
15 C. Olah, "Understanding LSTM networks," 2015 [Internet], Available: https://colah.github.io/posts/2015-08-Understanding-LSTMs/.
16 K. Tai, S. Richard, and M. Christopher, "Improved semantic representations from tree-structured long short-term memory networks," in Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics, Beijing, China, pp. 1556-1566, 2015.
17 X. Zhu, P. Sobhani, and H. Guo, "Long short-term memory over recursive structures," in Proceedings of the 32nd International Conference on Machine Learning, Lille, France, pp. 1604-1612, 2015.
18 K. Cho, B. Van Merrienboer, C. Gulcehre, D. Bahdanau, F. Bougares, H. Schwenk, and Y. Bengio, Y. (2014). "Learning phrase representations using RNN encoder-decoder for statistical machine translation," in Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, Doha, Qatar, pp. 1724-1734, 2014.
19 Y. Kim, "Convolutional neural networks for sentence classification," in Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, Doha, Qatar, pp. 1746-1751, 2014.
20 A. L. Maas, R. E. Daly, P. T. Pham, D. Huang, A. Y. Ng, and C. Potts, "Learning word vectors for sentiment analysis," in Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Portland, OR, pp. 142-150, 2011.
21 D. D. Lewis, Y. Yang, T. G. Rose, and F. Li, "RCV1: a new benchmark collection for text categorization research," Journal of Machine Learning Research, vol. 5. pp. 361-397, 2004.
22 L. Xu, K. Liu, S. Lai, and J. Zhao, "Product feature mining: Semantic clues versus syntactic constituents," in Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, Baltimore, MD, pp. 336-346, 2014.
23 M. Sahami, S. Dumais, D. Heckerman, and E. Horvitz, "A Bayesian approach to filtering junk e-mail," in Proceedings of AAAI'98 Workshop on Learning for Text Categorization, Madison, WI, 1998.
24 B. Pang, L. Lee, and S. Vaithyanathan, "Thumbs up? Sentiment classification using machine learning techniques," in Proceedings of the ACL-02 Conference on Empirical Methods in Natural Language Processing, Philadelphia, PA, pp. 79-86, 2002. DOI: 10.3115/1118693.1118704.   DOI
25 J. McAuley and J. Leskovec, "Hidden factors and hidden topics: understanding rating dimensions with review text," in in Proceedings of the 7th ACM Conference on Recommender Systems, Hong Kong, China, pp. 165-172, 2013.