Browse > Article
http://dx.doi.org/10.22937/IJCSNS.2022.22.7.43

Enhancing the Text Mining Process by Implementation of Average-Stochastic Gradient Descent Weight Dropped Long-Short Memory  

Annaluri, Sreenivasa Rao (VNR Vignana Jyothi Institute of Engineering and Technology)
Attili, Venkata Ramana (Sreenidhi institute of Science and Technology)
Publication Information
International Journal of Computer Science & Network Security / v.22, no.7, 2022 , pp. 352-358 More about this Journal
Abstract
Text mining is an important process used for analyzing the data collected from different sources like videos, audio, social media, and so on. The tools like Natural Language Processing (NLP) are mostly used in real-time applications. In the earlier research, text mining approaches were implemented using long-short memory (LSTM) networks. In this paper, text mining is performed using average-stochastic gradient descent weight-dropped (AWD)-LSTM techniques to obtain better accuracy and performance. The proposed model is effectively demonstrated by considering the internet movie database (IMDB) reviews. To implement the proposed model Python language was used due to easy adaptability and flexibility while dealing with massive data sets/databases. From the results, it is seen that the proposed LSTM plus weight dropped plus embedding model demonstrated an accuracy of 88.36% as compared to the previous models of AWD LSTM as 85.64. This result proved to be far better when compared with the results obtained by just LSTM model (with 85.16%) accuracy. Finally, the loss function proved to decrease from 0.341 to 0.299 using the proposed model
Keywords
Text Mining; Python; LSTM; AWD-LSTM; IMDB; NLP;
Citations & Related Records
Times Cited By KSCI : 3  (Citation Analysis)
연도 인용수 순위
1 Ayed, A.B. Halima, M.B. and Alimi, A. M. "Survey on clustering methods: Towards fuzzy clustering for big data." In 2014 6th International conference of soft computing and pattern recognition (SoCPaR), pp. 331-336. IEEE, 2014.
2 Eisenstein, J. Introduction to natural language processing. MIT press, 2019.
3 Usai, A. Pironti, M. Mital, M. and Mejri, C.A. "Knowledge discovery out of text data: a systematic review via text mining." Journal of knowledge management (2018).
4 Boer, F.D. Serbanescu, V. Hahnle, R. Henrio, L. Rochas, J. Din, C.C. Johnsen, E.B. et al. "A survey of active object languages." ACM Computing Surveys (CSUR) 50, no. 5 (2017): 1-39.
5 Bhirud, N.S. "Grammar checkers for natural languages: a review." International Journal on Natural Language Computing (IJNLC)
6 Feng, Z. and Zhu, Y. "A survey on trajectory data mining: Techniques and applications." IEEE Access 4, pp. 2056-2067, 2016.   DOI
7 Jovic, A. Brkic, K. and Bogunovic, N. "An overview of free software tools for general data mining." In 2014 37th International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO), pp. 1112-1117. IEEE, 2014.
8 Mendhe, C.H. Henderson, N. Srivastava, G. and Mago, V. "A scalable platform to collect, store, visualize, and analyze big data in real time." IEEE Transactions on Computational Social Systems (2020).
9 Salloum, S.A. Al-Emran, M. Monem, A. A. and Shaalan, K. "Using text mining techniques for extracting information from research articles." In Intelligent natural language processing: Trends and Applications, pp. 373-397. Springer, Cham, 2018.
10 Pejic Bach, M. Krstic, Z. Seljan, S. and Turulja, L.. "Text mining for big data analysis in financial sector: A literature review." Sustainability 11, no. 5 (2019): 1277.   DOI
11 Salloum, S.A. Al-Emran, M. Monem, A. A. and Shaalan, K. "Using text mining techniques for extracting information from research articles." In Intelligent natural language processing: Trends and Applications, pp. 373-397. Springer, Cham, 2018.
12 Ferreira-Mello, R. Andre, M. Pinheiro, A. Costa, E. and. Romero, C. "Text mining in education." Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery 9, no. 6 (2019): e1332.
13 Ignatow, G. and Mihalcea, R. Text mining: A guidebook for the social sciences. Sage Publications, 2016.
14 Porter, L.A. and Cunningham, S. W. Tech mining: exploiting new technologies for competitive advantage. Vol. 29. John Wiley & Sons, 2004.
15 Miner, G. Elder IV, J. Fast, A. Hill, T. Nisbet, R. and Delen, D. Practical text mining and statistical analysis for non-structured text data applications. Academic Press, 2012.
16 Sun, W. Cai, Z. Li, Y. Liu, F. Fang, S. and Wang, G. "Data processing and text mining technologies on electronic medical records: a review." Journal of healthcare engineering 2018 (2018).
17 Zhang, R. Xiao, W. Zhang, H. Liu, Y. Lin, H. and Yang, M. "An empirical study on program failures of deep learning jobs." In 2020 IEEE/ACM 42nd International Conference on Software Engineering (ICSE), pp. 1159-1170. IEEE, 2020.
18 Halavais, A. Search engine society. John Wiley & Sons, 2017.
19 Manoharan S. "A smart image processing algorithm for text recognition information extraction and vocalization for the visually challenged." Journal of Innovative Image Processing (JIIP) 1, no. 01 (2019): 31-38.   DOI
20 Vijayarani, S. and Janani, R. "Text mining: open-source tokenization tools-an analysis." Advanced Computational Intelligence: An International Journal (ACII) 3, no. 1 (2016): 37-47.   DOI
21 Nimmagadda, S.L. Zhu, D. and Reiners, T. "On Managing Contextual Knowledge of Digital Document Ecosystems, characterized by Alphanumeric Textual Data." Procedia Computer Science 159 (2019): 1135-1144.   DOI
22 Hassani, H. Beneki, C. Unger, S. Taj Mazinani, M. and Yeganegi. M. R. "Text mining in big data analytics." Big Data and Cognitive Computing 4, no. 1 (2020): 1   DOI
23 Boukhari, K. and Omri, M.N. "DL-VSM based document indexing approach for information retrieval." Journal of Ambient Intelligence and Humanized Computing (2020): 1-12.
24 Luo, X. "Efficient english text classification using selected machine learning techniques." Alexandria Engineering Journal 60, no. 3 (2021): 3401-3409.   DOI
25 Vani, K. and Gupta, D. "Unmasking text plagiarism using syntactic-semantic based natural language processing techniques: Comparisons, analysis and challenges." Information Processing & Management 54, no. 3 (2018): 408-432.   DOI
26 Curtis, B. Kellner, M.I. and Over, J. "Process modeling." Communications of the ACM 35, no. 9 (1992): 75-90.   DOI
27 De, S. Musil, F. Ingram, T. Baldauf, C. and Ceriotti, M. "Mapping and classifying molecules from a high-throughput structural database." Journal of cheminformatics 9, no. 1 (2017): 1-14.   DOI
28 Acharjya, D.P. and Ahmed, K. "A survey on big data analytics: challenges, open research issues and tools." International Journal of Advanced Computer Science and Applications 7, no. 2 (2016): 511-518.
29 Adnan, K. and Akbar, R. "An analytical study of information extraction from unstructured and multidimensional big data." Journal of Big Data 6, no. 1 (2019): 1-38.   DOI