Browse > Article
http://dx.doi.org/10.1633/JISTaP.2022.10.3.4

An Exploratory Analysis of Online Discussion of Library and Information Science Professionals in India using Text Mining  

Garg, Mohit (Central Library, Indian Institute of Technology Delhi, New Delhi, India School of Social Science, Indira Gandhi National Open University)
Kanjilal, Uma (School of Social Science, Indira Gandhi National Open University)
Publication Information
Journal of Information Science Theory and Practice / v.10, no.3, 2022 , pp. 40-56 More about this Journal
Abstract
This paper aims to implement a topic modeling technique for extracting the topics of online discussions among library professionals in India. Topic modeling is the established text mining technique popularly used for modeling text data from Twitter, Facebook, Yelp, and other social media platforms. The present study modeled the online discussions of Library and Information Science (LIS) professionals posted on Lis Links. The text data of these posts was extracted using a program written in R using the package "rvest." The data was pre-processed to remove blank posts, posts having text in non-English fonts, punctuation, URLs, emails, etc. Topic modeling with the Latent Dirichlet Allocation algorithm was applied to the pre-processed corpus to identify each topic associated with the posts. The frequency analysis of the occurrence of words in the text corpus was calculated. The results found that the most frequent words included: library, information, university, librarian, book, professional, science, research, paper, question, answer, and management. This shows that the LIS professionals actively discussed exams, research, and library operations on the forum of Lis Links. The study categorized the online discussions on Lis Links into ten topics, i.e. "LIS Recruitment," "LIS Issues," "Other Discussion," "LIS Education," "LIS Research," "LIS Exams," "General Information related to Library," "LIS Admission," "Library and Professional Activities," and "Information Communication Technology (ICT)." It was found that the majority of the posts belonged to "LIS Exam," followed by "Other Discussions" and "General Information related to the Library."
Keywords
discussion forum; Lis Links; text mining; topic modelling; Latent Dirichlet Allocation;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 Heimerl, F., Lohmann, S., Lange, S., & Ertl, T. (2014, January 6-9). Word cloud explorer: Text analytics based on word clouds. In R. H. Sprague, Jr. (Ed.), Proceedings of the 47th Hawaii International Conference on System Sciences (pp. 1833-1842). IEEE.
2 Hiranburana, K. (2017). Use of English in the Thai workplace. Kasetsart Journal of Social Sciences, 38(1), 31-38. https://doi.org/10.1016/j.kjss.2015.10.002.   DOI
3 Ignatow, G., & Mihalcea, R. (2017). Basic text processing. In G. Ignatow, & R. Mihalcea (Eds.), Text mining: A guidebook for the social sciences (pp. 52-61). Sage.
4 Pujar, S. M., Mahesh, G., & Jayakanth, F. (2014). An exploratory analysis of messages on a prominent LIS electronic discussion list from India. DESIDOC Journal of Library & Information Technology, 34(1), 23-27. https://doi.org/10.14429/djlit.34.1.5942.   DOI
5 Saranya, M. S., & Geetha, P. (2020, July 28-30). Word cloud generation on clothing reviews using topic model. Proceedings of the 2020 International Conference on Communication and Signal Processing (ICCSP) (pp. 177-180). IEEE.
6 Sawant, S., & Sawant, P. (2016). Indian LIS job market and its visibility through portals and mailing lists/forums. SRELS Journal of Information Management, 53(5), 387-391. https://doi.org/10.17821/srels/2016/v53i5/96051.   DOI
7 Bashri, M. F. A., & Kusumaningrum, R. (2017, May 17-19). Sentiment analysis using Latent Dirichlet allocation and topic polarity wordcloud visualization. In H. S. Lim, Y. H. Pang, Y. Rusmawati, & J. Tirtawangsa (Eds.), Proceedings of the 5th International Conference on Information and Communication Technology (ICoIC7) (pp. 1-5). IEEE.
8 Pandapotan, I. M., Alamsyah, A., & Paryasto, M. (2015, May 27-29). Indonesian music fans group identification using social network analysis in Kaskus forum. In M. A. Bijaksana, D. D. Jatmiko, A. T. Wibowo, Y. Redityamurti, M. Arzaki, & I. Asror (Eds.), Proceedings of the 3rd International Conference on Information and Communication Technology (ICoICT) (pp. 322-326). IEEE.
9 Porter, M. (n.d.). Snowball. https://snowballstem.org.
10 Press, G. (2016). Cleaning big data: Most time-consuming, least enjoyable data science task, survey says. https://www.forbes.com/sites/gilpress/2016/03/23/data-preparation-mosttime-consuming-least-enjoyable-data-science-task-surveysays/?sh=28c5b0ee6f63.
11 Qian, Y., & Gui, W. (2021). Identifying health information needs of senior online communities users: A text mining approach. Aslib Journal of Information Management, 73(1), 5-24. https://doi.org/10.1108/AJIM-02-2020-0057.   DOI
12 Munezero, M., Kojo, T., & Mannisto, T. (2017, November 9-10). An exploratory analysis of a hybrid OSS company's forum in search of sales leads. In B. Randall (Ed.), Proceedings of the 11th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM) (pp. 442-447). IEEE.
13 Li, X., & Lei, L. (2021). A bibliometric analysis of topic modelling studies (2000-2017). Journal of Information Science, 47(2), 161-175. https://doi.org/10.1177/0165551519877049.   DOI
14 Hvitfeldt, E., & Silge, J. (2021). Stop words. https://smltar.com/stopwords.
15 Abdillah, O., & Adriani, M. (2015, March 24-27). Mining user interests through internet review forum for building recommendation system. In L. Barolli, M. Takizawa, F. Xhafa, T. Enokido, & J. H. Park (Eds.), Proceedings of the IEEE 29th International Conference on Advanced Information Networking and Applications Workshops (pp. 564-569). IEEE.
16 Shukla, A., & Dawngliana, J. M. (2018). Do online professional forums promote professional contents effectively? An analytical study of new millennium LIS professionals (NMLIS). International Journal of Library and Information Studies, 8(1), 61-70. https://www.ijlis.org/articles/do-online-professional-forums-promote-professional-contents-effectivelyan-analytical-study-of-new-millennium-lis-profes.pdf.
17 Arden, M. A., Duxbury, A. M., & Soltani, H. (2014). Responses to gestational weight management guidance: A thematic analysis of comments made by women in online parenting forums. BMC Pregnancy and Childbirth, 14, 1-12. https://doi.org/10.1186/1471-2393-14-216.   DOI
18 Barravecchia, F., Mastrogiacomo, L., & Franceschini, F. (2022). Digital voice-of-customer processing by topic modelling algorithms: Insights to validate empirical results. Journal of Quality & Reliability Management, 39(6), 1453-1470. https://doi.org/10.1108/IJQRM-07-2021-0217.   DOI
19 Buck, A. M., & Ralston, D. F. (2021). I didn't sign up for your research study: The ethics of using "public" data. Computers and Composition, 61, 102655. https://doi.org/10.1016/j.compcom.2021.102655.   DOI
20 Siddique, N., Shafi Ullah, F., Mahmood, K., & Ajmal Khan, M. (2020). Professional networking with emailing groups: A case of Pakistan Library Automation Group. Journal of Librarianship and Information Science, 53(3), 499-509. https://doi.org/10.1177/0961000620965668.   DOI
21 Jelodar, H., Wang, Y., Yuan, C., Feng, X., Jiang, X., Li, Y., & Zhao, L. (2019). Latent Dirichlet allocation (LDA) and topic modeling: Models, applications, a survey. Multimedia Tools and Applications, 78(11), 15169-15211. https://doi.org/10.1007/s11042-018-6894-4.   DOI
22 Munoz-Canavate, A., Fernandez-Falero, M. R., & Hurtado-Guapo, M. A. (2017a, November 1-3). Information capture and knowledge sharing systems in the field of library and information science: The case of MEDLIB-L in medicine. In K. Liu, A. C. Salgado, J. Bernardino, & J. Filipe (Eds.), Proceedings of the 9th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (KMIS) (pp. 181-188). Science and Technology Publications.
23 Kahani, N., Bagherzadeh, M., Dingel, J., & Cordy, J. R. (2016, October 2-7). The problems with eclipse modeling tools: A topic analysis of eclipse forums. In J. DeAntoni (Ed.), Proceedings of the ACM/IEEE 19th International Conference on Model Driven Engineering Languages and Systems (MODELS' 2016) (pp. 227-237). ACM.
24 Lee, C. F. K. (2004). Written requests in emails sent by adult Chinese learners of English. Language, Culture and Curriculum, 17(1), 58-72. https://doi.org/10.1080/07908310408666682.   DOI
25 Lewis, D. D., Yang, Y., Rose, T. G., & Li, F. (2004). RCV1: A new benchmark collection for text categorization research. Journal of Machine Learning Research, 5, 361-397. https://www.jmlr.org/papers/volume5/lewis04a/lewis04a.pdf.
26 McKenna, E., & Thomson, M. (2014). Demand response behaviour of domestic consumers with photovoltaic systems in the UK: An exploratory analysis of an Internet discussion forum. Energy, Sustainability and Society, 4, 13. https://doi.org/10.1186/2192-0567-4-13.   DOI
27 Miley, F., & Read, A. (2011). Using word clouds to develop proactive learners. Journal of the Scholarship of Teaching and Learning, 11(2), 91-110. https://eric.ed.gov/?id=EJ932148.
28 Munoz-Canavate, A., Gonzalez, A. C., Hipola, P., & Miranda, E. A. C. (2017b, October 18-20). Mailing lists on the Internet - A collaboration tool that is still alive. The case of the rediris lists. In P. Isaias, & H. Weghorn (Eds.), Proceedings of the 2017 International Conference on WWW/Internet: Applied Computing (pp. 261-266). IADIS.
29 Ahn, J., Son, H., & Chung, A. D. (2021). Understanding public engagement on Twitter using topic modeling: The 2019 Ridgecrest earthquake case. International Journal of Information Management Data Insights, 1(2), 100033. https://doi.org/10.1016/j.jjimei.2021.100033.   DOI
30 Kim, Y. B., Lee, J., Park, N., Choo, J., Kim, J. H., & Kim, C. H. (2017). When Bitcoin encounters information in an online forum: Using text mining to analyse user opinions and predict value fluctuation. PloS One, 12(5), e0177630. https://doi.org/10.1371/journal.pone.0177630.   DOI
31 Barbierato, E., Bernetti, I., & Capecchi, I. (2022). Analyzing TripAdvisor reviews of wine tours: An approach based on text mining and sentiment analysis. International Journal of Wine Business Research, 34(2), 212-236. https://doi.org/10.1108/IJWBR-04-2021-0025.   DOI
32 Barman, B. (n.d.). Lis Links. http://www.lislinks.com.
33 Betts, D., Dahlen, H. G., & Smith, C. A. (2014). A search for hope and understanding: An analysis of threatened miscarriage Internet forums. Midwifery, 30(6), 650-656. https://doi.org/10.1016/j.midw.2013.12.011.   DOI
34 Choi, S., Dukic, Z., & Hill, A. (2019). Professional networking with Yahoo! Groups: A case of school librarians from international schools in Hong Kong. Journal of Librarianship and Information Science, 51(4), 1077-1090. https://doi.org/10.1177/0961000618763488.   DOI
35 Garg, M., & Kanjilal, U. (2019). A framework to process text data of web discussion forums a study of LisLinks. DESIDOC Journal of Library & Information Technology, 39(06), 315-321. https://doi.org/10.14429/djlit.39.06.15145.   DOI
36 Coulson, N. S. (2005). Receiving social support online: An analysis of a computer-mediated support group for individuals living with irritable bowel syndrome. Cyberpsychology & behavior, 8(6), 580-584. https://doi.org/10.1089/cpb.2005.8.580.   DOI
37 Dewi, I. N., Nurcahyo, R., & Farizal. (2020, April 16-21). Word cloud result of mobile payment user review in Indonesia. Proceedings of the IEEE 7th International Conference on Industrial Engineering and Applications (ICIEA) (pp. 989-992). IEEE.
38 Eastham, L. A. (2011). Research using blogs for data: public documents or private musings? Research in Nursing & Health, 34(4), 353-361. https://doi.org/10.1002/nur.20443.   DOI
39 Garg, M., & Rangra, P. (2022). Bibliometric analysis of Latent Dirichlet allocation. DESIDOC Journal of Library & Information Technology, 42(2), 105-113. https://doi.org/10.14429/djlit.42.2.17307.   DOI
40 Grun, B., & Hornik, K. (2011). Topicmodels: An R package for fitting topic models. Journal of Statistical Software, 40(13), 1-30. https://doi.org/10.18637/jss.v040.i13.   DOI
41 Hariharakrishnan, J., Mohanavalli, S., Srividya, & Sundhara Kumar, K. B. (2017, January 10-11). Survey of pre-processing techniques for mining big data. Proceedings of the 2017 International Conference on Computer, Communication and Signal Processing (ICCCSP) (pp. 1-5). IEEE.
42 Zarra, T., Chiheb, R., Faizi, R., & Afia, A. E. (2018, May 2-5). Student interactions in online discussion forums: Visual analysis with LDA topic models. Proceedings of the 2018 International Conference on Learning and Optimization Algorithms: Theory and Applications (LOPAL '18) (pp. 1-5). ACM.
43 Singh, S., Chauhan, T., Wahi, V., & Meel, P. (2021, April 8-10). Mining tourists' opinions on popular Indian tourism hotspots using sentiment analysis and topic modeling. Proceedings of the 5th International Conference on Computing Methodologies and Communication (ICCMC) (pp. 1306-1313). IEEE.
44 The Editors of Encyclopaedia Britannica. (2019). Indian languages. https://www.britannica.com/topic/Indian-languages.
45 Wang, C., & Tang, X. (2016). Stance analysis for debates on traditional Chinese medicine at Tianya forum. In H. Nguyen, & V. Snasel (Eds.), International Conference on Computational Social Networks. CSoNet 2016: Computational Social Networks (pp. 321-332). Springer.
46 Omidvar, A., Garakani, M., & Safarpour, H. R. (2014). Context based user ranking in forums for expert finding using WordNet dictionary and social network analysis. Information Technology and Management, 15(1), 51-63. https://doi.org/10.1007/s10799-013-0173-x.   DOI
47 Ozcan-Tok, E., Ozmen, M. U., Tok, E., & Yilmaz, T. (2019). The impact of collective action and market prices: Evidence from an online agricultural discussion forum. Online Information Review, 43(4), 565-583. https://doi.org/10.1108/OIR-08-2018-0243.   DOI