[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.3743/KOSIM.2015.32.1.135

Investigation of Topic Trends in Computer and Information Science by Text Mining Techniques: From the Perspective of Conferences in DBLP

Kim, Su Yeon (연세대학교)
Song, Sung Jeon (연세대학교 문헌정보학과 대학원)
Song, Min (연세대학교 문헌정보학과)

Publication Information

Journal of the Korean Society for information Management / v.32, no.1, 2015 , pp. 135-152 More about this Journal

Abstract

The goal of this paper is to explore the field of Computer and Information Science with the aid of text mining techniques by mining Computer and Information Science related conference data available in DBLP (Digital Bibliography & Library Project). Although studies based on bibliometric analysis are most prevalent in investigating dynamics of a research field, we attempt to understand dynamics of the field by utilizing Latent Dirichlet Allocation (LDA)-based multinomial topic modeling. For this study, we collect 236,170 documents from 353 conferences related to Computer and Information Science in DBLP. We aim to include conferences in the field of Computer and Information Science as broad as possible. We analyze topic modeling results along with datasets collected over the period of 2000 to 2011 including top authors per topic and top conferences per topic. We identify the following four different patterns in topic trends in the field of computer and information science during this period: growing (network related topics), shrinking (AI and data mining related topics), continuing (web, text mining information retrieval and database related topics), and fluctuating pattern (HCI, information system and multimedia system related topics).

Keywords

text mining; topic modeling; DMR; topic dynamics; research trend;

Citations & Related Records

Reference

1	Cutting, D., Karger, D., & Pederson, J. (1993). Constant interaction-time scatter/gather browsing of very large document collections. In Proceedings of the 16th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 126-134.
2	Frank, E., Paynter, G., Witten, I., Gutwin, C., & Nevill-Manning, C. (1999). Domain-specific keyphrase extraction. In Proceeding of 16th International Joint Conference on Artificial Intelligence, 668-673.
3	Glanzel, W. (2012). Bibliometric methods for detecting and analysing emerging research topics. El profesional de la informacion, 21(2), 194-201. DOI
4	Griffiths, T., & Steyvers, M. (2004). Finding scientific topics. Proceedings of the National Academy of Sciences, 101(suppl. 1), 5228-5235. DOI
5	HaCohen-Kerner, Y., Gross, Z., & Masa, A. (2005). Automatic extraction and learning of keyphrases from scientific articles. In Proceedings of the 6th International Conference on Computational Linguistics and Intelligent Text Processing, 657-669.
6	He, D., & Parker, S. (2010). Topic dynamics: an alternative model of 'bursts' in streams of topics. In Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining, 443-452.
7	Hulth, A. (2003). Improved automatic keyword extraction given more linguistic knowledge. In Proceedings of the 2003 Conference on Empirical Methods in Natural Language Processing, 216-223.
8	Janssens, F., Glanzel W., & De Moor, B. (2008). A hybrid mapping of information science. Scientometrics, 75(3), 607-631. DOI
9	Kleinberg, J. (2003). Bursty and hierarchical structure in streams. Data Mining and Knowledge Discovery, 7(4), 373-397. DOI
10	Liu, F., Liu, F., & Liu, Y. (2008). Automatic keyword extraction for the meeting corpus using supervised approach and bigram expansion. In Proceedings of 2008 IEEE Workshop on Spoken Language Technology, 181-184.
11	Liu, Z., Huang, W., Zheng, Y., & Sun, M. (2010). Automatic keyphrase extraction via topic decomposition. Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, 366-376.
12	McCallum, A. (2002). MALLET: A Machine learning for language toolkit. Retrieved from http://mallet.cs.umass.edu
13	Matsuo, Y., & Ishizuka, M. (2004). Keyword extraction from a single document using word co-occurrence statistical information. International Journal on Artificial Intelligence Tools, 13(1), 157-169. DOI
14	Merriam-Webster and American Heritage Dictionary. Retrieved from http://www.britannica.com/EBchecked/topic/19759/The-American-Heritage-Dictionary
15	Mimno, D., & McCallum, A. (2008). Topic models conditioned on arbitrary features with Dirichlet-multinomial regression. Retrieved from http://arxiv.org/abs/1206.3278v1
16	Tang, X., Yang, C. C., & Song, M. (2013). Understanding the evolution of multiple scientific research domains using a content and network approach. Journal of the American Society for Information Science and Technology, 64(5), 1065-1075. DOI
17	Treeratpituk, P., & Callan, J. (2006). Automatically labeling hierarchical clusters. In Proceedings of the 2006 International Conference on Digital Government Research, 167-176.
18	Wan, X., Yang, J., & Xiao, J. (2007). Towards an iterative reinforcement approach for simultaneous document summarization and keyword extraction. Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, 552-559.
19	Wang, X., Mohanty, N., & McCallum, A. (2005). Group and topic discovery from relations and text. The 11th ACM SIGKDD International conference on Knowledge Discovery and Data Mining Workshop on Link Discovery: Issues, Approaches & Applications, 28-35.
20	Wang, C., Blei, D., & Heckerman, D. (2012). Continuous time dynamic topic models. Retrieved from http://arxiv.org/abs/1206.3298v1
21	White, D., & McCain, W. (1998). Visualizing a discipline: An author co-citation analysis of information science, 1972-1995. Journal of American Society of Information Science and Technology, 49(4), 327-355.
22	Xu, J., Marshall, B., Kaza, S., & Chen, H. (2004). Analyzing and visualizing criminal network dynamics: A case study. In H.Chen, R.Moore, D.D.Zeng, & J.Leavitt (Eds.), Lecture Notes in Computer Science, 3073: Intelligence and Security Informatics, 359-377. Berlin: Springer.
23	Brin, S., & Page, L. (1998). The anatomy of a large-scale hypertextual web search engine. Computer Networks and ISDN Systems, 30, 107-117. DOI
24	Adamic, L., & Adar, E. (2005). How to search a social network. Social Networks, 27(3), 187-203. DOI
25	Blei, D., & Lafferty, J. (2006). Dynamic topic models. In Proceedings of the 23rd International Conference on Machine Learning, 113-120.
26	Blei, D., Ng A., & Jordan, M. (2003). Latent Dirichlet allocation. Journal of Machine Learning Research, 3, 993-1022.
27	Buckland, M. (2012). What kind of science can information science be?. Journal of the American Society for Information Science and Technology, 63(1), 1-7. DOI
28	Chen, C., & Carr, L. (1999). Visualizing the evolution of a subject domain: A case study. In Proceedings of the conference on Visualization '99: celebrating ten years, 449-452.

4	Yeong Jun Yoo. (2016) Journal of the Korean BIBLIA Society for library and Information Science A Bibliographic Study on the Calvin Theological Journal / 27 (4) , 125
3	Eun-Gyoung Seo. (2015) Journal of the Korean BIBLIA Society for library and Information Science Informetric Analysis of Research Trends in The Journal of Korean Biblio Society for Library and Information Science / 26 (3) , 315
2	Jo-Ah Kim. (2016) Journal of the Korean Society for information Management Analyzing the Research Fronts of Women's Studies in Korea Using Citation Image Makers Profiling / 33 (2) , 201
6	Si Yeong Lim. (2016) Korean Journal of Construction Engineering and Management A Text Mining Analysis for Research Trend about Information and Communication Technology in Construction Automation / 17 (6) , 13

KSCI

Investigation of Topic Trends in Computer and Information Science by Text Mining Techniques: From the Perspective of Conferences in DBLP 텍스트 마이닝 기법을 이용한 컴퓨터공학 및 정보학 분야 연구동향 조사: DBLP의 학술회의 데이터를 중심으로

Investigation of Topic Trends in Computer and Information Science by Text Mining Techniques: From the Perspective of Conferences in DBLP