Browse > Article
http://dx.doi.org/10.13088/jiis.2014.20.2.093

User-Perspective Issue Clustering Using Multi-Layered Two-Mode Network Analysis  

Kim, Jieun (Graduate School of Business IT, Kookmin University)
Kim, Namgyu (Graduate School of Business IT, Kookmin University)
Cho, Yoonho (College of Business Administration, Kookmin University)
Publication Information
Journal of Intelligence and Information Systems / v.20, no.2, 2014 , pp. 93-107 More about this Journal
Abstract
In this paper, we report what we have observed with regard to user-perspective issue clustering based on multi-layered two-mode network analysis. This work is significant in the context of data collection by companies about customer needs. Most companies have failed to uncover such needs for products or services properly in terms of demographic data such as age, income levels, and purchase history. Because of excessive reliance on limited internal data, most recommendation systems do not provide decision makers with appropriate business information for current business circumstances. However, part of the problem is the increasing regulation of personal data gathering and privacy. This makes demographic or transaction data collection more difficult, and is a significant hurdle for traditional recommendation approaches because these systems demand a great deal of personal data or transaction logs. Our motivation for presenting this paper to academia is our strong belief, and evidence, that most customers' requirements for products can be effectively and efficiently analyzed from unstructured textual data such as Internet news text. In order to derive users' requirements from textual data obtained online, the proposed approach in this paper attempts to construct double two-mode networks, such as a user-news network and news-issue network, and to integrate these into one quasi-network as the input for issue clustering. One of the contributions of this research is the development of a methodology utilizing enormous amounts of unstructured textual data for user-oriented issue clustering by leveraging existing text mining and social network analysis. In order to build multi-layered two-mode networks of news logs, we need some tools such as text mining and topic analysis. We used not only SAS Enterprise Miner 12.1, which provides a text miner module and cluster module for textual data analysis, but also NetMiner 4 for network visualization and analysis. Our approach for user-perspective issue clustering is composed of six main phases: crawling, topic analysis, access pattern analysis, network merging, network conversion, and clustering. In the first phase, we collect visit logs for news sites by crawler. After gathering unstructured news article data, the topic analysis phase extracts issues from each news article in order to build an article-news network. For simplicity, 100 topics are extracted from 13,652 articles. In the third phase, a user-article network is constructed with access patterns derived from web transaction logs. The double two-mode networks are then merged into a quasi-network of user-issue. Finally, in the user-oriented issue-clustering phase, we classify issues through structural equivalence, and compare these with the clustering results from statistical tools and network analysis. An experiment with a large dataset was performed to build a multi-layer two-mode network. After that, we compared the results of issue clustering from SAS with that of network analysis. The experimental dataset was from a web site ranking site, and the biggest portal site in Korea. The sample dataset contains 150 million transaction logs and 13,652 news articles of 5,000 panels over one year. User-article and article-issue networks are constructed and merged into a user-issue quasi-network using Netminer. Our issue-clustering results applied the Partitioning Around Medoids (PAM) algorithm and Multidimensional Scaling (MDS), and are consistent with the results from SAS clustering. In spite of extensive efforts to provide user information with recommendation systems, most projects are successful only when companies have sufficient data about users and transactions. Our proposed methodology, user-perspective issue clustering, can provide practical support to decision-making in companies because it enhances user-related data from unstructured textual data. To overcome the problem of insufficient data from traditional approaches, our methodology infers customers' real interests by utilizing web transaction logs. In addition, we suggest topic analysis and issue clustering as a practical means of issue identification.
Keywords
Data Mining; Issue Clustering; Social Network Analysis; Topic Analysis;
Citations & Related Records
Times Cited By KSCI : 4  (Citation Analysis)
연도 인용수 순위
1 Witten, I, H., Text Mining, Practical Handbook of Internet Computing, CRC Press, 2004.
2 Yoon, S., "A Study of Churn Prediction Model for Department Store Customers Using Data Mining Technique," Asia Marketing Journal, Vol.6, No.4(2005), 45-72.
3 Fan, W., W. Wallace, S. Rich, and Z. Zhang, "Tapping the Power of Text Mining," Communications of the ACM, Vol. 49, No. 9(2006), 76-82.
4 Cho, I. and N. Kim, "Recommending Core and Connecting Keywords of Research Area Using Social Network and Data Mining Techniques," Journal of Intelligence and Information Systems, Vol.17(2011), 127-138.   과학기술학회마을
5 Choi, C., "Research on Informal Organizational Network: Social Network Analysis," Korea Society and Public Administration, Vol.17, No.1(2006), 1-23.
6 Choi, K., "Social Big Data Analysis," Proceedings of the Spring Workshop on Korea Intelligent Information System Society, (2012).
7 Hong, S., Social Network World and Big Data Applications, Powerbook, Seoul, 2013.
8 Hyun, Y., H. Han, H. Choi, J. Park, K, Lee, K-Y. Kwahk, and N. Kim, "Methodology Using Text Analysis for Packaging R&D Information Services on Pending National Issues," Journal Of Information Technology Applications & Management, Vol.20(2013), 231-257.   과학기술학회마을
9 Kang, M., and Y. S. Hau, "Multi-level Analysis of the Antecedents of Knowledge Transfer:Integration of Social Capital Theory and Social Network Theory," Asia Pacific Journal of Information Systems, Vol.22(2012), 75-97.   과학기술학회마을
10 Kauffiman, S. A., The Origins of Order, Oxford University Press, Oxford, 1993.
11 Kim, I., "The Value of Big Data and Strategy," 2012 Big Data Search Analysis Technology, Insight, 2012.
12 Kim, Y. H., Social Network Analysis, Seoul, 2007.
13 Kwak, K. Y., Social Network Analysis, Cheongram, Seoul, 2014.
14 Liu, B., Sentiment Analysis and Opinion Mining, Morgan & Claypool Publishers, 2012.
15 Myung, J., D. Lee, and S. Lee., "A Korean Product Review Analysis System Using a Semi-Automatically Constructed Semantic Dictionary," Journal of KIISE : Software and Applications, Vol.35(2008), 392-403.   과학기술학회마을
16 Sebastiani, F., Classification of Text, Automatic, the Encyclopedia of Language and Linguistics 14, 2nd Edition, Elsevier Science Pub, 2006.
17 Stanvrianou, A., P. Andritsos, and N. Nicoloyannis, "Overview and Semantic Issues of Text Mining," ACM SIGMOD Record, Vol. 36(2007), 23-24.   DOI   ScienceOn
18 Albright, R., Taming Text with the SVD, SAS Institute Inc., 2006.