Browse > Article

Extraction of Latent Topic-based Communities in Blogspace  

Shin, Jung-Hwan (휴어테크 코딩인스펙터팀)
Yoon, Seok-Ho (한양대학교 전자컴퓨터통신공학과)
Kim, Sang-Wook (한양대학교 전자컴퓨터통신공학과)
Park, Sun-Ju (연세대학교 경영학부)
Abstract
In blogspace, there are posts that deal with a common topic and bloggers that are interested in these posts. In this paper, we define a blog community as a group of these bloggers and posts. With a blog community, we can establish various business policies for target marketing, sharing high quality data, and mobilizing the activities in the blogspace. Unlike internet cafes, bloggers participate in blog communities without explicit membership. So, it is not easy to identify the members of a community. In this paper, we propose an effective approach for extracting a blog community that is related to a given topic. First, we choose seed posts that is highly related to a given topic, and select bloggers that are related to the topic with the seed posts. Then, we select posts that are related to the topic with the selected bloggers. By repeating this, we find all the posts and bloggers that are members of the community related to a given topic in blogspace. We verify the superiority of the proposed approach by analyzing extracted blog communities.
Keywords
Blogosphere; Extraction of Blog Communities; Data Mining;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Y. R. Lin, H. Sundaram, Y. Chi, J. Tatemura, and B. L. Tseng, "Discovery of Blog Communities based on Mutual Awareness," In Proceedings of the 3rd Annual Workshop on the Weblogging Ecosysytem, 2006.
2 P. A. Chirita, D. Olmedilla and W. Nejdl, "Finding Related Pages Using the Link Structure of the WWW," In Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence, pp. 632-635, 2004.
3 T. Murata, "Discovery of Web Communities Based on the Co-occurrence of References," In Proceedings of the 3th International Conference on Discovery Science, LNAI 1967, pp.65-75, 2000.
4 B. Wellman, "Community: From Neighborhood to Network," Communications of the ACM, vol.48, no.10, pp.53-55, 2005.   DOI   ScienceOn
5 T. Murata, "Discovery of Web Communities from Positive and Negative Examples," In Proceedings of the 6th International Conference on Discovery Science, LNAI 2843, pp.365-372, 2003.
6 D. Gibson, J. M. Kleinberg, and P. Raghavan, "Inferring Web Communities from Link Topology," In Proceedings of the 9th ACM Conference on Hypertext and Hypermedia, pp.225-234, 1998.
7 N. Imafuhi and M. Kitsuregawa, "Effects of Maximum Flow Algorithm on Identifying Web Community," In Proceedings of the 4th International Workshop on Web information and Data Management, pp.43-48, 2002.
8 G. W. Flake, S. Lawrence, and C. L. Giles, "Efficient Identification of Web Communities," In Proceedings of the 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp.150-160, 2000.
9 L. R. Ford and D. R. Fulkerson, "Maximal Flow through a Network," Canadian Journal of Mathematics, pp.399-404, 1956.
10 R. Kumar, j. Novak, P. Raghavan, and A.Tomkins, "On the Bursty Evolution of Blogspace," In Proceedings of the 12th International Conference on World Wide Web, pp.568-576, 2003.
11 A. Chin and M. Chignell, "A Social Hypertext Model for Finding Community in Blogs," In Proceedings of the 7th Conference on Hypertext and Hypermedia, pp.11-22, 2006.
12 J. Dean, and M. R. Henzinger, "Finding Related Pages in the World Wide Web," In Proceedings of the 8th International Conference on World Wide Web, pp.1467-1479, 1999.
13 I. S. Dhillon, "Co-clustering Documents and Words using Bipartite Spectral Graph Partitioning," In Proceedings of 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp.269-274, 2001.
14 G. W. Flake, S. Lawrence, C. L. Giles, and F. M. Coetzee, "Self-Organization of the Web and Identification of Communities," IEEE Computer, vol. 35, no.3, pp.66-71, 2002.   DOI   ScienceOn
15 G. Greco, S. Greco and E. Zumpano, "Web Communities: Models and Algorithms," World Wide Web: Internet and Web Information Systems, pp. 59-82, 2004.
16 J. Sun, H. Qu, D. Chakrabarti, and C. Faloutsos, "Neighborhood Formation and Anomaly Detection in Bipartite Graphs," In Proceedings of the 5th IEEE International Conference on Data Mining, pp.418-425, 2005.
17 R. Kumar, P. Raghavan, S. Rajagopalan, and A. Tomkins, "Trwaling the Web for Emerging Cyber-Communities," In Proceedings of the 8th International Conference on World Wide Web, pp. 1481-1493, 1999.
18 Y. Zhou and J. Davis, "Discovering Web Communities in the Blogspace," In Proceedings of the 40th Annual Hawaiian International Conference on System Science (HICSS), 2007.
19 Y. R. Lin, H. Sundaram, Y. Chi, J. Tatemura, and B. L. Tseng, "Blog Community Discovery and Evolution Based on Mutual Awareness Expansion," In Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence, pp.48-56, 2007.
20 H. Small, "Co-citation in the Scientific Literature: A New Measure of the Relationship between Two Documents," Journal of the American Society for Information Science, vol.24, no.4, pp.265-269, 1973.   DOI   ScienceOn
21 K. Ishida, "Extracting Latent Weblog Communities - A Partitioning Algorithm for Bipartite Graphs," In Proceedings of the Second Annual Workshop on the Weblogging Ecosystem: Aggregation, Analysis and Dynamics, 2005.
22 T. Hu, H. Xiong, and S. Y. Sung, "Co-Preserving Patterns in Bipartite Partitioning for Topic Identification," In Proceedings of 7th SIAM International Conference on Data Mining, pp.509-514, 2007.