• Title/Summary/Keyword: 블로그 마이닝

Search Result 76, Processing Time 0.031 seconds

An Analysis of Information Diffusion in the Blog World (블로그 월드에서 정보 파급 분석)

  • Kwon, Yong-Suk;Kim, Sang-Wook;Park, Sun-Ju
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2008.05a
    • /
    • pp.223-226
    • /
    • 2008
  • 인터넷 기술의 발달로 인해 온라인상에서도 사회연결망이 나타나고 있다. 블로그 월드는 대표적인 온라인 사회연결망이다. 블로그 월드의 구성원인 블로거는 정보를 생성할 수도 있고, 정보를 얻기 위하여 다른 블로거와 명시적 관계를 맺을 수도 있으며, 이러한 관계를 통해 온라인 사회연결망인 블로그연결망을 구성한다. 사회 연결망 이론에서는 사회 연결망에서 정보의 파급이 구성원간의 관계를 통하여 이루어진다고 한다. 그러나 블로그 연결망과 실제 블로그 월드에서 발생한 정보 파급 이력을 비교 관찰해 보면, 사회연결망 이론과 달리 관계가 존재하지 않는 구성원 사이에서 정보 파급이 일어난다. 또한, 정보의 파급이 폭발적으로 일어나는 현상도 존재한다. 본 논문에서는 이러한 두 현상이 서로 연관이 있음을 밝히고, 이러한 현상을 일으키는 원인을 규명하는 분석방법을 제안한다. 제안하는 분석방법은 다음과 같다. 우선, 관계가 존재하지 않는 구성원 간에 정보 파급 현상을 유발할 수 있는 후보원인들을 모두 도출한다. 다음으로, 폭발적인 정보 파급 현상을 보이는 정보의 집단을 데이터 마이닝의 클러스터링 기술을 이용하여 도출한다. 도출된 정보의 집단과 후보 원인간의 상관관계를 데이터 마이닝의 특성분석 방법을 이용하여 구한다. 블로그 월드는 구성원과 그 사이의 관계, 정보 파급 이력에 대한 데이터를 모두 저장하고 있다. 본 논문은 실제 블로그 월드의 데이터를 이용하여 블로그 월드에서 정보의 폭발적 파급을 유발하는 원인들을 규명하고 그 원인들이 가지는 특징을 설명하였다.

An Approach for Determining Propensities of Blog Networks (블로그 연결망의 성향 판정 방안)

  • Yoon, Seok-Ho;Park, Sun-Ju;Kim, Sang-Wook
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.15 no.3
    • /
    • pp.178-188
    • /
    • 2009
  • A blog is a personal website where its owner publishes his/her articles for others. A blog can have relationships with other blogs. In this paper, we define a network that is composed of blogs connected together with such relationships as a blog network. Blog networks can have two different propensities characterized by the articles published in the blogs: information-valued propensity and friendship-valued propensity. The degree of each propensity of a blog network plays an important role in deciding business policies for blog networks. In this paper, we address the problem of determining the degrees of two propensities of a given blog network. First, we determine the degree of the propensity of every relationship, a basic unit of a blog network, by using classification that is one of data mining functionalities. Then, by utilizing the result thus obtained, we compute the degrees of two propensities of the whole blog network. Also, we propose a method to solve the problem that the degree of propensities depends on the size of blog networks. To verify the superiority of the proposed approach, we perform extensive experiments using a huge volume of real-world blog data. The results show that our approach provides high accuracy of around 93% in determining the degrees of both propensities of relationships between arbitrary two blogs. We also verify the applicability of the proposed approach by showing that if determines the degrees of the information-valued and friendship-valued propensities correctly in real-world blog networks.

Research on Methods for Processing Nonstandard Korean Words on Social Network Services (소셜네트워크서비스에 활용할 비표준어 한글 처리 방법 연구)

  • Lee, Jong-Hwa;Le, Hoanh Su;Lee, Hyun-Kyu
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.21 no.3
    • /
    • pp.35-46
    • /
    • 2016
  • Social network services (SNS) that help to build relationship network and share a particular interest or activity freely according to their interests by posting comments, photos, videos,${\ldots}$ on online communities such as blogs have adopted and developed widely as a social phenomenon. Several researches have been done to explore the pattern and valuable information in social networks data via text mining such as opinion mining and semantic analysis. For improving the efficiency of text mining, keyword-based approach have been applied but most of researchers argued the limitations of the rules of Korean orthography. This research aims to construct a database of non-standard Korean words which are difficulty in data mining such abbreviations, slangs, strange expressions, emoticons in order to improve the limitations in keyword-based text mining techniques. Based on the study of subjective opinions about specific topics on blogs, this research extracted non-standard words that were found useful in text mining process.

The Blog Polarity Classification Technique using Opinion Mining (오피니언 마이닝을 활용한 블로그의 극성 분류 기법)

  • Lee, Jong-Hyuk;Lee, Won-Sang;Park, Jea-Won;Choi, Jae-Hyun
    • Journal of Digital Contents Society
    • /
    • v.15 no.4
    • /
    • pp.559-568
    • /
    • 2014
  • Previous polarity classification using sentiment analysis utilizes a sentence rule by product reviews based rating points. It is difficult to be applied to blogs which have not rating of product reviews and is possible to fabricate product reviews by comment part-timers and managers who use web site so it is not easy to understand a product and store reviews which are reliability. Considering to these problems, if we analyze blogs which have personal and frank opinions and classify polarity, it is possible to understand rightly opinions for the product, store. This paper suggests that we extract high frequency vocabularies in blogs by several domains and choose topic words. Then we apply a technique of sentiment analysis and classify polarity about contents of blogs. To evaluate performances of sentiment analysis, we utilize the measurement index that use Precision, Recall, F-Score in an information retrieval field. In a result of evaluation, using suggested sentiment analysis is the better performances to classify polarity than previous techniques of using the sentence rule based product reviews.

A Decision Method for Propensities of Blog Networks (블로그 연결망의 성향 판정 방안)

  • Yoon, Seok-Ho;Kim, Sang-Wook;Park, Sun-Ju
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2007.05a
    • /
    • pp.65-66
    • /
    • 2007
  • 본 논문에서는 주어진 블로그 연결망의 정보 중시 성향과 친분 중시 성향의 정도를 판정하는 방안에 관하여 논의한다. 먼저, 데이터 마이닝 기법의 하나인 분류(classification)를 이용하여 블로그 연결망의 기본 단위인 관계 성향의 정도를 판정하고, 그 결과를 이용하여 주어진 연결망의 전체 성향의 정도를 판정한다. 또한, 블로그 연결망의 규모에 따라 성향의 정도가 좌우되는 문제를 해결하기 위한 기법을 제안한다. 실제 블로그 데이터를 이용한 실험을 통하여 제안하는 방안의 우수성을 검증한다.

Analysis of Perception on Happy Housing Using Blog Mining Technique (블로그 마이닝을 활용한 행복주택의 인식 분석)

  • Hwang, Ji Hyoun
    • The Journal of the Korea Contents Association
    • /
    • v.22 no.2
    • /
    • pp.211-223
    • /
    • 2022
  • This study aims to verify the possibility of using the blog mining to collect public opinion in the field of housing policy, thus, it collected blog posts with the keyword 'Happy Housing', extracted the main keywords from them, and analyzed the public's perception through keyword and word cluster analysis. 137,002 blog posts were used as analysis data from May 2013, when social discussion about happy housing spread, to August 2021, and the words derived by dividing the period into three stages in consideration of major housing policies and data collection were analyzed. The results are as follows. In the keyword analysis, overall, the importance of words related to the location, the number, the size, and the conditions for occupancy of Happy Housing is high. In the first stage, government policy implementation, in the second stage, the application process for Happy Housing, and in the third stage, recruitment notices, occupancy qualifications, and rental conditions are found to be highly important. In cluster analysis, project progress, application process, and project area were drawn as main themes at all stages. In particular, policy implementation and implementation plan in the first stage, occupancy qualification and financial support in the second stage, and policy implementation and occupancy qualification in the third stage were drawn as main themes. These results present the possibility of the blog mining as a method of collecting public opinion by sharing policy-related information, reflecting social issues, evaluating whether policies are delivered, and inferring the public's participation in policies.

Extraction of Latent Topic-based Communities in Blogspace (블로그 월드에서 주제 중심의 잠재적 커뮤니티 추출 방안)

  • Shin, Jung-Hwan;Yoon, Seok-Ho;Kim, Sang-Wook;Park, Sun-Ju
    • Journal of KIISE:Databases
    • /
    • v.37 no.1
    • /
    • pp.56-69
    • /
    • 2010
  • In blogspace, there are posts that deal with a common topic and bloggers that are interested in these posts. In this paper, we define a blog community as a group of these bloggers and posts. With a blog community, we can establish various business policies for target marketing, sharing high quality data, and mobilizing the activities in the blogspace. Unlike internet cafes, bloggers participate in blog communities without explicit membership. So, it is not easy to identify the members of a community. In this paper, we propose an effective approach for extracting a blog community that is related to a given topic. First, we choose seed posts that is highly related to a given topic, and select bloggers that are related to the topic with the seed posts. Then, we select posts that are related to the topic with the selected bloggers. By repeating this, we find all the posts and bloggers that are members of the community related to a given topic in blogspace. We verify the superiority of the proposed approach by analyzing extracted blog communities.

Link-Based Clustering in Blogosphere (블로그 공간에서의 링크 기반 클러스터링 방안)

  • Song, Suk-Soon;Yoon, Seok-Ho;Kim, Sang-Wook
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.46 no.3
    • /
    • pp.42-49
    • /
    • 2009
  • This paper addresses clustering of blogs and posts in blogosphere. First, we model blogosphere as a social network where blogs and posts correspond to nodes and interactions on posts by blogs corresponds to links. Next, for clustering in blogosphere, we employ LinkClus, a link based algorithm that finds clusters of nodes in a network effectively and efficiently. For more accurate clustering, we propose two refinements: (1) change of granularity from blogs to folders, and (2) removal of blogs and posts being highly likely to incur noises. Finally, we verify the effectiveness of the proposed approach by showing how the posts and blogs in the same cluster are similar to one another in terms of their contents.

Automatic Classification of Advertising Restaurant Blogs Using Machine Learning Techniques (기계학습기법을 이용한 광고 외식 블로그의 자동분류)

  • Chang, Jae-Young;Lee, Byung-Jun;Cho, Se-Jin;Han, Da-Hye;Lee, Kyu-Hong
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.16 no.2
    • /
    • pp.55-62
    • /
    • 2016
  • Recently, users choosing a restaurant basedon information provided by blogs are increasing significantly. However, those of most blogs are unreliable since domestic restaurant blogs are occupied by advertising postings written by 'power bloggers'. Thus, in order to ensure the reliability of blogs, it is necessary to filter the advertising blogs which are sometimes false or exaggerated. In this paper, we propose the method of distinguishing the advertising blogs utilizing an automatic classification technique. In the proposed technique, we first manually collected advertising restaurant blogs, and then analyzed features which are commonly found in those blogs. Using the extracted features, we determined whether a given blog is advertising one applying automatic classification algorithms. Additionally, we select the features and the algorithm which guarantee optimal classification performance through comparative experiments.

A Technique for Extracting GeoSemantic Knowledge from Micro-blog (마이크로 블로그기반의 공간 지식 추출 기법연구)

  • Ha, Su-Wook;Nam, Kwang-Woo;Ryu, Keun-Ho
    • Spatial Information Research
    • /
    • v.20 no.2
    • /
    • pp.129-136
    • /
    • 2012
  • Recently international organizations such as ISO/TC211, OGC, INSPIRE (Infrastructure for Spatial Information in Europe) make an effort to share geospatial data using semantic web technologies. In addition, smart phone and social networking services enable community-based opportunities for participants to share issues of a social phenomenon based on geographic area, and many researchers try to find a method of extracting issues from that. However, serviceable spatial ontologies are still insufficient at application level, and studies of spatial information extraction from SNS were focused on user's location finding or geocoding by text mining. Therefore, a study of extracting spatial phenomenon from social media information and converting it into geosemantic knowledge is very usable. In this paper, we propose a framework for extracting keywords from micro-blog, one of the social media services, finding their relationships using data mining technique, and converting it into spatiotemopral knowledge. The result of this study could be used for implementing a related system as a procedure and ontology model for constructing geoseem antic issue. And from this, it is expected to improve the effectiveness of finding, publishing and analysing spatial issues.