• Title/Summary/Keyword: Blog Search

Search Result 48, Processing Time 0.027 seconds

Study for Blog Clustering Method Based on Similarity of Titles (주제 유사성 기반 클러스터링을 이용한 블로그 검색기법 연구)

  • Lee, Ki-Jun;Lee, Myung-Jin;Kim, Woo-Ju
    • Journal of Intelligence and Information Systems
    • /
    • v.15 no.2
    • /
    • pp.61-74
    • /
    • 2009
  • With an exponential growth of blogs, lots of important data have appeared on blogs. However, since main topics mentioned in blog pages are quite different from general web pages, there are problems which can't be solved by general search engines. Therefore, many researchers have studied searching methods only for blogs to help users who want to have useful information on blog. We also present a blog classifying method based on similarity of titles. First, we analyze blogs and blog search engines to find problems and solution of current blog search. Second, applying our similarity algorithm on blog titles, we discuss a way to develop clustering method only for blog. Finally, by making a prototype system of our algorithm, we evaluate our algorithm's effectiveness and show conclusion and future work. We expect this algorithm could add its power to current search engine.

  • PDF

The Topic-Rank Technique for Enhancing the Performance of Blog Retrieval (블로그 검색 성능 향상을 위한 주제-랭크 기법)

  • Shin, Hyeon-Il;Yun, Un-Il;Ryu, Keun-Ho
    • Journal of the Korea Society of Computer and Information
    • /
    • v.16 no.1
    • /
    • pp.19-29
    • /
    • 2011
  • As people have heightened attention to blogs that are individual media, a variety rank algorithms was proposed for the blog search. These algorithms was modified for structural features of blogs that differ from typical web sites, and measured blogs' reputations or popularities based on the interaction results like links, comments or trackbacks and reflected in the search system. But actual blog search systems use not only blog-ranks but also search words, a time factor and so on. Nevertheless, those might not produce desirable results. In this paper, we suggest a topic-rank technique, which can find blogs that have significant degrees of association with topics. This technique is a method which ranks the relations between blogs and indexed words of blog posts as well as the topics representing blog posts. The blog rankings of correlations with search words are can be effectively computed in the blog retrieval by the proposed technique. After comparing precisions and coverage ratios of our blog retrieval system which applis our proposed topic-rank technique, we know that the performance of the blog retrieval system using topic-rank technique is more effective than others.

Efficient Blog Retrieval System by Topic-based Weighting (주제어 가중치 기법에 의한 효율적인 블로그 검색 시스템)

  • Shin, Hyeon-Il;Yun, Un-Il;Ryu, Keun-Ho
    • Journal of the Korea Society of Computer and Information
    • /
    • v.15 no.4
    • /
    • pp.1-9
    • /
    • 2010
  • In the new generation of Web, commonly called "Web 2.0", blogging has facilitated the publishing information or his/her opinion on the web. Various blog retrieval algorithms have been proposed to search for blogs more effectively. However, actually keyword-based searching or link-analysis blog ranking system cannot satisfy the user's requirement. In this paper, we suggest a topic-based weighting blog retrieval system in which the links between blog writings and searching words are considered to improve the search results. Our system extracts topics from each blog and weights them much higher than other guide words. In the comparison with other systems, we see that the proposed topic-base system has better recall rate of search results.

An Efficient Method for Detecting Duplicated Documents in a Blog Service System (블로그 서비스 시스템을 위한 효과적인 중복문서의 검출 기법)

  • Lee, Sang-Chul;Lee, Soon-Haeng;Kim, Sang-Wook
    • Journal of KIISE:Databases
    • /
    • v.37 no.1
    • /
    • pp.50-55
    • /
    • 2010
  • Duplicate documents in blog service system are one of causes that deteriorate both of the quality and the performance of blog searches. Unlike the WWW environment, the creation of documents is reported every time in blog service system, which makes it possible to identify the original document from its duplicate documents. Based on this observation, this paper proposes a novel method for detecting duplication documents in blog service system. This method determines whether a document is original or not at the time it is stored in the blog service system. As a result, it solves the problem of duplicate documents retrieved in the search result by keeping those documents from being stored in the index for the blog search engine. This paper also proposes three indexing methods that preserve an accuracy of previous work, Min-hashing. We show most effective indexing method via extensive experiments using real-life blog data.

Associated Keyword Recommendation System for Keyword-based Blog Marketing (키워드 기반 블로그 마케팅을 위한 연관 키워드 추천 시스템)

  • Choi, Sung-Ja;Son, Min-Young;Kim, Young-Hak
    • KIISE Transactions on Computing Practices
    • /
    • v.22 no.5
    • /
    • pp.246-251
    • /
    • 2016
  • Recently, the influence of SNS and online media is rapidly growing with a consequent increase in the interest of marketing using these tools. Blog marketing can increase the ripple effect and information delivery in marketing at low cost by prioritizing keyword search results of influential portal sites. However, because of the tough competition to gain top ranking of search results of specific keywords, long-term and proactive efforts are needed. Therefore, we propose a new method that recommends associated keyword groups with the possibility of higher exposure of the blog. The proposed method first collects the documents of blog including search results of target keyword, and extracts and filters keyword with higher association considering the frequency and location information of the word. Next, each associated keyword is compared to target keyword, and then associated keyword group with the possibility of higher exposure is recommended considering the information such as their association, search amount of associated keyword per month, the number of blogs including in search result, and average writhing date of blogs. The experiment result shows that the proposed method recommends keyword group with higher association.

The Effective Blog Search Algorithm based on the Structural Features in the Blogspace (블로그의 구조적 특성을 고려한 효율적인 블로그 검색 알고리즘)

  • Kim, Jung-Hoon;Yoon, Tae-Bok;Lee, Jee-Hyong
    • Journal of KIISE:Software and Applications
    • /
    • v.36 no.7
    • /
    • pp.580-589
    • /
    • 2009
  • Today, most web pages are being created in the blogspace or evolving into the blogspace. A blog entry (blog page) includes non-traditional features of Web pages, such as trackback links, bloggers' authority, tags, and comments. Thus, the traditional rank algorithms are not proper to evaluate blog entries because those algorithms do not consider the blog specific features. In this paper, a new algorithm called "Blog-Rank" is proposed. This algorithm ranks blog entries by calculating bloggers' reputation scores, trackback scores, and comment scores based on the features of the blog entries. This algorithm is also applied to searching for information related to the users' queries in the blogspace. The experiment shows that it finds the much more relevant information than the traditional ranking algorithms.

Splog Detection Using Post Structure Similarity and Daily Posting Count (포스트의 구조 유사성과 일일 발행수를 이용한 스플로그 탐지)

  • Beak, Jee-Hyun;Cho, Jung-Sik;Kim, Sung-Kwon
    • Journal of KIISE:Software and Applications
    • /
    • v.37 no.2
    • /
    • pp.137-147
    • /
    • 2010
  • A blog is a website, usually maintained by an individual, with regular entries of commentary, descriptions of events, or other material such as graphics or video. Entries are commonly displayed in reverse chronological order. Blog search engines, like web search engines, seek information for searchers on blogs. Blog search engines sometimes output unsatisfactory results, mainly due to spam blogs or splogs. Splogs are blogs hosting spam posts, plagiarized or auto-generated contents for the sole purpose of hosting advertizements or raising the search rankings of target sites. This thesis focuses on splog detection. This thesis proposes a new splog detection method, which is based on blog post structure similarity and posting count per day. Experiments based on methods proposed a day show excellent result on splog detection tasks with over 90% accuracy.

A Study on Moral Recognition of Blog Activity with China Netizen (중국 네티즌의 블로그 활동 윤리의식에 관한 연구)

  • Yu, Seung-Yeob
    • Journal of Digital Convergence
    • /
    • v.9 no.2
    • /
    • pp.101-110
    • /
    • 2011
  • This study has a blog usage experience with china netizen. We found the impact that blog usage motive, usage type, blog involvement and political-society disposition influence on a moral recognition with a blog activity. Results were as follows: First, A blog usage motive appeared seven, self-expressive and informational search motives were revealed that we had an influence on moral recognition of blog activity. Second, they had the difference in a blog usage motive according to a blog involvement and blog involvement reveled that we had an influence on moral recognition of blog activity. Third, the society involvement disposition of china netizen had an influence on moral recognition of blog activity, it effects on significantly social motivation of blog usage. The implication of this study is a follows. We verified the blog usage motive, type, blog involvement and society involvement disposition are important elements which understand a blog activity. Finally, we confirmed that psychological and social motive related to moral recognition of blog activity.

A Wikipedia-based Query Expansion Method for In-depth Blog Distillation (주제를 깊이 있게 다루는 블로그 피드 검색을 위한 위키피디아 기반 질의 확장 방법)

  • Song, Woo-Sang;Lee, Ye-Ha;Lee, Jong-Hyeok;Yang, Gi-Joo
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.16 no.11
    • /
    • pp.1121-1125
    • /
    • 2010
  • This paper proposes a Wikipedia-based feedback method for in-depth blog distillation whose goal is to find blogs that represent in-depth thoughts or analysis on a given query. The proposed method uses Wikipedia articles which are relevant to the query. TREC Blogs08 collection which is a large-scale blog corpus and English Wikipedia dump were used for experiments, The proposed method significantly increased the retrieval performance including MAP over the conventional post based feedback method.