• Title/Summary/Keyword: Web Search Engines

Search Result 209, Processing Time 0.024 seconds

Splog Detection Using Post Structure Similarity and Daily Posting Count (포스트의 구조 유사성과 일일 발행수를 이용한 스플로그 탐지)

  • Beak, Jee-Hyun;Cho, Jung-Sik;Kim, Sung-Kwon
    • Journal of KIISE:Software and Applications
    • /
    • v.37 no.2
    • /
    • pp.137-147
    • /
    • 2010
  • A blog is a website, usually maintained by an individual, with regular entries of commentary, descriptions of events, or other material such as graphics or video. Entries are commonly displayed in reverse chronological order. Blog search engines, like web search engines, seek information for searchers on blogs. Blog search engines sometimes output unsatisfactory results, mainly due to spam blogs or splogs. Splogs are blogs hosting spam posts, plagiarized or auto-generated contents for the sole purpose of hosting advertizements or raising the search rankings of target sites. This thesis focuses on splog detection. This thesis proposes a new splog detection method, which is based on blog post structure similarity and posting count per day. Experiments based on methods proposed a day show excellent result on splog detection tasks with over 90% accuracy.

Appraising the Interface Features of Web Search Engines Based on User-defined Relevance Criteria (이용자정의형 적합성 기준을 토대로 한 웹검색엔진 인터페이스 평가)

  • Kim, Yang-Woo
    • Journal of the Korean BIBLIA Society for library and Information Science
    • /
    • v.22 no.1
    • /
    • pp.247-262
    • /
    • 2011
  • Although research has shown a significant amount of work identifying various dimensions of relevance along with exhaustive lists of relevance criteria, there seem to have been less effort to apply the findings to improve actual systems design. Based on this assumption, this paper investigates to what extent those relevance criteria have been incorporated into the interface features of major commercial Web search engines, suggesting what can/should be done more. Before stepping into the actual system features, this paper compares recent relevance research in Information Science with other human factor studies both in Information Science and its neighboring discipline (HCI), as an attempt to identify studies that are conceptually similar to the relevance research, but not named as such way. Similarities and differences between these studies are presented. Recommendations suggested to support applicable interface features include: 1) further personalization of interface designs; 2) author-supplied meta tags for the Web contents; and 3) extensions of beyond-topical representations based on link structure.

Design and Implementation of the Specialized Internet Search Engine for Ship′s Parts Using Method of Mining for the Association Rule Discovery (연관 규칙 탐사 기법을 이용한 선박 부품 전문 검색 엔진의 설계 및 구현)

  • 하창승;윤병수;성창규;김종화;류길수
    • Proceedings of the Korean Society of Marine Engineers Conference
    • /
    • 2002.05a
    • /
    • pp.225-231
    • /
    • 2002
  • A specialized web search engine is an internet tool for detecting information in finite cyber world. It helps to retrieve necessary information in internet sites quickly In this paper, we design and implement a prototype search engine using method of mining for the association rule discovery. It consists of a search engine part and a search robot part. The search engine uses keyword method and is considered as various user oriented interface. The search robot fetches information related to ship parts n world wide web. The experiments show that our search engine(AISE) is superior to other search engines in collecting necessary informations.

  • PDF

Implementation of Search Engine to Minimize Traffic Using Blockchain-Based Web Usage History Management System

  • Yu, Sunghyun;Yeom, Cheolmin;Won, Yoojae
    • Journal of Information Processing Systems
    • /
    • v.17 no.5
    • /
    • pp.989-1003
    • /
    • 2021
  • With the recent increase in the types of services provided by Internet companies, collection of various types of data has become a necessity. Data collectors corresponding to web services profit by collecting users' data indiscriminately and providing it to the associated services. However, the data provider remains unaware of the manner in which the data are collected and used. Furthermore, the data collector of a web service consumes web resources by generating a large amount of web traffic. This traffic can damage servers by causing service outages. In this study, we propose a website search engine that employs a system that controls user information using blockchains and builds its database based on the recorded information. The system is divided into three parts: a collection section that uses proxy, a management section that uses blockchains, and a search engine that uses a built-in database. This structure allows data sovereigns to manage their data more transparently. Search engines that use blockchains do not use internet bots, and instead use the data generated by user behavior. This avoids generation of traffic from internet bots and can, thereby, contribute to creating a better web ecosystem.

Construction of Local Document Management System based on Associative Search

  • Kasagi, Yoshimasa;Yamaguchi, Toru;Takama, Yasufumi
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 2003.09a
    • /
    • pp.146-149
    • /
    • 2003
  • As the information that can collect from the web to local database is increasing, we propose a system that can suggest related local documents when new document arrives. We also propose for constructing an association dictionary using web search engines for similarity calculation. The prototype system is also developed, which is described in detail.

  • PDF

Dynamic Classification of Categories in Web Search Environment (웹 검색 환경에서 범주의 동적인 분류)

  • Choi Bum-Ghi;Lee Ju-Hong;Park Sun
    • Journal of KIISE:Software and Applications
    • /
    • v.33 no.7
    • /
    • pp.646-654
    • /
    • 2006
  • Directory searching and index searching methods are two main methods in web search engines. Both of the methods are applied to most of the well-known Internet search engines, which enable users to choose the other method if they are not satisfied with results shown by one method. That is, Index searching tends to come up with too many search results, while directory searching has a difficulty in selecting proper categories, frequently mislead to false ones. In this paper, we propose a novel method in which a category hierarchy is dynamically constructed. To do this, a category is regarded as a fuzzy set which includes keywords. Similarly extensible subcategories of a category can be found using fuzzy relational products. The merit of this method is to enhance the recall rate of directory search by expanding subcategories on the basis of similarity.

Intelligent Product Search Agent based on SWRL (시맨틱 웹 규칙 언어를 이용한 지능형 상품 정보 검색 에이전트 개발)

  • Kim, U-Ju;Kim, Jeong-Myeong;Choe, Dae-U
    • Proceedings of the Korea Inteligent Information System Society Conference
    • /
    • 2005.05a
    • /
    • pp.316-320
    • /
    • 2005
  • We developed Intelligent Product Search Agent based on SWRL, and this agent can search product information with knowledge(facts and rules) on the web, implement price comparison for searched products considering delivery rates. Existing keyword based product search engines is poor at searching intent products though a user has already prefect knowledge about intent produces. Furthermore if a user has insufficient knowledge, it is impossible to implement search. Also, existing price comparison shopping mall gives users comparison service considering total price(product prices, taxes, delivery rates), this service is valid to single product and has limitations of system expansion and up-dating because of not rule base but programming base. If there is appropriate knowledge on the Semantic web and this makes product information retrieval possible, above problems can be solved clearly. In this research, we developed Intelligent Product Search Agent based on SWRL that can search product information efficiently by making agent to handle facts and rules by itself.

  • PDF

A Usability Evaluation on the Visualization Techniques of Web Retrieval Results (웹 검색 결과 시각화 기법의 사용성 평가에 관한 연구)

  • Kim, Seong-Hee;Kim, Moon-Jeong
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.41 no.3
    • /
    • pp.181-199
    • /
    • 2007
  • This study is to suggest the usefulness of visualization techniques to display web retrieval results. We described the concept of visualization techniques and evaluated the usability for the SearchCrystal and KartOO search engines which provide visualization techniques for displaying the retrieval results. As a result, Searchcrystal search engine had higher score than KartOO system in terms of usability check lists.

Design of Internet Search Engine by Intelligent Agents on WWW

  • Nakano, Ryota;Noto, Masato
    • Proceedings of the IEEK Conference
    • /
    • 2000.07b
    • /
    • pp.699-702
    • /
    • 2000
  • The Internet has become widely used in many countries. In particular, a new emerging technology, the WWW (World Wide Web), which has become a major application of the Internet, has been rapidly developed. As a result, there are hundreds of millions of URLs (Uniform Resource Locators) on the WWW, and the total number of URLs is still explosively increasing. To get information from the WWW, we generally use Internet search engines. However, we cannot always get the actual information we want. Accordingly, we have solved this problem by constructing a prototype system based on agents by programming language Java for constructing a more effective search engine. This so-called “intelligent agent system on WWW” deletes redundant HTML (Hyper Text Markup Language) files and exchanges information about the existence of URLs. And we found that our prototype system is more powerful and effective than conventional search engines.

  • PDF

Layout Analysis for Calculation of Web Page Similarity as Image

  • Mitsuhashi, Noriaki;Yamaguchi, Toru;Takama, Yasufumi
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 2003.09a
    • /
    • pp.142-145
    • /
    • 2003
  • When we search information on the Web using search engines, they only analyze the text information collected from the source files of Web pages. However, there is a limit to analyze the layout of a Web page only from its source file, although Web page design is the most important factor for a user to estimate a page. In particular it often happens on the Web that the pages of similar design ofter similar information. We propose a method to analyze layout for comparing the design of pages by treating the displayed page as image.

  • PDF