• Title/Summary/Keyword: Web search

Search Result 1,646, Processing Time 0.022 seconds

A Document Collection Method for More Accurate Search Engine (정확도 높은 검색 엔진을 위한 문서 수집 방법)

  • Ha, Eun-Yong;Gwon, Hui-Yong;Hwang, Ho-Yeong
    • The KIPS Transactions:PartA
    • /
    • v.10A no.5
    • /
    • pp.469-478
    • /
    • 2003
  • Internet information search engines using web robots visit servers conneted to the Internet periodically or non-periodically. They extract and classify data collected according to their own method and construct their database, which are the basis of web information search engines. There procedure are repeated very frequently on the Web. Many search engine sites operate this processing strategically to become popular interneet portal sites which provede users ways how to information on the web. Web search engine contacts to thousands of thousands web servers and maintains its existed databases and navigates to get data about newly connected web servers. But these jobs are decided and conducted by search engines. They run web robots to collect data from web servers without knowledge on the states of web servers. Each search engine issues lots of requests and receives responses from web servers. This is one cause to increase internet traffic on the web. If each web server notify web robots about summary on its public documents and then each web robot runs collecting operations using this summary to the corresponding documents on the web servers, the unnecessary internet traffic is eliminated and also the accuracy of data on search engines will become higher. And the processing overhead concerned with web related jobs on web servers and search engines will become lower. In this paper, a monitoring system on the web server is designed and implemented, which monitors states of documents on the web server and summarizes changes of modified documents and sends the summary information to web robots which want to get documents from the web server. And an efficient web robot on the web search engine is also designed and implemented, which uses the notified summary and gets corresponding documents from the web servers and extracts index and updates its databases.

A Study on the Effects of Search Language on Web Searching Behavior: Focused on the Differences of Web Searching Pattern (검색 언어가 웹 정보검색행위에 미치는 영향에 관한 연구 - 웹 정보검색행위의 양상 차이를 중심으로 -)

  • Byun, Jeayeon
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.52 no.3
    • /
    • pp.289-334
    • /
    • 2018
  • Even though information in many languages other than English is quickly increasing, English is still playing the role of the lingua franca and being accounted for the largest proportion on the web. Therefore, it is necessary to investigate the key features and differences between "information searching behavior using mother tongue as a search language" and "information searching behavior using English as a search language" of users who are non-mother tongue speakers of English to acquire more diverse and abundant information. This study conducted the experiment on the web searching which is applied in concurrent think-aloud method to examine the information searching behavior and the cognitive process in Korean search and English search through the twenty-four undergraduate students at a private university in South Korea. Based on the qualitative data, this study applied the frequency analysis to web search pattern under search language. As a result, it is active, aggressive and independent information searching behavior in Korean search, while information searching behavior in English search is passive, submissive and dependent. In Korean search, the main features are the query formulation by extract and combine the terms from various sources such as users, tasks and system, the search range adjustment in diverse level, the smooth filtering of the item selection in search engine results pages, the exploration and comparison of many items and the browsing of the overall contents of web pages. Whereas, in English search, the main features are the query formulation by the terms principally extracted from task, the search range adjustment in limitative level, the item selection by rely on the relevance between the items such as categories or links, the repetitive exploring on same item, the browsing of partial contents of web pages and the frequent use of language support tools like dictionaries or translators.

Personalized Search based on Community through Automatic Analysis of Query Patterns (질의어 패턴 자동분석을 통한 커뮤니티 기반 개인화 검색)

  • Park, Gun-Woo;Lee, Sang-Hoon
    • Journal of KIISE:Databases
    • /
    • v.36 no.4
    • /
    • pp.321-326
    • /
    • 2009
  • Since the existing Web search engines don't sufficiently reflect user's search intent, it is very difficult to find out accurate information that users want to find. Therefore, a lot of researches, study for personalized search, to enhance satisfaction of Web search results by analyzing search pattern and applying it to search are in progress in these days. Web searchers can more efficiently find information and easily obtain appropriate information through the personalized search. In this paper, we propose the personalized search based on community through the analysis of web users' query patterns and interest. Consequently, when applying query frequency, interest and community to web search, we are able to the confirm that the search results which hit to the search intent of the individual are provided.

A Study on the Implementation and Evaluation of a Semantic Search System (시맨틱 검색 시스템의 구현과 평가에 관한 연구)

  • Han, Dong-Il;Kwon, Hyeong-In;Choi, Ho-Joon
    • Journal of Information Technology Services
    • /
    • v.7 no.3
    • /
    • pp.253-269
    • /
    • 2008
  • In this paper, we present an application called Semantic Search which is built on different supporting technologies and is designed to improve traditional web searching. The Semantic Search is becoming crucial challenges on semantic web. The assessment and the implementation of the research on Semantic Search is not full-fledged whereas its research is highly interested. Also there exists only little research that offers a commercial use Semantic Search System that should be taken into the account in measuring the effectiveness of a Semantic Search System. This paper proposes an implementation and evaluation for the Semantic Search System. Firstly, we built Semantic Search System which includes a case of development and it's procedure. Secondly, We presented the measurement of our Semantic Search System's effectiveness. Finally, the evaluation offers useful implications to the researchers and practitioners to improve the research level to the commercial use.

Design of Advanced HITS Algorithm by Suitability for Importance-Evaluation of Web-Documents (웹 문서 중요도 평가를 위한 적합도 향상 HITS 알고리즘 설계)

  • 김분희;한상용;김영찬
    • The Journal of Society for e-Business Studies
    • /
    • v.8 no.2
    • /
    • pp.23-31
    • /
    • 2003
  • Link-based search engines generate the rank using linked information of related web-documents . HITS(Hypertext Internet Topic Search), representative ranking evaluation algorithm using a special feature of web-documents based on such link, evaluates the importance degree of related pages from linked information and presents by ranking information. Problem of such HITS algorithm only is considered the link frequency within documents and depends on the set of web documents as input value. In this paper, we design the search agent based on better HITS algorithm according to advanced suitability between query and search-result in the set of given documents from link-based web search engine. It then complements locality of advanced search performance and result.

  • PDF

Improving the Performance of Web Search using Query Types (질의유형에 기반한 웹 검색의 성능 향상)

  • Kang, In-Ho;An, Dong-Un
    • The KIPS Transactions:PartB
    • /
    • v.11B no.5
    • /
    • pp.537-544
    • /
    • 2004
  • The Web is rich with various sources of information. Due to the massive and heterogeneous web document collections, users want to find various types of target pages. Each type of information for Web search has designated queries. If a user query is not a designated query, then we cannot have good result documents. Different strategies are needed to utilize the goodness of each type of information for a search engine. If we know the property of information, then we can refine candidate pages and rank them delicately. Various experiments are conducted to show the properties of each type of information. Therefore, we show an appropriate combining formula to utilize the properties of each type of information. In addition, for a service finding task, we propose Service Link Information that utilizes the existence of mechanisms for a user interaction.

An Improved Combined Content-similarity Approach for Optimizing Web Query Disambiguation

  • Kamal, Shahid;Ibrahim, Roliana;Ghani, Imran
    • Journal of Internet Computing and Services
    • /
    • v.16 no.6
    • /
    • pp.79-88
    • /
    • 2015
  • The web search engines are exposed to the issue of uncertainty because of ambiguous queries, being input for retrieving the accurate results. Ambiguous queries constitute a significant fraction of such instances and pose real challenges to web search engines. Moreover, web search has created an interest for the researchers to deal with search by considering context in terms of location perspective. Our proposed disambiguation approach is designed to improve user experience by using context in terms of location relevance with the document relevance. The aim is that providing the user a comprehensive location perspective of a topic is informative than retrieving a result that only contains temporal or context information. The capacity to use this information in a location manner can be, from a user perspective, potentially useful for several tasks, including user query understanding or clustering based on location. In order to carry out the approach, we developed a Java based prototype to derive the contextual information from the web results based on the queries from the well-known datasets. Among those results, queries are further classified in order to perform search in a broad way. After the result provision to users and the selection made by them, feedback is recorded implicitly to improve the web search based on contextual information. The experiment results demonstrate the outstanding performance of our approach in terms of precision 75%, accuracy 73%; recall 81% and f-measure 78% when compared with generic temporal evaluation approach and furthermore achieved precision 86%, accuracy 71%; recall 67% and f-measure 75% when compared with web document clustering approach.

Semantic Web based DQL Search System (시멘틱 웹 기반 DQL 검색 시스템 설계)

  • Kim Je-Min;Park Young-Tack
    • The KIPS Transactions:PartB
    • /
    • v.12B no.1 s.97
    • /
    • pp.91-100
    • /
    • 2005
  • It has been proposed diverse methods to use web information efficiently as the size of information is increasing. Most of search systems use a keyword-based method that mostly relies on syntactic information. They cannot utilize semantic information of documents and thus they could generate to users. To solve shortcoming in searching documents, a technique using the Semantic Web is suggested. A semantic web can find relevant information to users by employing metadata which are represented using standard ontologies. Each document is annotated with a metadata which can be reasoned by agents. In this paper, we propose a search system using semantic web technologies. Our semantic search system analyzes semantically questions that user input, and get resolution information that user want. To improve efficiency and accuracy of semantic search systems, this paper proposes DQL(DAML Query Language) engine that employs inference engine to execute reasoning and DQL converter that changes keyword form question of the user to DQL.

Implementation of a Parallel Web Crawler for the Odysseus Large-Scale Search Engine (오디세우스 대용량 검색 엔진을 위한 병렬 웹 크롤러의 구현)

  • Shin, Eun-Jeong;Kim, Yi-Reun;Heo, Jun-Seok;Whang, Kyu-Young
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.14 no.6
    • /
    • pp.567-581
    • /
    • 2008
  • As the size of the web is growing explosively, search engines are becoming increasingly important as the primary means to retrieve information from the Internet. A search engine periodically downloads web pages and stores them in the database to provide readers with up-to-date search results. The web crawler is a program that downloads and stores web pages for this purpose. A large-scale search engines uses a parallel web crawler to retrieve the collection of web pages maximizing the download rate. However, the service architecture or experimental analysis of parallel web crawlers has not been fully discussed in the literature. In this paper, we propose an architecture of the parallel web crawler and discuss implementation issues in detail. The proposed parallel web crawler is based on the coordinator/agent model using multiple machines to download web pages in parallel. The coordinator/agent model consists of multiple agent machines to collect web pages and a single coordinator machine to manage them. The parallel web crawler consists of three components: a crawling module for collecting web pages, a converting module for transforming the web pages into a database-friendly format, a ranking module for rating web pages based on their relative importance. We explain each component of the parallel web crawler and implementation methods in detail. Finally, we conduct extensive experiments to analyze the effectiveness of the parallel web crawler. The experimental results clarify the merit of our architecture in that the proposed parallel web crawler is scalable to the number of web pages to crawl and the number of machines used.

An Effective Mobile Web Object Navigation Based on the Steiner Tree Approach (스타이너트리 기반의 효과적인 모바일 웹 오브젝트 네비게이션)

  • Lee, Woo-Key;Song, Justin Jong-Su;Lee, James J.H.
    • Korean Management Science Review
    • /
    • v.28 no.1
    • /
    • pp.1-10
    • /
    • 2011
  • One of the fundamental roles of web object navigation is to support what the user wants precisely and efficiently from the enormous web database to the web browser. As long as the web search results are a set of individual lists, it is all right to display each and every web result for the web browser to display a web object one by one. However, in case the search results are a collection of multiple interrelated web objects, then there is a need to represent for a new mechanism for linked web objects at a time. We define a unit of web objects derived from a Steiner tree where the web objects include a set of specific keywords calculated by the weight from which the solutions are extracted. Even if a web object does not include all the keywords, then the related hypertext linked web objects are derived and displayed onto the mobile web browser with meta data in one shot. In this paper, it is applied for the mobile browser that the web contents can dynamically be displayed with Steiner trees until each renewal of the navigation request may be issued. In this paper, a new synchronized mobile browsing method is developed so that the navigating time can drastically be reduced and the web navigating efficiency can be dramatically enhanced without sacrificing memory consumption.