• Title/Summary/Keyword: Web Search Query

Search Result 198, Processing Time 0.028 seconds

An Analysis of Query Types and Topics Submitted to Navel (클릭 로그에 근거한 네이버 검색 질의의 형태 및 주제 분석)

  • Park Soyeon;Lee Joon-Ho;Kim Ji Seoung
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.39 no.1
    • /
    • pp.265-278
    • /
    • 2005
  • This study examines web query types and topics submitted to Naver during one year period by analyzing query logs and click logs. Query logs capture queries users submitted to the system, and click logs consist of documents users clicked and viewed. This study presents a methodology to classify query types and topics. A method for click log analysis is also suggested. When classified by query types, there are more site search queries than content search queries. Queries about computer/internet. entertainment, shopping. game, education rank hightest. The implications for system designers and web content providers are discussed.

A Brief Survey into the Field of Automatic Image Dataset Generation through Web Scraping and Query Expansion

  • Bart Dikmans;Dongwann Kang
    • Journal of Information Processing Systems
    • /
    • v.19 no.5
    • /
    • pp.602-613
    • /
    • 2023
  • High-quality image datasets are in high demand for various applications. With many online sources providing manually collected datasets, a persisting challenge is to fully automate the dataset collection process. In this study, we surveyed an automatic image dataset generation field through analyzing a collection of existing studies. Moreover, we examined fields that are closely related to automated dataset generation, such as query expansion, web scraping, and dataset quality. We assess how both noise and regional search engine differences can be addressed using an automated search query expansion focused on hypernyms, allowing for user-specific manual query expansion. Combining these aspects provides an outline of how a modern web scraping application can produce large-scale image datasets.

Improving Performance of Web Search using The User Preference in Query Word Senses (질의어 의미별 사용자 선호도를 이용한 웹 검색의 성능 향상)

  • 김형일;김준태
    • Journal of KIISE:Software and Applications
    • /
    • v.31 no.8
    • /
    • pp.1101-1112
    • /
    • 2004
  • In this paper, we propose a Web page weighting scheme using the user preference in each sense of query word to improve the performance of Web search. Generally search engines assign weights to a web page by using relevancy only, which is obtained by comparing the query word and the words in a web page. In the information retrieval from huge data such as the Web, simple word comparison cannot distinguish important documents because there exist too many documents with similar relevancy In this paper we implement a WordNet-based user interface that helps to distinguish different senses of query word, and constructed a search engine in which the implicit evaluations by multiple users are reflected in ranking by accumulating the number of clicks. In accumulating click counts, they are stored separately according to senses, so that more accurate search is possible. The experimental results with several keywords show that the precision of proposed system is improved compared to conventional search engines.

Keyword Selection for Visual Search based on Wikipedia (비주얼 검색을 위한 위키피디아 기반의 질의어 추출)

  • Kim, Jongwoo;Cho, Soosun
    • Journal of Korea Multimedia Society
    • /
    • v.21 no.8
    • /
    • pp.960-968
    • /
    • 2018
  • The mobile visual search service uses a query image to acquire linkage information through pre-constructed DB search. From the standpoint of this purpose, it would be more useful if you could perform a search on a web-based keyword search system instead of a pre-built DB search. In this paper, we propose a representative query extraction algorithm to be used as a keyword on a web-based search system. To do this, we use image classification labels generated by the CNN (Convolutional Neural Network) algorithm based on Deep Learning, which has a remarkable performance in image recognition. In the query extraction algorithm, dictionary meaningful words are extracted using Wikipedia, and hierarchical categories are constructed using WordNet. The performance of the proposed algorithm is evaluated by measuring the system response time.

WebDBs : A User oriented Web Search Engine (WebDBs: 사용자 중심의 웹 검색 엔진)

  • 김홍일;임해철
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.24 no.7B
    • /
    • pp.1331-1341
    • /
    • 1999
  • This paper propose WebDBs(Web Database system) which retrieves information registered in web using query language similar to SQL. This proposed system automatically extracts information which is needed to retrieve from HTML documents dispersed in web. Also, it has an ability to process SQL based query intended for the extracted information. Web database system takes the most of query processing time for capturing documents going through network line. And so, the information previously retrieved is reused in similar applications after stored in cache in perceiving that most of the web retrieval depends on web locality. In this case, we propose cache mechanism adapted to user applications by storing cached information associated with retrieved query. And, Web search engine is implemented based on these concepts.

  • PDF

Personalized Web Search using Query based User Profile (질의기반 사용자 프로파일을 이용하는 개인화 웹 검색)

  • Yoon, Sung Hee
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.17 no.2
    • /
    • pp.690-696
    • /
    • 2016
  • Search engines that rely on morphological matching of user query and web document content do not support individual interests. This research proposes a personalized web search scheme that returns the results that reflect the users' query intent and personal preferences. The performance of the personalized search depends on using an effective user profiling strategy to accurately capture the users' personal interests. In this study, the user profiles are the databases of topic words and customized weights based on the recent user queries and the frequency of topic words in click history. To determine the precise meaning of ambiguous queries and topic words, this strategy uses WordNet to calculate the semantic relatedness to words in the user profile. The experiments were conducted by installing a query expansion and re-ranking modules on the general web search systems. The results showed that this method has 92% precision and 82% recall in the top 10 search results, proving the enhanced performance.

Personalized Search Service in Semantic Web (시멘틱 웹 환경에서의 개인화 검색)

  • Kim, Je-Min;Park, Young-Tack
    • The KIPS Transactions:PartB
    • /
    • v.13B no.5 s.108
    • /
    • pp.533-540
    • /
    • 2006
  • The semantic web environment promise semantic search of heterogeneous data from distributed web page. Semantic search would resuit in an overwhelming number of results for users is increased, therefore elevating the need for appropriate personalized ranking schemes. Culture Finder helps semantic web agents obtain personalized culture information. It extracts meta data for each web page(culture news, culture performance, culture exhibition), perform semantic search and compute result ranking point to base user profile. In order to work efficient, Culture Finder uses five major technique: Machine learning technique for generating user profile from user search behavior and meta data repository, an efficient semantic search system for semantic web agent, query analysis for representing query and query result, personalized ranking method to provide suitable search result to user, upper ontology for generating meta data. In this paper, we also present the structure used in the Culture Finder to support personalized search service.

Research on User's Query Processing in Search Engine for Ocean using the Association Rules (연관 규칙 탐사 기법을 이용한 해양 전문 검색 엔진에서의 질의어 처리에 관한 연구)

  • 하창승;윤병수;류길수
    • Journal of the Korea Society of Computer and Information
    • /
    • v.8 no.2
    • /
    • pp.8-15
    • /
    • 2003
  • Recently various of information suppliers provide information via WWW so the necessary of search engine grows larger. However the efficiency of most search engines is low comparatively because of using simple pattern match technique between user's query and web document. A specialized search engine returns the specialized information depend on each user's search goal. It is trend to develop specialized search engines in many countries. However, most such engines don't satisfy the user's needs. This paper proposes the specialized search engine for ocean information that uses user's query related with ocean and the association rules in web data mining can prove relation between web documents. So this search engine improved the recall of data and the precision in existent search method.

  • PDF

Pre-Processing of Query Logs in Web Usage Mining

  • Abdullah, Norhaiza Ya;Husin, Husna Sarirah;Ramadhani, Herny;Nadarajan, Shanmuga Vivekanada
    • Industrial Engineering and Management Systems
    • /
    • v.11 no.1
    • /
    • pp.82-86
    • /
    • 2012
  • In For the past few years, query log data has been collected to find user's behavior in using the site. Many researches have studied on the usage of query logs to extract user's preference, recommend personalization, improve caching and pre-fetching of Web objects, build better adaptive user interfaces, and also to improve Web search for a search engine application. A query log contain data such as the client's IP address, time and date of request, the resources or page requested, status of request HTTP method used and the type of browser and operating system. A query log can offer valuable insight into web site usage. A proper compilation and interpretation of query log can provide a baseline of statistics that indicate the usage levels of website and can be used as tool to assist decision making in management activities. In this paper we want to discuss on the tasks performed of query logs in pre-processing of web usage mining. We will use query logs from an online newspaper company. The query logs will undergo pre-processing stage, in which the clickstream data is cleaned and partitioned into a set of user interactions which will represent the activities of each user during their visits to the site. The query logs will undergo essential task in pre-processing which are data cleaning and user identification.

An Improved Combined Content-similarity Approach for Optimizing Web Query Disambiguation

  • Kamal, Shahid;Ibrahim, Roliana;Ghani, Imran
    • Journal of Internet Computing and Services
    • /
    • v.16 no.6
    • /
    • pp.79-88
    • /
    • 2015
  • The web search engines are exposed to the issue of uncertainty because of ambiguous queries, being input for retrieving the accurate results. Ambiguous queries constitute a significant fraction of such instances and pose real challenges to web search engines. Moreover, web search has created an interest for the researchers to deal with search by considering context in terms of location perspective. Our proposed disambiguation approach is designed to improve user experience by using context in terms of location relevance with the document relevance. The aim is that providing the user a comprehensive location perspective of a topic is informative than retrieving a result that only contains temporal or context information. The capacity to use this information in a location manner can be, from a user perspective, potentially useful for several tasks, including user query understanding or clustering based on location. In order to carry out the approach, we developed a Java based prototype to derive the contextual information from the web results based on the queries from the well-known datasets. Among those results, queries are further classified in order to perform search in a broad way. After the result provision to users and the selection made by them, feedback is recorded implicitly to improve the web search based on contextual information. The experiment results demonstrate the outstanding performance of our approach in terms of precision 75%, accuracy 73%; recall 81% and f-measure 78% when compared with generic temporal evaluation approach and furthermore achieved precision 86%, accuracy 71%; recall 67% and f-measure 75% when compared with web document clustering approach.