• Title/Summary/Keyword: Web Search Query

Search Result 198, Processing Time 0.023 seconds

An Efficient Search Method of Product Reviews using Opinion Mining Techniques (오피니언 마이닝 기술을 이용한 효율적 상품평 검색 기법)

  • Yune, Hong-June;Kim, Han-Joon;Chang, Jae-Young
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.16 no.2
    • /
    • pp.222-226
    • /
    • 2010
  • With the continuously increasing volume of e-commerce transactions, it is now popular to buy some products and to evaluate them on the World Wide Web. The product reviews are very useful to customers because they can make better decisions based on the indirect experiences obtainable through these reviews. However, since online shopping malls do not provide ranking results, it is not easy for users to read all the relevant review documents effectively. Product reviews include subjective and emotional opinions. Thus, the review search is different from the general web search in terms of ranking strategy. In this paper, we propose an effective method of ranking the reviews that can reflect user's intention by using opinion mining techniques. The proposed method analyzes product reviews with query words, and sentimental polarity of subjective opinions. Through diverse experiments, we show that our proposed method outperforms conventional ones.

Methods for Integration of Documents using Hierarchical Structure based on the Formal Concept Analysis (FCA 기반 계층적 구조를 이용한 문서 통합 기법)

  • Kim, Tae-Hwan;Jeon, Ho-Cheol;Choi, Joong-Min
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.3
    • /
    • pp.63-77
    • /
    • 2011
  • The World Wide Web is a very large distributed digital information space. From its origins in 1991, the web has grown to encompass diverse information resources as personal home pasges, online digital libraries and virtual museums. Some estimates suggest that the web currently includes over 500 billion pages in the deep web. The ability to search and retrieve information from the web efficiently and effectively is an enabling technology for realizing its full potential. With powerful workstations and parallel processing technology, efficiency is not a bottleneck. In fact, some existing search tools sift through gigabyte.syze precompiled web indexes in a fraction of a second. But retrieval effectiveness is a different matter. Current search tools retrieve too many documents, of which only a small fraction are relevant to the user query. Furthermore, the most relevant documents do not nessarily appear at the top of the query output order. Also, current search tools can not retrieve the documents related with retrieved document from gigantic amount of documents. The most important problem for lots of current searching systems is to increase the quality of search. It means to provide related documents or decrease the number of unrelated documents as low as possible in the results of search. For this problem, CiteSeer proposed the ACI (Autonomous Citation Indexing) of the articles on the World Wide Web. A "citation index" indexes the links between articles that researchers make when they cite other articles. Citation indexes are very useful for a number of purposes, including literature search and analysis of the academic literature. For details of this work, references contained in academic articles are used to give credit to previous work in the literature and provide a link between the "citing" and "cited" articles. A citation index indexes the citations that an article makes, linking the articleswith the cited works. Citation indexes were originally designed mainly for information retrieval. The citation links allow navigating the literature in unique ways. Papers can be located independent of language, and words in thetitle, keywords or document. A citation index allows navigation backward in time (the list of cited articles) and forwardin time (which subsequent articles cite the current article?) But CiteSeer can not indexes the links between articles that researchers doesn't make. Because it indexes the links between articles that only researchers make when they cite other articles. Also, CiteSeer is not easy to scalability. Because CiteSeer can not indexes the links between articles that researchers doesn't make. All these problems make us orient for designing more effective search system. This paper shows a method that extracts subject and predicate per each sentence in documents. A document will be changed into the tabular form that extracted predicate checked value of possible subject and object. We make a hierarchical graph of a document using the table and then integrate graphs of documents. The graph of entire documents calculates the area of document as compared with integrated documents. We mark relation among the documents as compared with the area of documents. Also it proposes a method for structural integration of documents that retrieves documents from the graph. It makes that the user can find information easier. We compared the performance of the proposed approaches with lucene search engine using the formulas for ranking. As a result, the F.measure is about 60% and it is better as about 15%.

Query Expansion and Term Weighting Method for Document Filtering (문서필터링을 위한 질의어 확장과 가중치 부여 기법)

  • Shin, Seung-Eun;Kang, Yu-Hwan;Oh, Hyo-Jung;Jang, Myung-Gil;Park, Sang-Kyu;Lee, Jae-Sung;Seo, Young-Hoon
    • The KIPS Transactions:PartB
    • /
    • v.10B no.7
    • /
    • pp.743-750
    • /
    • 2003
  • In this paper, we propose a query expansion and weighting method for document filtering to increase precision of the result of Web search engines. Query expansion for document filtering uses ConceptNet, encyclopedia and documents of 10% high similarity. Term weighting method is used for calculation of query-documents similarity. In the first step, we expand an initial query into the first expanded query using ConceptNet and encyclopedia. And then we weight the first expanded query and calculate the first expanded query-documents similarity. Next, we create the second expanded query using documents of top 10% high similarity and calculate the second expanded query- documents similarity. We combine two similarities from the first and the second step. And then we re-rank the documents according to the combined similarities and filter off non-relevant documents with the lower similarity than the threshold. Our experiments showed that our document filtering method results in a notable improvement in the retrieval effectiveness when measured using both precision-recall and F-Measure.

Contents Analysis and Synthesis Scheme for Music Album Cover Art

  • Moon, Dae-Jin;Rho, Seung-Min;Hwang, Een-Jun
    • Journal of IKEEE
    • /
    • v.14 no.4
    • /
    • pp.305-311
    • /
    • 2010
  • Most recent web search engines perform effective keyword-based multimedia contents retrieval by investigating keywords associated with multimedia contents on the Web and comparing them with query keywords. On the other hand, most music and compilation albums provide professional artwork as cover art that will be displayed when the music is played. If the cover art is not available, then the music player just displays some dummy or random images, but this has been a source of dissatisfaction. In this paper, in order to automatically create cover art that is matched with music contents, we propose a music album cover art creation scheme based on music contents analysis and result synthesis. We first (i) analyze music contents and their lyrics and extract representative keywords, (ii) expand the keywords using WordNet and generate various queries, (iii) retrieve related images from the Web using those queries, and finally (iv) synthesize them according to the user preference for album cover art. To show the effectiveness of our scheme, we developed a prototype system and reported some results.

KUGI: A Database and Search System for Korean Unigene and Pathway Information

  • Yang, Jin-Ok;Hahn, Yoon-Soo;Kim, Nam-Soon;Yu, Ung-Sik;Woo, Hyun-Goo;Chu, In-Sun;Kim, Yong-Sung;Yoo, Hyang-Sook;Kim, Sang-Soo
    • Proceedings of the Korean Society for Bioinformatics Conference
    • /
    • 2005.09a
    • /
    • pp.407-411
    • /
    • 2005
  • KUGI (Korean UniGene Information) database contains the annotation information of the cDNA sequences obtained from the disease samples prevalent in Korean. A total of about 157,000 5'-EST high throughput sequences collected from cDNA libraries of stomach, liver, and some cancer tissues or established cell lines from Korean patients were clustered to about 35,000 contigs. From each cluster a representative clone having the longest high quality sequence or the start codon was selected. We stored the sequences of the representative clones and the clustered contigs in the KUGI database together with their information analyzed by running Blast against RefSeq, human mRNA, and UniGene databases from NCBI. We provide a web-based search engine fur the KUGI database using two types of user interfaces: attribute-based search and similarity search of the sequences. For attribute-based search, we use DBMS technology while we use BLAST that supports various similarity search options. The search system allows not only multiple queries, but also various query types. The results are as follows: 1) information of clones and libraries, 2) accession keys, location on genome, gene ontology, and pathways to public databases, 3) links to external programs, and 4) sequence information of contig and 5'-end of clones. We believe that the KUGI database and search system may provide very useful information that can be used in the study for elucidating the causes of the disease that are prevalent in Korean.

  • PDF

Building Intelligent User Interface Agent for Semantically Reformulating User Query in Medicine

  • Yang, Jung-Jin;Lim, Chae-Myung;Chu, Sung-Joon;Lee, Dong-Hoon;Park, Duck-Whan;Park, Tae-Yong
    • Journal of Intelligence and Information Systems
    • /
    • v.9 no.2
    • /
    • pp.101-119
    • /
    • 2003
  • Achieving the beneficiary goal of recent discovery in human genome project still needs a way to retrieve and analyze the exponentially expanding bio-related information. Research on bio-related fields naturally applies knowledge discovered to the current problem and make inferences to extract new information where shared concepts and data containing information need to be defined and used in a coherent way. In such a professional domain, while the need to help users reduce their work and to improve search results has been emerged, methods for systematic retrieval and adequate exchange of relevant information are still in their infancy. The design of our system aims at improving the quality of information retrieval in a professional domain by utilizing both corpus-based and concept-based ontology. Meta-rules of helping users to make an adequate query are formed into an ontology in the domain. The integration of those knowledge permits the system to retrieve relevant information in a more semantic and systematic fashion. This work mainly describes the query models with details of GUI and a secondary query generation of the system.

  • PDF

The Multimedia Searching Behavior of Korean Portal Users (국내 포털 이용자들의 멀티미디어 검색 행태 분석)

  • Park, So-Yeon
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.44 no.1
    • /
    • pp.101-115
    • /
    • 2010
  • The main difference between web searching and traditional searching is that the web provides and supports multimedia searching. This study aims to investigate the multimedia searching behavior of users of NAVER, a major Korean search portal. In conducting this study, the query logs and click logs of a unified search service were analyzed. The results of this study show that among the multimedia queries submitted by users, audio searches are the dominant media type, followed similarly by video and image searches. On the other hand, among the multimedia documents clicked on, video is the most popular collection type followed by image and audio collections. Entertainment is the most popular topic in both multimedia queries and clicks. The results of this study can be implemented for the portal's development of multimedia content and searching algorithms.

Modeling and Implementation of Multilingual Meta-search Service using Open APIs and Ajax (Open API와 Ajax를 이용한 다국어 메타검색 서비스의 모델링 및 구현)

  • Kim, Seon-Jin;Kang, Sin-Jae
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.14 no.5
    • /
    • pp.11-18
    • /
    • 2009
  • Ajax based on Java Script receives attention as an alternative to ActiveX technology. Most portal sites in korea show a tendency to reopen existing services by combining the technology, because it supports most web browsers, and has the advantages of such a brilliant interface, excellent speed, and traffic reduction through asynchronous interaction. This paper modeled and implemented a multilingual meta-search service using the Ajax and open APIs provided by international famous sites. First, a Korean query is translated into one of the language of 54 countries around the world by Google translation API, and then the translated result is used to search the information of the social web sites such as Flickr, Youtube, Daum, and Naver. Searched results are displayed fast by dynamic loading of portion of the screen using Ajax. Our system can reduce server traffic and per-packet communications charges by preventing redundant transmission of unnecessary information.

Issues of IPR Database Construction through Interdisciplinary Research (학제간 연구를 통한 IPR 데이터베이스 구축의 쟁점)

  • Kim, Dong Yong;Park, Young Chul
    • Journal of the Korea Convergence Society
    • /
    • v.8 no.8
    • /
    • pp.59-69
    • /
    • 2017
  • Humanities and social sciences researchers and database experts have teamed up to build a database of IPR materials prepared by the Institute of Pacific Relations (IPR). This paper presents the issues and solutions inherent in the database construction for ensuring the quality of IPR materials. For the accessibility of the database, we maintain the database on the Web so that researchers can access it via web browsers; for the convenience of the database construction, we provide an integrated interface that allows researchers to perform all tasks in it; for the completeness of IPR materials constructed, we support the responsible input and the responsible approval that identify responsibilities of each IPR material entered; and for the immediacy of the approval, we support an interactive approval process facilitating the input of researchers. We also use database design, query processing, transaction management, and search and sorting techniques to ensure the correctness of IPR materials entered. In particular, through concurrency control using existence dependency relationships between records, we ensure the correctness between the operating system files and their paths. Our future studies include content search, database download and upload, and copyright related work on IPR materials.

Ranked Web Service Retrieval by Keyword Search (키워드 질의를 이용한 순위화된 웹 서비스 검색 기법)

  • Lee, Kyong-Ha;Lee, Kyu-Chul;Kim, Kyong-Ok
    • The Journal of Society for e-Business Studies
    • /
    • v.13 no.2
    • /
    • pp.213-223
    • /
    • 2008
  • The efficient discovery of services from a large scale collection of services has become an important issue[7, 24]. We studied a syntactic method for Web service discovery, rather than a semantic method. We regarded a service discovery as a retrieval problem on the proprietary XML formats, which were service descriptions in a registry DB. We modeled services and queries as probabilistic values and devised similarity-based retrieval techniques. The benefits of our way are follows. First, our system supports ranked service retrieval by keyword search. Second, we considers both of UDDI data and WSDL definitions of services amid query evaluation time. Last, our technique can be easily implemented on the off-theshelf DBMS and also utilize good features of DBMS maintenance.

  • PDF