• Title/Summary/Keyword: query extraction

Search Result 109, Processing Time 0.026 seconds

TAKES: Two-step Approach for Knowledge Extraction in Biomedical Digital Libraries

  • Song, Min
    • Journal of Information Science Theory and Practice
    • /
    • v.2 no.1
    • /
    • pp.6-21
    • /
    • 2014
  • This paper proposes a novel knowledge extraction system, TAKES (Two-step Approach for Knowledge Extraction System), which integrates advanced techniques from Information Retrieval (IR), Information Extraction (IE), and Natural Language Processing (NLP). In particular, TAKES adopts a novel keyphrase extraction-based query expansion technique to collect promising documents. It also uses a Conditional Random Field-based machine learning technique to extract important biological entities and relations. TAKES is applied to biological knowledge extraction, particularly retrieving promising documents that contain Protein-Protein Interaction (PPI) and extracting PPI pairs. TAKES consists of two major components: DocSpotter, which is used to query and retrieve promising documents for extraction, and a Conditional Random Field (CRF)-based entity extraction component known as FCRF. The present paper investigated research problems addressing the issues with a knowledge extraction system and conducted a series of experiments to test our hypotheses. The findings from the experiments are as follows: First, the author verified, using three different test collections to measure the performance of our query expansion technique, that DocSpotter is robust and highly accurate when compared to Okapi BM25 and SLIPPER. Second, the author verified that our relation extraction algorithm, FCRF, is highly accurate in terms of F-Measure compared to four other competitive extraction algorithms: Support Vector Machine, Maximum Entropy, Single POS HMM, and Rapier.

The Schema Extraction Method using the frequency of Label Path in XML documents (XML 문서에서의 레이블 경로 발생 빈도수에 따른 스키마 추출 방법)

  • 김성림;윤용익
    • Journal of Internet Computing and Services
    • /
    • v.2 no.4
    • /
    • pp.11-24
    • /
    • 2001
  • XML documents found over internet are generally fairly irregular and hove no fixed schema, The SQL and OQL are not suitable for query processing in XML documents, So, there are many researches about schema extraction and query language for XML documents, We propose a schema extraction method using the frequency of label path in XML documents, Our proposed method produces multi-level schemas and those are useful for query processing.

  • PDF

A Study on Keyword Extraction and Expansion for Web Text Retrieval (웹 문서 검색을 위한 검색어 추출과 확장에 관한 연구)

  • Yoon, Sung-Hee
    • Journal of the Korea Computer Industry Society
    • /
    • v.5 no.9
    • /
    • pp.1111-1118
    • /
    • 2004
  • Natural language query is the best user interface for the users of web text retrieval systems. This paper proposes a retrieval system with expanded keyword from syntactically-analyzed structures of user's natural language query based on natural language processing technique. Through the steps combining or splitting the compound nouns based on syntactic tree traversal, and expanding the other-formed or shorten-formed keyword into multiple keyword, it shows that precision and correctness of the retrieval system was enhanced.

  • PDF

Intelligent Query Analysis using Fuzzy Association Rule (퍼지 연관규칙을 이용한 지능적 질의해석)

  • Kim, Mi-Hye
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.11 no.6
    • /
    • pp.2214-2218
    • /
    • 2010
  • Association rule is one of meaning and useful extraction methods from large amounts of data, and furnish useful information to user for data describing a pattern or similarity among attributes in database. Association rule have been studied about existence and nonexistence rule in boolean database. In this paper, we propose an intelligent query system using fuzzy association rule by extraction association rule changing a quantitative attribute data to a nominal attribute value.

Keyword Selection for Visual Search based on Wikipedia (비주얼 검색을 위한 위키피디아 기반의 질의어 추출)

  • Kim, Jongwoo;Cho, Soosun
    • Journal of Korea Multimedia Society
    • /
    • v.21 no.8
    • /
    • pp.960-968
    • /
    • 2018
  • The mobile visual search service uses a query image to acquire linkage information through pre-constructed DB search. From the standpoint of this purpose, it would be more useful if you could perform a search on a web-based keyword search system instead of a pre-built DB search. In this paper, we propose a representative query extraction algorithm to be used as a keyword on a web-based search system. To do this, we use image classification labels generated by the CNN (Convolutional Neural Network) algorithm based on Deep Learning, which has a remarkable performance in image recognition. In the query extraction algorithm, dictionary meaningful words are extracted using Wikipedia, and hierarchical categories are constructed using WordNet. The performance of the proposed algorithm is evaluated by measuring the system response time.

A Schema Extraction Method using Elements Information in XML Documents (XML 문서에서의 엘리먼트 정보를 이용한 스키마 추출방법)

  • Kim, Seong-Rim;Yun, Yong-Ik
    • The KIPS Transactions:PartD
    • /
    • v.9D no.3
    • /
    • pp.381-388
    • /
    • 2002
  • XML documents, which are becoming new standard for expressing and exchanging data in the Internet, don't have defined schema. It is not adequate to directly apply XML documents to the existing SQL or OQL. Research on how to extract Schema for XML documents and query language is going on actively. For users' query, the results could be too tony or too less. It Is important to give the users adequate results. This paper suggests the way to extract many levelized schema according to the frequency of element occurrence in XML documents. The Schema can be reduced or extended to correspond to the users' query more flexibly.

Query Extension of Retrieve System Using Hangul Word Embedding and Apriori (한글 워드임베딩과 아프리오리를 이용한 검색 시스템의 질의어 확장)

  • Shin, Dong-Ha;Kim, Chang-Bok
    • Journal of Advanced Navigation Technology
    • /
    • v.20 no.6
    • /
    • pp.617-624
    • /
    • 2016
  • The hangul word embedding should be performed certainly process for noun extraction. Otherwise, it should be trained words that are not necessary, and it can not be derived efficient embedding results. In this paper, we propose model that can retrieve more efficiently by query language expansion using hangul word embedded, apriori, and text mining. The word embedding and apriori is a step expanding query language by extracting association words according to meaning and context for query language. The hangul text mining is a step of extracting similar answer and responding to the user using noun extraction, TF-IDF, and cosine similarity. The proposed model can improve accuracy of answer by learning the answer of specific domain and expanding high correlation query language. As future research, it needs to extract more correlation query language by analysis of user queries stored in database.

A Relation Analysis between NDSL User Queries and Technical Terms (NDSL 검색 질의어와 기술용어간의 관계에 대한 분석적 연구)

  • Kang, Nam-Gyu;Cho, Min-Hee;Kwon, Oh-Seok
    • Journal of Information Management
    • /
    • v.39 no.3
    • /
    • pp.163-177
    • /
    • 2008
  • In this paper, we analyzed the relationship between user query keywords that is used to search NDSL and technical terms extracted from NDSL journals. For the analysis, we extracted about 833,000 query keywords from NDSL search logs during nearly 17 months and approximately 41,000,000 technical terms from NDSL, INSPEC, FSTA journals. And we used only the English noun phrase in extracted those and then we did an experiment on analysis of equality, relationship analysis and frequency analysis.

Medical Image Retrieval Using Feature Extraction Based on Wavelet Transform (웨이블렛 변환 기반의 특징 검출을 이용한 의료영상 검색)

  • Lee, H.S.;Ma, K.Y.;Ahn, Y.B.
    • Proceedings of the KOSOMBE Conference
    • /
    • v.1998 no.11
    • /
    • pp.321-322
    • /
    • 1998
  • In this paper, a medical images retrieval method using feature extraction based on wavelet transform is proposed. We used energy of coefficients which is represented by wavelet transform. The proposed retrieval algorithm is comprised of the two retrieval. At first, we make a energy map for wavelet coefficient of a query image and then compare is to one of db image. And then we use an edge information of the query image to retrieve the images selected at the first retrieval once more. Consequently some retrieved images are displayed on screen.

  • PDF

Implementation of Image Retrieval System Using MPEG-7 Descriptors (MPEG-7 기술자를 이용한 영상 검색 시스템 구현)

  • 이희경;정용주;윤정현;강경옥;노용만
    • Proceedings of the IEEK Conference
    • /
    • 2000.11c
    • /
    • pp.129-132
    • /
    • 2000
  • In this paper, a multimedia database retrieval system is proposed using MPEG-7 meta data. Multimedia content based retrieval system is implemented with the MPEG-7 meta data extraction and matching technique. MPEG-7 descriptors and descriptor schemes are stored into the database with other meta data. When a query image is given, the descriptors and descriptor schemes of the query image are extracted and compared with the descriptors and descriptor schemes in the database. Finally, images having more similarity are retrieved.

  • PDF