• Title/Summary/Keyword: query length

Search Result 67, Processing Time 0.021 seconds

Implementation of Text Summarize Automation Using Document Length Normalization (문서 길이 정규화를 이용한 문서 요약 자동화 시스템 구현)

  • 이재훈;김영천;이성주
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 2001.12a
    • /
    • pp.51-55
    • /
    • 2001
  • With the rapid growth of the World Wide Web and electronic information services, information is becoming available on-Line at an incredible rate. One result is the oft-decried information overload. No one has time to read everything, yet we often have to make critical decisions based on what we are able to assimilate. The technology of automatic text summarization is becoming indispensable for dealing with this problem. Text summarization is the process of distilling the most important information from a source to produce an abridged version for a particular user or task. Information retrieval(IR) is the task of searching a set of documents for some query-relevant documents. On the other hand, text summarization is considered to be the task of searching a document, a set of sentences, for some topic-relevant sentences. In this paper, we show that document information, that is more reliable and suitable for query, using document length normalization of which is gained through information retrieval . Experimental results of this system in newspaper articles show that document length normalization method superior to other methods use query itself.

  • PDF

An Efficient String Similarity Search Technique based on Generating Inverted Lists of Variable-Length Grams (가변길이 그램의 역리스트 생성을 이용한 효율적인 유사 문자열 검색 기법)

  • Kim, Jongik
    • Journal of KIISE
    • /
    • v.43 no.11
    • /
    • pp.1275-1280
    • /
    • 2016
  • Existing techniques for string similarity search first generate a set of candidate strings and then verify the candidates. The efficiency of string similarity search is highly dependent on candidate generation methods. State of the art techniques select fixed length q-grams from a query string and generate candidates using inverted lists of the selected q-grams. In this paper, we propose a technique to generate candidates using variable length grams of a query string and develop a dynamic programming algorithm that selects an optimal combination of variable length grams from a query string. Experimental results show that the proposed technique improves the performance of string similarity search compared with the existing techniques.

The Pragmatics of Automatic Query Expansion Based on Search Results of Natural Language Queries (탐색결과에 근거한 자연어질의 자동확장 및 응용에 관한 연구 고찰)

  • 노정순
    • Journal of the Korean Society for information Management
    • /
    • v.16 no.2
    • /
    • pp.49-80
    • /
    • 1999
  • This study analyses the researches on automatic query modification, expansion and combination based on search results of natural language queries and gives a conceptual framework for the factors affecting the effectiveness of the relevance feedback. The operating and experimental systems based on the vector space model, the binary independence model and the inference net model are reviewed, and it is found that the effectiveness of query expansion is affected by conceptual models, algorithms for weighting terms and documents and selecting query terms to be added, size of relevant and non-relevant documents to be used and size of terms to be added in relevance feedback, query length, type and size of DBs, etc.

  • PDF

RFID Tag Number Estimation and Query Time Optimization Methods (RFID 태그 개수 추정 방법 및 질의 시간 최소화 방안)

  • Woo, Kyung-Moon;Kim, Chong-Kwon
    • Journal of KIISE:Information Networking
    • /
    • v.33 no.6
    • /
    • pp.420-427
    • /
    • 2006
  • An RFID system is an important technology that could replace the traditional bar code system changing the paradigm of manufacturing, distribution, and service industry. An RFID reader can recognize several hundred tags in one second. Tag identification is done by tags' random transmission of their IDs in a frame which is assigned by the reader at each round. To minimize tag identification time, optimal frame size should be selected according to the number of tags. This paper presents new query optimization methods in RFID systems. Query optimization consists of tag number estimation problem and frame length determination problem. We propose a simple yet efficient tag estimation method and calculate optimal frame lengths that minimize overall query time. We conducted rigorous performance studies. Performance results show that the new tag number estimation technique is more accurate than previous methods. We also observe that a simple greedy method is as efficient as the optimal method in minimizing the query time.

Closest Pairs and e-distance Join Query Processing Algorithms using a POI-based Materialization Technique in Spatial Network Databases (공간 네트워크 데이터베이스에서 POI 기반 실체화 기법을 이용한 Closest Pairs 및 e-distance 조인 질의처리 알고리즘)

  • Kim, Yong-Ki;Chang, Jae-Woo
    • Journal of Korea Spatial Information System Society
    • /
    • v.9 no.3
    • /
    • pp.67-80
    • /
    • 2007
  • Recently, many studies on query processing algorithms has been done for spatial networks, such as roads and railways, instead of Euclidean spaces, in order to efficiently support LBS(location-based service) and Telematics applications. However, both a closest pairs query and an e-distance join query require a very high cost in query processing because they can be answered by processing a set of POIs, instead of a single POI. Nevertheless, the query processing cost for closest pairs and e-distance join queries is rapidly increased as the number of k (or the length of radius) is increased. Therefore, we propose both a closest pairs query processing algorithm and an e-distance join query processing algorithm using a POI-based materialization technique so that we can process closest pairs and e-distance join queries in an efficient way. In addition, we show the retrieval efficiency of the proposed algorithms by making a performance comparison of the conventional algorithms.

  • PDF

A new method to predict the protein sequence alignment quality (단백질 서열정렬 정확도 예측을 위한 새로운 방법)

  • Lee, Min-Ho;Jeong, Chan-Seok;Kim, Dong-Seop
    • Bioinformatics and Biosystems
    • /
    • v.1 no.1
    • /
    • pp.82-87
    • /
    • 2006
  • The most popular protein structure prediction method is comparative modeling. To guarantee accurate comparative modeling, the sequence alignment between a query protein and a template should be accurate. Although choosing the best template based on the protein sequence alignments is most critical to perform more accurate fold-recognition in comparative modeling, even more critical is the sequence alignment quality. Contrast to a lot of attention to developing a method for choosing the best template, prediction of alignment accuracy has not gained much interest. Here, we develop a method for prediction of the shift score, a recently proposed measure for alignment quality. We apply support vector regression (SVR) to predict shift score. The alignment between a query protein and a template protein of length n in our own library is transformed into an input vector of length n +2. Structural alignments are assumed to be the best alignment, and SVR is trained to predict the shift score between structural alignment and profile-profile alignment of a query protein to a template protein. The performance is assessed by Pearson correlation coefficient. The trained SVR predicts shift score with the correlation between observed and predicted shift score of 0.80.

  • PDF

A Tuning Algorithm for the Multidimensional Type Inheritance Index of XML Databases (XML 데이터베이스 다차원 타입상속 색인구조의 조율 알고리즘)

  • Lee, Jong-Hak
    • Journal of Korea Multimedia Society
    • /
    • v.14 no.2
    • /
    • pp.269-281
    • /
    • 2011
  • For the MD-TIX(multidimensional type inheritance index) that supports query processing for the type inheritance concept in XML databases, this paper presents an index tuning algorithm that enhances the performance of the XML query processing according to the query pattern. The MD-TIX uses a multidimensional index structure to support complex XML queries involving both nested elements and type inheritance hierarchies. In this index tuning algorithm, we first determine a shape of index page regions by using the query information about the user's query pattern, and then construct an optimal MD-TIX by applying a region splitting strategy that makes the shape of the page regions into the predetermined one. The performance evaluation results indicate that the proposed tuning algorithm builds an optimal MD-TIX by a given query pattern, and in the case of the three-dimensional query regions for the nested predicates of path length 2, the performance is much enhanced according to the skewed degree of the query region's shape.

Applying the Weight for Query Length and the Frequency of Query Term to Information Retrieval (정보 검색에서 질의문 길이에 대한 가중치와 질의어 출현 빈도 가중치 적용)

  • Kang, Seung-Shik;Chun, Young-Jin
    • Annual Conference of KIPS
    • /
    • 2005.05a
    • /
    • pp.763-766
    • /
    • 2005
  • 정보검색 시스템에서 긴 문장으로 질의가 들어올 경우 질의문의 길이와 시스템이 정답이라고 판단한 문서에서 질의문을 분석하여 추출한 질의어들이 출현한 빈도수를 가중치로 준다면 좀더 정확한 결과를 보일 수 있을 것이라 가정하였다. 즉 벡터 모델을 이용하여 문서와 질의와의 유사도를 계산하고 여기에 질의문의 길이에 대한 가중치와 유사도를 이용하여 얻은 결과 문서에서 질의문을 분석하여 얻은 질의 용어들의 출현 빈도에 대한 가중치를 적용하는 방법을 제안하였다.

  • PDF

An Efficient Continuous Range Query Processing Through Grid based Query Indexing (그리드 기반의 질의 색인을 통한 효율적인 연속 영역 질의 처리)

  • Park, Yong-Hun;Bok, Kyoung-Soo;Yoo, Jae-Soo
    • The KIPS Transactions:PartD
    • /
    • v.14D no.5
    • /
    • pp.471-482
    • /
    • 2007
  • In this paper, we propose an efficient continuous range query processing scheme using a modified grid based query indexing to reduce storage spaces and to accelerate processing time. The proposed method has two major features. First, each query has a bit identifier and each cell in a grid has a bit pattern that consists of the bit identifiers of the queries. The bit patterns present the relationship between cells and queries. Using the bit patterns, we can compute quickly what queries overlap a cell in a grid and reduce the number of unnecessary operations by comparing the bit patterns without comparing the query identifiers when we compute the relation between cells and queries. Second, the management of cells in the grid by groups prevents from wasting the storage space through the increase of the length of the bit pattern and increasing the comparison costs of bit patterns. We show through the performance evaluation that the proposed method outperforms the existing methods.

Embeded-type Search Function with Feedback for Smartphone Applications (스마트폰 애플리케이션을 위한 임베디드형 피드백 지원 검색체)

  • Kang, Moonjoong;Hwang, Mintae
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.21 no.5
    • /
    • pp.974-983
    • /
    • 2017
  • In this paper, we have discussed the search function that can be embedded and used on Android-based applications. We used BM25 to suppress insignificant and too frequent words such as postpositions, Pivoted Length Normalization technique used to resolve the search priority problem related to each item's length, and Rocchio's method to pull items inferred to be related to the query closer to the query vector on Vector Space Model to support implicit feedback function. The index operation is divided into two methods; simple index to support offline operation and complex index for online operation. The implementation uses query inference function to guess user's future input by collating given present input with indexed data and with it the function is able to handle and correct user's error. Thus the implementation could be easily adopted into smartphone applications to improve their search functions.