• Title/Summary/Keyword: Indexing searching

Search Result 146, Processing Time 0.029 seconds

A Study on the Indexing System Using a Controlled Vocabulary and Natural Language in the Secondary Legal Information Full-Text Databases : an Evaluation and Comparison of Retrieval Effectiveness (2차 법률정보 전문데이터베이스에 있어서 통제어 색인시스템과 자연어 색인시스템의 검색효율 평가에 관한 연구)

  • Roh Jeong-Ran
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.32 no.4
    • /
    • pp.69-86
    • /
    • 1998
  • The purpose of velop the indexing algorithm of secondary legal information by the study of characteristics of legal information, to compare the indexing system using controlled vocabulary to the indexing system using natural language in the secondary legal information full-text databases, and to prove propriety and superiority of the indexing system using controlled vocabulary. The results are as follows; 1)The indexing system using controlled vocabulary in the secondary legal information full-text databases has more effectiveness than the indexing system using natural language, in the recall rate, the precision rate, the distribution of propriety, and the faculty of searching for the unique proper-records which the indexing system using natural language fans to find 2)The indexing system which adds more words to the controlled vocabulary in the secondary legal information full-text databases does not better effectiveness in the retail rate, the precision rate, comparing to the indexing system using controlled vocabulary. 3)The indexing system using word-added controlled vocabulary with an extra weight in the secondary legal information full-text databases does not better effectiveness in the recall rate, the precision rate, comparing to the indexing system using word-added controlled vocabulary without an extra weight. This study indicates that it is necessary to have characteristic information the information experts recognize - that is to say, experimental and inherent knowledge only human being can have built-in into the system rather than to approach the information system by the linguistic, statistic or structuralistic way, and it can be more essential and intelligent information system.

  • PDF

An Investigation of the Objectiveness of Image Indexing from Users' Perspectives (이용자 관점에서 본 이미지 색인의 객관성에 대한 연구)

  • 이지연
    • Journal of the Korean Society for information Management
    • /
    • v.19 no.3
    • /
    • pp.123-143
    • /
    • 2002
  • Developing good methods for image description and indexing is fundamental for successful image retrieval, regardless of the content of images. Researchers and practitioners in the field of image indexing have developed a variety of image indexing systems and methods with the consideration of information types delivered by images. Such efforts in developing image indexing systems and methods include Panofsky's levels of image indexing and indexing systems adopting different approaches such as thesauri-based approach, classification approach. description element-based approach, and categorization approach. This study investigated users' perception of the objectiveness of image indexing, especially the iconographical analysis of image information advocated by Panofsky. One of the best examples of subjectiveness and conditional-dependence of image information is emotion. As a result, this study dealt with visual emotional information. Experiments were conducted in two phases : one was to measure the degree of agreement or disagreement about the emotional content of pictures among forty-eight participants and the other was to examine the inter-rater consistency defined as the degree of users' agreement on indexing. The results showed that the experiment participants made fairly subjective interpretation when they were viewing pictures. It was also found that the subjective interpretation made by the participants resulted from the individual differences in terms of their educational or cultural background. The study results emphasize the importance of developing new ways of indexing and/or searching for images, which can alleviate the limitations of access to images due to the subjective interpretation made by different users.

Color Image Query Using Hierachical Search by Region of Interest with Color Indexing

  • Sombutkaew, Rattikorn;Chitsobhuk, Orachat
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2004.08a
    • /
    • pp.810-813
    • /
    • 2004
  • Indexing and Retrieving images from large and varied collections using image content as a key is a challenging and important problem in computer vision application. In this paper, a color Content-based Image Retrieval (CBIR) system using hierarchical Region of Interest (ROI) query and indexing is presented. During indexing process, First, The ROIs on every image in the image database are extracted using a region-based image segmentation technique, The JSEG approach is selected to handle this problem in order to create color-texture regions. Then, Color features in form of histogram and correlogram are then extracted from each segmented regions. Finally, The features are stored in the database as the key to retrieve the relevant images. As in the retrieval system, users are allowed to select ROI directly over the sample or user's submission image and the query process then focuses on the content of the selected ROI in order to find those images containing similar regions from the database. The hierarchical region-of-interest query is performed to retrieve the similar images. Two-level search is exploited in this paper. In the first level, the most important regions, usually the large regions at the center of user's query, are used to retrieve images having similar regions using static search. This ensures that we can retrieve all the images having the most important regions. In the second level, all the remaining regions in user's query are used to search from all the retrieved images obtained from the first level. The experimental results using the indexing technique show good retrieval performance over a variety of image collections, also great reduction in the amount of searching time.

  • PDF

An Efficient Index Scheme of XML Documents Using Node Range and Pre-Order List (노드 범위와 Pre-Order List를 이용한 XML문서의 효율적 색인기법)

  • Kim Young;Park Sang-Ho;Lee Ju-Hong
    • Journal of Internet Computing and Services
    • /
    • v.7 no.4
    • /
    • pp.23-32
    • /
    • 2006
  • In this paper, we propose indexing method to manage large amount of XML documents efficiently, using the range of node and Pre-Oder List. The most of XML indexing methods are based on path or numbering method. However, the method of path-based indexing method shows disadvantages of performance degradation for join operations of ancestor-descendent relationships, and searching for middle and lower nodes. The method of numbers-scheme based indexing has to number all nodes of XML documents, since search overhead increased and the disk space for indexes was wasted. Therefore, in this paper, we propose a novel indexing method using node ranges and Preorder-Lists to overcome these problems. The proposed method more efficiently stores similar structured XML documents. In addition, our method supports flexible insertion and deletion of XML documents.

  • PDF

An Exploratory Investigation on Multimedia Information Needs and Searching Behavior among College Students (멀티미디어 정보요구와 검색행태에 관한 탐색적 연구)

  • Chung, Eun-Kyung
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.46 no.3
    • /
    • pp.251-270
    • /
    • 2012
  • Multimedia needs and searching have become important in everyday life, especially in a younger generation. The characteristics of multimedia needs and searching behaviors are distinctive compared to textual information needs and searching behaviors in a wide variety of ways. By interviewing and observing multimedia needs and searching behaviors of college students from 20 areas in Seoul, this study aims to improve the understanding on users' multimedia needs and how users search multimedia. The findings are presented in terms of searching sources, multimedia needs, relevance criteria and searching barriers. For multimedia, the searching sources are found primarily as Naver and Google and the distinguished features are presented depending on the individual multimedia types. As multimedia needs are categorized into generic, specific and abstract, most of the needs are classified as specific needs rather than generic needs, but there exist differences depending on the types of multimedia. In addition, the aspects of relevance criteria and searching barriers are reflected with the characteristics of individual multimedia types. The findings of this study demonstrate that distinctive indexing and searching environments depending on the types of multimedia might be necessary to improve the quality of multimedia searching.

Efficient Indexing for Large DNA Sequence Databases (대용량 DNA 시퀀스 데이타베이스를 위한 효율적인 인덱싱)

  • Won Jung-Im;Yoon Jee-Hee;Park Sang-Hyun;Kim Sang-Wook
    • Journal of KIISE:Databases
    • /
    • v.31 no.6
    • /
    • pp.650-663
    • /
    • 2004
  • In molecular biology, DNA sequence searching is one of the most crucial operations. Since DNA databases contain a huge volume of sequences, a fast indexing mechanism is essential for efficient processing of DNA sequence searches. In this paper, we first identify the problems of the suffix tree in aspects of the storage overhead, search performance, and integration with DBMSs. Then, we propose a new index structure that solves those problems. The proposed index consists of two parts: the primary part represents the trie as bit strings without any pointers, and the secondary part helps fast accesses of the leaf nodes of the trio that need to be accessed for post processing. We also suggest an efficient algorithm based on that index for DNA sequence searching. To verify the superiority of the proposed approach, we conducted a performance evaluation via a series of experiments. The results revealed that the proposed approach, which requires smaller storage space, achieves 13 to 29 times performance improvement over the suffix tree.

A Study of Designing the Automatic Information Retrieval System based on Natural Language (자연어를 이용한 자동정보검색시스템 구축에 관한 연구)

  • Seo, Hwi
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.35 no.4
    • /
    • pp.141-160
    • /
    • 2001
  • This study is to develop a new system for conducting the information retrieval automatically. The system in this study is programmed by Delphi 4.0(PASCAL) and consists of automatic indexing, clustering technique, establishing and expressing term hierarchic relation, and automatic information retrieval technique. Thus this browser system can automatically control all the processes of information searching such as representation, generation and extension of queries and construction of searching strategy and feedback searching.

  • PDF

Indexing Algorithm Using Dynamic Threshold (동적임계값을 이용한 인덱싱 알고리즘)

  • 이문우;박종운;장종환
    • Journal of the Korea Computer Industry Society
    • /
    • v.2 no.3
    • /
    • pp.389-396
    • /
    • 2001
  • In detection of a scene change of the moving pictures which has massive information capacity, the temporal sampling method has a faster searching speed and lower missing scene change detection than the sequential searching method for the whole moving pictures, yet employed searching algorithm and detection interval greatly affect missing frame and searching precision. In this study, the whole moving pictures were primarily retrieved threshold by the temporal difference of histogram scene change detection method. We suggested a dynamic threshold algorithm using cut detection interval and derived an equation formula to determine optimal primary retrieval threshold which can cut detection interval computation. Experimental results show that the proposed dynamic threshold algorithm using cut detection interval method works up about 30 percent in precision of performance than the sequential searching method.

  • PDF

A DNA Index Structure using Frequency and Position Information of Genetic Alphabet (염기문자의 빈도와 위치정보를 이용한 DNA 인덱스구조)

  • Kim Woo-Cheol;Park Sang-Hyun;Won Jung-Im;Kim Sang-Wook;Yoon Jee-Hee
    • Journal of KIISE:Databases
    • /
    • v.32 no.3
    • /
    • pp.263-275
    • /
    • 2005
  • In a large DNA database, indexing techniques are widely used for rapid approximate sequence searching. However, most indexing techniques require a space larger than original databases, and also suffer from difficulties in seamless integration with DBMS. In this paper, we suggest a space-efficient and disk-based indexing and query processing algorithm for approximate DNA sequence searching, specially exact match queries, wildcard match queries, and k-mismatch queries. Our indexing method places a sliding window at every possible location of a DNA sequence and extracts its signature by considering the occurrence frequency of each nucleotide. It then stores a set of signatures using a multi-dimensional index, such as R*-tree. Especially, by assigning a weight to each position of a window, it prevents signatures from being concentrated around a few spots in index space. Our query processing algorithm converts a query sequence into a multi-dimensional rectangle and searches the index for the signatures overlapped with the rectangle. The experiments with real biological data sets revealed that the proposed method is at least three times, twice, and several orders of magnitude faster than the suffix-tree-based method in exact match, wildcard match, and k- mismatch, respectively.

k-Bitmap Clustering Method for XML Data based on Relational DBMS (관계형 DBMS 기반의 XML 데이터를 위한 k-비트맵 클러스터링 기법)

  • Lee, Bum-Suk;Hwang, Byung-Yeon
    • The KIPS Transactions:PartD
    • /
    • v.16D no.6
    • /
    • pp.845-850
    • /
    • 2009
  • Use of XML data has been increased with growth of Web 2.0 environment. XML is recognized its advantages by using based technology of RSS or ATOM for transferring information from blogs and news feed. Bitmap clustering is a method to keep index in main memory based on Relational DBMS, and which performed better than the other XML indexing methods during the evaluation. Existing method generates too many clusters, and it causes deterioration of result of searching quality. This paper proposes k-Bitmap clustering method that can generate user defined k clusters to solve above-mentioned problem. The proposed method also keeps additional inverted index for searching excluded terms from representative bits of k-Bitmap. We performed evaluation and the result shows that the users can control the number of clusters. Also our method has high recall value in single term search, and it guarantees the searching result includes all related documents for its query with keeping two indices.