• Title/Summary/Keyword: Indexing System

Search Result 463, Processing Time 0.028 seconds

Automatic indexing as a subject analysis technique (주제분석기법으로서의 자동색인)

  • 이영자
    • Journal of Korean Library and Information Science Society
    • /
    • v.12
    • /
    • pp.61-96
    • /
    • 1985
  • The human subject analysis of a document has some critical problems. The method results in the inconsistency in analysis process and the contradiction of two objects of the subject analysis (one is the identification of the content for the retrieval of specific items and the other is to identify the content for the grouping of related materials). Since the subject analysis by mechanized has been recognized to be the possible way to aggregate the problems of manual analysis, various a n.0, pproaches of automatic indexing have been studied and experimented. This study is to examine the automatic indexing as one of the promising subject analysis techniques by statistical, syntactical and semantic a n.0, pproaches. In conclusion, the reasonable a n.0, pplication time of the automatic indexing should be made a decision based on the through investigation on the cost verse effectiveness, and automatic indexing system should be developed in the close relationship with the on-line search which is a good retrieval system for information explosion society. From now on, since the machine-readable document-text will be envisaged to be more and more available due to the rapid development of computer technology, the more substantial research on the automatic indexing will be also possible, which can bring about the increasing of practical automatic indexing systems.

  • PDF

An Indexing System for Retrieving Similar Paths in XML Documents (XML 문서의 유사 경로 검색을 위한 인덱싱 시스템)

  • Lee, Bum-Suk;Hwang, Byung-Yeon
    • The KIPS Transactions:PartD
    • /
    • v.15D no.2
    • /
    • pp.171-178
    • /
    • 2008
  • Since the XML standard was introduced by the W3C in 1998, documents that have been written in XML have been gradually increasing. Accordingly, several systems have been developed in order to efficiently manage and retrieve massive XML documents. BitCube-a bitmap indexing system-is a representative system for this field of research. Based on the bitmap indexing technique, the path bitmap indexing system(LH06), which performs the clustering of similar paths, improved the problem that the existing BitCube system could not solve, namely, determining similar paths. The path bitmap indexing system has the advantage of a higher retrieval speed in not only exactly matched path searching but also similar path searching. However, the similarity calculation algorithm of this system has a few particular problems. Consequently, it sometimes cannot calculate the similarity even though some of two paths have extremely similar relationships; further, it results in an increment in the number of meaningless clusters. In this paper, we have proposed a novel method that clustering, the similarity between the paths in order to solve these problems. The proposed system yields a stable result for clustering, and it obtains a high score in clustering precision during a performance evaluation against LH06.

An experiment in automatic indexing with korean texts : a comparison of syntactico-statistical and manual methods (구문 . 통계적 기법을 이용한 한국어 자동색인에 관한 연구)

  • 서은경
    • Journal of the Korean Society for information Management
    • /
    • v.10 no.1
    • /
    • pp.97-124
    • /
    • 1993
  • This study was undertaken in order to develop practical automatic indexing techniques suitable for Korean natural language texts. It has taken a modest step toward this goal by developing an automatic syntactico-statistical indexing method and evaluating the method by comparing the resutls with manual indexing. For this experimental study, the Korean text database was constructed manually based on 300 abstracts covering business subject. The experimental results showed that the performance of the automatic syntactico-statistical indexing system was comparable to that of other studies which have compared automatic indexing with manual indexing.

  • PDF

A Study on Distributed Indexing Technique for Digital Library (디지털 도서관을 위한 분산색인 기법에 대한 연구)

  • Yu, Chun-Sik;Lee, Jong-Deuk;Kim, Yong-Seong
    • The Transactions of the Korea Information Processing Society
    • /
    • v.6 no.2
    • /
    • pp.315-325
    • /
    • 1999
  • Indexing techniques for distributed resources have much effect on an information service system based on distributed environment like digital library. There is a centralized indexing technique, a distributed technique, and a mixed technique for distributed indexing techniques. In this paper, we propose new distributed indexing technique using EIF(extended Inverted File) structure that mix the centralized technique and t도 distributed technique. And we propose management techniques using EIF structure and retrieval technique using EIF structure. This distributed indexing technique proposed is able to fast process retrieval request and reduce network overload and select servers relevant to query terms. This paper investigated performance of a proposed distributed indexing technique.

  • PDF

A Study of Automatic Indexing Technique based on Logical Structure of SGML Hangul Document (SGML 한글문서의 논리적 구조에 근거한 색인기법에 관한 연구)

  • 유석종
    • Journal of the Korean Society for information Management
    • /
    • v.12 no.2
    • /
    • pp.85-101
    • /
    • 1995
  • Conventional indexing sytstems support only full-text indexing method for electronic documents and do not use logical structure of documents in retrieval. Most electronic documents are in different formats depending on various systems. Also, they only indicate physical style of the document without considering any logical structure. Thus, in the effort to standardize the exchange of documents. IS0 developed SGML(Stadard Generalized Markup Language) which contains information about logical structure of the documents. In this paper, to resolve the disadvantages of full-text indexing method and to use standard document format. indexing system for SGML document is designed and implemented. In this system, user can assign indexing domain on elements, thus the logical structure of document is reflected in retrieving information. Various retrieval methods can be implemented by using the structural information of the document. In addition, automatic indexing for SGML Hangul document is supported in this system

  • PDF

Automated Essay Grading: An Application For Historical Malay Text

  • Syed Mustapha, S.M.F.D;Idris, N.
    • Proceedings of the Korea Inteligent Information System Society Conference
    • /
    • 2001.01a
    • /
    • pp.237-245
    • /
    • 2001
  • Automated essay grading has been proposed for over thirty years. Only recently have practical implementations been constructed and tested. This paper investigated the role of the nearest-neighbour algorithm within the information retrieval as a way of grading the essay automatically called Automated Essay Grading System. It intended to offer teachers an individualized assistance in grading the student\`s essay. The system involved several processes, which are the indexing, the structuring of the model answer and the grade processing. The indexing process comprised the document indexing and query processing which are mainly used for representing the documents and the query. Structuring the model answer is actually preparing the marking scheme and the grade processing is the process of assessing the essay. To test the effectiveness of the developed algorithms, the algorithms are tested against the History text in Malay. The result showed that th information retrieval and the nearest-neighbour algorithm are practical combination that offer acceptable performance for grading the essay.

  • PDF

Protein Sequence Search based on N-gram Indexing

  • Hwang, Mi-Nyeong;Kim, Jin-Suk
    • Bioinformatics and Biosystems
    • /
    • v.1 no.1
    • /
    • pp.46-50
    • /
    • 2006
  • According to the advancement of experimental techniques in molecular biology, genomic and protein sequence databases are increasing in size exponentially, and mean sequence lengths are also increasing. Because the sizes of these databases become larger, it is difficult to search similar sequences in biological databases with significant homologies to a query sequence. In this paper, we present the N-gram indexing method to retrieve similar sequences fast, precisely and comparably. This method regards a protein sequence as a text written in language of 20 amino acid codes, adapts N-gram tokens of fixed-length as its indexing scheme for sequence strings. After such tokens are indexed for all the sequences in the database, sequences can be searched with information retrieval algorithms. Using this new method, we have developed a protein sequence search system named as ProSeS (PROtein Sequence Search). ProSeS is a protein sequence analysis system which provides overall analysis results such as similar sequences with significant homologies, predicted subcellular locations of the query sequence, and major keywords extracted from annotations of similar sequences. We show experimentally that the N-gram indexing approach saves the retrieval time significantly, and that it is as accurate as current popular search tool BLAST.

  • PDF

A Study on Semantic Based Indexing and Fuzzy Relevance Model (의미기반 인덱스 추출과 퍼지검색 모델에 관한 연구)

  • Kang, Bo-Yeong;Kim, Dae-Won;Gu, Sang-Ok;Lee, Sang-Jo
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2002.04b
    • /
    • pp.238-240
    • /
    • 2002
  • If there is an Information Retrieval system which comprehends the semantic content of documents and knows the preference of users. the system can search the information better on the Internet, or improve the IR performance. Therefore we propose the IR model which combines semantic based indexing and fuzzy relevance model. In addition to the statistical approach, we chose the semantic approach in indexing, lexical chains, because we assume it would improve the performance of the index term extraction. Furthermore, we combined the semantic based indexing with the fuzzy model, which finds out the exact relevance of the user preference and index terms. The proposed system works as follows: First, the presented system indexes documents by the efficient index term extraction method using lexical chains. And then, if a user tends to retrieve the information from the indexed document collection, the extended IR model calculates and ranks the relevance of user query. user preference and index terms by some metrics. When we experimented each module, semantic based indexing and extended fuzzy model. it gave noticeable results. The combination of these modules is expected to improve the information retrieval performance.

  • PDF

Implementation of Environmental Information Monitoring System using Multi-Query Indexing Technique and Wireless Sensor (다중 질의 색인기법과 무선 센서를 이용한 환경정보 모니터링 시스템 구현)

  • Kim, Jung-Yee;Lee, Kang-Ho
    • Journal of the Korea Society of Computer and Information
    • /
    • v.12 no.6
    • /
    • pp.307-312
    • /
    • 2007
  • Wireless Sensor Network(WSN) is considered as a core technology necessary for Ubiquitous computing, with its numerous possible applications in many practical areas, is being researched and studied actively by many around the world. WSN utilizes wireless sensors spatially placed to gather information regarding temperature, light condition, motion and change in speed of the objects within their surrounding environment. This paper implements an environmental information monitoring and indexing system based on spatial indexing technique by constructing a WSN system. This Multi-Query Indexing Technique coupled with wireless sensors provides an output based on the pre-defined built-in data index and new input from the sensors. If environment data is occured, system have to perform a proper action after collecting and analyzing this data. This is the purpose of implementing environment data monitoring system. We constructed environmental application using TinyOS and built tested with MICAz sensor bords. We designed and implemented a monitoring system which detects and multi-indexing process environmental data from distributed sensors.

  • PDF

Text Partitioned Indexing Method for Educational Documents (교육용 문서의 텍스트분할 색인)

  • Kang, Mu-Yeong;Lee, Sang-Gu
    • Journal of The Korean Association of Information Education
    • /
    • v.3 no.2
    • /
    • pp.72-84
    • /
    • 2000
  • Information retrieval system plays a key role in the information society to store digital documents with efficiency and to provide user with the information through the retrieval very fast. Especially, indexing is a prerequisite function for the information retrieval system in order to retrieve the information of the documents effectively which are saved in database. In this paper, we propose an indexing method using text partition. This method can retrieve educational documents in short processing time. We applied the suggested indexing method to real information retrieval system, and proved its excellent functions through the demonstration.

  • PDF