• Title/Summary/Keyword: related document retrieval

Search Result 49, Processing Time 0.019 seconds

A Study on the Depth-Oriented Decomposition Indexing Method for Creating and Searching Structured Documents Based-on XML (XML을 이용한 구조적 문서 생성 및 탐색을 위한 깊이중심분할 색인기법에 관한 연구)

  • Yang, Ok-Yul;Lee, Yong-Ju
    • The KIPS Transactions:PartD
    • /
    • v.9D no.6
    • /
    • pp.1025-1042
    • /
    • 2002
  • The goal of this study is to generate a structured document which improves the performance of an information retrieval system by using thesaurus, information on relations between words (terms), and to study on the technique for searching this structured document. In order to accomplish this goal, we propose a DODI (Depth -Oriented Decomposition Index) technique for the structured document and an algorithm to search for related information efficient]y through this index technique that uses a thesaurus. We establish a storage system by which the structured document generated by this index technique is saved in a database through OpenXML and XML documents are generated through ForXML methods.

Retrieval methodology for similar NPP LCO cases based on domain specific NLP

  • No Kyu Seong ;Jae Hee Lee ;Jong Beom Lee;Poong Hyun Seong
    • Nuclear Engineering and Technology
    • /
    • v.55 no.2
    • /
    • pp.421-431
    • /
    • 2023
  • Nuclear power plants (NPPs) have technical specifications (Tech Specs) to ensure that the equipment and key operating parameters necessary for the safe operation of the power plant are maintained within limiting conditions for operation (LCO) determined by a safety analysis. The LCO of Tech Specs that identify the lowest functional capability of equipment required for safe operation for a facility must be complied for the safe operation of NPP. There have been previous studies to aid in compliance with LCO relevant to rule-based expert systems; however, there is an obvious limit to expert systems for implementing the rules for many situations related to LCO. Therefore, in this study, we present a retrieval methodology for similar LCO cases in determining whether LCO is met or not met. To reflect the natural language processing of NPP features, a domain dictionary was built, and the optimal term frequency-inverse document frequency variant was selected. The retrieval performance was improved by adding a Boolean retrieval model based on terms related to the LCO in addition to the vector space model. The developed domain dictionary and retrieval methodology are expected to be exceedingly useful in determining whether LCO is met.

A Study on Building Structures and Processes for Intelligent Web Document Classification (지능적인 웹문서 분류를 위한 구조 및 프로세스 설계 연구)

  • Jang, Young-Cheol
    • Journal of Digital Convergence
    • /
    • v.6 no.4
    • /
    • pp.177-183
    • /
    • 2008
  • This paper aims to offer a solution based on intelligent document classification to create a user-centric information retrieval system allowing user-centric linguistic expression. So, structures expressing user intention and fine document classifying process using EBL, similarity, knowledge base, user intention, are proposed. To overcome the problem requiring huge and exact semantic information, a hybrid process is designed integrating keyword, thesaurus, probability and user intention information. User intention tree hierarchy is build and a method of extracting group intention between key words and user intentions is proposed. These structures and processes are implemented in HDCI(Hybrid Document Classification with Intention) system. HDCI consists of analyzing user intention and classifying web documents stages. Classifying stage is composed of knowledge base process, similarity process and hybrid coordinating process. With the help of user intention related structures and hybrid coordinating process, HDCI can efficiently categorize web documents in according to user's complex linguistic expression with small priori information.

  • PDF

Analysis on User Interface in Information Retrieval Systems (정보검색시스템에서의 이용자 인터페이스 기능에 관한 분석적 고찰)

  • 서은경
    • Journal of the Korean Society for information Management
    • /
    • v.16 no.4
    • /
    • pp.125-150
    • /
    • 1999
  • This study reviews various aspects of design of user interfaces in interactive information retrieval systems. Specially the study examines, 1) search related interfaces such as query processing, search strategies, and multilingual processing, and 2) browsing related interfaces such as document browsing and search result browsing. The main goals of this review are to characterize user interface techniques in information retrieval systems and to suggest potential future research direction and challenges.

  • PDF

The Document Clustering using LSI of IR (LSI를 이용한 문서 클러스터링)

  • 고지현;최영란;유준현;박순철
    • Proceedings of the Korea Society for Industrial Systems Conference
    • /
    • 2002.06a
    • /
    • pp.330-335
    • /
    • 2002
  • The most critical issue in information retrieval system is to have adequate results corresponding to user requests. When all documents related with user inquiry retrieve, it is not easy not only to find correct document what user wants but is limited. Therefore, clustering method that grouped by corresponding documents has widely used so far. In this paper, we cluster on the basis of the meaning rather than the index term in the existing document and a LSI method is applied by this reason. Furthermore, we distinguish and analyze differences from the clustering using widely-used K-Means algorithm for the document clustering.

  • PDF

Clustering of Web Document Exploiting with the Co-link in Hypertext (동시링크를 이용한 웹 문서 클러스터링 실험)

  • 김영기;이원희;권혁철
    • Journal of Korean Library and Information Science Society
    • /
    • v.34 no.2
    • /
    • pp.233-253
    • /
    • 2003
  • Knowledge organization is the way we humans understand the world. There are two types of information organization mechanisms studied in information retrieval: namely classification md clustering. Classification organizes entities by pigeonholing them into predefined categories, whereas clustering organizes information by grouping similar or related entities together. The system of the Internet information resources extracts a keyword from the words which appear in the web document and draws up a reverse file. Term clustering based on grouping related terms, however, did not prove overly successful and was mostly abandoned in cases of documents used different languages each other or door-way-pages composed of only an anchor text. This study examines infometric analysis and clustering possibility of web documents based on co-link topology of web pages.

  • PDF

A Design and Implementation for Data Sharing Interface in based XML (XML 기반 데이터 공유 Interface 설계 및 구현)

  • 김철원;김상영;박종훈
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2004.05b
    • /
    • pp.424-428
    • /
    • 2004
  • Study related to a system that saves a n document, and to search is consisting actively and has a lot of cases to have left emphasis in the function that these systems efficiently save a XML document and can search. Also, It has a table or the storage structure which was especially designed in order to save a XML document and can save structure information of a document in addition to contents of a XML document together and can efficiently do content retrieval of a XML document or a structure search with an early base. As for this paper, a design implemented the data which the many different kinds of database that had currently used had with you so that did recycling and shared conversion and this XML file in Web, and output can become a XML format through various interrace.

  • PDF

Conceptual Object Grouping for Multimedia Document Management

  • Lee, Chong-Deuk;Jeong, Taeg-Won
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.9 no.3
    • /
    • pp.161-165
    • /
    • 2009
  • Increase of multimedia information in Web requires a new method to manage and service multimedia documents efficiently. This paper proposes a conceptual object grouping method by fuzzy filtering, which is automatically constituted based on increase of multimedia documents. The proposed method composes subsumption relations between conceptual objects automatically using fuzzy filtering of the document objects that are extracted from domains. Grouping of such conceptual objects is regarded as subsumption relation which is decided by $\mu$-cut. This paper proposes $\mu$-cut, FAS(Fuzzy Average Similarity) and DSR(Direct Subsumption Relation) to decide fuzzy filtering, which groups related document objects easily. This paper used about 1,000 conceptual objects in the performance test of the proposed method. The simulation result showed that the proposed method had better retrieval performance than those for OGM(Optimistic Genealogy Method) and BGM(Balanced Genealogy Method).

Fast, Flexible Text Search Using Genomic Short-Read Mapping Model

  • Kim, Sung-Hwan;Cho, Hwan-Gue
    • ETRI Journal
    • /
    • v.38 no.3
    • /
    • pp.518-528
    • /
    • 2016
  • The searching of an extensive document database for documents that are locally similar to a given query document, and the subsequent detection of similar regions between such documents, is considered as an essential task in the fields of information retrieval and data management. In this paper, we present a framework for such a task. The proposed framework employs the method of short-read mapping, which is used in bioinformatics to reveal similarities between genomic sequences. In this paper, documents are considered biological objects; consequently, edit operations between locally similar documents are viewed as an evolutionary process. Accordingly, we are able to apply the method of evolution tracing in the detection of similar regions between documents. In addition, we propose heuristic methods to address issues associated with the different stages of the proposed framework, for example, a frequency-based fragment ordering method and a locality-aware interval aggregation method. Extensive experiments covering various scenarios related to the search of an extensive document database for documents that are locally similar to a given query document are considered, and the results indicate that the proposed framework outperforms existing methods.

A Path Combining Strategy for Efficient Storing of XML Documents (XML 문서의 효율적인 저장을 위한 경로 통합 기법)

  • Lee, Bum-Suk;Hwang, Byung-Yeon
    • Journal of Korea Multimedia Society
    • /
    • v.9 no.10
    • /
    • pp.1257-1265
    • /
    • 2006
  • As XML is increasingly used, the need of researches which are related with XML in various fields is also augmented. Many XML document management systems have been actively developed especially for the storage, processing and retrieval of XML documents. The BitCube is a three dimensional bitmap index system that could be manipulated efficiently and improves the performance of document retrieval. However, the site of index is increase rapidly, when a new bit is added to the axis. This problem is caused by its three dimensional memory structure with document, path and word. We suggest a path combining strategy of XML documents in this paper to solve the problem of BitCube that mentioned above. To reduce the size of index, our approach combines sibling nodes that has same ancestor paths, and transforms word axis into value axis. The method reduces the size of index, when the system com poses the three dimensional bitmap index. It also improves the speed of retrieving, and takes efficiency in storage space.

  • PDF