• Title/Summary/Keyword: Text Retrieval

Search Result 342, Processing Time 0.028 seconds

Construction of Full-text Database by SGML (문서기술언어 SGML에 의한 전문 데이터베이스의 구축)

  • Kim, Chang-Bong
    • Journal of Information Management
    • /
    • v.27 no.4
    • /
    • pp.35-56
    • /
    • 1996
  • SGML(Standard Generalized Markup Language) and its application to full-text database including a table, a figure and a picture are explained. A structure of SGML based full-text database Is defined by DTD(document type definition) written in SGML, and full-text itself is described with generalized markup depending on DTD. This article explains how to represent a document structure : a hierarchical structure like a chapter, a section, or a paragraph, or non-hierarchical(referencial) structure like a note, a table, a figure or a picture. Merits of SGML, electronic publishing, a retrieval system or hypertext and SGML tools are also described.

  • PDF

Semantic Image Retrieval Using RDF Metadata Based on the Representation of Spatial Relationships (공간관계 표현 기반 RDF 메타데이터를 이용한 의미적 이미지 검색)

  • Hwang, Myung-Gwun;Kong, Hyun-Jang;Kim, Pan-Koo
    • The KIPS Transactions:PartB
    • /
    • v.11B no.5
    • /
    • pp.573-580
    • /
    • 2004
  • As the modern techniques have improved, people intend to store and manage the information on the web. Especially, it is the image data that is given a great deal of weight of the information because of the development of the scan and popularization of the digital camera and the cell-phone's camera. However, most image retrieval systems are still based on the text annotations while many images are creating everyday on the web. In this paper, we suggest the new approach for the semantic image retrieval using the RDF metadata based on the representation of the spatial relationships. For the semantic image retrieval, firstly we define the new vocabularies to represent the spatial relationships between the objects in the image. Secondly, we write the metadata about the image using RDF and new vocabularies. Finally. we could expect more correct result in our image retrieval system.

Learning Probabilistic Kernel from Latent Dirichlet Allocation

  • Lv, Qi;Pang, Lin;Li, Xiong
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.10 no.6
    • /
    • pp.2527-2545
    • /
    • 2016
  • Measuring the similarity of given samples is a key problem of recognition, clustering, retrieval and related applications. A number of works, e.g. kernel method and metric learning, have been contributed to this problem. The challenge of similarity learning is to find a similarity robust to intra-class variance and simultaneously selective to inter-class characteristic. We observed that, the similarity measure can be improved if the data distribution and hidden semantic information are exploited in a more sophisticated way. In this paper, we propose a similarity learning approach for retrieval and recognition. The approach, termed as LDA-FEK, derives free energy kernel (FEK) from Latent Dirichlet Allocation (LDA). First, it trains LDA and constructs kernel using the parameters and variables of the trained model. Then, the unknown kernel parameters are learned by a discriminative learning approach. The main contributions of the proposed method are twofold: (1) the method is computationally efficient and scalable since the parameters in kernel are determined in a staged way; (2) the method exploits data distribution and semantic level hidden information by means of LDA. To evaluate the performance of LDA-FEK, we apply it for image retrieval over two data sets and for text categorization on four popular data sets. The results show the competitive performance of our method.

A Study of an Efficient Retrieval System Algorithm using a Text Mining (텍스트마이닝 기술을 이용한 효율적인 검색시스템 알고리즘에 대한 연구)

  • Kim, Je-Seok;Kim, Jang-Hyung
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • v.9 no.2
    • /
    • pp.531-534
    • /
    • 2005
  • Currently some problems are presented by the enlargement of network range and hardware upgrade for the solutions for network traffic and treatment speed of server processing, as well as the resource of networks and increasing speed of on-line information that is exceeding in operation limit of existing information systems. The study proposes the Architecture, an organic unification system of optimized content for retrieval, which is adapted to variable points of view of users or content changes of document aggregation by the study of algorithm, which offers easy retrieval of the location of documents on a multitude of on-line data.

  • PDF

Survey of Automatic Query Expansion for Arabic Text Retrieval

  • Farhan, Yasir Hadi;Noah, Shahrul Azman Mohd;Mohd, Masnizah
    • Journal of Information Science Theory and Practice
    • /
    • v.8 no.4
    • /
    • pp.67-86
    • /
    • 2020
  • Information need has been one of the main motivations for a person using a search engine. Queries can represent very different information needs. Ironically, a query can be a poor representation of the information need because the user can find it difficult to express the information need. Query Expansion (QE) is being popularly used to address this limitation. While QE can be considered as a language-independent technique, recent findings have shown that in certain cases, language plays an important role. Arabic is a language with a particularly large vocabulary rich in words with synonymous shades of meaning and has high morphological complexity. This paper, therefore, provides a review on QE for Arabic information retrieval, the intention being to identify the recent state-of-the-art of this burgeoning area. In this review, we primarily discuss statistical QE approaches that include document analysis, search, browse log analyses, and web knowledge analyses, in addition to the semantic QE approaches, which use semantic knowledge structures to extract meaningful word relationships. Finally, our conclusion is that QE regarding the Arabic language is subjected to additional investigation and research due to the intricate nature of this language.

Music Lyrics Summarization Method using TextRank Algorithm (TextRank 알고리즘을 이용한 음악 가사 요약 기법)

  • Son, Jiyoung;Shin, Yongtae
    • Journal of Korea Multimedia Society
    • /
    • v.21 no.1
    • /
    • pp.45-50
    • /
    • 2018
  • This research paper describes how to summarize music lyrics using the TextRank algorithm. This method can summarize music lyrics as important lyrics. Therefore, we recommend music more effectively than analyzing the number of words and recommending music.

Document Structuring and Text Retrieval Using SGML, (SGML을 이용한 문헌의 구조화 및 텍스트 검색에 관한 연구)

  • 오민경;정영미
    • Proceedings of the Korean Society for Information Management Conference
    • /
    • 1995.08a
    • /
    • pp.29-32
    • /
    • 1995
  • 본 논문에서는 SGML(Standard Generalized Markup Language)을 사용하여 텍스트 검색시스템을 구축하였다. SGML은 개괄적 마크업언어로서 문헌을 문헌요소라는 객체 단위로 이루어진 것으로 보고 이러한 문헌요소간의 관계를 표현하므로, 텍스트 검색시스템에서 SGML을 이용하면 문헌을 구조화할 수 있고 전문(full text)을 효율적으로 조직하고 검색하는 것이 가능하다.

  • PDF

Retrieval of Player Event in Golf Videos Using Spoken Content Analysis (음성정보 내용분석을 통한 골프 동영상에서의 선수별 이벤트 구간 검색)

  • Kim, Hyoung-Gook
    • The Journal of the Acoustical Society of Korea
    • /
    • v.28 no.7
    • /
    • pp.674-679
    • /
    • 2009
  • This paper proposes a method of player event retrieval using combination of two functions: detection of player name in speech information and detection of sound event from audio information in golf videos. The system consists of indexing module and retrieval module. At the indexing time audio segmentation and noise reduction are applied to audio stream demultiplexed from the golf videos. The noise-reduced speech is then fed into speech recognizer, which outputs spoken descriptors. The player name and sound event are indexed by the spoken descriptors. At search time, text query is converted into phoneme sequences. The lists of each query term are retrieved through a description matcher to identify full and partial phrase hits. For the retrieval of the player name, this paper compares the results of word-based, phoneme-based, and hybrid approach.

Ontology based Retrieval System for Cultural Assets Using Hybrid Text-Sketch Queries (혼합형 질의 방법에 의한 온톨로지 기반 유물 검색 시스템)

  • Cheon Hyeon-Jae;Baek Seung-Jae;Lee Hong-Chul
    • Journal of the Korea Society of Computer and Information
    • /
    • v.10 no.5 s.37
    • /
    • pp.17-26
    • /
    • 2005
  • With the rapidly Slowing information, the research on the effcient information retrieval is increasing. Most of the retrieval systems for domestic cultural assets on the web have adopted a keyword-based search method. Those systems have required users to know the exact information about cultural assets such as name, keyword, etc. However, it is not easy to search the cultural assets with little information or only a remembrance of the shape. In this paper, we propose the retrieval system for cultural assets using both ontology-based and sketch-based search method to solve the Problems of existing systems. Our retrieval system allows users to use both text and sketch for a Query regardless of the type of information about cultural assets and to search in results using the ontology.

  • PDF