• Title/Summary/Keyword: Multimedia Document Retrieval

Search Result 32, Processing Time 0.059 seconds

Design and Implementation of a SGML/XML Document Retrieval System (SGML/XML 검색 시스템의 설케 및 구현)

  • Ko, Seung-Kyu;Cho, Seung-Ki;Choy, Yoon-Chul;Koh, Kyun
    • Proceedings of the Korea Multimedia Society Conference
    • /
    • 2000.11a
    • /
    • pp.99-102
    • /
    • 2000
  • 이기종 간의 문서 교환 표준으로 제안되 SGML은 문서의 구조정보를 표현할 수 있는 장점으로 인해 CALS(Commerce At Light Speed), EC(Electronic Commerce), EDI(Electronic Data Interchange), 전자 도서관(Digital Library) 등 여러 분야에서 사용되고 있다. 이렇게 SGML이 여러 분야에서 사용됨에 따라 많은 SGML 문서 중에서 원하는 문서를 효율적으로 찾아줄 수 있는 검색 시스템의 필요성이 증가하고 있다. 이에 본 연구실에서는 기본적인 구조 검색을 지원하는 SGML 문서 관리시스템을 기개발하였다. 그러나 이 시스템은 구조 검색을 효과적으로 지원하기 못하기 때문에 본 연구에서는 구조 검색의 기능을 정의하고, 이를 지원하는 새로운 구조 질의어를 정의하였다. 또한 이러한 구조 검색을 효과적으로 지원하기 위한 구조 색인을 정의하였다. 그리고 구조 검색 방식으로 세가지 방식을 각각 구현 및 실험하여 그 중에서 성능이 뛰어난 절충식을 이용하여 검색 시스템을 구현하였다.

  • PDF

XML Repository System Using DBMS and IRS

  • Kang, Hyung-Il;Yoo, Jae-Soo;Lee, Byoung-Yup
    • International Journal of Contents
    • /
    • v.3 no.3
    • /
    • pp.6-14
    • /
    • 2007
  • In this paper, we design and implement a XML Repository System(XRS) that exploits the advantages of DBMSs and IRSs. Our scheme uses BRS to support full text indexing and content-based queries efficiently, and ORACLE to store XML documents, multimedia data, DTD and structure information. We design databases to manage XML documents including audio, video, images as well as text. We employ the non-composition model when storing XML documents into ORACLE. We represent structured information as ETID(Element Type Id), SORD(Sibling ORDer) and SSORD(Same Sibling ORDer). ETID is a unique value assigned to each element of DTD. SORD and SSORD represent an order information between sibling nodes and an order information among the sibling nodes with the same element respectively. In order to show superiority of our XRS, we perform various experiments in terms of the document loading time, document extracting time and contents retrieval time. It is shown through experiments that our XRS outperforms the existing XML document management systems. We also show that it supports various types of queries through performance experiments.

Data Model, Query Language, and Indexing Scheme for Structured Video Documents (구조화된 비디오 문서의 데이터 모델 및 질의어와 색인 기법)

  • 류은숙;이규철
    • Journal of Korea Multimedia Society
    • /
    • v.1 no.1
    • /
    • pp.1-17
    • /
    • 1998
  • Video information is an important component of multimedia systems such as Digital Library, World-Wide Web (WWW), and Video-On-Demand (VOD) service system. Video information has hierarchical document structure inherently, so it is named "structure video document" in this paper. This paper proposes a data model, a query language, and an indexing scheme for structured video documents in order to store, retrieve, and share video documents efficiently. In representing structured video documents, the object-oriented data modeling technique is used since the hierarchical structure information can be modeled as complex objects. We also define object types for the structure information. Our query language supports not only content-based retrieval, which means the queries based on the structure of video documents, and spatial/temporal relation for video documents. In order to perform structure queries efficiently, as well as to reduce the storage overhead of indices, an optimized inverted index structure is proposed.

  • PDF

XML-based Retrieval System for SCORM-based Virtual Learning Contents (SCORM 기반의 XML 학습 컨텐츠 검색 시스템)

  • Choi, Byung-Uk;Song, Mi-Sook;Cho, Jung-Won
    • The Journal of Korean Association of Computer Education
    • /
    • v.6 no.1
    • /
    • pp.9-17
    • /
    • 2003
  • XML(eXtensible Markup Language), next generation internet standard language has the advantage of easy re-use and re-structure in other computing environment because it has the separate data, presentation and structure. In this paper, we implement the efficient retrieval system for the general user by limiting the XML documents on the multimedia learning contents for the virtual education system. The system design is based on SCO Metadata unit defined in SCORM as the proposed virtual education standard. Each XML documents has three indexes - keyword, element and attribute. Also, it makes possible to retrieve data without previous knowledge of the DTD by making the element retrieval screen structure for the user interface. And it gives the user various result screen formats such as XML and HTML by restructuring the retrieval result through XML-QL and XSL, respectively.

  • PDF

Path Combining System of XML Documents based on Relational DBMS (관계형 DBMS 기반의 XML 문서 경로 통합 시스템)

  • Lee, Bum-Suk;Hwang, Byung-Yeon
    • Journal of Korea Multimedia Society
    • /
    • v.11 no.4
    • /
    • pp.415-422
    • /
    • 2008
  • With the increasing use of XML, considerable research is being conducted on the XML document management systems for more efficient storage and searching of XML documents. Depending on the base systems, these researches can be classified into object-oriented DBMS (OODBMS) and relational DBMS (RDBMS). OODBMS-based systems are better suited to reflect the structure of XML-documents than RDBMS based ones. However, using an XML parser to map the contents of documents to relational tables is a better way to construct a stable and effective XML document management system. The proposed X-Binder system uses an RDBMS-based inverted index; this guarantees high searching speed but wastes considerable storage space. To avoid this, the proposed system incorporates a path combining module agent that combines paths with sibling relations, and stores them in a single row. Performance evaluation revealed that the proposed system reduces storage wastage and search time.

  • PDF

Design and Implementation of a Structure and Content-based Multimedia Document Retrieval System (구조 및 내용-기반 멀티미디어 문서검색 시스템의 설계 및 구현)

  • Jin, Du-Seok;Lee, Jeong-Jae;Chang, Jae-Woo
    • The Transactions of the Korea Information Processing Society
    • /
    • v.7 no.11
    • /
    • pp.3341-3355
    • /
    • 2000
  • 최근 멀티미디어 문서의 개수가 기하 급수적으로 증가함에 따라, 사용자가 요구하는 멀티미디어 문서를 보다 효과적으로 저장 및 검색할 수 있는 멀티미디어 문서 검색 시스템을 개발하는 것이 필요하다. 본 논문에서는 XML로 정의된 문서를 문서 구조 및 이미지 내용을 기반으로 보다 효율적으로 검색할 수 있는 시스템을 설계 및 구현한다. 효율적인 구조-기반 검색을 지원하기 위해서 구조 인덱스를 o2store 저장 시스템을 사용하여 구현한다. 아울러 내용-기반 검색을 지원하기 위해서 X-트리에 기반한 효율적인 고차원 색인구조를 구현한다. 마지막으로 구현된 멀티미디어 문서검색 시스템을 검색시간, 저장시간, 부가 저장 공간의 관점에서 성능평가를 수행한다.

  • PDF

Design of Multimedia Document Retrieval System Using Relations between Media (미디어간 상호 연관성을 이용한 멀티미디어 문서 검색 시스템의 설계)

  • 이성환;유채곤;이원호;황치정
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 1998.10b
    • /
    • pp.274-276
    • /
    • 1998
  • 많은 분야에서 정보를 효과적으로 전달하기 위한 수단으로 멀티미디어가 많이 사용되고 있다. 이에 멀티미디어 문서를 효율적으로 저장, 검색, 표현하기 위한 기법에 대한 연구가 필요하다. 멀티미디어 문서 내에 사용되는 audio, video, image, text와 같은 여러 미디어들은 문서 내에서 시.공간적 관계뿐 아니라 내용상의 연관성을 갖게 된다. 본 논문에서는 멀티미디어 문서에 사용되는 미디어들의 특징 및 연관성을 추출해 내고, 각 미디어들을 효율적으로 관리하기 위하여 미디어 특성에 맞는 세크멘테이션 기법을 이용하고 이들에 대한 내용상의 연관성을 고려하여 저장(store), 검색(retrieve), 표현(present)하기위한 시스템을 설계 하였다.

Jointly Image Topic and Emotion Detection using Multi-Modal Hierarchical Latent Dirichlet Allocation

  • Ding, Wanying;Zhu, Junhuan;Guo, Lifan;Hu, Xiaohua;Luo, Jiebo;Wang, Haohong
    • Journal of Multimedia Information System
    • /
    • v.1 no.1
    • /
    • pp.55-67
    • /
    • 2014
  • Image topic and emotion analysis is an important component of online image retrieval, which nowadays has become very popular in the widely growing social media community. However, due to the gaps between images and texts, there is very limited work in literature to detect one image's Topics and Emotions in a unified framework, although topics and emotions are two levels of semantics that often work together to comprehensively describe one image. In this work, a unified model, Joint Topic/Emotion Multi-Modal Hierarchical Latent Dirichlet Allocation (JTE-MMHLDA) model, which extends previous LDA, mmLDA, and JST model to capture topic and emotion information at the same time from heterogeneous data, is proposed. Specifically, a two level graphical structured model is built to realize sharing topics and emotions among the whole document collection. The experimental results on a Flickr dataset indicate that the proposed model efficiently discovers images' topics and emotions, and significantly outperform the text-only system by 4.4%, vision-only system by 18.1% in topic detection, and outperforms the text-only system by 7.1%, vision-only system by 39.7% in emotion detection.

  • PDF

Design and Implementation of Web Crawler utilizing Unstructured data

  • Tanvir, Ahmed Md.;Chung, Mokdong
    • Journal of Korea Multimedia Society
    • /
    • v.22 no.3
    • /
    • pp.374-385
    • /
    • 2019
  • A Web Crawler is a program, which is commonly used by search engines to find the new brainchild on the internet. The use of crawlers has made the web easier for users. In this paper, we have used unstructured data by structuralization to collect data from the web pages. Our system is able to choose the word near our keyword in more than one document using unstructured way. Neighbor data were collected on the keyword through word2vec. The system goal is filtered at the data acquisition level and for a large taxonomy. The main problem in text taxonomy is how to improve the classification accuracy. In order to improve the accuracy, we propose a new weighting method of TF-IDF. In this paper, we modified TF-algorithm to calculate the accuracy of unstructured data. Finally, our system proposes a competent web pages search crawling algorithm, which is derived from TF-IDF and RL Web search algorithm to enhance the searching efficiency of the relevant information. In this paper, an attempt has been made to research and examine the work nature of crawlers and crawling algorithms in search engines for efficient information retrieval.

Document Ranking of Web Document Retrieval Systems (웹 정보검색 시스템의 문서 순위 결정)

  • An, Dong-Un;Kang, In-Ho
    • Journal of Information Management
    • /
    • v.34 no.2
    • /
    • pp.55-66
    • /
    • 2003
  • The Web is rich with various sources of information. It contains the contents of documents, multimedia data, shopping materials and so on. Due to the massive and heterogeneous web document collections, users want to find various types of target pages. We can classify user queries as three categories according to users'intent, content search, the site search, and the service search. In this paper, we present that different strategies are needed to meet the need of a user. Also we show the properties of content information, link information and URL information according to the class of a user query. In the content search, content information showed the good result. However, we lost the performance by combining link information and URL information. In the site search, we could increase the performance by combining link information and URL information.