• Title/Summary/Keyword: Document Databases

Search Result 130, Processing Time 0.031 seconds

A Digital Library Prototype - Digital Repository and Diverse Collections (디지털도서관 프로토타입의 구축 -디지털 리포지토리와 컬렉션을 중심으로)

  • 최원태
    • Proceedings of the Korea Database Society Conference
    • /
    • 1998.09a
    • /
    • pp.383-394
    • /
    • 1998
  • This article is an overview of the digital library project, indicating what roles Korea's diverse digital collections may play. Our digital library prototype has simple architecture, consisting of digital repositories, filters, indexing and searching, and clients. Digital repositories include various types of materials and databases. The role of filters is to recognize a format of a document collection and mark the structural components of each of its documents, We are using a database management system (ORACLE and ConText) supporting user-defined functions and access methods that allows us to easily incorporate new object analysis, structuring, and indexing technology into a repository.

  • PDF

GUI-based HTML2XML Wrapperusing Inductive Reasoning (학습 추론을 이용한 GUI 기반의 HTML2XML 래퍼)

  • Jang, Mun-Seong;Jeong, Jae-Mok;Choe, Il-Hwan;Kim, Hyeong-Ju
    • Journal of KIISE:Databases
    • /
    • v.29 no.4
    • /
    • pp.311-320
    • /
    • 2002
  • The 'wrapper' is a module that extracts and processes information from the specified data source by the pre-composed extraction rule. 'HTML Wrapper for XML' extracts information from the web source as the form of XML document. Since composing the extraction rule is a repetitious and tedious job, it should be done as easy and fast as possible. This paper presents the method to minimize the composing job, which integrates GUI based training and scripting.

Secure Healthcare Management: Protecting Sensitive Information from Unauthorized Users

  • Ko, Hye-Kyeong
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.13 no.1
    • /
    • pp.82-89
    • /
    • 2021
  • Recently, applications are increasing the importance of security for published documents. This paper deals with data-publishing where the publishers must state sensitive information that they need to protect. If a document containing such sensitive information is accidentally posted, users can use common-sense reasoning to infer unauthorized information. In recent studied of peer-to-peer databases, studies on the security of data of various unique groups are conducted. In this paper, we propose a security framework that fundamentally blocks user inference about sensitive information that may be leaked by XML constraints and prevents sensitive information from leaking from general user. The proposed framework protects sensitive information disclosed through encryption technology. Moreover, the proposed framework is query view security without any three types of XML constraints. As a result of the experiment, the proposed framework has mathematically proved a way to prevent leakage of user information through data inference more than the existing method.

A Digital Library Prototype for Access to Diverse Collections (다양한 장서 접근을 위한 디지털 도서관의 프로토타입 구축)

  • Choi Won-Tae
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.32 no.2
    • /
    • pp.295-307
    • /
    • 1998
  • This article is an overview of the digital library project, indicating what roles Koreas diverse digital collections may play. Our digital library prototype has simple architecture, consisting of digital repositories, filters, indexing and searching, and clients. Digital repositories include various types of materials and databases. The role of filters is to recognize a format of a document collection and mark the structural components of each of its documents. We are using a database management system (ORACLE and ConText) supporting user-defined functions and access methods that allows us to easily incorporate new object analysis, structuring, and indexing technology into a repository. Clients can be considered browsers or viewers designed for different document data types, such as image, audio, video, SGML, PDF, and KORMARC. The combination of navigational tools supports a variety of approaches to identifying collections and browsing or searching for individual items. The search interface was implemented using HTML forms and the World Wide Web's CGI mechanism.

  • PDF

XPERT : An XML Query Processing System using Relational Databases (관계형 DBMS를 이용한 XML 질의 처리 시스템 XPERT의 개발)

  • Jung Min-Kyoung;Hong Dong-Kweon
    • The KIPS Transactions:PartD
    • /
    • v.13D no.1 s.104
    • /
    • pp.1-10
    • /
    • 2006
  • This paper introduces the development XPERT(XML Query Processing Engine using Relational Technologies) which is based on relational model. In our system we have used a decomposed approach to store XML files in relational tables. XML queries are translated to SQLs according to the table schema, and then they are sent to the relational DBMS to get the results back. Our translation scheme produces AST(Abstract Syntax Tree) by analyzing XQuery expressions at first. And on traversing AST proper SQLs are generated. Translated SQLs can reduce the number of joins by using path information and utilize dewey number to preserve document originated orderings among compoments in XML. In addition we propose the efficient algorithms of XPath and XQuery translation. And finally we show the implementation of our prototype system for the functional evaluations.

A Change Detection Technique Supporting Nested Blank Nodes of RDF Documents (내포된 공노드를 포함하는 RDF 문서의 변경 탐지 기법)

  • Lee, Dong-Hee;Im, Dong-Hyuk;Kim, Hyoung-Joo
    • Journal of KIISE:Databases
    • /
    • v.34 no.6
    • /
    • pp.518-527
    • /
    • 2007
  • It is an important issue to find out the difference between RDF documents, because RDF documents are changed frequently. When RDF documents contain blank nodes, we need a matching technique for blank nodes in the change detection. Blank nodes have a nested form and they are used in most RDF documents. A RDF document can be modeled as a graph and it will contain many subtrees. We can consider a change detection problem as a minimum cost tree matching problem. In this paper, we propose a change detection technique for RDF documents using the labeling scheme for blank nodes. We also propose a method for improving the efficiency of general triple matching, which used predicate grouping and partitioning. In experiments, we showed that our approach was more accurate and faster than the previous approaches.

A Keyword-based Filtering Technique of Document-centric XML using NFA Representation (NFA 표현을 사용한 문서-중심적 XML의 키워드 기반 필터링 기법)

  • Lee, Kyoung-Han;Park, Seog
    • Journal of KIISE:Databases
    • /
    • v.33 no.5
    • /
    • pp.437-452
    • /
    • 2006
  • In this paper, we propose an extended XPath specification which includes a special matching character '%' used in the LIKE operation of SQL in order to solve the difficulty of writing some queries to filter element contents well, using the previous XPath specification. We also present a novel technique for filtering a collection of document-centric XMLs, called Pfilter, which is able to exploit the extended XPath specification. Owing to sharing the common prefix characters of the operands in value-based predicates, the Pfilter improves the performance in processing those. We show several performance studies, comparing Pfilter with Yfilter in respect to efficiency and scalability as using multi-query processing time (MQPT), and reporting the results with respect to inserting, deleting, and processing of value-based predicates. In conclusion, our approach provides a core algorithm for evaluating the contains() function of XPath queries in previous XML filtering researches, and a foundation for building XML-based distributed information systems.

XML Document Filtering based on Segments (세그먼트 기반의 XML 문서 필터링)

  • Kwon, Joon-Ho;Rao, Praveen;Moon, Bong-Ki;Lee, Suk-Ho
    • Journal of KIISE:Databases
    • /
    • v.35 no.4
    • /
    • pp.368-378
    • /
    • 2008
  • In recent years, publish-subscribe (pub-sub) systems based on XML document filtering have received much attention. In a typical pub-sub system, subscribed users specify their interest in profiles expressed in the XPath language, and each new content is matched against the user profiles so that the content is delivered to only the interested subscribers. As the number of subscribed users and their profiles can grow very large, the scalability of the system is critical to the success of pub-sub services. In this paper, we propose a fast and scalable XML filtering system called SFiST which is an extension of the FiST system. Sharable segments are extracted from twig patterns and stored into the hash-based Segment Table in SFiST system. Segments are used to represent user profiles as Terse Sequences and stored in the Compact Segment Index during filtering. Our experimental study shows that SFiST system has better performance than FiST system in terms of filtering time and memory usage.

A Study on Processing XML Documents (XML 문서 처리에 관한 연구)

  • Kim, Tae Gwon
    • Journal of KIISE
    • /
    • v.43 no.4
    • /
    • pp.489-496
    • /
    • 2016
  • XML can effectively express structured or semi-structured data as well as relational databases. XQuery is a query language for retrieving information for such an XML document. In this paper, an XQuery composer is designed and implemented, with an API provided for XQuery processors, and a proper processor is registered. This composer shows query results immediately processed by the processor. As this composer contains a parser for XQuery, it can compose XQuery effectively using a diverse dialog box designed for XQuery grammar. A dialog box is affiliated with a clause region, which is a region that algebra operates from the parsing tree. It can compose path expressions for an XML document easily as it shows an element tree from DTD graphically. Path expressions are composed automatically by marking elements in the structural hierarchy and by specifying the predicate of an element partially.

A Circle Labeling Scheme without Re-labeling for Dynamically Updatable XML Data (동적으로 갱신가능한 XML 데이터에서 레이블 재작성하지 않는 원형 레이블링 방법)

  • Kim, Jin-Young;Park, Seog
    • Journal of KIISE:Databases
    • /
    • v.36 no.2
    • /
    • pp.150-167
    • /
    • 2009
  • XML has become the new standard for storing, exchanging, and publishing of data over both the internet and the ubiquitous data stream environment. As demand for efficiency in handling XML document grows, labeling scheme has become an important topic in data storage. Recently proposed labeling schemes reflect the dynamic XML environment, which itself provides motivation for the discovery of an efficient labeling scheme. However, previous proposed labeling schemes have several problems: 1) An insertion of a new node into the XML document triggers re-labeling of pre-existing nodes. 2) They need larger memory space to store total label. etc. In this paper, we introduce a new labeling scheme called a Circle Labeling Scheme. In CLS, XML documents are represented in a circular form, and efficient storage of labels is supported by the use of concepts Rotation Number and Parent Circle/Child Circle. The concept of Radius is applied to support inclusion of new nodes at arbitrary positions in the tree. This eliminates the need for re-labeling existing nodes and the need to increase label length, and mitigates conflict with existing labels. A detailed experimental study demonstrates efficiency of CLS.