• Title/Summary/Keyword: Relational schema

Search Result 145, Processing Time 0.024 seconds

An Efficient Reasoning Method for OWL Properties using Relational Databases (관계형 데이터베이스를 이용한 효율적인 OWL 속성 추론 기법)

  • Lin, Jiexi;Lee, Ji-Hyun;Chung, Chin-Wan
    • Journal of KIISE:Databases
    • /
    • v.37 no.2
    • /
    • pp.92-103
    • /
    • 2010
  • The Web Ontology Language (OWL) has become the W3C recommendation for publishing and sharing ontologies on the Semantic Web. To derive hidden information from OWL data, a number of OWL reasoners have been proposed. Since OWL reasoners are memory-based, they cannot handle large-sized OWL data. To overcome the scalability problem, RDBMS-based systems have been proposed. These systems store OWL data into a database and perform reasoning by incorporating the use of a database. However, they do not consider complete reasoning on all types of properties defined in OWL and the database schemas they use are ineffective for reasoning. In addition, they do not manage updates to the OWL data which can occur frequently in real applications. In this paper, we compare various database schemas used by RDBMS-based systems and propose an improved schema for efficient reasoning. Also, to support reasoning for all the types of properties defined in OWL, we propose a complete and efficient reasoning algorithm. Furthermore, we suggest efficient approaches to managing the updates that may occur on OWL data. Experimental results show that our schema has improved performance in OWL data storage and reasoning, and that our approaches to managing updates to OWL data are more efficient than the existing approaches.

Storage and Retrieval of XML Documents Without Redundant Path Information (경로정보의 중복을 제거한 XML 문서의 저장 및 질의처리 기법)

  • Lee Hiye-Ja;Jeong Byeong-Soo;Kim Dae-Ho;Lee Young-Koo
    • The KIPS Transactions:PartD
    • /
    • v.12D no.5 s.101
    • /
    • pp.663-672
    • /
    • 2005
  • This Paper Proposes an approach that removes the redundancy of Path information and uses an inverted index, as an efficient way to store a large volume of XML documents and to retrieve wanted information from there. An XML document is decomposed into nodes based on its tree structure, and stored in relational tables according to the node type, with path information from the root to each node. The existing methods using path information store data for all element paths, which cause retrieval performance to be decreased with increased data volume. Our approach stores only data for leaf element path excluding internal element paths. As the inverted index is made by the leaf element path only, the number of posting lists by key words become smaller than those of the existing methods. For the storage and retrieval of U data, our approach doesn't require the XML schema information of XML documents and any extension of relational database. We demonstrate the better performance of on approach than the existing approaches within the scope of our experiment.

Development of Big Data System for Energy Big Data (에너지 빅데이터를 수용하는 빅데이터 시스템 개발)

  • Song, Mingoo
    • KIISE Transactions on Computing Practices
    • /
    • v.24 no.1
    • /
    • pp.24-32
    • /
    • 2018
  • This paper proposes a Big Data system for energy Big Data which is aggregated in real-time from industrial and public sources. The constructed Big Data system is based on Hadoop and the Spark framework is simultaneously applied on Big Data processing, which supports in-memory distributed computing. In the paper, we focus on Big Data, in the form of heat energy for district heating, and deal with methodologies for storing, managing, processing and analyzing aggregated Big Data in real-time while considering properties of energy input and output. At present, the Big Data influx is stored and managed in accordance with the designed relational database schema inside the system and the stored Big Data is processed and analyzed as to set objectives. The paper exemplifies a number of heat demand plants, concerned with district heating, as industrial sources of heat energy Big Data gathered in real-time as well as the proposed system.

New Inlining Method for Effective Creation of Relations and Preservation of Constraints (효율적인 릴레이션 생성과 제약조건 보존을 위한 새로운 Inlining 기법)

  • An, Sung-Chul;Kim, Yeong-Ung
    • Journal of Korea Multimedia Society
    • /
    • v.9 no.7
    • /
    • pp.773-781
    • /
    • 2006
  • XML is a standard language to express and exchange the data over the web. Recently, researches about techniques that storing XML documents into RDBMS and managing it have been progressed. These researches use a technique that are receiving the DTD document as an input and generate the relational schema from it. Existing researches, however, do not consider the semantic preservation because of the simplification of the DTD. Further, because existing studies only focus on the preservation technique to store information such as content and structure, there is a troublesomeness that have to use the stored-procedure or trigger for the data integrity during the stores of XML documents. This paper proposes a improved Inlining technique to create effective relations and to preserve semantics which can be inferred from DTD.

  • PDF

A Design of XML-Based Distributed MDR Retrieval System for Data Preparation (데이터준비를 위한 XML 기반의 분산 MDR 검색 시스템 설계)

  • Ko Sucbum;Youn Sungdae
    • Journal of Korea Multimedia Society
    • /
    • v.7 no.9
    • /
    • pp.1329-1338
    • /
    • 2004
  • The purpose of data mining is to extract multi-dimensional information from a large database. The only information that we can extract from a large database is the column name, data type or simple comments included in the columns of database tables. With such unstructured and scarce information, it is very difficult and time taking to collect and to cleanse data by analyzing the purpose, characteristic and schema of the column during the data preparation step. In order to solve this problem, we propose solutions for reducing the time spent data preparation step in a relational database environment in this paper. That is, we propose useful elements to be considered during the data preparation step and then these elements are organized to constitute MDR(Metadata Registry) which is becoming the international standard of ISO/IEC : ll179. Finally, we propose a XML-based distributed MDR retrieval system that is convertible among heterogeneous systems and heterogeneous DBMSS.

  • PDF

Design and Implementation of Middleware supporting translation of EDI using XML (XML기반의 EDI 문서교환을 위한 미들웨어 설계 및 구현)

  • Choi, Gwang-Mi;Park, Su-Young;Jung, Chai-Yeoung
    • The KIPS Transactions:PartB
    • /
    • v.9B no.6
    • /
    • pp.845-852
    • /
    • 2002
  • Electronic document processing using EDl (Electronic Data Interchange) must exchange documents using VAN (Value Added Network). However. the use of exclusive software needs alteration of a new document and the use of VAN(Value Added Network) demands an exchange of document and high cost for maintenance. Due to these problems, the existing EDI (Electronic Data Interchange) is turning into Web-based EDI (Electronic Data Interchange). This paper suggests techniques that change EDI (Electronic Data Interchange) messages which exist in two relational databases into XML (extensible Markeup Language) using the JDBC bridge. Also this paper proposes a method that recovers schema using converted XML (extensible Markeup Language) file, and a method which inserts an original record into a declared table. This solves the limitation of an original method that have to use sane database management system and also overcomes the problem in certain circumstances where the EDI (Electronic Data Interchange) exchange does not work.

A Study on Flexible Attribude Tree and Patial Result Matrix for Content-baseed Retrieval and Browsing of Video Date. (비디오 데이터의 내용 기반 검색과 브라우징을 위한 유동 속성 트리 및 부분 결과 행렬의 이용 방법 연구)

  • 성인용;이원석
    • Journal of Korea Multimedia Society
    • /
    • v.3 no.1
    • /
    • pp.1-13
    • /
    • 2000
  • While various types of information can be mixed in a continuous video stream without any cleat boundary, the meaning of a video scene can be interpreted by multiple levels of abstraction, and its description can be varied among different users. Therefore, for the content-based retrieval in video data it is important for a user to be able to describe a scene flexibly while the description given by different users should be maintained consistently This paper proposes an effective way to represent the different types of video information in conventional database models such as the relational and object-oriented models. Flexibly defined attributes and their values are organized as tree-structured dictionaries while the description of video data is stored in a fixed database schema. We also introduce several browsing methods to assist a user. The dictionary browser simplifies the annotation process as well as the querying process of a user while the result browser can help a user analyze the results of a query in terms of various combinations of Query conditions.

  • PDF

Automatic Construction of SHACL Schemas for RDF Knowledge Graphs Generated by R2RML Mappings

  • Choi, Ji-Woong
    • Journal of the Korea Society of Computer and Information
    • /
    • v.25 no.8
    • /
    • pp.9-21
    • /
    • 2020
  • With the proliferation of RDF knowledge graphs(KGs), there arose a need of a standardized schema representation of the graph model for effective data interchangeability and interoperability. The need resulted in the development of SHACL specification to describe and validate RDF graph's structure by W3C. Relational databases(RDBs) are one of major sources for acquiring structured knowledge. The standard for automatic generation of RDF KGs from RDBs is R2RML, which is also developed by W3C. Since R2RML is designed to generate only RDF data graphs from RDBs, additional manual tasks are required to create the schemas for the graphs. In this paper we propose an approach to automatically generate SHACL schemas for RDF KGs populated by R2RML mappings. The key of our approach is that the SHACL shemas are built only from R2RML documents. We describe an implementation of our appraoch. Then, we show the validity of our approach with R2RML test cases designed by W3C.

Object-Oriented Database Schemata and Queiy Processing for XML Data (XML 데이타를 위한 객체지향 데이터베이스 스키마 및 질의 처리)

  • Jeong, Tae-Seon;Park, Sang-Won;Han, Sang-Yeong;Kim, Hyeong-Ju
    • Journal of KIISE:Databases
    • /
    • v.29 no.2
    • /
    • pp.89-98
    • /
    • 2002
  • As XML has become an emerging standard for information exchange on the World Wide Web it has gained attention in database communities to extract information from XML seen as a database model. Recently, many researchers have addressed the problem of storing XML data and processing XML queries using traditional database engines. Here, most of them have used relational database systems. In this paper, we show that OODBSs can be another solution. Our technique generates an OODB schema from DTDs and processes XML queries, Especially, we show that the semi-structural part of XML data can be represented by the 'inheritance' and that this can be used to improve query processing.

Design and Implementation of MongoDB-based Unstructured Log Processing System over Cloud Computing Environment (클라우드 환경에서 MongoDB 기반의 비정형 로그 처리 시스템 설계 및 구현)

  • Kim, Myoungjin;Han, Seungho;Cui, Yun;Lee, Hanku
    • Journal of Internet Computing and Services
    • /
    • v.14 no.6
    • /
    • pp.71-84
    • /
    • 2013
  • Log data, which record the multitude of information created when operating computer systems, are utilized in many processes, from carrying out computer system inspection and process optimization to providing customized user optimization. In this paper, we propose a MongoDB-based unstructured log processing system in a cloud environment for processing the massive amount of log data of banks. Most of the log data generated during banking operations come from handling a client's business. Therefore, in order to gather, store, categorize, and analyze the log data generated while processing the client's business, a separate log data processing system needs to be established. However, the realization of flexible storage expansion functions for processing a massive amount of unstructured log data and executing a considerable number of functions to categorize and analyze the stored unstructured log data is difficult in existing computer environments. Thus, in this study, we use cloud computing technology to realize a cloud-based log data processing system for processing unstructured log data that are difficult to process using the existing computing infrastructure's analysis tools and management system. The proposed system uses the IaaS (Infrastructure as a Service) cloud environment to provide a flexible expansion of computing resources and includes the ability to flexibly expand resources such as storage space and memory under conditions such as extended storage or rapid increase in log data. Moreover, to overcome the processing limits of the existing analysis tool when a real-time analysis of the aggregated unstructured log data is required, the proposed system includes a Hadoop-based analysis module for quick and reliable parallel-distributed processing of the massive amount of log data. Furthermore, because the HDFS (Hadoop Distributed File System) stores data by generating copies of the block units of the aggregated log data, the proposed system offers automatic restore functions for the system to continually operate after it recovers from a malfunction. Finally, by establishing a distributed database using the NoSQL-based Mongo DB, the proposed system provides methods of effectively processing unstructured log data. Relational databases such as the MySQL databases have complex schemas that are inappropriate for processing unstructured log data. Further, strict schemas like those of relational databases cannot expand nodes in the case wherein the stored data are distributed to various nodes when the amount of data rapidly increases. NoSQL does not provide the complex computations that relational databases may provide but can easily expand the database through node dispersion when the amount of data increases rapidly; it is a non-relational database with an appropriate structure for processing unstructured data. The data models of the NoSQL are usually classified as Key-Value, column-oriented, and document-oriented types. Of these, the representative document-oriented data model, MongoDB, which has a free schema structure, is used in the proposed system. MongoDB is introduced to the proposed system because it makes it easy to process unstructured log data through a flexible schema structure, facilitates flexible node expansion when the amount of data is rapidly increasing, and provides an Auto-Sharding function that automatically expands storage. The proposed system is composed of a log collector module, a log graph generator module, a MongoDB module, a Hadoop-based analysis module, and a MySQL module. When the log data generated over the entire client business process of each bank are sent to the cloud server, the log collector module collects and classifies data according to the type of log data and distributes it to the MongoDB module and the MySQL module. The log graph generator module generates the results of the log analysis of the MongoDB module, Hadoop-based analysis module, and the MySQL module per analysis time and type of the aggregated log data, and provides them to the user through a web interface. Log data that require a real-time log data analysis are stored in the MySQL module and provided real-time by the log graph generator module. The aggregated log data per unit time are stored in the MongoDB module and plotted in a graph according to the user's various analysis conditions. The aggregated log data in the MongoDB module are parallel-distributed and processed by the Hadoop-based analysis module. A comparative evaluation is carried out against a log data processing system that uses only MySQL for inserting log data and estimating query performance; this evaluation proves the proposed system's superiority. Moreover, an optimal chunk size is confirmed through the log data insert performance evaluation of MongoDB for various chunk sizes.