• Title/Summary/Keyword: 데이터웨어하우스

Search Result 294, Processing Time 0.021 seconds

Ubiquitous Data Warehosue: Integrating RFID with Mutidimensional Online Analysis (유비쿼터스 데이터 웨어하우스: RFID와 다차원 온라인 분석의 통합)

  • Cho, Dai-Yon;Lee, Seung-Pyo
    • Journal of Information Technology Services
    • /
    • v.4 no.2
    • /
    • pp.61-69
    • /
    • 2005
  • RFID is used for tracking systems in various business fields these days and these systems brought considerable efficiencies and cost savings to companies. Real-time based information acquired through RFID devices could be a valuable source of information for making decisions if it is combined with decision support tools like OLAP of a data warehouse that has originally been designed for analyzing static and historical data. As an effort of extending the data source of a data warehouse, RFID is combined with a data warehouse in this research. And OLAP is designed to analyze the dynamic real-time based information gathered through RFID devices. The implemented prototype shows that ubiquitous computing technology such as RFID could be a valuable data source for a data warehouse and is very useful for making decisions when it is combined with online analysis. The system architecture of such system is suggested.

Data Cleaning System using XMDR-DAI in Cloud (클라우드 환경에서 XMDR-DAI를 이용한 데이터 정제 시스템)

  • Moon, Seok-Jae;Jeong, Kye-Dong;Lee, Jong-Yong;Cho, Young-Keun
    • Journal of Digital Convergence
    • /
    • v.12 no.2
    • /
    • pp.263-270
    • /
    • 2014
  • In cloud environment, business intelligence data warehouse is used for decision making and enterprise policy. But if new system is added in cloud environment, much cost and time is needed due to heterogenous characteristics in data integration. This paper suggests a data cleaning system for business intelligence in cloud environment. The proposed system minimizes the effect of local system when it integrates distributed system using XMDR-DAI. And this system provides standardized information to generate information of data warehouse in real time. Also the proposed system saves cost and time by integrating the data without a change of existed system. And it can improve quality of information by generating coherent information through data extraction and cleaning work in real time.

Importance of Selecting The characterized Housekeeping Genes as Reference Genes in Various Species (다양한 종에서 하우스키핑 유전자 선택의 중요성)

  • Chai, Han-Ha;Noh, Yun Jeong;Roh, Hee-Jong;Lim, Dajeong
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.21 no.8
    • /
    • pp.417-428
    • /
    • 2020
  • Housekeeping genes are expressed in cells of all organisms and perform basic cellular functions such as energy generation, substance synthesis, cell death, and cell defense. Accordingly, the expression levels of housekeeping genes are relatively constant, and thus they are used as reference genes in gene expression studies, such as protein expression and mRNA expression analysis of target genes. However, the levels of expression of these genes may be different among various tissues or cells and may change under certain circumstances. Therefore, it is important to select the best reference gene for specific gene expression research by exploring the stability of housekeeping gene expression. This review summarizes housekeeping genes found in humans, chickens, pigs, and rats in the literature and estimates expression stability using geNorm, NormFinder, and BestKeeper software. The most suitable reference housekeeping gene can selected based on expression stability according to the experimental conditions of the gene expression study and can thus be applied to data normalization.

An Approach for Integrated Modeling of Protein Data using a Fact Constellation Schema and a Tree based XML Model (Fact constellation 스키마와 트리 기반 XML 모델을 적용한 실험실 레벨의 단백질 데이터 통합 기법)

  • Park, Sung-Hee;Li, Rong-Hua;Ryu, Keun-Ho
    • The KIPS Transactions:PartD
    • /
    • v.11D no.3
    • /
    • pp.519-532
    • /
    • 2004
  • With the explosion of bioinformatics data such proteins and genes, biologists need a integrated system to analyze and organize large datasets that interact with heterogeneous types of biological data. In this paper, we propose a integration system based on a mediated data warehouse architecture using a XML model in order to combine protein related data at biology laboratories. A fact constellation model in this system is used at a common model for integration and an integrated schema it translated to a XML schema. In addition, to track source changes and provenance of data in an integrated database employ incremental update and management of sequence version. This paper shows modeling of integration for protein structures, sequences and classification of structures using the proposed system.

A Method for Extraction and Loading of Massive Traffic Data using Commercial Tools (상용 도구를 이용한 대용량 교통 데이터의 추출 및 적재 방안)

  • Woo, Chan-Il;Jeon, Se-Gil
    • Journal of Advanced Navigation Technology
    • /
    • v.12 no.1
    • /
    • pp.46-53
    • /
    • 2008
  • The ITS(Intelligent Transport System) enables us to provide solutions on traffic problems, while maximizing safety and efficiency of road and transportation systems, by combining technologies from information and communication, electrical engineering, electronics, mechanics, control and instrumentation with transportation systems. The issues that an integration system for massive traffic data sources must face are due to several factors such as the variety and amount of data available, the representational heterogeneity of the data in the different sources, and the autonomy and differing capabilities of the sources. In this paper, we describe how to extract and load of the heterogeneous massive traffic data from the operational databases, such as FTMS and ARTIS using commercial tools. Also, we experiment on traffic data warehouses with integrated quality management techniques for providing high quality data.

  • PDF

A Standard Way of Constructing a Data Warehouse based on a Neutral Model for Sharing Product Dat of Nuclear Power Plants (원자력 발전소 제품 데이터의 공유를 위한 중립 모델 기반의 데이터 웨어하우스의 구축)

  • Mun, D.H.;Cheon, S.U.;Choi, Y.J.;Han, S.H.
    • Korean Journal of Computational Design and Engineering
    • /
    • v.12 no.1
    • /
    • pp.74-85
    • /
    • 2007
  • During the lifecycle of a nuclear power plant many organizations are involved in KOREA. Korea Plant Engineering Co. (KOPEC) participates in the design stage, Korea Hydraulic and Nuclear Power (KHNP) operates and manages all nuclear power plants in KOREA, Dusan Heavy Industries manufactures the main equipment, and a construction company constructs the plant. Even though each organization has a digital data management system inside and obtains a certain level of automation, data sharing among organizations is poor. KHNP gets drawing and technical specifications from KOPEC in the form of paper. It results in manual re-work of definition and there are potential errors in the process. A data warehouse based on a neutral model has been constructed in order to make an information bridge between design and O&M phases. GPM(generic product model), a data model from Hitachi, Japan is addressed and extended in this study. GPM has a similar architecture with ISO 15926 "life cycle data for process plant". The extension is oriented to nuclear power plants. This paper introduces some of implementation results: 1) 2D piping and instrument diagram (P&ID) and 3D CAD model exchanges and their visualization; 2) Interface between GPM-based data warehouse and KHNP ERP system.

Design of Database Integration System and Query System based on Global View Generation Tool (전역 스키마 생성 도구를 이용한 데이터베이스 통합 및 질의 시스템)

  • Park, U-Chang
    • Journal of Internet Computing and Services
    • /
    • v.8 no.3
    • /
    • pp.65-74
    • /
    • 2007
  • Database integration is a common and growing challenge with the proliferation of database systems, data warehouses, data marts, and other OLAP systems in organizations. Although there are many methods of sharing data between databases, true interoperability of database integration system that improves in the database federation architecture by allowing domain administrators to simply and efficiently capture database semantics. The semantic information is combined using a tool for producing a global view. Building the global view is the bottleneck in integration because there are few tools that support its construction, and these tools often require sophisticated knowledge and experience to operate properly. The technique and tool presented is simple and powerful enough to be used by all database administrators, yet expressive enough to support the majority of integration queries.

  • PDF

Performance Comparison of DW System Tajo Based on Hadoop and Relational DBMS (하둡 기반 DW시스템 타조와 관계형 DBMS의 성능 비교)

  • Liu, Chen;Ko, Junghyun;Yeo, Jeongmo
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.3 no.9
    • /
    • pp.349-354
    • /
    • 2014
  • Since Hadoop which is the Big-data processing platform was announced, SQL-on-Hadoop is the spotlight as the technique to analyze data using SQL on Hadoop. Tajo created by Korean programmers has recently been promoted to Top-Level-Project status by the Apache in April and has been paid attention all around world. Despite a sensible change caused by Hadoop's appearance in DW market, researches of those performance is insufficient. Thus, this study has been conducted to help choose a DW solution based on SQL-on-Hadoop as progressing the test on comparison analysis of RDBMS and Tajo. It has shown that Tajo based on Hadoop is more superior than RDBMS if it is used with accurate strategy. In addition, open-source project Tajo is expected not only to achieve improvements in technique due to active participation of many developers but also to be in charge of an important role of DW in the filed of data analysis.

Hilbert Cube for Spatio-Temporal Data Warehouses (시공간 데이타웨어하우스를 위한 힐버트큐브)

  • 최원익;이석호
    • Journal of KIISE:Databases
    • /
    • v.30 no.5
    • /
    • pp.451-463
    • /
    • 2003
  • Recently, there have been various research efforts to develop strategies for accelerating OLAP operations on huge amounts of spatio-temporal data. Most of the work is based on multi-tree structures which consist of a single R-tree variant for spatial dimension and numerous B-trees for temporal dimension. The multi~tree based frameworks, however, are hardly applicable to spatio-temporal OLAP in practice, due mainly to high management cost and low query efficiency. To overcome the limitations of such multi-tree based frameworks, we propose a new approach called Hilbert Cube(H-Cube), which employs fractals in order to impose a total-order on cells. In addition, the H-Cube takes advantage of the traditional Prefix-sum approach to improve Query efficiency significantly. The H-Cube partitions an embedding space into a set of cells which are clustered on disk by Hilbert ordering, and then composes a cube by arranging the grid cells in a chronological order. The H-Cube refines cells adaptively to handle regional data skew, which may change its locations over time. The H-Cube is an adaptive, total-ordered and prefix-summed cube for spatio-temporal data warehouses. Our approach focuses on indexing dynamic point objects in static spatial dimensions. Through the extensive performance studies, we observed that The H-Cube consumed at most 20% of the space required by multi-tree based frameworks, and achieved higher query performance compared with multi-tree structures.

Asymmetric Index Management Scheme for High-capacity Compressed Databases (대용량 압축 데이터베이스를 위한 비대칭 색인 관리 기법)

  • Byun, Si-Woo;Jang, Seok-Woo
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.17 no.7
    • /
    • pp.293-300
    • /
    • 2016
  • Traditional databases exploit a record-based model, where the attributes of a record are placed contiguously in a slow hard disk to achieve high performance. On the other hand, for read-intensive data analysis systems, the column-based compressed database has become a proper model because of its superior read performance. Currently, flash memory SSD is largely recognized as the preferred storage media for high-speed analysis systems. This paper introduces a compressed column-storage model and proposes a new index and its data management scheme for a high-capacity data warehouse system. The proposed index management scheme is based on the asymmetric index duplication and achieves superior search performance using the master index and compact index, particularly for large read-mostly databases. In addition, the data management scheme contributes to the read performance and high reliability by compressing the related columns and replicating them in two mirrored SSD. Based on the results of the performance evaluation under the high workload conditions, the data management scheme outperforms the traditional scheme in terms of the search throughput and response time.