• Title/Summary/Keyword: research data management

Search Result 15,946, Processing Time 0.039 seconds

Development of Climate & Environment Data System for Big Data from Climate Model Simulations (대용량 기후모델자료를 위한 통합관리시스템 구축)

  • Lee, Jae-Hee;Sung, Hyun Min;Won, Sangho;Lee, Johan;Byu, Young-Hwa
    • Atmosphere
    • /
    • v.29 no.1
    • /
    • pp.75-86
    • /
    • 2019
  • In this paper, we introduce a novel Climate & Environment Database System (CEDS). The CEDS is developed by the National Institute of Meteorological Sciences (NIMS) to provide easy and efficient user interfaces and storage management of climate model data, so improves work efficiency. In uploading the data/files, the CEDS provides an option to automatically operate the international standard data conversion (CMORization) and the quality assurance (QA) processes for submission of CMIP6 variable data. This option increases the system performance, removes the user mistakes, and increases the level of reliability as it eliminates user operation for the CMORization and QA processes. The uploaded raw files are saved in a NAS storage and the Cassandra database stores the metadata that will be used for efficient data access and storage management. The Metadata is automatically generated when uploading a file, or by the user inputs. With the Metadata, the CEDS supports effective storage management by categorizing data/files. This effective storage management allows easy and fast data access with a higher level of data reliability when requesting with the simple search words by a novice. Moreover, the CEDS supports parallel and distributed computing for increasing overall system performance and balancing the load. This supports the high level of availability as multiple users can use it at the same time with fast system-response. Additionally, it deduplicates redundant data and reduces storage space.

A Study on the Development of Internet Purchase Support Systems Based on Data Mining and Case-Based Reasoning (데이터마이닝과 사례기반추론 기법에 기반한 인터넷 구매지원 시스템 구축에 관한 연구)

  • 김진성
    • Journal of the Korean Operations Research and Management Science Society
    • /
    • v.28 no.3
    • /
    • pp.135-148
    • /
    • 2003
  • In this paper we introduce the Internet-based purchase support systems using data mining and case-based reasoning (CBR). Internet Business activity that involves the end user is undergoing a significant revolution. The ability to track users browsing behavior has brought the vendor and end customer's closer than ever before. It is now possible for a vendor to personalize his product message for individual customers at massive scale. Most of former researchers, in this research arena, used data mining techniques to pursue the customer's future behavior and to improve the frequency of repurchase. The area of data mining can be defined as efficiently discovering association rules from large collections of data. However, the basic association rule-based data mining technique was not flexible. If there were no inference rules to track the customer's future behavior, association rule-based data mining systems may not present more information. To resolve this problem, we combined association rule-based data mining with CBR mechanism. CBR is used in reasoning for customer's preference searching and training through the cases. Data mining and CBR-based hybrid purchase support mechanism can reflect both association rule-based logical inference and case-based information reuse. A Web-log data gathered in the real-world Internet shopping mall is given to illustrate the quality of the proposed systems.

A Fundamental Study on Data Item occurred in EPC Stage of Pipeline in Extreme Cold Weather (극한지 자원이송망 EPC단계에서 발생되는 데이터 항목에 관한 기초연구)

  • Kim, Chang-Han;Won, Seo-Kyung;Lee, Jun-Bok;Han, Choong-Hee
    • Proceedings of the Korean Institute of Building Construction Conference
    • /
    • 2014.05a
    • /
    • pp.18-19
    • /
    • 2014
  • As issued the development of energy resources, EPC work process through the IT technology is essential for efficient business management, and systematic management of data generated in this process is needed. In domestic, the research related to system development for the collection and management of construction data detected in the field has been done continuously, but pipeline business target the long-distance in extreme cold weather, almost no cases have been studied up to now. Therefore, this research is aimed to derive the data item for efficient management in EPC Stage of pipeline business in extreme cold weather. WBS system of EPC work are classified easily at two levels, data items can be divided based on the type of document. In the future I will be expected to be the foundation of the systematic management of data generated in the EPC step-by-step of pipeline business.

  • PDF

A Data-driven Approach for Computational Simulation: Trend, Requirement and Technology

  • Lee, Sunghee;Ahn, Sunil;Joo, Wonkyun;Yang, Myungseok;Yu, Eunji
    • Journal of Internet Computing and Services
    • /
    • v.19 no.1
    • /
    • pp.123-130
    • /
    • 2018
  • With the emergence of a new paradigm called Open Science and Big Data, the need for data sharing and collaboration is also emerging in the computational science field. This paper, we analyzed data-driven research cases for computational science by field; material design, bioinformatics, high energy physics. We also studied the characteristics of the computational science data and the data management issues. To manage computational science data effectively it is required to have data quality management, increased data reliability, flexibility to support a variety of data types, and tools for analysis and linkage to the computing infrastructure. In addition, we analyzed trends of platform technology for efficient sharing and management of computational science data. The main contribution of this paper is to review the various computational science data repositories and related platform technologies to analyze the characteristics of computational science data and the problems of data management, and to present design considerations for building a future computational science data platform.

Dynamic Location Area Management Scheme Using the Historical Data of a Mobile User (이동통신 사용자의 이력 자료를 고려한 동적 위치영역 관리 기법)

  • Lee, J.S.;Chang, I.K.;Hong, J.W.;Lie, C.H.
    • Proceedings of the Korean Operations and Management Science Society Conference
    • /
    • 2004.05a
    • /
    • pp.119-126
    • /
    • 2004
  • Location management is very important issue in wireless communication system to trace mobile users' exact location. In this study, we propose a dynamic location area management scheme which determines the size of dynamic location area considering each user's characteristic. In determining the optimal location area size, we consider the measurement data as well as the historical data, which contains call arrival rate and average speed of each mobile user. In this mixture of data, the weight of historical data is derived by linear searching method which guarantees the minimal cost of location management. We also introduce the regularity index which can be calculated by using the autocorrelation of historical data itself. Statistical validation shows that the regularity index is the same as the weight of measurement data. As a result, the regularity index is utilized to incorporate the historical data into the measurement data. By applying the proposed scheme, the location management cost is shown to decrease. Numerical examples illustrate such an aspect of the proposed scheme.

  • PDF

Data-Compression-Based Resource Management in Cloud Computing for Biology and Medicine

  • Zhu, Changming
    • Journal of Computing Science and Engineering
    • /
    • v.10 no.1
    • /
    • pp.21-31
    • /
    • 2016
  • With the application and development of biomedical techniques such as next-generation sequencing, mass spectrometry, and medical imaging, the amount of biomedical data have been growing explosively. In terms of processing such data, we face the problems surrounding big data, highly intensive computation, and high dimensionality data. Fortunately, cloud computing represents significant advantages of resource allocation, data storage, computation, and sharing and offers a solution to solve big data problems of biomedical research. In order to improve the efficiency of resource management in cloud computing, this paper proposes a clustering method and adopts Radial Basis Function in order to compress comprehensive data sets found in biology and medicine in high quality, and stores these data with resource management in cloud computing. Experiments have validated that with such a data-compression-based resource management in cloud computing, one can store large data sets from biology and medicine in fewer capacities. Furthermore, with reverse operation of the Radial Basis Function, these compressed data can be reconstructed with high accuracy.

A Study on Factors Affecting the Reuse of Research Data by Academic Researchers in the Social Sciences (사회과학분야 학술 연구자의 연구데이터 재이용 영향요인 연구)

  • Bak, Ji Won;Chang, Woo Kwon
    • Journal of the Korean Society for information Management
    • /
    • v.38 no.4
    • /
    • pp.199-230
    • /
    • 2021
  • This study is to present an analysis and activation plan for the effect of reuse of research data through investigation of researchers and reuse data on reuse of research data. To this end, 178 copies were analyzed based on the distribution and collection of surveys targeting academic researchers in the field of social science in Korea who have experience in calculating new research results by reusing research data. As a result, 1) Most researchers acquire reuse data through systems such as data repositories, data management systems, and research data DBs, and mainly reuse analysis data produced through experiments and observations. In addition, despite being a researcher who successfully reused research data, the awareness of research data sharing was low and did not share it in the face of various problems. 2) The reliability and validity of 10 factors derived through literature review and factor analysis (academic usefulness, research efficiency, researcher concerns, data vulnerability, direct effort, indirect effort, suitability for reuse, data completeness, data usefulness, and social conditions) were verified. 3) As a result of correlation analysis, research efficiency, social conditions showed a quantitative correlation with research data reuse intention, researcher concerns, data vulnerability, and direct effort showed a negative correlation with research data reuse intention. As a result of regression analysis, all of these factors had a significant effect on the intention to reuse research data, and in the order of research efficiency, social conditions, direct efforts, researchers' concerns, and data vulnerability. Based on this, a plan to revitalize the reuse of research data was proposed.

Advanced Resource Management with Access Control for Multitenant Hadoop

  • Won, Heesun;Nguyen, Minh Chau;Gil, Myeong-Seon;Moon, Yang-Sae
    • Journal of Communications and Networks
    • /
    • v.17 no.6
    • /
    • pp.592-601
    • /
    • 2015
  • Multitenancy has gained growing importance with the development and evolution of cloud computing technology. In a multitenant environment, multiple tenants with different demands can share a variety of computing resources (e.g., CPU, memory, storage, network, and data) within a single system, while each tenant remains logically isolated. This useful multitenancy concept offers highly efficient, and cost-effective systems without wasting computing resources to enterprises requiring similar environments for data processing and management. In this paper, we propose a novel approach supporting multitenancy features for Apache Hadoop, a large scale distributed system commonly used for processing big data. We first analyze the Hadoop framework focusing on "yet another resource negotiator (YARN)", which is responsible for managing resources, application runtime, and access control in the latest version of Hadoop. We then define the problems for supporting multitenancy and formally derive the requirements to solve these problems. Based on these requirements, we design the details of multitenant Hadoop. We also present experimental results to validate the data access control and to evaluate the performance enhancement of multitenant Hadoop.

The Confirmation of the Validity and Reliability of the UIS Model Toward the Public Management Information System (행정정보시스템에 대한 UIS모형의 타당성 및 유효성 검증)

    • Journal of the Korean Operations Research and Management Science Society
    • /
    • v.22 no.1
    • /
    • pp.141-157
    • /
    • 1997
  • The structure and dimensionality of the User Information Satisfaction (UIS) construct is an important theoretical issue that received considerable attentions. The acceptance of UIS as a standardized instrument requires confirmation that it explains and measures the user information satisfaction construct and its component. Based on a simple of 670 respondents who participated in dealing with the Public Management Information System (PMIS), this research used a confirmatory factor analysis to test the alternavtive models of underlying factor structure and assessed the reliability and validity of these factors and items in the PMIS. The result provided a support for a revised UIS model with four first-order factors and one PMIS The result provided a support for a revised UIS model with four first-order factors and one second-order (higher-order) factor in PMIS. To cross-validata these results, the author reexamined two prior data sets. The results showed that the revised model provides better model-data fit in all three data sets.

  • PDF

Deriving the Determining Factor for the Management of Oceanographic Data (해양관측데이터 관리를 위한 결정요소 도출)

  • Kim, Sun-Tae;Lee, Tae-Young;Kim, Yong
    • Journal of Information Management
    • /
    • v.43 no.3
    • /
    • pp.97-115
    • /
    • 2012
  • This paper derives determining factor for the management of oceanographic data in two ways. 1) The type of oceanographic observation and the raw data which were collected from marine physics, marine chemistry, marine biology, marine geology area were analyzed. 2) The services of the KODC(Korea Oceangraphic Data Center), NFRDI(National Fisheries Research & Development Institute), KHOA(Korea Hydrographic and Oceanographic Administration) were analyzed to derive metadata elements for retrieval. After analyze, the 42 deciding factor were derived in the 9 areas (general, Observer, satellites, observation instruments, observatories, space, information, projects, and observational data, data processing).