• 제목/요약/키워드: The Data

검색결과 217,399건 처리시간 0.125초

A Data-driven Approach for Computational Simulation: Trend, Requirement and Technology

  • Lee, Sunghee;Ahn, Sunil;Joo, Wonkyun;Yang, Myungseok;Yu, Eunji
    • 인터넷정보학회논문지
    • /
    • 제19권1호
    • /
    • pp.123-130
    • /
    • 2018
  • With the emergence of a new paradigm called Open Science and Big Data, the need for data sharing and collaboration is also emerging in the computational science field. This paper, we analyzed data-driven research cases for computational science by field; material design, bioinformatics, high energy physics. We also studied the characteristics of the computational science data and the data management issues. To manage computational science data effectively it is required to have data quality management, increased data reliability, flexibility to support a variety of data types, and tools for analysis and linkage to the computing infrastructure. In addition, we analyzed trends of platform technology for efficient sharing and management of computational science data. The main contribution of this paper is to review the various computational science data repositories and related platform technologies to analyze the characteristics of computational science data and the problems of data management, and to present design considerations for building a future computational science data platform.

원천 데이터 품질이 빅데이터 분석결과의 유용성과 활용도에 미치는 영향 (An Empirical Study on the Effects of Source Data Quality on the Usefulness and Utilization of Big Data Analytics Results)

  • 박소현;이국희;이아연
    • Journal of Information Technology Applications and Management
    • /
    • 제24권4호
    • /
    • pp.197-214
    • /
    • 2017
  • This study sheds light on the source data quality in big data systems. Previous studies about big data success have called for future research and further examination of the quality factors and the importance of source data. This study extracted the quality factors of source data from the user's viewpoint and empirically tested the effects of source data quality on the usefulness and utilization of big data analytics results. Based on the previous researches and focus group evaluation, four quality factors have been established such as accuracy, completeness, timeliness and consistency. After setting up 11 hypotheses on how the quality of the source data contributes to the usefulness, utilization, and ongoing use of the big data analytics results, e-mail survey was conducted at a level of independent department using big data in domestic firms. The results of the hypothetical review identified the characteristics and impact of the source data quality in the big data systems and drew some meaningful findings about big data characteristics.

데이터처리전문기관의 역할 및 보안 강화방안 연구: 버몬트주 데이터브로커 비교를 중심으로 (A Study on the Role and Security Enhancement of the Expert Data Processing Agency: Focusing on a Comparison of Data Brokers in Vermont)

  • 김수한;권헌영
    • 한국IT서비스학회지
    • /
    • 제22권3호
    • /
    • pp.29-47
    • /
    • 2023
  • With the recent advancement of information and communication technologies such as artificial intelligence, big data, cloud computing, and 5G, data is being produced and digitized in unprecedented amounts. As a result, data has emerged as a critical resource for the future economy, and overseas countries have been revising laws for data protection and utilization. In Korea, the 'Data 3 Act' was revised in 2020 to introduce institutional measures that classify personal information, pseudonymized information, and anonymous information for research, statistics, and preservation of public records. Among them, it is expected to increase the added value of data by combining pseudonymized personal information, and to this end, "the Expert Data Combination Agency" and "the Expert Data Agency" (hereinafter referred to as the Expert Data Processing Agency) system were introduced. In comparison to these domestic systems, we would like to analyze similar overseas systems, and it was recently confirmed that the Vermont government in the United States enacted the first "Data Broker Act" in the United States as a measure to protect personal information held by data brokers. In this study, we aim to compare and analyze the roles and functions of the "Expert Data Processing Agency" and "Data Broker," and to identify differences in designated standards, security measures, etc., in order to present ways to contribute to the activation of the data economy and enhance information protection.

UEPF:A blockchain based Uniform Encoding and Parsing Framework in multi-cloud environments

  • Tao, Dehao;Yang, Zhen;Qin, Xuanmei;Li, Qi;Huang, Yongfeng;Luo, Yubo
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제15권8호
    • /
    • pp.2849-2864
    • /
    • 2021
  • The emerging of cloud data sharing can create great values, especially in multi-cloud environments. However, "data island" between different cloud service providers (CSPs) has drawn trust problem in data sharing, causing contradictions with the increasing sharing need of cloud data users. And how to ensure the data value for both data owner and data user before sharing, is another challenge limiting massive data sharing in the multi-cloud environments. To solve the problems above, we propose a Uniform Encoding and Parsing Framework (UEPF) with blockchain to support trustworthy and valuable data sharing. We design namespace-based unique identifier pair to support data description corresponding with data in multi-cloud, and build a blockchain-based data encoding protocol to manage the metadata with identifier pair in the blockchain ledger. To share data in multi-cloud, we build a data parsing protocol with smart contract to query and get the sharing cloud data efficiently. We also build identifier updating protocol to satisfy the dynamicity of data, and data check protocol to ensure the validity of data. Theoretical analysis and experiment results show that UEPF is pretty efficient.

AHP 기법을 활용한 Big Data 보안관리 요소들의 우선순위 분석에 관한 연구 (A Study on Priorities of the Components of Big Data Information Security Service by AHP)

  • 수브르더 비스워스;유진호;정철용
    • 한국전자거래학회지
    • /
    • 제18권4호
    • /
    • pp.301-314
    • /
    • 2013
  • IT기술의 발전은 기존의 컴퓨터 환경과 더불어 수많은 모바일 환경 및 사물 인터넷환경을 통해 사람의 삶을 편리하게 하고 있다. 이러한 모바일과 인터넷 환경의 등장으로 데이터가 급속히 폭증하고 있으며, 이러한 환경에서 데이터를 경제적인 자산으로 활용 가능한 Big Data 환경과 서비스가 등장하고 있다. 그러나 Big Data를 활용한 서비스는 증가하고 있지만, 이러한 서비스를 위해 발생되는 다량의 데이터에는 보안적 문제점이 있음에도 불구하고 Big Data의 보안성에 대한 논의는 미흡한 실정이다. 그리고 기존의 Big Data에 대한 보안적인 측면의 연구들은 Big Data의 보안이 아닌 Big Data를 활용한 서비스의 보안이 주를 이루고 있다. 이에 따라서 본 연구에서는 Big Data의 서비스 산업의 활성화를 위하여 Big Data의 보안에 대한 연구를 하였다. 세부적으로 AHP 기법을 활용한 Big Data 환경에서 보안관리를 위한 구성요소를 파악하고 그에 대한 우선순위를 도출하였다.

농촌 생활권 기초생활서비스 항목 설정 및 공간데이터 구축을 위한 기초연구 (Deriving Basic Living Service Items and Establishing Spatial Data in Rural Areas)

  • 김수연;김상범
    • 한국농촌건축학회논문집
    • /
    • 제24권3호
    • /
    • pp.39-46
    • /
    • 2022
  • This study aims to derive basic living service facility items in rural areas and construct related spatial data. To do this, a literature review on the laws and systems related to the residential environment and services in rural areas, rural spatial planning, and the 'Rural Convention' strategic plan reports for the Jeolla and Gyeongsang Region in 2021 was conducted. Primary data collection and review on the list of basic living service items in rural areas derived from the analysis were conducted. After data collection, 12 sectors and 44 types of rural basic living service items were derived; the data selection was carried out based on the clarity of the subject of data management, whether it was established nationwide, whether it was disclosed and provided, whether it was periodically updated, and whether it was an underlying law. Afterwards, data on the derived rural basic living service items were constructed. Afterwards, spatial data on the derived rural basic living service items were constructed. Because open data provided through various institutions were employed, data structure unification such as data attribute values and code names was needed, and abnormal data such as address errors and omissions were refined. After that, the data provided in text form was converted into spatial data through geocoding, and through comparative review of the distribution status of the converted data and the provided address, spatial data related to rural basic living services were finally constructed for about 540,000 cases. Finally, implications for data construction for diagnosing rural living areas were derived through the data collection and construction process. The derived implications include data unification, data update system establishment, the establishment of attribute values necessary for rural living area diagnosis and spatial planning, data establishment plan for facilities that provide various services, rural living area analysis method, and diagnostic index development. This study is meaningful in that it laid the foundation for data-based rural area diagnosis and rural planning, by selecting the basic rural living service items, and constructing spatial data on the selected items.

Analysis of the Current Status of Data Repositories in the Field of Ecological Research

  • Kim, Suntae
    • Proceedings of the National Institute of Ecology of the Republic of Korea
    • /
    • 제2권2호
    • /
    • pp.139-143
    • /
    • 2021
  • In this study, data repository information registered in re3data (re3data.org), a research data registry, was collected. Based on collected data, the current status was analyzed for 354 repositories (approximately 14% of total repositories) in the field using keywords in the ecological field suggested by two experts. Major metadata formats used to describe data in ecological research data repositories include Federal Geographic Data Committee Content Standard for Digital Geospatial Metadata (FGDC/CSDGM), Dublin Core, ISO 19115, Ecological Metadata Language (EML), Directory Interchange Format (DIF), Darwin Core, Data Documentation Initiative (DDI), and DataCite Metadata Schema. The number of ecological repositories according to country is 102 in the US, 34 in Germany, 31 in Canada, and one in Korea. A total of 771 non-profit organizations and 12 for-profit organizations are involved in the construction of the ecological field research data repository. Data version control ratio of the ecological field research data repositories registered in re3data was analyzed to be somewhat higher (86.6%) than the total ratio (83.9%). Results of this study can be used to establish policies to build and operate a research data repository in the ecological field.

Detection and Correction Method of Erroneous Data Using Quantile Pattern and LSTM

  • Hwang, Chulhyun;Kim, Hosung;Jung, Hoekyung
    • Journal of information and communication convergence engineering
    • /
    • 제16권4호
    • /
    • pp.242-247
    • /
    • 2018
  • The data of K-Water waterworks is collected from various sensors and used as basic data for the operation and analysis of various devices. In this way, the importance of the sensor data is very high, but it contains misleading data due to the characteristics of the sensor in the external environment. However, the cleansing method for the missing data is concentrated on the prediction of the missing data, so the research on the detection and prediction method of the missing data is poor. This is a study to detect wrong data by converting collected data into quintiles and patterning them. It is confirmed that the accuracy of detecting false data intentionally generated from real data is higher than that of the conventional method in all cases. Future research we will prove the proposed system's efficiency and accuracy in various environments.

Obtaining bootstrap data for the joint distribution of bivariate survival times

  • Kwon, Se-Hyug
    • Journal of the Korean Data and Information Science Society
    • /
    • 제20권5호
    • /
    • pp.933-939
    • /
    • 2009
  • The bivariate data in clinical research fields often has two types of failure times, which are mark variable for the first failure time and the final failure time. This paper showed how to generate bootstrap data to get Bayesian estimation for the joint distribution of bivariate survival times. The observed data was generated by Frank's family and the fake date is simulated with the Gamma prior of survival time. The bootstrap data was obtained by combining the mimic data with the observed data and the simulated fake data from the observed data.

  • PDF

Verification Control Algorithm of Data Integrity Verification in Remote Data sharing

  • Xu, Guangwei;Li, Shan;Lai, Miaolin;Gan, Yanglan;Feng, Xiangyang;Huang, Qiubo;Li, Li;Li, Wei
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제16권2호
    • /
    • pp.565-586
    • /
    • 2022
  • Cloud storage's elastic expansibility not only provides flexible services for data owners to store their data remotely, but also reduces storage operation and management costs of their data sharing. The data outsourced remotely in the storage space of cloud service provider also brings data security concerns about data integrity. Data integrity verification has become an important technology for detecting the integrity of remote shared data. However, users without data access rights to verify the data integrity will cause unnecessary overhead to data owner and cloud service provider. Especially malicious users who constantly launch data integrity verification will greatly waste service resources. Since data owner is a consumer purchasing cloud services, he needs to bear both the cost of data storage and that of data verification. This paper proposes a verification control algorithm in data integrity verification for remotely outsourced data. It designs an attribute-based encryption verification control algorithm for multiple verifiers. Moreover, data owner and cloud service provider construct a common access structure together and generate a verification sentinel to verify the authority of verifiers according to the access structure. Finally, since cloud service provider cannot know the access structure and the sentry generation operation, it can only authenticate verifiers with satisfying access policy to verify the data integrity for the corresponding outsourced data. Theoretical analysis and experimental results show that the proposed algorithm achieves fine-grained access control to multiple verifiers for the data integrity verification.