• Title/Summary/Keyword: Data science

Search Result 56,576, Processing Time 0.06 seconds

A Fast and Exact Verification of Inter-Domain Data Transfer based on PKI

  • Jung, Im-Y.;Eom, Hyeon-Sang;Yeom, Heon-Y.
    • Journal of Information Technology Applications and Management
    • /
    • v.18 no.3
    • /
    • pp.61-72
    • /
    • 2011
  • Trust for the data created, processed and transferred on e-Science environments can be estimated with provenance. The information to form provenance, which says how the data was created and reached its current state, increases as data evolves. It is a heavy burden to trace and verify the massive provenance in order to trust data. On the other hand, it is another issue how to trust the verification of data with provenance. This paper proposes a fast and exact verification of inter-domain data transfer and data origin for e-Science environment based on PKI. The verification, which is called two-way verification, cuts down the tracking overhead of the data along the causality presented on Open Provenance Model with the domain specialty of e-Science environment supported by Grid Security Infrastructure (GSI). The proposed scheme is easy-applicable without an extra infrastructure, scalable irrespective of the number of provenance records, transparent and secure with cryptography as well as low-overhead.

Improving Utilization of GPS Data for Urban Traffic Applications

  • Nguyen, Duc Hai;Nguyen, Tan Phuc;Doan, Khue;Ta, Ho Thai Hai;Pham, Tran Vu;Huynh, Nam;Le, Thanh Van
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.7 no.1
    • /
    • pp.6-9
    • /
    • 2015
  • The use of Intelligent Transportation System (ITS) is promising to bring better solutions for managing and handling the city traffic. This system combines many fields in advanced technology such as Global Positioning System (GPS), Geographic Information System (GIS) and so on. The basement of applications in ITS is the effective collections and data integration tools. The purpose of our research is to propose solutions which involve the use of GPS time series data collected from GPS devices in order to improve the quality of output traffic data. In this study, GPS data is collected from devices attached to vehicles travelling on routes in Ho Chi Minh City (HCMC). Then, GPS data is stored in database system to serve in many transportation applications. The proposed method combines the data usage level and data coverage to improve the quality of traffic data.

Changes in Research Paradigms in Data Intensive Environments

  • Minsoo Park
    • International journal of advanced smart convergence
    • /
    • v.12 no.4
    • /
    • pp.98-103
    • /
    • 2023
  • As technology advanced dramatically in the late 20th century, a new era of science arrived. The emerging era of scientific discovery, variously described as e-Science, cyberscience, and the fourth paradigm, uses technologies required for computation, data curation, analysis, and visualization. The emergence of the fourth research paradigm will have such a huge impact that it will shake the foundations of science, and will also have a huge impact on the role of data-information infrastructure. In the digital age, the roles of data-information professionals are becoming more diverse. As eScience emerges as a sustainable and growing part of research, data-information professionals and centeres are exploring new roles to address the issues that arise from new forms of research. The functions that data-information professionals and centeres can fundamentally provide in the e-Science area are data curation, preservation, access, and metadata. Basically, it involves discovering and using available technical infrastructure and tools, finding relevant data, establishing a data management plan, and developing tools to support research. A further advanced service is archiving and curating relevant data for long-term preservation and integration of datasets and providing curating and data management services as part of a data management plan. Adaptation and change to the new information environment of the 21st century require strong and future-responsive leadership. There is a strong need to effectively respond to future challenges by exploring the role and function of data-information professionals in the future environment. Understanding what types of data-information professionals and skills will be needed in the future is essential to developing the talent that will lead the transformation. The new values and roles of data-information professionals and centers for 21st century researchers in STEAM are discussed.

A Flexible and Expandable Representation Framework for Computational Science Data

  • Kim, Jaesung;Ahn, Sunil;Lee, Jeongchoel;Lee, Jongsuk Ruth
    • Journal of Internet Computing and Services
    • /
    • v.21 no.3
    • /
    • pp.41-51
    • /
    • 2020
  • EDISON is a web-based platform that provides easy and convenient use of simulation software on high-performance computers. One of the most important roles of a computational science platform, such as EDISON, is to post-process and represent the simulation results data so that the user can easily understand the data. We interviewed EDISON users and collected requirements for post-processing and represent of simulation results, which included i) flexible data representation, ii) supporting various data representation components, and iii) flexible and easy development of view template. In previous studies, it was difficult to develop or contribute data representation components, and the view templates were not able to be shared or recycled. This causes a problem that makes it difficult to create ecosystems for the representation tool development of numerous simulation software. EDISON-VIEW is a framework for post-processing and representing simulation results produced from the EDISON platform. This paper proposes various methods used in the design and development of the EDISON-VIEW framework to solve the above requirements and problems. We have verified its usefulness by applying it to simulation software in various fields such as material, computational fluid dynamics, computational structural dynamics, and computational chemistry.

LDBAS: Location-aware Data Block Allocation Strategy for HDFS-based Applications in the Cloud

  • Xu, Hua;Liu, Weiqing;Shu, Guansheng;Li, Jing
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.1
    • /
    • pp.204-226
    • /
    • 2018
  • Big data processing applications have been migrated into cloud gradually, due to the advantages of cloud computing. Hadoop Distributed File System (HDFS) is one of the fundamental support systems for big data processing on MapReduce-like frameworks, such as Hadoop and Spark. Since HDFS is not aware of the co-location of virtual machines in the cloud, the default scheme of block allocation in HDFS does not fit well in the cloud environments behaving in two aspects: data reliability loss and performance degradation. In this paper, we present a novel location-aware data block allocation strategy (LDBAS). LDBAS jointly optimizes data reliability and performance for upper-layer applications by allocating data blocks according to the locations and different processing capacities of virtual nodes in the cloud. We apply LDBAS to two stages of data allocation of HDFS in the cloud (the initial data allocation and data recovery), and design the corresponding algorithms. Finally, we implement LDBAS into an actual Hadoop cluster and evaluate the performance with the benchmark suite BigDataBench. The experimental results show that LDBAS can guarantee the designed data reliability while reducing the job execution time of the I/O-intensive applications in Hadoop by 8.9% on average and up to 11.2% compared with the original Hadoop in the cloud.

Survival Analysis using SRC-Stat Statistical Package (SRC-Stat 통계패키지를 이용한 생존분석)

  • Ha, Il Do;Noh, Maengseok;Lee, Youngjo;Lim, Johan;Lee, Jaeyong;Oh, Heeseok;Shin, Dongwan;Lee, Sanggoo;Seo, Jinuk;Park, Yonhtae;Cho, Sungzoon;Park, Jonghun;Kim, Youkyung;You, Kyungsang
    • The Korean Journal of Applied Statistics
    • /
    • v.28 no.2
    • /
    • pp.309-324
    • /
    • 2015
  • In this paper we introduce how to analyze survival data via a SRC-Stat statistical package. This provides classical survival analysis (e.g. Cox's proportional hazards models for univariate survival data) as well as advanced survival analysis such as shared and nested frailty models for multivariate survival data. We illustrate the use of our package with practical data sets.

Secure Data Sharing in The Cloud Through Enhanced RSA

  • Islam abdalla mohamed;Loay F. Hussein;Anis Ben Aissa;Tarak kallel
    • International Journal of Computer Science & Network Security
    • /
    • v.23 no.2
    • /
    • pp.89-95
    • /
    • 2023
  • Cloud computing today provides huge computational resources, storage capacity, and many kinds of data services. Data sharing in the cloud is the practice of exchanging files between various users via cloud technology. The main difficulty with file sharing in the public cloud is maintaining privacy and integrity through data encryption. To address this issue, this paper proposes an Enhanced RSA encryption schema (ERSA) for data sharing in the public cloud that protects privacy and strengthens data integrity. The data owners store their files in the cloud after encrypting the data using the ERSA which combines the RSA algorithm, XOR operation, and SHA-512. This approach can preserve the confidentiality and integrity of a file in any cloud system while data owners are authorized with their unique identities for data access. Furthermore, analysis and experimental results are presented to verify the efficiency and security of the proposed schema.

Component Development and Importance Weight Analysis of Data Governance (Data Governance 구성요소 개발과 중요도 분석)

  • Jang, Kyoung-Ae;Kim, Woo-Je
    • Journal of the Korean Operations Research and Management Science Society
    • /
    • v.41 no.3
    • /
    • pp.45-58
    • /
    • 2016
  • Data are important in an organization because they are used in making decisions and obtaining insights. Furthermore, given the increasing importance of data in modern society, data governance should be requested to increase an organization's competitive power. However, data governance concepts have caused confusion because of the myriad of guidelines proposed by related institutions and researchers. In this study, we re-established the concept of ambiguous data governance and derived the top-level components by analyzing previous research. This study identified the components of data governance and quantitatively analyzed the relation between these components by using DEMATEL and context analysis techniques that are often used to solve complex problems. Three higher components (data compliance management, data quality management, and data organization management) and 13 lower components are derived as data governance components. Furthermore, importance analysis shows that data quality management, data compliance management, and data organization management are the top components of data governance in order of priority. This study can be used as a basis for presenting standards or establishing concepts of data governance.

Concept and Characteristics of Intelligent Science Lab (지능형 과학실의 개념과 특징)

  • Hong, Oksu;Kim, Kyoung Mi;Lee, Jae Young;Kim, Yool
    • Journal of The Korean Association For Science Education
    • /
    • v.42 no.2
    • /
    • pp.177-184
    • /
    • 2022
  • This article aims to explain the concept and characteristics of the 'Intelligent Science Lab', which is being promoted nationwide in Korea since 2021. The Korean Ministry of Education creates a master plan containing a vision for science education every five years. The most recently announced '4th Master plan for science education (2020-2024)' emphasizes the policy of setting up an 'intelligent science lab' in all elementary and secondary schools as an online and offline space for scientific inquiry using advanced technologies, such as Internet of Things and Augmented and Virtual Reality. The 'Intelligent Science Lab' project is being pursued in two main directions: (1) developing an online platform named 'Intelligent Science Lab-ON' that supports science inquiry classes, and (2) building a science lab space in schools that encourages active student participation while utilizing the online platform. This article presents the key features of the 'Intelligent Science Lab-ON' and the characteristics of intelligent science lab spaces newly built in schools. Furthermore, it introduces inquiry-based science learning programs developed for intelligent science labs. These programs include scientific inquiry activities in which students generate and collect data ('data generation' type), utilize datasets provided by the online platform ('data utilization' type), or utilize open and public data sources ('open data source' type). The Intelligent Science Lab project is expected to not only encourage students to engage in scientific inquiry that solves individual and social problems based on real data, but also contribute to presenting a model of online and offline linked scientific inquiry lessons required in the post-COVID-19 era.