• Title/Summary/Keyword: Data journal

Search Result 191,047, Processing Time 0.112 seconds

An Exploratory Study on the Influencing Factor on Utilization of Public Data (공공데이터 활용에 미치는 영향 요인에 관한 탐색적 연구)

  • Junyoung Jeong;Keuntae Cho
    • Journal of Information Technology Services
    • /
    • v.23 no.2
    • /
    • pp.49-62
    • /
    • 2024
  • The purpose of this study is empirically to identify what factors affect the utilization of public data on the perspective of users. This study proposes four crucial factors including understandability, processing-convenience, linkage, and timeliness from the previous studies. As a result, understandability and linkage factors have significant impact on the utilization of public data and no different impact depending on the industry classification. The implication of this study is that it is important to provide sufficient information so that open data users can understand easily what kind of data it is, and to facilitate the linkage of open data with other data in order to activate the utilization of public data.

Utilizing Agricultural Data with the Prosumer Concept

  • Se-Yun Tak;Neung-Hoe Kim
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.16 no.3
    • /
    • pp.329-333
    • /
    • 2024
  • With increasing applications of technologies developed in the Fourth Industrial Revolution, data have come to replace important knowledge and experience in the agricultural field. Although data-based smart agriculture is growing at an average annual rate of 8.57%, research on ways to utilize data produced alongside is remains insufficient. Because such data may considerably help stakeholders involved in agricultural activities, we deployed the prosumer concept to revitalize agricultural data. We systematically structured and defined three relevant entities: the prosumer, which produces and consumes agricultural data; the database, which systematically processes and integrates agricultural data; and the consumer, which utilizes agricultural data in various ways. Our framework is designed to help stakeholders use agricultural data to improve the quality of crops, minimize the failure of agricultural activities, quickly adapt to new environments and methods of crop production, and find effective solutions to relevant issues.

Fabricator based on B+Tree for Metadata Management in Distributed Environment

  • Chae-Yeon Yun;Seok-Jae Moon
    • International journal of advanced smart convergence
    • /
    • v.13 no.3
    • /
    • pp.125-134
    • /
    • 2024
  • In a distributed environment, data fabric refers to the technology and architecture that provides data management, integration, and access in a consistent and unified manner. To build a data fabric, it is necessary to maintain data consistency, establish a data governance system, reduce structural differences between data sources, and provide a unified view. In this paper, we propose the Fabricator system, a technology that provides data management and access in a consistent and unified manner by building a metadata registry. Fabricator manages the addition and modification of metadata schemas and matching processes by designing a matching tool called MetaSB Manager that applies B+Tree. This allows real-time integration of various data sources in a distributed environment, maximizing the flexibility and usability of data.

A Survey on the Mobile Crowdsensing System life cycle: Task Allocation, Data Collection, and Data Aggregation

  • Xia Zhuoyue;Azween Abdullah;S.H. Kok
    • International Journal of Computer Science & Network Security
    • /
    • v.23 no.3
    • /
    • pp.31-48
    • /
    • 2023
  • The popularization of smart devices and subsequent optimization of their sensing capacity has resulted in a novel mobile crowdsensing (MCS) pattern, which employs smart devices as sensing nodes by recruiting users to develop a sensing network for multiple-task performance. This technique has garnered much scholarly interest in terms of sensing range, cost, and integration. The MCS is prevalent in various fields, including environmental monitoring, noise monitoring, and road monitoring. A complete MCS life cycle entails task allocation, data collection, and data aggregation. Regardless, specific drawbacks remain unresolved in this study despite extensive research on this life cycle. This article mainly summarizes single-task, multi-task allocation, and space-time multi-task allocation at the task allocation stage. Meanwhile, the quality, safety, and efficiency of data collection are discussed at the data collection stage. Edge computing, which provides a novel development idea to derive data from the MCS system, is also highlighted. Furthermore, data aggregation security and quality are summarized at the data aggregation stage. The novel development of multi-modal data aggregation is also outlined following the diversity of data obtained from MCS. Overall, this article summarizes the three aspects of the MCS life cycle, analyzes the issues underlying this study, and offers developmental directions for future scholars' reference.

Performance Analysis of Perturbation-based Privacy Preserving Techniques: An Experimental Perspective

  • Ritu Ratra;Preeti Gulia;Nasib Singh Gill
    • International Journal of Computer Science & Network Security
    • /
    • v.23 no.10
    • /
    • pp.81-88
    • /
    • 2023
  • In the present scenario, enormous amounts of data are produced every second. These data also contain private information from sources including media platforms, the banking sector, finance, healthcare, and criminal histories. Data mining is a method for looking through and analyzing massive volumes of data to find usable information. Preserving personal data during data mining has become difficult, thus privacy-preserving data mining (PPDM) is used to do so. Data perturbation is one of the several tactics used by the PPDM data privacy protection mechanism. In Perturbation, datasets are perturbed in order to preserve personal information. Both data accuracy and data privacy are addressed by it. This paper will explore and compare several perturbation strategies that may be used to protect data privacy. For this experiment, two perturbation techniques based on random projection and principal component analysis were used. These techniques include Improved Random Projection Perturbation (IRPP) and Enhanced Principal Component Analysis based Technique (EPCAT). The Naive Bayes classification algorithm is used for data mining approaches. These methods are employed to assess the precision, run time, and accuracy of the experimental results. The best perturbation method in the Nave-Bayes classification is determined to be a random projection-based technique (IRPP) for both the cardiovascular and hypothyroid datasets.

A Study on Priorities of the Components of Big Data Information Security Service by AHP (AHP 기법을 활용한 Big Data 보안관리 요소들의 우선순위 분석에 관한 연구)

  • Biswas, Subrata;Yoo, Jin Ho;Jung, Chul Yong
    • The Journal of Society for e-Business Studies
    • /
    • v.18 no.4
    • /
    • pp.301-314
    • /
    • 2013
  • The existing computer environment, numerous mobile environments and the internet environment make human life easier through the development of IT technology. With the emergence of the mobile and internet environment, data is getting bigger rapidly. From this environment, we can take advantage of using those data as economic assets for organizations which make dreams come true for the emerging Big Data environment and Big Data security services. Nowadays, Big Data services are increasing. However, these Big Data services about Big Data security is insufficient at present. In terms of Big Data security the number of security by Big Data studies are increasing which creates value for Security by Big Data not Security for Big Data. Accordingly in this paper our research will show how security for Big Data can vitalize Big Data service for organizations. In details, this paper derives the priorities of the components of Big Data Information Security Service by AHP.

Draft Design of DataLake Framework based on Abyss Storage Cluster (Abyss Storage Cluster 기반의 DataLake Framework의 설계)

  • Cha, ByungRae;Park, Sun;Shin, Byeong-Chun;Kim, JongWon
    • Smart Media Journal
    • /
    • v.7 no.1
    • /
    • pp.9-15
    • /
    • 2018
  • As an organization or organization grows in size, many different types of data are being generated in different systems. There is a need for a way to improve efficiency by processing data smarter in different systems. Just like DataLake, we are creating a single domain model that accurately describes the data and can represent the most important data for the entire business. In order to realize the benefits of a DataLake, it is import to know how a DataLake may be expected to work and what components architecturally may help to build a fully functional DataLake. DataLake components have a life cycle according to the data flow. And while th data flows into a DataLake from the point of acquisition, its meta-data is captured and managed along with data traceability, data lineage, and security aspects based on data sensitivity across its life cycle. According to this reason, we have designed the DataLake Framework based on Abyss Storage Cluster.

Development and Lessons Learned of Clinical Data Warehouse based on Common Data Model for Drug Surveillance (약물부작용 감시를 위한 공통데이터모델 기반 임상데이터웨어하우스 구축)

  • Mi Jung Rho
    • Korea Journal of Hospital Management
    • /
    • v.28 no.3
    • /
    • pp.1-14
    • /
    • 2023
  • Purposes: It is very important to establish a clinical data warehouse based on a common data model to offset the different data characteristics of each medical institution and for drug surveillance. This study attempted to establish a clinical data warehouse for Dankook university hospital for drug surveillance, and to derive the main items necessary for development. Methodology/Approach: This study extracted the electronic medical record data of Dankook university hospital tracked for 9 years from 2013 (2013.01.01. to 2021.12.31) to build a clinical data warehouse. The extracted data was converted into the Observational Medical Outcomes Partnership Common Data Model (Version 5.4). Data term mapping was performed using the electronic medical record data of Dankook university hospital and the standard term mapping guide. To verify the clinical data warehouse, the use of angiotensin receptor blockers and the incidence of liver toxicity were analyzed, and the results were compared with the analysis of hospital raw data. Findings: This study used a total of 670,933 data from electronic medical records for the Dankook university clinical data warehouse. Excluding the number of overlapping cases among the total number of cases, the target data was mapped into standard terms. Diagnosis (100% of total cases), drug (92.1%), and measurement (94.5%) were standardized. For treatment and surgery, the insurance EDI (electronic data interchange) code was used as it is. Extraction, conversion and loading were completed. R language-based conversion and loading software for the process was developed, and clinical data warehouse construction was completed through data verification. Practical Implications: In this study, a clinical data warehouse for Dankook university hospitals based on a common data model supporting drug surveillance research was established and verified. The results of this study provide guidelines for institutions that want to build a clinical data warehouse in the future by deriving key points necessary for building a clinical data warehouse.

  • PDF

Development of a National Research Data Platform for Sharing and Utilizing Research Data

  • Shin, Youngho;Um, Jungho;Seo, Dongmin;Shin, Sungho
    • Journal of Information Science Theory and Practice
    • /
    • v.10 no.spc
    • /
    • pp.25-38
    • /
    • 2022
  • Research data means data used or created in the course of research or experiments. Research data is very important for validation of research conducted and for use in future research and projects. Recently, convergence research between various fields and international cooperation has been continuously done due to the explosive increase of research data and the increase in the complexity of science and technology. Developed countries are actively promoting open science policies that share research results and processes to create new knowledge and values through convergence research. Communities to promote the sharing and utilization of research data such as RDA (Research Data Alliance) and COAR (Confederation of Open Access Repositories) are active, and various platforms for managing and sharing research data are being developed and used. OpenAIRE (Open Access Infrastructure for Research In Europe), a research data platform in Europe, ARDC (Australian Research Data Commons) in Australia, and IRDB (Institutional Repositories DataBase) in Japan provide research data or research data related services. Korea has been establishing and implementing a research data sharing and utilization strategy to promote the sharing and utilization of research data at the national level, led by the central government. Based on this strategy, KISTI has been building a Korean research data platform (DataON) since 2018, and has been providing research data sharing and utilization services to users since January 2020. This paper reviews the characteristics of DataON and how it is used for research by showing its applications.

Linking Bibliographic Data and Public Library Service Data Using Bibliographic Framework (서지프레임워크를 활용한 공공도서관 서지데이터와 서비스 데이터의 연계)

  • Park, Zi-young
    • Journal of the Korean Society for information Management
    • /
    • v.33 no.1
    • /
    • pp.293-316
    • /
    • 2016
  • This study aims to improve bibliographic data of public libraries by linking service data, which are produced out of library service programs. Service data collected from the seven award-winning public libraries were selected and analyzed. A Bibliographic Framework is used for linking bibliographic data and service data. Interfaces are also suggested for the two-way data linking. The results can be used to obtain 1) selective and value-added bibliographic data, 2) bibliographic data updated continuously throughout the lifecycle, 3) structured service data for preservation and sharing, and 4) bibliographic data linked to the additional external linked data.