• 제목/요약/키워드: Data

검색결과 215,418건 처리시간 0.111초

Data Governance 구성요소 개발과 중요도 분석 (Component Development and Importance Weight Analysis of Data Governance)

  • 장경애;김우제
    • 한국경영과학회지
    • /
    • 제41권3호
    • /
    • pp.45-58
    • /
    • 2016
  • Data are important in an organization because they are used in making decisions and obtaining insights. Furthermore, given the increasing importance of data in modern society, data governance should be requested to increase an organization's competitive power. However, data governance concepts have caused confusion because of the myriad of guidelines proposed by related institutions and researchers. In this study, we re-established the concept of ambiguous data governance and derived the top-level components by analyzing previous research. This study identified the components of data governance and quantitatively analyzed the relation between these components by using DEMATEL and context analysis techniques that are often used to solve complex problems. Three higher components (data compliance management, data quality management, and data organization management) and 13 lower components are derived as data governance components. Furthermore, importance analysis shows that data quality management, data compliance management, and data organization management are the top components of data governance in order of priority. This study can be used as a basis for presenting standards or establishing concepts of data governance.

Data Mining for High Dimensional Data in Drug Discovery and Development

  • Lee, Kwan R.;Park, Daniel C.;Lin, Xiwu;Eslava, Sergio
    • Genomics & Informatics
    • /
    • 제1권2호
    • /
    • pp.65-74
    • /
    • 2003
  • Data mining differs primarily from traditional data analysis on an important dimension, namely the scale of the data. That is the reason why not only statistical but also computer science principles are needed to extract information from large data sets. In this paper we briefly review data mining, its characteristics, typical data mining algorithms, and potential and ongoing applications of data mining at biopharmaceutical industries. The distinguishing characteristics of data mining lie in its understandability, scalability, its problem driven nature, and its analysis of retrospective or observational data in contrast to experimentally designed data. At a high level one can identify three types of problems for which data mining is useful: description, prediction and search. Brief review of data mining algorithms include decision trees and rules, nonlinear classification methods, memory-based methods, model-based clustering, and graphical dependency models. Application areas covered are discovery compound libraries, clinical trial and disease management data, genomics and proteomics, structural databases for candidate drug compounds, and other applications of pharmaceutical relevance.

Environmental Survey Data Analysis by Data Fusion Techniques

  • Cho, Kwang-Hyun;Park, Hee-Chang
    • Journal of the Korean Data and Information Science Society
    • /
    • 제17권4호
    • /
    • pp.1201-1208
    • /
    • 2006
  • Data fusion is generally defined as the use of techniques that combine data from multiple sources and gather that information in order to achieve inferences. Data fusion is also called data combination or data matching. Data fusion is divided in five branch types which are exact matching, judgemental matching, probability matching, statistical matching, and data linking. Currently, Gyeongnam province is executing the social survey every year with the provincials. But, they have the limit of the analysis as execute the different survey to 3 year cycles. In this paper, we study to data fusion of environmental survey data using sas macro. We can use data fusion outputs in environmental preservation and environmental improvement.

  • PDF

Data Reduction Method in Massive Data Sets

  • Namo, Gecynth Torre;Yun, Hong-Won
    • Journal of information and communication convergence engineering
    • /
    • 제7권1호
    • /
    • pp.35-40
    • /
    • 2009
  • Many researchers strive to research on ways on how to improve the performance of RFID system and many papers were written to solve one of the major drawbacks of potent technology related with data management. As RFID system captures billions of data, problems arising from dirty data and large volume of data causes uproar in the RFID community those researchers are finding ways on how to address this issue. Especially, effective data management is important to manage large volume of data. Data reduction techniques in attempts to address the issues on data are also presented in this paper. This paper introduces readers to a new data reduction algorithm that might be an alternative to reduce data in RFID Systems. A process on how to extract data from the reduced database is also presented. Performance study is conducted to analyze the new data reduction algorithm. Our performance analysis shows the utility and feasibility of our categorization reduction algorithms.

데이터 가치에 대한 탐색적 연구: 공공데이터를 중심으로 (A Study on the Data Value: In Public Data)

  • 이상은;이정훈;최현진
    • 한국IT서비스학회지
    • /
    • 제21권1호
    • /
    • pp.145-161
    • /
    • 2022
  • The data is a key catalyst for the development of the fourth industry, and has been viewed as an essential element of the new industry, with technology convergence such as artificial intelligence, augmented/virtual reality, self-driving and 5 G. This will determine the price and value of the data as the user uses data in which the data is based on the context of the situation, rather than the data itself of the past supplier-centric data. This study began with, what factors will increase the value of data from a user perspective not a supplier perspective The study was limited to public data and users conducted research on users using data, such as analysis or development based on data. The study was designed to gauge the value of data that was not studied in the user's perspective, and was instrumental in raising the value of data in the jurisdiction of supplying and managing data.

AHP 기법을 활용한 Big Data 보안관리 요소들의 우선순위 분석에 관한 연구 (A Study on Priorities of the Components of Big Data Information Security Service by AHP)

  • 수브르더 비스워스;유진호;정철용
    • 한국전자거래학회지
    • /
    • 제18권4호
    • /
    • pp.301-314
    • /
    • 2013
  • IT기술의 발전은 기존의 컴퓨터 환경과 더불어 수많은 모바일 환경 및 사물 인터넷환경을 통해 사람의 삶을 편리하게 하고 있다. 이러한 모바일과 인터넷 환경의 등장으로 데이터가 급속히 폭증하고 있으며, 이러한 환경에서 데이터를 경제적인 자산으로 활용 가능한 Big Data 환경과 서비스가 등장하고 있다. 그러나 Big Data를 활용한 서비스는 증가하고 있지만, 이러한 서비스를 위해 발생되는 다량의 데이터에는 보안적 문제점이 있음에도 불구하고 Big Data의 보안성에 대한 논의는 미흡한 실정이다. 그리고 기존의 Big Data에 대한 보안적인 측면의 연구들은 Big Data의 보안이 아닌 Big Data를 활용한 서비스의 보안이 주를 이루고 있다. 이에 따라서 본 연구에서는 Big Data의 서비스 산업의 활성화를 위하여 Big Data의 보안에 대한 연구를 하였다. 세부적으로 AHP 기법을 활용한 Big Data 환경에서 보안관리를 위한 구성요소를 파악하고 그에 대한 우선순위를 도출하였다.

Abyss Storage Cluster 기반의 DataLake Framework의 설계 (Draft Design of DataLake Framework based on Abyss Storage Cluster)

  • 차병래;박선;신병춘;김종원
    • 스마트미디어저널
    • /
    • 제7권1호
    • /
    • pp.9-15
    • /
    • 2018
  • 기관 또는 조직은 비즈니스 시스템의 규모가 커지면서 이들과 관련된 서로 다른 시스템에서 다양한 대량의 데이터들이 생성되고 있다. 이와 같이 비즈니스 환경에서 서로 다른 시스템에서 데이터를 보다 스마트하게 처리하여 효율성을 높일 수 있는 방법이 필요하다. 이를 위한 가장 기본적인 접근 방법 중 하나는 DataLake와 같이 데이터를 정확하게 설명하고 전체 비즈니스에 대한 가장 중요한 데이터를 나타낼 수 있는 단일 도메인 모델을 만드는 것이다. DataLake의 장점을 구현하기 위해서는 다양하게 요구되어진 기능들을 어떤 구조로, 어떻게 작동 할 것인지에 대한 DataLake의 구성 요소들을 정의하는 게 중요하며, DataLake의 구성 요소들에 의해서 데이터 흐름에 따른 라이프 사이클을 갖게 된다. 또한 데이터 획득 시점에서 DataLake로 유입되는 동안 메타 데이터는 데이터 추적 가능성, 데이터 계보 및 라이프 사이클 전반의 데이터 민감도에 기반 한 보안 측면과 함께 캡처 및 관리되어야 하며, 이러한 이유로 Abyss Storage Cluster 기반의 DataLake Framework를 설계하였다.

약물부작용 감시를 위한 공통데이터모델 기반 임상데이터웨어하우스 구축 (Development and Lessons Learned of Clinical Data Warehouse based on Common Data Model for Drug Surveillance)

  • 노미정
    • 한국병원경영학회지
    • /
    • 제28권3호
    • /
    • pp.1-14
    • /
    • 2023
  • Purposes: It is very important to establish a clinical data warehouse based on a common data model to offset the different data characteristics of each medical institution and for drug surveillance. This study attempted to establish a clinical data warehouse for Dankook university hospital for drug surveillance, and to derive the main items necessary for development. Methodology/Approach: This study extracted the electronic medical record data of Dankook university hospital tracked for 9 years from 2013 (2013.01.01. to 2021.12.31) to build a clinical data warehouse. The extracted data was converted into the Observational Medical Outcomes Partnership Common Data Model (Version 5.4). Data term mapping was performed using the electronic medical record data of Dankook university hospital and the standard term mapping guide. To verify the clinical data warehouse, the use of angiotensin receptor blockers and the incidence of liver toxicity were analyzed, and the results were compared with the analysis of hospital raw data. Findings: This study used a total of 670,933 data from electronic medical records for the Dankook university clinical data warehouse. Excluding the number of overlapping cases among the total number of cases, the target data was mapped into standard terms. Diagnosis (100% of total cases), drug (92.1%), and measurement (94.5%) were standardized. For treatment and surgery, the insurance EDI (electronic data interchange) code was used as it is. Extraction, conversion and loading were completed. R language-based conversion and loading software for the process was developed, and clinical data warehouse construction was completed through data verification. Practical Implications: In this study, a clinical data warehouse for Dankook university hospitals based on a common data model supporting drug surveillance research was established and verified. The results of this study provide guidelines for institutions that want to build a clinical data warehouse in the future by deriving key points necessary for building a clinical data warehouse.

  • PDF

데이터간 의미 분석을 위한 R기반의 데이터 가중치 및 신경망기반의 데이터 예측 모형에 관한 연구 (A Novel Data Prediction Model using Data Weights and Neural Network based on R for Meaning Analysis between Data)

  • 정세훈;김종찬;심춘보
    • 한국멀티미디어학회논문지
    • /
    • 제18권4호
    • /
    • pp.524-532
    • /
    • 2015
  • All data created in BigData times is included potentially meaning and correlation in data. A variety of data during a day in all society sectors has become created and stored. Research areas in analysis and grasp meaning between data is proceeding briskly. Especially, accuracy of meaning prediction and data imbalance problem between data for analysis is part in course of something important in data analysis field. In this paper, we proposed data prediction model based on data weights and neural network using R for meaning analysis between data. Proposed data prediction model is composed of classification model and analysis model. Classification model is working as weights application of normal distribution and optimum independent variable selection of multiple regression analysis. Analysis model role is increased prediction accuracy of output variable through neural network. Performance evaluation result, we were confirmed superiority of prediction model so that performance of result prediction through primitive data was measured 87.475% by proposed data prediction model.

EPCIS Event 데이터 크기의 정량적 모델링에 관한 연구 (A Study on Quantitative Modeling for EPCIS Event Data)

  • 이창호;조용철
    • 대한안전경영과학회지
    • /
    • 제11권4호
    • /
    • pp.221-228
    • /
    • 2009
  • Electronic Product Code Information Services(EPCIS) is an EPCglobal standard for sharing EPC related information between trading partners. EPCIS provides a new important capability to improve efficiency, security, and visibility in the global supply chain. EPCIS data are classified into two categories, master data (static data) and event data (dynamic data). Master data are static and constant for objects, for example, the name and code of product and the manufacturer, etc. Event data refer to things that happen dynamically with the passing of time, for example, the date of manufacture, the period and the route of circulation, the date of storage in warehouse, etc. There are four kinds of event data which are Object Event data, Aggregation Event data, Quantity Event data, and Transaction Event data. This thesis we propose an event-based data model for EPC Information Service repository in RFID based integrated logistics center. This data model can reduce the data volume and handle well all kinds of entity relationships. From the point of aspect of data quantity, we propose a formula model that can explain how many EPCIS events data are created per one business activity. Using this formula model, we can estimate the size of EPCIS events data of RFID based integrated logistics center for a one day under the assumed scenario.