• 제목/요약/키워드: Data journal

검색결과 188,479건 처리시간 0.106초

빅데이터 처리 프로세스에 따른 빅데이터 위험요인 분석 (The Analyzing Risk Factor of Big Data : Big Data Processing Perspective)

  • 이지은;김창재;이남용
    • 한국IT서비스학회지
    • /
    • 제13권2호
    • /
    • pp.185-194
    • /
    • 2014
  • Recently, as value for practical use of big data is evaluated, companies and organizations that create benefit and profit are gradually increasing with application of big data. But specifical and theoretical study about possible risk factors as introduction of big data is not being conducted. Accordingly, the study extracts the possible risk factors as introduction of big data based on literature reviews and classifies according to big data processing, data collection, data storage, data analysis, analysis data visualization and application. Also, the risk factors have order of priority according to the degree of risk from the survey of experts. This study will make a chance that can avoid risks by bid data processing and preparation for risks in order of dangerous grades of risk.

A Temporal Data model and a Query Language Based on the OO data model

  • 서용무
    • 한국경영과학회지
    • /
    • 제14권1호
    • /
    • pp.87-87
    • /
    • 1989
  • There have been lots of research on temporal data management for the past two decades. Most of them are based on some logical data model, especially on the relational data model, although there are some conceptual data models which are independent of logical data models. Also, many properties or issues regarding temporal data models and temporal query languages have been studied. But some of them were shown to be incompatible, which means there could not be a complete temporal data model, satisfying all the desired properties at the same time. Many modeling issues discussed in the papers, do not have to be done so, if they take object-oriented data model as a base model. Therefore, this paper proposes a temporal data model, which is based on the object-oriented data model, mainly discussing the most essential issues that are common to many temporal data models. Our new temporal data model and query language will be illustrated with a small database, created by a set of sample transaction.

A Study of Association Rule Mining by Clustering through Data Fusion

  • Cho, Kwang-Hyun;Park, Hee-Chang
    • Journal of the Korean Data and Information Science Society
    • /
    • 제18권4호
    • /
    • pp.927-935
    • /
    • 2007
  • Currently, Gyeongnam province is executing the social index survey every year to the provincials. But, this survey has the limit of the analysis as execution of the different survey per 3 year cycles. The solution of this problem is data fusion. Data fusion is the process of combining multiple data in order to provide information of tactical value to the user. But, data fusion doesn#t mean the ultimate result. Therefore, efficient analysis for the data fusion is also important. In this study, we present data fusion method of statistical survey data. Also, we suggest application methodology of association rule mining by clustering through data fusion of statistical survey data.

  • PDF

Design and evaluation of a fuzzy cooperative caching scheme for MANETs

  • Bae, Ihn-Han
    • Journal of the Korean Data and Information Science Society
    • /
    • 제21권3호
    • /
    • pp.605-619
    • /
    • 2010
  • Caching of frequently accessed data in multi-hop ad hoc environment is a technique that can improve data access performance and availability. Cooperative caching, which allows sharing and coordination of cached data among several clients, can further en-hance the potential of caching techniques. In this paper, we propose a fuzzy cooperative caching scheme in mobile ad hoc networks. The cache management of the proposed caching scheme not only uses adaptively CacheData or CachePath based on data sim-ilarity and data utility, but also uses the replacement manager based on data pro t. Also, the proposed caching scheme uses a prefetch manager. When the TTL of the cached data expires, the prefetch manager evaluates the popularity index of the data. If the popularity index is larger than a threshold, the data is prefetched. Otherwise, its space is released. The performance of the proposed scheme is evaluated analytically and is compared to that of other cooperative caching schemes.

지중 송전케이블 자산데이터의 자동 정제 알고리즘 개발연구 (Automatic Cleaning Algorithm of Asset Data for Transmission Cable)

  • Hwang, Jae-Sang;Mun, Sung-Duk;Kim, Tae-Joon;Kim, Kang-Sik
    • KEPCO Journal on Electric Power and Energy
    • /
    • 제7권1호
    • /
    • pp.79-84
    • /
    • 2021
  • The fundamental element to be kept for big data analysis, artificial intelligence technologies and asset management system is a data quality, which could directly affect the entire system reliability. For this reason, the momentum of data cleaning works is recently increased and data cleaning methods have been investigating around the world. In the field of electric power, however, asset data cleaning methods have not been fully determined therefore, automatic cleaning algorithm of asset data for transmission cables has been studied in this paper. Cleaning algorithm is composed of missing data treatment and outlier data one. Rule-based and expert opinion based cleaning methods are converged and utilized for these dirty data.

Brief Paper: An Analysis of Curricula for Data Science Undergraduate Programs

  • Cho, Soosun
    • Journal of Multimedia Information System
    • /
    • 제9권2호
    • /
    • pp.171-176
    • /
    • 2022
  • Today, it is imperative to educate students on how to best prepare themselves for the new data driven era of the future. Undergraduate education plays an important role in providing students with more Data Science opportunities and expanding the supply of Data Science talent. This paper surveys and analyzes the curricula of Data Science-related bachelor's degree programs in the United States. The 'required' and 'elective' courses in a curriculum for obtaining a B.S. degree were evaluated by course weight to indicate its necessity. As a result, it was possible to find out which courses were important in Data Science programs and which areas were emphasized for B.S. degrees in Data Science. We found that courses belong to the Data Science area, such as data management, data visualization, and data modeling, were more required for Data Science B.S. degrees in the United States.

데이터간 의미 분석을 위한 R기반의 데이터 가중치 및 신경망기반의 데이터 예측 모형에 관한 연구 (A Novel Data Prediction Model using Data Weights and Neural Network based on R for Meaning Analysis between Data)

  • 정세훈;김종찬;심춘보
    • 한국멀티미디어학회논문지
    • /
    • 제18권4호
    • /
    • pp.524-532
    • /
    • 2015
  • All data created in BigData times is included potentially meaning and correlation in data. A variety of data during a day in all society sectors has become created and stored. Research areas in analysis and grasp meaning between data is proceeding briskly. Especially, accuracy of meaning prediction and data imbalance problem between data for analysis is part in course of something important in data analysis field. In this paper, we proposed data prediction model based on data weights and neural network using R for meaning analysis between data. Proposed data prediction model is composed of classification model and analysis model. Classification model is working as weights application of normal distribution and optimum independent variable selection of multiple regression analysis. Analysis model role is increased prediction accuracy of output variable through neural network. Performance evaluation result, we were confirmed superiority of prediction model so that performance of result prediction through primitive data was measured 87.475% by proposed data prediction model.

EPCIS Event 데이터 크기의 정량적 모델링에 관한 연구 (A Study on Quantitative Modeling for EPCIS Event Data)

  • 이창호;조용철
    • 대한안전경영과학회지
    • /
    • 제11권4호
    • /
    • pp.221-228
    • /
    • 2009
  • Electronic Product Code Information Services(EPCIS) is an EPCglobal standard for sharing EPC related information between trading partners. EPCIS provides a new important capability to improve efficiency, security, and visibility in the global supply chain. EPCIS data are classified into two categories, master data (static data) and event data (dynamic data). Master data are static and constant for objects, for example, the name and code of product and the manufacturer, etc. Event data refer to things that happen dynamically with the passing of time, for example, the date of manufacture, the period and the route of circulation, the date of storage in warehouse, etc. There are four kinds of event data which are Object Event data, Aggregation Event data, Quantity Event data, and Transaction Event data. This thesis we propose an event-based data model for EPC Information Service repository in RFID based integrated logistics center. This data model can reduce the data volume and handle well all kinds of entity relationships. From the point of aspect of data quantity, we propose a formula model that can explain how many EPCIS events data are created per one business activity. Using this formula model, we can estimate the size of EPCIS events data of RFID based integrated logistics center for a one day under the assumed scenario.

고층기상관측자료를 이용한 바람장 개선 효과 연구 (The Effects of Data Assimilation on Simulated Wind Fields Using Upper-Air Observations)

  • 정주희;권지혜;김유근
    • 한국환경과학회지
    • /
    • 제16권10호
    • /
    • pp.1127-1137
    • /
    • 2007
  • We focused on effects on data assimilation of simulated wind fields by using upper-air observations (wind profiler and sonde data). Local Analysis Prediction System (LAPS), a type of data assimilation system, was used for wind field modeling. Five cases of simulation experiments for sensitivity analysis were performed: which are EXP0) non data assimilation, EXP1) surface data, EXP2) surface data and sonde data, EXP3) surface data and wind profiler data, EXP4) surface data, sonde data and wind profiler data. These were compared with observation data. The result showed that the effects of data assimilation with wind profiler data were found to be greater than sonde data. The delicate wind fields in complex coastal area were simulated well in EXP3. EXP3 and EXP4 using wind profiler data with vertically high resolution represented well sophisticated differences of wind speed compared with EXP1 and EXP2, this is because the effects of wind profiler data assimilation were sensitively adjusted to first guess field than those of sonde observations.

공공데이터 융합역량 수준에 따른 데이터 기반 조직 역량의 연구 (A Study on the Data-Based Organizational Capabilities by Convergence Capabilities Level of Public Data)

  • 정병호;주형근
    • 디지털산업정보학회논문지
    • /
    • 제18권4호
    • /
    • pp.97-110
    • /
    • 2022
  • The purpose of this study is to analyze the level of public data convergence capabilities of administrative organizations and to explore important variables in data-based organizational capabilities. The theoretical background was summarized on public data and use activation, joint use, convergence, administrative organization, and convergence constraints. These contents were explained Public Data Act, the Electronic Government Act, and the Data-Based Administrative Act. The research model was set as the data-based organizational capabilities effect by a data-based administrative capability, public data operation capabilities, and public data operation constraints. It was also set whether there is a capabilities difference data-based on an organizational operation by the level of data convergence capabilities. This study analysis was conducted with hierarchical cluster analysis and multiple regression analysis. As the research result, First, hierarchical cluster analysis was classified into three groups. It was classified into a group that uses only public data and structured data, a group that uses public data on both structured and unstructured data, and a group that uses both public and private data. Second, the critical variables of data-based organizational operation capabilities were found in the data-based administrative planning and administrative technology, the supervisory organizations and technical systems by public data convergence, and the data sharing and market transaction constraints. Finally, the essential independent variables on data-based organizational competencies differ by group. This study contributed. As a theoretical implication, this research is updated on management information systems by explaining the Public Data Act, the Electronic Government Act, and the Data-Based Administrative Act. As a practical implication, the activity reinforcement of public data should be promoting the establishment of data standardization and search convenience and elimination of the lukewarm attitudes and Selfishness behavior for data sharing.