• Title/Summary/Keyword: 데이터 논문

Search Result 41,647, Processing Time 0.063 seconds

A study on DID metadata processing method according to distance learning data weight (원격교육 학습데이터 가중치에 따른 DID 메타데이터 처리방법 연구)

  • Youn-A Min
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2023.07a
    • /
    • pp.567-568
    • /
    • 2023
  • 본 논문에서는 블록체인 기반 DID기술을 이용하여 원격교육에서 발생하는 학습데이터를 효율적으로 관리하기 위한 방법으로, 학습데이터 가중치를 고려한 DID 메타데이터관리방법을 제안하였다. 메타데이터의 식별자에 대하여 특정위치로 데이터 가중치를 검색하도록 하고 해당 가중치에 따라 처리방법을 다양화 할 수 있다. 본문에서는 블록체인의 Zero Knowledge Proof 방식 처리에 차별화를 두어 메타데이터를 처리하였으며 데이터 처리속도 및 데이터관리에 효율성높일 수 있다.

  • PDF

스마트 항로표지 서비스를 위한 빅데이터 플랫폼 구축 연구

  • 김경원;박종빈
    • Proceedings of the Korean Institute of Navigation and Port Research Conference
    • /
    • 2021.11a
    • /
    • pp.57-59
    • /
    • 2021
  • 현재 국내 해양에 설치된 항로표지를 통해 해양 상태에 대한 다양한 정보가 수집/관리되고 있으며, 기상청에서 제공되는 기상 데이터 등 항로표지 데이터와 연계를 통해 유용한 서비스 개발이 가능한 데이터가 생상되고 있으나, 각 데이터의 관리 주체/시스템이 분산되어 효율적으로 활용되기 어려운 실정이다. 이에, 본 논문에서는 항로표지 데이터와 타 시스템에서 수집/관리되고 있는 데이터의 연계/분석을 통해 항로표지 데이터 기반의 다양한 서비스 개발에 활용 가능한 스마트 항로표지 빅데이터 플랫폼 구축 기술을 제안한다.

  • PDF

Exploring the Possibilities of Operation Data Use for Data-Driven Management in National R&D API Management System (데이터 기반 경영을 위한 국가R&D API관리시스템의 운영 데이터 활용 가능성 탐색)

  • Na, Hye-In;Lee, Jun-Young;Lee, Byeong-Hee;Choi, Kwang-Nam
    • The Journal of the Korea Contents Association
    • /
    • v.20 no.4
    • /
    • pp.14-24
    • /
    • 2020
  • This paper aims to establish an efficient national R&D Application Programming Interface (API) management system for national R&D data-driven management and explore the possibility of using operational data according to the recent global data openness and sharing policy. In accordance with the trend of opening and sharing of national R&D data, we plan to improve management efficiency by analyzing operational data of the national R&D API service. For this purpose, we standardized the parameters for the national R&D APIs that were distributed separately by integrating the individual APIs to build a national R&D API management system. The results of this study revealed that the service call traffic of the national R&D API has shown 554.5% growth in the year as compared to the year 2015 when the measurement started. In addition, this paper also evaluations the possibility of using operational data through data preparation, analysis, and prediction based on service operations management data in the actual operation of national R&D integrated API management system.

Improving Data Availability by Data Partitioning and Partial Overlapping on Multiple Cloud Storages (다수 클라우드 스토리지로의 데이터 분할 및 부분 중복을 통한 데이터 가용성 향상)

  • Park, Jun-Cheol
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.36 no.12B
    • /
    • pp.1498-1508
    • /
    • 2011
  • A cloud service customer has no other way but to wait for his lost data to be recovered by the cloud service provider when the data was lost or not accessible for a while due to the provider's system failure, cracking attempt, malfunction, or outage. We consider a solution to address this problem that can be implemented in the cloud client's domain, rather than in the cloud service provider's domain. We propose a high level architecture and scheme for successfully retrieving data units even when several cloud storages are not accessible at the same time. The scheme is based on a clever way of partitioning and partial overlapping of data for being stored on multiple cloud storages. In addition to providing a high level of data availability, the scheme makes it possible to re-encrypt data units with new keys in a user transparent way, and can produce the complete log of every user's data units accessed, for assessing data disclosure, if needed.

Noise Averaging Effect on Privacy-Preserving Clustering of Time-Series Data (시계열 데이터의 프라이버시 보호 클러스터링에서 노이즈 평준화 효과)

  • Moon, Yang-Sae;Kim, Hea-Suk
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.16 no.3
    • /
    • pp.356-360
    • /
    • 2010
  • Recently, there have been many research efforts on privacy-preserving data mining. In privacy-preserving data mining, accuracy preservation of mining results is as important as privacy preservation. Random perturbation privacy-preserving data mining technique is known to well preserve privacy. However, it has a problem that it destroys distance orders among time-series. In this paper, we propose a notion of the noise averaging effect of piecewise aggregate approximation(PAA), which can be preserved the clustering accuracy as high as possible in time-series data clustering. Based on the noise averaging effect, we define the PAA distance in computing distance. And, we show that our PAA distance can alleviate the problem of destroying distance orders in random perturbing time series.

An Efficient Recovery Technique using Global Buffer on SAN Environments (SAN 환경에서의 전역 버퍼를 이용한 효율적인 회복 기법)

  • Park, Chun-Seo;Kim, Gyeong-Bae;Lee, Yong-Ju;Park, Seon-Yeong;Sin, Beom-Ju
    • The KIPS Transactions:PartA
    • /
    • v.8A no.4
    • /
    • pp.375-384
    • /
    • 2001
  • The shared disk file systems use a technique known as file system journaling to support recovery of metadata on the SAN(Storage Area Network). In the existing journaling technique, the metadata that is dirtied by one host must be updated to disk space before some hosts access it. The system performance is decreased because the disk access number is increased. In this paper, we describe a new recovery technique using a global buffer to decrease disk I/O. It transmits the dirtied metadata into the other hosts through Fibre Channel network on the SAN instead of disk I/O and supports recovery of a critical data by journaling a data as well as metadata.

  • PDF

Predictive Optimization Adjusted With Pseudo Data From A Missing Data Imputation Technique (결측 데이터 보정법에 의한 의사 데이터로 조정된 예측 최적화 방법)

  • Kim, Jeong-Woo
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.20 no.2
    • /
    • pp.200-209
    • /
    • 2019
  • When forecasting future values, a model estimated after minimizing training errors can yield test errors higher than the training errors. This result is the over-fitting problem caused by an increase in model complexity when the model is focused only on a given dataset. Some regularization and resampling methods have been introduced to reduce test errors by alleviating this problem but have been designed for use with only a given dataset. In this paper, we propose a new optimization approach to reduce test errors by transforming a test error minimization problem into a training error minimization problem. To carry out this transformation, we needed additional data for the given dataset, termed pseudo data. To make proper use of pseudo data, we used three types of missing data imputation techniques. As an optimization tool, we chose the least squares method and combined it with an extra pseudo data instance. Furthermore, we present the numerical results supporting our proposed approach, which resulted in less test errors than the ordinary least squares method.

Anomaly Detection In Real Power Plant Vibration Data by MSCRED Base Model Improved By Subset Sampling Validation (Subset 샘플링 검증 기법을 활용한 MSCRED 모델 기반 발전소 진동 데이터의 이상 진단)

  • Hong, Su-Woong;Kwon, Jang-Woo
    • Journal of Convergence for Information Technology
    • /
    • v.12 no.1
    • /
    • pp.31-38
    • /
    • 2022
  • This paper applies an expert independent unsupervised neural network learning-based multivariate time series data analysis model, MSCRED(Multi-Scale Convolutional Recurrent Encoder-Decoder), and to overcome the limitation, because the MCRED is based on Auto-encoder model, that train data must not to be contaminated, by using learning data sampling technique, called Subset Sampling Validation. By using the vibration data of power plant equipment that has been labeled, the classification performance of MSCRED is evaluated with the Anomaly Score in many cases, 1) the abnormal data is mixed with the training data 2) when the abnormal data is removed from the training data in case 1. Through this, this paper presents an expert-independent anomaly diagnosis framework that is strong against error data, and presents a concise and accurate solution in various fields of multivariate time series data.

Visualizing Article Material using a Big Data Analytical Tool R Language (빅데이터 분석 도구 R 언어를 이용한 논문 데이터 시각화)

  • Nam, Soo-Tai;Shin, Seong-Yoon;Jin, Chan-Yong
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2021.05a
    • /
    • pp.326-327
    • /
    • 2021
  • Newly, big data utilization has been widely interested in a wide variety of industrial fields. Big data analysis is the process of discovering meaningful new correlations, patterns, and trends in large volumes of data stored in data stores and creating new value. Thus, most big data analysis technology methods include data mining, machine learning, natural language processing, and pattern recognition used in existing statistical computer science. Also, using the R language, a big data tool, we can express analysis results through various visualization functions using pre-processing text data. The data used in this study were analyzed for 29 papers in a specific journal. In the final analysis results, the most frequently mentioned keyword was "Research", which ranked first 743 times. Therefore, based on the results of the analysis, the limitations of the study and theoretical implications are suggested.

  • PDF

An Investigation on Characteristics and Intellectual Structure of Sociology by Analyzing Cited Data (사회학 분야의 연구데이터 특성과 지적구조 규명에 관한 연구)

  • Choi, Hyung Wook;Chung, EunKyung
    • Journal of the Korean Society for information Management
    • /
    • v.34 no.3
    • /
    • pp.109-124
    • /
    • 2017
  • Through a wide variety of disciplines, practices on data access and re-use have been increased recently. In fact, there has been an emerging phenomenon that researchers tend to use the data sets produced by other researchers and give scholarly credit as citation. With respect to this practice, in 2012, Thomson Reuters launched Data Citation Index (DCI). With the DCI, citation to research data published by researchers are collected and analyzed in a similar way for citation to journal articles. The purpose of this study is to identify the characteristics and intellectual structure of sociology field based on research data, which is one of actively data-citing fields. To accomplish this purpose, two data sets were collected and analyzed. First, from DCI, a total of 8,365 data were collected in the field of sociology. Second, a total of 12,132 data were collected from Web of Science with a topic search with 'Sociology'. As a result of the co-word analysis of author provided-keywords for both data sets, the intellectual structure of research data-based sociology was composed of two areas and 15 clusters and that of article-based sociology was composed with three areas and 17 clusters. More importantly, medical science area was found to be actively studied in research data-based sociology and public health and psychology are identified to be central areas from data citation.