• Title/Summary/Keyword: 데이터과학자

Search Result 604, Processing Time 0.028 seconds

A Data Protection Scheme based on Hilbert Curve for Data Aggregation in Wireless Sensor Network (센서 네트워크에서 데이터 집계를 위한 힐버트 커브 기반 데이터 보호 기법)

  • Yoon, Min;Kim, Yong-Ki;Chang, Jae-Woo
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.16 no.11
    • /
    • pp.1071-1075
    • /
    • 2010
  • Because a sensor node in wireless sensor networks(WSNs) has limited resources, such as battery capacity and memory, data aggregation techniques have been studied to manage the limited resources efficiently. Because sensor network uses wireless communication, a data can be disclosed by attacker. Thus, the study on data protection schemes for data aggregation is essential in WSNs. But the existing data aggregation methods require both a large number of computation and communication, in case of network construction and data aggregation processing. To solve the problem, we propose a data protection scheme based on Hilbert-curve for data aggregation. Our scheme can minimizes communications among neighboring sensor nodes by using tree-based routing. Moreover, it can protect the data from attacker by doing encryption through a Hilbert-curve technique based on a private seed, Finally, we show that our scheme outperforms the existing methods in terms of message transmission and average sensor node lifetime.

The value and sharing of medical research data (의학연구데이터의 가치와 공유의 의미)

  • Kim, Na Won
    • Proceedings of the Korean Society for Information Management Conference
    • /
    • 2017.08a
    • /
    • pp.104-104
    • /
    • 2017
  • 연구 데이터란 과학적 연구에서 사용된 일차 자료와 연구자에 의해 직접 작성된 연구의 결과로서 수치, 문자, 이미지, 음성 등의 사실적 기록을 의미한다. 이 연구의 주제 분야인 의학은 잠재적 가치와 활용 가능성이 높고 공익적 성격을 가지고 있는 학문 분야로 의학 연구 데이터의 종류와 관리의 필요성을 통해서 그 가치와 공유 의미를 찾아보고자 한다. 또한 연구 데이터의 대표적인 임상 시험 기록과 연구 논문의 발표와 공유 현황에 대해서도 살펴보고 그 안에서 도서관의 역할이 어떤 것인가를 짚어보고자 한다. 의학 연구 데이터는 환자 진료기록, 건강 검진 기록, 임상 기록, 사망 기록, 임상 시험 기록, 유전체 정보, 연구 논문 등 그 종류와 형태가 다양하며 대용량인 경우가 많다. 의학 연구는 개인 정보보호와 윤리적인 문제 등 연구 수행 과정에서 어려운 점이 많은 성격을 가지고 있으나 질병 치료나 예방 나아가 인류의 건강과 직접적으로 관련된 학문 분야로 의학 연구 데이터의 보존과 공개, 공유를 위한 관리는 큰 의미가 있다. 의학 연구 데이터관리는 새로운 연구의 밑받침이 될 뿐만 아니라 중 저개발 국가의 연구자들에게도 큰 기회를 부여하여 세계적인 의학 발전에 기여할 수 있다. 또한 임상 시험 결과에 대한 은폐와 거짓 연구 방지에도 의미가 있어 미국뿐만 아니라 전세계적으로 학술 연구 논문 발표에 사용된 데이터는 등록하도록 규정하고 있다. 임상 시험 등록으로 공인된 사이트는 NIH의 ClinicalTrials.gov, ICTRP의 Primary Registry 등이 있으며, 우리나라에도 질병관리본부 국립보건연구원에서 관리하는 CRIS 등이 있다. 의학 연구자들은 연구의 시작부터 연구 데이터를 수집, 사용, 보존, 공유의 문제를 고려해야 하나 시간적 물리적인 문제 등으로 어려움을 겪고 있으며, 이를 지원하는 서비스는 도서관에서도 관심이 높아지고 있는 분야로 Virginia Commonwealth 대학 도서관과 Emory 대학 도서관 등에서 시도되고 있다. 이 서비스는 연구 과정에서 사서의 지원이 가능한 새로운 기회로 연구자의 데이터관리를 위한 단계별 스토리를 조직하고 DMP 작성 지원 및 교육 등을 통해서 학술 커뮤니케이션에서 새로운 역할자로 자리잡을 수 있을 것이다.

  • PDF

Probability of default validation in a corporate credit rating model (국내모회사와 해외자회사 신용평가모형의 적합성 검증 연구)

  • Lee, Woosik;Kim, Dong-Yung
    • Journal of the Korean Data and Information Science Society
    • /
    • v.28 no.3
    • /
    • pp.605-615
    • /
    • 2017
  • Recently, financial supervisory authority of Korea and international credit rating agencies have been concerned about a stand-alone rating that is calculated without incorporating guaranteed support of parent companies. Guaranteed by parent companies, most foreign subsidiaries keeps good credit rate in spite of weak financial status. However, what if the parent companies stop supporting the foreign subsidiaries, they could have a probability to go bankrupt. In this paper, we have validated a credit rating model through statistical measurers such as performance, calibration, and stability for Korean companies owning foreign subsidiaries.

Noise Averaging Effect on Privacy-Preserving Clustering of Time-Series Data (시계열 데이터의 프라이버시 보호 클러스터링에서 노이즈 평준화 효과)

  • Moon, Yang-Sae;Kim, Hea-Suk
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.16 no.3
    • /
    • pp.356-360
    • /
    • 2010
  • Recently, there have been many research efforts on privacy-preserving data mining. In privacy-preserving data mining, accuracy preservation of mining results is as important as privacy preservation. Random perturbation privacy-preserving data mining technique is known to well preserve privacy. However, it has a problem that it destroys distance orders among time-series. In this paper, we propose a notion of the noise averaging effect of piecewise aggregate approximation(PAA), which can be preserved the clustering accuracy as high as possible in time-series data clustering. Based on the noise averaging effect, we define the PAA distance in computing distance. And, we show that our PAA distance can alleviate the problem of destroying distance orders in random perturbing time series.

AI Model-Based Automated Data Cleaning for Reliable Autonomous Driving Image Datasets (자율주행 영상데이터의 신뢰도 향상을 위한 AI모델 기반 데이터 자동 정제)

  • Kana Kim;Hakil Kim
    • Journal of Broadcast Engineering
    • /
    • v.28 no.3
    • /
    • pp.302-313
    • /
    • 2023
  • This paper aims to develop a framework that can fully automate the quality management of training data used in large-scale Artificial Intelligence (AI) models built by the Ministry of Science and ICT (MSIT) in the 'AI Hub Data Dam' project, which has invested more than 1 trillion won since 2017. Autonomous driving technology using AI has achieved excellent performance through many studies, but it requires a large amount of high-quality data to train the model. Moreover, it is still difficult for humans to directly inspect the processed data and prove it is valid, and a model trained with erroneous data can cause fatal problems in real life. This paper presents a dataset reconstruction framework that removes abnormal data from the constructed dataset and introduces strategies to improve the performance of AI models by reconstructing them into a reliable dataset to increase the efficiency of model training. The framework's validity was verified through an experiment on the autonomous driving dataset published through the AI Hub of the National Information Society Agency (NIA). As a result, it was confirmed that it could be rebuilt as a reliable dataset from which abnormal data has been removed.

Transaction Pattern Discrimination of Malicious Supply Chain using Tariff-Structured Big Data (관세 정형 빅데이터를 활용한 우범공급망 거래패턴 선별)

  • Kim, Seongchan;Song, Sa-Kwang;Cho, Minhee;Shin, Su-Hyun
    • The Journal of the Korea Contents Association
    • /
    • v.21 no.2
    • /
    • pp.121-129
    • /
    • 2021
  • In this study, we try to minimize the tariff risk by constructing a hazardous cargo screening model by applying Association Rule Mining, one of the data mining techniques. For this, the risk level between supply chains is calculated using the Apriori Algorithm, which is an association analysis algorithm, using the big data of the import declaration form of the Korea Customs Service(KCS). We perform data preprocessing and association rule mining to generate a model to be used in screening the supply chain. In the preprocessing process, we extract the attributes required for rule generation from the import declaration data after the error removing process. Then, we generate the rules by using the extracted attributes as inputs to the Apriori algorithm. The generated association rule model is loaded in the KCS screening system. When the import declaration which should be checked is received, the screening system refers to the model and returns the confidence value based on the supply chain information on the import declaration data. The result will be used to determine whether to check the import case. The 5-fold cross-validation of 16.6% precision and 33.8% recall showed that import declaration data for 2 years and 6 months were divided into learning data and test data. This is a result that is about 3.4 times higher in precision and 1.5 times higher in recall than frequency-based methods. This confirms that the proposed method is an effective way to reduce tariff risks.

A study on the effect of cognitive types on EEg laterality in judgmental time series forecasting (인지유형에 따른 시계열 예측에 있어 뇌파의 편측성에 대한 연구)

  • 박흥국;황민철;임좌상
    • Science of Emotion and Sensibility
    • /
    • v.2 no.1
    • /
    • pp.121-128
    • /
    • 1999
  • 본 연구는 인지 유형에 따라 시계열 예측의 정확성이 분석적인 사람과 직관적인 사람 간에 다를 것이란 가설을 설정하고 이를 규명하기 위하여 44명의 대학생을 사용하여 실험이 이루어졌다. 피험자는 MBTI에 의거하여 분석적인 그룹과 직관적인 그룹으로 나누고 주어진 시계열 데이터에 대하여 예측을 하게 하였다. 이때 인지 유형에 따른 뇌파의 편측성을 분석하기 위하여 전두엽에서 뇌파(F3, F4)를 측정하였다. 그 결과, 인지유형간의 뇌파의 편측성에 유의적인 차이가 없었으며, 예측의 정확성 (MAPE) 또한 유의적인 차이가 없었다.

  • PDF

Epistemological Implications of Scientific Reasoning Designed by Preservice Elementary Teachers during Their Simulation Teaching: Evidence-Explanation Continuum Perspective (초등 예비교사가 모의수업 시연에서 구성한 과학적 추론의 인식론적 의미 - 증거-설명 연속선의 관점 -)

  • Maeng, Seungho
    • Journal of Korean Elementary Science Education
    • /
    • v.42 no.1
    • /
    • pp.109-126
    • /
    • 2023
  • In this study, I took the evidence-explanation (E-E) continuum perspective to examine the epistemological implications of scientific reasoning cases designed by preservice elementary teachers during their simulation teaching. The participants were four preservice teachers who conducted simulation instruction on the seasons and high/low air pressure and wind. The selected discourse episodes, which included cases of inductive, deductive, or abductive reasoning, were analyzed for their epistemological implications-specifically, the role played by the reasoning cases in the E-E continuum. The two preservice teachers conducting seasons classes used hypothetical-deductive reasoning when they identified evidence by comparing student-group data and tested a hypothesis by comparing the evidence with the hypothetical statement. However, they did not adopt explicit reasoning for creating the hypothesis or constructing a model from the evidence. The two preservice teachers conducting air pressure and wind classes applied inductive reasoning to find evidence by summarizing the student-group data and adopted linear logic-structured deductive reasoning to construct the final explanation. In teaching similar topics, the preservice teachers showed similar epistemic processes in their scientific reasoning cases. However, the epistemological implications of the instruction were not similar in terms of the E-E continuum. In addition, except in one case, the teachers were neither good at abductive reasoning for creating a hypothesis or an explanatory model, nor good at using reasoning to construct a model from the evidence. The E-E continuum helps in examining the epistemological implications of scientific reasoning and can be an alternative way of transmitting scientific reasoning.

A Study on Secure Cluster Based Routing Protocol considering Distributed PKI Mechanisms (분산 PKI 메커니즘을 고려한 안전한 클러스터 기반 라우팅 프로토콜에 관한 연구)

  • Hahn, Gene-Beck;Nyang, Dae-Hun;Kim, Sin-Kyu;Seo, Sung-Hoon;Song, Joo-Seok
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2004.05a
    • /
    • pp.1299-1302
    • /
    • 2004
  • 본 연구에서는 MANET(Mobile Ad Hoc Network)에서 분산 PKI(Public Key Infrastructure) 메커니즘을 라우팅 프로토콜에 적용하기 위한 방법을 제안한다. 이를 위해 MANET이 사용하는 라우팅 프로토콜로 CBRP(Cluster Based Routing Protocol)를 고려한다. 제안하는 프로토콜은 CBRP의 기능과 분산 PKI 메커니즘을 활용하여 효율적으로 인증서 체인을 찾을 수 있고, 이를 통해 통신노드 상호간의 세션키 설정과 송수신하고자 하는 데이터에 대한 암호화를 지원한다. 또한, 라우팅 프로토콜의 안전한 동작을 위해 제안하는 프로토콜은 전자서명된 HELLO 메시지를 교환하여 악의적인 공격자들에 대해 신뢰성을 제공하고, 안전한 라우팅을 가능하게 한다.

  • PDF

Deriving Personalized Context-aware Services from Activities of Daily Living (생활 데이터 분석을 통한 개인화된 상황인식서비스 생성)

  • Park, Jeong-Kyu;Lee, Keung-Hae
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.16 no.5
    • /
    • pp.525-530
    • /
    • 2010
  • Currently, most context-aware services are built by developers. Some researchers argued that services should be defined by end users, who understand their own needs best. We view that the significance of enabling the user to define his/her personalized services will multiply as our living spaces grow smarter. This paper introduces a novel method called CASPER, which is capable of deriving personalized services from the log of user's activities of daily living. CASPER can generate useful services that even the user may not perceive, mining causality of events in the log. We present the algorithm of CASPER in detail and discuss the result of an experiment which we conducted as a proof of concept.