• Title/Summary/Keyword: Secure Machine Learning

Search Result 73, Processing Time 0.025 seconds

Application of Machine Learning Techniques for the Classification of Source Code Vulnerability (소스코드 취약성 분류를 위한 기계학습 기법의 적용)

  • Lee, Won-Kyung;Lee, Min-Ju;Seo, DongSu
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.30 no.4
    • /
    • pp.735-743
    • /
    • 2020
  • Secure coding is a technique that detects malicious attack or unexpected errors to make software systems resilient against such circumstances. In many cases secure coding relies on static analysis tools to find vulnerable patterns and contaminated data in advance. However, secure coding has the disadvantage of being dependent on rule-sets, and accurate diagnosis is difficult as the complexity of static analysis tools increases. In order to support secure coding, we apply machine learning techniques, such as DNN, CNN and RNN to investigate into finding major weakness patterns shown in secure development coding guides and present machine learning models and experimental results. We believe that machine learning techniques can support detecting security weakness along with static analysis techniques.

Centralized Machine Learning Versus Federated Averaging: A Comparison using MNIST Dataset

  • Peng, Sony;Yang, Yixuan;Mao, Makara;Park, Doo-Soon
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.2
    • /
    • pp.742-756
    • /
    • 2022
  • A flood of information has occurred with the rise of the internet and digital devices in the fourth industrial revolution era. Every millisecond, massive amounts of structured and unstructured data are generated; smartphones, wearable devices, sensors, and self-driving cars are just a few examples of devices that currently generate massive amounts of data in our daily. Machine learning has been considered an approach to support and recognize patterns in data in many areas to provide a convenient way to other sectors, including the healthcare sector, government sector, banks, military sector, and more. However, the conventional machine learning model requires the data owner to upload their information to train the model in one central location to perform the model training. This classical model has caused data owners to worry about the risks of transferring private information because traditional machine learning is required to push their data to the cloud to process the model training. Furthermore, the training of machine learning and deep learning models requires massive computing resources. Thus, many researchers have jumped to a new model known as "Federated Learning". Federated learning is emerging to train Artificial Intelligence models over distributed clients, and it provides secure privacy information to the data owner. Hence, this paper implements Federated Averaging with a Deep Neural Network to classify the handwriting image and protect the sensitive data. Moreover, we compare the centralized machine learning model with federated averaging. The result shows the centralized machine learning model outperforms federated learning in terms of accuracy, but this classical model produces another risk, like privacy concern, due to the data being stored in the data center. The MNIST dataset was used in this experiment.

An Integrated Accurate-Secure Heart Disease Prediction (IAS) Model using Cryptographic and Machine Learning Methods

  • Syed Anwar Hussainy F;Senthil Kumar Thillaigovindan
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.17 no.2
    • /
    • pp.504-519
    • /
    • 2023
  • Heart disease is becoming the top reason of death all around the world. Diagnosing cardiac illness is a difficult endeavor that necessitates both expertise and extensive knowledge. Machine learning (ML) is becoming gradually more important in the medical field. Most of the works have concentrated on the prediction of cardiac disease, however the precision of the results is minimal, and data integrity is uncertain. To solve these difficulties, this research creates an Integrated Accurate-Secure Heart Disease Prediction (IAS) Model based on Deep Convolutional Neural Networks. Heart-related medical data is collected and pre-processed. Secondly, feature extraction is processed with two factors, from signals and acquired data, which are further trained for classification. The Deep Convolutional Neural Networks (DCNN) is used to categorize received sensor data as normal or abnormal. Furthermore, the results are safeguarded by implementing an integrity validation mechanism based on the hash algorithm. The system's performance is evaluated by comparing the proposed to existing models. The results explain that the proposed model-based cardiac disease diagnosis model surpasses previous techniques. The proposed method demonstrates that it attains accuracy of 98.5 % for the maximum amount of records, which is higher than available classifiers.

Application of the machine learning technique for the development of a condensation heat transfer model for a passive containment cooling system

  • Lee, Dong Hyun;Yoo, Jee Min;Kim, Hui Yung;Hong, Dong Jin;Yun, Byong Jo;Jeong, Jae Jun
    • Nuclear Engineering and Technology
    • /
    • v.54 no.6
    • /
    • pp.2297-2310
    • /
    • 2022
  • A condensation heat transfer model is essential to accurately predict the performance of the passive containment cooling system (PCCS) during an accident in an advanced light water reactor. However, most of existing models tend to predict condensation heat transfer very well for a specific range of thermal-hydraulic conditions. In this study, a new correlation for condensation heat transfer coefficient (HTC) is presented using machine learning technique. To secure sufficient training data, a large number of pseudo data were produced by using ten existing condensation models. Then, a neural network model was developed, consisting of a fully connected layer and a convolutional neural network (CNN) algorithm, DenseNet. Based on the hold-out cross-validation, the neural network was trained and validated against the pseudo data. Thereafter, it was evaluated using the experimental data, which were not used for training. The machine learning model predicted better results than the existing models. It was also confirmed through a parametric study that the machine learning model presents continuous and physical HTCs for various thermal-hydraulic conditions. By reflecting the effects of individual variables obtained from the parametric analysis, a new correlation was proposed. It yielded better results for almost all experimental conditions than the ten existing models.

A Study on Automatic Recommendation of Keywords for Sub-Classification of National Science and Technology Standard Classification System Using AttentionMesh (AttentionMesh를 활용한 국가과학기술표준분류체계 소분류 키워드 자동추천에 관한 연구)

  • Park, Jin Ho;Song, Min Sun
    • Journal of Korean Library and Information Science Society
    • /
    • v.53 no.2
    • /
    • pp.95-115
    • /
    • 2022
  • The purpose of this study is to transform the sub-categorization terms of the National Science and Technology Standards Classification System into technical keywords by applying a machine learning algorithm. For this purpose, AttentionMeSH was used as a learning algorithm suitable for topic word recommendation. For source data, four-year research status files from 2017 to 2020, refined by the Korea Institute of Science and Technology Planning and Evaluation, were used. For learning, four attributes that well express the research content were used: task name, research goal, research abstract, and expected effect. As a result, it was confirmed that the result of MiF 0.6377 was derived when the threshold was 0.5. In order to utilize machine learning in actual work in the future and to secure technical keywords, it is expected that it will be necessary to establish a term management system and secure data of various attributes.

Study on the Improvement of Machine Learning Ability through Data Augmentation (데이터 증강을 통한 기계학습 능력 개선 방법 연구)

  • Kim, Tae-woo;Shin, Kwang-seong
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2021.05a
    • /
    • pp.346-347
    • /
    • 2021
  • For pattern recognition for machine learning, the larger the amount of learning data, the better its performance. However, it is not always possible to secure a large amount of learning data with the types and information of patterns that must be detected in daily life. Therefore, it is necessary to significantly inflate a small data set for general machine learning. In this study, we study techniques to augment data so that machine learning can be performed. A representative method of performing machine learning using a small data set is the transfer learning technique. Transfer learning is a method of obtaining a result by performing basic learning with a general-purpose data set and then substituting the target data set into the final stage. In this study, a learning model trained with a general-purpose data set such as ImageNet is used as a feature extraction set using augmented data to detect a desired pattern.

  • PDF

Anomaly-Based Network Intrusion Detection: An Approach Using Ensemble-Based Machine Learning Algorithm

  • Kashif Gul Chachar;Syed Nadeem Ahsan
    • International Journal of Computer Science & Network Security
    • /
    • v.24 no.1
    • /
    • pp.107-118
    • /
    • 2024
  • With the seamless growth of the technology, network usage requirements are expanding day by day. The majority of electronic devices are capable of communication, which strongly requires a secure and reliable network. Network-based intrusion detection systems (NIDS) is a new method for preventing and alerting computers and networks from attacks. Machine Learning is an emerging field that provides a variety of ways to implement effective network intrusion detection systems (NIDS). Bagging and Boosting are two ensemble ML techniques, renowned for better performance in the learning and classification process. In this paper, the study provides a detailed literature review of the past work done and proposed a novel ensemble approach to develop a NIDS system based on the voting method using bagging and boosting ensemble techniques. The test results demonstrate that the ensemble of bagging and boosting through voting exhibits the highest classification accuracy of 99.98% and a minimum false positive rate (FPR) on both datasets. Although the model building time is average which can be a tradeoff by processor speed.

Method for improving video/image data quality for AI learning of unstructured data (비정형데이터의 AI학습을 위한 영상/이미지 데이터 품질 향상 방법)

  • Kim Seung Hee;Dongju Ryu
    • Convergence Security Journal
    • /
    • v.23 no.2
    • /
    • pp.55-66
    • /
    • 2023
  • Recently, there is an increasing movement to increase the value of AI learning data and to secure high-quality data based on previous research on AI learning data in all areas of society. Therefore, quality management is very important in construction projects to secure high-quality data. In this paper, quality management to secure high-quality data when building AI learning data and improvement plans for each construction process are presented. In particular, more than 80% of the data quality of unstructured data built for AI learning is determined during the construction process. In this paper, we performed quality inspection of image/video data. In addition, we identified inspection procedures and problem elements that occurred in the construction phases of acquisition, data cleaning, labeling, and models, and suggested ways to secure high-quality data by solving them. Through this, it is expected that it will be an alternative to overcome the quality deviation of data for research groups and operators participating in the construction of AI learning data.

A Survey on Privacy Vulnerabilities through Logit Inversion in Distillation-based Federated Learning (증류 기반 연합 학습에서 로짓 역전을 통한 개인 정보 취약성에 관한 연구)

  • Subin Yun;Yungi Cho;Yunheung Paek
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2024.05a
    • /
    • pp.711-714
    • /
    • 2024
  • In the dynamic landscape of modern machine learning, Federated Learning (FL) has emerged as a compelling paradigm designed to enhance privacy by enabling participants to collaboratively train models without sharing their private data. Specifically, Distillation-based Federated Learning, like Federated Learning with Model Distillation (FedMD), Federated Gradient Encryption and Model Sharing (FedGEMS), and Differentially Secure Federated Learning (DS-FL), has arisen as a novel approach aimed at addressing Non-IID data challenges by leveraging Federated Learning. These methods refine the standard FL framework by distilling insights from public dataset predictions, securing data transmissions through gradient encryption, and applying differential privacy to mask individual contributions. Despite these innovations, our survey identifies persistent vulnerabilities, particularly concerning the susceptibility to logit inversion attacks where malicious actors could reconstruct private data from shared public predictions. This exploration reveals that even advanced Distillation-based Federated Learning systems harbor significant privacy risks, challenging the prevailing assumptions about their security and underscoring the need for continued advancements in secure Federated Learning methodologies.

Research on the application of Machine Learning to threat assessment of combat systems

  • Seung-Joon Lee
    • Journal of the Korea Society of Computer and Information
    • /
    • v.28 no.7
    • /
    • pp.47-55
    • /
    • 2023
  • This paper presents a method for predicting the threat index of combat systems using Gradient Boosting Regressors and Support Vector Regressors among machine learning models. Currently, combat systems are software that emphasizes safety and reliability, so the application of AI technology that is not guaranteed to be reliable is restricted by policy, and as a result, the electrified domestic combat systems are not equipped with AI technology. However, in order to respond to the policy direction of the Ministry of National Defense, which aims to electrify AI, we conducted a study to secure the basic technology required for the application of machine learning in combat systems. After collecting the data required for threat index evaluation, the study determined the prediction accuracy of the trained model by processing and refining the data, selecting the machine learning model, and selecting the optimal hyper-parameters. As a result, the model score for the test data was over 99 points, confirming the applicability of machine learning models to combat systems.