• Title/Summary/Keyword: Machine Learning #2

Search Result 1,663, Processing Time 0.04 seconds

Systematic Research on Privacy-Preserving Distributed Machine Learning (프라이버시를 보호하는 분산 기계 학습 연구 동향)

  • Min Seob Lee;Young Ah Shin;Ji Young Chun
    • The Transactions of the Korea Information Processing Society
    • /
    • v.13 no.2
    • /
    • pp.76-90
    • /
    • 2024
  • Although artificial intelligence (AI) can be utilized in various domains such as smart city, healthcare, it is limited due to concerns about the exposure of personal and sensitive information. In response, the concept of distributed machine learning has emerged, wherein learning occurs locally before training a global model, mitigating the concentration of data on a central server. However, overall learning phase in a collaborative way among multiple participants poses threats to data privacy. In this paper, we systematically analyzes recent trends in privacy protection within the realm of distributed machine learning, considering factors such as the presence of a central server, distribution environment of the training datasets, and performance variations among participants. In particular, we focus on key distributed machine learning techniques, including horizontal federated learning, vertical federated learning, and swarm learning. We examine privacy protection mechanisms within these techniques and explores potential directions for future research.

Machine Learning-based Prediction of Relative Regional Air Volume Change from Healthy Human Lung CTs

  • Eunchan Kim;YongHyun Lee;Jiwoong Choi;Byungjoon Yoo;Kum Ju Chae;Chang Hyun Lee
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.17 no.2
    • /
    • pp.576-590
    • /
    • 2023
  • Machine learning is widely used in various academic fields, and recently it has been actively applied in the medical research. In the medical field, machine learning is used in a variety of ways, such as speeding up diagnosis, discovering new biomarkers, or discovering latent traits of a disease. In the respiratory field, a relative regional air volume change (RRAVC) map based on quantitative inspiratory and expiratory computed tomography (CT) imaging can be used as a useful functional imaging biomarker for characterizing regional ventilation. In this study, we seek to predict RRAVC using various regular machine learning models such as extreme gradient boosting (XGBoost), light gradient boosting machine (LightGBM), and multi-layer perceptron (MLP). We experimentally show that MLP performs best, followed by XGBoost. We also propose several relative coordinate systems to minimize intersubjective variability. We confirm a significant experimental performance improvement when we apply a subject's relative proportion coordinates over conventional absolute coordinates.

Parameterization of the Company's Business Model for Machine Learning-Based Marketing Stress Testing

  • Menkova, Krystyna;Zozulov, Oleksandr
    • International Journal of Computer Science & Network Security
    • /
    • v.22 no.2
    • /
    • pp.318-326
    • /
    • 2022
  • Marketing stress testing is a new method of identifying the company's strengths and weaknesses in a turbulent environment. Technically, this is a complex procedure, so it involves artificial intelligence and machine learning. The main problem is currently the development of methodological approaches to the development of the company's digital model, which will provide a framework for machine learning. The aim of the study was to identify and develop an author's approach to the parameterization of the company's business processes for machine learning-based marketing stress testing. This aim provided the company's activities to be considered as a set of elements (business processes, products) and factors that affect them (marketing environment). The article proposes an author's approach to the parameterization of the company's business processes for machine learning-based marketing stress testing. The proposed approach includes four main elements that are subject to parameterization: elements of the company's internal environment, factors of the marketing environment, the company' core competency and factors impacting the company. Matrices for evaluating the results of the work of expert groups to determine the degree of influence of the marketing environment factors were developed. It is proposed to distinguish between mega-level, macro-level, meso-level and micro-level factors depending on the degree of impact on the company. The methodological limitation of the study is that it involves the modelling method as the only one possible at this stage of the study. The implementation limitation is that the proposed approach can only be used if the company plans to use machine learning for marketing stress testing.

An Extended Function Point Model for Estimating the Implementing Cost of Machine Learning Applications (머신러닝 애플리케이션 구현 비용 평가를 위한 확장형 기능 포인트 모델)

  • Seokjin Im
    • The Journal of the Convergence on Culture Technology
    • /
    • v.9 no.2
    • /
    • pp.475-481
    • /
    • 2023
  • Softwares, especially like machine learning applications, affect human's life style tremendously. Accordingly, the importance of the cost model for softwares increases rapidly. As cost models, LOC(Line of Code) and M/M(Man-Month) estimates the quantitative aspects of the software. Differently from them, FP(Function Point) focuses on estimating the functional characteristics of software. FP is efficient in the aspect that it estimates qualitative characteristics. FP, however, has a limit for evaluating machine learning softwares because FP does not evaluate the critical factors of machine learning software. In this paper, we propose an extended function point(ExFP) that extends FP to adopt hyper parameter and the complexity of its optimization as the characteristics of the machine learning applications. In the evaluation reflecting the characteristics of machine learning applications. we reveals the effectiveness of the proposed ExFP.

RFA: Recursive Feature Addition Algorithm for Machine Learning-Based Malware Classification

  • Byeon, Ji-Yun;Kim, Dae-Ho;Kim, Hee-Chul;Choi, Sang-Yong
    • Journal of the Korea Society of Computer and Information
    • /
    • v.26 no.2
    • /
    • pp.61-68
    • /
    • 2021
  • Recently, various technologies that use machine learning to classify malicious code have been studied. In order to enhance the effectiveness of machine learning, it is most important to extract properties to identify malicious codes and normal binaries. In this paper, we propose a feature extraction method for use in machine learning using recursive methods. The proposed method selects the final feature using recursive methods for individual features to maximize the performance of machine learning. In detail, we use the method of extracting the best performing features among individual feature at each stage, and then combining the extracted features. We extract features with the proposed method and apply them to machine learning algorithms such as Decision Tree, SVM, Random Forest, and KNN, to validate that machine learning performance improves as the steps continue.

Research on Forecasting Framework for System Marginal Price based on Deep Recurrent Neural Networks and Statistical Analysis Models

  • Kim, Taehyun;Lee, Yoonjae;Hwangbo, Soonho
    • Clean Technology
    • /
    • v.28 no.2
    • /
    • pp.138-146
    • /
    • 2022
  • Electricity has become a factor that dramatically affects the market economy. The day-ahead system marginal price determines electricity prices, and system marginal price forecasting is critical in maintaining energy management systems. There have been several studies using mathematics and machine learning models to forecast the system marginal price, but few studies have been conducted to develop, compare, and analyze various machine learning and deep learning models based on a data-driven framework. Therefore, in this study, different machine learning algorithms (i.e., autoregressive-based models such as the autoregressive integrated moving average model) and deep learning networks (i.e., recurrent neural network-based models such as the long short-term memory and gated recurrent unit model) are considered and integrated evaluation metrics including a forecasting test and information criteria are proposed to discern the optimal forecasting model. A case study of South Korea using long-term time-series system marginal price data from 2016 to 2021 was applied to the developed framework. The results of the study indicate that the autoregressive integrated moving average model (R-squared score: 0.97) and the gated recurrent unit model (R-squared score: 0.94) are appropriate for system marginal price forecasting. This study is expected to contribute significantly to energy management systems and the suggested framework can be explicitly applied for renewable energy networks.

A Study on Machine Learning Algorithms based on Embedded Processors Using Genetic Algorithm (유전 알고리즘을 이용한 임베디드 프로세서 기반의 머신러닝 알고리즘에 관한 연구)

  • So-Haeng Lee;Gyeong-Hyu Seok
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.19 no.2
    • /
    • pp.417-426
    • /
    • 2024
  • In general, the implementation of machine learning requires prior knowledge and experience with deep learning models, and substantial computational resources and time are necessary for data processing. As a result, machine learning encounters several limitations when deployed on embedded processors. To address these challenges, this paper introduces a novel approach where a genetic algorithm is applied to the convolution operation within the machine learning process, specifically for performing a selective convolution operation.In the selective convolution operation, the convolution is executed exclusively on pixels identified by a genetic algorithm. This method selects and computes pixels based on a ratio determined by the genetic algorithm, effectively reducing the computational workload by the specified ratio. The paper thoroughly explores the integration of genetic algorithms into machine learning computations, monitoring the fitness of each generation to ascertain if it reaches the target value. This approach is then compared with the computational requirements of existing methods.The learning process involves iteratively training generations to ensure that the fitness adequately converges.

A Development of Fixture Planning Module using Machine Learning (기계 학습을 이용한 치구 공정 계획 모듈의 개발)

  • 김선우;이수홍
    • Korean Journal of Computational Design and Engineering
    • /
    • v.2 no.2
    • /
    • pp.111-121
    • /
    • 1997
  • This study intends to develop a fixture planning module as a part of the planning system for cutting. The fixture module uses machine learning method to reuse previous failure results so that the system can reduce the repeated failures. Machine learning is one of efforts to incorporate human reasoning ability into a computerized system. A human expert designs better than a novice does because he has a wide experience in a specific area. This study implements the machine learning algorithm to have a wide experience in the fixture planning area as a human expert does. When the fixture planner finds a setup failure for the suggested operations by a process planner, it makes the process planner store its attributes and other information for the failed setup. Then the process planner applies the learned knowledge when it meets a similar case so that the planner can reduce possibility of setup failure. Also the system can teach a novice user by showing a failed setup with a modified setup.

  • PDF

Fault Prediction Using Statistical and Machine Learning Methods for Improving Software Quality

  • Malhotra, Ruchika;Jain, Ankita
    • Journal of Information Processing Systems
    • /
    • v.8 no.2
    • /
    • pp.241-262
    • /
    • 2012
  • An understanding of quality attributes is relevant for the software organization to deliver high software reliability. An empirical assessment of metrics to predict the quality attributes is essential in order to gain insight about the quality of software in the early phases of software development and to ensure corrective actions. In this paper, we predict a model to estimate fault proneness using Object Oriented CK metrics and QMOOD metrics. We apply one statistical method and six machine learning methods to predict the models. The proposed models are validated using dataset collected from Open Source software. The results are analyzed using Area Under the Curve (AUC) obtained from Receiver Operating Characteristics (ROC) analysis. The results show that the model predicted using the random forest and bagging methods outperformed all the other models. Hence, based on these results it is reasonable to claim that quality models have a significant relevance with Object Oriented metrics and that machine learning methods have a comparable performance with statistical methods.

PubMiner: Machine Learning-based Text Mining for Biomedical Information Analysis

  • Eom, Jae-Hong;Zhang, Byoung-Tak
    • Genomics & Informatics
    • /
    • v.2 no.2
    • /
    • pp.99-106
    • /
    • 2004
  • In this paper we introduce PubMiner, an intelligent machine learning based text mining system for mining biological information from the literature. PubMiner employs natural language processing techniques and machine learning based data mining techniques for mining useful biological information such as protein­protein interaction from the massive literature. The system recognizes biological terms such as gene, protein, and enzymes and extracts their interactions described in the document through natural language processing. The extracted interactions are further analyzed with a set of features of each entity that were collected from the related public databases to infer more interactions from the original interactions. An inferred interaction from the interaction analysis and native interaction are provided to the user with the link of literature sources. The performance of entity and interaction extraction was tested with selected MEDLINE abstracts. The evaluation of inference proceeded using the protein interaction data of S. cerevisiae (bakers yeast) from MIPS and SGD.