• Title/Summary/Keyword: machine learning model

Search Result 2,563, Processing Time 0.029 seconds

Predicting Reports of Theft in Businesses via Machine Learning

  • JungIn, Seo;JeongHyeon, Chang
    • International Journal of Advanced Culture Technology
    • /
    • v.10 no.4
    • /
    • pp.499-510
    • /
    • 2022
  • This study examines the reporting factors of crime against business in Korea and proposes a corresponding predictive model using machine learning. While many previous studies focused on the individual factors of theft victims, there is a lack of evidence on the reporting factors of crime against a business that serves the public good as opposed to those that protect private property. Therefore, we proposed a crime prevention model for the willingness factor of theft reporting in businesses. This study used data collected through the 2015 Commercial Crime Damage Survey conducted by the Korea Institute for Criminal Policy. It analyzed data from 834 businesses that had experienced theft during a 2016 crime investigation. The data showed a problem with unbalanced classes. To solve this problem, we jointly applied the Synthetic Minority Over Sampling Technique and the Tomek link techniques to the training data. Two prediction models were implemented. One was a statistical model using logistic regression and elastic net. The other involved a support vector machine model, tree-based machine learning models (e.g., random forest, extreme gradient boosting), and a stacking model. As a result, the features of theft price, invasion, and remedy, which are known to have significant effects on reporting theft offences, can be predicted as determinants of such offences in companies. Finally, we verified and compared the proposed predictive models using several popular metrics. Based on our evaluation of the importance of the features used in each model, we suggest a more accurate criterion for predicting var.

Big Data Based Urban Transportation Analysis for Smart Cities - Machine Learning Based Traffic Prediction by Using Urban Environment Data - (도시 빅데이터를 활용한 스마트시티의 교통 예측 모델 - 환경 데이터와의 상관관계 기계 학습을 통한 예측 모델의 구축 및 검증 -)

  • Jang, Sun-Young;Shin, Dong-Youn
    • Journal of KIBIM
    • /
    • v.8 no.3
    • /
    • pp.12-19
    • /
    • 2018
  • The research aims to find implications of machine learning and urban big data as a way to construct the flexible transportation network system of smart city by responding the urban context changes. This research deals with a problem that existing a bus headway model is difficult to respond urban situations in real-time. Therefore, utilizing the urban big data and machine learning prototyping tool in weathers, traffics, and bus statues, this research presents a flexible headway model to predict bus delay and analyze the result. The prototyping model is composed by real-time data of buses. The data is gathered through public data portals and real time Application Program Interface (API) by the government. These data are fundamental resources to organize interval pattern models of bus operations as traffic environment factors (road speeds, station conditions, weathers, and bus information of operating in real-time). The prototyping model is implemented by the machine learning tool (RapidMiner Studio) and conducted several tests for bus delays prediction according to specific circumstances. As a result, possibilities of transportation system are discussed for promoting the urban efficiency and the citizens' convenience by responding to urban conditions.

Study on predictive model and mechanism analysis for martensite transformation temperatures through explainable artificial intelligence (설명가능한 인공지능을 통한 마르텐사이트 변태 온도 예측 모델 및 거동 분석 연구)

  • Junhyub Jeon;Seung Bae Son;Jae-Gil Jung;Seok-Jae Lee
    • Journal of the Korean Society for Heat Treatment
    • /
    • v.37 no.3
    • /
    • pp.103-113
    • /
    • 2024
  • Martensite volume fraction significantly affects the mechanical properties of alloy steels. Martensite start temperature (Ms), transformation temperature for martensite 50 vol.% (M50), and transformation temperature for martensite 90 vol.% (M90) are important transformation temperatures to control the martensite phase fraction. Several researchers proposed empirical equations and machine learning models to predict the Ms temperature. These numerical approaches can easily predict the Ms temperature without additional experiment and cost. However, to control martensite phase fraction more precisely, we need to reduce prediction error of the Ms model and propose prediction models for other martensite transformation temperatures (M50, M90). In the present study, machine learning model was applied to suggest the predictive model for the Ms, M50, M90 temperatures. To explain prediction mechanisms and suggest feature importance on martensite transformation temperature of machine learning models, the explainable artificial intelligence (XAI) is employed. Random forest regression (RFR) showed the best performance for predicting the Ms, M50, M90 temperatures using different machine learning models. The feature importance was proposed and the prediction mechanisms were discussed by XAI.

Transfer Learning based DNN-SVM Hybrid Model for Breast Cancer Classification

  • Gui Rae Jo;Beomsu Baek;Young Soon Kim;Dong Hoon Lim
    • Journal of the Korea Society of Computer and Information
    • /
    • v.28 no.11
    • /
    • pp.1-11
    • /
    • 2023
  • Breast cancer is the disease that affects women the most worldwide. Due to the development of computer technology, the efficiency of machine learning has increased, and thus plays an important role in cancer detection and diagnosis. Deep learning is a field of machine learning technology based on an artificial neural network, and its performance has been rapidly improved in recent years, and its application range is expanding. In this paper, we propose a DNN-SVM hybrid model that combines the structure of a deep neural network (DNN) based on transfer learning and a support vector machine (SVM) for breast cancer classification. The transfer learning-based proposed model is effective for small training data, has a fast learning speed, and can improve model performance by combining all the advantages of a single model, that is, DNN and SVM. To evaluate the performance of the proposed DNN-SVM Hybrid model, the performance test results with WOBC and WDBC breast cancer data provided by the UCI machine learning repository showed that the proposed model is superior to single models such as logistic regression, DNN, and SVM, and ensemble models such as random forest in various performance measures.

Stress Identification and Analysis using Observed Heart Beat Data from Smart HRM Sensor Device

  • Pramanta, SPL Aditya;Kim, Myonghee;Park, Man-Gon
    • Journal of Korea Multimedia Society
    • /
    • v.20 no.8
    • /
    • pp.1395-1405
    • /
    • 2017
  • In this paper, we analyses heart beat data to identify subjects stress state (binary) using heart rate variability (HRV) features extracted from heart beat data of the subjects and implement supervised machine learning techniques to create the mental stress classifier. There are four steps need to be done: data acquisition, data processing (HRV analysis), features selection, and machine learning, before doing performance measurement. There are 56 features generated from the HRV Analysis module with several of them are selected (using own algorithm) after computing the Pearson Correlation Matrix (p-values). The results of the list of selected features compared with all features data are compared by its model error after training using several machine learning techniques: support vector machine, decision tree, and discriminant analysis. SVM model and decision tree model with using selected features shows close results compared to using all recording by only 1% difference. Meanwhile, the discriminant analysis differs about 5%. All the machine learning method used in this works have 90% maximum average accuracy.

Application of Multi-Layer Perceptron and Random Forest Method for Cylinder Plate Forming (Multi-Layer Perceptron과 Random Forest를 이용한 실린더 판재의 성형 조건 예측)

  • Kim, Seong-Kyeom;Hwang, Se-Yun;Lee, Jang-Hyun
    • Journal of the Society of Naval Architects of Korea
    • /
    • v.57 no.5
    • /
    • pp.297-304
    • /
    • 2020
  • In this study, the prediction method was reviewed to process a cylindrical plate forming using machine learning as a data-driven approach by roll bending equipment. The calculation of the forming variables was based on the analysis using the mechanical relationship between the material properties and the roll bending machine in the bending process. Then, by applying the finite element analysis method, the accuracy of the deformation prediction model was reviewed, and a large number data set was created to apply to machine learning using the finite element analysis model for deformation prediction. As a result of the application of the machine learning model, it was confirmed that the calculation is slightly higher than the linear regression method. Applicable results were confirmed through the machine learning method.

Machine learning application to seismic site classification prediction model using Horizontal-to-Vertical Spectral Ratio (HVSR) of strong-ground motions

  • Francis G. Phi;Bumsu Cho;Jungeun Kim;Hyungik Cho;Yun Wook Choo;Dookie Kim;Inhi Kim
    • Geomechanics and Engineering
    • /
    • v.37 no.6
    • /
    • pp.539-554
    • /
    • 2024
  • This study explores development of prediction model for seismic site classification through the integration of machine learning techniques with horizontal-to-vertical spectral ratio (HVSR) methodologies. To improve model accuracy, the research employs outlier detection methods and, synthetic minority over-sampling technique (SMOTE) for data balance, and evaluates using seven machine learning models using seismic data from KiK-net. Notably, light gradient boosting method (LGBM), gradient boosting, and decision tree models exhibit improved performance when coupled with SMOTE, while Multiple linear regression (MLR) and Support vector machine (SVM) models show reduced efficacy. Outlier detection techniques significantly enhance accuracy, particularly for LGBM, gradient boosting, and voting boosting. The ensemble of LGBM with the isolation forest and SMOTE achieves the highest accuracy of 0.91, with LGBM and local outlier factor yielding the highest F1-score of 0.79. Consistently outperforming other models, LGBM proves most efficient for seismic site classification when supported by appropriate preprocessing procedures. These findings show the significance of outlier detection and data balancing for precise seismic soil classification prediction, offering insights and highlighting the potential of machine learning in optimizing site classification accuracy.

A Study on the Insider Behavior Analysis Framework for Detecting Information Leakage Using Network Traffic Collection and Restoration (네트워크 트래픽 수집 및 복원을 통한 내부자 행위 분석 프레임워크 연구)

  • Kauh, Janghyuk;Lee, Dongho
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.13 no.4
    • /
    • pp.125-139
    • /
    • 2017
  • In this paper, we developed a framework to detect and predict insider information leakage by collecting and restoring network traffic. For automated behavior analysis, many meta information and behavior information obtained using network traffic collection are used as machine learning features. By these features, we created and learned behavior model, network model and protocol-specific models. In addition, the ensemble model was developed by digitizing and summing the results of various models. We developed a function to present information leakage candidates and view meta information and behavior information from various perspectives using the visual analysis. This supports to rule-based threat detection and machine learning based threat detection. In the future, we plan to make an ensemble model that applies a regression model to the results of the models, and plan to develop a model with deep learning technology.

1D CNN and Machine Learning Methods for Fall Detection (1D CNN과 기계 학습을 사용한 낙상 검출)

  • Kim, Inkyung;Kim, Daehee;Noh, Song;Lee, Jaekoo
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.10 no.3
    • /
    • pp.85-90
    • /
    • 2021
  • In this paper, fall detection using individual wearable devices for older people is considered. To design a low-cost wearable device for reliable fall detection, we present a comprehensive analysis of two representative models. One is a machine learning model composed of a decision tree, random forest, and Support Vector Machine(SVM). The other is a deep learning model relying on a one-dimensional(1D) Convolutional Neural Network(CNN). By considering data segmentation, preprocessing, and feature extraction methods applied to the input data, we also evaluate the considered models' validity. Simulation results verify the efficacy of the deep learning model showing improved overall performance.

A Study on the Factors Influencing a Company's Selection of Machine Learning: From the Perspective of Expanded Algorithm Selection Problem (기업의 머신러닝 선정에 영향을 미치는 요인 연구: 확장된 알고리즘 선택 문제의 관점으로)

  • Yi, Youngsoo;Kwon, Min Soo;Kwon, Ohbyung
    • The Journal of Society for e-Business Studies
    • /
    • v.27 no.2
    • /
    • pp.37-64
    • /
    • 2022
  • As the social acceptance of artificial intelligence increases, the number of cases of applying machine learning methods to companies is also increasing. Technical factors such as accuracy and interpretability have been the main criteria for selecting machine learning methods. However, the success of implementing machine learning also affects management factors such as IT departments, operation departments, leadership, and organizational culture. Unfortunately, there are few integrated studies that understand the success factors of machine learning selection in which technical and management factors are considered together. Therefore, the purpose of this paper is to propose and empirically analyze a technology-management integrated model that combines task-tech fit, IS Success Model theory, and John Rice's algorithm selection process model to understand machine learning selection within the company. As a result of a survey of 240 companies that implemented machine learning, it was found that the higher the algorithm quality and data quality, the higher the algorithm-problem fit was perceived. It was also verified that algorithm-problem fit had a significant impact on the organization's innovation and productivity. In addition, it was confirmed that outsourcing and management support had a positive impact on the quality of the machine learning system and organizational cultural factors such as data-driven management and motivation. Data-driven management and motivation were highly perceived in companies' performance.