• Title/Summary/Keyword: k-Nearest neighbors

Search Result 209, Processing Time 0.025 seconds

Application of Explainable Artificial Intelligence for Predicting Hardness of AlSi10Mg Alloy Manufactured by Laser Powder Bed Fusion (레이저 분말 베드 용융법으로 제조된 AlSi10Mg 합금의 경도 예측을 위한 설명 가능한 인공지능 활용)

  • Junhyub Jeon;Namhyuk Seo;Min-Su Kim;Seung Bae Son;Jae-Gil Jung;Seok-Jae Lee
    • Journal of Powder Materials
    • /
    • v.30 no.3
    • /
    • pp.210-216
    • /
    • 2023
  • In this study, machine learning models are proposed to predict the Vickers hardness of AlSi10Mg alloys fabricated by laser powder bed fusion (LPBF). A total of 113 utilizable datasets were collected from the literature. The hyperparameters of the machine-learning models were adjusted to select an accurate predictive model. The random forest regression (RFR) model showed the best performance compared to support vector regression, artificial neural networks, and k-nearest neighbors. The variable importance and prediction mechanisms of the RFR were discussed by Shapley additive explanation (SHAP). Aging time had the greatest influence on the Vickers hardness, followed by solution time, solution temperature, layer thickness, scan speed, power, aging temperature, average particle size, and hatching distance. Detailed prediction mechanisms for RFR are analyzed using SHAP dependence plots.

Explainable analysis of the Relationship between Hypertension with Gas leakages (설명 가능한 인공지능 기술을 활용한 가스누출과 고혈압의 연관 분석)

  • Dashdondov, Khongorzul;Jo, Kyuri;Kim, Mi-Hye
    • Annual Conference of KIPS
    • /
    • 2022.11a
    • /
    • pp.55-56
    • /
    • 2022
  • Hypertension is a severe health problem and increases the risk of other health issues, such as heart disease, heart attack, and stroke. In this research, we propose a machine learning-based prediction method for the risk of chronic hypertension. The proposed method consists of four main modules. In the first module, the linear interpolation method fills missing values of the integration of gas and meteorological datasets. In the second module, the OrdinalEncoder-based normalization is followed by the Decision tree algorithm to select important features. The prediction analysis module builds three models based on k-Nearest Neighbors, Decision Tree, and Random Forest to predict hypertension levels. Finally, the features used in the prediction model are explained by the DeepSHAP approach. The proposed method is evaluated by integrating the Korean meteorological agency dataset, natural gas leakage dataset, and Korean National Health and Nutrition Examination Survey dataset. The experimental results showed important global features for the hypertension of the entire population and local components for particular patients. Based on the local explanation results for a randomly selected 65-year-old male, the effect of hypertension increased from 0.694 to 1.249 when age increased by 0.37 and gas loss increased by 0.17. Therefore, it is concluded that gas loss is the cause of high blood pressure.

Assessment of wall convergence for tunnels using machine learning techniques

  • Mahmoodzadeh, Arsalan;Nejati, Hamid Reza;Mohammadi, Mokhtar;Ibrahim, Hawkar Hashim;Mohammed, Adil Hussein;Rashidi, Shima
    • Geomechanics and Engineering
    • /
    • v.31 no.3
    • /
    • pp.265-279
    • /
    • 2022
  • Tunnel convergence prediction is essential for the safe construction and design of tunnels. This study proposes five machine learning models of deep neural network (DNN), K-nearest neighbors (KNN), Gaussian process regression (GPR), support vector regression (SVR), and decision trees (DT) to predict the convergence phenomenon during or shortly after the excavation of tunnels. In this respect, a database including 650 datasets (440 for training, 110 for validation, and 100 for test) was gathered from the previously constructed tunnels. In the database, 12 effective parameters on the tunnel convergence and a target of tunnel wall convergence were considered. Both 5-fold and hold-out cross validation methods were used to analyze the predicted outcomes in the ML models. Finally, the DNN method was proposed as the most robust model. Also, to assess each parameter's contribution to the prediction problem, the backward selection method was used. The results showed that the highest and lowest impact parameters for tunnel convergence are tunnel depth and tunnel width, respectively.

Comparative Evaluation of Machine Learning Models for Predicting Soccer Injury Types

  • Davronbek Malikov;Jaeho Kim;Jung Kyu Park
    • Journal of the Korean Society of Industry Convergence
    • /
    • v.27 no.2_1
    • /
    • pp.257-268
    • /
    • 2024
  • Soccer is type of sport that carries a high risk of injury. Injury is not only cause in the unlucky soccer carrier and also team performance as well as financial effects can be worse since soccer is a team-based game. The duration of recovery from a soccer injury typically relies on its type and severity. Therefore, we conduct this research in order to predict the probability of players injury type using machine learning technologies in this paper. Furthermore, we compare different machine learning models to find the best fit model. This paper utilizes various supervised classification machine learning models, including Decision Tree, Random Forest, K-Nearest Neighbors (KNN), and Naive Bayes. Moreover, based on our finding the KNN and Decision models achieved the highest accuracy rates at 70%, surpassing other models. The Random Forest model followed closely with an accuracy score of 62%. Among the evaluated models, the Naive Bayes model demonstrated the lowest accuracy at 56%. We gathered information about 54 professional soccer players who are playing in the top five European leagues based on their career history. We gathered information about 54 professional soccer players who are playing in the top five European leagues based on their career history.

Analysis of disc cutter replacement based on wear patterns using artificial intelligence classification models

  • Yunhee Kim;Jaewoo Shin;Bumjoo Kim
    • Geomechanics and Engineering
    • /
    • v.38 no.6
    • /
    • pp.633-645
    • /
    • 2024
  • Disc cutters, used as excavation tools for rocks in a Tunnel Boring Machine (TBM), naturally undergo wear during the tunneling process, involving crushing and cutting through the ground, leading to various wear types. When disc cutters reach their wear limits, they must be replaced at the appropriate time to ensure efficient excavation. General disc cutter life prediction models are typically used during the design phase to predict the total required quantity and replacement locations for construction. However, disc cutters are replaced more frequently during tunneling than initially planned. Unpredictable disc cutter replacements can easily diminish tunneling efficiency, and abnormal wear is a common cause during tunneling in complex ground conditions. This study aims to overcome the limitations of existing disc cutter life prediction models by utilizing machine data generated during tunneling to predict disc cutter wear patterns and determine the need for replacements in real-time. Artificial intelligence classification algorithms, including K-nearest Neighbors (KNN), Support Vector Machine (SVM), Decision Tree (DT), and Stacking, are employed to assess the need for disc cutter replacement. Binary classification models are developed to predict which disc cutters require replacement, while multi-class classification models are fine-tuned to identify three categories: no replacement required, replacement due to normal wear, and replacement due to abnormal wear during tunneling. The performance of these models is thoroughly assessed, demonstrating that the proposed approach effectively manages disc cutter wear and replacements in shield TBM tunnel projects.

A Study on the Optimization of Metalloid Contents of Fe-Si-B-C Based Amorphous Soft Magnetic Materials Using Artificial Intelligence Method

  • Young-Sin Choi;Do-Hun Kwon;Min-Woo Lee;Eun-Ji Cha;Junhyup Jeon;Seok-Jae Lee;Jongryoul Kim;Hwi-Jun Kim
    • Archives of Metallurgy and Materials
    • /
    • v.67 no.4
    • /
    • pp.1459-1463
    • /
    • 2022
  • The soft magnetic properties of Fe-based amorphous alloys can be controlled by their compositions through alloy design. Experimental data on these alloys show some discrepancy, however, with predicted values. For further improvement of the soft magnetic properties, machine learning processes such as random forest regression, k-nearest neighbors regression and support vector regression can be helpful to optimize the composition. In this study, the random forest regression method was used to find the optimum compositions of Fe-Si-B-C alloys. As a result, the lowest coercivity was observed in Fe80.5Si3.63B13.54C2.33 at.% and the highest saturation magnetization was obtained Fe81.83Si3.63B12.63C1.91 at.% with R2 values of 0.74 and 0.878, respectively.

Machine-learning Approaches with Multi-temporal Remotely Sensed Data for Estimation of Forest Biomass and Forest Reference Emission Levels (시계열 위성영상과 머신러닝 기법을 이용한 산림 바이오매스 및 배출기준선 추정)

  • Yong-Kyu, Lee;Jung-Soo, Lee
    • Journal of Korean Society of Forest Science
    • /
    • v.111 no.4
    • /
    • pp.603-612
    • /
    • 2022
  • The study aims were to evaluate a machine-learning, algorithm-based, forest biomass-estimation model to estimate subnational forest biomass and to comparatively analyze REDD+ forest reference emission levels. Time-series Landsat satellite imagery and ESA Biomass Climate Change Initiative information were used to build a machine-learning-based biomass estimation model. The k-nearest neighbors algorithm (kNN), which is a non-parametric learning model, and the tree-based random forest (RF) model were applied to the machine-learning algorithm, and the estimated biomasses were compared with the forest reference emission levels (FREL) data, which was provided by the Paraguayan government. The root mean square error (RMSE), which was the optimum parameter of the kNN model, was 35.9, and the RMSE of the RF model was lower at 34.41, showing that the RF model was superior. As a result of separately using the FREL, kNN, and RF methods to set the reference emission levels, the gradient was set to approximately -33,000 tons, -253,000 tons, and -92,000 tons, respectively. These results showed that the machine learning-based estimation model was more suitable than the existing methods for setting reference emission levels.

Improving minority prediction performance of support vector machine for imbalanced text data via feature selection and SMOTE (단어선택과 SMOTE 알고리즘을 이용한 불균형 텍스트 데이터의 소수 범주 예측성능 향상 기법)

  • Jongchan Kim;Seong Jun Chang;Won Son
    • The Korean Journal of Applied Statistics
    • /
    • v.37 no.4
    • /
    • pp.395-410
    • /
    • 2024
  • Text data is usually made up of a wide variety of unique words. Even in standard text data, it is common to find tens of thousands of different words. In text data analysis, usually, each unique word is treated as a variable. Thus, text data can be regarded as a dataset with a large number of variables. On the other hand, in text data classification, we often encounter class label imbalance problems. In the cases of substantial imbalances, the performance of conventional classification models can be severely degraded. To improve the classification performance of support vector machines (SVM) for imbalanced data, algorithms such as the Synthetic Minority Over-sampling Technique (SMOTE) can be used. The SMOTE algorithm synthetically generates new observations for the minority class based on the k-Nearest Neighbors (kNN) algorithm. However, in datasets with a large number of variables, such as text data, errors may accumulate. This can potentially impact the performance of the kNN algorithm. In this study, we propose a method for enhancing prediction performance for the minority class of imbalanced text data. Our approach involves employing variable selection to generate new synthetic observations in a reduced space, thereby improving the overall classification performance of SVM.

CNN-based Adaptive K for Improving Positioning Accuracy in W-kNN-based LTE Fingerprint Positioning

  • Kwon, Jae Uk;Chae, Myeong Seok;Cho, Seong Yun
    • Journal of Positioning, Navigation, and Timing
    • /
    • v.11 no.3
    • /
    • pp.217-227
    • /
    • 2022
  • In order to provide a location-based services regardless of indoor or outdoor space, it is important to provide position information of the terminal regardless of location. Among the wireless/mobile communication resources used for this purpose, Long Term Evolution (LTE) signal is a representative infrastructure that can overcome spatial limitations, but the positioning method based on the location of the base station has a disadvantage in that the accuracy is low. Therefore, a fingerprinting technique, which is a pattern recognition technology, has been widely used. The simplest yet widely applied algorithm among Fingerprint positioning technologies is k-Nearest Neighbors (kNN). However, in the kNN algorithm, it is difficult to find the optimal K value with the lowest positioning error for each location to be estimated, so it is generally fixed to an appropriate K value and used. Since the optimal K value cannot be applied to each estimated location, therefore, there is a problem in that the accuracy of the overall estimated location information is lowered. Considering this problem, this paper proposes a technique for adaptively varying the K value by using a Convolutional Neural Network (CNN) model among Artificial Neural Network (ANN) techniques. First, by using the signal information of the measured values obtained in the service area, an image is created according to the Physical Cell Identity (PCI) and Band combination, and an answer label for supervised learning is created. Then, the structure of the CNN is modeled to classify K values through the image information of the measurements. The performance of the proposed technique is verified based on actual data measured in the testbed. As a result, it can be seen that the proposed technique improves the positioning performance compared to using a fixed K value.

A Study on Application of Machine Learning Algorithms to Visitor Marketing in Sports Stadium (기계학습 알고리즘을 사용한 스포츠 경기장 방문객 마케팅 적용 방안)

  • Park, So-Hyun;Ihm, Sun-Young;Park, Young-Ho
    • Journal of Digital Contents Society
    • /
    • v.19 no.1
    • /
    • pp.27-33
    • /
    • 2018
  • In this study, we analyze the big data of visitors who are looking for a sports stadium in marketing field and conduct research to provide customized marketing service to consumers. For this purpose, we intend to derive a similar visitor group by using the K-means clustering method. Also, we will use the K-nearest neighbors method to predict the store of interest for new visitors. As a result of the experiment, it was possible to provide a marketing service suitable for each group attribute by deriving a group of similar visitors through the above two algorithms, and it was possible to recommend products and events for new visitors.