• Title/Summary/Keyword: Robust SVM

Search Result 95, Processing Time 0.018 seconds

Machine learning techniques for reinforced concrete's tensile strength assessment under different wetting and drying cycles

  • Ibrahim Albaijan;Danial Fakhri;Adil Hussein Mohammed;Arsalan Mahmoodzadeh;Hawkar Hashim Ibrahim;Khaled Mohamed Elhadi;Shima Rashidi
    • Steel and Composite Structures
    • /
    • v.49 no.3
    • /
    • pp.337-348
    • /
    • 2023
  • Successive wetting and drying cycles of concrete due to weather changes can endanger the safety of engineering structures over time. Considering wetting and drying cycles in concrete tests can lead to a more correct and reliable design of engineering structures. This study aims to provide a model that can be used to estimate the resistance properties of concrete under different wetting and drying cycles. Complex sample preparation methods, the necessity for highly accurate and sensitive instruments, early sample failure, and brittle samples all contribute to the difficulty of measuring the strength of concrete in the laboratory. To address these problems, in this study, the potential ability of six machine learning techniques, including ANN, SVM, RF, KNN, XGBoost, and NB, to predict the concrete's tensile strength was investigated by applying 240 datasets obtained using the Brazilian test (80% for training and 20% for test). In conducting the test, the effect of additives such as glass and polypropylene, as well as the effect of wetting and drying cycles on the tensile strength of concrete, was investigated. Finally, the statistical analysis results revealed that the XGBoost model was the most robust one with R2 = 0.9155, mean absolute error (MAE) = 0.1080 Mpa, and variance accounted for (VAF) = 91.54% to predict the concrete tensile strength. This work's significance is that it allows civil engineers to accurately estimate the tensile strength of different types of concrete. In this way, the high time and cost required for the laboratory tests can be eliminated.

Response Modeling for the Marketing Promotion with Weighted Case Based Reasoning Under Imbalanced Data Distribution (불균형 데이터 환경에서 변수가중치를 적용한 사례기반추론 기반의 고객반응 예측)

  • Kim, Eunmi;Hong, Taeho
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.1
    • /
    • pp.29-45
    • /
    • 2015
  • Response modeling is a well-known research issue for those who have tried to get more superior performance in the capability of predicting the customers' response for the marketing promotion. The response model for customers would reduce the marketing cost by identifying prospective customers from very large customer database and predicting the purchasing intention of the selected customers while the promotion which is derived from an undifferentiated marketing strategy results in unnecessary cost. In addition, the big data environment has accelerated developing the response model with data mining techniques such as CBR, neural networks and support vector machines. And CBR is one of the most major tools in business because it is known as simple and robust to apply to the response model. However, CBR is an attractive data mining technique for data mining applications in business even though it hasn't shown high performance compared to other machine learning techniques. Thus many studies have tried to improve CBR and utilized in business data mining with the enhanced algorithms or the support of other techniques such as genetic algorithm, decision tree and AHP (Analytic Process Hierarchy). Ahn and Kim(2008) utilized logit, neural networks, CBR to predict that which customers would purchase the items promoted by marketing department and tried to optimized the number of k for k-nearest neighbor with genetic algorithm for the purpose of improving the performance of the integrated model. Hong and Park(2009) noted that the integrated approach with CBR for logit, neural networks, and Support Vector Machine (SVM) showed more improved prediction ability for response of customers to marketing promotion than each data mining models such as logit, neural networks, and SVM. This paper presented an approach to predict customers' response of marketing promotion with Case Based Reasoning. The proposed model was developed by applying different weights to each feature. We deployed logit model with a database including the promotion and the purchasing data of bath soap. After that, the coefficients were used to give different weights of CBR. We analyzed the performance of proposed weighted CBR based model compared to neural networks and pure CBR based model empirically and found that the proposed weighted CBR based model showed more superior performance than pure CBR model. Imbalanced data is a common problem to build data mining model to classify a class with real data such as bankruptcy prediction, intrusion detection, fraud detection, churn management, and response modeling. Imbalanced data means that the number of instance in one class is remarkably small or large compared to the number of instance in other classes. The classification model such as response modeling has a lot of trouble to recognize the pattern from data through learning because the model tends to ignore a small number of classes while classifying a large number of classes correctly. To resolve the problem caused from imbalanced data distribution, sampling method is one of the most representative approach. The sampling method could be categorized to under sampling and over sampling. However, CBR is not sensitive to data distribution because it doesn't learn from data unlike machine learning algorithm. In this study, we investigated the robustness of our proposed model while changing the ratio of response customers and nonresponse customers to the promotion program because the response customers for the suggested promotion is always a small part of nonresponse customers in the real world. We simulated the proposed model 100 times to validate the robustness with different ratio of response customers to response customers under the imbalanced data distribution. Finally, we found that our proposed CBR based model showed superior performance than compared models under the imbalanced data sets. Our study is expected to improve the performance of response model for the promotion program with CBR under imbalanced data distribution in the real world.

Face Identification Using a Near-Infrared Camera in a Nonrestrictive In-Vehicle Environment (적외선 카메라를 이용한 비제약적 환경에서의 얼굴 인증)

  • Ki, Min Song;Choi, Yeong Woo
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.10 no.3
    • /
    • pp.99-108
    • /
    • 2021
  • There are unrestricted conditions on the driver's face inside the vehicle, such as changes in lighting, partial occlusion and various changes in the driver's condition. In this paper, we propose a face identification system in an unrestricted vehicle environment. The proposed method uses a near-infrared (NIR) camera to minimize the changes in facial images that occur according to the illumination changes inside and outside the vehicle. In order to process a face exposed to extreme light, the normal face image is changed to a simulated overexposed image using mean and variance for training. Thus, facial classifiers are simultaneously generated under both normal and extreme illumination conditions. Our method identifies a face by detecting facial landmarks and aggregating the confidence score of each landmark for the final decision. In particular, the performance improvement is the highest in the class where the driver wears glasses or sunglasses, owing to the robustness to partial occlusions by recognizing each landmark. We can recognize the driver by using the scores of remaining visible landmarks. We also propose a novel robust rejection and a new evaluation method, which considers the relations between registered and unregistered drivers. The experimental results on our dataset, PolyU and ORL datasets demonstrate the effectiveness of the proposed method.

Wildfire Severity Mapping Using Sentinel Satellite Data Based on Machine Learning Approaches (Sentinel 위성영상과 기계학습을 이용한 국내산불 피해강도 탐지)

  • Sim, Seongmun;Kim, Woohyeok;Lee, Jaese;Kang, Yoojin;Im, Jungho;Kwon, Chunguen;Kim, Sungyong
    • Korean Journal of Remote Sensing
    • /
    • v.36 no.5_3
    • /
    • pp.1109-1123
    • /
    • 2020
  • In South Korea with forest as a major land cover class (over 60% of the country), many wildfires occur every year. Wildfires weaken the shear strength of the soil, forming a layer of soil that is vulnerable to landslides. It is important to identify the severity of a wildfire as well as the burned area to sustainably manage the forest. Although satellite remote sensing has been widely used to map wildfire severity, it is often difficult to determine the severity using only the temporal change of satellite-derived indices such as Normalized Difference Vegetation Index (NDVI) and Normalized Burn Ratio (NBR). In this study, we proposed an approach for determining wildfire severity based on machine learning through the synergistic use of Sentinel-1A Synthetic Aperture Radar-C data and Sentinel-2A Multi Spectral Instrument data. Three wildfire cases-Samcheok in May 2017, Gangreung·Donghae in April 2019, and Gosung·Sokcho in April 2019-were used for developing wildfire severity mapping models with three machine learning algorithms (i.e., Random Forest, Logistic Regression, and Support Vector Machine). The results showed that the random forest model yielded the best performance, resulting in an overall accuracy of 82.3%. The cross-site validation to examine the spatiotemporal transferability of the machine learning models showed that the models were highly sensitive to temporal differences between the training and validation sites, especially in the early growing season. This implies that a more robust model with high spatiotemporal transferability can be developed when more wildfire cases with different seasons and areas are added in the future.

Bankruptcy Type Prediction Using A Hybrid Artificial Neural Networks Model (하이브리드 인공신경망 모형을 이용한 부도 유형 예측)

  • Jo, Nam-ok;Kim, Hyun-jung;Shin, Kyung-shik
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.3
    • /
    • pp.79-99
    • /
    • 2015
  • The prediction of bankruptcy has been extensively studied in the accounting and finance field. It can have an important impact on lending decisions and the profitability of financial institutions in terms of risk management. Many researchers have focused on constructing a more robust bankruptcy prediction model. Early studies primarily used statistical techniques such as multiple discriminant analysis (MDA) and logit analysis for bankruptcy prediction. However, many studies have demonstrated that artificial intelligence (AI) approaches, such as artificial neural networks (ANN), decision trees, case-based reasoning (CBR), and support vector machine (SVM), have been outperforming statistical techniques since 1990s for business classification problems because statistical methods have some rigid assumptions in their application. In previous studies on corporate bankruptcy, many researchers have focused on developing a bankruptcy prediction model using financial ratios. However, there are few studies that suggest the specific types of bankruptcy. Previous bankruptcy prediction models have generally been interested in predicting whether or not firms will become bankrupt. Most of the studies on bankruptcy types have focused on reviewing the previous literature or performing a case study. Thus, this study develops a model using data mining techniques for predicting the specific types of bankruptcy as well as the occurrence of bankruptcy in Korean small- and medium-sized construction firms in terms of profitability, stability, and activity index. Thus, firms will be able to prevent it from occurring in advance. We propose a hybrid approach using two artificial neural networks (ANNs) for the prediction of bankruptcy types. The first is a back-propagation neural network (BPN) model using supervised learning for bankruptcy prediction and the second is a self-organizing map (SOM) model using unsupervised learning to classify bankruptcy data into several types. Based on the constructed model, we predict the bankruptcy of companies by applying the BPN model to a validation set that was not utilized in the development of the model. This allows for identifying the specific types of bankruptcy by using bankruptcy data predicted by the BPN model. We calculated the average of selected input variables through statistical test for each cluster to interpret characteristics of the derived clusters in the SOM model. Each cluster represents bankruptcy type classified through data of bankruptcy firms, and input variables indicate financial ratios in interpreting the meaning of each cluster. The experimental result shows that each of five bankruptcy types has different characteristics according to financial ratios. Type 1 (severe bankruptcy) has inferior financial statements except for EBITDA (earnings before interest, taxes, depreciation, and amortization) to sales based on the clustering results. Type 2 (lack of stability) has a low quick ratio, low stockholder's equity to total assets, and high total borrowings to total assets. Type 3 (lack of activity) has a slightly low total asset turnover and fixed asset turnover. Type 4 (lack of profitability) has low retained earnings to total assets and EBITDA to sales which represent the indices of profitability. Type 5 (recoverable bankruptcy) includes firms that have a relatively good financial condition as compared to other bankruptcy types even though they are bankrupt. Based on the findings, researchers and practitioners engaged in the credit evaluation field can obtain more useful information about the types of corporate bankruptcy. In this paper, we utilized the financial ratios of firms to classify bankruptcy types. It is important to select the input variables that correctly predict bankruptcy and meaningfully classify the type of bankruptcy. In a further study, we will include non-financial factors such as size, industry, and age of the firms. Thus, we can obtain realistic clustering results for bankruptcy types by combining qualitative factors and reflecting the domain knowledge of experts.