• Title/Summary/Keyword: Feature Importance

Search Result 422, Processing Time 0.027 seconds

Truck Weight Estimation using Operational Statistics at 3rd Party Logistics Environment (운영 데이터를 활용한 제3자 물류 환경에서의 배송 트럭 무게 예측)

  • Yu-jin Lee;Kyung Min Choi;Song-eun Kim;Kyungsu Park;Seung Hwan Jung
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.45 no.4
    • /
    • pp.127-133
    • /
    • 2022
  • Many manufacturers applying third party logistics (3PLs) have some challenges to increase their logistics efficiency. This study introduces an effort to estimate the weight of the delivery trucks provided by 3PL providers, which allows the manufacturer to package and load products in trailers in advance to reduce delivery time. The accuracy of the weigh estimation is more important due to the total weight regulation. This study uses not only the data from the company but also many general prediction variables such as weather, oil prices and population of destinations. In addition, operational statistics variables are developed to indicate the availabilities of the trucks in a specific weight category for each 3PL provider. The prediction model using XGBoost regressor and permutation feature importance method provides highly acceptable performance with MAPE of 2.785% and shows the effectiveness of the developed operational statistics variables.

Study on predictive model and mechanism analysis for martensite transformation temperatures through explainable artificial intelligence (설명가능한 인공지능을 통한 마르텐사이트 변태 온도 예측 모델 및 거동 분석 연구)

  • Junhyub Jeon;Seung Bae Son;Jae-Gil Jung;Seok-Jae Lee
    • Journal of the Korean Society for Heat Treatment
    • /
    • v.37 no.3
    • /
    • pp.103-113
    • /
    • 2024
  • Martensite volume fraction significantly affects the mechanical properties of alloy steels. Martensite start temperature (Ms), transformation temperature for martensite 50 vol.% (M50), and transformation temperature for martensite 90 vol.% (M90) are important transformation temperatures to control the martensite phase fraction. Several researchers proposed empirical equations and machine learning models to predict the Ms temperature. These numerical approaches can easily predict the Ms temperature without additional experiment and cost. However, to control martensite phase fraction more precisely, we need to reduce prediction error of the Ms model and propose prediction models for other martensite transformation temperatures (M50, M90). In the present study, machine learning model was applied to suggest the predictive model for the Ms, M50, M90 temperatures. To explain prediction mechanisms and suggest feature importance on martensite transformation temperature of machine learning models, the explainable artificial intelligence (XAI) is employed. Random forest regression (RFR) showed the best performance for predicting the Ms, M50, M90 temperatures using different machine learning models. The feature importance was proposed and the prediction mechanisms were discussed by XAI.

Application of Random Forests to Assessment of Importance of Variables in Multi-sensor Data Fusion for Land-cover Classification

  • Park No-Wook;Chi kwang-Hoon
    • Korean Journal of Remote Sensing
    • /
    • v.22 no.3
    • /
    • pp.211-219
    • /
    • 2006
  • A random forests classifier is applied to multi-sensor data fusion for supervised land-cover classification in order to account for the importance of variable. The random forests approach is a non-parametric ensemble classifier based on CART-like trees. The distinguished feature is that the importance of variable can be estimated by randomly permuting the variable of interest in all the out-of-bag samples for each classifier. Two different multi-sensor data sets for supervised classification were used to illustrate the applicability of random forests: one with optical and polarimetric SAR data and the other with multi-temporal Radarsat-l and ENVISAT ASAR data sets. From the experimental results, the random forests approach could extract important variables or bands for land-cover discrimination and showed reasonably good performance in terms of classification accuracy.

Machine-Learning-Based Link Adaptation for Energy-Efficient MIMO-OFDM Systems (MIMO-OFDM 시스템에서 에너지 효율성을 위한 기계 학습 기반 적응형 전송 기술 및 Feature Space 연구)

  • Oh, Myeung Suk;Kim, Gibum;Park, Hyuncheol
    • The Journal of Korean Institute of Electromagnetic Engineering and Science
    • /
    • v.27 no.5
    • /
    • pp.407-415
    • /
    • 2016
  • Recent wireless communication trends have emphasized the importance of energy-efficient transmission. In this paper, link adaptation with machine learning mechanism for maximum energy efficiency in multiple-input multiple-output orthogonal frequency division multiplexing(MIMO-OFDM) wireless system is considered. For reflecting frequency-selective MIMO-OFDM channels, two-dimensional capacity(2D-CAP) feature space is proposed. In addition, machine-learning-based bit and power adaptation(ML-BPA) algorithm that performs classification-based link adaptation is presented. Simulation results show that 2D-CAP feature space can represent channel conditions accurately and bring noticeable improvement in link adaptation performance. Compared with other feature spaces, including ordered postprocessing signal-to-noise ratio(ordSNR) feature space, 2D-CAP has distinguished advantages in either efficiency performance or computational complexity.

Operating Voltage Prediction in Mobile Semiconductor Manufacturing Process Using Machine Learning (기계학습을 활용한 모바일 반도체 제조 공정에서 동작 전압 예측)

  • Inhwan Baek;Seungwoo Jang;Kwangsu Kim
    • Journal of the Semiconductor & Display Technology
    • /
    • v.22 no.1
    • /
    • pp.124-128
    • /
    • 2023
  • Semiconductor engineers have long sought to enhance the energy efficiency of mobile semiconductors by reducing their voltage. During the final stages of the semiconductor manufacturing process, the screening and evaluation of voltage is crucial. However, determining the optimal test start voltage presents a significant challenge as it can increase testing time. In the semiconductor manufacturing process, a wealth of test element group information is collected. If this information can be controlled to predict the test voltage, it could lead to a reduction in testing time and increase the probability of identifying the optimal voltage. To achieve this, this paper is exploring machine learning techniques, such as linear regression and ensemble models, that can leverage large amounts of information for voltage prediction. The outcomes of these machine learning methods not only demonstrate high consistency but can also be used for feature engineering to enhance accuracy in future processes.

  • PDF

Integration of History-based Parametric CAD Model Translators Using Automation API (오토메이션 API를 사용한 설계 이력 기반 파라메트릭 CAD 모델 번역기의 통합)

  • Kim B.;Han S.
    • Korean Journal of Computational Design and Engineering
    • /
    • v.11 no.3
    • /
    • pp.164-171
    • /
    • 2006
  • As collaborative design and configuration design are of increasing importance in product development, it becomes essential to exchange the feature and parametric CAD models among participants. A history-based parametric method has been proposed and implemented. But each translator which exchanges the feature and parametric information tends to be heavy because to implement duplicated functions such as the identification of the selected geometries, mapping between features which have different attributes. Furthermore. because the history-based parametric translator uses the procedural model as the neutral format, which is the XML macro file, the history-based parametric translators need a geometric modeling kernel to generate an internal explicit geometric model. To ease the problem, we implemented a shared integration platform, the TransCAD. The TransCAD separates translators from the XML macro files. The translators for various CAD systems need to communicate with only the TransCAD. To support the communication with the TransCAD, we exposed the functions of the TransCAD by using the Automation APIs, which is developed by Microsoft. The Automation APIs of the TransCAD consist of the part modeling functions, the data extraction functions, and the utility functions. Each translator uses these functions to translate a parametric CAD model from the sending CAD system into the XML format, or from the in format into the model of the receiving CAD system This paper introduces what the TransCAD is and how it works for the exchange of the feature and parametric models.

An Improved method of Two Stage Linear Discriminant Analysis

  • Chen, Yarui;Tao, Xin;Xiong, Congcong;Yang, Jucheng
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.3
    • /
    • pp.1243-1263
    • /
    • 2018
  • The two-stage linear discrimination analysis (TSLDA) is a feature extraction technique to solve the small size sample problem in the field of image recognition. The TSLDA has retained all subspace information of the between-class scatter and within-class scatter. However, the feature information in the four subspaces may not be entirely beneficial for classification, and the regularization procedure for eliminating singular metrics in TSLDA has higher time complexity. In order to address these drawbacks, this paper proposes an improved two-stage linear discriminant analysis (Improved TSLDA). The Improved TSLDA proposes a selection and compression method to extract superior feature information from the four subspaces to constitute optimal projection space, where it defines a single Fisher criterion to measure the importance of single feature vector. Meanwhile, Improved TSLDA also applies an approximation matrix method to eliminate the singular matrices and reduce its time complexity. This paper presents comparative experiments on five face databases and one handwritten digit database to validate the effectiveness of the Improved TSLDA.

Enhancing prediction accuracy of concrete compressive strength using stacking ensemble machine learning

  • Yunpeng Zhao;Dimitrios Goulias;Setare Saremi
    • Computers and Concrete
    • /
    • v.32 no.3
    • /
    • pp.233-246
    • /
    • 2023
  • Accurate prediction of concrete compressive strength can minimize the need for extensive, time-consuming, and costly mixture optimization testing and analysis. This study attempts to enhance the prediction accuracy of compressive strength using stacking ensemble machine learning (ML) with feature engineering techniques. Seven alternative ML models of increasing complexity were implemented and compared, including linear regression, SVM, decision tree, multiple layer perceptron, random forest, Xgboost and Adaboost. To further improve the prediction accuracy, a ML pipeline was proposed in which the feature engineering technique was implemented, and a two-layer stacked model was developed. The k-fold cross-validation approach was employed to optimize model parameters and train the stacked model. The stacked model showed superior performance in predicting concrete compressive strength with a correlation of determination (R2) of 0.985. Feature (i.e., variable) importance was determined to demonstrate how useful the synthetic features are in prediction and provide better interpretability of the data and the model. The methodology in this study promotes a more thorough assessment of alternative ML algorithms and rather than focusing on any single ML model type for concrete compressive strength prediction.

A machine learning informed prediction of severe accident progressions in nuclear power plants

  • JinHo Song;SungJoong Kim
    • Nuclear Engineering and Technology
    • /
    • v.56 no.6
    • /
    • pp.2266-2273
    • /
    • 2024
  • A machine learning platform is proposed for the diagnosis of a severe accident progression in a nuclear power plant. To predict the key parameters for accident management including lost signals, a long short term memory (LSTM) network is proposed, where multiple accident scenarios are used for training. Training and test data were produced by MELCOR simulation of the Fukushima Daiichi Nuclear Power Plant (FDNPP) accident at unit 3. Feature variables were selected among plant parameters, where the importance ranking was determined by a recursive feature elimination technique using RandomForestRegressor. To answer the question of whether a reduced order ML model could predict the complex transient response, we performed a systematic sensitivity study for the choices of target variables, the combination of training and test data, the number of feature variables, and the number of neurons to evaluate the performance of the proposed ML platform. The number of sensitivity cases was chosen to guarantee a 95 % tolerance limit with a 95 % confidence level based on Wilks' formula to quantify the uncertainty of predictions. The results of investigations indicate that the proposed ML platform consistently predicts the target variable. The median and mean predictions were close to the true value.

Analysis of Feature Importance of Ship's Berthing Velocity Using Classification Algorithms of Machine Learning (머신러닝 분류 알고리즘을 활용한 선박 접안속도 영향요소의 중요도 분석)

  • Lee, Hyeong-Tak;Lee, Sang-Won;Cho, Jang-Won;Cho, Ik-Soon
    • Journal of the Korean Society of Marine Environment & Safety
    • /
    • v.26 no.2
    • /
    • pp.139-148
    • /
    • 2020
  • The most important factor affecting the berthing energy generated when a ship berths is the berthing velocity. Thus, an accident may occur if the berthing velocity is extremely high. Several ship features influence the determination of the berthing velocity. However, previous studies have mostly focused on the size of the vessel. Therefore, the aim of this study is to analyze various features that influence berthing velocity and determine their respective importance. The data used in the analysis was based on the berthing velocity of a ship on a jetty in Korea. Using the collected data, machine learning classification algorithms were compared and analyzed, such as decision tree, random forest, logistic regression, and perceptron. As an algorithm evaluation method, indexes according to the confusion matrix were used. Consequently, perceptron demonstrated the best performance, and the feature importance was in the following order: DWT, jetty number, and state. Hence, when berthing a ship, the berthing velocity should be determined in consideration of various features, such as the size of the ship, position of the jetty, and loading condition of the cargo.