• Title/Summary/Keyword: Correlation based Feature Selection

Search Result 54, Processing Time 0.024 seconds

Several models for tunnel boring machine performance prediction based on machine learning

  • Mahmoodzadeh, Arsalan;Nejati, Hamid Reza;Ibrahim, Hawkar Hashim;Ali, Hunar Farid Hama;Mohammed, Adil Hussein;Rashidi, Shima;Majeed, Mohammed Kamal
    • Geomechanics and Engineering
    • /
    • v.30 no.1
    • /
    • pp.75-91
    • /
    • 2022
  • This paper aims to show how to use several Machine Learning (ML) methods to estimate the TBM penetration rate systematically (TBM-PR). To this end, 1125 datasets including uniaxial compressive strength (UCS), Brazilian tensile strength (BTS), punch slope index (PSI), distance between the planes of weakness (DPW), orientation of discontinuities (alpha angle-α), rock fracture class (RFC), and actual/measured TBM-PRs were established. To evaluate the ML methods' ability to perform, the 5-fold cross-validation was taken into consideration. Eventually, comparing the ML outcomes and the TBM monitoring data indicated that the ML methods have a very good potential ability in the prediction of TBM-PR. However, the long short-term memory model with a correlation coefficient of 0.9932 and a route mean square error of 2.68E-6 outperformed the remaining six ML algorithms. The backward selection method showed that PSI and RFC were more and less significant parameters on the TBM-PR compared to the others.

Personal Biometric Identification based on ECG Features (ECG 특징추출 기반 개인 바이오 인식)

  • Yoon, Seok-Joo;Kim, Gwang-Jun
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.10 no.4
    • /
    • pp.521-526
    • /
    • 2015
  • Research on how to use the biological characteristics of human to confirm the identity of the individual is being actively conducted. Electrocardiogram(: ECG) based biometric system is difficult to counterfeit and does not cause skin irritation on the subject. It can be easily combined with conventional biometrics such as fingerprint and face recognition to give multimodal biometric systems. In this thesis, biometric identification method analysing ECG waveform characteristics from Discrete Wavelet Transform(DWT) coefficients is suggested. Feature selection is performed on the 9 coefficients of DWT using the correlation analysis. The verification is achieved by using the error back propagation neural networks. Using the proposed approach on 24 subjects of MIT-BIH QT Database, 98.88% verification rate has been obtained.

Proposing the Method for Improving the Forecast Accuracy of Loan Underwriting (대출심사의 예측 정확도 향상을 위한 방법 제안)

  • Yang, Yu-Young;Park, Sang-Sung;Shin, Young-Geun;Jang, Dong-Sik
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.11 no.4
    • /
    • pp.1419-1429
    • /
    • 2010
  • Industry structure and environment of the domestic bank have been changed by an influx of large foreign-banks and advanced financial products when the currency crisis erupted in Korea. In a competitive environment, accurate forecasts of changes and tendencies are essential for the survival and development. Forecast of whether to approve loan applications for customer or not is an important matter because that is related to profit generation and risk management on the bank. Therefore, this paper proposes the method to improve forecast accuracy of loan underwriting. Processes in experiments are as follows. First, we select the predictor variables which affect significantly to the result of loan underwriting by correlation analysis and feature selection technique, and then cluster the customers by the 2-Step clustering technique based on selected variables. Second, we find the most accurate forecasting model for each clustering by applying LR, NN and SVM. Finally, we compare the forecasting accuracy of the proposed method with the forecasting accuracy of existing application way.

Music Exploring Interface using Emotional Model (감성모델을 이용한 음악 탐색 인터페이스)

  • Yoo, Min-Joon;Kim, Hyun-Ju;Lee, In-Kwon
    • 한국HCI학회:학술대회논문집
    • /
    • 2009.02a
    • /
    • pp.707-710
    • /
    • 2009
  • In this paper, we introduce an interface for exploring music using emotional model. First, we survey arousal-valence factors of various music and calculate a correlation between audio fefatures of music and arousal-valence factors to build an AV model. Then, various music is aligned and arranged using the AV model and the user can explore music in this interface. To select the desired music more intuitively, we introduce new fade in/out function based on the location of the user's mouse point. We also offer several mode of selecting music so user can explore music using most suitable mode of interface. With our interface, the user can find the emotionally desired music more easily.

  • PDF

Study for Classification of Facial Expression using Distance Features of Facial Landmarks (얼굴 랜드마크 거리 특징을 이용한 표정 분류에 대한 연구)

  • Bae, Jin Hee;Wang, Bo Hyeon;Lim, Joon S.
    • Journal of IKEEE
    • /
    • v.25 no.4
    • /
    • pp.613-618
    • /
    • 2021
  • Facial expression recognition has long been established as a subject of continuous research in various fields. In this paper, the relationship between each landmark is analyzed using the features obtained by calculating the distance between the facial landmarks in the image, and five facial expressions are classified. We increased data and label reliability based on our labeling work with multiple observers. In addition, faces were recognized from the original data and landmark coordinates were extracted and used as features. A genetic algorithm was used to select features that are relatively more helpful for classification. We performed facial recognition classification and analysis with the method proposed in this paper, which shows the validity and effectiveness of the proposed method.

Assessment of Landslide Susceptibility in Jecheon Using Deep Learning Based on Exploratory Data Analysis (데이터 탐색을 활용한 딥러닝 기반 제천 지역 산사태 취약성 분석)

  • Sang-A Ahn;Jung-Hyun Lee;Hyuck-Jin Park
    • The Journal of Engineering Geology
    • /
    • v.33 no.4
    • /
    • pp.673-687
    • /
    • 2023
  • Exploratory data analysis is the process of observing and understanding data collected from various sources to identify their distributions and correlations through their structures and characterization. This process can be used to identify correlations among conditioning factors and select the most effective factors for analysis. This can help the assessment of landslide susceptibility, because landslides are usually triggered by multiple factors, and the impacts of these factors vary by region. This study compared two stages of exploratory data analysis to examine the impact of the data exploration procedure on the landslide prediction model's performance with respect to factor selection. Deep-learning-based landslide susceptibility analysis used either a combinations of selected factors or all 23 factors. During the data exploration phase, we used a Pearson correlation coefficient heat map and a histogram of random forest feature importance. We then assessed the accuracy of our deep-learning-based analysis of landslide susceptibility using a confusion matrix. Finally, a landslide susceptibility map was generated using the landslide susceptibility index derived from the proposed analysis. The analysis revealed that using all 23 factors resulted in low accuracy (55.90%), but using the 13 factors selected in one step of exploration improved the accuracy to 81.25%. This was further improved to 92.80% using only the nine conditioning factors selected during both steps of the data exploration. Therefore, exploratory data analysis selected the conditioning factors most suitable for landslide susceptibility analysis and thereby improving the performance of the analysis.

Improving Field Crop Classification Accuracy Using GLCM and SVM with UAV-Acquired Images

  • Seung-Hwan Go;Jong-Hwa Park
    • Korean Journal of Remote Sensing
    • /
    • v.40 no.1
    • /
    • pp.93-101
    • /
    • 2024
  • Accurate field crop classification is essential for various agricultural applications, yet existing methods face challenges due to diverse crop types and complex field conditions. This study aimed to address these issues by combining support vector machine (SVM) models with multi-seasonal unmanned aerial vehicle (UAV) images, texture information extracted from Gray Level Co-occurrence Matrix (GLCM), and RGB spectral data. Twelve high-resolution UAV image captures spanned March-October 2021, while field surveys on three dates provided ground truth data. We focused on data from August (-A), September (-S), and October (-O) images and trained four support vector classifier (SVC) models (SVC-A, SVC-S, SVC-O, SVC-AS) using visual bands and eight GLCM features. Farm maps provided by the Ministry of Agriculture, Food and Rural Affairs proved efficient for open-field crop identification and served as a reference for accuracy comparison. Our analysis showcased the significant impact of hyperparameter tuning (C and gamma) on SVM model performance, requiring careful optimization for each scenario. Importantly, we identified models exhibiting distinct high-accuracy zones, with SVC-O trained on October data achieving the highest overall and individual crop classification accuracy. This success likely stems from its ability to capture distinct texture information from mature crops.Incorporating GLCM features proved highly effective for all models,significantly boosting classification accuracy.Among these features, homogeneity, entropy, and correlation consistently demonstrated the most impactful contribution. However, balancing accuracy with computational efficiency and feature selection remains crucial for practical application. Performance analysis revealed that SVC-O achieved exceptional results in overall and individual crop classification, while soybeans and rice were consistently classified well by all models. Challenges were encountered with cabbage due to its early growth stage and low field cover density. The study demonstrates the potential of utilizing farm maps and GLCM features in conjunction with SVM models for accurate field crop classification. Careful parameter tuning and model selection based on specific scenarios are key for optimizing performance in real-world applications.

Design of Automatic Document Classifier for IT documents based on SVM (SVM을 이용한 디렉토리 기반 기술정보 문서 자동 분류시스템 설계)

  • Kang, Yun-Hee;Park, Young-B.
    • Journal of IKEEE
    • /
    • v.8 no.2 s.15
    • /
    • pp.186-194
    • /
    • 2004
  • Due to the exponential growth of information on the internet, it is getting difficult to find and organize relevant informations. To reduce heavy overload of accesses to information, automatic text classification for handling enormous documents is necessary. In this paper, we describe structure and implementation of a document classification system for web documents. We utilize SVM for documentation classification model that is constructed based on training set and its representative terms in a directory. In our system, SVM is trained and is used for document classification by using word set that is extracted from information and communication related web documents. In addition, we use vector-space model in order to represent characteristics based on TFiDF and training data consists of positive and negative classes that are represented by using characteristic set with weight. Experiments show the results of categorization and the correlation of vector length.

  • PDF

Evaluation of Firmness and Sweetness Index of Tomatoes using Hyperspectral Imaging

  • Rahman, Anisur;Faqeerzada, Mohammad Akbar;Joshi, Rahul;Cho, Byoung-Kwan
    • Proceedings of the Korean Society for Agricultural Machinery Conference
    • /
    • 2017.04a
    • /
    • pp.44-44
    • /
    • 2017
  • The objective of this study was to evaluate firmness, and sweetness index (SI) of tomatoes (Lycopersicum esculentum) by using hyperspectral imaging (HSI) in the range of 1000-1400 nm. The mean spectra of the 95 matured tomato samples were extracted from the hyperspectral images, and the reference firmness and sweetness index of the same sample were measured and calibrated with their corresponding spectral data by partial least squares (PLS) regression with different preprocessing method. The results showed that the regression model developed by PLS regression based on Savitzky-Golay (S-G) second-derivative preprocessed spectra resulted in better performance for firmness, and SI of tomatoes compared to models developed by other preprocessing methods, with correlation coefficients (rpred) of 0.82, and 0.74 with standard error of prediction (SEP) of 0.86 N, and 0.63 respectively. Then, the feature wavelengths were identified using model-based variable selection method, i.e., variable important in projection (VIP), resulting from the PLS regression analyses and finally chemical images were derived by applying the respective regression coefficient on the spectral image in a pixel-wise manner. The resulting chemical images provided detailed information on firmness, and sweetness index (SI) of tomatoes. Therefore, these research demonstrated that HIS technique has a potential for rapid and non-destructive evaluation of the firmness and sweetness index of tomatoes.

  • PDF

An analysis of satisfaction index on computer education of university using kernel machine (커널머신을 이용한 대학의 컴퓨터교육 만족도 분석)

  • Pi, Su-Young;Park, Hye-Jung;Ryu, Kyung-Hyun
    • Journal of the Korean Data and Information Science Society
    • /
    • v.22 no.5
    • /
    • pp.921-929
    • /
    • 2011
  • In Information age, the academic liberal art Computer education course set up goals for promoting computer literacy and for developing the ability to cope actively with in Information Society and for improving productivity and competition among nations. In this paper, we analyze on discovering of decisive property and satisfaction index to have a influence on computer education on university students. As a preprocessing method, the proposed method select optimum property using correlation feature selection of machine learning tool based on Java and then we use multiclass least square support vector machine based on statistical learning theory. After applying that compare with multiclass support vector machine and multiclass least square support vector machine, we can see the fact that the proposed method have a excellent result like multiclass support vector machine in analysis of the academic liberal art computer education satisfaction index data.