• Title/Summary/Keyword: Classification algorithms

Search Result 1,168, Processing Time 0.034 seconds

Method for Assessing Landslide Susceptibility Using SMOTE and Classification Algorithms (SMOTE와 분류 기법을 활용한 산사태 위험 지역 결정 방법)

  • Yoon, Hyung-Koo
    • Journal of the Korean Geotechnical Society
    • /
    • v.39 no.6
    • /
    • pp.5-12
    • /
    • 2023
  • Proactive assessment of landslide susceptibility is necessary for minimizing casualties. This study proposes a methodology for classifying the landslide safety factor using a classification algorithm based on machine learning techniques. The high-risk area model is adopted to perform the classification and eight geotechnical parameters are adopted as inputs. Four classification algorithms-namely decision tree, k-nearest neighbor, logistic regression, and random forest-are employed for comparing classification accuracy for the safety factors ranging between 1.2 and 2.0. Notably, a high accuracy is demonstrated in the safety factor range of 1.2~1.7, but a relatively low accuracy is obtained in the range of 1.8~2.0. To overcome this issue, the synthetic minority over-sampling technique (SMOTE) is adopted to generate additional data. The application of SMOTE improves the average accuracy by ~250% in the safety factor range of 1.8~2.0. The results demonstrate that SMOTE algorithm improves the accuracy of classification algorithms when applied to geotechnical data.

Performance Improvement of Feature Selection Methods based on Bio-Inspired Algorithms (생태계 모방 알고리즘 기반 특징 선택 방법의 성능 개선 방안)

  • Yun, Chul-Min;Yang, Ji-Hoon
    • The KIPS Transactions:PartB
    • /
    • v.15B no.4
    • /
    • pp.331-340
    • /
    • 2008
  • Feature Selection is one of methods to improve the classification accuracy of data in the field of machine learning. Many feature selection algorithms have been proposed and discussed for years. However, the problem of finding the optimal feature subset from full data still remains to be a difficult problem. Bio-inspired algorithms are well-known evolutionary algorithms based on the principles of behavior of organisms, and very useful methods to find the optimal solution in optimization problems. Bio-inspired algorithms are also used in the field of feature selection problems. So in this paper we proposed new improved bio-inspired algorithms for feature selection. We used well-known bio-inspired algorithms, Genetic Algorithm (GA) and Particle Swarm Optimization (PSO), to find the optimal subset of features that shows the best performance in classification accuracy. In addition, we modified the bio-inspired algorithms considering the prior importance (prior relevance) of each feature. We chose the mRMR method, which can measure the goodness of single feature, to set the prior importance of each feature. We modified the evolution operators of GA and PSO by using the prior importance of each feature. We verified the performance of the proposed methods by experiment with datasets. Feature selection methods using GA and PSO produced better performances in terms of the classification accuracy. The modified method with the prior importance demonstrated improved performances in terms of the evolution speed and the classification accuracy.

Monitoring of Graveyards in Mountainous Areas with Simulated KOMPSAT-2 imagery

  • Chang, Eun-Mi;Kim, Min-Ho;Lee, Byung-Whan;Heo, Min
    • Proceedings of the KSRS Conference
    • /
    • 2003.11a
    • /
    • pp.1409-1411
    • /
    • 2003
  • The application of simulated KOMPSAT-2 imagery to monitor graveyards is to be developed. Positions calculated from image were compared with those obtained from Geographic Positioning System. With 24 checkpoints, the position of graveyards showed within 5-meter range. Unsupervised classification, supervised classification, and objected-orientation classification algorithms were used to extract the graveyard. Unsupervised classification with masking processes based on National topographic data gives the best result. The graveyards were categorized with four types in field studies while the two types of graveyards were shown in descriptive statistics. Cluster Analysis and discriminant analysis showed the consistency with two types of tombs. It was hard to get a specific spectral signature of graveyards, as they are covered with grasses at different levels and shaded from the surrounding trees. The slopes and aspects of location of graveyards did not make any difference in the spectral signatures. This study gives the basic spectral characteristics for further development of objected-oriented classification algorithms and plausibility of KOMPSAT-2 images for management of mountainous areas in the aspect of position accuracy and classification accuracy.

  • PDF

Comparison of Three Land Cover Classification Algorithms -ISODATA, SMA, and SOM - for the Monitoring of North Korea with MODIS Multi-temporal Data

  • Kim, Do-Hyung;Jeong, Seung-Gyu;Park, Chong-Hwa
    • Korean Journal of Remote Sensing
    • /
    • v.23 no.3
    • /
    • pp.181-188
    • /
    • 2007
  • The objective of this research was to investigate the optimal land cover classification algorithm for the monitoring of North Korea with MODIS multi-temporal data based on monthly phenological characteristics. Three frequently used land cover classification algorithms, ISODATA1), SMA2), and SOM3) were employed for this study; the land cover categories were forest, grass, agricultural, wetland, barren, built-up, and water body. The outcomes of the study can be summarized as follows. First, the overall classification accuracy of ISODATA, SMA, and SOM was 69.03%, 64.28%, and 73.57%, respectively. Second, ISODATA and SMA resulted in a higher classification accuracy of forest and agricultural categories, but SOM performed better for the built-up area, bare soil, grassland, and water. A possible explanation for this difference would be related to the difference of sensitivity against the vegetation activity. This would be related to the capability of SOM to express all of their values without any loss of data by maintaining the topology between pixels of primitive data after classification, while ISODATA and SMA retain limited amount of data after normalization process. Third, we can conclude that SOM is the best algorithm for monitoring the land cover change of North Korea.

Gene selection method using neural networks and genetic algorithm and its applications to classification of cancers (신경회로망과 유전 알고리즘을 이용한 유전자 추출법과 이의 암 분류법에의 적용)

  • Cho, Hyun-Sung;Kim, Tae-Seon;Jeon, Sung-Mo;Wee, Jae-Woo;Lee, Chong-Ho
    • Proceedings of the KIEE Conference
    • /
    • 2002.07d
    • /
    • pp.2815-2817
    • /
    • 2002
  • Classification method of cancers using cDNA microarrays data was developed using genetic algorithms and neural networks. For gene selection, 2308 genes were ranked using genetic algorithms, and selected by frequency number of selection from 1000 of genetic iterative runs. To calculate fitness values, artificial neural networks are used as classifier. The small, round blue cell tumors (SRBCTs) which is difficult to distinguish via pathological single test was used as test diseases for classification, and the test results showed the 96% of exact classification capability for 25 test samples.

  • PDF

Case based Reasoning System with Two Dimensional Reduction Technique for Customer Classification Model

  • Kim, Kyoung-Jae;Ahn, Hyun-Chul
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • v.9 no.2
    • /
    • pp.383-386
    • /
    • 2005
  • This study proposes a case based reasoning system with two dimensional reduction techniques. In this study, vertical and horizontal dimensions of the research data are reduced through hybrid feature and instance selection process using genetic algorithms. We applied the proposed model to customer classification model which utilizes customers' demographic characteristics as inputs to predict their buying behavior for the specific product. Experimental results show that the proposed technique may improve the classification accuracy and outperform various optimized models of typical CBR system.

  • PDF

Hybridized Decision Tree methods for Detecting Generic Attack on Ciphertext

  • Alsariera, Yazan Ahmad
    • International Journal of Computer Science & Network Security
    • /
    • v.21 no.7
    • /
    • pp.56-62
    • /
    • 2021
  • The surge in generic attacks execution against cipher text on the computer network has led to the continuous advancement of the mechanisms to protect information integrity and confidentiality. The implementation of explicit decision tree machine learning algorithm is reported to accurately classifier generic attacks better than some multi-classification algorithms as the multi-classification method suffers from detection oversight. However, there is a need to improve the accuracy and reduce the false alarm rate. Therefore, this study aims to improve generic attack classification by implementing two hybridized decision tree algorithms namely Naïve Bayes Decision tree (NBTree) and Logistic Model tree (LMT). The proposed hybridized methods were developed using the 10-fold cross-validation technique to avoid overfitting. The generic attack detector produced a 99.8% accuracy, an FPR score of 0.002 and an MCC score of 0.995. The performances of the proposed methods were better than the existing decision tree method. Similarly, the proposed method outperformed multi-classification methods for detecting generic attacks. Hence, it is recommended to implement hybridized decision tree method for detecting generic attacks on a computer network.

Transfer Learning Using Convolutional Neural Network Architectures for Glioma Classification from MRI Images

  • Kulkarni, Sunita M.;Sundari, G.
    • International Journal of Computer Science & Network Security
    • /
    • v.21 no.2
    • /
    • pp.198-204
    • /
    • 2021
  • Glioma is one of the common types of brain tumors starting in the brain's glial cell. These tumors are classified into low-grade or high-grade tumors. Physicians analyze the stages of brain tumors and suggest treatment to the patient. The status of the tumor has an importance in the treatment. Nowadays, computerized systems are used to analyze and classify brain tumors. The accurate grading of the tumor makes sense in the treatment of brain tumors. This paper aims to develop a classification of low-grade glioma and high-grade glioma using a deep learning algorithm. This system utilizes four transfer learning algorithms, i.e., AlexNet, GoogLeNet, ResNet18, and ResNet50, for classification purposes. Among these algorithms, ResNet18 shows the highest classification accuracy of 97.19%.

An Application of Support Vector Machines to Customer Loyalty Classification of Korean Retailing Company Using R Language

  • Nguyen, Phu-Thien;Lee, Young-Chan
    • The Journal of Information Systems
    • /
    • v.26 no.4
    • /
    • pp.17-37
    • /
    • 2017
  • Purpose Customer Loyalty is the most important factor of customer relationship management (CRM). Especially in retailing industry, where customers have many options of where to spend their money. Classifying loyal customers through customers' data can help retailing companies build more efficient marketing strategies and gain competitive advantages. This study aims to construct classification models of distinguishing the loyal customers within a Korean retailing company using data mining techniques with R language. Design/methodology/approach In order to classify retailing customers, we used combination of support vector machines (SVMs) and other classification algorithms of machine learning (ML) with the support of recursive feature elimination (RFE). In particular, we first clean the dataset to remove outlier and impute the missing value. Then we used a RFE framework for electing most significant predictors. Finally, we construct models with classification algorithms, tune the best parameters and compare the performances among them. Findings The results reveal that ML classification techniques can work well with CRM data in Korean retailing industry. Moreover, customer loyalty is impacted by not only unique factor such as net promoter score but also other purchase habits such as expensive goods preferring or multi-branch visiting and so on. We also prove that with retailing customer's dataset the model constructed by SVMs algorithm has given better performance than others. We expect that the models in this study can be used by other retailing companies to classify their customers, then they can focus on giving services to these potential vip group. We also hope that the results of this ML algorithm using R language could be useful to other researchers for selecting appropriate ML algorithms.

Evaluation of Classification Algorithm Performance of Sentiment Analysis Using Entropy Score (엔트로피 점수를 이용한 감성분석 분류알고리즘의 수행도 평가)

  • Park, Man-Hee
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.22 no.9
    • /
    • pp.1153-1158
    • /
    • 2018
  • Online customer evaluations and social media information among a variety of information sources are critical for businesses as it influences the customer's decision making. There are limitations on the time and money that the survey will ask to identify a variety of customers' needs and complaints. The customer review data at online shopping malls provide the ideal data sources for analyzing customer sentiment about their products. In this study, we collected product reviews data on the smartphone of Samsung and Apple from Amazon. We applied five classification algorithms which are used as representative sentiment analysis techniques in previous studies. The five algorithms are based on support vector machines, bagging, random forest, classification or regression tree and maximum entropy. In this study, we proposed entropy score which can comprehensively evaluate the performance of classification algorithm. As a result of evaluating five algorithms using an entropy score, the SVMs algorithm's entropy score was ranked highest.