• Title/Summary/Keyword: Ensemble-based algorithm

Search Result 140, Processing Time 0.026 seconds

A new Ensemble Clustering Algorithm using a Reconstructed Mapping Coefficient

  • Cao, Tuoqia;Chang, Dongxia;Zhao, Yao
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.14 no.7
    • /
    • pp.2957-2980
    • /
    • 2020
  • Ensemble clustering commonly integrates multiple basic partitions to obtain a more accurate clustering result than a single partition. Specifically, it exists an inevitable problem that the incomplete transformation from the original space to the integrated space. In this paper, a novel ensemble clustering algorithm using a newly reconstructed mapping coefficient (ECRMC) is proposed. In the algorithm, a newly reconstructed mapping coefficient between objects and micro-clusters is designed based on the principle of increasing information entropy to enhance effective information. This can reduce the information loss in the transformation from micro-clusters to the original space. Then the correlation of the micro-clusters is creatively calculated by the Spearman coefficient. Therefore, the revised co-association graph between objects can be built more accurately because the supplementary information can well ensure the completeness of the whole conversion process. Experiment results demonstrate that the ECRMC clustering algorithm has high performance, effectiveness, and feasibility.

Anomaly-Based Network Intrusion Detection: An Approach Using Ensemble-Based Machine Learning Algorithm

  • Kashif Gul Chachar;Syed Nadeem Ahsan
    • International Journal of Computer Science & Network Security
    • /
    • v.24 no.1
    • /
    • pp.107-118
    • /
    • 2024
  • With the seamless growth of the technology, network usage requirements are expanding day by day. The majority of electronic devices are capable of communication, which strongly requires a secure and reliable network. Network-based intrusion detection systems (NIDS) is a new method for preventing and alerting computers and networks from attacks. Machine Learning is an emerging field that provides a variety of ways to implement effective network intrusion detection systems (NIDS). Bagging and Boosting are two ensemble ML techniques, renowned for better performance in the learning and classification process. In this paper, the study provides a detailed literature review of the past work done and proposed a novel ensemble approach to develop a NIDS system based on the voting method using bagging and boosting ensemble techniques. The test results demonstrate that the ensemble of bagging and boosting through voting exhibits the highest classification accuracy of 99.98% and a minimum false positive rate (FPR) on both datasets. Although the model building time is average which can be a tradeoff by processor speed.

Performance Improvement of Classifier by Combining Disjunctive Normal Form features

  • Min, Hyeon-Gyu;Kang, Dong-Joong
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.10 no.4
    • /
    • pp.50-64
    • /
    • 2018
  • This paper describes a visual object detection approach utilizing ensemble based machine learning. Object detection methods employing 1D features have the benefit of fast calculation speed. However, for real image with complex background, detection accuracy and performance are degraded. In this paper, we propose an ensemble learning algorithm that combines a 1D feature classifier and 2D DNF (Disjunctive Normal Form) classifier to improve the object detection performance in a single input image. Also, to improve the computing efficiency and accuracy, we propose a feature selecting method to reduce the computing time and ensemble algorithm by combining the 1D features and 2D DNF features. In the verification experiments, we selected the Haar-like feature as the 1D image descriptor, and demonstrated the performance of the algorithm on a few datasets such as face and vehicle.

Comparison of AT1- and Kalman Filter-Based Ensemble Time Scale Algorithms

  • Lee, Ho Seong;Kwon, Taeg Yong;Lee, Young Kyu;Yang, Sung-hoon;Yu, Dai-Hyuk;Park, Sang Eon;Heo, Myoung-Sun
    • Journal of Positioning, Navigation, and Timing
    • /
    • v.10 no.3
    • /
    • pp.197-206
    • /
    • 2021
  • We compared two typical ensemble time scale algorithms; AT1 and Kalman filter. Four commercial atomic clocks composed of two hydrogen masers and two cesium atomic clocks provided measurement data to the algorithms. The allocation of relative weights to the clocks is important to generate a stable ensemble time. A 30 day-average-weight model, which was obtained from the average Allan variance of each clock, was applied to the AT1 algorithm. For the reduced Kalman filter (Kred) algorithm, we gave the same weights to the two hydrogen masers. We also compared the frequency stabilities of the outcome from the algorithms when the frequency offsets and/or the frequency drift offsets estimated by the algorithms were corrected or not corrected by the KRISS-made primary frequency standard, KRISS-F1. We found that the Kred algorithm is more effective to generate a stable ensemble time scale in the long-term, and the algorithm also generates much enhanced short-term stability when the frequency offset is used for the calculation of the Allan deviation instead of the phase offset.

Ensemble Classification Method for Efficient Medical Diagnostic (효율적인 의료진단을 위한 앙상블 분류 기법)

  • Jung, Yong-Gyu;Heo, Go-Eun
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.10 no.3
    • /
    • pp.97-102
    • /
    • 2010
  • The purpose of medical data mining for efficient algorithms and techniques throughout the various diseases is to increase the reliability of estimates to classify. Previous studies, an algorithm based on a single model, and even the existence of the model to better predict the classification accuracy of multi-model ensemble-based research techniques are being applied. In this paper, the higher the medical data to predict the reliability of the existing scope of the ensemble technique applied to the I-ENSEMBLE offers. Data for the diagnosis of hypothyroidism is the result of applying the experimental technique, a representative ensemble Bagging, Boosting, Stacking technique significantly improved accuracy compared to all existing, respectively. In addition, compared to traditional single-model techniques and ensemble techniques Multi modeling when applied to represent the effects were more pronounced.

Ensemble Based Optimal Feature Selection Algorithm for Efficient Intrusion Detection in Wireless Sensor Network

  • Shyam Sundar S;R.S. Bhuvaneswaran;SaiRamesh L
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.18 no.8
    • /
    • pp.2214-2229
    • /
    • 2024
  • Wireless sensor network (WSN) consists of large number of sensor nodes that are deployed in geographical locations to collect sensed information, process data and communicate it to the control station for further processing. Due the unfriendly environment where the sensors are deployed, there exist many possibilities of malicious nodes which performs malicious activities in the network. Therefore, the security threats affect performance and life time of sensor networks, whereas various security aspects are there to address security issues in WSN namely Cryptography, Trust Management, Intrusion Detection System (IDS) and Intrusion Prevention Systems (IPS). However, IDS detect the malicious activities and produce an alarm. These malicious activities exploit vulnerabilities in the network layer and affect all layers in the network. Existing feature selection methods such as filter-based methods are not considering the redundancy of the selected features and wrapper method has high risk of overfitting the classification of intrusion. Due to overfitting, the classification algorithm fails to detect the intrusion in better manner. The main objective of this paper is to provide the efficient feature selection algorithm which was suitable for any type classification algorithm to detect the intrusion in an effective manner. This paper, the security of the network is addressed by proposing Feature Selection Algorithm using Chi Squared with Ensemble Method (FSChE). The proposed scheme employs the combination of decision tree along with the random forest classification algorithm to form ensemble classifier. The experimental results justify the feasibility of the proposed scheme in terms of attack detection, packet delivery ratio and time analysis by employing NSL KDD cup data Set. The obtained results shows that the proposed ensemble method increases the overall performance by 10% to 25% with respect to mentioned parameters.

Ensemble-By-Session Method on Keystroke Dynamics based User Authentication

  • Ho, Jiacang;Kang, Dae-Ki
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.8 no.4
    • /
    • pp.19-25
    • /
    • 2016
  • There are many free applications that need users to sign up before they can use the applications nowadays. It is difficult to choose a suitable password for your account. If the password is too complicated, then it is hard to remember it. However, it is easy to be intruded by other users if we use a very simple password. Therefore, biometric-based approach is one of the solutions to solve the issue. The biometric-based approach includes keystroke dynamics on keyboard, mice, or mobile devices, gait analysis and many more. The approach can integrate with any appropriate machine learning algorithm to learn a user typing behavior for authentication system. Preprocessing phase is one the important role to increase the performance of the algorithm. In this paper, we have proposed ensemble-by-session (EBS) method which to operate the preprocessing phase before the training phase. EBS distributes the dataset into multiple sub-datasets based on the session. In other words, we split the dataset into session by session instead of assemble them all into one dataset. If a session is considered as one day, then the sub-dataset has all the information on the particular day. Each sub-dataset will have different information for different day. The sub-datasets are then trained by a machine learning algorithm. From the experimental result, we have shown the improvement of the performance for each base algorithm after the preprocessing phase.

Ensemble variable selection using genetic algorithm

  • Seogyoung, Lee;Martin Seunghwan, Yang;Jongkyeong, Kang;Seung Jun, Shin
    • Communications for Statistical Applications and Methods
    • /
    • v.29 no.6
    • /
    • pp.629-640
    • /
    • 2022
  • Variable selection is one of the most crucial tasks in supervised learning, such as regression and classification. The best subset selection is straightforward and optimal but not practically applicable unless the number of predictors is small. In this article, we propose directly solving the best subset selection via the genetic algorithm (GA), a popular stochastic optimization algorithm based on the principle of Darwinian evolution. To further improve the variable selection performance, we propose to run multiple GA to solve the best subset selection and then synthesize the results, which we call ensemble GA (EGA). The EGA significantly improves variable selection performance. In addition, the proposed method is essentially the best subset selection and hence applicable to a variety of models with different selection criteria. We compare the proposed EGA to existing variable selection methods under various models, including linear regression, Poisson regression, and Cox regression for survival data. Both simulation and real data analysis demonstrate the promising performance of the proposed method.

Ensemble of Classifiers Constructed on Class-Oriented Attribute Reduction

  • Li, Min;Deng, Shaobo;Wang, Lei
    • Journal of Information Processing Systems
    • /
    • v.16 no.2
    • /
    • pp.360-376
    • /
    • 2020
  • Many heuristic attribute reduction algorithms have been proposed to find a single reduct that functions as the entire set of original attributes without loss of classification capability; however, the proposed reducts are not always perfect for these multiclass datasets. In this study, based on a probabilistic rough set model, we propose the class-oriented attribute reduction (COAR) algorithm, which separately finds a reduct for each target class. Thus, there is a strong dependence between a reduct and its target class. Consequently, we propose a type of ensemble constructed on a group of classifiers based on class-oriented reducts with a customized weighted majority voting strategy. We evaluated the performance of our proposed algorithm based on five real multiclass datasets. Experimental results confirm the superiority of the proposed method in terms of four general evaluation metrics.

A Novel Method for Inserting an MPEG-2 TS into Ensemble in a DMB Transmission System

  • Lee, Gwang-Soon;Bae, Byung-Jun;Hahm, Young-Kwon;Lee, Soo-In
    • ETRI Journal
    • /
    • v.26 no.6
    • /
    • pp.653-656
    • /
    • 2004
  • This paper presents an effective algorithm for inserting an MPEG-2 transport stream (TS) into a Digital Audio Broadcasting (DAB) ensemble without any bandwidth waste in a Digital Multimedia Broadcasting (DMB) transmission system. The key technologies of this algorithm include packet rate control and program clock reference correction, which are important for TS processing. The proposed algorithms are applied to the various DMB transmission systems based on Eureka-147, and the performance of the proposed algorithm is confirmed through the experimental DMB broadcasting.

  • PDF