• Title/Summary/Keyword: clustering algorithms

Search Result 611, Processing Time 0.023 seconds

A Differential Evolution based Support Vector Clustering (차분진화 기반의 Support Vector Clustering)

  • Jun, Sung-Hae
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.17 no.5
    • /
    • pp.679-683
    • /
    • 2007
  • Statistical learning theory by Vapnik consists of support vector machine(SVM), support vector regression(SVR), and support vector clustering(SVC) for classification, regression, and clustering respectively. In this algorithms, SVC is good clustering algorithm using support vectors based on Gaussian kernel function. But, similar to SVM and SVR, SVC needs to determine kernel parameters and regularization constant optimally. In general, the parameters have been determined by the arts of researchers and grid search which is demanded computing time heavily. In this paper, we propose a differential evolution based SVC(DESVC) which combines differential evolution into SVC for efficient selection of kernel parameters and regularization constant. To verify improved performance of our DESVC, we make experiments using the data sets from UCI machine learning repository and simulation.

Design and Development of Clustering Algorithm Considering Influences of Spatial Objects (공간객체의 영향력을 고려한 클러스터링 알고리즘의 설계와 구현)

  • Kim, Byung-Cheol
    • The Journal of the Korea Contents Association
    • /
    • v.6 no.12
    • /
    • pp.113-120
    • /
    • 2006
  • This paper proposes DBSCAN-SI that is an algorithm for clustering with influences of spatial objects. DBSCAN-SI that is extended from existing DBSCAN and DBSCAN-W converts from non-spatial properties to the influences of spatial objects during the spatial clustering. It increases probability of inclusion to the cluster according to the higher the influences that is affected by the properties used in clustering and executes the clustering not only respect the spatial distances, but also volume of influences. For the perspective of specific property-centered, the clustering technique proposed in this paper can makeup the disadvantage of existing algorithms that exclude the objects in spite of high influences from cluster by means of being scarcely close objects around the cluster.

  • PDF

ACCELERATION OF MACHINE LEARNING ALGORITHMS BY TCHEBYCHEV ITERATION TECHNIQUE

  • LEVIN, MIKHAIL P.
    • Journal of the Korean Society for Industrial and Applied Mathematics
    • /
    • v.22 no.1
    • /
    • pp.15-28
    • /
    • 2018
  • Recently Machine Learning algorithms are widely used to process Big Data in various applications and a lot of these applications are executed in run time. Therefore the speed of Machine Learning algorithms is a critical issue in these applications. However the most of modern iteration Machine Learning algorithms use a successive iteration technique well-known in Numerical Linear Algebra. But this technique has a very low convergence, needs a lot of iterations to get solution of considering problems and therefore a lot of time for processing even on modern multi-core computers and clusters. Tchebychev iteration technique is well-known in Numerical Linear Algebra as an attractive candidate to decrease the number of iterations in Machine Learning iteration algorithms and also to decrease the running time of these algorithms those is very important especially in run time applications. In this paper we consider the usage of Tchebychev iterations for acceleration of well-known K-Means and SVM (Support Vector Machine) clustering algorithms in Machine Leaning. Some examples of usage of our approach on modern multi-core computers under Apache Spark framework will be considered and discussed.

Artificial Intelligence and Pattern Recognition Using Data Mining Algorithms

  • Al-Shamiri, Abdulkawi Yahya Radman
    • International Journal of Computer Science & Network Security
    • /
    • v.21 no.7
    • /
    • pp.221-232
    • /
    • 2021
  • In recent years, with the existence of huge amounts of data stored in huge databases, the need for developing accurate tools for analyzing data and extracting information and knowledge from the huge and multi-source databases have been increased. Hence, new and modern techniques have emerged that will contribute to the development of all other sciences. Knowledge discovery techniques are among these technologies, one popular technique of knowledge discovery techniques is data mining which aims to knowledge discovery from huge amounts of data. Such modern technologies of knowledge discovery will contribute to the development of all other fields. Data mining is important, interesting technique, and has many different and varied algorithms; Therefore, this paper aims to present overview of data mining, and clarify the most important of those algorithms and their uses.

A Data-Centric Clustering Algorithm for Reducing Network Traffic in Wireless Sensor Networks (무선 센서 네트워크에서 네트워크 트래픽 감소를 위한 데이타 중심 클러스터링 알고리즘)

  • Yeo, Myung-Ho;Lee, Mi-Sook;Park, Jong-Guk;Lee, Seok-Jae;Yoo, Jae-Soo
    • Journal of KIISE:Information Networking
    • /
    • v.35 no.2
    • /
    • pp.139-148
    • /
    • 2008
  • Many types of sensor data exhibit strong correlation in both space and time. Suppression, both temporal and spatial, provides opportunities for reducing the energy cost of sensor data collection. Unfortunately, existing clustering algorithms are difficult to utilize the spatial or temporal opportunities, because they just organize clusters based on the distribution of sensor nodes or the network topology but not correlation of sensor data. In this paper, we propose a novel clustering algorithm with suppression techniques. To guarantee independent communication among clusters, we allocate multiple channels based on sensor data. Also, we propose a spatio-temporal suppression technique to reduce the network traffic. In order to show the superiority of our clustering algorithm, we compare it with the existing suppression algorithms in terms of the lifetime of the sensor network and the site of data which have been collected in the base-station. As a result, our experimental results show that the size of data was reduced by $4{\sim}40%$, and whole network lifetime was prolonged by $20{\sim}30%$.

Orthogonal Nonnegative Matrix Factorization: Multiplicative Updates on Stiefel Manifolds (Stiefel 다양체에서 곱셈의 업데이트를 이용한 비음수 행렬의 직교 분해)

  • Yoo, Ji-Ho;Choi, Seung-Jin
    • Journal of KIISE:Software and Applications
    • /
    • v.36 no.5
    • /
    • pp.347-352
    • /
    • 2009
  • Nonnegative matrix factorization (NMF) is a popular method for multivariate analysis of nonnegative data, the goal of which is decompose a data matrix into a product of two factor matrices with all entries in factor matrices restricted to be nonnegative. NMF was shown to be useful in a task of clustering (especially document clustering). In this paper we present an algorithm for orthogonal nonnegative matrix factorization, where an orthogonality constraint is imposed on the nonnegative decomposition of a term-document matrix. We develop multiplicative updates directly from true gradient on Stiefel manifold, whereas existing algorithms consider additive orthogonality constraints. Experiments on several different document data sets show our orthogonal NMF algorithms perform better in a task of clustering, compared to the standard NMF and an existing orthogonal NMF.

An Algorithm For Load-Sharing and Fault-Tolerance In Internet-Based Clustering Systems (인터넷 기반 클러스터 시스템 환경에서 부하공유 및 결함허용 알고리즘)

  • Choi, In-Bok;Lee, Jae-Dong
    • The KIPS Transactions:PartA
    • /
    • v.10A no.3
    • /
    • pp.215-224
    • /
    • 2003
  • Since there are various networks and heterogeneity of nodes in Internet, the existing load-sharing algorithms are hardly adapted for use in Internet-based clustering systems. Therefore, in Internet-based clustering systems, a load-sharing algorithm must consider various conditions such as heterogeneity of nodes, characteristics of a network and imbalance of load, and so on. This paper has proposed an expanded-WF algorithm which is based on a WF (Weighted Factoring) algorithm for load-sharing in Internet-based clustering systems. The proposed algorithm uses an adaptive granularity strategy for load-sharing and duplicate execution of partial job for fault-tolerance. For the simulation, the to matrix multiplication using PVM is performed on the heterogeneous clustering environment which consists of two different networks. Compared to other algorithms such as Send, GSS and Weighted Factoring, the proposed algorithm results in an improvement of performance by 55%, 63% and 20%, respectively. Also, this paper shows that It can process the fault-tolerance.

Clustering Algorithm with using Road Side Unit(RSU) for Cluster Head(CH) Selection in VANET (차량 네트워크 환경에서 도로 기반 시설을 이용한 클러스터 헤드 선택 알고리즘)

  • Kwon, Hyuk-joon;Kwon, Yong-ho;Rhee, Byung-ho
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2014.10a
    • /
    • pp.620-623
    • /
    • 2014
  • Network topology for communication between vehicles are quickly changing because vehicles have a special movement pattern, especially character which is quickly changed by velocity and situation of road. Because of these feature, it is not easy to apply reliable routing on VANET(Vehicular Ad-hoc Network). Clustering method is one of the alternatives which are suggested for overcoming weakness of routing algorithm. Clustering is the way to communicate and manage vehicles by binding them around cluster head. Therefore choosing certain cluster head among vehicles has a decisive effect on decreasing overhead in relevant clustering and determining stability and efficiency of the network. This paper introduces new cluster head selection algorithm using RSU(Road Side Unit) different from existing algorithms. We suggest a more stable and efficient algorithm which decides a priority of cluster head by calculating vehicles' velocity and distance through RSU than existing algorithms.

  • PDF

Identification of Fuzzy System Driven to Parallel Genetic Algorithm (병렬유전자 알고리즘을 기반으로한 퍼지 시스템의 동정)

  • Choi, Jeoung-Nae;Oh, Sung-Kwun
    • Proceedings of the KIEE Conference
    • /
    • 2007.04a
    • /
    • pp.201-203
    • /
    • 2007
  • The paper concerns the successive optimization for structure and parameters of fuzzy inference systems that is based on parallel Genetic Algorithms (PGA) and information data granulation (IG). PGA is multi, population based genetic algorithms, and it is used tu optimize structure and parameters of fuzzy model simultaneously, The granulation is realized with the aid of the C-means clustering. The concept of information granulation was applied to the fuzzy model in order to enhance the abilities of structural optimization. By doing that, we divide the input space to form the premise part of the fuzzy rules and the consequence part of each fuzzy rule is newly' organized based on center points of data group extracted by the C-Means clustering, It concerns the fuzzy model related parameters such as the number of input variables to be used in fuzzy model. a collection of specific subset of input variables, the number of membership functions according to used variables, and the polynomial type of the consequence part of fuzzy rules, The simultaneous optimization mechanism is explored. It can find optimal values related to structure and parameter of fuzzy model via PGA, the C-means clustering and standard least square method at once. A comparative analysis demonstrates that the Dnmosed algorithm is superior to the conventional methods.

  • PDF

The Design of Multi-FNN Model Using HCM Clustering and Genetic Algorithms and Its Applications to Nonlinear Process (HCM 클러스터링과 유전자 알고리즘을 이용한 다중 FNN 모델 설계와 비선형 공정으로의 응용)

  • 박호성;오성권;김현기
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 2000.05a
    • /
    • pp.47-50
    • /
    • 2000
  • In this paper, an optimal identification method using Multi-FNN(Fuzzy-Neural Network) is proposed for model ins of nonlinear complex system. In order to control of nonlinear process with complexity and uncertainty of data, proposed model use a HCM clustering algorithm which carry out the input-output data preprocessing function and Genetic Algorithm which carry out optimization of model. The proposed Multi-FNN is based on Yamakawa's FNN and it uses simplified inference as fuzzy inference method and Error Back Propagation Algorithm as learning rules. HCM clustering method which carry out the data preprocessing function for system modeling, is utilized to determine the structure of Multi-FNN by means of the divisions of input-output space. Also, the parameters of Multi-FNN model such as apexes of membership function, learning rates and momentum coefficients are adjusted using genetic algorithms. Also, a performance index with a weighting factor is presented to achieve a sound balance between approximation and generalization abilities of the model, To evaluate the performance of the proposed model, we use the time series data for gas furnace and the numerical data of nonlinear function.

  • PDF