• Title/Summary/Keyword: Modified k-means algorithm

Search Result 100, Processing Time 0.025 seconds

Clustering Gene Expression Data by MCL Algorithm (MCL 알고리즘을 사용한 유전자 발현 데이터 클러스터링)

  • Shon, Ho-Sun;Ryu, Keun-Ho
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.45 no.4
    • /
    • pp.27-33
    • /
    • 2008
  • The clustering of gene expression data is used to analyze the results of microarray studies. This clustering is one of the frequently used methods in understanding degrees of biological change and gene expression. In biological research, MCL algorithm is an algorithm that clusters nodes within a graph, and is quick and efficient. We have modified the existing MCL algorithm and applied it to microarray data. In applying the MCL algorithm we put forth a simulation that adjusts two factors, namely inflation and diagonal tent and converted them by making use of Markov matrix. Furthermore, in order to distinguish class more clearly in the modified MCL algorithm we took the average of each row and used it as a threshold. Therefore, the improved algorithm can increase accuracy better than the existing ones. In other words, in the actual experiment, it showed an average of 70% accuracy when compared with an existing class. We also compared the MCL algorithm with the self-organizing map(SOM) clustering, K-means clustering and hierarchical clustering (HC) algorithms. And the result showed that it showed better results than ones derived from hierarchical clustering and K-means method.

Fuzzy c-Means Clustering Algorithm with Pseudo Mahalanobis Distances

  • ICHIHASHI, Hidetomo;OHUE, Masayuki;MIYOSHI, Tetsuya
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 1998.06a
    • /
    • pp.148-152
    • /
    • 1998
  • Gustafson and Kessel proposed a modified fuzzy c-Means algorithm based of the Mahalanobis distance. Though the algorithm appears more natural through the use of a fuzzy covariance matrix, it needs to calculate determinants and inverses of the c-fuzzy scatter matrices. This paper proposes a fuzzy clustering algorithm using pseudo mahalanobis distance, which is more easy to use and flexible than the Gustafson and Kessel's fuzzy c-Means.

  • PDF

An Implementation of the Baseline Recognizer Using the Segmental K-means Algorithm for the Noisy Speech Recognition Using the Aurora DB (Aurora DB를 이용한 잡음 음성 인식실험을 위한 Segmental K-means 훈련 방식의 기반인식기의 구현)

  • Kim Hee-Keun;Chung Young-Joo
    • MALSORI
    • /
    • no.57
    • /
    • pp.113-122
    • /
    • 2006
  • Recently, many studies have been done for speech recognition in noisy environments. Particularly, the Aurora DB has been built as the common database for comparing the various feature extraction schemes. However, in general, the recognition models as well as the features have to be modified for effective noisy speech recognition. As the structure of the HTK is very complex, it is not easy to modify, the recognition engine. In this paper, we implemented a baseline recognizer based on the segmental K-means algorithm whose performance is comparable to the HTK in spite of the simplicity in its implementation.

  • PDF

Fast Outlier Removal for Image Registration based on Modified K-means Clustering

  • Soh, Young-Sung;Qadir, Mudasar;Kim, In-Taek
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.16 no.1
    • /
    • pp.9-14
    • /
    • 2015
  • Outlier detection and removal is a crucial step needed for various image processing applications such as image registration. Random Sample Consensus (RANSAC) is known to be the best algorithm so far for the outlier detection and removal. However RANSAC requires a cosiderable computation time. To drastically reduce the computation time while preserving the comparable quality, a outlier detection and removal method based on modified K-means is proposed. The original K-means was conducted first for matching point pairs and then cluster merging and member exclusion step are performed in the modification step. We applied the methods to various images with highly repetitive patterns under several geometric distortions and obtained successful results. We compared the proposed method with RANSAC and showed that the proposed method runs 3~10 times faster than RANSAC.

Extensions of X-means with Efficient Learning the Number of Clusters (X-means 확장을 통한 효율적인 집단 개수의 결정)

  • Heo, Gyeong-Yong;Woo, Young-Woon
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.12 no.4
    • /
    • pp.772-780
    • /
    • 2008
  • K-means is one of the simplest unsupervised learning algorithms that solve the clustering problem. However K-means suffers the basic shortcoming: the number of clusters k has to be known in advance. In this paper, we propose extensions of X-means, which can estimate the number of clusters using Bayesian information criterion(BIC). We introduce two different versions of algorithm: modified X-means(MX-means) and generalized X-means(GX-means), which employ one full covariance matrix for one cluster and so can estimate the number of clusters efficiently without severe over-fitting which X-means suffers due to its spherical cluster assumption. The algorithms start with one cluster and try to split a cluster iteratively to maximize the BIC score. The former uses K-means algorithm to find a set of optimal clusters with current k, which makes it simple and fast. However it generates wrongly estimated centers when the clusters are overlapped. The latter uses EM algorithm to estimate the parameters and generates more stable clusters even when the clusters are overlapped. Experiments with synthetic data show that the purposed methods can provide a robust estimate of the number of clusters and cluster parameters compared to other existing top-down algorithms.

Radial basis function network design for chaotic time series prediction (혼돈 시계열의 예측을 위한 Radial Basis 함수 회로망 설계)

  • 신창용;김택수;최윤호;박상희
    • The Transactions of the Korean Institute of Electrical Engineers
    • /
    • v.45 no.4
    • /
    • pp.602-611
    • /
    • 1996
  • In this paper, radial basis function networks with two hidden layers, which employ the K-means clustering method and the hierarchical training, are proposed for improving the short-term predictability of chaotic time series. Furthermore the recursive training method of radial basis function network using the recursive modified Gram-Schmidt algorithm is proposed for the purpose. In addition, the radial basis function networks trained by the proposed training methods are compared with the X.D. He A Lapedes's model and the radial basis function network by nonrecursive training method. Through this comparison, an improved radial basis function network for predicting chaotic time series is presented. (author). 17 refs., 8 figs., 3 tabs.

  • PDF

Blind linear/nonlinear equalization for heavy noise-corrupted channels

  • Han, Soo- Whan;Park, Sung-Dae
    • Journal of information and communication convergence engineering
    • /
    • v.7 no.3
    • /
    • pp.383-391
    • /
    • 2009
  • In this paper, blind equalization using a modified Fuzzy C-Means algorithm with Gaussian Weights (MFCM_GW) is attempted to the heavy noise-corrupted channels. The proposed algorithm can deal with both of linear and nonlinear channels, because it searches for the optimal channel output states of a channel instead of estimating the channel parameters in a direct manner. In contrast to the common Euclidean distance in Fuzzy C-Means (FCM), the use of the Bayesian likelihood fitness function and the Gaussian weighted partition matrix is exploited in its search procedure. The selected channel states by MFCM_GW are always close to the optimal set of a channel even the additive white Gaussian noise (AWGN) is heavily corrupted in it. Simulation studies demonstrate that the performance of the proposed method is relatively superior to existing genetic algorithm (GA) and conventional FCM based methods in terms of accuracy and speed.

An improved algorithm for the exchange heuristic for solving multi-project multi-resource constrained scheduling with variable-intensity activities

  • Yu, Jai-Keon;Kim, Won-Kyung
    • Proceedings of the Korean Operations and Management Science Society Conference
    • /
    • 1993.04a
    • /
    • pp.343-352
    • /
    • 1993
  • In this study, a modified algorithm for the exchange heuristic is developed and applied to a resource-constrained scheduling problem. The problem involves multiple projects and multiple resource categories and allows flexible resource allocation to each activity. The objective is to minimize the maximum completion time. The exchange heuristkc is a multiple pass algorithm which makes improvements upon a given initial feasible schedule. Four different modified algorithms are proposed. The original algorithm and the new algorithms were compared through an experimental investigation. All the proposed algorithms reduce the maximum completion time much more effectively than the original algorithm. Especially, one of four proposed algorithms obviously outperforms the other three algorithms. The algorithm of the best performance produces significantly shorter schedules than the original algorithm, though it requires up to three times more computation time. However, in most situations, a reduction in schedule length means a significant reduction in the total cost.

  • PDF

Optimization Design for Dynamic Characters of Electromagnetic Apparatus Based on Niche Sorting Multi-objective Particle Swarm Algorithm

  • Xu, Le;You, Jiaxin;Yu, Haidan;Liang, Huimin
    • Journal of Magnetics
    • /
    • v.21 no.4
    • /
    • pp.660-665
    • /
    • 2016
  • The electromagnetic apparatus plays an important role in high power electrical systems. It is of great importance to provide an effective approach for the optimization of the high power electromagnetic apparatus. However, premature convergence and few Pareto solution set of the optimization for electromagnetic apparatus always happen. This paper proposed a modified multi-objective particle swarm optimization algorithm based on the niche sorting strategy. Applying to the modified algorithm, this paper guarantee the better Pareto optimal front with an enhanced distribution. Aiming at shortcomings in the closing bounce and slow breaking velocity of electromagnetic apparatus, the multi-objective optimization model was established on the basis of the traditional optimization. Besides, by means of the improved multi-objective particle swarm optimization algorithm, this paper processed the model and obtained a series of optimized parameters (decision variables). Compared with other different classical algorithms, the modified algorithm has a satisfactory performance in the multi-objective optimization problems in the electromagnetic apparatus.

Speaker-Independent Isolated Word Recognition Using A Modified ISODATA Method (Modified ISODATA 방법을 이용한 불특정화자 단독어 인식)

  • Hwang, U-Geun;An, Tae-Ok;Lee, Hyeong-Jun
    • The Journal of the Acoustical Society of Korea
    • /
    • v.6 no.4
    • /
    • pp.31-43
    • /
    • 1987
  • As a study on Speaker-Independent Isolated Word Recognition, a Modified ISODATA clustering method is proposed. This method simplifies the outlier processing and the splitting procedure in conventional ISODATA algorithm, and eliminates the lumping procedure. Through this method, we could find cluster centers precisely and automatically. When this method applied to 11 digits by 10 males and 4 females, its recognition rates of $84.42\%$ for K=4 were better than those of the latest Modified K-means, $82.5\%$. Judging from these results, we proved this method the best method in finding cluster centers precisely.

  • PDF