• Title/Summary/Keyword: Clustering Technique

Search Result 704, Processing Time 0.028 seconds

Recommendation of Optimal Treatment Method for Heart Disease using EM Clustering Technique

  • Jung, Yong Gyu;Kim, Hee Wan
    • International Journal of Advanced Culture Technology
    • /
    • v.5 no.3
    • /
    • pp.40-45
    • /
    • 2017
  • This data mining technique was used to extract useful information from percutaneous coronary intervention data obtained from the US public data homepage. The experiment was performed by extracting data on the area, frequency of operation, and the number of deaths. It led us to finding of meaningful correlations, patterns, and trends using various algorithms, pattern techniques, and statistical techniques. In this paper, information is obtained through efficient decision tree and cluster analysis in predicting the incidence of percutaneous coronary intervention and mortality. In the cluster analysis, EM algorithm was used to evaluate the suitability of the algorithm for each situation based on performance tests and verification of results. In the cluster analysis, the experimental data were classified using the EM algorithm, and we evaluated which models are more effective in comparing functions. Using data mining technique, it was identified which areas had effective treatment techniques and which areas were vulnerable, and we can predict the frequency and mortality of percutaneous coronary intervention for heart disease.

Iterative LBG Clustering for SIMO Channel Identification

  • Daneshgaran, Fred;Laddomada, Massimiliano
    • Journal of Communications and Networks
    • /
    • v.5 no.2
    • /
    • pp.157-166
    • /
    • 2003
  • This paper deals with the problem of channel identification for Single Input Multiple Output (SIMO) slow fading channels using clustering algorithms. Due to the intrinsic memory of the discrete-time model of the channel, over short observation periods, the received data vectors of the SIMO model are spread in clusters because of the AWGN noise. Each cluster is practically centered around the ideal channel output labels without noise and the noisy received vectors are distributed according to a multivariate Gaussian distribution. Starting from the Markov SIMO channel model, simultaneous maximum ikelihood estimation of the input vector and the channel coefficients reduce to one of obtaining the values of this pair that minimizes the sum of the Euclidean norms between the received and the estimated output vectors. Viterbi algorithm can be used for this purpose provided the trellis diagram of the Markov model can be labeled with the noiseless channel outputs. The problem of identification of the ideal channel outputs, which is the focus of this paper, is then equivalent to designing a Vector Quantizer (VQ) from a training set corresponding to the observed noisy channel outputs. The Linde-Buzo-Gray (LBG)-type clustering algorithms [1] could be used to obtain the noiseless channel output labels from the noisy received vectors. One problem with the use of such algorithms for blind time-varying channel identification is the codebook initialization. This paper looks at two critical issues with regards to the use of VQ for channel identification. The first has to deal with the applicability of this technique in general; we present theoretical results for the conditions under which the technique may be applicable. The second aims at overcoming the codebook initialization problem by proposing a novel approach which attempts to make the first phase of the channel estimation faster than the classical codebook initialization methods. Sample simulation results are provided confirming the effectiveness of the proposed initialization technique.

A Computational Intelligence Based Online Data Imputation Method: An Application For Banking

  • Nishanth, Kancherla Jonah;Ravi, Vadlamani
    • Journal of Information Processing Systems
    • /
    • v.9 no.4
    • /
    • pp.633-650
    • /
    • 2013
  • All the imputation techniques proposed so far in literature for data imputation are offline techniques as they require a number of iterations to learn the characteristics of data during training and they also consume a lot of computational time. Hence, these techniques are not suitable for applications that require the imputation to be performed on demand and near real-time. The paper proposes a computational intelligence based architecture for online data imputation and extended versions of an existing offline data imputation method as well. The proposed online imputation technique has 2 stages. In stage 1, Evolving Clustering Method (ECM) is used to replace the missing values with cluster centers, as part of the local learning strategy. Stage 2 refines the resultant approximate values using a General Regression Neural Network (GRNN) as part of the global approximation strategy. We also propose extended versions of an existing offline imputation technique. The offline imputation techniques employ K-Means or K-Medoids and Multi Layer Perceptron (MLP)or GRNN in Stage-1and Stage-2respectively. Several experiments were conducted on 8benchmark datasets and 4 bank related datasets to assess the effectiveness of the proposed online and offline imputation techniques. In terms of Mean Absolute Percentage Error (MAPE), the results indicate that the difference between the proposed best offline imputation method viz., K-Medoids+GRNN and the proposed online imputation method viz., ECM+GRNN is statistically insignificant at a 1% level of significance. Consequently, the proposed online technique, being less expensive and faster, can be employed for imputation instead of the existing and proposed offline imputation techniques. This is the significant outcome of the study. Furthermore, GRNN in stage-2 uniformly reduced MAPE values in both offline and online imputation methods on all datasets.

A Novel Clustering Method for Increasing Connection Durability in Sensor Network Environment (센서 네트워크에서 연결 지속성 향상 가능한 새로운 클러스터링 기법에 관한 연구)

  • Kim, Dae-Hyun;Kim, Jin-Mook;Lee, Kyung-Oh
    • The KIPS Transactions:PartC
    • /
    • v.15C no.2
    • /
    • pp.119-124
    • /
    • 2008
  • The LEACH is a study method of a clustering base that was representative in many routing techniques that was suggested in sensor network environment. This is suggested technique to manage the sensor network group which consisted of many sensors in efficiency. However, it does not consider energy remaining quantity of the cluster header to manage cluster group, the problem where the cluster group is able to destroyed on the middle which transmits data. We are going to propose to solve such a problem by this paper for new clustering technique to accomplish management for a cluster group. It uses the technique which it proposes from the dissertation which it sees and are to the data transfer and the control also the connection directivity of the cluster group to improve also the lag time diminishes, test result it will lead and the possibility which it will know it was.

The Alcock-Paczynski effect via clustering shells

  • Sabiu, Cristiano G.;Lee, Seokcheon;Park, Changbom
    • The Bulletin of The Korean Astronomical Society
    • /
    • v.38 no.2
    • /
    • pp.58.2-58.2
    • /
    • 2013
  • Both peculiar velocities and errors in the assumed redshift-distance relation ("Alcock-Paczynski effect") generate correlations between clustering amplitude and orientation with respect to the line-of-sight. In this talk we propose a novel technique to extract the Alcock-Paczynski, geometric, distortion information from the anisotropic clustering of galaxies in 3-dimensional redshift space while minimizing non-linear clustering and peculiar velocity effects. We capitalize on the recent, large dataset from the Sloan Digital Sky Survey III (SDSS-III), which provides a large comoving sample of the universe out to high redshift. We focus our analysis on the Baryon Oscillation Spectroscopic Survey (BOSS) constant mass (CMASS) sample of 549,005 bright galaxies in the redshift range 0.43

  • PDF

Applying Particle Swarm Optimization for Enhanced Clustering of DNA Chip Data (DNA Chip 데이터의 군집화 성능 향상을 위한 Particle Swarm Optimization 알고리즘의 적용기법)

  • Lee, Min-Soo
    • The KIPS Transactions:PartD
    • /
    • v.17D no.3
    • /
    • pp.175-184
    • /
    • 2010
  • Experiments and research on genes have become very convenient by using DNA chips, which provide large amounts of data from various experiments. The data provided by the DNA chips could be represented as a two dimensional matrix, in which one axis represents genes and the other represents samples. By performing an efficient and good quality clustering on such data, the classification work which follows could be more efficient and accurate. In this paper, we use a bio-inspired algorithm called the Particle Swarm Optimization algorithm to propose an efficient clustering mechanism for large amounts of DNA chip data, and show through experimental results that the clustering technique using the PSO algorithm provides a faster yet good quality result compared with other existing clustering solutions.

Efficient and Secure Routing Protocol forWireless Sensor Networks through SNR Based Dynamic Clustering Mechanisms

  • Ganesh, Subramanian;Amutha, Ramachandran
    • Journal of Communications and Networks
    • /
    • v.15 no.4
    • /
    • pp.422-429
    • /
    • 2013
  • Advances in wireless sensor network (WSN) technology have enabled small and low-cost sensors with the capability of sensing various types of physical and environmental conditions, data processing, and wireless communication. In the WSN, the sensor nodes have a limited transmission range and their processing and storage capabilities as well as their energy resources are limited. A triple umpiring system has already been proved for its better performance in WSNs. The clustering technique is effective in prolonging the lifetime of the WSN. In this study, we have modified the ad-hoc on demand distance vector routing by incorporating signal-to-noise ratio (SNR) based dynamic clustering. The proposed scheme, which is an efficient and secure routing protocol for wireless sensor networks through SNR-based dynamic clustering (ESRPSDC) mechanisms, can partition the nodes into clusters and select the cluster head (CH) among the nodes based on the energy, and non CH nodes join with a specific CH based on the SNR values. Error recovery has been implemented during the inter-cluster routing in order to avoid end-to-end error recovery. Security has been achieved by isolating the malicious nodes using sink-based routing pattern analysis. Extensive investigation studies using a global mobile simulator have shown that this hybrid ESRP significantly improves the energy efficiency and packet reception rate as compared with the SNR unaware routing algorithms such as the low energy aware adaptive clustering hierarchy and power efficient gathering in sensor information systems.

K-Means Clustering in the PCA Subspace using an Unified Measure (통합 측도를 사용한 주성분해석 부공간에서의 k-평균 군집화 방법)

  • Yoo, Jae-Hung
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.17 no.4
    • /
    • pp.703-708
    • /
    • 2022
  • K-means clustering is a representative clustering technique. However, there is a limitation in not being able to integrate the performance evaluation scale and the method of determining the minimum number of clusters. In this paper, a method for numerically determining the minimum number of clusters is introduced. The explained variance is presented as an integrated measure. We propose that the k-means clustering method should be performed in the subspace of the PCA in order to simultaneously satisfy the minimum number of clusters and the threshold of the explained variance. It aims to present an explanation in principle why principal component analysis and k-means clustering are sequentially performed in pattern recognition and machine learning.

Prediction of Energy Consumption in a Smart Home Using Coherent Weighted K-Means Clustering ARIMA Model

  • Magdalene, J. Jasmine Christina;Zoraida, B.S.E.
    • International Journal of Computer Science & Network Security
    • /
    • v.22 no.10
    • /
    • pp.177-182
    • /
    • 2022
  • Technology is progressing with every passing day and the enormous usage of electricity is becoming a necessity. One of the techniques to enjoy the assistances in a smart home is the efficiency to manage the electric energy. When electric energy is managed in an appropriate way, it drastically saves sufficient power even to be spent during hard time as when hit by natural calamities. To accomplish this, prediction of energy consumption plays a very important role. This proposed prediction model Coherent Weighted K-Means Clustering ARIMA (CWKMCA) enhances the weighted k-means clustering technique by adding weights to the cluster points. Forecasting is done using the ARIMA model based on the centroid of the clusters produced. The dataset for this proposed work is taken from the Pecan Project in Texas, USA. The level of accuracy of this model is compared with the traditional ARIMA model and the Weighted K-Means Clustering ARIMA Model. When predicting,errors such as RMSE, MAPE, AIC and AICC are analysed, the results of this suggested work reveal lower values than the ARIMA and Weighted K-Means Clustering ARIMA models. This model also has a greater loglikelihood, demonstrating that this model outperforms the ARIMA model for time series forecasting.

Application of k-means Clustering for Association Rule Using Measure of Association

  • Lee, Keun-Woo;Park, Hee-Chang
    • Journal of the Korean Data and Information Science Society
    • /
    • v.19 no.3
    • /
    • pp.925-936
    • /
    • 2008
  • An association rule mining finds the relation among each items in massive volume database. In generating association rules, the researcher specifies the measurements randomly such as support, confidence and lift, and produces the rules. The rule is not produced if it is not suitable to the one any condition which is given value. For example, in case of a little small one than the value which a confidence value is specified but a support and lift's value is very high, this rule is meaningful rule. But association rule mining can not produce the meaningful rules in this case because it is not suitable to a given condition. Consequently, we creat insignificant error which is not selected to the meaningful rules. In this paper, we suggest clustering technique to association rule measures for finding effective association rules using measure of association.

  • PDF