• Title/Summary/Keyword: model-based clustering

Search Result 758, Processing Time 0.03 seconds

Path based K-means Clustering for RFID Data Sets

  • Yun, Hong-Won
    • Journal of information and communication convergence engineering
    • /
    • v.6 no.4
    • /
    • pp.434-438
    • /
    • 2008
  • Massive data are continuously produced with a data rate of over several terabytes every day. These applications need effective clustering algorithms to achieve an overall high performance computation. In this paper, we propose ancestor as cluster center based approach to clustering, the K-means algorithm using ancestor. We modify the K-means algorithm. We present a clustering architecture and a clustering algorithm that minimize of I/Os and show a performance with excellent. In our experimental performance evaluation, we present that our algorithm can improve the I/O speed and the query processing time.

Nonlinear structural finite element model updating with a focus on model uncertainty

  • Mehrdad, Ebrahimi;Reza Karami, Mohammadi;Elnaz, Nobahar;Ehsan Noroozinejad, Farsangi
    • Earthquakes and Structures
    • /
    • v.23 no.6
    • /
    • pp.549-580
    • /
    • 2022
  • This paper assesses the influences of modeling assumptions and uncertainties on the performance of the non-linear finite element (FE) model updating procedure and model clustering method. The results of a shaking table test on a four-story steel moment-resisting frame are employed for both calibrations and clustering of the FE models. In the first part, simple to detailed non-linear FE models of the test frame is calibrated to minimize the difference between the various data features of the models and the structure. To investigate the effect of the specified data feature, four of which include the acceleration, displacement, hysteretic energy, and instantaneous features of responses, have been considered. In the last part of the work, a model-based clustering approach to group models of a four-story frame with similar behavior is introduced to detect abnormal ones. The approach is a composition of property derivation, outlier removal based on k-Nearest neighbors, and a K-means clustering approach using specified data features. The clustering results showed correlations among similar models. Moreover, it also helped to detect the best strategy for modeling different structural components.

Clustering-based identification for the prediction of splitting tensile strength of concrete

  • Tutmez, Bulent
    • Computers and Concrete
    • /
    • v.6 no.2
    • /
    • pp.155-165
    • /
    • 2009
  • Splitting tensile strength (STS) of high-performance concrete (HPC) is one of the important mechanical properties for structural design. This property is related to compressive strength (CS), water/binder (W/B) ratio and concrete age. This paper presents a clustering-based fuzzy model for the prediction of STS based on the CS and (W/B) at a fixed age (28 days). The data driven fuzzy model consists of three main steps: fuzzy clustering, inference system, and prediction. The system can be analyzed directly by the model from measured data. The performance evaluations showed that the fuzzy model is more accurate than the other prediction models concerned.

Speaker Adaptation Using i-Vector Based Clustering

  • Kim, Minsoo;Jang, Gil-Jin;Kim, Ji-Hwan;Lee, Minho
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.14 no.7
    • /
    • pp.2785-2799
    • /
    • 2020
  • We propose a novel speaker adaptation method using acoustic model clustering. The similarity of different speakers is defined by the cosine distance between their i-vectors (intermediate vectors), and various efficient clustering algorithms are applied to obtain a number of speaker subsets with different characteristics. The speaker-independent model is then retrained with the training data of the individual speaker subsets grouped by the clustering results, and an unknown speech is recognized by the retrained model of the closest cluster. The proposed method is applied to a large-scale speech recognition system implemented by a hybrid hidden Markov model and deep neural network framework. An experiment was conducted to evaluate the word error rates using Resource Management database. When the proposed speaker adaptation method using i-vector based clustering was applied, the performance, as compared to that of the conventional speaker-independent speech recognition model, was improved relatively by as much as 12.2% for the conventional fully neural network, and by as much as 10.5% for the bidirectional long short-term memory.

Design and Implementation of the Ensemble-based Classification Model by Using k-means Clustering

  • Song, Sung-Yeol;Khil, A-Ra
    • Journal of the Korea Society of Computer and Information
    • /
    • v.20 no.10
    • /
    • pp.31-38
    • /
    • 2015
  • In this paper, we propose the ensemble-based classification model which extracts just new data patterns from the streaming-data by using clustering and generates new classification models to be added to the ensemble in order to reduce the number of data labeling while it keeps the accuracy of the existing system. The proposed technique performs clustering of similar patterned data from streaming data. It performs the data labeling to each cluster at the point when a certain amount of data has been gathered. The proposed technique applies the K-NN technique to the classification model unit in order to keep the accuracy of the existing system while it uses a small amount of data. The proposed technique is efficient as using about 3% less data comparing with the existing technique as shown the simulation results for benchmarks, thereby using clustering.

Bayesian Curve Clustering in Microarray

  • Lee, Kyeong-Eun;Mallick, Bani K.
    • 한국데이터정보과학회:학술대회논문집
    • /
    • 2006.04a
    • /
    • pp.39-42
    • /
    • 2006
  • We propose a Bayesian model-based approach using a mixture of Dirichlet processes model with discrete wavelet transform, for curve clustering in the microarray data with time-course gene expressions.

  • PDF

The Application of an HMM-based Clustering Method to Speaker Independent Word Recognition (HMM을 기본으로한 집단화 방법의 불특정화자 단어 인식에 응용)

  • Lim, H.;Park, S.-Y.;Park, M.-W.
    • The Journal of the Acoustical Society of Korea
    • /
    • v.14 no.5
    • /
    • pp.5-10
    • /
    • 1995
  • In this paper we present a clustering procedure based on the use of HMM in order to get multiple statistical models which can well absorb the variants of each speaker with different ways of saying words. The HMM-clustered models obtained from the developed technique are applied to the speaker independent isolated word recognition. The HMM clustering method splits off all observation sequences with poor likelihood scores which fall below threshold from the training set and create a new model out of the observation sequences in the new cluster. Clustering is iterated by classifying each observation sequence as belonging to the cluster whose model has the maximum likelihood score. If any clutter has changed from the previous iteration the model in that cluster is reestimated by using the Baum-Welch reestimation procedure. Therefore, this method is more efficient than the conventional template-based clustering technique due to the integration capability of the clustering procedure and the parameter estimation. Experimental data show that the HMM-based clustering procedure leads to $1.43\%$ performance improvements over the conventional template-based clustering method and $2.08\%$ improvements over the single HMM method for the case of recognition of the isolated korean digits.

  • PDF

Normal Mixture Model with General Linear Regressive Restriction: Applied to Microarray Gene Clustering

  • Kim, Seung-Gu
    • Communications for Statistical Applications and Methods
    • /
    • v.14 no.1
    • /
    • pp.205-213
    • /
    • 2007
  • In this paper, the normal mixture model subjected to general linear restriction for component-means based on linear regression is proposed, and its fitting method by EM algorithm and Lagrange multiplier is provided. This model is applied to gene clustering of microarray expression data, which demonstrates it has very good performances for real data set. This model also allows to obtain the clusters that an analyst wants to find out in the fashion that the hypothesis for component-means is represented by the design matrices and the linear restriction matrices.

Genetically Optimized Information Granules-based FIS (유전자적 최적 정보 입자 기반 퍼지 추론 시스템)

  • Park, Keon-Jun;Oh, Sung-Kwun;Lee, Young-Il
    • Proceedings of the KIEE Conference
    • /
    • 2005.10b
    • /
    • pp.146-148
    • /
    • 2005
  • In this paper, we propose a genetically optimized identification of information granulation(IG)-based fuzzy model. To optimally design the IG-based fuzzy model we exploit a hybrid identification through genetic alrogithms(GAs) and Hard C-Means (HCM) clustering. An initial structure of fuzzy model is identified by determining the number of input, the seleced input variables, the number of membership function, and the conclusion inference type by means of GAs. Granulation of information data with the aid of Hard C-Means(HCM) clustering algorithm help determine the initial paramters of fuzzy model such as the initial apexes of the membership functions and the initial values of polyminial functions being used in the premise and consequence part of the fuzzy rules. And the inital parameters are tuned effectively with the aid of the genetic algorithms and the least square method. And also, we exploite consecutive identification of fuzzy model in case of identification of structure and parameters. Numerical example is included to evaluate the performance of the proposed model.

  • PDF

Prediction of Energy Consumption in a Smart Home Using Coherent Weighted K-Means Clustering ARIMA Model

  • Magdalene, J. Jasmine Christina;Zoraida, B.S.E.
    • International Journal of Computer Science & Network Security
    • /
    • v.22 no.10
    • /
    • pp.177-182
    • /
    • 2022
  • Technology is progressing with every passing day and the enormous usage of electricity is becoming a necessity. One of the techniques to enjoy the assistances in a smart home is the efficiency to manage the electric energy. When electric energy is managed in an appropriate way, it drastically saves sufficient power even to be spent during hard time as when hit by natural calamities. To accomplish this, prediction of energy consumption plays a very important role. This proposed prediction model Coherent Weighted K-Means Clustering ARIMA (CWKMCA) enhances the weighted k-means clustering technique by adding weights to the cluster points. Forecasting is done using the ARIMA model based on the centroid of the clusters produced. The dataset for this proposed work is taken from the Pecan Project in Texas, USA. The level of accuracy of this model is compared with the traditional ARIMA model and the Weighted K-Means Clustering ARIMA Model. When predicting,errors such as RMSE, MAPE, AIC and AICC are analysed, the results of this suggested work reveal lower values than the ARIMA and Weighted K-Means Clustering ARIMA models. This model also has a greater loglikelihood, demonstrating that this model outperforms the ARIMA model for time series forecasting.