• Title/Summary/Keyword: Kullback-Leibler method

Search Result 41, Processing Time 0.021 seconds

A Simple Tandem Method for Clustering of Multimodal Dataset

  • Cho C.;Lee J.W.;Lee J.W.
    • Proceedings of the Korean Operations and Management Science Society Conference
    • /
    • 2003.05a
    • /
    • pp.729-733
    • /
    • 2003
  • The presence of local features within clusters incurred by multi-modal nature of data prohibits many conventional clustering techniques from working properly. Especially, the clustering of datasets with non-Gaussian distributions within a cluster can be problematic when the technique with implicit assumption of Gaussian distribution is used. Current study proposes a simple tandem clustering method composed of k-means type algorithm and hierarchical method to solve such problems. The multi-modal dataset is first divided into many small pre-clusters by k-means or fuzzy k-means algorithm. The pre-clusters found from the first step are to be clustered again using agglomerative hierarchical clustering method with Kullback- Leibler divergence as the measure of dissimilarity. This method is not only effective at extracting the multi-modal clusters but also fast and easy in terms of computation complexity and relatively robust at the presence of outliers. The performance of the proposed method was evaluated on three generated datasets and six sets of publicly known real world data.

  • PDF

An Analysis of Fuzzy Survey Data Based on the Maximum Entropy Principle (최대 엔트로피 분포를 이용한 퍼지 관측데이터의 분석법에 관한 연구)

  • 유재휘;유동일
    • Journal of the Korea Society of Computer and Information
    • /
    • v.3 no.2
    • /
    • pp.131-138
    • /
    • 1998
  • In usual statistical data analysis, we describe statistical data by exact values. However, in modem complex and large-scale systems, it is difficult to treat the systems using only exact data. In this paper, we define these data as fuzzy data(ie. Linguistic variable applied to make the member-ship function.) and Propose a new method to get an analysis of fuzzy survey data based on the maximum entropy Principle. Also, we propose a new method of discrimination by measuring distance between a distribution of the stable state and estimated distribution of the present state using the Kullback - Leibler information. Furthermore, we investigate the validity of our method by computer simulations under realistic situations.

  • PDF

Performance Evaluation of Nonkeyword Modeling and Postprocessing for Vocabulary-independent Keyword Spotting (가변어휘 핵심어 검출을 위한 비핵심어 모델링 및 후처리 성능평가)

  • Kim, Hyung-Soon;Kim, Young-Kuk;Shin, Young-Wook
    • Speech Sciences
    • /
    • v.10 no.3
    • /
    • pp.225-239
    • /
    • 2003
  • In this paper, we develop a keyword spotting system using vocabulary-independent speech recognition technique, and investigate several non-keyword modeling and post-processing methods to improve its performance. In order to model non-keyword speech segments, monophone clustering and Gaussian Mixture Model (GMM) are considered. We employ likelihood ratio scoring method for the post-processing schemes to verify the recognition results, and filler models, anti-subword models and N-best decoding results are considered as an alternative hypothesis for likelihood ratio scoring. We also examine different methods to construct anti-subword models. We evaluate the performance of our system on the automatic telephone exchange service task. The results show that GMM-based non-keyword modeling yields better performance than that using monophone clustering. According to the post-processing experiment, the method using anti-keyword model based on Kullback-Leibler distance and N-best decoding method show better performance than other methods, and we could reduce more than 50% of keyword recognition errors with keyword rejection rate of 5%.

  • PDF

Gaussian Approximation of Stochastic Lanchester Model for Heterogeneous Forces (혼합 군에 대한 확률적 란체스터 모형의 정규근사)

  • Park, Donghyun;Kim, Donghyun;Moon, Hyungil;Shin, Hayong
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.42 no.2
    • /
    • pp.86-95
    • /
    • 2016
  • We propose a new approach to the stochastic version of Lanchester model. Commonly used approach to stochastic Lanchester model is through the Markov-chain method. The Markov-chain approach, however, is not appropriate to high dimensional heterogeneous force case because of large computational cost. In this paper, we propose an approximation method of stochastic Lanchester model. By matching the first and the second moments, the distribution of each unit strength can be approximated with multivariate normal distribution. We evaluate an approximation of discrete Markov-chain model by measuring Kullback-Leibler divergence. We confirmed high accuracy of approximation method, and also the accuracy and low computational cost are maintained under high dimensional heterogeneous force case.

A New Distance Measure for a Variable-Sized Acoustic Model Based on MDL Technique

  • Cho, Hoon-Young;Kim, Sang-Hun
    • ETRI Journal
    • /
    • v.32 no.5
    • /
    • pp.795-800
    • /
    • 2010
  • Embedding a large vocabulary speech recognition system in mobile devices requires a reduced acoustic model obtained by eliminating redundant model parameters. In conventional optimization methods based on the minimum description length (MDL) criterion, a binary Gaussian tree is built at each state of a hidden Markov model by iteratively finding and merging similar mixture components. An optimal subset of the tree nodes is then selected to generate a downsized acoustic model. To obtain a better binary Gaussian tree by improving the process of finding the most similar Gaussian components, this paper proposes a new distance measure that exploits the difference in likelihood values for cases before and after two components are combined. The mixture weight of Gaussian components is also introduced in the component merging step. Experimental results show that the proposed method outperforms MDL-based optimization using either a Kullback-Leibler (KL) divergence or weighted KL divergence measure. The proposed method could also reduce the acoustic model size by 50% with less than a 1.5% increase in error rate compared to a baseline system.

Region-based Multi-level Thresholding for Color Image Segmentation (영역 기반의 Multi-level Thresholding에 의한 컬러 영상 분할)

  • Oh, Jun-Taek;Kim, Wook-Hyun
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.43 no.6 s.312
    • /
    • pp.20-27
    • /
    • 2006
  • Multi-level thresholding is a method that is widely used in image segmentation. However most of the existing methods are not suited to be directly used in applicable fields and moreover expanded until a step of image segmentation. This paper proposes region-based multi-level thresholding as an image segmentation method. At first we classify pixels of each color channel to two clusters by using EWFCM(Entropy-based Weighted Fuzzy C-Means) algorithm that is an improved FCM algorithm with spatial information between pixels. To obtain better segmentation results, a reduction of clusters is then performed by a region-based reclassification step based on a similarity between regions existing in a cluster and the other clusters. The clusters are created using the classification information of pixels according to color channel. We finally perform a region merging by Bayesian algorithm based on Kullback-Leibler distance between a region and the neighboring regions as a post-processing method as many regions still exist in image. Experiments show that region-based multi-level thresholding is superior to cluster-, pixel-based multi-level thresholding, and the existing mettled. And much better segmentation results are obtained by the post-processing method.

An improved fuzzy c-means method based on multivariate skew-normal distribution for brain MR image segmentation

  • Guiyuan Zhu;Shengyang Liao;Tianming Zhan;Yunjie Chen
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.18 no.8
    • /
    • pp.2082-2102
    • /
    • 2024
  • Accurate segmentation of magnetic resonance (MR) images is crucial for providing doctors with effective quantitative information for diagnosis. However, the presence of weak boundaries, intensity inhomogeneity, and noise in the images poses challenges for segmentation models to achieve optimal results. While deep learning models can offer relatively accurate results, the scarcity of labeled medical imaging data increases the risk of overfitting. To tackle this issue, this paper proposes a novel fuzzy c-means (FCM) model that integrates a deep learning approach. To address the limited accuracy of traditional FCM models, which employ Euclidean distance as a distance measure, we introduce a measurement function based on the skewed normal distribution. This function enables us to capture more precise information about the distribution of the image. Additionally, we construct a regularization term based on the Kullback-Leibler (KL) divergence of high-confidence deep learning results. This regularization term helps enhance the final segmentation accuracy of the model. Moreover, we incorporate orthogonal basis functions to estimate the bias field and integrate it into the improved FCM method. This integration allows our method to simultaneously segment the image and estimate the bias field. The experimental results on both simulated and real brain MR images demonstrate the robustness of our method, highlighting its superiority over other advanced segmentation algorithms.

Discrimination of Out-of-Control Condition Using AIC in (x, s) Control Chart

  • Takemoto, Yasuhiko;Arizono, Ikuo;Satoh, Takanori
    • Industrial Engineering and Management Systems
    • /
    • v.12 no.2
    • /
    • pp.112-117
    • /
    • 2013
  • The $\overline{x}$ control chart for the process mean and either the R or s control chart for the process dispersion have been used together to monitor the manufacturing processes. However, it has been pointed out that this procedure is flawed by a fault that makes it difficult to capture the behavior of process condition visually by considering the relationship between the shift in the process mean and the change in the process dispersion because the respective characteristics are monitored by an individual control chart in parallel. Then, the ($\overline{x}$, s) control chart has been proposed to enable the process managers to monitor the changes in the process mean, process dispersion, or both. On the one hand, identifying which process parameters are responsible for out-of-control condition of process is one of the important issues in the process management. It is especially important in the ($\overline{x}$, s) control chart where some parameters are monitored at a single plane. The previous literature has proposed the multiple decision method based on the statistical hypothesis tests to identify the parameters responsible for out-of-control condition. In this paper, we propose how to identify parameters responsible for out-of-control condition using the information criterion. Then, the effectiveness of proposed method is shown through some numerical experiments.

Performance Improvement in the Multi-Model Based Speech Recognizer for Continuous Noisy Speech Recognition (연속 잡음 음성 인식을 위한 다 모델 기반 인식기의 성능 향상에 대한 연구)

  • Chung, Yong-Joo
    • Speech Sciences
    • /
    • v.15 no.2
    • /
    • pp.55-65
    • /
    • 2008
  • Recently, the multi-model based speech recognizer has been used quite successfully for noisy speech recognition. For the selection of the reference HMM (hidden Markov model) which best matches the noise type and SNR (signal to noise ratio) of the input testing speech, the estimation of the SNR value using the VAD (voice activity detection) algorithm and the classification of the noise type based on the GMM (Gaussian mixture model) have been done separately in the multi-model framework. As the SNR estimation process is vulnerable to errors, we propose an efficient method which can classify simultaneously the SNR values and noise types. The KL (Kullback-Leibler) distance between the single Gaussian distributions for the noise signal during the training and testing is utilized for the classification. The recognition experiments have been done on the Aurora 2 database showing the usefulness of the model compensation method in the multi-model based speech recognizer. We could also see that further performance improvement was achievable by combining the probability density function of the MCT (multi-condition training) with that of the reference HMM compensated by the D-JA (data-driven Jacobian adaptation) in the multi-model based speech recognizer.

  • PDF

Secure and Robust Clustering for Quantized Target Tracking in Wireless Sensor Networks

  • Mansouri, Majdi;Khoukhi, Lyes;Nounou, Hazem;Nounou, Mohamed
    • Journal of Communications and Networks
    • /
    • v.15 no.2
    • /
    • pp.164-172
    • /
    • 2013
  • We consider the problem of secure and robust clustering for quantized target tracking in wireless sensor networks (WSN) where the observed system is assumed to evolve according to a probabilistic state space model. We propose a new method for jointly activating the best group of candidate sensors that participate in data aggregation, detecting the malicious sensors and estimating the target position. Firstly, we select the appropriate group in order to balance the energy dissipation and to provide the required data of the target in the WSN. This selection is also based on the transmission power between a sensor node and a cluster head. Secondly, we detect the malicious sensor nodes based on the information relevance of their measurements. Then, we estimate the target position using quantized variational filtering (QVF) algorithm. The selection of the candidate sensors group is based on multi-criteria function, which is computed by using the predicted target position provided by the QVF algorithm, while the malicious sensor nodes detection is based on Kullback-Leibler distance between the current target position distribution and the predicted sensor observation. The performance of the proposed method is validated by simulation results in target tracking for WSN.