Search | Korea Science

The Application of an HMM-based Clustering Method to Speaker Independent Word Recognition (HMM을 기본으로한 집단화 방법의 불특정화자 단어 인식에 응용)

Lim, H.;Park, S.-Y.;Park, M.-W.
- The Journal of the Acoustical Society of Korea
- /
- v.14 no.5
- /
- pp.5-10
- /
- 1995
In this paper we present a clustering procedure based on the use of HMM in order to get multiple statistical models which can well absorb the variants of each speaker with different ways of saying words. The HMM-clustered models obtained from the developed technique are applied to the speaker independent isolated word recognition. The HMM clustering method splits off all observation sequences with poor likelihood scores which fall below threshold from the training set and create a new model out of the observation sequences in the new cluster. Clustering is iterated by classifying each observation sequence as belonging to the cluster whose model has the maximum likelihood score. If any clutter has changed from the previous iteration the model in that cluster is reestimated by using the Baum-Welch reestimation procedure. Therefore, this method is more efficient than the conventional template-based clustering technique due to the integration capability of the clustering procedure and the parameter estimation. Experimental data show that the HMM-based clustering procedure leads to $1.43\%$ performance improvements over the conventional template-based clustering method and $2.08\%$ improvements over the single HMM method for the case of recognition of the isolated korean digits.
PDF

Kernel Pattern Recognition using K-means Clustering Method (K-평균 군집방법을 이요한 가중커널분류기)

백장선;심정욱
- The Korean Journal of Applied Statistics
- /
- v.13 no.2
- /
- pp.447-455
- /
- 2000
We propose a weighted kernel pattern recognition method using the K -means clustering algorithm to reduce computation and storage required for the full kernel classifier. This technique finds a set of reference vectors and weights which are used to approximate the kernel classifier. Since the hierarchical clustering method implemented in the 'Weighted Parzen Window (WP\V) classifier is not able to rearrange the proper clusters, we adopt the K -means algorithm to find reference vectors and weights from the more properly rearranged clusters \Ve find that the proposed method outperforms the \VP\V method for the repre~entativeness of the reference vectors and the data reduction.
PDF

Gene Screening and Clustering of Yeast Microarray Gene Expression Data (효모 마이크로어레이 유전자 발현 데이터에 대한 유전자 선별 및 군집분석)

Lee, Kyung-A;Kim, Tae-Houn;Kim, Jae-Hee
- The Korean Journal of Applied Statistics
- /
- v.24 no.6
- /
- pp.1077-1094
- /
- 2011
We accomplish clustering analyses for yeast cell cycle microarray expression data. To reflect the characteristics of a time-course data, we screen the genes using the test statistics with Fourier coefficients applying a FDR procedure. We compare the results done by model-based clustering, K-means, PAM, SOM, hierarchical Ward method and Fuzzy method with the yeast data. As the validity measure for clustering results, connectivity, Dunn index and silhouette values are computed and compared. A biological interpretation with GO analysis is also included.
https://doi.org/10.5351/KJAS.2011.24.6.1077 인용 PDF KSCI

Comparison Study of Time Series Clustering Methods (시계열자료 눈집방법의 비교연구)

Hong, Han-Woom;Park, Min-Jeong;Cho, Sin-Sup
- The Korean Journal of Applied Statistics
- /
- v.22 no.6
- /
- pp.1203-1214
- /
- 2009
In this paper we introduce the time series clustering methods in the time and frequency domains and discuss the merits or demerits of each method. We analyze 15 daily stock prices of KOSPI 200, and the nonparametric method using the wavelet shows the best clustering results. For the clustering of nonstationary time series using the spectral density, the EMD method remove the trend more effectively than the differencing.
https://doi.org/10.5351/KJAS.2009.22.6.1203 인용 PDF KSCI

Compositional data analysis by the square-root transformation: Application to NBA USG% data

Jeseok Lee;Byungwon Kim
- Communications for Statistical Applications and Methods
- /
- v.31 no.3
- /
- pp.349-363
- /
- 2024
Compositional data refers to data where the sum of the values of the components is a constant, hence the sample space is defined as a simplex making it impossible to apply statistical methods developed in the usual Euclidean vector space. A natural approach to overcome this restriction is to consider an appropriate transformation which moves the sample space onto the Euclidean space, and log-ratio typed transformations, such as the additive log-ratio (ALR), the centered log-ratio (CLR) and the isometric log-ratio (ILR) transformations, have been mostly conducted. However, in scenarios with sparsity, where certain components take on exact zero values, these log-ratio type transformations may not be effective. In this work, we mainly suggest an alternative transformation, that is the square-root transformation which moves the original sample space onto the directional space. We compare the square-root transformation with the log-ratio typed transformation by the simulation study and the real data example. In the real data example, we applied both types of transformations to the USG% data obtained from NBA, and used a density based clustering method, DBSCAN (density-based spatial clustering of applications with noise), to show the result.
https://doi.org/10.29220/CSAM.2024.31.3.349 인용 PDF

Tree-structured Clustering for Mixed Data (혼합형 데이터에 대한 나무형 군집화)

Yang Kyung-Sook;Huh Myung-Hoe
- The Korean Journal of Applied Statistics
- /
- v.19 no.2
- /
- pp.271-282
- /
- 2006
The aim of this study is to propose a tree-structured clustering for mixed data. We suggest a scaling method to reduce the variable selection bias among categorical variables. In numerical examples such as credit data, German credit data, we note several differences between tree-structured clustering and K-means clustering.
https://doi.org/10.5351/KJAS.2006.19.2.271 인용 PDF KSCI

Bootstrapping and DNA Marker Mining of ILSTS098 Microsatellite Locus in Hanwoo Chromosome 2

Lee, Jea-Young;Kwon, Jae-Chul
- Communications for Statistical Applications and Methods
- /
- v.13 no.3
- /
- pp.525-535
- /
- 2006
We describe tests for detecting and locating quantitative traits loci (QTL) for traits in Hanwoo. Lod scores and a permutation test have been described. From results of a permutation test to detect QTL, we select major DNA markers of ILSTS098 microsatellite locus in Hanwoo chromosome 2 for further analysis. K-means clustering analysis applied to four traits and eight DNA markers in ILSTS098 resulted in three cluster groups. We conclude that the major DNA markers of BMS1167 microsatellite locus in Hanwoo chromosome 2 are markers 105bp, 113bp and 115bp. Finally, bootstrap testing method has been adapted to calculate confidence intervals and for finding major DNA Markers.
https://doi.org/10.5351/CKSS.2006.13.3.525 인용 PDF KSCI

L1-penalized AUC-optimization with a surrogate loss

Hyungwoo Kim;Seung Jun Shin
- Communications for Statistical Applications and Methods
- /
- v.31 no.2
- /
- pp.203-212
- /
- 2024
The area under the ROC curve (AUC) is one of the most common criteria used to measure the overall performance of binary classifiers for a wide range of machine learning problems. In this article, we propose a L₁-penalized AUC-optimization classifier that directly maximizes the AUC for high-dimensional data. Toward this, we employ the AUC-consistent surrogate loss function and combine the L₁-norm penalty which enables us to estimate coefficients and select informative variables simultaneously. In addition, we develop an efficient optimization algorithm by adopting k-means clustering and proximal gradient descent which enjoys computational advantages to obtain solutions for the proposed method. Numerical simulation studies demonstrate that the proposed method shows promising performance in terms of prediction accuracy, variable selectivity, and computational costs.
https://doi.org/10.29220/CSAM.2024.31.2.203 인용 PDF

Design of Hierarchically Structured Clustering Algorithm and its Application (계층 구조 클러스터링 알고리즘 설계 및 그 응용)

Bang, Young-Keun;Park, Ha-Yong;Lee, Chul-Heui
- Journal of Industrial Technology
- /
- v.29 no.B
- /
- pp.17-23
- /
- 2009
In many cases, clustering algorithms have been used for extracting and discovering useful information from non-linear data. They have made a great effect on performances of the systems dealing with non-linear data. Thus, this paper presents a new approach called hierarchically structured clustering algorithm, and it is applied to the prediction system for non-linear time series data. The proposed hierarchically structured clustering algorithm (called HCKA: Hierarchical Cross-correlation and K-means clustering Algorithms) in which the cross-correlation and k-means clustering algorithm are combined can accept the correlationship of non-linear time series as well as statistical characteristics. First, the optimal differences of data are generated, which can suitably reveal the characteristics of non-linear time series. Second, the generated differences are classified into the upper clusters for their predictors by the cross-correlation clustering algorithm, and then each classified differences are classified again into the lower fuzzy sets by the k-means clustering algorithm. As a result, the proposed method can give an efficient classification and improve the performance. Finally, we demonstrates the effectiveness of the proposed HCKA via typical time series examples.
PDF

Load Forecasting using Hierarchical Clustering Method for Building (계층적 군집분석방법을 활용한 건물 부하의 전력수요예측)

Hwang, Hye-Mi;Lee, Sung-Hee;Park, Jong-Bae;Park, Yong-Gi;Son, Sung-Yong
- The Transactions of The Korean Institute of Electrical Engineers
- /
- v.64 no.1
- /
- pp.41-47
- /
- 2015
In recent years, energy supply cases to take advantage of EMS(Energy Management System) are increasing according to high interest of energy efficiency. The important factor for essential and economical EMS operation is the supply and demand plan the hourly power demand of building load using the hierarchical clustering method of variety statistical techniques, and use the real historical data of target load. Also the estimated results of study are obtained the reliability through separate tests of validity.
https://doi.org/10.5370/KIEE.2015.64.1.041 인용 PDF KSCI KPUBS HTML

Search Result 231, Processing Time 0.033 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)