• Title/Summary/Keyword: spectral similarity measure

Search Result 12, Processing Time 0.033 seconds

A Max-Flow-Based Similarity Measure for Spectral Clustering

  • Cao, Jiangzhong;Chen, Pei;Zheng, Yun;Dai, Qingyun
    • ETRI Journal
    • /
    • v.35 no.2
    • /
    • pp.311-320
    • /
    • 2013
  • In most spectral clustering approaches, the Gaussian kernel-based similarity measure is used to construct the affinity matrix. However, such a similarity measure does not work well on a dataset with a nonlinear and elongated structure. In this paper, we present a new similarity measure to deal with the nonlinearity issue. The maximum flow between data points is computed as the new similarity, which can satisfy the requirement for similarity in the clustering method. Additionally, the new similarity carries the global and local relations between data. We apply it to spectral clustering and compare the proposed similarity measure with other state-of-the-art methods on both synthetic and real-world data. The experiment results show the superiority of the new similarity: 1) The max-flow-based similarity measure can significantly improve the performance of spectral clustering; 2) It is robust and not sensitive to the parameters.

Robust Similarity Measure for Spectral Clustering Based on Shared Neighbors

  • Ye, Xiucai;Sakurai, Tetsuya
    • ETRI Journal
    • /
    • v.38 no.3
    • /
    • pp.540-550
    • /
    • 2016
  • Spectral clustering is a powerful tool for exploratory data analysis. Many existing spectral clustering algorithms typically measure the similarity by using a Gaussian kernel function or an undirected k-nearest neighbor (kNN) graph, which cannot reveal the real clusters when the data are not well separated. In this paper, to improve the spectral clustering, we consider a robust similarity measure based on the shared nearest neighbors in a directed kNN graph. We propose two novel algorithms for spectral clustering: one based on the number of shared nearest neighbors, and one based on their closeness. The proposed algorithms are able to explore the underlying similarity relationships between data points, and are robust to datasets that are not well separated. Moreover, the proposed algorithms have only one parameter, k. We evaluated the proposed algorithms using synthetic and real-world datasets. The experimental results demonstrate that the proposed algorithms not only achieve a good level of performance, they also outperform the traditional spectral clustering algorithms.

Spectral clustering based on the local similarity measure of shared neighbors

  • Cao, Zongqi;Chen, Hongjia;Wang, Xiang
    • ETRI Journal
    • /
    • v.44 no.5
    • /
    • pp.769-779
    • /
    • 2022
  • Spectral clustering has become a typical and efficient clustering method used in a variety of applications. The critical step of spectral clustering is the similarity measurement, which largely determines the performance of the spectral clustering method. In this paper, we propose a novel spectral clustering algorithm based on the local similarity measure of shared neighbors. This similarity measurement exploits the local density information between data points based on the weight of the shared neighbors in a directed k-nearest neighbor graph with only one parameter k, that is, the number of nearest neighbors. Numerical experiments on synthetic and real-world datasets demonstrate that our proposed algorithm outperforms other existing spectral clustering algorithms in terms of the clustering performance measured via the normalized mutual information, clustering accuracy, and F-measure. As an example, the proposed method can provide an improvement of 15.82% in the clustering performance for the Soybean dataset.

Identification Performance of Low-Molecular Compounds by Searching Tandem Mass Spectral Libraries with Simple Peak Matching

  • Milman, Boris L.;Zhurkovich, Inna K.
    • Mass Spectrometry Letters
    • /
    • v.9 no.3
    • /
    • pp.73-76
    • /
    • 2018
  • The number of matched peaks (NMP) is estimated as the spectral similarity measure in tandem mass spectral library searches of small molecules. In the high resolution mode, NMP provides the same reliable identification as in the case of a common dot-product function. Corresponding true positive rates are ($94{\pm}3$) % and ($96{\pm}3$) %, respectively.

A Study on the Unsupervised Change Detection for Hyperspectral Data Using Similarity Measure Techniques (화소간 유사도 측정 기법을 이용한 하이퍼스펙트럴 데이터의 무감독 변화탐지에 관한 연구)

  • Kim Dae-Sung;Kim Yong-Il
    • Proceedings of the Korean Society of Surveying, Geodesy, Photogrammetry, and Cartography Conference
    • /
    • 2006.04a
    • /
    • pp.243-248
    • /
    • 2006
  • In this paper, we propose the unsupervised change detection algorithm that apply the similarity measure techniques to the hyperspectral image. The general similarity measures including euclidean distance and spectral angle were compared. The spectral similarity scale algorithm for reducing the problems of those techniques was studied and tested with Hyperion data. The thresholds for detecting the change area were estimated through EM(Expectation-Maximization) algorithm. The experimental result shows that the similarity measure techniques and EM algorithm can be applied effectively for the unsupervised change detection of the hyperspectral data.

  • PDF

Evaluating the Contribution of Spectral Features to Image Classification Using Class Separability

  • Ye, Chul-Soo
    • Korean Journal of Remote Sensing
    • /
    • v.36 no.1
    • /
    • pp.55-65
    • /
    • 2020
  • Image classification needs the spectral similarity comparison between spectral features of each pixel and the representative spectral features of each class. The spectral similarity is obtained by computing the spectral feature vector distance between the pixel and the class. Each spectral feature contributes differently in the image classification depending on the class separability of the spectral feature, which is computed using a suitable vector distance measure such as the Bhattacharyya distance. We propose a method to determine the weight value of each spectral feature in the computation of feature vector distance for the similarity measurement. The weight value is determined by the ratio between each feature separability value to the total separability values of all the spectral features. We created ten spectral features consisting of seven bands of Landsat-8 OLI image and three indices, NDVI, NDWI and NDBI. For three experimental test sites, we obtained the overall accuracies between 95.0% and 97.5% and the kappa coefficients between 90.43% and 94.47%.

Estimating Amino Acid Composition of Protein Sequences Using Position-Dependent Similarity Spectrum (위치 종속 유사도 스펙트럼을 이용한 단백질 서열의 아미노산 조성 추정)

  • Chi, Sang-Mun
    • Journal of KIISE:Software and Applications
    • /
    • v.37 no.1
    • /
    • pp.74-79
    • /
    • 2010
  • The amino acid composition of a protein provides basic information for solving many problems in bioinformatics. We propose a new method that uses biologically relevant similarity between amino acids to determine the amino acid composition, where the BOLOSUM matrix is exploited to define a similarity measure between amino acids. Futhermore, to extract more information from a protein sequence than conventional methods for determining amino acid composition, we exploit the concepts of spectral analysis of signals such as radar and speech signals-the concepts of time-dependent analysis, time resolution, and frequency resolution. The proposed method was applied to predict subcellular localization of proteins, and showed significantly improved performance over previous methods for amino acid composition estimation.

Classification of Hyperspectral Images Using Spectral Mutual Information (분광 상호정보를 이용한 하이퍼스펙트럴 영상분류)

  • Byun, Young-Gi;Eo, Yang-Dam;Yu, Ki-Yun
    • Journal of Korean Society for Geospatial Information Science
    • /
    • v.15 no.3
    • /
    • pp.33-39
    • /
    • 2007
  • Hyperspectral remote sensing data contain plenty of information about objects, which makes object classification more precise. In this paper, we proposed a new spectral similarity measure, called Spectral Mutual Information (SMI) for hyperspectral image classification problem. It is derived from the concept of mutual information arising in information theory and can be used to measure the statistical dependency between spectra. SMI views each pixel spectrum as a random variable and classifies image by measuring the similarity between two spectra form analogy mutual information. The proposed SMI was tested to evaluate its effectiveness. The evaluation was done by comparing the results of preexisting classification method (SAM, SSV). The evaluation results showed the proposed approach has a good potential in the classification of hyperspectral images.

  • PDF

Cloud Removal Using Gaussian Process Regression for Optical Image Reconstruction

  • Park, Soyeon;Park, No-Wook
    • Korean Journal of Remote Sensing
    • /
    • v.38 no.4
    • /
    • pp.327-341
    • /
    • 2022
  • Cloud removal is often required to construct time-series sets of optical images for environmental monitoring. In regression-based cloud removal, the selection of an appropriate regression model and the impact analysis of the input images significantly affect the prediction performance. This study evaluates the potential of Gaussian process (GP) regression for cloud removal and also analyzes the effects of cloud-free optical images and spectral bands on prediction performance. Unlike other machine learning-based regression models, GP regression provides uncertainty information and automatically optimizes hyperparameters. An experiment using Sentinel-2 multi-spectral images was conducted for cloud removal in the two agricultural regions. The prediction performance of GP regression was compared with that of random forest (RF) regression. Various combinations of input images and multi-spectral bands were considered for quantitative evaluations. The experimental results showed that using multi-temporal images with multi-spectral bands as inputs achieved the best prediction accuracy. Highly correlated adjacent multi-spectral bands and temporally correlated multi-temporal images resulted in an improved prediction accuracy. The prediction performance of GP regression was significantly improved in predicting the near-infrared band compared to that of RF regression. Estimating the distribution function of input data in GP regression could reflect the variations in the considered spectral band with a broader range. In particular, GP regression was superior to RF regression for reproducing structural patterns at both sites in terms of structural similarity. In addition, uncertainty information provided by GP regression showed a reasonable similarity to prediction errors for some sub-areas, indicating that uncertainty estimates may be used to measure the prediction result quality. These findings suggest that GP regression could be beneficial for cloud removal and optical image reconstruction. In addition, the impact analysis results of the input images provide guidelines for selecting optimal images for regression-based cloud removal.

Classification of Time-Series Data Based on Several Lag Windows

  • Kim, Hee-Young;Park, Man-Sik
    • Communications for Statistical Applications and Methods
    • /
    • v.17 no.3
    • /
    • pp.377-390
    • /
    • 2010
  • In the case of time-series analysis, it is often more convenient to rely on the frequency domain than the time domain. Spectral density is the core of the frequency-domain analysis that describes autocorrelation structures in a time-series process. Possible ways to estimate spectral density are to compute a periodogram or to average the periodogram over some frequencies with (un)equal weights. This can be an attractive tool to measure the similarity between time-series processes. We employ the metrics based on a smoothed periodogram proposed by Park and Kim (2008) for the classification of different classes of time-series processes. We consider several lag windows with unequal weights instead of a modified Daniel's window used in Park and Kim (2008). We evaluate the performance under various simulation scenarios. Simulation results reveal that the metrics used in this study split the time series into the preassigned clusters better than do the raw-periodogram based ones proposed by Caiado et al. 2006. Our metrics are applied to an economic time-series dataset.