• Title/Summary/Keyword: Cluster Validity Index

Search Result 26, Processing Time 0.024 seconds

VS-FCM: Validity-guided Spatial Fuzzy c-Means Clustering for Image Segmentation

  • Kang, Bo-Yeong;Kim, Dae-Won
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.10 no.1
    • /
    • pp.89-93
    • /
    • 2010
  • In this paper a new fuzzy clustering approach to the color clustering problem has been proposed. To deal with the limitations of the traditional FCM algorithm, we propose a spatial homogeneity-based FCM algorithm. Moreover, the cluster validity index is employed to automatically determine the number of clusters for a given image. We refer to this method as VS-FCM algorithm. The effectiveness of the proposed method is demonstrated through various clustering examples.

Comparison of the Cluster Validation Techniques using Gene Expression Data (유전자 발현 자료를 이용한 군집 타당성분석 기법 비교)

  • Jeong, Yun-Kyoung;Baek, Jang-Sun
    • 한국데이터정보과학회:학술대회논문집
    • /
    • 2006.04a
    • /
    • pp.63-76
    • /
    • 2006
  • Several clustering algorithms to analyze gene expression data and cluster validation techniques that assess the quality of their outcomes, have been suggested, but evaluations of these cluster validation techniques have seldom been implemented. In this paper we compared various cluster validity indices for simulation data and real genomic data, and found that Dunn's index is more effective and robust through small simulations and with real gene expression data.

  • PDF

A Cluster Validity Index for Fuzzy Clustering (퍼지 클러스터링의 타당성 평가 기준)

  • 권순학
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 1998.10a
    • /
    • pp.83-89
    • /
    • 1998
  • 본 논문에서는, 퍼지 클러스터의 수가 증가함에 따라 나타나는 퍼지 클러스터링 타당성 평가 기준의 단조 감소 현상을 억제하는 새로운 퍼지 클러스터링 타당성 평가 기준을 제시한다. 또한, 제시된 평가 기준의 성질을 조사하고 기존의 퍼지 클러스터링 타당성 평가 기준과의 차이점에 대하여 논한다. 마지막으로, 퍼지 크러스터링에 자주 인용되는 몇 가지 전형적인 자료에 대한 모의 실험을 통하여 제시된 평가 기준의 효용성을 보인다.

  • PDF

Comparison of time series clustering methods and application to power consumption pattern clustering

  • Kim, Jaehwi;Kim, Jaehee
    • Communications for Statistical Applications and Methods
    • /
    • v.27 no.6
    • /
    • pp.589-602
    • /
    • 2020
  • The development of smart grids has enabled the easy collection of a large amount of power data. There are some common patterns that make it useful to cluster power consumption patterns when analyzing s power big data. In this paper, clustering analysis is based on distance functions for time series and clustering algorithms to discover patterns for power consumption data. In clustering, we use 10 distance measures to find the clusters that consider the characteristics of time series data. A simulation study is done to compare the distance measures for clustering. Cluster validity measures are also calculated and compared such as error rate, similarity index, Dunn index and silhouette values. Real power consumption data are used for clustering, with five distance measures whose performances are better than others in the simulation.

A Method of Data Hiding in a File System by Modifying Directory Information

  • Cho, Gyu-Sang
    • Journal of the Korea Society of Computer and Information
    • /
    • v.23 no.8
    • /
    • pp.85-93
    • /
    • 2018
  • In this research, it is proposed that a method to hide data by modifying directory index entry information. It consists of two methods: a directory list hiding and a file contents hiding. The directory list hiding method is to avoid the list of files from appearing in the file explorer window or the command prompt window. By modifying the file names of several index entries to make them duplicated, if the duplicated files are deleted, then the only the original file is deleted, but the modified files are retained in the MFT entry intact. So, the fact that these files are hidden is not exposed. The file contents hiding is to allocate data to be hidden on an empty index record page that is not used. If many files are made in the directory, several 4KB index records are allocated. NTFS leaves the empty index records unchanged after deleting the files. By modifying the run-list of the index record with the cluster number of the file-to-hide, the contents of the file-to-hide are hidden in the index record. By applying the proposed method to the case of hiding two files, the file lists are not exposed in the file explorer and the command prompt window, and the contents of the file-to-hide are hidden in the empty index record. It is proved that the proposed method has effectiveness and validity.

Non-destructive evaluation and pattern recognition for SCRC columns using the AE technique

  • Du, Fangzhu;Li, Dongsheng
    • Structural Monitoring and Maintenance
    • /
    • v.6 no.3
    • /
    • pp.173-190
    • /
    • 2019
  • Steel-confined reinforced concrete (SCRC) columns feature highly complex and invisible mechanisms that make damage evaluation and pattern recognition difficult. In the present article, the prevailing acoustic emission (AE) technique was applied to monitor and evaluate the damage process of steel-confined RC columns in a quasi-static test. AE energy-based indicators, such as index of damage and relax ratio, were proposed to trace the damage progress and quantitatively evaluate the damage state. The fuzzy C-means algorithm successfully discriminated the AE data of different patterns, validity analysis guaranteed cluster accuracy, and principal component analysis simplified the datasets. A detailed statistical investigation on typical AE features was conducted to relate the clustered AE signals to micro mechanisms and the observed damage patterns, and differences between steel-confined and unconfined RC columns were compared and illustrated.

The Development of A Standardized Test of Science Inquiry Skills : Interpreting and Analyzing Data for Eighth Grade Students (과학 탐구능력 측정을 위한 표준화 검사지 개발 - 중학교 2학년의 자료 분석과 해석 능력을 중심으로 -)

  • Lee, Youne-Woo;Woo, Jong-Ok
    • Journal of The Korean Association For Science Education
    • /
    • v.11 no.1
    • /
    • pp.59-72
    • /
    • 1991
  • This study has formed a clear definition of the elements of inquiry skills : inference, determining relationship, causal explanation, prediction, and created the goals of assessment and the items of assessment. They have been checked the validity and the objectivity and the clarity of the items by six professors of science education. At the same time, the two times of the field trial has been executed. and checked the discriminating power and the difficulty index and the effectiveness of distracters, and modified the items. The test developed in this way was administered to 1060 students of the eighth grade, randomly cluster-sampled from the universe, and standardized. The test is the aptitude test as well as the norm-reference test. and has twenty items. The testing-time is thirty minutes. And the cotent validity is 85%, the objectivity of the answer keys 91.7%, the mean of items difficulty 68.8%, the mean of discriminating power 0.39, the standard deviation 3.31, the reliability(K-R 20) 0.69. Because it is the standardized lest, it can diagnose the well-developed skills and the ill-developed skills of the students, and monitor the development of skills.

  • PDF

Clustering load patterns recorded from advanced metering infrastructure (AMI로부터 측정된 전력사용데이터에 대한 군집 분석)

  • Ann, Hyojung;Lim, Yaeji
    • The Korean Journal of Applied Statistics
    • /
    • v.34 no.6
    • /
    • pp.969-977
    • /
    • 2021
  • We cluster the electricity consumption of households in A-apartment in Seoul, Korea using Hierarchical K-means clustering algorithm. The data is recorded from the advanced metering infrastructure (AMI), and we focus on the electricity consumption during evening weekdays in summer. Compare to the conventional clustering algorithms, Hierarchical K-means clustering algorithm is recently applied to the electricity usage data, and it can identify usage patterns while reducing dimension. We apply Hierarchical K-means algorithm to the AMI data, and compare the results based on the various clustering validity indexes. The results show that the electricity usage patterns are well-identified, and it is expected to be utilized as a major basis for future applications in various fields.

Clustering Validity of Social Network Subgroup Using Attribute Similarity (속성유사도에 따른 사회연결망 서브그룹의 군집유효성)

  • Yoon, Han-Seong
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.17 no.1
    • /
    • pp.75-84
    • /
    • 2021
  • For analyzing big data, the social network is increasingly being utilized through relational data, which means the connection characteristics between entities such as people and objects. When the relational data does not exist directly, a social network can be configured by calculating relational data such as attribute similarity from attribute data of entities and using it as links. In this paper, the composition method of the social network using the attribute similarity between entities as a connection relationship, and the clustering method using subgroups for the configured social network are suggested, and the clustering effectiveness of the clustering results is evaluated. The analysis results can vary depending on the type and characteristics of the data to be analyzed, the type of attribute similarity selected, and the criterion value. In addition, the clustering effectiveness may not be consistent depending on the its evaluation method. Therefore, selections and experiments are necessary for better analysis results. Since the analysis results may be different depending on the type and characteristics of the analysis target, options for clustering, etc., there is a limitation. In addition, for performance evaluation of clustering, a study is needed to compare the method of this paper with the conventional method such as k-means.

Development of Korean Patient Classification System for Neonatal Care Nurses (한국형 신생아중환자간호 분류도구 개발)

  • Yu, Mi;Kim, Dong Yeon;Yoo, Cheong Suk
    • Journal of Korean Clinical Nursing Research
    • /
    • v.22 no.2
    • /
    • pp.205-216
    • /
    • 2016
  • Purpose: This study was performed to develop a valid and reliable Korean Patient Classification System for Neonatal care nurses (KPCSN). Methods: The study was conducted in tertiary and general hospitals with 1~2 grade according to nursing fee differentiation policy for NICU (neonatal intensive care unit) nurse staffing. The reliability was evaluated for the classification of 218 patients by 10 nurse managers and 56 staff nurses working in NICUs from 10 hospitals. To verify construct validity, 208 patients were classified and compared for the type of stay, gestational age, birth weight, and current body weight. Nursing time was measured by nurses, nurse managers, and nurse aids. For the calculation of conversion index (total nursing time divided by the KPCSN score), 426 patients were classified using the KPCSN. Data were collected from September 5 to October 28, 2015, and analyzed using t-test, ANOVA, intraclass correlation coefficient, and non-hierarchial cluster analysis. Results: The final KPCSN consisted of 11 nursing categories, 71 nursing activities and 111 criteria. The reliability of the KPCSN was r=.83 (p<.001). The construct validity was established. The KPCSN score was classified into four groups; group $1:{\leq}57points$, group 2: 58~80 points, group 3: 81~108 points, and group $4:{\geq}109points$ in the KPSCN score. The conversion index was calculated as 7.45 minutes/classification score. Conclusion: The KPCSN can be utilized to measure specific and complex nursing demands for infants receiving care in the NICUs.