• 제목/요약/키워드: View Clustering

검색결과 100건 처리시간 0.028초

커뮤니티 검출기법을 이용한 소프트웨어 아키텍쳐 모듈 뷰 복원 (Recovering Module View of Software Architecture using Community Detection Algorithm)

  • 김정민;이찬근
    • 소프트웨어공학소사이어티 논문지
    • /
    • 제25권4호
    • /
    • pp.69-74
    • /
    • 2012
  • 본 논문은 소프트웨어 클러스터링 기법과 커뮤니티 검출 기법의 비교를 통하여 아키텍쳐 모듈 복원 프로세스에 커뮤니티 검출 알고리즘의 적용가능성을 제시한다. 또한, 대표적인 클러스터링 알고리즘과 커뮤니티 검출 알고리즘의 값과 나눠진 모듈간의 상관관계와 차이점을 분석한다. 이를 통하여 커뮤니티 검출 알고리즘이 소프트웨어 아키텍쳐 모듈 뷰 복원에 활용되어질 수 있다는 몇 가지 근거를 제시하였고, 기존의 클러스터링 결과와 커뮤니티 알고리즘의 결과치를 비교함으로써, 서로의 결과 데이터가 어떠한 연관성을 가지는지 제시하였다.

  • PDF

아이템의 유사도를 고려한 트랜잭션 클러스터링 (Transactions Clustering based on Item Similarity)

  • 이상욱;김재련
    • 한국지능정보시스템학회:학술대회논문집
    • /
    • 한국지능정보시스템학회 2002년도 추계정기학술대회
    • /
    • pp.250-257
    • /
    • 2002
  • Clustering is a data mining method, which consists in discovering interesting data distributions in very large databases. In traditional data clustering, similarity of a cluster of object is measured by pairwise similarity of objects in that paper. In view of the nature of clustering transactions, we devise in this paper a novel measurement called item similarity and utilize this to perform clustering. With this item similarity measurement, we develop an efficient clustering algorithm for target marketing in each group.

  • PDF

Multi-view Clustering by Spectral Structure Fusion and Novel Low-rank Approximation

  • Long, Yin;Liu, Xiaobo;Murphy, Simon
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제16권3호
    • /
    • pp.813-829
    • /
    • 2022
  • In multi-view subspace clustering, how to integrate the complementary information between perspectives to construct a unified representation is a critical problem. In the existing works, the unified representation is usually constructed in the original data space. However, when the data representation in each view is very diverse, the unified representation derived directly in the original data domain may lead to a huge information loss. To address this issue, different to the existing works, inspired by the latest revelation that the data across all perspectives have a very similar or close spectral block structure, we try to construct the unified representation in the spectral embedding domain. In this way, the complementary information across all perspectives can be fused into a unified representation with little information loss, since the spectral block structure from all views shares high consistency. In addition, to capture the global structure of data on each view with high accuracy and robustness both, we propose a novel low-rank approximation via the tight lower bound on the rank function. Finally, experimental results prove that, the proposed method has the effectiveness and robustness at the same time, compared with the state-of-art approaches.

ASVMRT: Materialized View Selection Algorithm in Data Warehouse

  • Yang, Jin-Hyuk;Chung, In-Jeong
    • Journal of Information Processing Systems
    • /
    • 제2권2호
    • /
    • pp.67-75
    • /
    • 2006
  • In order to acquire a precise and quick response to an analytical query, proper selection of the views to materialize in the data warehouse is crucial. In traditional view selection algorithms, all relations are considered for selection as materialized views. However, materializing all relations rather than a part results in much worse performance in terms of time and space costs. Therefore, we present an improved algorithm for selection of views to materialize using the clustering method to overcome the problem resulting from conventional view selection algorithms. In the presented algorithm, ASVMRT (Algorithm for Selection of Views to Materialize using Reduced Table), we first generate reduced tables in the data warehouse using clustering based on attribute-values density, and then we consider the combination of reduced tables as materialized views instead of a combination of the original base relations. For the justification of the proposed algorithm, we reveal the experimental results in which both time and space costs are approximately 1.8 times better than conventional algorithms.

클러스터링 기법을 이용한 수용가별 전력 데이터 패턴 분석 (Customer Load Pattern Analysis using Clustering Techniques)

  • 유승형;김홍석;오도은;노재구
    • KEPCO Journal on Electric Power and Energy
    • /
    • 제2권1호
    • /
    • pp.61-69
    • /
    • 2016
  • Understanding load patterns and customer classification is a basic step in analyzing the behavior of electricity consumers. To achieve that, there have been many researches about clustering customers' daily load data. Nowadays, the deployment of advanced metering infrastructure (AMI) and big-data technologies make it easier to study customers' load data. In this paper, we study load clustering from the view point of yearly and daily load pattern. We compare four clustering methods; K-means clustering, hierarchical clustering (average & Ward's method) and DBSCAN (Density-Based Spatial Clustering of Applications with Noise). We also discuss the relationship between clustering results and Korean Standard Industrial Classification that is one of possible labels for customers' load data. We find that hierarchical clustering with Ward's method is suitable for clustering load data and KSIC can be well characterized by daily load pattern, but not quite well by yearly load pattern.

Linear Discriminant Clustering in Pattern Recognition

  • Sun, Zhaojia;Choi, Mi-Seon;Kim, Young-Kuk
    • 대한전자공학회:학술대회논문집
    • /
    • 대한전자공학회 2008년도 하계종합학술대회
    • /
    • pp.717-718
    • /
    • 2008
  • Fisher Linear Discriminant(FLD) is a sample and intuitive linear feature extraction method in pattern recognition. But in some special cases, such as un-separable case, one class data dispersed into several clustering case, FLD doesn't work well. In this paper, a new discriminant named K-means Fisher Linear Discriminant, which combines FLD with K-means clustering is proposed. It could deal with this case efficiently, not only possess FLD's global-view merit, but also K-means' local-view property. Finally, the simulation results also demonstrate its advantage against K-means and FLD individually.

  • PDF

모듈의 의존관계와 저자 엔트로피를 이용한 소프트웨어 모듈-뷰 복원 (Recovery of Software Module-View using Dependency and Author Entropy of Modules)

  • 김정민;이찬근;이기성
    • 정보과학회 논문지
    • /
    • 제44권3호
    • /
    • pp.275-286
    • /
    • 2017
  • 본 연구에서 우리는 모듈의 의존관계와 저자 엔트로피(Author Entropy) 정보를 이용하여 소프트웨어 모듈-뷰를 복원하는 새로운 소프트웨어 클러스터링 기법을 제안한다. 해당 기법은 우선 구조적 및 논리적 의존관계 정보를 기준으로 소프트웨어 모듈을 클러스터링한 후, 모듈 별 저자 엔트로피를 이용하여 일부 선택된 모듈을 클러스터 결과로부터 이전한다. 제안된 기법의 평가를 위해 참(ground-truth) 모듈-뷰가 알려진 오픈소스 프로젝트들에 적용하여 MoJoFM 값을 구하였다. 이와 함께 기존에 연구된 모듈-뷰 복원 기법들의 MoJoFM값과 비교하여, 제안된 기법이 소프트웨어 모듈-뷰 복원에 보다 효과적임을 보였다.

Comparisons on Clustering Methods: Use of LMS Log Variables on Academic Courses

  • Jo, Il-Hyun;PARK, Yeonjeong;SONG, Jongwoo
    • Educational Technology International
    • /
    • 제18권2호
    • /
    • pp.159-191
    • /
    • 2017
  • Academic analytics guides university decision-makers to assign limited resources more effectively. Especially, diverse academic courses clustered by the usage patterns and levels on Learning Management System(LMS) help understanding instructors' pedagogical approach and the integration level of technologies. Further, the clustering results can contribute deciding proper range and levels of financial and technical supports. However, in spite of diverse analytic methodologies, clustering analysis methods often provide different results. The purpose of this study is to present implications by using three different clustering analysis including Gaussian Mixture Model, K-Means clustering, and Hierarchical clustering. As a case, we have clustered academic courses based on the usage levels and patterns of LMS in higher education using those three clustering techniques. In this study, 2,639 courses opened during 2013 fall semester in a large private university located in South Korea were analyzed with 13 observation variables that represent the characteristics of academic courses. The results of analysis show that the strengths and weakness of each clustering analysis and suggest that academic leaders and university staff should look into the usage levels and patterns of LMS with more elaborated view and take an integrated approach with different analytic methods for their strategic decision on development of LMS.

클러스터 중심 결정 방법에 따른 문서 클러스터링 성능 분석 (Analysis of Document Clustering Varing Cluster Centroid Decisions)

  • 오형진;변동률;이신원;박순철;정성종;안동언
    • 대한전자공학회:학술대회논문집
    • /
    • 대한전자공학회 2002년도 하계종합학술대회 논문집(3)
    • /
    • pp.99-102
    • /
    • 2002
  • K-means clustering algorithm is a very popular clustering technique, which is used in the field of information retrieval. In this paper, We deal with the problem of K-means Algorithm from the view of creating the centroids and suggest a method reflecting document feature and considering the context of each document to determine the new centroids during the process of forming new centroids. For experiment, We used the automatic document summarizer to summarize the Reuter21578 newslire test dataset and achieved 20% improved results to the recall metrics.

  • PDF

계층적 구조를 가진 퍼지 패턴 분류기 설계 (A Design of Fuzzy Classifier with Hierarchical Structure)

  • 안태천;노석범;김용수
    • 한국지능시스템학회논문지
    • /
    • 제24권4호
    • /
    • pp.355-359
    • /
    • 2014
  • 본 논문은 단순한 후반부 구조를 가진 퍼지 모델을 계층적 구조로 결합한 퍼지 패턴 분류기를 제안한다. 계층적 구조를 가진 퍼지 패턴 분류기의 기본 구조는 단순한 후반부 구조를 가진 퍼지 모델을 사용하여 전체 패턴 분류기의 구조적 복잡성을 높이지 않도록 설계 하였다. 입력공간을 계층적으로 분할하기 위하여 대표적인 퍼지 클러스터링 알고리즘인 Fuzzy C-Means clustering 기법을 이용하였다. 분할된 퍼지 입력 공간의 하위 구조를 분석하기 위하여 conditional Fuzzy C-Means 클러스터링 기법을 이용하였다. 계층적으로 분할된 퍼지 입력공간에 간단한 구조를 가진 퍼지 패턴 분류기를 적용하여 계층적 구조를 가진 패턴 분류기를 설계한다. 계층적으로 퍼지 모델들을 결합함으로써 입력 공간의 정보 분석을 거시적인 관점에서 시작하여 세부적으로 분석이 가능하게 되었다. 제안된 퍼지 패턴 분류기의 성능을 평가하기 위하여 다양한 기계 학습 데이터를 사용하였다.