• Title/Summary/Keyword: Co-Clustering

Search Result 222, Processing Time 0.024 seconds

Researcher Clustering Technique based on Weighted Researcher Network (가중치 정보를 가진 연구자 네트워크 기반의 연구자 클러스터링 기법)

  • Mun, Hyeon Jeong;Lee, Sang Min;Woo, Yong Tae
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.5 no.2
    • /
    • pp.1-11
    • /
    • 2009
  • This study presents HCWS algorithm for researcher grouping on a weighted researcher network. The weights represent intensity of connections among researchers based on the number of co-authors and the number of co-authored research papers. To confirm the validity of the proposed technique, this study conducted an experimentation on about 80 research papers. As a consequence, it is proved that HCWS algorithm is able to bring about more realistic clustering compared with HCS algorithm which presents semantic relations among researchers in simple connections. In addition, it is found that HCWS algorithm can address the problems of existing HCS algorithm; researchers are disconnected since their connections are classified as weak even though they are strong, and vise versa. The technique described in this research paper can be applied to efficiently establish social networks of researchers considering relations such as collaboration histories among researchers or to create communities of researchers.

PathTalk: Interpretation of Microarray Gene-Expression Clusters in Association with Biological Pathways

  • Chung, Tae-Su;Chung, Hee-Joon;Kim, Ju-Han
    • Genomics & Informatics
    • /
    • v.5 no.3
    • /
    • pp.124-128
    • /
    • 2007
  • Microarray technology enables us to measure the expression of tens of thousands of genes simultaneously under various experimental conditions. Clustering analysis is one of the most successful methods for analyzing microarray data using the assumption that co-expressed genes may be co-regulated. It is important to extract meaningful clusters from a long unordered list of clusters and to evaluate the functional homogeneity and heterogeneity of clusters. Many quality measures for clustering results have been suggested in different conditions. In the present study, we consider biological pathways as a collection of biological knowledge and used them as a reference for measuring the quality of clustering results and functional homogeneities. PathTalk visualizes and evaluates functional relationships between gene clusters and biological pathways.

Cluster Analysis-based Approach for Manufacturing Cell Formation (제조 셀 구현을 위한 군집분석 기반 방법론)

  • Shim, Young Hak;Hwang, Jung Yoon
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.36 no.1
    • /
    • pp.24-35
    • /
    • 2013
  • A cell formation approach based on cluster analysis is developed for the configuration of manufacturing cells. Cell formation, which is to group machines and parts into machine cells and the associated part families, is implemented to add the flexibility and efficiency to manufacturing systems. In order to develop an efficient clustering procedure, this paper proposes a cluster analysis-based approach developed by incorporating and modifying two cluster analysis methods, a hierarchical clustering and a non-hierarchical clustering method. The objective of the proposed approach is to minimize intercellular movements and maximize the machine utilization within clusters. The proposed approach is tested on the cell formation problems and is compared with other well-known methodologies available in the literature. The result shows that the proposed approach is efficient enough to yield a good quality solution no matter what the difficulty of data sets is, ill or well-structured.

Moving Object Tracking Using Co-occurrence Features of Objects (이동 물체의 상호 발생 특징정보를 이용한 동영상에서의 이동물체 추적)

  • Kim, Seongdong;Seongah Chin;Moonwon Choo
    • Journal of Intelligence and Information Systems
    • /
    • v.8 no.2
    • /
    • pp.1-13
    • /
    • 2002
  • In this paper, we propose an object tracking system which can be convinced of moving area shaped on objects through color sequential images, decided moving directions of foot messengers or vehicles of image sequences. In static camera, we suggests a new evaluating method extracting co-occurrence matrix with feature vectors of RGB after analyzing and blocking difference images, which is accessed to field of camera view for motion. They are energy, entropy, contrast, maximum probability, inverse difference moment, and correlation of RGB color vectors. we describe how to analyze and compute corresponding relations of objects between adjacent frames. In the clustering, we apply an algorithm of FCM(fuzzy c means) to analyze matching and clustering problems of adjacent frames of the featured vectors, energy and entropy, gotten from previous phase. In the matching phase, we also propose a method to know correspondence relation that can track motion each objects by clustering with similar area, compute object centers and cluster around them in case of same objects based on membership function of motion area of adjacent frames.

  • PDF

Nearest neighbor and validity-based clustering

  • Son, Seo H.;Seo, Suk T.;Kwon, Soon H.
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.4 no.3
    • /
    • pp.337-340
    • /
    • 2004
  • The clustering problem can be formulated as the problem to find the number of clusters and a partition matrix from a given data set using the iterative or non-iterative algorithms. The author proposes a nearest neighbor and validity-based clustering algorithm where each data point in the data set is linked with the nearest neighbor data point to form initial clusters and then a cluster in the initial clusters is linked with the nearest neighbor cluster to form a new cluster. The linking between clusters is continued until no more linking is possible. An optimal set of clusters is identified by using the conventional cluster validity index. Experimental results on well-known data sets are provided to show the effectiveness of the proposed clustering algorithm.

Projection Pursuit K-Means Visual Clustering

  • Kim, Mi-Kyung;Huh, Myung-Hoe
    • Journal of the Korean Statistical Society
    • /
    • v.31 no.4
    • /
    • pp.519-532
    • /
    • 2002
  • K-means clustering is a well-known partitioning method of multivariate observations. Recently, the method is implemented broadly in data mining softwares due to its computational efficiency in handling large data sets. However, it does not yield a suitable visual display of multivariate observations that is important especially in exploratory stage of data analysis. The aim of this study is to develop a K-means clustering method that enables visual display of multivariate observations in a low-dimensional space, for which the projection pursuit method is adopted. We propose a computationally inexpensive and reliable algorithm and provide two numerical examples.

Reorganizing Social Issues from R&D Perspective Using Social Network Analysis

  • Shun Wong, William Xiu;Kim, Namgyu
    • Journal of Information Technology Applications and Management
    • /
    • v.22 no.3
    • /
    • pp.83-103
    • /
    • 2015
  • The rapid development of internet technologies and social media over the last few years has generated a huge amount of unstructured text data, which contains a great deal of valuable information and issues. Therefore, text mining-extracting meaningful information from unstructured text data-has gained attention from many researchers in various fields. Topic analysis is a text mining application that is used to determine the main issues in a large volume of text documents. However, it is difficult to identify related issues or meaningful insights as the number of issues derived through topic analysis is too large. Furthermore, traditional issue-clustering methods can only be performed based on the co-occurrence frequency of issue keywords in many documents. Therefore, an association between issues that have a low co-occurrence frequency cannot be recognized using traditional issue-clustering methods, even if those issues are strongly related in other perspectives. Therefore, in this research, a methodology to reorganize social issues from a research and development (R&D) perspective using social network analysis is proposed. Using an R&D perspective lexicon, issues that consistently share the same R&D keywords can be further identified through social network analysis. In this study, the R&D keywords that are associated with a particular issue imply the key technology elements that are needed to solve a particular issue. Issue clustering can then be performed based on the analysis results. Furthermore, the relationship between issues that share the same R&D keywords can be reorganized more systematically, by grouping them into clusters according to the R&D perspective lexicon. We expect that our methodology will contribute to establishing efficient R&D investment policies at the national level by enhancing the reusability of R&D knowledge, based on issue clustering using the R&D perspective lexicon. In addition, business companies could also utilize the results by aligning the R&D with their business strategy plans, to help companies develop innovative products and new technologies that sustain innovative business models.

Clustering Based Adaptive Power Control for Interference Mitigation in Two-Tier Femtocell Networks

  • Wang, Hong;Song, Rongfang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.8 no.4
    • /
    • pp.1424-1441
    • /
    • 2014
  • Two-tier femtocell networks, consisting of a conventional cellular network underlaid with femtocell hotspots, play an important role in the indoor coverage and capacity of cellular networks. However, the cross- and co-tier interference will cause an unacceptable quality of service (QoS) for users with universal frequency reuse. In this paper, we propose a novel downlink interference mitigation strategy for spectrum-shared two-tier femtocell networks. The proposed solution is composed of three parts. The first is femtocells clustering, which maximizes the distance between femtocells using the same slot resource to mitigate co-tier interference. The second is to assign macrocell users (MUEs) to clusters by max-min criterion, by which each MUE can avoid using the same resource as the nearest femtocell. The third is a novel adaptive power control scheme with femtocells downlink transmit power adjusted adaptively based on the signal to interference plus noise ratio (SINR) level of neighboring users. Simulation results show that the proposed scheme can effectively increase the successful transmission ratio and ergodic capacity of femtocells, while guaranteeing QoS of the macrocell.

A Study on Technology Forecasting based on Co-occurrence Network of Keyword in Multidisciplinary Journals (다학제 분야 학술지의 주제어 동시발생 네트워크를 활용한 기술예측 연구)

  • Kim, Hyunuk;Ahn, Sang-Jin;Jung, Woo-Sung
    • Journal of the Korean Operations Research and Management Science Society
    • /
    • v.40 no.4
    • /
    • pp.49-63
    • /
    • 2015
  • Keyword indexed in multidisciplinary journals show trends about science and technology innovation. Nature and Science were selected as multidisciplinary journals for our analysis. In order to reduce the effect of plurality of keyword, stemming algorithm were implemented. After this process, we fitted growth curve of keyword (stem) following bass model, which is a well-known model in diffusion process. Bass model is useful for expressing growth pattern by assuming innovative and imitative activities in innovation spreading. In addition, we construct keyword co-occurrence network and calculate network measures such as centrality indices and local clustering coefficient. Based on network metrics and yearly frequency of keyword, time series analysis was conducted for obtaining statistical causality between these measures. For some cases, local clustering coefficient seems to Granger-cause yearly frequency of keyword. We expect that local clustering coefficient could be a supportive indicator of emerging science and technology.

Empirical Comparison of Word Similarity Measures Based on Co-Occurrence, Context, and a Vector Space Model

  • Kadowaki, Natsuki;Kishida, Kazuaki
    • Journal of Information Science Theory and Practice
    • /
    • v.8 no.2
    • /
    • pp.6-17
    • /
    • 2020
  • Word similarity is often measured to enhance system performance in the information retrieval field and other related areas. This paper reports on an experimental comparison of values for word similarity measures that were computed based on 50 intentionally selected words from a Reuters corpus. There were three targets, including (1) co-occurrence-based similarity measures (for which a co-occurrence frequency is counted as the number of documents or sentences), (2) context-based distributional similarity measures obtained from a latent Dirichlet allocation (LDA), nonnegative matrix factorization (NMF), and Word2Vec algorithm, and (3) similarity measures computed from the tf-idf weights of each word according to a vector space model (VSM). Here, a Pearson correlation coefficient for a pair of VSM-based similarity measures and co-occurrence-based similarity measures according to the number of documents was highest. Group-average agglomerative hierarchical clustering was also applied to similarity matrices computed by individual measures. An evaluation of the cluster sets according to an answer set revealed that VSM- and LDA-based similarity measures performed best.