• Title/Summary/Keyword: Clustering Coefficient

Search Result 192, Processing Time 0.029 seconds

Social Network Analysis for the Effective Adoption of Recommender Systems (추천시스템의 효과적 도입을 위한 소셜네트워크 분석)

  • Park, Jong-Hak;Cho, Yoon-Ho
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.4
    • /
    • pp.305-316
    • /
    • 2011
  • Recommender system is the system which, by using automated information filtering technology, recommends products or services to the customers who are likely to be interested in. Those systems are widely used in many different Web retailers such as Amazon.com, Netfix.com, and CDNow.com. Various recommender systems have been developed. Among them, Collaborative Filtering (CF) has been known as the most successful and commonly used approach. CF identifies customers whose tastes are similar to those of a given customer, and recommends items those customers have liked in the past. Numerous CF algorithms have been developed to increase the performance of recommender systems. However, the relative performances of CF algorithms are known to be domain and data dependent. It is very time-consuming and expensive to implement and launce a CF recommender system, and also the system unsuited for the given domain provides customers with poor quality recommendations that make them easily annoyed. Therefore, predicting in advance whether the performance of CF recommender system is acceptable or not is practically important and needed. In this study, we propose a decision making guideline which helps decide whether CF is adoptable for a given application with certain transaction data characteristics. Several previous studies reported that sparsity, gray sheep, cold-start, coverage, and serendipity could affect the performance of CF, but the theoretical and empirical justification of such factors is lacking. Recently there are many studies paying attention to Social Network Analysis (SNA) as a method to analyze social relationships among people. SNA is a method to measure and visualize the linkage structure and status focusing on interaction among objects within communication group. CF analyzes the similarity among previous ratings or purchases of each customer, finds the relationships among the customers who have similarities, and then uses the relationships for recommendations. Thus CF can be modeled as a social network in which customers are nodes and purchase relationships between customers are links. Under the assumption that SNA could facilitate an exploration of the topological properties of the network structure that are implicit in transaction data for CF recommendations, we focus on density, clustering coefficient, and centralization which are ones of the most commonly used measures to capture topological properties of the social network structure. While network density, expressed as a proportion of the maximum possible number of links, captures the density of the whole network, the clustering coefficient captures the degree to which the overall network contains localized pockets of dense connectivity. Centralization reflects the extent to which connections are concentrated in a small number of nodes rather than distributed equally among all nodes. We explore how these SNA measures affect the performance of CF performance and how they interact to each other. Our experiments used sales transaction data from H department store, one of the well?known department stores in Korea. Total 396 data set were sampled to construct various types of social networks. The dependant variable measuring process consists of three steps; analysis of customer similarities, construction of a social network, and analysis of social network patterns. We used UCINET 6.0 for SNA. The experiments conducted the 3-way ANOVA which employs three SNA measures as dependant variables, and the recommendation accuracy measured by F1-measure as an independent variable. The experiments report that 1) each of three SNA measures affects the recommendation accuracy, 2) the density's effect to the performance overrides those of clustering coefficient and centralization (i.e., CF adoption is not a good decision if the density is low), and 3) however though the density is low, the performance of CF is comparatively good when the clustering coefficient is low. We expect that these experiment results help firms decide whether CF recommender system is adoptable for their business domain with certain transaction data characteristics.

Identification Methodology of FCM-based Fuzzy Model Using Particle Swarm Optimization (입자 군집 최적화를 이용한 FCM 기반 퍼지 모델의 동정 방법론)

  • Oh, Sung-Kwun;Kim, Wook-Dong;Park, Ho-Sung;Son, Myung-Hee
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.60 no.1
    • /
    • pp.184-192
    • /
    • 2011
  • In this study, we introduce a identification methodology for FCM-based fuzzy model. The two underlying design mechanisms of such networks involve Fuzzy C-Means (FCM) clustering method and Particle Swarm Optimization(PSO). The proposed algorithm is based on FCM clustering method for efficient processing of data and the optimization of model was carried out using PSO. The premise part of fuzzy rules does not construct as any fixed membership functions such as triangular, gaussian, ellipsoidal because we build up the premise part of fuzzy rules using FCM. As a result, the proposed model can lead to the compact architecture of network. In this study, as the consequence part of fuzzy rules, we are able to use four types of polynomials such as simplified, linear, quadratic, modified quadratic. In addition, a Weighted Least Square Estimation to estimate the coefficients of polynomials, which are the consequent parts of fuzzy model, can decouple each fuzzy rule from the other fuzzy rules. Therefore, a local learning capability and an interpretability of the proposed fuzzy model are improved. Also, the parameters of the proposed fuzzy model such as a fuzzification coefficient of FCM clustering, the number of clusters of FCM clustering, and the polynomial type of the consequent part of fuzzy rules are adjusted using PSO. The proposed model is illustrated with the use of Automobile Miles per Gallon(MPG) and Boston housing called Machine Learning dataset. A comparative analysis reveals that the proposed FCM-based fuzzy model exhibits higher accuracy and superb predictive capability in comparison to some previous models available in the literature.

Fuzzy Inference Systems Based on FCM Clustering Algorithm for Nonlinear Process (비선형 공정을 위한 FCM 클러스터링 알고리즘 기반 퍼지 추론 시스템)

  • Park, Keon-Jun;Kang, Hyung-Kil;Kim, Yong-Kab
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.5 no.4
    • /
    • pp.224-231
    • /
    • 2012
  • In this paper, we introduce a fuzzy inference systems based on fuzzy c-means clustering algorithm for fuzzy modeling of nonlinear process. Typically, the generation of fuzzy rules for nonlinear processes have the problem that the number of fuzzy rules exponentially increases. To solve this problem, the fuzzy rules of fuzzy model are generated by partitioning the input space in the scatter form using FCM clustering algorithm. The premise parameters of the fuzzy rules are determined by membership matrix by means of FCM clustering algorithm. The consequence part of the rules is expressed in the form of polynomial functions and the coefficient parameters of each rule are determined by the standard least-squares method. And lastly, we evaluate the performance and the nonlinear characteristics using the data widely used in nonlinear process.

An Enhanced Spatial Fuzzy C-Means Algorithm for Image Segmentation (영상 분할을 위한 개선된 공간적 퍼지 클러스터링 알고리즘)

  • Truong, Tung X.;Kim, Jong-Myon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.17 no.2
    • /
    • pp.49-57
    • /
    • 2012
  • Conventional fuzzy c-means (FCM) algorithms have achieved a good clustering performance. However, they do not fully utilize the spatial information in the image and this results in lower clustering performance for images that have low contrast, vague boundaries, and noises. To overcome this issue, we propose an enhanced spatial fuzzy c-means (ESFCM) algorithm that takes into account the influence of neighboring pixels on the center pixel by assigning weights to the neighbors in a $3{\times}3$ square window. To evaluate between the proposed ESFCM and various FCM based segmentation algorithms, we utilized clustering validity functions such as partition coefficient ($V_{pc}$), partition entropy ($V_{pe}$), and Xie-Bdni function ($V_{xb}$). Experimental results show that the proposed ESFCM outperforms other FCM based algorithms in terms of clustering validity functions.

Threshold based User-centric Clustering for Cell-free MIMO Network (셀프리 다중안테나 네트워크를 위한 임계값 기반 사용자 중심 클러스터링)

  • Ryu, Jong Yeol;Lee, Woongsup;Ban, Tae-Won
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.26 no.1
    • /
    • pp.114-121
    • /
    • 2022
  • In this paper, we consider a user centric clustering in order to guarantee the performance of the users in cell free multiple-input multiple-output (MIMO) network. In the user centric clustering scheme, by using large scale fading coefficients of the connected access points (APs), each user decides own cluster with the APs having the higher the large scale fading coefficients than threshold value compared to the highest large scale fading coefficient. In the determined user centric clusters, the APs design the beamformers and power allocations in the distributed manner and the APs cooperatively transmit data to users by using beamformers and power allocations. In the simulation results, we verify the performance of user centric clustering in terms of the spectral efficiency and we also find the optimal threshold value in the given configuration.

Similarity of Sampling Sites by Water Quality (수질 관측지점 유사성 측정방법 연구)

  • Kwon, Se-Hyug;Lee, Yo-Sang
    • Communications for Statistical Applications and Methods
    • /
    • v.17 no.1
    • /
    • pp.39-45
    • /
    • 2010
  • As the value of environment is increasing, the water quality has been a matter of interest to the nation and people. Research on water quality has been widely studied, but focused on geographical characteristic and river characteristics like inflow, outflow, quantity and speed of water. In this paper, two approaches to measure the similarity of sampling sites by using water quality data are discussed and compared with two-years empirical data of Yongdam-Dam. The existing method has calculated their similarities with principal component scores. The proposed approach in this paper use correlation matrix of water quality related variables and MDS for measuring the similarity, which is shown to be better in the sense of being clustering which is identical to geographical clustering since it can consider the time series pattern of water quality.

Genetic Diversity and Population Genetic Structure of Black-spotted Pond Frog (Pelophylax nigromaculatus) Distributed in South Korean River Basins

  • Park, Jun-Kyu;Yoo, Nakyung;Do, Yuno
    • Proceedings of the National Institute of Ecology of the Republic of Korea
    • /
    • v.2 no.2
    • /
    • pp.120-128
    • /
    • 2021
  • The objective of this study was to analyze the genotype of black-spotted pond frog (Pelophylax nigromaculatus) using seven microsatellite loci to quantify its genetic diversity and population structure throughout the spatial scale of basins of Han, Geum, Yeongsan, and Nakdong Rivers in South Korea. Genetic diversities in these four areas were compared using diversity index and inbreeding coefficient obtained from the number and frequency of alleles as well as heterozygosity. Additionally, the population structure was confirmed with population differentiation, Nei's genetic distance, multivariate analysis, and Bayesian clustering analysis. Interestingly, a negative genetic diversity pattern was observed in the Han River basin, indicating possible recent habitat disturbances or population declines. In contrast, a positive genetic diversity pattern was found for the population in the Nakdong River basin that had remained the most stable. Results of population structure suggested that populations of black-spotted pond frogs distributed in these four river basins were genetically independent. In particular, the population of the Nakdong River basin had the greatest genetic distance, indicating that it might have originated from an independent population. These results support the use of genetics in addition to designations strictly based on geographic stream areas to define the spatial scale of populations for management and conservation practices.

Community Detection using Closeness Similarity based on Common Neighbor Node Clustering Entropy

  • Jiang, Wanchang;Zhang, Xiaoxi;Zhu, Weihua
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.8
    • /
    • pp.2587-2605
    • /
    • 2022
  • In order to efficiently detect community structure in complex networks, community detection algorithms can be designed from the perspective of node similarity. However, the appropriate parameters should be chosen to achieve community division, furthermore, these existing algorithms based on the similarity of common neighbors have low discrimination between node pairs. To solve the above problems, a noval community detection algorithm using closeness similarity based on common neighbor node clustering entropy is proposed, shorted as CSCDA. Firstly, to improve detection accuracy, common neighbors and clustering coefficient are combined in the form of entropy, then a new closeness similarity measure is proposed. Through the designed similarity measure, the closeness similar node set of each node can be further accurately identified. Secondly, to reduce the randomness of the community detection result, based on the closeness similar node set, the node leadership is used to determine the most closeness similar first-order neighbor node for merging to create the initial communities. Thirdly, for the difficult problem of parameter selection in existing algorithms, the merging of two levels is used to iteratively detect the final communities with the idea of modularity optimization. Finally, experiments show that the normalized mutual information values are increased by an average of 8.06% and 5.94% on two scales of synthetic networks and real-world networks with real communities, and modularity is increased by an average of 0.80% on the real-world networks without real communities.

Facial Expression Recognition with Fuzzy C-Means Clusstering Algorithm and Neural Network Based on Gabor Wavelets

  • Youngsuk Shin;Chansup Chung;Lee, Yillbyung
    • Proceedings of the Korean Society for Emotion and Sensibility Conference
    • /
    • 2000.04a
    • /
    • pp.126-132
    • /
    • 2000
  • This paper presents a facial expression recognition based on Gabor wavelets that uses a fuzzy C-means(FCM) clustering algorithm and neural network. Features of facial expressions are extracted to two steps. In the first step, Gabor wavelet representation can provide edges extraction of major face components using the average value of the image's 2-D Gabor wavelet coefficient histogram. In the next step, we extract sparse features of facial expressions from the extracted edge information using FCM clustering algorithm. The result of facial expression recognition is compared with dimensional values of internal stated derived from semantic ratings of words related to emotion. The dimensional model can recognize not only six facial expressions related to Ekman's basic emotions, but also expressions of various internal states.

  • PDF

Generalized Network Generation Method for Small-World Network and Scale-Free Network (Small-World 망과 Scale-Free 망을 위한 일반적인 망 생성 방법)

  • Lee, Kang-won;Lee, Jae-hoon;Choe, Hye-zin
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.41 no.7
    • /
    • pp.754-764
    • /
    • 2016
  • To understand and analyze SNS(Social Network Service) two important classes of networks, small-world and scale-free networks have gained a lot of research interests. In this study, a generalized network generation method is developed, which can produce small-world network, scale-free network, or network with the properties of both small-world and scale-free by controlling two input parameters. By tuning one parameter we can represent the small-world property and by tuning the other one we can represent both scale-free and small-world properties. For the network measures to represent small-world and scale-free properties clustering coefficient, average shortest path distance and power-law property are used. Using the model proposed in this study we can have more clear understanding about relationships between small-world network and scale-free network. Using numerical examples we have verified the effects of two parameters on clustering coefficient, average shortest path distance and power-law property. Through this investigation it can be shown that small-world network, scale-free network or both can be generated by tuning two input parameters properly.