• Title/Summary/Keyword: Network clustering analysis

Search Result 392, Processing Time 0.028 seconds

An Investigation of the Relationship between Revenue Water Ratio and the Operating and Maintenance Cost of Water Supply Network (상수관망 유수율과 유지관리 비용의 관계 분석)

  • Kim, Jaehee;Yoo, Kwangtae;Jun, Hwandon;Jang, Jaesun
    • Journal of Korean Society on Water Environment
    • /
    • v.28 no.2
    • /
    • pp.202-212
    • /
    • 2012
  • Due to the deterioration of water supply network and the deficiency of raw water, the water utility of local governments have performed various projects to improve their revenue water ratio. However, it is very difficult to estimate the cost for maintaining the revenue water ratio at higher level after completing the project, because local governments have different conditions affecting the operating and maintenance cost of water supply network. The purpose of this study is to present a procedure to estimate the operating and maintenance cost required to maintain the target revenue water ratio of the water supply network. For this purpose, we estimated the cost used only for operation and maintenance of water supply network of 164 local governments with the aid of K-Mean Clustering Analysis and the data from 40 representative local governments. Then, the regression analysis was performed to find relationship between revenue water ratio and the operating and maintenance cost with two different data sets generated by two classification methods; the first method classifies the local governments by means of k-means clustering, and the other classifies the local governments according to the index standardized by the operating and maintenance cost per unit length of water mains per revenue water ratio. The results shows that the method based on the index standardized by the cost and revenue water ratio of each government produces more reliable results for finding regression equations between revenue water ratio and the operating and maintenance cost only for water supply network. The estimated regression equations for each group can be used to estimate the cost required to keep the target revenue water ratio of the local government.

Design of Fuzzy Clustering-based Neural Networks Classifier for Sorting Black Plastics with the Aid of Raman Spectroscopy (라만분광법에 의한 흑색 플라스틱 선별을 위한 퍼지 클러스터링기반 신경회로망 분류기 설계)

  • Kim, Eun-Hu;Bae, Jong-Soo;Oh, Sung-Kwun
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.66 no.7
    • /
    • pp.1131-1140
    • /
    • 2017
  • This study is concerned with a design methodology of optimized fuzzy clustering-based neural network classifier for classifying black plastic. Since the amount of waste plastic is increased every year, the technique for recycling waste plastic is getting more attention. The proposed classifier is on a basis of architecture of radial basis function neural network. The hidden layer of the proposed classifier is composed to FCM clustering instead of activation functions, while connection weights are formed as the linear functions and their coefficients are estimated by the local least squares estimator (LLSE)-based learning. Because the raw dataset collected from Raman spectroscopy include high-dimensional variables over about three thousands, principal component analysis(PCA) is applied for the dimensional reduction. In addition, artificial bee colony(ABC), which is one of the evolutionary algorithm, is used in order to identify the architecture and parameters of the proposed network. In experiment, the proposed classifier sorts the three kinds of plastics which is the most largely discharged in the real world. The effectiveness of the proposed classifier is proved through a comparison of performance between dataset obtained from chemical analysis and entire dataset extracted directly from Raman spectroscopy.

Social Network Analysis for the Effective Adoption of Recommender Systems (추천시스템의 효과적 도입을 위한 소셜네트워크 분석)

  • Park, Jong-Hak;Cho, Yoon-Ho
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.4
    • /
    • pp.305-316
    • /
    • 2011
  • Recommender system is the system which, by using automated information filtering technology, recommends products or services to the customers who are likely to be interested in. Those systems are widely used in many different Web retailers such as Amazon.com, Netfix.com, and CDNow.com. Various recommender systems have been developed. Among them, Collaborative Filtering (CF) has been known as the most successful and commonly used approach. CF identifies customers whose tastes are similar to those of a given customer, and recommends items those customers have liked in the past. Numerous CF algorithms have been developed to increase the performance of recommender systems. However, the relative performances of CF algorithms are known to be domain and data dependent. It is very time-consuming and expensive to implement and launce a CF recommender system, and also the system unsuited for the given domain provides customers with poor quality recommendations that make them easily annoyed. Therefore, predicting in advance whether the performance of CF recommender system is acceptable or not is practically important and needed. In this study, we propose a decision making guideline which helps decide whether CF is adoptable for a given application with certain transaction data characteristics. Several previous studies reported that sparsity, gray sheep, cold-start, coverage, and serendipity could affect the performance of CF, but the theoretical and empirical justification of such factors is lacking. Recently there are many studies paying attention to Social Network Analysis (SNA) as a method to analyze social relationships among people. SNA is a method to measure and visualize the linkage structure and status focusing on interaction among objects within communication group. CF analyzes the similarity among previous ratings or purchases of each customer, finds the relationships among the customers who have similarities, and then uses the relationships for recommendations. Thus CF can be modeled as a social network in which customers are nodes and purchase relationships between customers are links. Under the assumption that SNA could facilitate an exploration of the topological properties of the network structure that are implicit in transaction data for CF recommendations, we focus on density, clustering coefficient, and centralization which are ones of the most commonly used measures to capture topological properties of the social network structure. While network density, expressed as a proportion of the maximum possible number of links, captures the density of the whole network, the clustering coefficient captures the degree to which the overall network contains localized pockets of dense connectivity. Centralization reflects the extent to which connections are concentrated in a small number of nodes rather than distributed equally among all nodes. We explore how these SNA measures affect the performance of CF performance and how they interact to each other. Our experiments used sales transaction data from H department store, one of the well?known department stores in Korea. Total 396 data set were sampled to construct various types of social networks. The dependant variable measuring process consists of three steps; analysis of customer similarities, construction of a social network, and analysis of social network patterns. We used UCINET 6.0 for SNA. The experiments conducted the 3-way ANOVA which employs three SNA measures as dependant variables, and the recommendation accuracy measured by F1-measure as an independent variable. The experiments report that 1) each of three SNA measures affects the recommendation accuracy, 2) the density's effect to the performance overrides those of clustering coefficient and centralization (i.e., CF adoption is not a good decision if the density is low), and 3) however though the density is low, the performance of CF is comparatively good when the clustering coefficient is low. We expect that these experiment results help firms decide whether CF recommender system is adoptable for their business domain with certain transaction data characteristics.

Link Analysis on Institutional Repository web Network of Indian Institute of Technologies Registered in open DOAR-uncovering Patterns and Trends Hidden in the Network

  • Kumar, Kutty
    • International Journal of Knowledge Content Development & Technology
    • /
    • v.8 no.2
    • /
    • pp.23-36
    • /
    • 2018
  • Institutional repositories (IR) are promising to be extremely advantageous to scholars especially in developing countries. IR initiatives started in India during the late nineties and the popularity of this concept is growing rapidly in the higher educational and research institutions to disseminate newly emerging knowledge and expertise. The purpose of this paper is to critically analyze the network links of IR websites among four IITs that are registered in open DOAR (Directory of Open Access Repositories) web portal. The Institutional Repositories chosen for the study are IIT Delhi, IIT Hyderabad, IIT Bombay, and IIT Kanpur. The analysis of the study focused on standard graph and network cohesion metrics, such as density, diameter, eccentricity and distances, and clustering coefficient; for an even more detailed analysis advanced centrality measures and fast algorithms such as clique census are used.

Efficient and Secure Routing Protocol forWireless Sensor Networks through SNR Based Dynamic Clustering Mechanisms

  • Ganesh, Subramanian;Amutha, Ramachandran
    • Journal of Communications and Networks
    • /
    • v.15 no.4
    • /
    • pp.422-429
    • /
    • 2013
  • Advances in wireless sensor network (WSN) technology have enabled small and low-cost sensors with the capability of sensing various types of physical and environmental conditions, data processing, and wireless communication. In the WSN, the sensor nodes have a limited transmission range and their processing and storage capabilities as well as their energy resources are limited. A triple umpiring system has already been proved for its better performance in WSNs. The clustering technique is effective in prolonging the lifetime of the WSN. In this study, we have modified the ad-hoc on demand distance vector routing by incorporating signal-to-noise ratio (SNR) based dynamic clustering. The proposed scheme, which is an efficient and secure routing protocol for wireless sensor networks through SNR-based dynamic clustering (ESRPSDC) mechanisms, can partition the nodes into clusters and select the cluster head (CH) among the nodes based on the energy, and non CH nodes join with a specific CH based on the SNR values. Error recovery has been implemented during the inter-cluster routing in order to avoid end-to-end error recovery. Security has been achieved by isolating the malicious nodes using sink-based routing pattern analysis. Extensive investigation studies using a global mobile simulator have shown that this hybrid ESRP significantly improves the energy efficiency and packet reception rate as compared with the SNR unaware routing algorithms such as the low energy aware adaptive clustering hierarchy and power efficient gathering in sensor information systems.

Keyword Network Analysis for Technology Forecasting (기술예측을 위한 특허 키워드 네트워크 분석)

  • Choi, Jin-Ho;Kim, Hee-Su;Im, Nam-Gyu
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.4
    • /
    • pp.227-240
    • /
    • 2011
  • New concepts and ideas often result from extensive recombination of existing concepts or ideas. Both researchers and developers build on existing concepts and ideas in published papers or registered patents to develop new theories and technologies that in turn serve as a basis for further development. As the importance of patent increases, so does that of patent analysis. Patent analysis is largely divided into network-based and keyword-based analyses. The former lacks its ability to analyze information technology in details while the letter is unable to identify the relationship between such technologies. In order to overcome the limitations of network-based and keyword-based analyses, this study, which blends those two methods, suggests the keyword network based analysis methodology. In this study, we collected significant technology information in each patent that is related to Light Emitting Diode (LED) through text mining, built a keyword network, and then executed a community network analysis on the collected data. The results of analysis are as the following. First, the patent keyword network indicated very low density and exceptionally high clustering coefficient. Technically, density is obtained by dividing the number of ties in a network by the number of all possible ties. The value ranges between 0 and 1, with higher values indicating denser networks and lower values indicating sparser networks. In real-world networks, the density varies depending on the size of a network; increasing the size of a network generally leads to a decrease in the density. The clustering coefficient is a network-level measure that illustrates the tendency of nodes to cluster in densely interconnected modules. This measure is to show the small-world property in which a network can be highly clustered even though it has a small average distance between nodes in spite of the large number of nodes. Therefore, high density in patent keyword network means that nodes in the patent keyword network are connected sporadically, and high clustering coefficient shows that nodes in the network are closely connected one another. Second, the cumulative degree distribution of the patent keyword network, as any other knowledge network like citation network or collaboration network, followed a clear power-law distribution. A well-known mechanism of this pattern is the preferential attachment mechanism, whereby a node with more links is likely to attain further new links in the evolution of the corresponding network. Unlike general normal distributions, the power-law distribution does not have a representative scale. This means that one cannot pick a representative or an average because there is always a considerable probability of finding much larger values. Networks with power-law distributions are therefore often referred to as scale-free networks. The presence of heavy-tailed scale-free distribution represents the fundamental signature of an emergent collective behavior of the actors who contribute to forming the network. In our context, the more frequently a patent keyword is used, the more often it is selected by researchers and is associated with other keywords or concepts to constitute and convey new patents or technologies. The evidence of power-law distribution implies that the preferential attachment mechanism suggests the origin of heavy-tailed distributions in a wide range of growing patent keyword network. Third, we found that among keywords that flew into a particular field, the vast majority of keywords with new links join existing keywords in the associated community in forming the concept of a new patent. This finding resulted in the same outcomes for both the short-term period (4-year) and long-term period (10-year) analyses. Furthermore, using the keyword combination information that was derived from the methodology suggested by our study enables one to forecast which concepts combine to form a new patent dimension and refer to those concepts when developing a new patent.

Data prediction Strategy for Sensor Network Clustering Scheme (센서 네트워크 클러스터링 기법의 데이터 예측 전략)

  • Choi, Dong-Min;Shen, Jian;Moh, Sang-Man;Chung, Il-Yong
    • Journal of Korea Multimedia Society
    • /
    • v.14 no.9
    • /
    • pp.1138-1151
    • /
    • 2011
  • Sensor network clustering scheme is an efficient method that prolongs network lifetime. However, when it is applied to an environment in which collected data of the sensor nodes easily overlap, sensor node unnecessarily consumes energy. Accordingly, we proposed a data prediction scheme that sensor node can predict current data to exclude redundant data transmission and to minimize data transmission among the cluster head node and member nodes. Our scheme excludes redundant data collection by neighbor nodes. Thus it is possible that energy efficient data transmission. Moreover, to alleviate unnecessary data transmission, we introduce data prediction graph whether transmit or not through analyze between prediction and current data. According to the result of performance analysis, our method consume less energy than the existing clustering method. Nevertheless, transmission efficiency and data accuracy is increased. Consequently, network lifetime is prolonged.

Hierarchical and Incremental Clustering for Semi Real-time Issue Analysis on News Articles (준 실시간 뉴스 이슈 분석을 위한 계층적·점증적 군집화)

  • Kim, Hoyong;Lee, SeungWoo;Jang, Hong-Jun;Seo, DongMin
    • The Journal of the Korea Contents Association
    • /
    • v.20 no.6
    • /
    • pp.556-578
    • /
    • 2020
  • There are many different researches about how to analyze issues based on real-time news streams. But, there are few researches which analyze issues hierarchically from news articles and even a previous research of hierarchical issue analysis make clustering speed slower as the increment of news articles. In this paper, we propose a hierarchical and incremental clustering for semi real-time issue analysis on news articles. We trained siamese neural network based weighted cosine similarity model, applied this model to k-means algorithm which is used to make word clusters and converted news articles to document vectors by using these word clusters. Finally, we initialized an issue cluster tree from document vectors, updated this tree whenever news articles happen, and analyzed issues in semi real-time. Through the experiment and evaluation, we showed that up to about 0.26 performance has been improved in terms of NMI. Also, in terms of speed of incremental clustering, we also showed about 10 times faster than before.

CLUSTERING DNA MICROARRAY DATA BY STOCHASTIC ALGORITHM

  • Shon, Ho-Sun;Kim, Sun-Shin;Wang, Ling;Ryu, Keun-Ho
    • Proceedings of the KSRS Conference
    • /
    • 2007.10a
    • /
    • pp.438-441
    • /
    • 2007
  • Recently, due to molecular biology and engineering technology, DNA microarray makes people watch thousands of genes and the state of variation from the tissue samples of living body. With DNA Microarray, it is possible to construct a genetic group that has similar expression patterns and grasp the progress and variation of gene. This paper practices Cluster Analysis which purposes the discovery of biological subgroup or class by using gene expression information. Hence, the purpose of this paper is to predict a new class which is unknown, open leukaemia data are used for the experiment, and MCL (Markov CLustering) algorithm is applied as an analysis method. The MCL algorithm is based on probability and graph flow theory. MCL simulates random walks on a graph using Markov matrices to determine the transition probabilities among nodes of the graph. If you look at closely to the method, first, MCL algorithm should be applied after getting the distance by using Euclidean distance, then inflation and diagonal factors which are tuning modulus should be tuned, and finally the threshold using the average of each column should be gotten to distinguish one class from another class. Our method has improved the accuracy through using the threshold, namely the average of each column. Our experimental result shows about 70% of accuracy in average compared to the class that is known before. Also, for the comparison evaluation to other algorithm, the proposed method compared to and analyzed SOM (Self-Organizing Map) clustering algorithm which is divided into neural network and hierarchical clustering. The method shows the better result when compared to hierarchical clustering. In further study, it should be studied whether there will be a similar result when the parameter of inflation gotten from our experiment is applied to other gene expression data. We are also trying to make a systematic method to improve the accuracy by regulating the factors mentioned above.

  • PDF

A Model for Developing Urban Innovation Clusters

  • Morse, Sidney
    • World Technopolis Review
    • /
    • v.2 no.2
    • /
    • pp.81-95
    • /
    • 2013
  • This paper seeks to build on previous work conducted by Porter, Devol, Florida, Bahrami and Evans, Wennberg and Lindqvist, and others contained in the literature, to construct a new way of looking at innovation cluster development. It seeks to describe the key elements contained in the research that serve as building blocks for innovation clustering, adding analysis dimensions that aim to further illuminate understanding of this process. It compares those building block characteristics to the innovation topography of U.S. urban centers, to shed light on a new framework through which urban innovation cluster formation can be considered. It identifies three building block analysis categories: 1) Technological Capability and Capacity (TCC); 2) Intellectual Propulsion Capacity (IPC); and 3) Structural Creative Inspiration (SCI). These three pillars form the architecture for creation of a Strategic Innovation Network (SIN), upon which clustering can be systematically analysed and built. The purpose of the SIN is to optimally organize and connect all available resources that include physical, financial, and human, such that innovation clustering is inspired, encouraged, nurtured, and ultimately constructed as fully functioning socio-economic organisms that provide both local and regional benefits. It is designed to aid both private enterprise and public policy leaders in their strategic planning considerations, and to enhance urban economic development opportunities.