• Title/Summary/Keyword: Community detection algorithm

Search Result 36, Processing Time 0.027 seconds

Author Graph Generation based on Author Disambiguation (저자 식별에 기반한 저자 그래프 생성)

  • Kang, In-Su
    • Journal of Information Management
    • /
    • v.42 no.1
    • /
    • pp.47-62
    • /
    • 2011
  • While an ideal author graph should have its nodes to represent authors, automatically-generated author graphs mostly use author names as their nodes due to the difficulty of resolving author names into individuals. However, employing author names as nodes of author graphs merges namesakes, otherwise separate nodes in the author graph, into the same node, which may distort the characteristics of the author graph. This study proposes an algorithm which resolves author ambiguities based on co-authorship and then yields an author graph consisting of not author name nodes but author nodes. Scientific collaboration relationship this algorithm depends on tends to produce the clustering results which minimize the over-clustering error at the expense of the under-clustering error. In experiments, the algorithm is applied to the real citation records where Korean namesakes occur, and the results are discussed.

A Bottom-up Algorithm to Find the Densest Subgraphs Based on MapReduce (맵리듀스 기반 상향식 최대 밀도 부분그래프 탐색 알고리즘)

  • Lee, Woonghee;Kim, Younghoon
    • Journal of KIISE
    • /
    • v.44 no.1
    • /
    • pp.78-83
    • /
    • 2017
  • Finding the densest subgraphs from social networks, such that people in the subgraph are in a particular community or have common interests, has been a recurring problem in numerous studies undertaken. However, these algorithms focused only on finding the single densest subgraph. We suggest a heuristic algorithm of the bottom-up type, which finds the densest subgraph by increasing its size from a given starting node, with the repeated addition of adjacent nodes with the maximum degree. Furthermore, since this approach matches well with parallel processing, we further implement a parallel algorithm on the MapReduce framework. In experiments using various graph data, we confirmed that the proposed algorithm finds the densest subgraphs in fewer steps, as compared to other related studies. It also scales efficiently for many given starting nodes.

Multi-scale and Interactive Visual Analysis of Public Bicycle System

  • Shi, Xiaoying;Wang, Yang;Lv, Fanshun;Yang, Xiaohang;Fang, Qiming;Zhang, Li
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.6
    • /
    • pp.3037-3054
    • /
    • 2019
  • Public bicycle system (PBS) is a new emerging and popular mode of public transportation. PBS data can be adopted to analyze human movement patterns. Previous work usually focused on specific scales, and the relationships between different levels of hierarchies are ignored. In this paper, we introduce a multi-scale and interactive visual analytics system to investigate human cycling movement and PBS usage condition. The system supports level-of-detail explorative analysis of spatio-temporal characteristics in PBS. Visual views are designed from global, regional and microcosmic scales. For the regional scale, a bicycle network is constructed to model PBS data, and an flow-based community detection algorithm is applied on the bicycle network to determine station clusters. In contrast to the previous used Louvain algorithm, our method avoids producing super-communities and generates better results. We provide two cases to demonstrate how our system can help analysts explore the overall cycling condition in the city and spatio-temporal aggregation of stations.

Movie Recommendation Algorithm Using Social Network Analysis to Alleviate Cold-Start Problem

  • Xinchang, Khamphaphone;Vilakone, Phonexay;Park, Doo-Soon
    • Journal of Information Processing Systems
    • /
    • v.15 no.3
    • /
    • pp.616-631
    • /
    • 2019
  • With the rapid increase of information on the World Wide Web, finding useful information on the internet has become a major problem. The recommendation system helps users make decisions in complex data areas where the amount of data available is large. There are many methods that have been proposed in the recommender system. Collaborative filtering is a popular method widely used in the recommendation system. However, collaborative filtering methods still have some problems, namely cold-start problem. In this paper, we propose a movie recommendation system by using social network analysis and collaborative filtering to solve this problem associated with collaborative filtering methods. We applied personal propensity of users such as age, gender, and occupation to make relationship matrix between users, and the relationship matrix is applied to cluster user by using community detection based on edge betweenness centrality. Then the recommended system will suggest movies which were previously interested by users in the group to new users. We show shown that the proposed method is a very efficient method using mean absolute error.

Design of an Leader Election Protocol in Mobile Ad Hoc Distributed Systems (분산 이동 시스템에서 선출 프로토콜의 설계)

  • Park, Sung-Hoon
    • The Journal of the Korea Contents Association
    • /
    • v.8 no.12
    • /
    • pp.53-62
    • /
    • 2008
  • The Election paradigm can be used as a building block in many practical problems such as group communication, atomic commit and replicated data management where a protocol coordinator might be useful. The problem has been widely studied in the research community since one reason for this wide interest is that many distributed protocols need an election protocol. However, despite its usefulness, to our knowledge there is no work that has been devoted to this problem in a mobile ad hoc computing environment. Mobile ad hoc systems are more prone to failures than conventional distributed systems. Solving election in such an environment requires from a set of mobile nodes to choose a unique node as a leader based on its priority despite failures or disconnections of mobile nodes. In this paper, we describe a solution to the election problem from mobile ad hoc computing systems. This solution is based on the Group Membership Detection algorithm.

Data anomaly detection for structural health monitoring of bridges using shapelet transform

  • Arul, Monica;Kareem, Ahsan
    • Smart Structures and Systems
    • /
    • v.29 no.1
    • /
    • pp.93-103
    • /
    • 2022
  • With the wider availability of sensor technology through easily affordable sensor devices, several Structural Health Monitoring (SHM) systems are deployed to monitor vital civil infrastructure. The continuous monitoring provides valuable information about the health of the structure that can help provide a decision support system for retrofits and other structural modifications. However, when the sensors are exposed to harsh environmental conditions, the data measured by the SHM systems tend to be affected by multiple anomalies caused by faulty or broken sensors. Given a deluge of high-dimensional data collected continuously over time, research into using machine learning methods to detect anomalies are a topic of great interest to the SHM community. This paper contributes to this effort by proposing a relatively new time series representation named "Shapelet Transform" in combination with a Random Forest classifier to autonomously identify anomalies in SHM data. The shapelet transform is a unique time series representation based solely on the shape of the time series data. Considering the individual characteristics unique to every anomaly, the application of this transform yields a new shape-based feature representation that can be combined with any standard machine learning algorithm to detect anomalous data with no manual intervention. For the present study, the anomaly detection framework consists of three steps: identifying unique shapes from anomalous data, using these shapes to transform the SHM data into a local-shape space and training machine learning algorithms on this transformed data to identify anomalies. The efficacy of this method is demonstrated by the identification of anomalies in acceleration data from an SHM system installed on a long-span bridge in China. The results show that multiple data anomalies in SHM data can be automatically detected with high accuracy using the proposed method.

Improved Sentence Boundary Detection Method for Web Documents (웹 문서를 위한 개선된 문장경계인식 방법)

  • Lee, Chung-Hee;Jang, Myung-Gil;Seo, Young-Hoon
    • Journal of KIISE:Software and Applications
    • /
    • v.37 no.6
    • /
    • pp.455-463
    • /
    • 2010
  • In this paper, we present an approach to sentence boundary detection for web documents that builds on statistical-based methods and uses rule-based correction. The proposed system uses the classification model learned offline using a training set of human-labeled web documents. The web documents have many word-spacing errors and frequently no punctuation mark that indicates the end of sentence boundary. As sentence boundary candidates, the proposed method considers every Ending Eomis as well as punctuation marks. We optimize engine performance by selecting the best feature, the best training data, and the best classification algorithm. For evaluation, we made two test sets; Set1 consisting of articles and blog documents and Set2 of web community documents. We use F-measure to compare results on a large variety of tasks, Detecting only periods as sentence boundary, our basis engine showed 96.5% in Set1 and 56.7% in Set2. We improved our basis engine by adapting features and the boundary search algorithm. For the final evaluation, we compared our adaptation engine with our basis engine in Set2. As a result, the adaptation engine obtained improvements over the basis engine by 39.6%. We proved the effectiveness of the proposed method in sentence boundary detection.

Movie Recommendation System using Community Detection and Parallel Programming (커뮤니티 탐지 및 병렬 프로그래밍을 이용한 영화 추천 시스템)

  • Sadriddinov Ilkhomjon;Yixuan Yang;Sony Peng;Sophort Siet;Dae-Young Kim;Doo-Soon Park
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2023.05a
    • /
    • pp.389-391
    • /
    • 2023
  • In the era of Big Data, humanity is facing a huge overflow of information. To overcome such an obstacle, many new cutting-edge technologies are being introduced. The movie recommendation system is also one such technology. To date, many theoretical and practical kinds of research have been conducted. Our research also focuses on the movie recommendation system by implementing methods from Social Network Analysis(SNA) and Parallel Programming. We applied the Girvan-Newman algorithm to detect communities of users, and a future package to perform the parallelization. This approach not only tries to improve the accuracy of the system but also accelerates the execution time. To do our experiment, we used the MovieLense Dataset.

Analysis of Geographic Network Structure by Business Relationship between Companies of the Korean Automobile Industry (한국 자동차산업의 기업간 거래관계에 의한 지리적 네트워크 구조 분석)

  • KIM, Hye-Lim;MOON, Tae-Heon
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.24 no.3
    • /
    • pp.58-72
    • /
    • 2021
  • In July 2021, UNCTAD classified Korea as a developed country. After the Korean War in the 1950s, economic development was promoted despite difficult conditions, resulting in epoch-making national growth. However, in order to respond to the rapidly changing global economy, it is necessary to continuously study the domestic industrial ecosystem and prepare strategies for continuous change and growth. This study analyzed the industrial ecosystem of the automobile industry where it is possible to obtain transaction data between companies by applying complexity spatial network analysis. For data, 295 corporate data(node data) and 607 transaction data (link data) were used. As a result of checking the spatial distribution by geocoding the address of the company, the automobile industry-related companies were concentrated in the Seoul metropolitan area and the Southeastern(Dongnam) region. The node importance was measured through degree centrality, betweenness centrality, closeness centrality, and eigenvector centrality, and the network structure was confirmed by identifying density, distance, community detection, and assortativity and disassortivity. As a result, among the automakers, Hyundai Motor, Kia Motors, and GM Korea were included in the top 15 in 4 indicators of node centrality. In terms of company location, companies located in the Seoul metropolitan area were included in the top 15. In terms of company size, most of the large companies with more than 1,000 employees were included in the top 15 for degree centrality and betweenness centrality. Regarding closeness centrality and eigenvector centrality, most of the companies with 500 or less employees were included in the top 15, except for automakers. In the structure of the network, the density was 0.01390522 and the average distance was 3.422481. As a result of community detection using the fast greedy algorithm, 11 communities were finally derived.

Mapping Studies on Visual Search, Eye Movement, and Eye track by Bibliometric Analysis

  • Rhie, Ye Lim;Lim, Ji Hyoun;Yun, Myung Hwan
    • Journal of the Ergonomics Society of Korea
    • /
    • v.34 no.5
    • /
    • pp.377-399
    • /
    • 2015
  • Objective: The aim of this study is to understand and identify the critical issues in vision research area using content analysis and network analysis. Background: Vision, the most influential factor in information processing, has been studied in a wide range of area. As studies on vision are dispersed across a broad area of research and the number of published researches is ever increasing, a bibliometric analysis towards literature would assist researchers in understanding and identifying critical issues in their research. Method: In this study, content and network analysis were applied on the meta-data of literatures collected using three search keywords: 'visual search', 'eye movement', and 'eye tracking'. Results: Content analysis focuses on extracting meaningful information from the text, deducting seven categories of research area; 'stimuli and task', 'condition', 'measures', 'participants', 'eye movement behavior', 'biological system', and 'cognitive process'. Network analysis extracts relational aspect of research areas, presenting characteristics of sub-groups identified by community detection algorithm. Conclusion: Using these methods, studies on vision were quantitatively analyzed and the results helped understand the overall relation between concepts and keywords. Application: The results of this study suggests that the use of content and network analysis helps identifying not only trends of specific research areas but also the relational aspects of each research issue while minimizing researchers' bias. Moreover, the investigated structural relationship would help identify the interrelated subjects from a macroscopic view.