• Title/Summary/Keyword: Code Clustering

Search Result 61, Processing Time 0.024 seconds

Comparison of graph clustering methods for analyzing the mathematical subject classification codes

  • Choi, Kwangju;Lee, June-Yub;Kim, Younjin;Lee, Donghwan
    • Communications for Statistical Applications and Methods
    • /
    • v.27 no.5
    • /
    • pp.569-578
    • /
    • 2020
  • Various graph clustering methods have been introduced to identify communities in social or biological networks. This paper studies the entropy-based and the Markov chain-based methods in clustering the undirected graph. We examine the performance of two clustering methods with conventional methods based on quality measures of clustering. For the real applications, we collect the mathematical subject classification (MSC) codes of research papers from published mathematical databases and construct the weighted code-to-document matrix for applying graph clustering methods. We pursue to group MSC codes into the same cluster if the corresponding MSC codes appear in many papers simultaneously. We compare the MSC clustering results based on the several assessment measures and conclude that the Markov chain-based method is suitable for clustering the MSC codes.

Typical Daily Load Profile Generation using Load Profile of Automatic Meter Reading Customer (자동검침 고객의 부하패턴을 이용한 일일 대표 부하패턴 생성)

  • Kim, Young-Il;Shin, Jin-Ho;Yi, Bong-Jae;Yang, Il-Kwon
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.57 no.9
    • /
    • pp.1516-1521
    • /
    • 2008
  • Recently, distribution load analysis using AMR (Automatic Meter Reading) data is researched in electric utilities. Load analysis method based on AMR system generates the typical load profile using load data of AMR customers, estimates the load profile of non-AMR customers, and analyzes the peak load and load profile of the distribution circuits and sectors per every 15 minutes/hour/day/week/month. Typical load profile is generated by the algorithm calculating the average amount of power consumption of each groups having similar load patterns. Traditional customer clustering mechanism uses only contract type code as a key. This mechanism has low accuracy because many customers having same contract code have different load patterns. In this research, We propose a customer clustring mechanism using k-means algorithm with contract type code and AMR data.

A Code Clustering Technique for Unifying Method Full Path of Reusable Cloned Code Sets of a Product Family (제품군의 재사용 가능한 클론 코드의 메소드 경로 통일을 위한 코드 클러스터링 방법)

  • Kim, Taeyoung;Lee, Jihyun;Kim, Eunmi
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.12 no.1
    • /
    • pp.1-18
    • /
    • 2023
  • Similar software is often developed with the Clone-And-Own (CAO) approach that copies and modifies existing artifacts. The CAO approach is considered as a bad practice because it makes maintenance difficult as the number of cloned products increases. Software product line engineering is a methodology that can solve the issue of the CAO approach by developing a product family through systematic reuse. Migrating product families that have been developed with the CAO approach to the product line engineering begins with finding, integrating, and building them as reusable assets. However, cloning occurs at various levels from directories to code lines, and their structures can be changed. This makes it difficult to build product line code base simply by finding clones. Successful migration thus requires unifying the source code's file path, class name, and method signature. This paper proposes a clustering method that identifies a set of similar codes scattered across product variants and some of their method full paths are different, so path unification is necessary. In order to show the effectiveness of the proposed method, we conducted an experiment using the Apo Games product line, which has evolved with the CAO approach. As a result, the average precision of clustering performed without preprocessing was 0.91 and the number of identified common clusters was 0, whereas our method showed 0.98 and 15 respectively.

Increase of Binary CDMA transmission range by using Clustering technique (Clustering을 통한 Binary CDMA 전송거리 확보)

  • Choi, Hyeon-Seok;Ji, Choong-Won;Kim, Jung-Sun
    • 한국HCI학회:학술대회논문집
    • /
    • 2008.02a
    • /
    • pp.679-682
    • /
    • 2008
  • High interest for the wireless network is going on the research to apply the related technologies in one's real life. Among these wireless network technologies, local area wireless network, Binary CDMA(Code Division Multiple Access), is the method transferring the data by using RF band based on 2.4Ghz. Binary CDMA has longer transmission distance than Bluetooth. Also, it is of benefit to an inexpensive price because the circuit is simple as compared with being similar to the performance of the existing CDMA. Though Binary CDMA has these benefits, one problem is a frequency overlap, and anther problem is to generate the sections with the shorter distance. To solve these problems, We propose the clustering method that can cover wide area.

  • PDF

A clustered cyclic product code for the burst error correction in the DVCR systems (DVCR 시스템의 연집 오류 정정을 위한 클러스터 순환 프러덕트 부호)

  • 이종화;유철우;강창언;홍대식
    • Journal of the Korean Institute of Telematics and Electronics S
    • /
    • v.34S no.2
    • /
    • pp.1-10
    • /
    • 1997
  • In this paper, an improved lower bound on the burst-error correcting capability of th ecyclic product code is presented and through the analysis of this new bound clustered cyclic product (CCP abbr.)code is proposed. The CCP code, to improve the burst-error correcting capability, combines the idea of clustering and the transmission method of cyclic product code. That is, a cluster which is defined in this paper as a group of consecutive code symbols is employed as a new transmission unit to the code array transmission of cyclic product code. the burst-error correcting capability of the CCP code is improved without a loss in the random-error correcting capability and performance comparison in the digital video camera records (DVCR) system shows the superiority of the proposed CCP code over conventional product codes.

  • PDF

Parallel Processing of k-Means Clustering Algorithm for Unsupervised Classification of Large Satellite Images: A Hybrid Method Using Multicores and a PC-Cluster (대용량 위성영상의 무감독 분류를 위한 k-Means Clustering 알고리즘의 병렬처리: 다중코어와 PC-Cluster를 이용한 Hybrid 방식)

  • Han, Soohee;Song, Jeong Heon
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.37 no.6
    • /
    • pp.445-452
    • /
    • 2019
  • In this study, parallel processing codes of k-means clustering algorithm were developed and implemented in a PC-cluster for unsupervised classification of large satellite images. We implemented intra-node code using multicores of CPU (Central Processing Unit) based on OpenMP (Open Multi-Processing), inter-nodes code using a PC-cluster based on message passing interface, and hybrid code using both. The PC-cluster consists of one master node and eight slave nodes, and each node is equipped with eight multicores. Two operating systems, Microsoft Windows and Canonical Ubuntu, were installed in the PC-cluster in turn and tested to compare parallel processing performance. Two multispectral satellite images were tested, which are a medium-capacity LANDSAT 8 OLI (Operational Land Imager) image and a high-capacity Sentinel 2A image. To evaluate the performance of parallel processing, speedup and efficiency were measured. Overall, the speedup was over N / 2 and the efficiency was over 0.5. From the comparison of the two operating systems, the Ubuntu system showed two to three times faster performance. To confirm that the results of the sequential and parallel processing coincide with the other, the center value of each band and the number of classified pixels were compared, and result images were examined by pixel to pixel comparison. It was found that care should be taken to avoid false sharing of OpenMP in intra-node implementation. To process large satellite images in a PC-cluster, code and hardware should be designed to reduce performance degradation caused by file I / O. Also, it was found that performance can differ depending on the operating system installed in a PC-cluster.

Development of a Company-Tailored Part Classification & Coding System Using fuzzy clustering Techniques (Fuzzy 밀집기법을 이용한 맞춤형 부픔 분류법의 개발)

  • 박진우
    • Journal of the Korean Operations Research and Management Science Society
    • /
    • v.13 no.1
    • /
    • pp.31-38
    • /
    • 1988
  • This paper presents a methodology for the development of a part classification and coding system suited to each individual company. When coding a group of parts for a specific company by a general purpose part classification & coding system like OPITZ system, it is frequently observed that we use only a small subset of total available code numbers. Such sparsity in the actual occurrences of code numbers implies that we can design a better system which uses digits of the system more parsimoniously. A 2-dimensional fuzzy ISODATA algorithm is developed to extract the important characteristics for the classification from the set of given parts. Based on the extracted characteristics nd the distances between fuzzy clustering cenetroids, a company-unique classification and coding system can be developed. An example case study for a medium sized machine shop is presented.

  • PDF

Shape Design of Passages for Turbine Blade Using Design Optimization System (최적화설계시스템을 이용한 터빈블레이드 냉각통로의 형상설계)

  • Jeong Min-Joong;Lee Joon-Seong
    • Transactions of the Korean Society of Mechanical Engineers A
    • /
    • v.29 no.7 s.238
    • /
    • pp.1013-1021
    • /
    • 2005
  • In this paper, we developed an automatic design optimization system for parametric shape optimization of cooling passages inside axial turbine blades. A parallel three-dimensional thermoelasticity finite element analysis code from an open source system was used to perform automatic thermal and stress analysis of different blade configuration. The developed code was connected to an evolutionary optimizer and built in a design optimization system. Using the optimization system, 279 feasible and optimal solutions were searched. It is provided not only one best solution of the searched solutions, but also information of variation structure and correlation of the 279 solutions in function, variable, and real design spaces. To explore design information, it is proposed a new interpretation approach based on evolutionary clustering and principal component analysis. The interpretation approach might be applicable to the increasing demands in the general area of design optimization.

A Method of Object Identification from Procedural Programs (절차적 프로그램으로부터의 객체 추출 방법론)

  • Jin, Yun-Suk;Ma, Pyeong-Su;Sin, Gyu-Sang
    • The Transactions of the Korea Information Processing Society
    • /
    • v.6 no.10
    • /
    • pp.2693-2706
    • /
    • 1999
  • Reengineering to object-oriented system is needed to maintain the system and satisfy requirements of structure change. Target systems which should be reengineered to object-oriented system are difficult to change because these systems have no design document or their design document is inconsistent of source code. Using design document to identifying objects for these systems is improper. There are several researches which identify objects through procedural source code analysis. In this paper, we propose automatic object identification method based on clustering of VTFG(Variable-Type-Function Graph) which represents relations among variables, types, and functions. VTFG includes relations among variables, types, and functions that may be basis of objects, and weights of these relations. By clustering related variables, types, and functions using their weights, our method overcomes limit of existing researches which identify too big objects or objects excluding many functions. The method proposed in this paper minimizes user's interaction through automatic object identification and make it easy to reenginner procedural system to object-oriented system.

  • PDF

Clustering Performance Analysis of Autoencoder with Skip Connection (스킵연결이 적용된 오토인코더 모델의 클러스터링 성능 분석)

  • Jo, In-su;Kang, Yunhee;Choi, Dong-bin;Park, Young B.
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.9 no.12
    • /
    • pp.403-410
    • /
    • 2020
  • In addition to the research on noise removal and super-resolution using the data restoration (Output result) function of Autoencoder, research on the performance improvement of clustering using the dimension reduction function of autoencoder are actively being conducted. The clustering function and data restoration function using Autoencoder have common points that both improve performance through the same learning. Based on these characteristics, this study conducted an experiment to see if the autoencoder model designed to have excellent data recovery performance is superior in clustering performance. Skip connection technique was used to design autoencoder with excellent data recovery performance. The output result performance and clustering performance of both autoencoder model with Skip connection and model without Skip connection were shown as graph and visual extract. The output result performance was increased, but the clustering performance was decreased. This result indicates that the neural network models such as autoencoders are not sure that each layer has learned the characteristics of the data well if the output result is good. Lastly, the performance degradation of clustering was compensated by using both latent code and skip connection. This study is a prior study to solve the Hanja Unicode problem by clustering.