• Title/Summary/Keyword: software clustering

Search Result 316, Processing Time 0.027 seconds

A K-Means-Based Clustering Algorithm for Traffic Prediction in a Bike-Sharing System (공유자전거 시스템의 이용 예측을 위한 K-Means 기반의 군집 알고리즘)

  • Kim, Kyoungok;Lee, Chang Hwan
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.10 no.5
    • /
    • pp.169-178
    • /
    • 2021
  • Recently, a bike-sharing system (BSS) has become popular as a convenient "last mile" transportation. Rebalancing of bikes is a critical issue to manage BSS because the rents and returns of bikes are not balanced by stations and periods. For efficient and effective rebalancing, accurate traffic prediction is important. Recently, cluster-based traffic prediction has been utilized to enhance the accuracy of prediction at the station-level and the clustering step is very important in this approach. In this paper, we propose a k-means based clustering algorithm that overcomes the drawbacks of the existing clustering methods for BSS; indeterministic and hardly converged. By employing the centroid initialization and using the temporal proportion of the rents and returns of stations as an input for clustering, the proposed algorithm can be deterministic and fast.

Defect Severity-based Defect Prediction Model using CL

  • Lee, Na-Young;Kwon, Ki-Tae
    • Journal of the Korea Society of Computer and Information
    • /
    • v.23 no.9
    • /
    • pp.81-86
    • /
    • 2018
  • Software defect severity is very important in projects with limited historical data or new projects. But general software defect prediction is very difficult to collect the label information of the training set and cross-project defect prediction must have a lot of data. In this paper, an unclassified data set with defect severity is clustered according to the distribution ratio. And defect severity-based prediction model is proposed by way of labeling. Proposed model is applied CLAMI in JM1, PC4 with the least ambiguity of defect severity-based NASA dataset. And it is evaluated the value of ACC compared to original data. In this study experiment result, proposed model is improved JM1 0.15 (15%), PC4 0.12(12%) than existing defect severity-based prediction models.

Similarity Measure and Clustering Technique for XML Documents by a Parent-Child Matrix (부모-자식 행렬을 사용한 XML 문서 유사도 측정과 군집 기법)

  • Lee, Yun-Gu;Kim, Woosaeng
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.19 no.7
    • /
    • pp.1599-1607
    • /
    • 2015
  • Recently, researches have been developing efficient techniques for accessing, querying, and managing XML documents which are frequently used in the Internet. In this paper, we propose a parent-child matrix to cluster XML documents efficiently. A parent-child matrix analyzes both the content and structural features of an XML document. Each cell of a parent-child matrix has either the value of a node in an XML tree or the value of a child node, where a parent-child relationship exists in the XML tree. Then, the similarity between two XML documents can be measured by the similarity between two corresponding parent-child matrices. The experiment shows that our proposed method has good performance.

A Sentiment Classification Approach of Sentences Clustering in Webcast Barrages

  • Li, Jun;Huang, Guimin;Zhou, Ya
    • Journal of Information Processing Systems
    • /
    • v.16 no.3
    • /
    • pp.718-732
    • /
    • 2020
  • Conducting sentiment analysis and opinion mining are challenging tasks in natural language processing. Many of the sentiment analysis and opinion mining applications focus on product reviews, social media reviews, forums and microblogs whose reviews are topic-similar and opinion-rich. In this paper, we try to analyze the sentiments of sentences from online webcast reviews that scroll across the screen, which we call live barrages. Contrary to social media comments or product reviews, the topics in live barrages are more fragmented, and there are plenty of invalid comments that we must remove in the preprocessing phase. To extract evaluative sentiment sentences, we proposed a novel approach that clusters the barrages from the same commenter to solve the problem of scattering the information for each barrage. The method developed in this paper contains two subtasks: in the data preprocessing phase, we cluster the sentences from the same commenter and remove unavailable sentences; and we use a semi-supervised machine learning approach, the naïve Bayes algorithm, to analyze the sentiment of the barrage. According to our experimental results, this method shows that it performs well in analyzing the sentiment of online webcast barrages.

A design of the PSDG based semantic slicing model for software maintenance (소프트웨어의 유지보수를 위한 PSDG기반 의미분할모형의 설계)

  • Yeo, Ho-Young;Lee, Kee-O;Rhew, Sung-Yul
    • The Transactions of the Korea Information Processing Society
    • /
    • v.5 no.8
    • /
    • pp.2041-2049
    • /
    • 1998
  • This paper suggests a technique for program segmentation and maintenance using PSDG(Post-State Dependency Graph) that improves the quality of a software by identifying and detecting defects in already fixed source code. A program segmentation is performed by utilizing source code analysis which combines the measures of static, dynamic and semantic slicing when we need understandability of defect in programs for corrective maintanence. It provides users with a segmental principle to split a program by tracing state dependency of a source code with the graph, and clustering and highlighting, Through a modeling of the PSDG, elimination of ineffective program deadcode and generalization of related program segments arc possible, Additionally, it can be correlated with other design modeb as STD(State Transition Diagram), also be used as design documents.

  • PDF

Integrating Ant Colony Clustering Method to a Multi-Robot System Using Mobile Agents

  • Kambayashi, Yasushi;Ugajin, Masataka;Sato, Osamu;Tsujimura, Yasuhiro;Yamachi, Hidemi;Takimoto, Munehiro;Yamamoto, Hisashi
    • Industrial Engineering and Management Systems
    • /
    • v.8 no.3
    • /
    • pp.181-193
    • /
    • 2009
  • This paper presents a framework for controlling mobile multiple robots connected by communication networks. This framework provides novel methods to control coordinated systems using mobile agents. The combination of the mobile agent and mobile multiple robots opens a new horizon of efficient use of mobile robot resources. Instead of physical movement of multiple robots, mobile software agents can migrate from one robot to another so that they can minimize energy consumption in aggregation. The imaginary application is making "carts," such as found in large airports, intelligent. Travelers pick up carts at designated points but leave them arbitrary places. It is a considerable task to re-collect them. It is, therefore, desirable that intelligent carts (intelligent robots) draw themselves together automatically. Simple implementation may be making each cart has a designated assembly point, and when they are free, automatically return to those points. It is easy to implement, but some carts have to travel very long way back to their own assembly point, even though it is located close to some other assembly points. It consumes too much unnecessary energy so that the carts have to have expensive batteries. In order to ameliorate the situation, we employ mobile software agents to locate robots scattered in a field, e.g. an airport, and make them autonomously determine their moving behaviors by using a clustering algorithm based on the Ant Colony Optimization (ACO). ACO is the swarm intelligence-based methods, and a multi-agent system that exploit artificial stigmergy for the solution of combinatorial optimization problems. Preliminary experiments have provided a favorable result. In this paper, we focus on the implementation of the controlling mechanism of the multi-robots using the mobile agents.

Unsupervised Learning Model for Fault Prediction Using Representative Clustering Algorithms (대표적인 클러스터링 알고리즘을 사용한 비감독형 결함 예측 모델)

  • Hong, Euyseok;Park, Mikyeong
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.3 no.2
    • /
    • pp.57-64
    • /
    • 2014
  • Most previous studies of software fault prediction model which determines the fault-proneness of input modules have focused on supervised learning model using training data set. However, Unsupervised learning model is needed in case supervised learning model cannot be applied: either past training data set is not present or even though there exists data set, current project type is changed. Building an unsupervised learning model is extremely difficult that is why only a few studies exist. In this paper, we build unsupervised models using representative clustering algorithms, EM and DBSCAN, that have not been used in prior studies and compare these models with the previous model using K-means algorithm. The results of our study show that the EM model performs slightly better than the K-means model in terms of error rate and these two models significantly outperform the DBSCAN model.

Daily Behavior Pattern Extraction using Time-Series Behavioral Data of Dairy Cows and k-Means Clustering (행동 시계열 데이터와 k-평균 군집화를 통한 젖소의 일일 행동패턴 검출)

  • Lee, Seonghun;Park, Gicheol;Park, Jaehwa
    • Journal of Software Assessment and Valuation
    • /
    • v.17 no.1
    • /
    • pp.83-92
    • /
    • 2021
  • There are continuous and tremendous attempts to apply various sensor systems and ICTs into the dairy science for data accumulation and improvement of dairy productivity. However, these only concerns the fields which directly affect to the dairy productivity such as the number of individuals and the milk production amount, while researches on the physiology aspects of dairy cows are not enough which are fundamentally involved in the dairy productivity. This paper proposes the basic approach for extraction of daily behavior pattern from hourly behavioral data of dairy cows to identify the health status and stress. Total four clusters were grouped by k-means clustering and the reasonability was proved by visualization of the data in each groups and the representatives of each groups. We hope that provided results should lead to the further researches on catching abnormalities and disease signs of dairy cows.

Discovering Association Rules using Item Clustering on Frequent Pattern Network (빈발 패턴 네트워크에서 아이템 클러스터링을 통한 연관규칙 발견)

  • Oh, Kyeong-Jin;Jung, Jin-Guk;Ha, In-Ay;Jo, Geun-Sik
    • Journal of Intelligence and Information Systems
    • /
    • v.14 no.1
    • /
    • pp.1-17
    • /
    • 2008
  • Data mining is defined as the process of discovering meaningful and useful pattern in large volumes of data. In particular, finding associations rules between items in a database of customer transactions has become an important thing. Some data structures and algorithms had been proposed for storing meaningful information compressed from an original database to find frequent itemsets since Apriori algorithm. Though existing method find all association rules, we must have a lot of process to analyze association rules because there are too many rules. In this paper, we propose a new data structure, called a Frequent Pattern Network (FPN), which represents items as vertices and 2-itemsets as edges of the network. In order to utilize FPN, We constitute FPN using item's frequency. And then we use a clustering method to group the vertices on the network into clusters so that the intracluster similarity is maximized and the intercluster similarity is minimized. We generate association rules based on clusters. Our experiments showed accuracy of clustering items on the network using confidence, correlation and edge weight similarity methods. And We generated association rules using clusters and compare traditional and our method. From the results, the confidence similarity had a strong influence than others on the frequent pattern network. And FPN had a flexibility to minimum support value.

  • PDF

A Method of Detecting the Aggressive Driving of Elderly Driver (노인 운전자의 공격적인 운전 상태 검출 기법)

  • Koh, Dong-Woo;Kang, Hang-Bong
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.6 no.11
    • /
    • pp.537-542
    • /
    • 2017
  • Aggressive driving is a major cause of car accidents. Previous studies have mainly analyzed young driver's aggressive driving tendency, yet they were only done through pure clustering or classification technique of machine learning. However, since elderly people have different driving habits due to their fragile physical conditions, it is necessary to develop a new method such as enhancing the characteristics of driving data to properly analyze aggressive driving of elderly drivers. In this study, acceleration data collected from a smartphone of a driving vehicle is analyzed by a newly proposed ECA(Enhanced Clustering method for Acceleration data) technique, coupled with a conventional clustering technique (K-means Clustering, Expectation-maximization algorithm). ECA selects high-intensity data among the data of the cluster group detected through K-means and EM in all of the subjects' data and models the characteristic data through the scaled value. Using this method, the aggressive driving data of all youth and elderly experiment participants were collected, unlike the pure clustering method. We further found that the K-means clustering has higher detection efficiency than EM method. Also, the results of K-means clustering demonstrate that a young driver has a driving strength 1.29 times higher than that of an elderly driver. In conclusion, the proposed method of our research is able to detect aggressive driving maneuvers from data of the elderly having low operating intensity. The proposed method is able to construct a customized safe driving system for the elderly driver. In the future, it will be possible to detect abnormal driving conditions and to use the collected data for early warning to drivers.