• Title/Summary/Keyword: cluster-then-predict

Search Result 26, Processing Time 0.026 seconds

A On-Line Pattern Clustering Technique Using Fuzzy Neural Networks (퍼지 신경망을 이용한 온라인 클러스터링 방법)

  • 김재현;서일홍
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.31B no.7
    • /
    • pp.199-210
    • /
    • 1994
  • Most of clustering methods usually employ a center or predefined shape of a cluster to assign the input data into the cluster. When there is no information about data set, it is impossible to predict how many clusters are to be or what shape clusters take. (the shape of clusters could not be easily represented by the center or predefined shape of clusters) Therefore, it is difficult to assign input data into a proper cluster using previous methods. In this paper, to overcome such a difficulty a cluster is to be represented as a collection of several subclusters representing boundary of the cluster. And membership functions are used to represent how much input data bllongs to subclusters. Then the position of the nearest subcluster is adaptively corrected for expansion of cluster, which the subcluster belongs to by use of a competitive learning neural network. To show the validity of the proposed method a numerical example is illustrated where FMMC(Fuzzy Min-Max Clustering) algorithm is compared with the proposed method.

  • PDF

Dynamic Load Balancing Algorithm using Execution Time Prediction on Cluster Systems

  • Yoon, Wan-Oh;Jung, Jin-Ha;Park, Sang-Bang
    • Proceedings of the IEEK Conference
    • /
    • 2002.07a
    • /
    • pp.176-179
    • /
    • 2002
  • In recent years, an increasing amount of computer network research has focused on the problem of cluster system in order to achieve higher performance and lower cost. The load unbalance is the major defect that reduces performance of a cluster system that uses parallel program in a form of SPMD (Single Program Multiple Data). Also, the load unbalance is a problem of MPP (Massive Parallel Processors), and distributed system. The cluster system is a loosely-coupled distributed system, therefore, it has higher communication overhead than MPP. Dynamic load balancing can solve the load unbalance problem of cluster system and reduce its communication cost. The cluster systems considered in this paper consist of P heterogeneous nodes connected by a switch-based network. The master node can predict the average execution time of tasks for each slave node based on the information from the corresponding slave node. Then, the master node redistributes remaining tasks to each node considering the predicted execution time and the communication overhead for task migration. The proposed dynamic load balancing uses execution time prediction to optimize the task redistribution. The various performance factors such as node number, task number, and communication cost are considered to improve the performance of cluster system. From the simulation results, we verified the effectiveness of the proposed dynamic load balancing algorithm.

  • PDF

An Energy-Efficient Periodic Data Collection using Dynamic Cluster Management Method in Wireless Sensor Network (무선 센서 네트워크에서 동적 클러스터 유지 관리 방법을 이용한 에너지 효율적인 주기적 데이터 수집)

  • Yun, SangHun;Cho, Haengrae
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.5 no.4
    • /
    • pp.206-216
    • /
    • 2010
  • Wireless sensor networks (WSNs) are used to collect various data in environment monitoring applications. A spatial clustering may reduce energy consumption of data collection by partitioning the WSN into a set of spatial clusters with similar sensing data. For each cluster, only a few sensor nodes (samplers) report their sensing data to a base station (BS). The BS may predict the missed data of non-samplers using the spatial correlations between sensor nodes. ASAP is a representative data collection algorithm using the spatial clustering. It periodically reconstructs the entire network into new clusters to accommodate to the change of spatial correlations, which results in high message overhead. In this paper, we propose a new data collection algorithm, name EPDC (Energy-efficient Periodic Data Collection). Unlike ASAP, EPDC identifies a specific cluster consisting of many dissimilar sensor nodes. Then it reconstructs only the cluster into subclusters each of which includes strongly correlated sensor nodes. EPDC also tries to reduce the message overhead by incorporating a judicious probabilistic model transfer method. We evaluate the performance of EPDC and ASAP using a simulation model. The experiment results show that the performance improvement of EPDC is up to 84% compared to ASAP.

Predicting Learning Achievement Using Big Data Cluster Analysis - Focusing on Longitudinal Study (빅데이터 군집 분석을 이용한 학습성취도 예측 - 종단 연구를 중심으로)

  • Ko, Sujeong
    • Journal of Digital Contents Society
    • /
    • v.19 no.9
    • /
    • pp.1769-1778
    • /
    • 2018
  • As the value of using Big Data is increasing, various researches are being carried out utilizing big data analysis technology in the field of education as well as corporations. In this paper, we propose a method to predict learning achievement using big data cluster analysis. In the proposed method, students in Korea Children and Youth Panel Survey(KCYPS) are classified into groups with similar learning habits using the Kmeans algorithm based on the learning habits of students of the first year at middle school, and group features are extracted. Next, using the extracted features of groups, the first grade students at the middle school in the test group were classified into groups having similar learning habits using the cosine similarity, and then the neighbors were selected and the learning achievement was predicted. The method proposed in this paper has proved that the learning habits at middle school are closely related to at the university, and they make it possible to predict the learning achievement at high school and the satisfaction with university and major.

Prediction of High Level Ozone Concentration in Seoul by Using Multivariate Statistical Analyses (다변량 통계분석을 이용한 서울시 고농도 오존의 예측에 관한 연구)

  • 허정숙;김동술
    • Journal of Korean Society for Atmospheric Environment
    • /
    • v.9 no.3
    • /
    • pp.207-215
    • /
    • 1993
  • In order to statistically predict $O_3$ levels in Seoul, the study used the TMS (telemeted air monitoring system) data from the Department of Environment, which have monitored at 20 sites in 1989 and 1990. Each data in each site was characterized by 6 major criteria pollutants ($SO_2, TSP, CO, NO_2, THC, and O_3$) and 2 meteorological parameters, such as wind speed and wind direction. To select proper variables and to determine each pollutant's behavior, univariate statistical analyses were extensively studied in the beginning, and then various applied statistical techniques like cluster analysis, regression analysis, and expert system have been intensively examined. For the initial study of high level $O_3$ prediction, the raw data set in each site was separated into 2 group based on 60 ppb $O_3$ level. A hierarchical cluster analysis was applied to classify the group based on 60 ppb $O_3$ into small calsses. Each class in each site has its own pattern. Next, multiple regression for each class was repeatedly applied to determine an $O_3$ prediction submodel and to determine outliers in each class based on a certain level of standardized redisual. Thus, a prediction submodel for each homogeneous class could be obtained. The study was extended to model $O_3$ prediction for both on-time basis and 1-hr after basis. Finally, an expect system was used to build a unified classification rule based on examples of the homogenous classes for all of sites. Thus, a concept of high level $O_3$ prediction model was developed for one of $O_3$ alert systems.

  • PDF

A Novel Thresholding for Prediction Analytics with Machine Learning Techniques

  • Shakir, Khan;Reemiah Muneer, Alotaibi
    • International Journal of Computer Science & Network Security
    • /
    • v.23 no.1
    • /
    • pp.33-40
    • /
    • 2023
  • Machine-learning techniques are discovering effective performance on data analytics. Classification and regression are supported for prediction on different kinds of data. There are various breeds of classification techniques are using based on nature of data. Threshold determination is essential to making better model for unlabelled data. In this paper, threshold value applied as range, based on min-max normalization technique for creating labels and multiclass classification performed on rainfall data. Binary classification is applied on autism data and classification techniques applied on child abuse data. Performance of each technique analysed with the evaluation metrics.

Weak Lensing Mass Map Reconstruction of Merging Clusters with Convolutional Neural Network

  • Park, Sangnam;Jee, James M.;Hong, Sungwook E.;Bak, Dongsu
    • The Bulletin of The Korean Astronomical Society
    • /
    • v.44 no.2
    • /
    • pp.75.1-75.1
    • /
    • 2019
  • We introduce a novel method for reconstructing the projected dark matter mass maps of merging galaxy clusters by applying the convolutional neural network (CNN) to their weak lensing maps. We generate synthesized grayscale images from given weak lensing maps that preserve their averaged galaxy ellipticity. We then apply them to multi-layered CNN with architectures of alternating convolution and trans-convolution filters to predict the mass maps. We train our architecture with 1,000 Subaru/Suprime-Cam mock weak lensing maps, and our method have better mass map prediction than the Kaiser-Squires method with the following three aspects: (1) better pixel-to-pixel correlation, (2) more accurate finding of density peak position, and (3) free from mass-sheet degeneracy. We also apply our method to the HST weak lensing map of the El Gordo cluster and compare our result to the previous studies.

  • PDF

Spatiotemporal Impact Assessments of Highway Construction: Autonomous SWAT Modeling

  • Choi, Kunhee;Bae, Junseo
    • International conference on construction engineering and project management
    • /
    • 2015.10a
    • /
    • pp.294-298
    • /
    • 2015
  • In the United States, the completion of Construction Work Zone (CWZ) impact assessments for all federally-funded highway infrastructure improvement projects is mandated, yet it is regarded as a daunting task for state transportation agencies, due to a lack of standardized analytical methods for developing sounder Transportation Management Plans (TMPs). To circumvent these issues, this study aims to create a spatiotemporal modeling framework, dubbed "SWAT" (Spatiotemporal Work zone Assessment for TMPs). This study drew a total of 43,795 traffic sensor reading data collected from heavily trafficked highways in U.S. metropolitan areas. A multilevel-cluster-driven analysis characterized traffic patterns, while being verified using a measurement system analysis. An artificial neural networks model was created to predict potential 24/7 traffic demand automatically, and its predictive power was statistically validated. It is proposed that the predicted traffic patterns will be then incorporated into a what-if scenario analysis that evaluates the impact of numerous alternative construction plans. This study will yield a breakthrough in automating CWZ impact assessments with the first view of a systematic estimation method.

  • PDF

Reclassification of the vulnerability group of wartime equipment (군집분석을 이용한 전시장비의 취약성 그룹 재분류)

  • Lee, Hanwoo;Kim, Suhwan;Joo, Kyungsik
    • Journal of the Korean Data and Information Science Society
    • /
    • v.26 no.3
    • /
    • pp.581-592
    • /
    • 2015
  • In the GORRAM, the estimation of resource requirements for wartime equipment is based on the ELCON of the USA. The number of vulnerability groups of ELCON are 22, but unfortunately it is hard to determine how the 22 groups are classified. Thus, in this research we collected 505 types of basic items used in wartime and classified those items into new vulnerability groups using AHP and cluster analysis methods. We selected 11 variables through AHP to classify those items with cluster analysis. Next, we decided the number of vulnerability groups through hierarchical clustering and then we classified 505 types of basic items into the new vulnerability groups through K-means clustering.This paper presents new vulnerability groups of 505 types of basic items fitted to Korean weapon systems. Furthermore, our approach can be applied to a new weapon system which needs to be classified into a vulnerability group. We believe that our approach will provide practitioners in the military with a reliable and rational method for classifying wartime equipment and thus consequentially predict the exact estimation of resource requirements in wartime.

Chlorophyll-a Forcasting using PLS Based c-Fuzzy Model Tree (PLS기반 c-퍼지 모델트리를 이용한 클로로필-a 농도 예측)

  • Lee, Dae-Jong;Park, Sang-Young;Jung, Nahm-Chung;Lee, Hye-Keun;Park, Jin-Il;Chun, Meung-Geun
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.16 no.6
    • /
    • pp.777-784
    • /
    • 2006
  • This paper proposes a c-fuzzy model tree using partial least square method to predict the Chlorophyll-a concentration in each zone. First, cluster centers are calculated by fuzzy clustering method using all input and output attributes. And then, each internal node is produced according to fuzzy membership values between centers and input attributes. Linear models are constructed by partial least square method considering input-output pairs remained in each internal node. The expansion of internal node is determined by comparing errors calculated in parent node with ones in child node, respectively. On the other hands, prediction is performed with a linear model haying the highest fuzzy membership value between input attributes and cluster centers in leaf nodes. To show the effectiveness of the proposed method, we have applied our method to water quality data set measured at several stations. Under various experiments, our proposed method shows better performance than conventional least square based model tree method.