• Title/Summary/Keyword: Clustering test

Search Result 378, Processing Time 0.033 seconds

Classification of Traffic Flows into QoS Classes by Unsupervised Learning and KNN Clustering

  • Zeng, Yi;Chen, Thomas M.
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.3 no.2
    • /
    • pp.134-146
    • /
    • 2009
  • Traffic classification seeks to assign packet flows to an appropriate quality of service(QoS) class based on flow statistics without the need to examine packet payloads. Classification proceeds in two steps. Classification rules are first built by analyzing traffic traces, and then the classification rules are evaluated using test data. In this paper, we use self-organizing map and K-means clustering as unsupervised machine learning methods to identify the inherent classes in traffic traces. Three clusters were discovered, corresponding to transactional, bulk data transfer, and interactive applications. The K-nearest neighbor classifier was found to be highly accurate for the traffic data and significantly better compared to a minimum mean distance classifier.

Improved Classification Algorithm using Extended Fuzzy Clustering and Maximum Likelihood Method

  • Jeon Young-Joon;Kim Jin-Il
    • Proceedings of the IEEK Conference
    • /
    • summer
    • /
    • pp.447-450
    • /
    • 2004
  • This paper proposes remotely sensed image classification method by fuzzy c-means clustering algorithm using average intra-cluster distance. The average intra-cluster distance acquires an average of the vector set belong to each cluster and proportionates to its size and density. We perform classification according to pixel's membership grade by cluster center of fuzzy c-means clustering using the mean-values of training data about each class. Fuzzy c-means algorithm considered membership degree for inter-cluster of each class. And then, we validate degree of overlap between clusters. A pixel which has a high degree of overlap applies to the maximum likelihood classification method. Finally, we decide category by comparing with fuzzy membership degree and likelihood rate. The proposed method is applied to IKONOS remote sensing satellite image for the verifying test.

  • PDF

Automatic Extraction of Blood Flow Area in Brachial Artery for Suspicious Hypertension Patients from Color Doppler Sonography with Fuzzy C-Means Clustering

  • Kim, Kwang Baek;Song, Doo Heon;Yun, Sang-Seok
    • Journal of information and communication convergence engineering
    • /
    • v.16 no.4
    • /
    • pp.258-263
    • /
    • 2018
  • Color Doppler sonography is a useful tool for examining blood flow and related indices. However, it should be done by well-trained operator, that is, operator subjectivity exists. In this paper, we propose an automatic blood flow area extraction method from brachial artery that would be an essential building block of computer aided color Doppler analyzer. Specifically, our concern is to examine hypertension suspicious (prehypertension) patients who might develop their symptoms to established hypertension in the future. The proposed method uses fuzzy C-means clustering as quantization engine with careful seeding of the number of clusters from histogram analysis. The experiment verifies that the proposed method is feasible in that the successful extraction rates are 96% (successful in 48 out of 50 test cases) and demonstrated better performance than K-means based method in specificity and sensitivity analysis but the proposed method should be further refined as the retrospective analysis pointed out.

Toward precise and accurate modeling of matter clustering in redshift space

  • Oh, Minji
    • The Bulletin of The Korean Astronomical Society
    • /
    • v.43 no.2
    • /
    • pp.40.3-40.3
    • /
    • 2018
  • This dissertation presents the results on two-dimensional Redshift space distortion (hereafter RSD) analyses of the large-scale structure of the universe using spectroscopic data and on improvement of modeling of the RSD effect. RSD is an effect caused by galaxies' peculiar velocity on their clustering feature in observation along the line of sight and is thus intimately connected to the growth rate of the structure in the universe, from which we can test the origin of cosmic acceleration and Einstein's theory of gravity at cosmic scales in the end. However, there are several challenges in modeling precise and accurate RSD effect, such as non-linearities and the existence of an exotic component, e.g. massive neutrino. As part of endeavors for modeling more precise and accurate galaxy clustering in redshift space, this dissertation includes a series of works for this issue. (More detailed descriptions were omitted.)

  • PDF

Intelligent LoRa-Based Positioning System

  • Chen, Jiann-Liang;Chen, Hsin-Yun;Ma, Yi-Wei
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.9
    • /
    • pp.2961-2975
    • /
    • 2022
  • The Location-Based Service (LBS) is one of the most well-known services on the Internet. Positioning is the primary association with LBS services. This study proposes an intelligent LoRa-based positioning system, called AI@LBS, to provide accurate location data. The fingerprint mechanism with the clustering algorithm in unsupervised learning filters out signal noise and improves computing stability and accuracy. In this study, data noise is filtered using the DBSCAN (Density-Based Spatial Clustering of Applications with Noise) algorithm, increasing the positioning accuracy from 95.37% to 97.38%. The problem of data imbalance is addressed using the SMOTE (Synthetic Minority Over-sampling Technique) technique, increasing the positioning accuracy from 97.38% to 99.17%. A field test in the NTUST campus (www.ntust.edu.tw) revealed that AI@LBS system can reduce average distance error to 0.48m.

Spatial Analysis of Common Gastrointestinal Tract Cancers in Counties of Iran

  • Soleimani, Ali;Hassanzadeh, Jafar;Motlagh, Ali Ghanbari;Tabatabaee, Hamidreza;Partovipour, Elham;Keshavarzi, Sareh;Hossein, Mohammad
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.16 no.9
    • /
    • pp.4025-4029
    • /
    • 2015
  • Background: Gastrointestinal tract cancers are among the most common cancers in Iran and comprise approximately 38% of all the reported cases of cancer. This study aimed to describe the epidemiology and to investigate spatial clustering of common cancers of the gastrointestinal tract across the counties of Iran using full Bayesian smoothing and Moran I Index statistics. Materials and Methods: The data of the national registry cancer were used in this study. Besides, indirect standardized rates were calculated for 371 counties of Iranand smoothed using Winbug 1.4 software with a full Bayesian method. Global Moran I and local Moran I were also used to investigate clustering. Results: According to the results, 75,644 new cases of cancer were nationally registered in Iran among which 18,019 cases (23.8%) were esophagus, gastric, colorectal, and liver cancers. The results of Global Moran's I test were 0.60 (P=0.001), 0.47 (P=0.001), 0.29 (P=0.001), and 0.40 (P=0.001) for esophagus, gastric, colorectal, and liver cancers, respectively. This shows clustering of the four studied cancers in Iran at the national level. Conclusions: High level clustering of the cases was seen in northern, northwestern, western, and northeastern areas for esophagus, gastric, and colorectal cancers. Considering liver cancer, high clustering was observed in some counties in central, northeastern, and southern areas.

Variable Clustering Management for Multiple Streaming of Distributed Mobile Service (분산 모바일 서비스의 다중 스트리밍을 위한 가변 클러스터링 관리)

  • Jeong, Taeg-Won;Lee, Chong-Deuk
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.19 no.4
    • /
    • pp.485-492
    • /
    • 2009
  • In the mobile service environment, patterns generated by temporal synchronization are streamed with different instance values. This paper proposed a variable clustering management method, which manages multiple data streaming dynamically, to support flexible clustering. The method manages synchronization effectively and differently with conventional streaming methods in data streaming environment and manages clustering streaming after the structural presentation level and the fitness presentation level. In the structural presentation level, the stream structure is presented using level matching and accumulation matching, and clustering management is carried out by the management of dynamic segment and static segment. The performance of the proposed method is tested by using k-means method, C/S server method, CDN method, and simulation. The test results showed that the proposed method has better performance than the other methods.

Genetic Clustering with Semantic Vector Expansion (의미 벡터 확장을 통한 유전자 클러스터링)

  • Song, Wei;Park, Soon-Cheol
    • The Journal of the Korea Contents Association
    • /
    • v.9 no.3
    • /
    • pp.1-8
    • /
    • 2009
  • This paper proposes a new document clustering system using fuzzy logic-based genetic algorithm (GA) and semantic vector expansion technology. It has been known in many GA papers that the success depends on two factors, the diversity of the population and the capability to convergence. We use the fuzzy logic-based operators to adaptively adjust the influence between these two factors. In traditional document clustering, the most popular and straightforward approach to represent the document is vector space model (VSM). However, this approach not only leads to a high dimensional feature space, but also ignores the semantic relationships between some important words, which would affect the accuracy of clustering. In this paper we use latent semantic analysis (LSA)to expand the documents to corresponding semantic vectors conceptually, rather than the individual terms. Meanwhile, the sizes of the vectors can be reduced drastically. We test our clustering algorithm on 20 news groups and Reuter collection data sets. The results show that our method outperforms the conventional GA in various document representation environments.

Improving Clustering Performance Using Gene Ontology (유전자 온톨로지를 활용한 클러스터링 성능 향상 기법)

  • Ko, Song;Kang, Bo-Yeong;Kim, Dae-Won
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.19 no.6
    • /
    • pp.802-808
    • /
    • 2009
  • Recently many researches have been presented to improve the clustering performance of gene expression data by incorporating Gene Ontology into the process of clustering. In particular, Kustra et al. showed higher performance improvement by exploiting Biological Process Ontology compared to the typical expression-based clustering. This paper extends the work of Kustra et al. by performing extensive experiments on the way of incorporating GO structures. To this end, we used three ontological distance measures (Lin's, Resnik's, Jiang's) and three GO structures (BP, CC, MF) for the yeast expression data. From all test cases, We found that clustering performances were remarkably improved by incorporating GO; especially, Resnik's distance measure based on Biological Process Ontology was the best.

Development of an unsupervised learning-based ESG evaluation process for Korean public institutions without label annotation

  • Do Hyeok Yoo;SuJin Bak
    • Journal of the Korea Society of Computer and Information
    • /
    • v.29 no.5
    • /
    • pp.155-164
    • /
    • 2024
  • This study proposes an unsupervised learning-based clustering model to estimate the ESG ratings of domestic public institutions. To achieve this, the optimal number of clusters was determined by comparing spectral clustering and k-means clustering. These results are guaranteed by calculating the Davies-Bouldin Index (DBI), a model performance index. The DBI values were 0.734 for spectral clustering and 1.715 for k-means clustering, indicating lower values showed better performance. Thus, the superiority of spectral clustering was confirmed. Furthermore, T-test and ANOVA were used to reveal statistically significant differences between ESG non-financial data, and correlation coefficients were used to confirm the relationships between ESG indicators. Based on these results, this study suggests the possibility of estimating the ESG performance ranking of each public institution without existing ESG ratings. This is achieved by calculating the optimal number of clusters, and then determining the sum of averages of the ESG data within each cluster. Therefore, the proposed model can be employed to evaluate the ESG ratings of various domestic public institutions, and it is expected to be useful in domestic sustainable management practice and performance management.