• Title/Summary/Keyword: index clustering

Search Result 323, Processing Time 0.03 seconds

Image compression using K-mean clustering algorithm

  • Munshi, Amani;Alshehri, Asma;Alharbi, Bayan;AlGhamdi, Eman;Banajjar, Esraa;Albogami, Meznah;Alshanbari, Hanan S.
    • International Journal of Computer Science & Network Security
    • /
    • v.21 no.9
    • /
    • pp.275-280
    • /
    • 2021
  • With the development of communication networks, the processes of exchanging and transmitting information rapidly developed. As millions of images are sent via social media every day, also wireless sensor networks are now used in all applications to capture images such as those used in traffic lights, roads and malls. Therefore, there is a need to reduce the size of these images while maintaining an acceptable degree of quality. In this paper, we use Python software to apply K-mean Clustering algorithm to compress RGB images. The PSNR, MSE, and SSIM are utilized to measure the image quality after image compression. The results of compression reduced the image size to nearly half the size of the original images using k = 64. In the SSIM measure, the higher the K, the greater the similarity between the two images which is a good indicator to a significant reduction in image size. Our proposed compression technique powered by the K-Mean clustering algorithm is useful for compressing images and reducing the size of images.

Opera Clustering: K-means on librettos datasets

  • Jeong, Harim;Yoo, Joo Hun
    • Journal of Internet Computing and Services
    • /
    • v.23 no.2
    • /
    • pp.45-52
    • /
    • 2022
  • With the development of artificial intelligence analysis methods, especially machine learning, various fields are widely expanding their application ranges. However, in the case of classical music, there still remain some difficulties in applying machine learning techniques. Genre classification or music recommendation systems generated by deep learning algorithms are actively used in general music, but not in classical music. In this paper, we attempted to classify opera among classical music. To this end, an experiment was conducted to determine which criteria are most suitable among, composer, period of composition, and emotional atmosphere, which are the basic features of music. To generate emotional labels, we adopted zero-shot classification with four basic emotions, 'happiness', 'sadness', 'anger', and 'fear.' After embedding the opera libretto with the doc2vec processing model, the optimal number of clusters is computed based on the result of the elbow method. Decided four centroids are then adopted in k-means clustering to classify unsupervised libretto datasets. We were able to get optimized clustering based on the result of adjusted rand index scores. With these results, we compared them with notated variables of music. As a result, it was confirmed that the four clusterings calculated by machine after training were most similar to the grouping result by period. Additionally, we were able to verify that the emotional similarity between composer and period did not appear significantly. At the end of the study, by knowing the period is the right criteria, we hope that it makes easier for music listeners to find music that suits their tastes.

K-Means Clustering with Deep Learning for Fingerprint Class Type Prediction

  • Mukoya, Esther;Rimiru, Richard;Kimwele, Michael;Mashava, Destine
    • International Journal of Computer Science & Network Security
    • /
    • v.22 no.3
    • /
    • pp.29-36
    • /
    • 2022
  • In deep learning classification tasks, most models frequently assume that all labels are available for the training datasets. As such strategies to learn new concepts from unlabeled datasets are scarce. In fingerprint classification tasks, most of the fingerprint datasets are labelled using the subject/individual and fingerprint datasets labelled with finger type classes are scarce. In this paper, authors have developed approaches of classifying fingerprint images using the majorly known fingerprint classes. Our study provides a flexible method to learn new classes of fingerprints. Our classifier model combines both the clustering technique and use of deep learning to cluster and hence label the fingerprint images into appropriate classes. The K means clustering strategy explores the label uncertainty and high-density regions from unlabeled data to be clustered. Using similarity index, five clusters are created. Deep learning is then used to train a model using a publicly known fingerprint dataset with known finger class types. A prediction technique is then employed to predict the classes of the clusters from the trained model. Our proposed model is better and has less computational costs in learning new classes and hence significantly saving on labelling costs of fingerprint images.

Multiscale Analysis on Expectation of Mechanical Behavior of Polymer Nanocomposites using Nanoparticulate Agglomeration Density Index (나노 입자의 군집밀도를 이용한 고분자 나노복합재의 기계적 거동 예측에 대한 멀티스케일 연구)

  • Baek, Kyungmin;Shin, Hyunseong;Han, Jin-Gyu;Cho, Maenghyo
    • Composites Research
    • /
    • v.30 no.5
    • /
    • pp.323-330
    • /
    • 2017
  • In this study, multiscale analysis in which the information obtained from molecular dynamics simulation is applied to the continuum mechanics level is conducted to investigate the effects of clustering of silicon carbide nanoparticles reinforced into polypropylene matrix on mechanical behavior of nanocomposites. The elastic behavior of polymer nanocomposites is observed for various states of nanoparticulate agglomeration according to the model reflecting the degradation of interphase properties. In addition, factors which mainly affect the mechanical behavior of the nanocomposites are identified, and new index 'clustering density' is defined. The correlation between the clustering density and the elastic modulus of nanocomposites is understood. As the clustering density increases, the interfacial effect decreased and finally the improvement of mechanical properties is suppressed. By considering the random distribution of the nanoparticles, the range of elastic modulus of nanocomposites for same value of clustering density can be investigated. The correlation can be expressed in the form of exponential function, and the mechanical behavior of the polymer nanocomposites can be effectively predicted by using the nanoparticulate clustering density.

A Study on the Musical Theme Clustering for Searching Note Sequences (음렬 탐색을 위한 주제소절 자동분류에 관한 연구)

  • 심지영;김태수
    • Journal of the Korean Society for information Management
    • /
    • v.19 no.3
    • /
    • pp.5-30
    • /
    • 2002
  • In this paper, classification feature is selected with focus of musical content, note sequences pattern, and measures similarity between note sequences followed by constructing clusters by similar note sequences, which is easier for users to search by showing the similar note sequences with the search result in the CBMR system. Experimental document was $\ulcorner$A Dictionary of Musical Themes$\lrcorner$, the index of theme bar focused on classical music and obtained kern-type file. Humdrum Toolkit version 1.0 was used as note sequences treat tool. The hierarchical clustering method is by stages focused on four-type similarity matrices by whether the note sequences segmentation or not and where the starting point is. For the measurement of the result, WACS standard is used in the case of being manual classification and in the case of the note sequences starling from any point in the note sequences, there is used common feature pattern distribution in the cluster obtained from the clustering result. According to the result, clustering with segmented feature unconnected with the starting point Is higher with distinct difference compared with clustering with non-segmented feature.

Spatial Analysis of Common Gastrointestinal Tract Cancers in Counties of Iran

  • Soleimani, Ali;Hassanzadeh, Jafar;Motlagh, Ali Ghanbari;Tabatabaee, Hamidreza;Partovipour, Elham;Keshavarzi, Sareh;Hossein, Mohammad
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.16 no.9
    • /
    • pp.4025-4029
    • /
    • 2015
  • Background: Gastrointestinal tract cancers are among the most common cancers in Iran and comprise approximately 38% of all the reported cases of cancer. This study aimed to describe the epidemiology and to investigate spatial clustering of common cancers of the gastrointestinal tract across the counties of Iran using full Bayesian smoothing and Moran I Index statistics. Materials and Methods: The data of the national registry cancer were used in this study. Besides, indirect standardized rates were calculated for 371 counties of Iranand smoothed using Winbug 1.4 software with a full Bayesian method. Global Moran I and local Moran I were also used to investigate clustering. Results: According to the results, 75,644 new cases of cancer were nationally registered in Iran among which 18,019 cases (23.8%) were esophagus, gastric, colorectal, and liver cancers. The results of Global Moran's I test were 0.60 (P=0.001), 0.47 (P=0.001), 0.29 (P=0.001), and 0.40 (P=0.001) for esophagus, gastric, colorectal, and liver cancers, respectively. This shows clustering of the four studied cancers in Iran at the national level. Conclusions: High level clustering of the cases was seen in northern, northwestern, western, and northeastern areas for esophagus, gastric, and colorectal cancers. Considering liver cancer, high clustering was observed in some counties in central, northeastern, and southern areas.

A Cell-based Clustering Method for Large High-dimensional Data in Data Mining (데이타마이닝에서 고차원 대용량 데이타를 위한 셀-기반 클러스터 링 방법)

  • Jin, Du-Seok;Chang, Jae-Woo
    • Journal of KIISE:Databases
    • /
    • v.28 no.4
    • /
    • pp.558-567
    • /
    • 2001
  • Recently, data mining applications require a large amount of high-dimensional data Most algorithms for data mining applications however, do not work efficiently of high-dimensional large data because of the so-called curse of dimensionality[1] and the limitation of available memory. To overcome these problems, this paper proposes a new cell-based clustering which is more efficient than the existing algorithms for high-dimensional large data, Our clustering method provides a cell construction algorithm for dealing with high-dimensional large data and a index structure based of filtering .We do performance comparison of our cell-based clustering method with the CLIQUE method in terms of clustering time, precision, and retrieval time. Finally, the results from our experiment show that our cell-based clustering method outperform the CLIQUE method.

  • PDF

Study on clustering of satellite images by K-means algorithm

  • 설상동;김정선
    • Proceedings of the Korean Institute of Communication Sciences Conference
    • /
    • 1987.04a
    • /
    • pp.9-13
    • /
    • 1987
  • K-emans alsor/thm was used to classify cloud-type that is low, mix and cumuionimbus Tnitiat ciustercenters and K parameter is given in this paper by coatse computins and Fisher’s alsorithm. Results indicate that performance index is minimized and mix cloud is well clallified.

  • PDF

A Cluster Validity Index for Fuzzy Clustering (퍼지 클러스터링의 타당성 평가 기준)

  • 권순학
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 1998.10a
    • /
    • pp.83-89
    • /
    • 1998
  • 본 논문에서는, 퍼지 클러스터의 수가 증가함에 따라 나타나는 퍼지 클러스터링 타당성 평가 기준의 단조 감소 현상을 억제하는 새로운 퍼지 클러스터링 타당성 평가 기준을 제시한다. 또한, 제시된 평가 기준의 성질을 조사하고 기존의 퍼지 클러스터링 타당성 평가 기준과의 차이점에 대하여 논한다. 마지막으로, 퍼지 크러스터링에 자주 인용되는 몇 가지 전형적인 자료에 대한 모의 실험을 통하여 제시된 평가 기준의 효용성을 보인다.

  • PDF

A Study on the Pattern Distribution of Yin-Yang Ren [음양인] (Used on Questionnaire) (음양인 유형분류에 관한 연구 (설문지를 중심으로))

  • 이상범;최경미;박영배
    • The Journal of Korean Medicine
    • /
    • v.25 no.1
    • /
    • pp.1-20
    • /
    • 2004
  • Objectives : Based on the analysis of Yin-Yang[음양] characteristics and symptoms, each person is classified into Yin-Yang. Also the validity of the result is statistically analized. Methods : From Feb. to May. 2003, the data were collected through a questionnaire given to 690 patients. The questionnaire was composed of 34 items which were about personality, habit, sweat, response to coldness, thirst, bowel, urine, physical shape, and menstruation for women only. SD(Semantic Differential Technique) used for each item, each item is measured as a contrast of two opposite symptoms. Reliability analysis was used to select items and categories. Based on means of items in each category the Yin-Yang index was developed. The validity of Yin-Yang index was investigated using classification and clustering analysis. In statistical analysis, SPSS V10.0.7 PC was used. Results : The obtained results are summarized as follows: 1) We constructed Yin-Yang index based on the middle point of the sum of categorical means. Then we classified each person into Yin or Yang. 2) To investigate the validity of the distribution of personal Yin-Yang degree, the crosstabulation of results from clustering and classification was used. The hit ratio for classification was much higher than Maximum Chance Criterion($C_{max}$), and concurrence in crosstabulation was successful. Therefore we can infer that the distribution of Yin-Yang was valid. Conclusions : Based on Yin-Yang characteristics and symptoms, we was analyzed personal degree of Yin-Yang, and confirmed the validity of its distribution. Therefore this index can be used further for Bian-Zheng [변증] and classification of the constitution.

  • PDF