• Title/Summary/Keyword: clustering by unsupervised learning

Search Result 86, Processing Time 0.026 seconds

Clustering In Tied Mixture HMM Using Homogeneous Centroid Neural Network (Homogeneous Centroid Neural Network에 의한 Tied Mixture HMM의 군집화)

  • Park Dong-Chul;Kim Woo-Sung
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.31 no.9C
    • /
    • pp.853-858
    • /
    • 2006
  • TMHMM(Tied Mixture Hidden Markov Model) is an important approach to reduce the number of free parameters in speech recognition. However, this model suffers from a degradation in recognition accuracy due to its GPDF (Gaussian Probability Density Function) clustering error. This paper proposes a clustering algorithm, called HCNN(Homogeneous Centroid Neural network), to cluster acoustic feature vectors in TMHMM. Moreover, the HCNN uses the heterogeneous distance measure to allocate more code vectors in the heterogeneous areas where probability densities of different states overlap each other. When applied to Korean digit isolated word recognition, the HCNN reduces the error rate by 9.39% over CNN clustering, and 14.63% over the traditional K-means clustering.

Characterization of Premature Ventricular Contraction by K-Means Clustering Learning Algorithm with Mean-Reverting Heart Rate Variability Analysis (평균회귀 심박변이도의 K-평균 군집화 학습을 통한 심실조기수축 부정맥 신호의 특성분석)

  • Kim, Jeong-Hwan;Kim, Dong-Jun;Lee, Jeong-Whan;Kim, Kyeong-Seop
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.66 no.7
    • /
    • pp.1072-1077
    • /
    • 2017
  • Mean-reverting analysis refers to a way of estimating the underlining tendency after new data has evoked the variation in the equilibrium state. In this paper, we propose a new method to interpret the specular portraits of Premature Ventricular Contraction(PVC) arrhythmia by applying K-means unsupervised learning algorithm on electrocardiogram(ECG) data. Aiming at this purpose, we applied a mean-reverting model to analyse Heart Rate Variability(HRV) in terms of the modified poincare plot by considering PVC rhythm as the component of disrupting the homeostasis state. Based on our experimental tests on MIT-BIH ECG database, we can find the fact that the specular patterns portraited by K-means clustering on mean-reverting HRV data can be more clearly visible and the Euclidean metric can be used to identify the discrepancy between the normal sinus rhythm and PVC beats by the relative distance among cluster-centroids.

Preliminary Study on Image Processing Method for Concrete Temperature Monitoring using Thermal Imaging Camera (열화상카메라 기반 콘크리트 온도 측정을 위한 이미지 프로세싱 적용 기초 연구)

  • Mun, Seong-Hwan;Kim, Tae-Hoon;Cho, Kyu-Man
    • Proceedings of the Korean Institute of Building Construction Conference
    • /
    • 2020.06a
    • /
    • pp.206-207
    • /
    • 2020
  • Accurate estimation of concrete strength development at early ages is a critical factor to secure structural stability as well as to speed up the construction process. The temperature generated from the heat of hydration is considered as a key parameter in predicting the early age strength. Conventionally, concrete temperature has been measured by temperature sensors installed inside concrete. However, considering the measurement on building structures with multiple floors, this method requires reinstallation and repositioning of hardware such as sensors, data loggers and routers for data transfer. This makes the temperature monitoring work cumbersome and inefficient. Concrete temperature monitoring by using thermal remote sensing can be an effective alternative to supplement those shortcomings. In this study, image processing was carried out through K-means clustering technique, which is a unsupervised learning method, and the classification results were analyzed accordingly. In the future, research will be conducted on how to automatically recognize concrete among various objects by using deep learning techniques.

  • PDF

Velocity Dispersion Bias of Galaxy Groups classified by Machine Learning Algorithm

  • Lee, Youngdae;Jeong, Hyunjin;Ko, Jongwan;Lee, Joon Hyeop;Lee, Jong Chul;Lee, Hye-Ran;Yang, Yujin;Rey, Soo-Chang
    • The Bulletin of The Korean Astronomical Society
    • /
    • v.44 no.2
    • /
    • pp.74.2-74.2
    • /
    • 2019
  • We present a possible bias in the estimation of velocity dispersions for galaxy groups due to the contribution of subgroups which are infalling into the groups. We execute a systematic search for flux-limited galaxy groups and subgroups based on the spectroscopic galaxies with r < 17.77 mag of SDSS data release 12, by using DBSCAN (Density-Based Spatial Clustering of Application with Noise) and Hierarchical Clustering Method which are well known unsupervised machine learning algorithm. A total of 2042 groups with at least 10 members are found and ~20% of groups have subgroups. We found that the estimation of velocity dispersions of groups using total galaxies including those in subgroups are underestimated by ~10% compared to the case of using only galaxies in main groups. This result suggests that the subgroups should be properly considered for mass measurement of galaxy groups based on the velocity dispersion.

  • PDF

Detection of Traffic Anomalities using Mining : An Empirical Approach (마이닝을 이용한 이상트래픽 탐지: 사례 분석을 통한 접근)

  • Kim Jung-Hyun;Ahn Soo-Han;Won You-Jip;Lee Jong-Moon;Lee Eun-Young
    • Journal of KIISE:Information Networking
    • /
    • v.33 no.3
    • /
    • pp.201-217
    • /
    • 2006
  • In this paper, we collected the physical traces from high speed Internet backbone traffic and analyze the various characteristics of the underlying packet traces. Particularly, our work is focused on analyzing the characteristics of an anomalous traffic. It is found that in our data, the anomalous traffic is caused by UDP session traffic and we determined that it was one of the Denial of Service attacks. In this work, we adopted the unsupervised machine learning algorithm to classify the network flows. We apply the k-means clustering algorithm to train the learner. Via the Cramer-Yon-Misses test, we confirmed that the proposed classification method which is able to detect anomalous traffic within 1 second can accurately predict the class of a flow and can be effectively used in determining the anomalous flows.

Unsupervised Motion Learning for Abnormal Behavior Detection in Visual Surveillance (영상감시시스템에서 움직임의 비교사학습을 통한 비정상행동탐지)

  • Jeong, Ha-Wook;Chang, Hyung-Jin;Choi, Jin-Young
    • Journal of the Institute of Electronics Engineers of Korea SC
    • /
    • v.48 no.5
    • /
    • pp.45-51
    • /
    • 2011
  • In this paper, we propose an unsupervised learning method for modeling motion trajectory patterns effectively. In our approach, observations of an object on a trajectory are treated as words in a document for latent dirichlet allocation algorithm which is used for clustering words on the topic in natural language process. This allows clustering topics (e.g. go straight, turn left, turn right) effectively in complex scenes, such as crossroads. After this procedure, we learn patterns of word sequences in each cluster using Baum-Welch algorithm used to find the unknown parameters in a hidden markov model. Evaluation of abnormality can be done using forward algorithm by comparing learned sequence and input sequence. Results of experiments show that modeling of semantic region is robust against noise in various scene.

An Improved Clustering Method with Cluster Density Independence (클러스터 밀도에 무관한 향상된 클러스터링 기법)

  • Yoo, Byeong-Hyeon;Kim, Wan-Woo;Heo, Gyeongyong
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2015.10a
    • /
    • pp.248-249
    • /
    • 2015
  • Clustering is one of the most important unsupervised learning methods that clusters data into homogeneous groups. However, cluster centers tend leaning to high density clusters because clustering is based on the distances between data points and cluster centers. In this paper, a modified clustering method forcing cluster centers to be apart by introducing a center-scattering term in the Fuzzy C-Means objective function is introduced. The proposed method converges more to real centers with small number of iterations compared to the original one. All the strengths can be verified with experimental results.

  • PDF

Design and Implentation of Body Fat Percentage Analysis Model using K-means and CNN (K-means와 CNN을 활용한 체지방율 분석 모델 설계 및 구현)

  • Lee, Taejun;Park, Chanmyeong;Kim, Changsu;Jung, Heokyung
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2021.10a
    • /
    • pp.329-331
    • /
    • 2021
  • Recently, as various cases of using deep learning in the health-care field are increasing, functions such as electrocardiogram examination and body composition analysis through wearable device can be provided to provide rational decision-making and a process tailored to the individual. In order to utilize deep learning, it it most important to secure refined data, and this data is being made through human intervention or unsupervised learning. In this paper, we propose a model that conducts unsupervised learning by clusters according to gender and age using human body data such as chest and waist circumferences, which are easy to measure, and classifies them with CNN. For data, the 7th human body data provided by Korean Agency for Technology and Standards was used. Through this, it it thought that it can be applied to various application cases such as personalized body shape management service and obesity analysis.

  • PDF

A Study on Preprocessing Method in Deep Learning for ICS Cyber Attack Detection (ICS 사이버 공격 탐지를 위한 딥러닝 전처리 방법 연구)

  • Seonghwan Park;Minseok Kim;Eunseo Baek;Junghoon Park
    • Smart Media Journal
    • /
    • v.12 no.11
    • /
    • pp.36-47
    • /
    • 2023
  • Industrial Control System(ICS), which controls facilities at major industrial sites, is increasingly connected to other systems through networks. With this integration and the development of intelligent attacks that can lead to a single external intrusion as a whole system paralysis, the risk and impact of security on industrial control systems are increasing. As a result, research on how to protect and detect cyber attacks is actively underway, and deep learning models in the form of unsupervised learning have achieved a lot, and many abnormal detection technologies based on deep learning are being introduced. In this study, we emphasize the application of preprocessing methodologies to enhance the anomaly detection performance of deep learning models on time series data. The results demonstrate the effectiveness of a Wavelet Transform (WT)-based noise reduction methodology as a preprocessing technique for deep learning-based anomaly detection. Particularly, by incorporating sensor characteristics through clustering, the differential application of the Dual-Tree Complex Wavelet Transform proves to be the most effective approach in improving the detection performance of cyber attacks.

K-Means Clustering with Content Based Doctor Recommendation for Cancer

  • kumar, Rethina;Ganapathy, Gopinath;Kang, Jeong-Jin
    • International Journal of Advanced Culture Technology
    • /
    • v.8 no.4
    • /
    • pp.167-176
    • /
    • 2020
  • Recommendation Systems is the top requirements for many people and researchers for the need required by them with the proper suggestion with their personal indeed, sorting and suggesting doctor to the patient. Most of the rating prediction in recommendation systems are based on patient's feedback with their information regarding their treatment. Patient's preferences will be based on the historical behaviour of similar patients. The similarity between the patients is generally measured by the patient's feedback with the information about the doctor with the treatment methods with their success rate. This paper presents a new method of predicting Top Ranked Doctor's in recommendation systems. The proposed Recommendation system starts by identifying the similar doctor based on the patients' health requirements and cluster them using K-Means Efficient Clustering. Our proposed K-Means Clustering with Content Based Doctor Recommendation for Cancer (KMC-CBD) helps users to find an optimal solution. The core component of KMC-CBD Recommended system suggests patients with top recommended doctors similar to the other patients who already treated with that doctor and supports the choice of the doctor and the hospital for the patient requirements and their health condition. The recommendation System first computes K-Means Clustering is an unsupervised learning among Doctors according to their profile and list the Doctors according to their Medical profile. Then the Content based doctor recommendation System generates a Top rated list of doctors for the given patient profile by exploiting health data shared by the crowd internet community. Patients can find the most similar patients, so that they can analyze how they are treated for the similar diseases, and they can send and receive suggestions to solve their health issues. In order to the improve Recommendation system efficiency, the patient can express their health information by a natural-language sentence. The Recommendation system analyze and identifies the most relevant medical area for that specific case and uses this information for the recommendation task. Provided by users as well as the recommended system to suggest the right doctors for a specific health problem. Our proposed system is implemented in Python with necessary functions and dataset.