• Title/Summary/Keyword: K-means 군집화

Search Result 274, Processing Time 0.031 seconds

Patterning Zooplankton Dynamics in the Regulated Nakdong River by Means of the Self-Organizing Map (자가조직화 지도 방법을 이용한 조절된 낙동강 내 동물플랑크톤 역동성의 모형화)

  • Kim, Dong-Kyun;Joo, Gea-Jae;Jeong, Kwang-Seuk;Chang, Kwang-Hyson;Kim, Hyun-Woo
    • Korean Journal of Ecology and Environment
    • /
    • v.39 no.1 s.115
    • /
    • pp.52-61
    • /
    • 2006
  • The aim of this study was to analyze the seasonal patterns of zooplankton community dynamics in the lower Nakdong River (Mulgum, RK; river kilometer; 27 km from the estuarine barrage), with a Self-Organizing Map (SOM) based on weekly sampled data collected over ten years(1994 ${\sim}$ 2003). It is well known that zooplankton groups had important role in the food web of freshwater ecosystems, however, less attention has been paid to this group compared with other community constituents. A non-linear patterning algorithm of the SOM was applied to discover the relationship among river environments and zooplankton community dynamics. Limnological variables (water temperature, dissolved oxygen, pH , Secchi transparency, turbidity, chlorophyll a, discharge, etc.) were taken into account to implement patterning seasonal changes of zooplankton community structures (consisting of rotifers, cladocerans and copepods). The trained SOM model allocated zooplankton on the map plane with limnological parameters. Three zooplankton groups had high similarities to one another in their changing seasonal patterns, Among the limnological variables, water temporature was highly related to the zooplankton community dynamics (especially for cladocerans). The SOM model illustrated the suppression of zooplankton due to the increased river discharge, particularly in summer. Chlorophyll a concentrations were separated from zooplankton data set on the map plane, which would intimate the herbivorous activity of dominant grazers. This study introduces the zooplankton dynamics associated with limnological parameters using a nonlinear method, and the information will be useful for managing the river ecosystem, with respect to the food web interactions.

Online VQ Codebook Generation using a Triangle Inequality (삼각 부등식을 이용한 온라인 VQ 코드북 생성 방법)

  • Lee, Hyunjin
    • Journal of Digital Contents Society
    • /
    • v.16 no.3
    • /
    • pp.373-379
    • /
    • 2015
  • In this paper, we propose an online VQ Codebook generation method for updating an existing VQ Codebook in real-time and adding to an existing cluster with newly created text data which are news paper, web pages, blogs, tweets and IoT data like sensor, machine. Without degrading the performance of the batch VQ Codebook to the existing data, it was able to take advantage of the newly added data by using a triangle inequality which modifying the VQ Codebook progressively show a high degree of accuracy and speed. The result of applying to test data showed that the performance is similar to the batch method.

A Study on Degradation Pattern of GIS Using Clustering Methode (군집화 기법을 이용한 GIS 열화 패턴 연구)

  • Lee, Deok Jin
    • Journal of the Korean Institute of Electrical and Electronic Material Engineers
    • /
    • v.31 no.4
    • /
    • pp.255-260
    • /
    • 2018
  • In recent years, increasing electricity use has led to considerable interest in green energy. In order to effectively supply, cut off, and operate an electric power system, many electric power facilities such as gas insulation switch (GIS), cable, and large substation facilities with higher densities are being developed to meet demand. However, because of the increased use of aging electric power facilities, safety problems are emerging. Electromagnetic wave and leakage current detection are mainly used as sensing methods to detect live-line partial discharges. Although electromagnetic sensors are excellent at providing an initial diagnosis and very reliable, it is difficult to precisely determine the fault point, while leakage current sensors require a connection to the ground line and are very vulnerable to line noise. The partial discharge characteristic in particular is accompanied by statistical irregularity, and it has been reported that proper statistical processing of data is very important. Therefore, in this paper, we present the results of analyzing ${\Phi}-q-n$ cluster distributions of partial discharge characteristics by using K-means clustering to develop an expert partial discharge diagnosis system generated in a GIS facility.

Classification of the Aged Distribution and the Occupational-Demographic Characteristics in the Seoul Metropolitan Area (수도권 고령층 분포지역의 유형화와 유형별 거주 및 고용 특성 분석)

  • Park, So Hyun;Lee, Keumsook
    • Journal of the Korean Regional Science Association
    • /
    • v.33 no.3
    • /
    • pp.79-100
    • /
    • 2017
  • This study provides the insight into the aged employment provision issue for the aged-low growth era. For the purpose, we analyze the national trend of the age demographic and occupational employment in first. And then we investigate the spatial characteristics of employment of the aged in the Seoul Metropolitan area which has the highest elderly population by utilizing location quotient, factor analysis, and K-means cluster analysis. As the result, we found that the spatial distribution patterns of the residence and workplace of the elderly were nearly coincided with each other. Furthermore, five clusters of the aged distribution have been determined according to the industrial-occupational-demographic attributes. The result revealed clear spatial segrmentation: Most of elderly population of the research area have been engaged in the low-level service jobs, while elderly population employed to the educated-knowledged based high-level jobs has been distributed at a few cores. The results could be applied to the practical use for regional employment planning for the aged.

Performance Improvement of Radial Basis Function Neural Networks Using Adaptive Feature Extraction (적응적 특징추출을 이용한 Radial Basis Function 신경망의 성능개선)

  • 조용현
    • Journal of Korea Multimedia Society
    • /
    • v.3 no.3
    • /
    • pp.253-262
    • /
    • 2000
  • This paper proposes a new RBF neural network that determines the number and the center of hidden neurons based on the adaptive feature extraction for the input data. The principal component analysis is applied for extracting adaptively the features by reducing the dimension of the given input data. It can simultaneously achieve a superior property of both the principal component analysis by mapping input data into set of statistically independent features and the RBF neural networks. The proposed neural networks has been applied to classify the 200 breast cancer databases by 2-class. The simulation results shows that the proposed neural networks has better performances of the learning time and the classification for test data, in comparison with those using the k-means clustering algorithm. And it is affected less than the k-means clustering algorithm by the initial weight setting and the scope of the smoothing factor.

  • PDF

A study on the ordering of PIM family similarity measures without marginal probability (주변 확률을 고려하지 않는 확률적 흥미도 측도 계열 유사성 측도의 서열화)

  • Park, Hee Chang
    • Journal of the Korean Data and Information Science Society
    • /
    • v.26 no.2
    • /
    • pp.367-376
    • /
    • 2015
  • Today, big data has become a hot keyword in that big data may be defined as collection of data sets so huge and complex that it becomes difficult to process by traditional methods. Clustering method is to identify the information in a big database by assigning a set of objects into the clusters so that the objects in the same cluster are more similar to each other clusters. The similarity measures being used in the cluster analysis may be classified into various types depending on the nature of the data. In this paper, we computed upper and lower limits for probability interestingness measure based similarity measures without marginal probability such as Yule I and II, Michael, Digby, Baulieu, and Dispersion measure. And we compared these measures by real data and simulated experiment. By Warrens (2008), Coefficients with the same quantities in the numerator and denominator, that are bounded, and are close to each other in the ordering, are likely to be more similar. Thus, results on bounds provide means of classifying various measures. Also, knowing which coefficients are similar provides insight into the stability of a given algorithm.

Detecting outliers in multivariate data and visualization-R scripts (다변량 자료에서 특이점 검출 및 시각화 - R 스크립트)

  • Kim, Sung-Soo
    • The Korean Journal of Applied Statistics
    • /
    • v.31 no.4
    • /
    • pp.517-528
    • /
    • 2018
  • We provide R scripts to detect outliers in multivariate data and visualization. Detecting outliers is provided using three approaches 1) Robust Mahalanobis distance, 2) High Dimensional data, 3) density-based approach methods. We use the following techniques to visualize detected potential outliers 1) multidimensional scaling (MDS) and minimal spanning tree (MST) with k-means clustering, 2) MDS with fviz cluster, 3) principal component analysis (PCA) with fviz cluster. For real data sets, we use MLB pitching data including Ryu, Hyun-jin in 2013 and 2014. The developed R scripts can be downloaded at "http://www.knou.ac.kr/~sskim/ddpoutlier.html" (R scripts and also R package can be downloaded here).

Performance Improvement of Continuous Digits Speech Recognition Using the Transformed Successive State Splitting and Demi-syllable Pair (반음절쌍과 변형된 연쇄 상태 분할을 이용한 연속 숫자 음 인식의 성능 향상)

  • Seo Eun-Kyoung;Choi Gab-Keun;Kim Soon-Hyob;Lee Soo-Jeong
    • Journal of Korea Multimedia Society
    • /
    • v.9 no.1
    • /
    • pp.23-32
    • /
    • 2006
  • This paper describes the optimization of a language model and an acoustic model to improve speech recognition using Korean unit digits. Since the model is composed of a finite state network (FSN) with a disyllable, recognition errors of the language model were reduced by analyzing the grammatical features of Korean unit digits. Acoustic models utilize a demisyllable pair to decrease recognition errors caused by inaccurate division of a phone or monosyllable due to short pronunciation time and articulation. We have used the K-means clustering algorithm with the transformed successive state splitting in the feature level for the efficient modelling of feature of the recognition unit. As a result of experiments, 10.5% recognition rate is raised in the case of the proposed language model. The demi-syllable fair with an acoustic model increased 12.5% recognition rate and 1.5% recognition rate is improved in transformed successive state splitting.

  • PDF

Camera and LiDAR Sensor Fusion for Improving Object Detection (카메라와 라이다의 객체 검출 성능 향상을 위한 Sensor Fusion)

  • Lee, Jongseo;Kim, Mangyu;Kim, Hakil
    • Journal of Broadcast Engineering
    • /
    • v.24 no.4
    • /
    • pp.580-591
    • /
    • 2019
  • This paper focuses on to improving object detection performance using the camera and LiDAR on autonomous vehicle platforms by fusing detected objects from individual sensors through a late fusion approach. In the case of object detection using camera sensor, YOLOv3 model was employed as a one-stage detection process. Furthermore, the distance estimation of the detected objects is based on the formulations of Perspective matrix. On the other hand, the object detection using LiDAR is based on K-means clustering method. The camera and LiDAR calibration was carried out by PnP-Ransac in order to calculate the rotation and translation matrix between two sensors. For Sensor fusion, intersection over union(IoU) on the image plane with respective to the distance and angle on world coordinate were estimated. Additionally, all the three attributes i.e; IoU, distance and angle were fused using logistic regression. The performance evaluation in the sensor fusion scenario has shown an effective 5% improvement in object detection performance compared to the usage of single sensor.

Determination of coagulant input rate in water purification plant using K-means algorithm and GBR algorithm (K-means 알고리즘과 GBR 알고리즘을 이용한 정수장 응집제 투입률 결정 기법)

  • Kim, Jinyoung;Kang, Bokseon;Jung, Hoekyung
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.25 no.6
    • /
    • pp.792-798
    • /
    • 2021
  • In this paper, an algorithm for determining the coagulant input rate in the drug-injection tank during the process of the water purification plant was derived through big data analysis and prediction based on artificial intelligence. In addition, analysis of big data technology and AI algorithm application methods and existing academic and technical data were reviewed to analyze and review application cases in similar fields. Through this, the goal was to develop an algorithm for determining the coagulant input rate and to present the optimal input rate through autonomous driving simulator and pilot operation of the coagulant input process. Through this study, the coagulant injection rate, which is an output variable, is determined based on various input variables, and it is developed to simulate the relationship pattern between the input variable and the output variable and apply the learned pattern to the decision-making pattern of water plant operating workers.