• Title/Summary/Keyword: Cluster and Outlier Analysis

Search Result 14, Processing Time 0.023 seconds

The Classification of Forest Cover Types by Consecutive Application of Multivariate Statistical Analysis in the Natural Forest of Western Mt. Jiri (다변량 통계 분석법의 연속 적용에 의한 서부 지리산 천연림의 산림 피복형 분류)

  • Chung, Sang Hoon;Kim, Ji Hong
    • Journal of Korean Society of Forest Science
    • /
    • v.102 no.3
    • /
    • pp.407-414
    • /
    • 2013
  • This study was conducted to classify forest cover types using the multivariate statistical analysis in the natural forest of western Mt. Jiri. On the basis of the vegetation data by point quarter sampling, the adopted analytical methods were species-area curve (SAC), hierarchical cluster analysis (HCA), indicator species analysis (ISA), and multiple discriminant analysis (MDA). SAC selected the outlier tree species which was likely to have no influence on the classification of forest cover types, excluded from all analytical process. Based on forest vegetative information, HCA classified the study area into 2 to 10 clusters and ISA indicated that the optimal number of clusters were seven. MDA was taken to test the clusters that classified with HCA and ISA. The seven clusters were classified appropriately as overall classification success were 91.3%. The classified forest cover types were named by the ratio of the dominant species in the upper layer of each cluster. They were (1) Quercus mongolica Pure forest, (2) Mixed mesophytic forest, (3) Q. mongolica - Q. serrata forest, (4) Abies koreana - Q. mongolica forest, (5) Fraxinus mandshurica forest, (6) Q. serrata forest, and (7) Carpinus laxiflora forest.

Moving Object Detection and Tracking Techniques for Error Reduction (오인식률 감소를 위한 이동 물체 검출 및 추적 기법)

  • Hwang, Seung-Jun;Ko, Ha-Yoon;Baek, Joong-Hwan
    • Journal of Advanced Navigation Technology
    • /
    • v.22 no.1
    • /
    • pp.20-26
    • /
    • 2018
  • In this paper, we propose a moving object detection and tracking algorithm based on multi-frame feature point tracking information to reduce false positives. However, there are problems of detection error and tracking speed in existing studies. In order to compensate for this, we first calculate the corner feature points and the optical flow of multiple frames for camera movement compensation and object tracking. Next, the tracking error of the optical flow is reduced by the multi-frame forward-backward tracking, and the traced feature points are divided into the background and the moving object candidate based on homography and RANSAC algorithm for camera movement compensation. Among the transformed corner feature points, the outlier points removed by the RANSAC are clustered and the outlier cluster of a certain size is classified as the moving object candidate. Objects classified as moving object candidates are tracked according to label tracking based data association analysis. In this paper, we prove that the proposed algorithm improves both precision and recall compared with existing algorithms by using quadrotor image - based detection and tracking performance experiments.

Characteristics for the Distribution of Elderly Population by Utilizing the Census Data (센서스 데이터를 활용한 고령인구 분포 특성)

  • Nam, Kwang-Woo;Gwon, Il-Hwa
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.14 no.1
    • /
    • pp.464-469
    • /
    • 2013
  • After city of Busan has been entered to the aging society in 2000, the city has the highest aging rate among 7 representative cities in 2011. Moreover, while entire population and number of average household are decreasing, over 65 years old of elderly population is rapidly increasing. So, it is possible to enter the super-aged society, where aging rate would be about 20% after 2020. The purpose of this study is that older housing-related analysis is consisted of dong-unit, and this led microscopic analysis has become necessary. Surveys from 2000 through 2010, census aggregate (output area) unit of spatial analysis was conducted. Take advantages of this, aging population and area, soaring area, high-density areas, such as the region of interest were primary extracted, and microscopic location and spatial distribution patterns were analyzed. Upon analysis, aging population is concentrated in the city and adjacent area, the highlands, and 10 years of increasing rate was more than 30 times in certain aggregate. Regarding the characteristic of these areas, the original city center, Busan, especially concentrated and intensified in aging population. Also, 2000 to 2010, the overall distribution pattern of Busan has identified aging population that is increasingly being distributed. This is the result, which is confronted with previous research result. Entering a super aged-society for the future is accordance with migration of social costs and improve the quality of life of elderly. And this could be the basic information to use the spatial dimension for the corresponding.

A Study of Anomaly Detection for ICT Infrastructure using Conditional Multimodal Autoencoder (ICT 인프라 이상탐지를 위한 조건부 멀티모달 오토인코더에 관한 연구)

  • Shin, Byungjin;Lee, Jonghoon;Han, Sangjin;Park, Choong-Shik
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.3
    • /
    • pp.57-73
    • /
    • 2021
  • Maintenance and prevention of failure through anomaly detection of ICT infrastructure is becoming important. System monitoring data is multidimensional time series data. When we deal with multidimensional time series data, we have difficulty in considering both characteristics of multidimensional data and characteristics of time series data. When dealing with multidimensional data, correlation between variables should be considered. Existing methods such as probability and linear base, distance base, etc. are degraded due to limitations called the curse of dimensions. In addition, time series data is preprocessed by applying sliding window technique and time series decomposition for self-correlation analysis. These techniques are the cause of increasing the dimension of data, so it is necessary to supplement them. The anomaly detection field is an old research field, and statistical methods and regression analysis were used in the early days. Currently, there are active studies to apply machine learning and artificial neural network technology to this field. Statistically based methods are difficult to apply when data is non-homogeneous, and do not detect local outliers well. The regression analysis method compares the predictive value and the actual value after learning the regression formula based on the parametric statistics and it detects abnormality. Anomaly detection using regression analysis has the disadvantage that the performance is lowered when the model is not solid and the noise or outliers of the data are included. There is a restriction that learning data with noise or outliers should be used. The autoencoder using artificial neural networks is learned to output as similar as possible to input data. It has many advantages compared to existing probability and linear model, cluster analysis, and map learning. It can be applied to data that does not satisfy probability distribution or linear assumption. In addition, it is possible to learn non-mapping without label data for teaching. However, there is a limitation of local outlier identification of multidimensional data in anomaly detection, and there is a problem that the dimension of data is greatly increased due to the characteristics of time series data. In this study, we propose a CMAE (Conditional Multimodal Autoencoder) that enhances the performance of anomaly detection by considering local outliers and time series characteristics. First, we applied Multimodal Autoencoder (MAE) to improve the limitations of local outlier identification of multidimensional data. Multimodals are commonly used to learn different types of inputs, such as voice and image. The different modal shares the bottleneck effect of Autoencoder and it learns correlation. In addition, CAE (Conditional Autoencoder) was used to learn the characteristics of time series data effectively without increasing the dimension of data. In general, conditional input mainly uses category variables, but in this study, time was used as a condition to learn periodicity. The CMAE model proposed in this paper was verified by comparing with the Unimodal Autoencoder (UAE) and Multi-modal Autoencoder (MAE). The restoration performance of Autoencoder for 41 variables was confirmed in the proposed model and the comparison model. The restoration performance is different by variables, and the restoration is normally well operated because the loss value is small for Memory, Disk, and Network modals in all three Autoencoder models. The process modal did not show a significant difference in all three models, and the CPU modal showed excellent performance in CMAE. ROC curve was prepared for the evaluation of anomaly detection performance in the proposed model and the comparison model, and AUC, accuracy, precision, recall, and F1-score were compared. In all indicators, the performance was shown in the order of CMAE, MAE, and AE. Especially, the reproduction rate was 0.9828 for CMAE, which can be confirmed to detect almost most of the abnormalities. The accuracy of the model was also improved and 87.12%, and the F1-score was 0.8883, which is considered to be suitable for anomaly detection. In practical aspect, the proposed model has an additional advantage in addition to performance improvement. The use of techniques such as time series decomposition and sliding windows has the disadvantage of managing unnecessary procedures; and their dimensional increase can cause a decrease in the computational speed in inference.The proposed model has characteristics that are easy to apply to practical tasks such as inference speed and model management.