• Title/Summary/Keyword: Unsupervised-Classification

Search Result 276, Processing Time 0.034 seconds

Anomaly Detection In Real Power Plant Vibration Data by MSCRED Base Model Improved By Subset Sampling Validation (Subset 샘플링 검증 기법을 활용한 MSCRED 모델 기반 발전소 진동 데이터의 이상 진단)

  • Hong, Su-Woong;Kwon, Jang-Woo
    • Journal of Convergence for Information Technology
    • /
    • v.12 no.1
    • /
    • pp.31-38
    • /
    • 2022
  • This paper applies an expert independent unsupervised neural network learning-based multivariate time series data analysis model, MSCRED(Multi-Scale Convolutional Recurrent Encoder-Decoder), and to overcome the limitation, because the MCRED is based on Auto-encoder model, that train data must not to be contaminated, by using learning data sampling technique, called Subset Sampling Validation. By using the vibration data of power plant equipment that has been labeled, the classification performance of MSCRED is evaluated with the Anomaly Score in many cases, 1) the abnormal data is mixed with the training data 2) when the abnormal data is removed from the training data in case 1. Through this, this paper presents an expert-independent anomaly diagnosis framework that is strong against error data, and presents a concise and accurate solution in various fields of multivariate time series data.

Classification of hysteretic loop feature for runoff generation through a unsupervised machine learning algorithm (비지도 기계학습을 통한 유출 발생 내 이력 현상 구분)

  • Lee, Eunhyung;Jeon, Hangtak;Kim, Dahong;Friday, Bassey Bassey;Kim, Sanghyun
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2022.05a
    • /
    • pp.360-360
    • /
    • 2022
  • 토양수분과 유출 간 관계를 정량화하는 것은 수문 기작 및 유출 발생 과정의 이해를 위한 중요한 정보를 제공한다. 특히, 유출과정의 특성화는 수문 사상에 따른 불포화대 내 토양수 및 토사 손실 제어와 산사태 및 비점오염원 발생 예측을 위해 필수적이다. 유출과정과 관련된 비선형성과 복잡성을 확인하기 위해 토양수분과 유출 사이의 이력 거동이 조사되었다. 특히, 수문 과정 내 이력 현상 구체화를 위해 정성적인 시각적 분류 및 정량적 평가를 위한 이력 지수들이 개발되었다. 정성적인 시각적 분류는 시간에 따라 시계 및 반시계방향으로 다중 루프 형상을 나누는 방식으로 진행되었고, 정량적 평가의 경우 이력 고리(Hysteretic loop) 내 상승 고리(Rising limb)와 하강 고리(Falling limb)의 차이를 기준으로 한 지수로 이력 현상을 특성화하였다. 이전에 제안된 방법론들은 연구자의 판단이 들어가기 때문에 보편적이지 않고 이력 현상을 개발된 지수에 맞춤에 따라 자료 손실이 나타나는 한계가 존재한다. 자료의 손실 없이 불포화대 내 발생 가능한 대표 이력 현상을 자동으로 추출하기 위해 적합한 비지도 학습기반 기계학습 방법론의 제안이 필요하다. 우리 연구에서는 국내 산지 사면에서 강우 사상 동안 다중 깊이(10, 30, 60cm)로 56개의 토양수분 측정지점에서 확보된 토양수분 시계열 자료와 산지 사면 내 위어를 통해 확보된 유출 시계열 자료를 사용하였다. 먼저, 기존에 분류 방법을 기반으로 계절 및 공간특성에 따라 지배적으로 발생하는 토양수분-유출 간 이력 현상을 특성화하였다. 다음으로, 토양수분-유출 간 이력 패턴을 자료 손실 없이 형상화하여 자동으로 데이터베이스화하는 알고리즘을 개발하였다. 마지막으로, 비지도 학습방법을 이용하여 데이터베이스화된 실제 발현 이력 현상 내 확률분포를 최대한 가깝게 추정하는 은닉층을 반복적인 재구성 학습을 통해 구현함으로써 대표 이력 현상 패턴을 추출하였다.

  • PDF

Application and Potential of Artificial Intelligence in Heart Failure: Past, Present, and Future

  • Minjae Yoon;Jin Joo Park;Taeho Hur;Cam-Hao Hua;Musarrat Hussain;Sungyoung Lee;Dong-Ju Choi
    • International Journal of Heart Failure
    • /
    • v.6 no.1
    • /
    • pp.11-19
    • /
    • 2024
  • The prevalence of heart failure (HF) is increasing, necessitating accurate diagnosis and tailored treatment. The accumulation of clinical information from patients with HF generates big data, which poses challenges for traditional analytical methods. To address this, big data approaches and artificial intelligence (AI) have been developed that can effectively predict future observations and outcomes, enabling precise diagnoses and personalized treatments of patients with HF. Machine learning (ML) is a subfield of AI that allows computers to analyze data, find patterns, and make predictions without explicit instructions. ML can be supervised, unsupervised, or semi-supervised. Deep learning is a branch of ML that uses artificial neural networks with multiple layers to find complex patterns. These AI technologies have shown significant potential in various aspects of HF research, including diagnosis, outcome prediction, classification of HF phenotypes, and optimization of treatment strategies. In addition, integrating multiple data sources, such as electrocardiography, electronic health records, and imaging data, can enhance the diagnostic accuracy of AI algorithms. Currently, wearable devices and remote monitoring aided by AI enable the earlier detection of HF and improved patient care. This review focuses on the rationale behind utilizing AI in HF and explores its various applications.

INVESTIGATION OF BAIKDU-SAN VOLCANO WITH SPACE-BORNE SAR SYSTEM

  • Kim, Duk-Jin;Feng, Lanying;Moon, Wooil-M.
    • Proceedings of the KSRS Conference
    • /
    • 1999.11a
    • /
    • pp.148-153
    • /
    • 1999
  • Baikdu-san was a very active volcano during the Cenozoic era and is believed to be formed in late Cenozoic era. Recently it was also reported that there was a major eruption in or around 1002 A.D. and there are evidences which indicate that it is still an active volcano and a potential volcanic hazard. Remote sensing techniques have been widely used to monitor various natural hazards, including volcanic hazards. However, during an active volcanic eruption, volcanic ash can basically cover the sky and often blocks the solar radiation preventing any use of optical sensors. Synthetic aperture radar(SAR) is an ideal tool to monitor the volcanic activities and lava flows, because the wavelength of the microwave signal is considerably longer that the average volcanic ash particle size. In this study we have utilized several sets of SAR data to evaluate the utility of the space-borne SAR system. The data sets include JERS-1(L-band) SAR, and RADARSAT(C-band) data which included both standard mode and the ScanSAR mode data sets. We also utilized several sets of auxiliary data such as local geological maps and JERS-1 OPS data. The routine preprocessing and image processing steps were applied to these data sets before any attempts of classifying and mapping surface geological features. Although we computed sigma nought ($\sigma$$^{0}$) values far the standard mode RADARSAT data, the utility of sigma nought image was minimal in this study. Application of various types of classification algorithms to identify and map several stages of volcanic flows was not very successful. Although this research is still in progress, the following preliminary conclusions could be made: (1) sigma nought (RADARSAT standard mode data) and DN (JERS-1 SAR and RADARSAT ScanSAR data) have limited usefulness for distinguishing early basalt lava flows from late trachyte flows or later trachyte flows from the old basement granitic rocks around Baikdu-san volcano, (2) surface geological structure features such as several faults and volcanic lava flow channels can easily be identified and mapped, and (3) routine application of unsupervised classification methods cannot be used for mapping any types of surface lava flow patterns.

  • PDF

Improved Focused Sampling for Class Imbalance Problem (클래스 불균형 문제를 해결하기 위한 개선된 집중 샘플링)

  • Kim, Man-Sun;Yang, Hyung-Jeong;Kim, Soo-Hyung;Cheah, Wooi Ping
    • The KIPS Transactions:PartB
    • /
    • v.14B no.4
    • /
    • pp.287-294
    • /
    • 2007
  • Many classification algorithms for real world data suffer from a data class imbalance problem. To solve this problem, various methods have been proposed such as altering the training balance and designing better sampling strategies. The previous methods are not satisfy in the distribution of the input data and the constraint. In this paper, we propose a focused sampling method which is more superior than previous methods. To solve the problem, we must select some useful data set from all training sets. To get useful data set, the proposed method devide the region according to scores which are computed based on the distribution of SOM over the input data. The scores are sorted in ascending order. They represent the distribution or the input data, which may in turn represent the characteristics or the whole data. A new training dataset is obtained by eliminating unuseful data which are located in the region between an upper bound and a lower bound. The proposed method gives a better or at least similar performance compare to classification accuracy of previous approaches. Besides, it also gives several benefits : ratio reduction of class imbalance; size reduction of training sets; prevention of over-fitting. The proposed method has been tested with kNN classifier. An experimental result in ecoli data set shows that this method achieves the precision up to 2.27 times than the other methods.

Applying of SOM for Automatic Recognition of Tension and Relaxation (긴장과 이완상태의 자동인식을 위한 SOM의 적용)

  • Jeong, Chan-Soon;Ham, Jun-Seok;Ko, Il-Ju;Jang, Dae-Sik
    • Journal of the Korea Society of Computer and Information
    • /
    • v.15 no.2
    • /
    • pp.65-74
    • /
    • 2010
  • We propose a system that automatically recognizes the tense or relaxed condition of scrolling-shooting game subject that plays. Existing study compares the changed values of source of stimulation to the player by suggesting the source, and thus involves limitation in automatic classification. This study applies SOM of unsupervised learning for automatic classification and recognition of player's condition change. Application of SOM for automatic recognition of tense and relaxed condition is composed of two steps. First, ECG measurement and analysis, is to extract characteristic vector through HRV analysis by measuring ECG after having the player play the game. Secondly, SOM learning and recognition, is to classify and recognize the tense and relaxed conditions of player through SOM learning of the input vectors of heart beat signals that the characteristic extracted. Experiment results are divided into three groups. The first is HRV frequency change and the second the SOM learning results of heart beat signal. The third is the analysis of match rate to identify SOM learning performance. As a result of matching the LF/HF ratio of HRV frequency analysis to the distance of winner neuron of SOM based on 1.5, a match rate of 72% performance in average was shown.

Extraction of Agricultural Land Use and Crop Growth Information using KOMPSAT-3 Resolution Satellite Image (KOMPSAT-3급 위성영상을 이용한 농업 토지이용 및 작물 생육정보 추출)

  • Lee, Mi-Seon;Kim, Seong-Joon;Shin, Hyoung-Sub;Park, Jin-Ki;Park, Jong-Hwa
    • Korean Journal of Remote Sensing
    • /
    • v.25 no.5
    • /
    • pp.411-421
    • /
    • 2009
  • This study refers to develop a semi-automatic extraction of agricultural land use and vegetation information using high resolution satellite images. Data of IKONOS-2 satellite images (May 25 of 2001, December 25 of 2001, and October 23 of 2003), QuickBird-2 satellite images (May 1 of 2006 and November 17 of 2004) and KOMPSAT-2 satellite image (September 17 of 2007) which resemble with the spatial resolution and spectral characteristics of KOMPSAT-3 were used. The precise agricultural land use classification was tried using ISODATA unsupervised classification technique, and the result was compared with on-screen digitizing land use accompanying with field investigation. For the extraction of crop growth information, three crops of paddy, com and red pepper were selected, and the spectral characteristics were collected during each growing period using ground spectroradiometer. The vegetation indices viz. RVI, NDVI, ARVI, and SAVI for the crops were evaluated. The evaluation process was developed using the ERDAS IMAGINE Spatial Modeler Tool.

Leision Detection in Chest X-ray Images based on Coreset of Patch Feature (패치 특징 코어세트 기반의 흉부 X-Ray 영상에서의 병변 유무 감지)

  • Kim, Hyun-bin;Chun, Jun-Chul
    • Journal of Internet Computing and Services
    • /
    • v.23 no.3
    • /
    • pp.35-45
    • /
    • 2022
  • Even in recent years, treatment of first-aid patients is still often delayed due to a shortage of medical resources in marginalized areas. Research on automating the analysis of medical data to solve the problems of inaccessibility for medical services and shortage of medical personnel is ongoing. Computer vision-based medical inspection automation requires a lot of cost in data collection and labeling for training purposes. These problems stand out in the works of classifying lesion that are rare, or pathological features and pathogenesis that are difficult to clearly define visually. Anomaly detection is attracting as a method that can significantly reduce the cost of data collection by adopting an unsupervised learning strategy. In this paper, we propose methods for detecting abnormal images on chest X-RAY images as follows based on existing anomaly detection techniques. (1) Normalize the brightness range of medical images resampled as optimal resolution. (2) Some feature vectors with high representative power are selected in set of patch features extracted as intermediate-level from lesion-free images. (3) Measure the difference from the feature vectors of lesion-free data selected based on the nearest neighbor search algorithm. The proposed system can simultaneously perform anomaly classification and localization for each image. In this paper, the anomaly detection performance of the proposed system for chest X-RAY images of PA projection is measured and presented by detailed conditions. We demonstrate effect of anomaly detection for medical images by showing 0.705 classification AUROC for random subset extracted from the PadChest dataset. The proposed system can be usefully used to improve the clinical diagnosis workflow of medical institutions, and can effectively support early diagnosis in medically poor area.

Abbreviation Disambiguation using Topic Modeling (토픽모델링을 이용한 약어 중의성 해소)

  • Woon-Kyo Lee;Ja-Hee Kim;Junki Yang
    • Journal of the Korea Society for Simulation
    • /
    • v.32 no.1
    • /
    • pp.35-44
    • /
    • 2023
  • In recent, there are many research cases that analyze trends or research trends with text analysis. When collecting documents by searching for keywords in abbreviations for data analysis, it is necessary to disambiguate abbreviations. In many studies, documents are classified by hand-work reading the data one by one to find the data necessary for the study. Most of the studies to disambiguate abbreviations are studies that clarify the meaning of words and use supervised learning. The previous method to disambiguate abbreviation is not suitable for classification studies of documents looking for research data from abbreviation search documents, and related studies are also insufficient. This paper proposes a method of semi-automatically classifying documents collected by abbreviations by going topic modeling with Non-Negative Matrix Factorization, an unsupervised learning method, in the data pre-processing step. To verify the proposed method, papers were collected from academic DB with the abbreviation 'MSA'. The proposed method found 316 papers related to Micro Services Architecture in 1,401 papers. The document classification accuracy of the proposed method was measured at 92.36%. It is expected that the proposed method can reduce the researcher's time and cost due to hand work.

Ecoclimatic Map over North-East Asia Using SPOT/VEGETATION 10-day Synthesis Data (SPOT/VEGETATION NDVI 자료를 이용한 동북아시아의 생태기후지도)

  • Park Youn-Young;Han Kyung-Soo
    • Korean Journal of Agricultural and Forest Meteorology
    • /
    • v.8 no.2
    • /
    • pp.86-96
    • /
    • 2006
  • Ecoclimap-1, a new complete surface parameter global database at a 1-km resolution, was previously presented. It is intended to be used to initialize the soil-vegetation- atmosphere transfer schemes in meteorological and climate models. Surface parameters in the Ecoclimap-1 database are provided in the form of a per-class value by an ecoclimatic base map from a simple merging of land cover and climate maps. The principal objective of this ecoclimatic map is to consider intra-class variability of life cycle that the usual land cover map cannot describe. Although the ecoclimatic map considering land cover and climate is used, the intra-class variability was still too high inside some classes. In this study, a new strategy is defined; the idea is to use the information contained in S10 NDVI SPOT/VEGETATION profiles to split a land cover into more homogeneous sub-classes. This utilizes an intra-class unsupervised sub-clustering methodology instead of simple merging. This study was performed to provide a new ecolimatic map over Northeast Asia in the framework of Ecoclimap-2 global database construction for surface parameters. We used the University of Maryland's 1km Global Land Cover Database (UMD) and a climate map to determine the initial number of clusters for intra-class sub-clustering. An unsupervised classification process using six years of NDVI profiles allows the discrimination of different behavior for each land cover class. We checked the spatial coherence of the classes and, if necessary, carried out an aggregation step of the clusters having a similar NDVI time series profile. From the mapping system, 29 ecosystems resulted for the study area. In terms of climate-related studies, this new ecosystem map may be useful as a base map to construct an Ecoclimap-2 database and to improve the surface climatology quality in the climate model.