• Title/Summary/Keyword: PCA 알고리즘

Search Result 323, Processing Time 0.027 seconds

A Novel of Data Clustering Architecture for Outlier Detection to Electric Power Data Analysis (전력데이터 분석에서 이상점 추출을 위한 데이터 클러스터링 아키텍처에 관한 연구)

  • Jung, Se Hoon;Shin, Chang Sun;Cho, Young Yun;Park, Jang Woo;Park, Myung Hye;Kim, Young Hyun;Lee, Seung Bae;Sim, Chun Bo
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.6 no.10
    • /
    • pp.465-472
    • /
    • 2017
  • In the past, researchers mainly used the supervised learning technique of machine learning to analyze power data and investigated the identification of patterns through the data mining technique. Data analysis research, however, faces its limitations with the old data classification and analysis techniques today when the size of electric power data has increased with the possible real-time provision of data. This study thus set out to propose a clustering architecture to analyze large-sized electric power data. The clustering process proposed in the study supplements the K-means algorithm, an unsupervised learning technique, for its problems and is capable of automating the entire process from the collection of electric power data to their analysis. In the present study, power data were categorized and analyzed in total three levels, which include the row data level, clustering level, and user interface level. In addition, the investigator identified K, the ideal number of clusters, based on principal component analysis and normal distribution and proposed an altered K-means algorithm to reduce data that would be categorized as ideal points in order to increase the efficiency of clustering.

Detection of Toluene Hazardous and Noxious Substances (HNS) Based on Hyperspectral Remote Sensing (초분광 원격탐사 기반 위험·유해물질 톨루엔 탐지)

  • Park, Jae-Jin;Park, Kyung-Ae;Foucher, Pierre-Yves;Kim, Tae-Sung;Lee, Moonjin
    • Journal of the Korean earth science society
    • /
    • v.42 no.6
    • /
    • pp.623-631
    • /
    • 2021
  • The increased transport of marine hazardous and noxious substances (HNS) has resulted in frequent HNS spill accidents domestically and internationally. There are about 6,000 species of HNS internationally, and most of them have toxic properties. When an accidental HNS spill occurs, it can destroys the marine ecosystem and can damage life and property due to explosion and fire. Constructing a spectral library of HNS according to wavelength and developing a detection algorithm would help prepare for accidents. In this study, a ground HNS spill experiment was conducted in France. The toluene spectrum was determined through hyperspectral sensor measurements. HNS present in the hyperspectral images were detected by applying the spectral mixture algorithm. Preprocessing principal component analysis (PCA) removed noise and performed dimensional compression. The endmember spectra of toluene and seawater were extracted through the N-FINDR technique. By calculating the abundance fraction of toluene and seawater based on the spectrum, the detection accuracy of HNS in all pixels was presented as a probability. The probability was compared with radiance images at a wavelength of 418.15 nm to select abundance fractions with maximum detection accuracy. The accuracy exceeded 99% at a ratio of approximately 42%. Response to marine spills of HNS are presently impeded by the restricted access to the site because of high risk of exposure to toxic compounds. The present experimental and detection results could help estimate the area of contamination with HNS based on hyperspectral remote sensing.

Statistical Techniques to Detect Sensor Drifts (센서드리프트 판별을 위한 통계적 탐지기술 고찰)

  • Seo, In-Yong;Shin, Ho-Cheol;Park, Moon-Ghu;Kim, Seong-Jun
    • Journal of the Korea Society for Simulation
    • /
    • v.18 no.3
    • /
    • pp.103-112
    • /
    • 2009
  • In a nuclear power plant (NPP), periodic sensor calibrations are required to assure sensors are operating correctly. However, only a few faulty sensors are found to be calibrated. For the safe operation of an NPP and the reduction of unnecessary calibration, on-line calibration monitoring is needed. In this paper, principal component-based Auto-Associative support vector regression (PCSVR) was proposed for the sensor signal validation of the NPP. It utilizes the attractive merits of principal component analysis (PCA) for extracting predominant feature vectors and AASVR because it easily represents complicated processes that are difficult to model with analytical and mechanistic models. With the use of real plant startup data from the Kori Nuclear Power Plant Unit 3, SVR hyperparameters were optimized by the response surface methodology (RSM). Moreover the statistical techniques are integrated with PCSVR for the failure detection. The residuals between the estimated signals and the measured signals are tested by the Shewhart Control Chart, Exponentially Weighted Moving Average (EWMA), Cumulative Sum (CUSUM) and generalized likelihood ratio test (GLRT) to detect whether the sensors are failed or not. This study shows the GLRT can be a candidate for the detection of sensor drift.

A study on the classification of research topics based on COVID-19 academic research using Topic modeling (토픽모델링을 활용한 COVID-19 학술 연구 기반 연구 주제 분류에 관한 연구)

  • Yoo, So-yeon;Lim, Gyoo-gun
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.1
    • /
    • pp.155-174
    • /
    • 2022
  • From January 2020 to October 2021, more than 500,000 academic studies related to COVID-19 (Coronavirus-2, a fatal respiratory syndrome) have been published. The rapid increase in the number of papers related to COVID-19 is putting time and technical constraints on healthcare professionals and policy makers to quickly find important research. Therefore, in this study, we propose a method of extracting useful information from text data of extensive literature using LDA and Word2vec algorithm. Papers related to keywords to be searched were extracted from papers related to COVID-19, and detailed topics were identified. The data used the CORD-19 data set on Kaggle, a free academic resource prepared by major research groups and the White House to respond to the COVID-19 pandemic, updated weekly. The research methods are divided into two main categories. First, 41,062 articles were collected through data filtering and pre-processing of the abstracts of 47,110 academic papers including full text. For this purpose, the number of publications related to COVID-19 by year was analyzed through exploratory data analysis using a Python program, and the top 10 journals under active research were identified. LDA and Word2vec algorithm were used to derive research topics related to COVID-19, and after analyzing related words, similarity was measured. Second, papers containing 'vaccine' and 'treatment' were extracted from among the topics derived from all papers, and a total of 4,555 papers related to 'vaccine' and 5,971 papers related to 'treatment' were extracted. did For each collected paper, detailed topics were analyzed using LDA and Word2vec algorithms, and a clustering method through PCA dimension reduction was applied to visualize groups of papers with similar themes using the t-SNE algorithm. A noteworthy point from the results of this study is that the topics that were not derived from the topics derived for all papers being researched in relation to COVID-19 (

    ) were the topic modeling results for each research topic (
    ) was found to be derived from For example, as a result of topic modeling for papers related to 'vaccine', a new topic titled Topic 05 'neutralizing antibodies' was extracted. A neutralizing antibody is an antibody that protects cells from infection when a virus enters the body, and is said to play an important role in the production of therapeutic agents and vaccine development. In addition, as a result of extracting topics from papers related to 'treatment', a new topic called Topic 05 'cytokine' was discovered. A cytokine storm is when the immune cells of our body do not defend against attacks, but attack normal cells. Hidden topics that could not be found for the entire thesis were classified according to keywords, and topic modeling was performed to find detailed topics. In this study, we proposed a method of extracting topics from a large amount of literature using the LDA algorithm and extracting similar words using the Skip-gram method that predicts the similar words as the central word among the Word2vec models. The combination of the LDA model and the Word2vec model tried to show better performance by identifying the relationship between the document and the LDA subject and the relationship between the Word2vec document. In addition, as a clustering method through PCA dimension reduction, a method for intuitively classifying documents by using the t-SNE technique to classify documents with similar themes and forming groups into a structured organization of documents was presented. In a situation where the efforts of many researchers to overcome COVID-19 cannot keep up with the rapid publication of academic papers related to COVID-19, it will reduce the precious time and effort of healthcare professionals and policy makers, and rapidly gain new insights. We hope to help you get It is also expected to be used as basic data for researchers to explore new research directions.

  • Development of a Storage Level and Capacity Monitoring and Forecasting Techniques in Yongdam Dam Basin Using High Resolution Satellite Image (고해상도 위성자료를 이용한 용담댐 유역 저수위/저수량 모니터링 및 예측 기술 개발)

    • Yoon, Sunkwon;Lee, Seongkyu;Park, Kyungwon;Jang, Sangmin;Rhee, Jinyung
      • Korean Journal of Remote Sensing
      • /
      • v.34 no.6_1
      • /
      • pp.1041-1053
      • /
      • 2018
    • In this study, a real-time storage level and capacity monitoring and forecasting system for Yongdam Dam watershed was developed using high resolution satellite image. The drought indices such as Standardized Precipitation Index (SPI) from satellite data were used for storage level monitoring in case of drought. Moreover, to predict storage volume we used a statistical method based on Principle Component Analysis (PCA) of Singular Spectrum Analysis (SSA). According to this study, correlation coefficient between storage level and SPI (3) was highly calculated with CC=0.78, and the monitoring and predictability of storage level was diagnosed using the drought index calculated from satellite data. As a result of analysis of principal component analysis by SSA, correlation between SPI (3) and each Reconstructed Components (RCs) data were highly correlated with CC=0.87 to 0.99. And also, the correlations of RC data with Normalized Water Surface Level (N-W.S.L.) were confirmed that has highly correlated with CC=0.83 to 0.97. In terms of high resolution satellite image we developed a water detection algorithm by applying an exponential method to monitor the change of storage level by using Multi-Spectral Instrument (MSI) sensor of Sentinel-2 satellite. The materials of satellite image for water surface area detection in Yongdam dam watershed was considered from 2016 to 2018, respectively. Based on this, we proposed the possibility of real-time drought monitoring system using high resolution water surface area detection by Sentinel-2 satellite image. The results of this study can be applied to estimate of the reservoir volume calculated from various satellite observations, which can be used for monitoring and estimating hydrological droughts in an unmeasured area.

    Implementation of A Safe Driving Assistance System and Doze Detection (졸음 인식과 안전운전 보조시스템 구현)

    • Song, Hyok;Choi, Jin-Mo;Lee, Chul-Dong;Choi, Byeong-Ho;Yoo, Ji-Sang
      • Journal of the Institute of Electronics Engineers of Korea SP
      • /
      • v.49 no.3
      • /
      • pp.30-39
      • /
      • 2012
    • In this paper, a safe driving assistance system is proposed by detecting the status of driver's doze based on face and eye detection. By the level of the fatigue, safe driving system alarms or set the seatbelt on vibration. To reduce the effect of backward light and too strong solar light which cause a decrease of face and eye detection rate and false fatigue detection, post processing techniques like image equalization are used. Haar transform and PCA are used for face detection. By using the statistic of the face and eye structural ratio of normal Koreans, we can reduce the eye candidate area in the face, which results in reduction of the computational load. We also propose a new eye status detection algorithm based on Hough transform and eye width-height ratio, which are used to detect eye's blinking status which decides doze level by measuring the blinking period. The system alarms and operates seatbelt on vibration through controller area network(CAN) when the driver's doze level is detected. In this paper, four algorithms are implemented and proposed algorithm is made based on the probability model and we achieves 84.88% of correct detection rate through indoor and in-car environment experiments. And also we achieves 69.81% of detection rate which is better result than that of other algorithms using IR camera.

    A New Face Tracking and Recognition Method Adapted to the Environment (환경에 적응적인 얼굴 추적 및 인식 방법)

    • Ju, Myung-Ho;Kang, Hang-Bong
      • The KIPS Transactions:PartB
      • /
      • v.16B no.5
      • /
      • pp.385-394
      • /
      • 2009
    • Face tracking and recognition are difficult problems because the face is a non-rigid object. The main reasons for the failure to track and recognize the faces are the changes of a face pose and environmental illumination. To solve these problems, we propose a nonlinear manifold framework for the face pose and the face illumination normalization processing. Specifically, to track and recognize a face on the video that has various pose variations, we approximate a face pose density to single Gaussian density by PCA(Principle Component Analysis) using images sampled from training video sequences and then construct the GMM(Gaussian Mixture Model) for each person. To solve the illumination problem for the face tracking and recognition, we decompose the face images into the reflectance and the illuminance using the SSR(Single Scale Retinex) model. To obtain the normalized reflectance, the reflectance is rescaled by histogram equalization on the defined range. We newly approximate the illuminance by the trained manifold since the illuminance has almost variations by illumination. By combining these two features into our manifold framework, we derived the efficient face tracking and recognition results on indoor and outdoor video. To improve the video based tracking results, we update the weights of each face pose density at each frame by the tracking result at the previous frame using EM algorithm. Our experimental results show that our method is more efficient than other methods.

    Dimensionality Reduction Methods Analysis of Hyperspectral Imagery for Unsupervised Change Detection of Multi-sensor Images (이종 영상 간의 무감독 변화탐지를 위한 초분광 영상의 차원 축소 방법 분석)

    • PARK, Hong-Lyun;PARK, Wan-Yong;PARK, Hyun-Chun;CHOI, Seok-Keun;CHOI, Jae-Wan;IM, Hon-Ryang
      • Journal of the Korean Association of Geographic Information Studies
      • /
      • v.22 no.4
      • /
      • pp.1-11
      • /
      • 2019
    • With the development of remote sensing sensor technology, it has become possible to acquire satellite images with various spectral information. In particular, since the hyperspectral image is composed of continuous and narrow spectral wavelength, it can be effectively used in various fields such as land cover classification, target detection, and environment monitoring. Change detection techniques using remote sensing data are generally performed through differences of data with same dimensions. Therefore, it has a disadvantage that it is difficult to apply to heterogeneous sensors having different dimensions. In this study, we have developed a change detection method applicable to hyperspectral image and high spat ial resolution satellite image with different dimensions, and confirmed the applicability of the change detection method between heterogeneous images. For the application of the change detection method, the dimension of hyperspectral image was reduced by using correlation analysis and principal component analysis, and the change detection algorithm used CVA. The ROC curve and the AUC were calculated using the reference data for the evaluation of change detection performance. Experimental results show that the change detection performance is higher when using the image generated by adequate dimensionality reduction than the case using the original hyperspectral image.

    Hand Gesture Recognition Regardless of Sensor Misplacement for Circular EMG Sensor Array System (원형 근전도 센서 어레이 시스템의 센서 틀어짐에 강인한 손 제스쳐 인식)

    • Joo, SeongSoo;Park, HoonKi;Kim, InYoung;Lee, JongShill
      • Journal of rehabilitation welfare engineering & assistive technology
      • /
      • v.11 no.4
      • /
      • pp.371-376
      • /
      • 2017
    • In this paper, we propose an algorithm that can recognize the pattern regardless of the sensor position when performing EMG pattern recognition using circular EMG system equipment. Fourteen features were extracted by using the data obtained by measuring the eight channel EMG signals of six motions for 1 second. In addition, 112 features extracted from 8 channels were analyzed to perform principal component analysis, and only the data with high influence was cut out to 8 input signals. All experiments were performed using k-NN classifier and data was verified using 5-fold cross validation. When learning data in machine learning, the results vary greatly depending on what data is learned. EMG Accuracy of 99.3% was confirmed when using the learning data used in the previous studies. However, even if the position of the sensor was changed by only 22.5 degrees, it was clearly dropped to 67.28% accuracy. The accuracy of the proposed method is 98% and the accuracy of the proposed method is about 98% even if the sensor position is changed. Using these results, it is expected that the convenience of the users using the circular EMG system can be greatly increased.

    Development of Driver's Emotion and Attention Recognition System using Multi-modal Sensor Fusion Algorithm (다중 센서 융합 알고리즘을 이용한 운전자의 감정 및 주의력 인식 기술 개발)

    • Han, Cheol-Hun;Sim, Kwee-Bo
      • Journal of the Korean Institute of Intelligent Systems
      • /
      • v.18 no.6
      • /
      • pp.754-761
      • /
      • 2008
    • As the automobile industry and technologies are developed, driver's tend to more concern about service matters than mechanical matters. For this reason, interests about recognition of human knowledge and emotion to make safe and convenient driving environment for driver are increasing more and more. recognition of human knowledge and emotion are emotion engineering technology which has been studied since the late 1980s to provide people with human-friendly services. Emotion engineering technology analyzes people's emotion through their faces, voices and gestures, so if we use this technology for automobile, we can supply drivels with various kinds of service for each driver's situation and help them drive safely. Furthermore, we can prevent accidents which are caused by careless driving or dozing off while driving by recognizing driver's gestures. the purpose of this paper is to develop a system which can recognize states of driver's emotion and attention for safe driving. First of all, we detect a signals of driver's emotion by using bio-motion signals, sleepiness and attention, and then we build several types of databases. by analyzing this databases, we find some special features about drivers' emotion, sleepiness and attention, and fuse the results through Multi-Modal method so that it is possible to develop the system.


    (34141) Korea Institute of Science and Technology Information, 245, Daehak-ro, Yuseong-gu, Daejeon
    Copyright (C) KISTI. All Rights Reserved.