• Title/Summary/Keyword: Multivariate Data

Search Result 1,996, Processing Time 0.028 seconds

Development of Typhoon Damage Forecasting Function of Southern Inland Area By Multivariate Analysis Technique (다변량 통계분석을 이용한 남부 내륙지역 태풍피해예측모형 개발)

  • Kim, Yonsoo;Kim, Taegyun
    • Journal of Wetlands Research
    • /
    • v.21 no.4
    • /
    • pp.281-289
    • /
    • 2019
  • In this study, the typhoon damage forecasting model was developed for southern inland district. The typhoon damage in the inland district is caused by heavy rain and strong winds, variables are many and varied, but the damage data of the inland district are not enough to develop the model. The hydrological data related to the typhoon damage were hour maximum rainfall amount which is accumulated 3 hour interval, the total rainfall amount, the 1-5 day anticipated rainfall amount, the maximum wind speed and the typhoon center pressure at latitude 33° near the Jeju island. The Multivariate Analysis such as cluster Analysis considering the lack of damage data and principal component analysis removing multi-collinearity of rainfall data are adopted for the damage forecasting model. As a result of applying the developed model, typhoon damage estimated and observed values were up to 2.2 times. this is caused it is difficult to estimate the damage caused by strong winds and it is assumed that the local rainfall characteristics are not considered properly measured by 69 ASOS.

Multivariate Outlier Removing for the Risk Prediction of Gas Leakage based Methane Gas (메탄 가스 기반 가스 누출 위험 예측을 위한 다변량 특이치 제거)

  • Dashdondov, Khongorzul;Kim, Mi-Hye
    • Journal of the Korea Convergence Society
    • /
    • v.11 no.12
    • /
    • pp.23-30
    • /
    • 2020
  • In this study, the relationship between natural gas (NG) data and gas-related environmental elements was performed using machine learning algorithms to predict the level of gas leakage risk without directly measuring gas leakage data. The study was based on open data provided by the server using the IoT-based remote control Picarro gas sensor specification. The naturel gas leaks into the air, it is a big problem for air pollution, environment and the health. The proposed method is multivariate outlier removing method based Random Forest (RF) classification for predicting risk of NG leak. After, unsupervised k-means clustering, the experimental dataset has done imbalanced data. Therefore, we focusing our proposed models can predict medium and high risk so best. In this case, we compared the receiver operating characteristic (ROC) curve, accuracy, area under the ROC curve (AUC), and mean standard error (MSE) for each classification model. As a result of our experiments, the evaluation measurements include accuracy, area under the ROC curve (AUC), and MSE; 99.71%, 99.57%, and 0.0016 for MOL_RF respectively.

Cryptanalysis of LILI-128 with Overdefined Systems of Equations (과포화(Overdefined) 연립방정식을 이용한 LILI-128 스트림 암호에 대한 분석)

  • 문덕재;홍석희;이상진;임종인;은희천
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.13 no.1
    • /
    • pp.139-146
    • /
    • 2003
  • In this paper we demonstrate a cryptanalysis of the stream cipher LILI-128. Our approach to analysis on LILI-128 is to solve an overdefined system of multivariate equations. The LILI-128 keystream generato $r^{[8]}$ is a LFSR-based synchronous stream cipher with 128 bit key. This cipher consists of two parts, “CLOCK CONTROL”, pan and “DATA GENERATION”, part. We focus on the “DATA GENERATION”part. This part uses the function $f_d$. that satisfies the third order of correlation immunity, high nonlinearity and balancedness. But, this function does not have highly nonlinear order(i.e. high degree in its algebraic normal form). We use this property of the function $f_d$. We reduced the problem of recovering the secret key of LILI-128 to the problem of solving a largely overdefined system of multivariate equations of degree K=6. In our best version of the XL-based cryptanalysis we have the parameter D=7. Our fastest cryptanalysis of LILI-128 requires $2^{110.7}$ CPU clocks. This complexity can be achieved using only $2^{26.3}$ keystream bits.

Analysis of the Necessity of Introducing the Obligation to Take Safety and Health Measures for Construction Orderers using Multivariate Analysis (다변량 분석을 이용한 건설업 발주자의 안전보건조치 의무 도입 필요성 분석)

  • Lim, Se Jong;Seo, Jae Min;Won, Jeong-Hun;Kim, Chang-Won
    • Journal of the Korean Society of Safety
    • /
    • v.37 no.1
    • /
    • pp.20-29
    • /
    • 2022
  • To stem the ever-prevalent occurrence of industrial accidents in the construction industry, which is emerging as a social problem, efforts must be invested by various stakeholders. Specifically, among stakeholders, the orderer is at the top of a project's decision-making structure. Therefore, the orderer's awareness of safety and health directly affects the process of securing the safety of the overall construction site. In this light, the present study aims to identify differences in the perceptions of each stakeholder regarding the obligatory safety and health measures for clients that have recently been introduced. In addition, it suggests specific implementation plans in the Korean context. The data used for analysis were collected through a survey targeting stakeholders such as orderers, safety managers, and site managers, and the collected data were quantitatively reviewed by using multivariate analysis methods such as analysis of variance. As a result of the analysis, the introduction of safety and health obligations for the owner was found to be necessary, and the designation and operation of safety and health experts as an action plan was deemed reasonable. The authors expect that the results of this study can be used as basic data for revising the related regulations in Korea. Moreover, as a further study, a review of the effectiveness after improving regulations would contribute strongly to the domain.

Short-term Construction Investment Forecasting Model in Korea (건설투자(建設投資)의 단기예측모형(短期豫測模型) 비교(比較))

  • Kim, Kwan-young;Lee, Chang-soo
    • KDI Journal of Economic Policy
    • /
    • v.14 no.1
    • /
    • pp.121-145
    • /
    • 1992
  • This paper examines characteristics of time series data related to the construction investment(stationarity and time series components such as secular trend, cyclical fluctuation, seasonal variation, and random change) and surveys predictibility, fitness, and explicability of independent variables of various models to build a short-term construction investment forecasting model suitable for current economic circumstances. Unit root test, autocorrelation coefficient and spectral density function analysis show that related time series data do not have unit roots, fluctuate cyclically, and are largely explicated by lagged variables. Moreover it is very important for the short-term construction investment forecasting to grasp time lag relation between construction investment series and leading indicators such as building construction permits and value of construction orders received. In chapter 3, we explicate 7 forecasting models; Univariate time series model (ARIMA and multiplicative linear trend model), multivariate time series model using leading indicators (1st order autoregressive model, vector autoregressive model and error correction model) and multivariate time series model using National Accounts data (simple reduced form model disconnected from simultaneous macroeconomic model and VAR model). These models are examined by 4 statistical tools that are average absolute error, root mean square error, adjusted coefficient of determination, and Durbin-Watson statistic. This analysis proves two facts. First, multivariate models are more suitable than univariate models in the point that forecasting error of multivariate models tend to decrease in contrast to the case of latter. Second, VAR model is superior than any other multivariate models; average absolute prediction error and root mean square error of VAR model are quitely low and adjusted coefficient of determination is higher. This conclusion is reasonable when we consider current construction investment has sustained overheating growth more than secular trend.

  • PDF

A Study on Generation of Stochastic Rainfall Variation using Multivariate Monte Carlo method (다변량 Monte Carlo 기법을 이용한 추계학적 강우 변동 생성기법에 관한 연구)

  • Ahn, Ki-Hong;Han, Kun-Yeun
    • Journal of the Korean Society of Hazard Mitigation
    • /
    • v.9 no.3
    • /
    • pp.127-133
    • /
    • 2009
  • In this study, dimensionless-cumulative rainfall curves were generated by multivariate Monte Carlo method. For generation of rainfall curve rainfall storms were divided and made into dimensionless type since it was required to remove the spatial and temporal variances as well as differences in rainfall data. The dimensionless rainfall curves were divided into 4 types, and log-ratio method was introduced to overcome the limitations that elements of dimensionless-cumulative rainfall curve should always be more than zero and the sum total should be one. Orthogonal transformation by Johnson system and the constrained non-normal multivariate Monte Carlo simulation were introduced to analyse the rainfall characteristics. The generative technique in stochastic rainfall variation using multivariate Monte Carlo method will contribute to the design and evaluation of hydrosystems and can use the establishment of the flood disaster prevention system.

A Survey on Unsupervised Anomaly Detection for Multivariate Time Series (다변량 시계열 이상 탐지 과업에서 비지도 학습 모델의 성능 비교)

  • Juwan Lim;Jaekoo Lee
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.33 no.1
    • /
    • pp.1-12
    • /
    • 2023
  • It is very time-intensive to obtain data with labels on anomaly detection tasks for multivariate time series. Therefore, several studies have been conducted on unsupervised learning that does not require any labels. However, a well-done integrative survey has not been conducted on in-depth discussion of learning architecture and property for multivariate time series anomaly detection. This study aims to explore the characteristic of well-known architectures in anomaly detection of multivariate time series. Additionally, architecture was categorized by using top-down and bottom-up approaches. In order toconsider real-world anomaly detection situation, we trained models with dataset such as power grids or Cyber Physical Systems that contains realistic anomalies. From experimental results, we compared and analyzed the comprehensive performance of each architecture. Quantitative performance were measured using precision, recall, and F1 scores.

Shelf-life prediction of fresh ginseng packaged with plastic films based on a kinetic model and multivariate accelerated shelf-life testing

  • Jong-Jin Park;Jeong-Hee Choi;Kee-Jai Park;Jeong-Seok Cho;Dae-Yong Yun;Jeong-Ho Lim
    • Food Science and Preservation
    • /
    • v.30 no.4
    • /
    • pp.573-588
    • /
    • 2023
  • The purpose of this study was to monitor changes in the quality of ginseng and predict its shelf-life. As the storage period of ginseng increased, some quality indicators, such as water-soluble pectin (WSP), CDTA-soluble pectin (CSP), cellulose, weight loss, and microbial growth increased, while others (Na2CO3-soluble pectin/NSP, hemicellulose, starch, and firmness) decreased. Principal component analysis (PCA) was performed using the quality attribute data and the principal component 1 (PC1) scores extracted from the PCA results were applied to the multivariate analysis. The reaction rate at different temperatures and the temperature dependence of the reaction rate were determined using kinetic and Arrhenius models, respectively. Among the kinetic models, zeroth-order models with cellulose and a PC1 score provided an adequate fit for reaction rate estimation. Hence, the prediction model was constructed by applying the cellulose and PC1 scores to the zeroth-order kinetic and Arrhenius models. The prediction model with PC1 score showed higher R2 values (0.877-0.919) than those of cellulose (0.797-0.863), indicating that multivariate analysis using PC1 score is more accurate for the shelf-life prediction of ginseng. The predicted shelf-life using the multivariate accelerated shelf-life test at 5, 20, and 35℃ was 40, 16, and 7 days, respectively.

An Analysis of Engine Failures Using Multivariate Data Analysis Method (다변량해석법을 이용한 기관고장분석)

  • 윤석훈
    • Journal of the Korean Society of Fisheries and Ocean Technology
    • /
    • v.23 no.4
    • /
    • pp.198-203
    • /
    • 1987
  • The basis of all approaches to improve reliability of marine engines exists in analyzing the field data of troubles and failures on marine engines. This paper analyses the data of troubles and failures on marine engines by Principal Component Analysis Method, one of Multivariate Data Analysis Method. The total number of data investigated is 211 and the observation period is 9 years. The analyzed factors are categorized among five groups respectively; electric.automatic control equipments, auxiliary machinery, pipings, refrigerators.air conditioners, and main engine. The failures in main engine are discovered by a definite fact of disorder, on the contrary, the failures in auxiliary machinery, refrigerators and air conditioners are discovered by sensible judgement of the operators.

  • PDF

Application of metabolic profiling for biomarker discovery

  • Hwang, Geum-Sook
    • Proceedings of the Korean Society of Applied Pharmacology
    • /
    • 2007.11a
    • /
    • pp.19-27
    • /
    • 2007
  • An important potential of metabolomics-based approach is the possibility to develop fingerprints of diseases or cellular responses to classes of compounds with known common biological effect. Such fingerprints have the potential to allow classification of disease states or compounds, to provide mechanistic information on cellular perturbations and pathways and to identify biomarkers specific for disease severity and drug efficacy. Metabolic profiles of biological fluids contain a vast array of endogenous metabolites. Changes in those profiles resulting from perturbations of the system can be observed using analytical techniques, such as NMR and MS. $^1H$ NMR was used to generate a molecular fingerprint of serum or urinary sample, and then pattern recognition technique was applied to identity molecular signatures associated with the specific diseases or drug efficiency. Several metabolites that differentiate disease samples from the control were thoroughly characterized by NMR spectroscopy. We investigated the metabolic changes in human normal and clinical samples using $^1H$ NMR. Spectral data were applied to targeted profiling and spectral binning method, and then multivariate statistical data analysis (MVDA) was used to examine in detail the modulation of small molecule candidate biomarkers. We show that targeted profiling produces robust models, generates accurate metabolite concentration data, and provides data that can be used to help understand metabolic differences between healthy and disease population. Such metabolic signatures could provide diagnostic markers for a disease state or biomarkers for drug response phenotypes.

  • PDF