• Title/Summary/Keyword: Multivariate Data

Search Result 2,004, Processing Time 0.044 seconds

Functional ARCH analysis for a choice of time interval in intraday return via multivariate volatility (함수형 ARCH 분석 및 다변량 변동성을 통한 일중 로그 수익률 시간 간격 선택)

  • Kim, D.H.;Yoon, J.E.;Hwang, S.Y.
    • The Korean Journal of Applied Statistics
    • /
    • v.33 no.3
    • /
    • pp.297-308
    • /
    • 2020
  • We focus on the functional autoregressive conditional heteroscedasticity (fARCH) modelling to analyze intraday volatilities based on high frequency financial time series. Multivariate volatility models are investigated to approximate fARCH(1). A formula of multi-step ahead volatilities for fARCH(1) model is derived. As an application, in implementing fARCH(1), a choice of appropriate time interval for the intraday return is discussed. High frequency KOSPI data analysis is conducted to illustrate the main contributions of the article.

Prediction Model of Final Project Cost using Multivariate Probabilistic Analysis (MPA) and Bayes' Theorem

  • Yoo, Wi Sung;Hadipriono, FAbian C.
    • Korean Journal of Construction Engineering and Management
    • /
    • v.8 no.5
    • /
    • pp.191-200
    • /
    • 2007
  • This paper introduces a tool for predicting potential cost overrun during project execution and for quantifying the uncertainty on the expected project cost, which is occasionally changed by the unknown effects resulted from project's complications and unforeseen environments. The model proposed in this stuff is useful in diagnosing cost performance as a project progresses and in monitoring the changes of the uncertainty as indicators for a warning signal. This model is intended for the use by project managers who forecast the change of the uncertainty and its magnitude. The paper presents a mathematical approach for modifying the costs of incomplete work packages and project cost, and quantifying reduced uncertainties at a consistent confidence level as actual cost information of an ongoing project is obtained. Furthermore, this approach addresses the effects of actual informed data of completed work packages on the re-estimates of incomplete work packages and describes the impacts on the variation of the uncertainty for the expected project cost incorporating Multivariate Probabilistic Analysis (MPA) and Bayes' Theorem. For the illustration purpose, the Introduced model has employed an example construction project. The results are analyzed to demonstrate the use of the model and illustrate its capabilities.

Evaluating seismic liquefaction potential using multivariate adaptive regression splines and logistic regression

  • Zhang, Wengang;Goh, Anthony T.C.
    • Geomechanics and Engineering
    • /
    • v.10 no.3
    • /
    • pp.269-284
    • /
    • 2016
  • Simplified techniques based on in situ testing methods are commonly used to assess seismic liquefaction potential. Many of these simplified methods were developed by analyzing liquefaction case histories from which the liquefaction boundary (limit state) separating two categories (the occurrence or non-occurrence of liquefaction) is determined. As the liquefaction classification problem is highly nonlinear in nature, it is difficult to develop a comprehensive model using conventional modeling techniques that take into consideration all the independent variables, such as the seismic and soil properties. In this study, a modification of the Multivariate Adaptive Regression Splines (MARS) approach based on Logistic Regression (LR) LR_MARS is used to evaluate seismic liquefaction potential based on actual field records. Three different LR_MARS models were used to analyze three different field liquefaction databases and the results are compared with the neural network approaches. The developed spline functions and the limit state functions obtained reveal that the LR_MARS models can capture and describe the intrinsic, complex relationship between seismic parameters, soil parameters, and the liquefaction potential without having to make any assumptions about the underlying relationship between the various variables. Considering its computational efficiency, simplicity of interpretation, predictive accuracy, its data-driven and adaptive nature and its ability to map the interaction between variables, the use of LR_MARS model in assessing seismic liquefaction potential is promising.

Differentiation of Roots of Glycyrrhiza Species by 1H Nuclear Magnetic Resonance Spectroscopy and Multivariate Statistical Analysis

  • Yang, Seung-Ok;Hyun, Sun-Hee;Kim, So-Hyun;Kim, Hee-Su;Lee, Jae-Hwi;Whang, Wan-Kyun;Lee, Min-Won;Choi, Hyung-Kyoon
    • Bulletin of the Korean Chemical Society
    • /
    • v.31 no.4
    • /
    • pp.825-828
    • /
    • 2010
  • To classify Glycyrrhiza species, samples of different species were analyzed by $^1H$ NMR-based metabolomics technique. Partial least squares discriminant analysis (PLS-DA) was used as the multivariate statistical analysis of the 1H NMR data sets. There was a clear separation between various Glycyrrhiza species in the PLS-DA derived score plots. The PLS-DA model was validated, and the key metabolites contributing to the separation in the score plots of various Glycyrrhiza species were lactic acid, alanine, arginine, proline, malic acid, asparagine, choline, glycine, glucose, sucrose, 4-hydroxy-phenylacetic acid, and formic acid. The compounds present at relatively high levels were glucose, and 4-hydroxyphenylacetic acid in G. glabra; lactic acid, alanine, and proline in G. inflata; and arginine, malic acid, and sucrose in G. uralensis. This is the first study to perform the global metabolomic profiling and differentiation of Glycyrrhiza species using $^1H$ NMR and multivariate statistical analysis.

An evolutionary hybrid optimization of MARS model in predicting settlement of shallow foundations on sandy soils

  • Luat, Nguyen-Vu;Nguyen, Van-Quang;Lee, Seunghye;Woo, Sungwoo;Lee, Kihak
    • Geomechanics and Engineering
    • /
    • v.21 no.6
    • /
    • pp.583-598
    • /
    • 2020
  • This study is attempted to propose a new hybrid artificial intelligence model called integrative genetic algorithm with multivariate adaptive regression splines (GA-MARS) for settlement prediction of shallow foundations on sandy soils. In this hybrid model, the evolution algorithm - Genetic Algorithm (GA) was used to search and optimize the hyperparameters of multivariate adaptive regression splines (MARS). For this purpose, a total of 180 experimental data were collected and analyzed from available researches with five-input variables including the bread of foundation (B), length to width (L/B), embedment ratio (Df/B), foundation net applied pressure (qnet), and average SPT blow count (NSPT). In further analysis, a new explicit formulation was derived from MARS and its accuracy was compared with four available formulae. The attained results indicated that the proposed GA-MARS model exhibited a more robust and better performance than the available methods.

Pan evaporation modeling using multivariate adaptive regression splines (다변량 적응 회귀 스플라인을 이용한 증발접시 증발량 모델링)

  • Seo, Youngmin;Kim, Sungwon
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2018.05a
    • /
    • pp.351-354
    • /
    • 2018
  • 본 연구에서는 일 증발접시 증발량 모델링을 위한 다변량 적응 회귀 스플라인 (multivariate adaptive regression splines, MARS) 모델의 성능을 평가하였다. 모델 입력변수 집합은 부산 관측소 (기상청)로부터 수집된 기상자료를 활용하여 증발접시 증발량과의 상관성이 높은 변수들의 조합으로 구성되었으며, 일사량, 일조시간, 평균지상온도, 최대기온의 조합으로 구성된 세 가지 입력집합이 결정되었다. MARS 모델의 성능은 네 가지의 모델성능평가지표를 활용하여 정량적으로 산출되었으며, 그 결과를 인공신경망 (artificial neural network, ANN) 모델과 비교하였다. 입력변수로서 일사량 및 일조시간을 가지는 Set 1의 경우 MARS1 모델이 ANN1 모델보다 우수한 성능을 나타내었으며, Set 2 (일사량, 일조시간, 평균지상온도)의 경우 ANN2 모델, Set 3 (일사량, 일조시간, 평균지상온도, 최대기온)의 경우 MARS3 모델이 상대적으로 우수한 모델 성능을 나타내었다. 모든 분석 모델들을 비교하였을 때, MARS3, ANN2, ANN3, MARS2, MARS1, ANN1 모델의 순서로 우수한 모델 성능을 나타내었으며, 특히 MARS3 모델은 CE = 0.790, $r^2=0.800$, RMSE = 0.762, MAE = 0.587로서 가장 우수한 일 증발접시 증발량 모델링 성능을 나타내었다. 따라서 본 연구에서 적용한 MARS 모델은 지상관측 기상자료를 활용한 일 증발접시 증발량 모델링에서 효과적인 대안이 될 수 있을 것으로 판단된다.

  • PDF

MBRDR: R-package for response dimension reduction in multivariate regression

  • Heesung Ahn;Jae Keun Yoo
    • Communications for Statistical Applications and Methods
    • /
    • v.31 no.2
    • /
    • pp.179-189
    • /
    • 2024
  • In multivariate regression with a high-dimensional response Y ∈ ℝr and a relatively low-dimensional predictor X ∈ ℝp (where r ≥ 2), the statistical analysis of such data presents significant challenges due to the exponential increase in the number of parameters as the dimension of the response grows. Most existing dimension reduction techniques primarily focus on reducing the dimension of the predictors (X), not the dimension of the response variable (Y). Yoo and Cook (2008) introduced a response dimension reduction method that preserves information about the conditional mean E(Y | X). Building upon this foundational work, Yoo (2018) proposed two semi-parametric methods, principal response reduction (PRR) and principal fitted response reduction (PFRR), then expanded these methods to unstructured principal fitted response reduction (UPFRR) (Yoo, 2019). This paper reviews these four response dimension reduction methodologies mentioned above. In addition, it introduces the implementation of the mbrdr package in R. The mbrdr is a unique tool in the R community, as it is specifically designed for response dimension reduction, setting it apart from existing dimension reduction packages that focus solely on predictors.

Estimation of Brain Connectivity during Motor Imagery Tasks using Noise-Assisted Multivariate Empirical Mode Decomposition

  • Lee, Ki-Baek;Kim, Ko Keun;Song, Jaeseung;Ryu, Jiwoo;Kim, Youngjoo;Park, Cheolsoo
    • Journal of Electrical Engineering and Technology
    • /
    • v.11 no.6
    • /
    • pp.1812-1824
    • /
    • 2016
  • The neural dynamics underlying the causal network during motor planning or imagery in the human brain are not well understood. The lack of signal processing tools suitable for the analysis of nonlinear and nonstationary electroencephalographic (EEG) hinders such analyses. In this study, noise-assisted multivariate empirical mode decomposition (NA-MEMD) is used to estimate the causal inference in the frequency domain, i.e., partial directed coherence (PDC). Natural and intrinsic oscillations corresponding to the motor imagery tasks can be extracted due to the data-driven approach of NA-MEMD, which does not employ predefined basis functions. Simulations based on synthetic data with a time delay between two signals demonstrated that NA-MEMD was the optimal method for estimating the delay between two signals. Furthermore, classification analysis of the motor imagery responses of 29 subjects revealed that NA-MEMD is a prerequisite process for estimating the causal network across multichannel EEG data during mental tasks.

A Study on Measuring the Similarity Among Sampling Sites in Lake Yongdam with Water Quality Data Using Multivariate Techniques (다변량기법을 활용한 용담호 수질측정지점 유사성 연구)

  • Lee, Yosang;Kwon, Sehyug
    • Journal of Environmental Impact Assessment
    • /
    • v.18 no.6
    • /
    • pp.401-409
    • /
    • 2009
  • Multivariate statistical approaches to classify sampling sites with measuring their similarity by water quality data and understand the characteristics of classified clusters have been discussed for the optimal water quality monitering network. For empirical study, data of two years (2005, 2006) at the 9 sampling sites with the combination of 2 depth levels and 7 important variables related to water quality is collected in Yongdam reservoir. The similarity among sampling sites is measured with Euclidean distances of water quality related variables and they are classified by hierarchical clustering method. The clustered sites are discussed with principal component variables in the view of the geographical characteristics of them and reducing the number of measuring sites. Nine sampling sites are clustered as follows; One cluster of 5, 6, and 7 sampling sites shows the characteristic of low water depth and main stream of water. The sites of 2 and 4 are clustered into the same group by characteristics of hydraulics which come from that of main stream. But their changing pattern of water quality looks like different since the site of 2 is near to dam. The sampling sites of 3, 8, and 9 are individually positioned due to the different tributary.

EXPERIMENTAL ANALYSIS OF DRIVING PATTERNS AND FUEL ECONOMY FOR PASSENGER CARS IN SEOUL

  • Sa, J.-S.;Chung, N.-H.;Sunwoo, M.-H.
    • International Journal of Automotive Technology
    • /
    • v.4 no.2
    • /
    • pp.101-108
    • /
    • 2003
  • There are a lot of factors that influence automotive fuel economy such as average trip time per kilometer, average trip speed, the number of times of vehicle stationary, and so forth. These factors depend on road conditions and traffic environment. In this study, various driving data were measured and recorded during road tests in Seoul. The accumulated road test mileage is around 1,300 kilometers. The objective of the study is to identify the driving patterns of the Seoul metropolitan area and to analyze the fuel economy based on these driving patterns. The driving data which was acquired through road tests was analysed statistically in order to obtain the driving characteristics via modal analysis, speed analysis, and speed-acceleration analysis. Moreover, the driving data was analyzed by multivariate statistical techniques including correlation analysis, principal component analysis, and multiple linear regression analysis in order to obtain the relationships between influencing factors on fuel economy. The analyzed results show that the average speed is around 29.2 km/h, and the average fuel economy is 10.23 km/L. The vehicle speed of the Seoul metropolitan area is slower, and the stop-and-go operation is more frequent than FTP-75 test mode which is used for emission and fuel economy tests. The average trip time per kilometer is one of the most important factors in fuel consumption, and the increase of the average speed is desirable for reducing emissions and fuel consumption.