• Title/Summary/Keyword: multivariate data analysis

Search Result 1,402, Processing Time 0.037 seconds

Nonlinear structural modeling using multivariate adaptive regression splines

  • Zhang, Wengang;Goh, A.T.C.
    • Computers and Concrete
    • /
    • v.16 no.4
    • /
    • pp.569-585
    • /
    • 2015
  • Various computational tools are available for modeling highly nonlinear structural engineering problems that lack a precise analytical theory or understanding of the phenomena involved. This paper adopts a fairly simple nonparametric adaptive regression algorithm known as multivariate adaptive regression splines (MARS) to model the nonlinear interactions between variables. The MARS method makes no specific assumptions about the underlying functional relationship between the input variables and the response. Details of MARS methodology and its associated procedures are introduced first, followed by a number of examples including three practical structural engineering problems. These examples indicate that accuracy of the MARS prediction approach. Additionally, MARS is able to assess the relative importance of the designed variables. As MARS explicitly defines the intervals for the input variables, the model enables engineers to have an insight and understanding of where significant changes in the data may occur. An example is also presented to demonstrate how the MARS developed model can be used to carry out structural reliability analysis.

A rolling analysis on the prediction of value at risk with multivariate GARCH and copula

  • Bai, Yang;Dang, Yibo;Park, Cheolwoo;Lee, Taewook
    • Communications for Statistical Applications and Methods
    • /
    • v.25 no.6
    • /
    • pp.605-618
    • /
    • 2018
  • Risk management has been a crucial part of the daily operations of the financial industry over the past two decades. Value at Risk (VaR), a quantitative measure introduced by JP Morgan in 1995, is the most popular and simplest quantitative measure of risk. VaR has been widely applied to the risk evaluation over all types of financial activities, including portfolio management and asset allocation. This paper uses the implementations of multivariate GARCH models and copula methods to illustrate the performance of a one-day-ahead VaR prediction modeling process for high-dimensional portfolios. Many factors, such as the interaction among included assets, are included in the modeling process. Additionally, empirical data analyses and backtesting results are demonstrated through a rolling analysis, which help capture the instability of parameter estimates. We find that our way of modeling is relatively robust and flexible.

A Review of Time Series Analysis for Environmental and Ecological Data (환경생태 자료 분석을 위한 시계열 분석 방법 연구)

  • Mo, Hyoung-ho;Cho, Kijong;Shin, Key-Il
    • Korean Journal of Environmental Biology
    • /
    • v.34 no.4
    • /
    • pp.365-373
    • /
    • 2016
  • Much of the data used in the analysis of environmental ecological data is being obtained over time. If the number of time points is small, the data will not be given enough information, so repeated measurements or multiple survey points data should be used to perform a comprehensive analysis. The method used for that case is longitudinal data analysis or mixed model analysis. However, if the amount of information is sufficient due to the large number of time points, repetitive data are not needed and these data are analyzed using time series analysis technique. In particular, with a large number of data points in the current situation, when we want to predict how each variable affects each other, or what trends will be expected in the future, we should analyze the data using time series analysis techniques. In this study, we introduce univariate time series analysis, intervention time series model, transfer function model, and multivariate time series model and review research papers studied in Korea. We also introduce an error correction model, which can be used to analyze environmental ecological data.

통계분석을 이용한 지하수위 변동 특성 분류

  • 문상기;우남칠
    • Proceedings of the Korean Society of Soil and Groundwater Environment Conference
    • /
    • 2001.09a
    • /
    • pp.155-159
    • /
    • 2001
  • A study on multivariate statistical classification of ground water hydrographs was conducted. The vast data of national ground water monitoring network (78 sites of alluvium) were used. 6 factors were selected to classify the ground water level change. Factor analysis was proved to be useful tool for classifying vast hydrogeological data.

  • PDF

Selection probability of multivariate regularization to identify pleiotropic variants in genetic association studies

  • Kim, Kipoong;Sun, Hokeun
    • Communications for Statistical Applications and Methods
    • /
    • v.27 no.5
    • /
    • pp.535-546
    • /
    • 2020
  • In genetic association studies, pleiotropy is a phenomenon where a variant or a genetic region affects multiple traits or diseases. There have been many studies identifying cross-phenotype genetic associations. But, most of statistical approaches for detection of pleiotropy are based on individual tests where a single variant association with multiple traits is tested one at a time. These approaches fail to account for relations among correlated variants. Recently, multivariate regularization methods have been proposed to detect pleiotropy in analysis of high-dimensional genomic data. However, they suffer a problem of tuning parameter selection, which often results in either too many false positives or too small true positives. In this article, we applied selection probability to multivariate regularization methods in order to identify pleiotropic variants associated with multiple phenotypes. Selection probability was applied to individual elastic-net, unified elastic-net and multi-response elastic-net regularization methods. In simulation studies, selection performance of three multivariate regularization methods was evaluated when the total number of phenotypes, the number of phenotypes associated with a variant, and correlations among phenotypes are different. We also applied the regularization methods to a wild bean dataset consisting of 169,028 variants and 17 phenotypes.

Multivariate assessment of the occurrence of compound Hazards at the pan-Asian region

  • Davy Jean Abella;Kuk-Hyun Ahn
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2023.05a
    • /
    • pp.166-166
    • /
    • 2023
  • Compound hazards (CHs) are two or more extreme climate events combined which occur simultaneously in the same region at the same time. Compared to individual hazards, the combination of hazards that cause CHs can result in greater economic losses and deaths. While several extreme climate events have been recorded across Asia for the past decades, many studies have only focused on a single hazard. In this study, we assess the spatiotemporal pattern of dry compound hazards which includes drought, heatwave, fire and wind across Asia for the last 42 years (1980-2021) using the historical data from ERA5 Reanalysis dataset. We utilize a daily spatial data of each climate event to assess the occurrence of such compound hazards on a daily basis. Heatwave, fire and wind hazard occurrences are analyzed using daily percentile-based thresholds while a pre-defined threshold for SPI is applied for drought occurrence. Then, the occurrence of each type of compound hazard is taken from overlapping the map of daily occurrences of a single hazard. Lastly, a multivariate assessment are conducted to quantify the occurrence frequency, hotspots and trends of each type of compound hazard across Asia. By conducting a multivariate analysis of the occurrence of these compound hazards, we identify the relationships and interactions in dry compound hazards including droughts, heatwaves, fires, and winds, ultimately leading to better-informed decisions and strategies in the natural risk management.

  • PDF

Applications of Cluster Analysis in Biplots (행렬도에서 군집분석의 활용)

  • Choi, Yong-Seok;Kim, Hyoung-Young
    • Communications for Statistical Applications and Methods
    • /
    • v.15 no.1
    • /
    • pp.65-76
    • /
    • 2008
  • Biplots are the multivariate analogue of scatter plots. They approximate the multivariate distribution of a sample in a few dimensions, typically two, and they superimpose on this display representations of the variables on which the samples are measured(Gower and Hand, 1996, Chapter 1). And the relationships between the observations and variables can be easily seen. Thus, biplots are useful for giving a graphical description of the data. However, this method does not give some concise interpretations between variables and observations when the number of observations are large. Therefore, in this study, we will suggest to interpret the biplot analysis by applying the K-means clustering analysis. It shows that the relationships between the clusters and variables can be easily interpreted. So, this method is more useful for giving a graphical description of the data than using raw data.

Comparison study of modeling covariance matrix for multivariate longitudinal data (다변량 경시적 자료 분석을 위한 공분산 행렬의 모형화 비교 연구)

  • Kwak, Na Young;Lee, Keunbaik
    • The Korean Journal of Applied Statistics
    • /
    • v.33 no.3
    • /
    • pp.281-296
    • /
    • 2020
  • Repeated outcomes from the same subjects are referred to as longitudinal data. Analysis of the data requires different methods unlike cross-sectional data analysis. It is important to model the covariance matrix because the correlation between the repeated outcomes must be considered when estimating the effects of covariates on the mean response. However, the modeling of the covariance matrix is tricky because there are many parameters to be estimated, and the estimated covariance matrix should be positive definite. In this paper, we consider analysis of multivariate longitudinal data via two modeling methodologies for the covariance matrix for multivariate longitudinal data. Both methods describe serial correlations of multivariate longitudinal outcomes using a modified Cholesky decomposition. However, the two methods consider different decompositions to explain the correlation between simultaneous responses. The first method uses enhanced linear covariance models so that the covariance matrix satisfies a positive definiteness condition; in addition, and principal component analysis and maximization-minimization algorithm (MM algorithm) were used to estimate model parameters. The second method considers variance-correlation decomposition and hypersphere decomposition to model covariance matrix. Simulations are used to compare the performance of the two methodologies.

The Contribution of Social Media Value to Company's Financial Performance: Empirical Evidence from Indonesia

  • MIQDAD, Muhammad;OKTAVIANI, Siska Aprilia
    • The Journal of Asian Finance, Economics and Business
    • /
    • v.8 no.1
    • /
    • pp.305-315
    • /
    • 2021
  • This article aims to explore the contribution of social media value to a company's financial performance in a digital environment economy since the awareness of companies and investors in the use of social media opens up new mechanisms for disseminating information. Quantitative method is used in this study with Multivariate Analysis of Variance as the analysis tool. The data used is secondary data gathered from Indonesia Stock Exchange (IDX) using 308 companies as samples. In the multivariate test, four kinds of multivariate significance tests were carried out, namely Pillai Trace, Wilk Lambda, Hotelling's Trace, and Roy's Largest Root. It was found that social media value has a small contribution in the difference of the level of profitability and the value of the company in Indonesia, but it doesn't have a contribution to the difference of the level of liquidity. The contribution was an implication of online Word of Mouth (WOM) motives which are interrelated with signal theory and as additional information for investors in relation to single-person decision theory. This study provides an insight into the importance of social media management considering that the world of digital economy will continue to develop, so companies in Indonesia need to take advantage of these opportunities.

Application of metabolic profiling for biomarker discovery

  • Hwang, Geum-Sook
    • Proceedings of the Korean Society of Applied Pharmacology
    • /
    • 2007.11a
    • /
    • pp.19-27
    • /
    • 2007
  • An important potential of metabolomics-based approach is the possibility to develop fingerprints of diseases or cellular responses to classes of compounds with known common biological effect. Such fingerprints have the potential to allow classification of disease states or compounds, to provide mechanistic information on cellular perturbations and pathways and to identify biomarkers specific for disease severity and drug efficacy. Metabolic profiles of biological fluids contain a vast array of endogenous metabolites. Changes in those profiles resulting from perturbations of the system can be observed using analytical techniques, such as NMR and MS. $^1H$ NMR was used to generate a molecular fingerprint of serum or urinary sample, and then pattern recognition technique was applied to identity molecular signatures associated with the specific diseases or drug efficiency. Several metabolites that differentiate disease samples from the control were thoroughly characterized by NMR spectroscopy. We investigated the metabolic changes in human normal and clinical samples using $^1H$ NMR. Spectral data were applied to targeted profiling and spectral binning method, and then multivariate statistical data analysis (MVDA) was used to examine in detail the modulation of small molecule candidate biomarkers. We show that targeted profiling produces robust models, generates accurate metabolite concentration data, and provides data that can be used to help understand metabolic differences between healthy and disease population. Such metabolic signatures could provide diagnostic markers for a disease state or biomarkers for drug response phenotypes.

  • PDF