• Title/Summary/Keyword: Multivariate statistical models

Search Result 126, Processing Time 0.034 seconds

Choice of frequency via principal component in high-frequency multivariate volatility models (주성분을 이용한 다변량 고빈도 실현 변동성의 주기 선택)

  • Jin, M.K.;Yoon, J.E.;Hwang, S.Y.
    • The Korean Journal of Applied Statistics
    • /
    • v.30 no.5
    • /
    • pp.747-757
    • /
    • 2017
  • We investigate multivariate volatilities based on high frequency time series. The PCA (principal component analysis) method is employed to achieve a dimension reduction in multivariate volatility. Multivariate realized volatilities (RV) with various frequencies are calculated from high frequency data and "optimum" frequency is suggested using PCA. Specifically, RVs with various frequencies are compared with existing daily volatilities such as Cholesky, EWMA and BEKK after dimension reduction via PCA. An analysis of high frequency stock prices of KOSPI, Samsung Electronics and Hyundai motor company is illustrated.

Parallelism Test of Slope in Simple Linear Regression Models (회귀모형의 기울기에 대한 품행성 검정)

  • Park, Hyun-Wook;Kim, Dong-Jae
    • Communications for Statistical Applications and Methods
    • /
    • v.16 no.1
    • /
    • pp.75-83
    • /
    • 2009
  • Parallelism tests are proposed for slope in the simple linear regression models. In this paper, we suggest the parametric test using HSD testing method (Tukey,1953) and distribution-free test using Kruskal-wallis (1952) for more than three slopes. Monte Carlo simulation study is adapted to compare the power of the proposed methods with Wilks' Lambda multivariate procedure.

Multiple imputation for competing risks survival data via pseudo-observations

  • Han, Seungbong;Andrei, Adin-Cristian;Tsui, Kam-Wah
    • Communications for Statistical Applications and Methods
    • /
    • v.25 no.4
    • /
    • pp.385-396
    • /
    • 2018
  • Competing risks are commonly encountered in biomedical research. Regression models for competing risks data can be developed based on data routinely collected in hospitals or general practices. However, these data sets usually contain the covariate missing values. To overcome this problem, multiple imputation is often used to fit regression models under a MAR assumption. Here, we introduce a multivariate imputation in a chained equations algorithm to deal with competing risks survival data. Using pseudo-observations, we make use of the available outcome information by accommodating the competing risk structure. Lastly, we illustrate the practical advantages of our approach using simulations and two data examples from a coronary artery disease data and hepatocellular carcinoma data.

Model Classification of Quality Statistics Using Block Repeated Measures (블록 반복측정을 이용한 품질통계 모형의 유형화)

  • Choi, Sung-Woon
    • Journal of the Korea Safety Management & Science
    • /
    • v.9 no.3
    • /
    • pp.165-171
    • /
    • 2007
  • Dependent models in quality statistics are classified as serially autocorrelated model, multivariate model and dependent sample model. Dependent sample model is most efficient in time and cost to obtain samples among the above models. This paper proposes to implement parametric and nonparametric models into production system depended on demand pattern. Nonparametric models have distribution free and asymptotic distribution free techniques. Quality statistical models are classified into two categories ; the number of dependent sample and the type of data. The type of data consists of nominal, ordinal, interval and ratio data. The number of dependent sample divides into 2 samples and more than 3 samples.

Bayesian mixed models for longitudinal genetic data: theory, concepts, and simulation studies

  • Chung, Wonil;Cho, Youngkwang
    • Genomics & Informatics
    • /
    • v.20 no.1
    • /
    • pp.8.1-8.14
    • /
    • 2022
  • Despite the success of recent genome-wide association studies investigating longitudinal traits, a large fraction of overall heritability remains unexplained. This suggests that some of the missing heritability may be accounted for by gene-gene and gene-time/environment interactions. In this paper, we develop a Bayesian variable selection method for longitudinal genetic data based on mixed models. The method jointly models the main effects and interactions of all candidate genetic variants and non-genetic factors and has higher statistical power than previous approaches. To account for the within-subject dependence structure, we propose a grid-based approach that models only one fixed-dimensional covariance matrix, which is thus applicable to data where subjects have different numbers of time points. We provide the theoretical basis of our Bayesian method and then illustrate its performance using data from the 1000 Genome Project with various simulation settings. Several simulation studies show that our multivariate method increases the statistical power compared to the corresponding univariate method and can detect gene-time/ environment interactions well. We further evaluate our method with different numbers of individuals, variants, and causal variants, as well as different trait-heritability, and conclude that our method performs reasonably well with various simulation settings.

Application of machine learning models for estimating house price (단독주택가격 추정을 위한 기계학습 모형의 응용)

  • Lee, Chang Ro;Park, Key Ho
    • Journal of the Korean Geographical Society
    • /
    • v.51 no.2
    • /
    • pp.219-233
    • /
    • 2016
  • In social science fields, statistical models are used almost exclusively for causal explanation, and explanatory modeling has been a mainstream until now. In contrast, predictive modeling has been rare in the fields. Hence, we focus on constructing the predictive non-parametric model, instead of the explanatory model. Gangnam-gu, Seoul was chosen as a study area and we collected single-family house sales data sold between 2011 and 2014. We applied non-parametric models proposed in machine learning area including generalized additive model(GAM), random forest, multivariate adaptive regression splines(MARS) and support vector machines(SVM). Models developed recently such as MARS and SVM were found to be superior in predictive power for house price estimation. Finally, spatial autocorrelation was accounted for in the non-parametric models additionally, and the result showed that their predictive power was enhanced further. We hope that this study will prompt methodology for property price estimation to be extended from traditional parametric models into non-parametric ones.

  • PDF

Application of metabolic profiling for biomarker discovery

  • Hwang, Geum-Sook
    • Proceedings of the Korean Society of Applied Pharmacology
    • /
    • 2007.11a
    • /
    • pp.19-27
    • /
    • 2007
  • An important potential of metabolomics-based approach is the possibility to develop fingerprints of diseases or cellular responses to classes of compounds with known common biological effect. Such fingerprints have the potential to allow classification of disease states or compounds, to provide mechanistic information on cellular perturbations and pathways and to identify biomarkers specific for disease severity and drug efficacy. Metabolic profiles of biological fluids contain a vast array of endogenous metabolites. Changes in those profiles resulting from perturbations of the system can be observed using analytical techniques, such as NMR and MS. $^1H$ NMR was used to generate a molecular fingerprint of serum or urinary sample, and then pattern recognition technique was applied to identity molecular signatures associated with the specific diseases or drug efficiency. Several metabolites that differentiate disease samples from the control were thoroughly characterized by NMR spectroscopy. We investigated the metabolic changes in human normal and clinical samples using $^1H$ NMR. Spectral data were applied to targeted profiling and spectral binning method, and then multivariate statistical data analysis (MVDA) was used to examine in detail the modulation of small molecule candidate biomarkers. We show that targeted profiling produces robust models, generates accurate metabolite concentration data, and provides data that can be used to help understand metabolic differences between healthy and disease population. Such metabolic signatures could provide diagnostic markers for a disease state or biomarkers for drug response phenotypes.

  • PDF

PROCESS ANALYSIS OF AUTOMOTIVE PARTS USING GRAPHICAL MODELLING

  • IRIKURA Norio;KUZUYA Kazuyoshi;NISHINA Ken
    • Proceedings of the Korean Society for Quality Management Conference
    • /
    • 1998.11a
    • /
    • pp.295-300
    • /
    • 1998
  • Recently graphical modelling is being studied as a useful process analysis tool for exploratory causal analysis. Graphical modelling is a presentation method that uses graphs to describe statistical models of the structures of multivariate data. This paper describes an application of this graphical modeling with two cases from the automotive parts industry. One case is the unbalance problem of the pulley, an automotive generator part. There is multivariate data of the product from each of the processes which are connected in the series. By means of exploratory causal analysis between the variables using graphical modeling, the key processes which causes the variation of the final characteristics and their mechanism of the causal relationship have become clear. Another case is, also, the unbalanced problem of automotive starter parts which consists of many parts and is manufactured by complex machinery and assembling process. By means of the similar technique, the key processes are obtained easily and the results are reasonable from technical knowledge.

  • PDF

Corporate Image Strategy of Corporate Ethics and Customer Satisfaction through Quality Improvement -Discriminant Models based on the Utilization of a Small Number of Observed Values- (품질향상을 통한 고객만족과 기업윤리차원의 기업이미지 전략 -소수의 관측치들의 활용을 위한 모형들 중심으로-)

  • Kim, Jong Soon
    • Journal of Korean Society for Quality Management
    • /
    • v.24 no.4
    • /
    • pp.168-189
    • /
    • 1996
  • In order for the corporation to get a good image from the customers it should consider several variables, but especially important are corproate ethics and customer satisfaction through quality improvement. Standard multivariate data analysis can be applied to find out the importance of customer satisfaction and corporate ethics as influence factors in the corporate competitive strategy. When applying this Methodology, multivariate normal distributions density function and the identical covariance between groups assumptions have to be satisfied. By using the evaluation result from a small number of specialists in an attempt to decide on the strategical factors that will create a better company image than its competitor, if it chooses to use statistical discriminant analysis method, it would be difficult to satisfy the two assumptions mentioned above. This thesis introduces discriminant analysis method that uses LP/GP effectively which is applicable to this particular situation.

  • PDF

A Bayesian Approach to Dependent Paired Comparison Rankings

  • Kim, Hea-Jung;Kim, Dae-Hwang
    • Proceedings of the Korean Statistical Society Conference
    • /
    • 2003.05a
    • /
    • pp.85-90
    • /
    • 2003
  • In this paper we develop a method for finding optimal ordering of K statistical models. This is based on a dependent paired comparison experimental arrangement whose results can naturally be represented by a completely oriented graph (also so called tournament graph). Introducing preference probabilities, strong transitivity conditions, and an optimal criterion to the graph, we show that a Hamiltonian path obtained from row sum ranking is the optimal ordering. Necessary theories involved in the method and computation are provided. As an application of the method, generalized variances of K multivariate normal populations are compared by a Bayesian approach.

  • PDF