• Title/Summary/Keyword: longitudinal binary data

Search Result 19, Processing Time 0.022 seconds

Bayesian Pattern Mixture Model for Longitudinal Binary Data with Nonignorable Missingness

  • Kyoung, Yujung;Lee, Keunbaik
    • Communications for Statistical Applications and Methods
    • /
    • v.22 no.6
    • /
    • pp.589-598
    • /
    • 2015
  • In longitudinal studies missing data are common and require a complicated analysis. There are two popular modeling frameworks, pattern mixture model (PMM) and selection models (SM) to analyze the missing data. We focus on the PMM and we also propose Bayesian pattern mixture models using generalized linear mixed models (GLMMs) for longitudinal binary data. Sensitivity analysis is used under the missing not at random assumption.

Confounding of Time Trend with Dropout Process in Longitudinal Data Analysis

  • Kim, Ji-Hyun;Choi, Hye-Hyun
    • Communications for Statistical Applications and Methods
    • /
    • v.9 no.3
    • /
    • pp.703-713
    • /
    • 2002
  • In longitudinal studies, outcomes are repeatedly measured over time for each subject. It is common to have missing values or dropouts for longitudinal data. In this study time trend in longitudinal data with dropouts is of concern. The confounding of time trend with dropout process is investigated through simulation studies. Some simulation results are reported for binary responses as well as continuous responses with patterns of dropouts varying. It has been found that time trend is not confounded with random dropout process for binary responses when it is estimated using GEE.

A Study on Decision Tree for Multiple Binary Responses

  • Lee, Seong-Keon
    • Communications for Statistical Applications and Methods
    • /
    • v.10 no.3
    • /
    • pp.971-980
    • /
    • 2003
  • The tree method can be extended to multivariate responses, such as repeated measure and longitudinal data, by modifying the split function so as to accommodate multiple responses. Recently, some decision trees for multiple responses have been constructed by Segal (1992) and Zhang (1998). Segal suggested a tree can analyze continuous longitudinal response using Mahalanobis distance for within node homogeneity measures and Zhang suggested a tree can analyze multiple binary responses using generalized entropy criterion which is proportional to maximum likelihood of joint distribution of multiple binary responses. In this paper, we will modify CART procedure and suggest a new tree-based method that can analyze multiple binary responses using similarity measures.

Semiparametric kernel logistic regression with longitudinal data

  • Shim, Joo-Yong;Seok, Kyung-Ha
    • Journal of the Korean Data and Information Science Society
    • /
    • v.23 no.2
    • /
    • pp.385-392
    • /
    • 2012
  • Logistic regression is a well known binary classification method in the field of statistical learning. Mixed-effect regression models are widely used for the analysis of correlated data such as those found in longitudinal studies. We consider kernel extensions with semiparametric fixed effects and parametric random effects for the logistic regression. The estimation is performed through the penalized likelihood method based on kernel trick, and our focus is on the efficient computation and the effective hyperparameter selection. For the selection of optimal hyperparameters, cross-validation techniques are employed. Numerical results are then presented to indicate the performance of the proposed procedure.

Building credit scoring models with various types of target variables (목표변수의 형태에 따른 신용평점 모형 구축)

  • Woo, Hyun Seok;Lee, Seok Hyung;Cho, HyungJun
    • Journal of the Korean Data and Information Science Society
    • /
    • v.24 no.1
    • /
    • pp.85-94
    • /
    • 2013
  • As the financial market becomes larger, the loss increases due to the failure of the credit risk managements from the poor management of the customer information or poor decision-making. Thus, the credit risk management also becomes more important and it is essential to develop a credit scoring model, which is a fundamental tool used to minimize the credit risk. Credit scoring models have been studied and developed only for binary target variables. In this paper, we consider other types of target variables such as ordinal multinomial data or longitudinal binary data and suggest credit scoring models. We then apply our developed models to real data and random data, and investigate their performance through Kolmogorov-Smirnov statistic.

Bayesian inference of longitudinal Markov binary regression models with t-link function (t-링크를 갖는 마코프 이항 회귀 모형을 이용한 인도네시아 어린이 종단 자료에 대한 베이지안 분석)

  • Sim, Bohyun;Chung, Younshik
    • The Korean Journal of Applied Statistics
    • /
    • v.33 no.1
    • /
    • pp.47-59
    • /
    • 2020
  • In this paper, we present the longitudinal Markov binary regression model with t-link function when its transition order is known or unknown. It is assumed that logit or probit models are considered in binary regression models. Here, t-link function can be used for more flexibility instead of the probit model since the t distribution approaches to normal distribution as the degree of freedom goes to infinity. A Markov regression model is considered because of the longitudinal data of each individual data set. We propose Bayesian method to determine the transition order of Markov regression model. In particular, we use the deviance information criterion (DIC) (Spiegelhalter et al., 2002) of possible models in order to determine the transition order of the Markov binary regression model if the transition order is known; however, we compute and compare their posterior probabilities if unknown. In order to overcome the complicated Bayesian computation, our proposed model is reconstructed by the ideas of Albert and Chib (1993), Kuo and Mallick (1998), and Erkanli et al. (2001). Our proposed method is applied to the simulated data and real data examined by Sommer et al. (1984). Markov chain Monte Carlo methods to determine the optimal model are used assuming that the transition order of the Markov regression model are known or unknown. Gelman and Rubin's method (1992) is also employed to check the convergence of the Metropolis Hastings algorithm.

Property of regression estimators in GEE models for ordinal responses

  • Lee, Hyun-Yung
    • Journal of the Korean Data and Information Science Society
    • /
    • v.23 no.1
    • /
    • pp.209-218
    • /
    • 2012
  • The method of generalized estimating equations (GEEs) provides consistent esti- mates of the regression parameters in a marginal regression model for longitudinal data, even when the working correlation model is misspecified (Liang and Zeger, 1986). In this paper we compare the estimators of parameters in GEE approach. We consider two aspects: coverage probabilites and efficiency. We adopted to ordinal responses th results derived from binary outcomes.

Semiparametric Approach to Logistic Model with Random Intercept (준모수적 방법을 이용한 랜덤 절편 로지스틱 모형 분석)

  • Kim, Mijeong
    • The Korean Journal of Applied Statistics
    • /
    • v.28 no.6
    • /
    • pp.1121-1131
    • /
    • 2015
  • Logistic models with a random intercept are useful to analyze longitudinal binary data. Traditionally, the random intercept of the logistic model is assumed to be parametric (such as normal distribution) and is also assumed to be independent to variables. Such assumptions are very strong and restricted for application to real data. Recently, Garcia and Ma (2015) derived semiparametric efficient estimators for logistic model with a random intercept without these assumptions. Their estimator shows the consistency where we do not assume any parametric form for the random intercept. In addition, the method is computationally simple. In this paper, we apply this method to analyze toenail infection data. We compare the semiparametric estimator with maximum likelihood estimator, penalized quasi-likelihood estimator and hierarchical generalized linear estimator.

Factors Related to Cognitive Function Decline by Socio-demographic and Health-related Characteristics : Based on Korean Longitudinal Study of Ageing(KLoSA) Panel Data (인구사회학적 요인 및 건강관련 특성에 따른 인지기능저하 관련 요인 연구 -고령화연구패널 조사 자료를 이용하여-)

  • Kim, Kyeong-Na;Lee, Hyo-Young;Kim, Soo-Jeong
    • The Korean Journal of Health Service Management
    • /
    • v.14 no.1
    • /
    • pp.137-146
    • /
    • 2020
  • Objectives: The aim of this study was to investigate cognitive function decline by socio-demographic and health-related characteristics (health behaviors and health status) using 5th Korean Longitudinal Study of Aging panel data. Methods: The subjects were 4,440 community-dwelling people aged over 57 years. The data were analyzed with descriptive statistics, frequency analysis, χ2-test, and binary logistic regression analysis using SPSS ver. 25.0. Results: The findings revealed that socio-demographic characteristics (gender, age, area of residence, educational level, marital status, number of children, number of grand-children) and health-related characteristics (smoking, drinking, regular exercise, weight category by body mass index, hypertension and diabetes mellitus) were factors that influenced cognitive function decline (p<.05). Conclusions: Cognitive function decline was closely related to health behaviors and disease types. Future studies must examine related constructs to accurately determine these relationships among various populations. The present study could be used as a tool for the development and implementation of health promotion and prevention strategies.

Hurdle Model for Longitudinal Zero-Inflated Count Data Analysis (영과잉 경시적 가산자료 분석을 위한 허들모형)

  • Jin, Iktae;Lee, Keunbaik
    • The Korean Journal of Applied Statistics
    • /
    • v.27 no.6
    • /
    • pp.923-932
    • /
    • 2014
  • The Hurdle model can to analyze zero-inflated count data. This model is a mixed model of the logit model for a binary component and a truncated Poisson model of a truncated count component. We propose a new hurdle model with a general heterogeneous random effects covariance matrix to analyze longitudinal zero-inflated count data using modified Cholesky decomposition. This decomposition factors the random effects covariance matrix into generalized autoregressive parameters and innovation variance. The parameters are modeled using (generalized) linear models and estimated with a Bayesian method. We use these methods to carefully analyze a real dataset.