DOI QR코드

DOI QR Code

Rank regression inferences on doubly interval-censored data

이중 구간 중도절단 자료에 대한 순위 기반 회귀 추정법 연구

  • Received : 2024.08.23
  • Accepted : 2024.10.09
  • Published : 2024.12.31

Abstract

In many biomedical fields, especially in studies of disease progressions, we frequently encounter two sequential events, both of which are often interval-censored due to regular examinations. Such a structure is called doubly interval-censoring (DIC), and our primary interest is the elapsed time between two consecutive events. In this paper, we propose a weighted rank regression approach for DIC data under the semiparametric accelerated failure time model. After transforming DIC data into simple interval-censored data where the true elapsed times may lie, we develop estimation procedures with a Gehan-type weight by gathering all comparable pairs of observed residuals from transformed data. Moreover, we generalize this approach with data-dependent weights and extend it to clustered DIC data, where the cluster size is potentially informative, using an inverse weighting strategy. An efficient technique for variance estimation as an alternative to resampling techniques is considered. We establish asymptotic properties and conduct numerical studies to demonstrate finite sample performances. Finally, we illustrate our method with a real dataset for clustered DIC data.

많은 의학 연구, 특히 질병 진행 연구에서는 서로 연관된 두 개의 연속적인 사건이 발생하는 경우를 자주 접하게 되며, 두 사건 시간 모두 정기적인 검진으로 인하여 구간 중도절단되는 경향이 있다. 이러한 자료를 이중 구간 중도절단(DIC) 되었다고 하며, 본 연구의 주요 관심사는 두 연속적인 사건 사이의 경과 시간이다. 본 논문에서는 준모수적 가속 수명 시간 모형을 바탕으로 DIC 자료를 분석하기 위한 순위 기반 회귀 방법론을 제안하였다. 복잡한 DIC 자료를 경과 시간이 포함하는 단일 구간 중도절단 자료로 바꾸어 관측되는 잔차들 간 비교 가능한 짝의 크기를 비교하여 게한 및 로그 순위 가중치를 고려한 추정 방정식을 바탕으로 회귀계수를 추정하였다. 일변량 DIC 자료에서 군집의 크기가 유의미한 정보가 있는 경우까지 고려한 군집 DIC 자료를 위하여 군집 크기의 역가중치를 추가한 방법론으로 확장하였다. 추정량의 분산을 추정하기 위하여 효율적인 방법을 사용하였으며, 유한 표본에서의 방법론 성능을 평가하기 위한 다양한 모의실험을 수행하였다. 마지막으로, 본 연구의 활용성을 살펴보기 위하여 제안한 방법론을 군집 DIC 구조를 갖는 실제 데이터에 적용하고 그 결과를 제시하였다.

Keywords

Acknowledgement

이 논문은 정부 (과학기술정보통신부)의 재원으로 한국연구재단의 지원을 받아 수행된 연구임 (NRF-2022M3J6A1063595, 2022R1A2C1008514).

References

  1. Besag J, Green P, Higdon D, and Mengersen K (1995). Bayesian computation and stochastic systems (with discussion), Statistical Science, 10, 3-66.
  2. Bogaerts K, Komarek A, and Lesaffre E (2017). Survival Analysis with Interval-censored Data: A Practical Approach with Examples in R, SAS, and BUGS, Chapman and Hall/CRC, New York.
  3. Choi T, Choi S, and Bandyopadhyay D (2024+). Rank estimation for the accelerated failure time model with partially interval-censored data, Statistica Sinica (under revision).
  4. Cong XJ, Yin G, and Shen Y (2007). Marginal analysis of correlated failure time data with informative cluster sizes, Biometrics, 63, 663-672.
  5. Dejardin D and Lesaffre E (2013). Stochastic EM algorithm for doubly interval-censored data, Biostatistics, 14, 766-778.
  6. De Gruttola V and Lagakos SW (1989). Analysis of doubly-censored survival data, with application to AIDS, Biometrics, 45, 1-11.
  7. Efron B and Tibshirani RJ (1994). An Introduction to the Bootstrap, Chapman and Hall/CRC, New York.
  8. Fan J and Datta S (2011). Fitting marginal accelerated failure time models to clustered survival data with potentially informative cluster size, Computational Statistics & Data Analysis, 55, 3295-3303.
  9. Fang HB and Sun J (2001). Consistency of nonparametric maximum likelihood estimation of a distribution function based on doubly interval-censored failure time data, Statistics & Probability Letters, 55, 311-318.
  10. Fygenson M and Ritov YA (1994). Monotone estimating equations for censored data, The Annals of Statistics, 22, 732-746.
  11. Jara A, Lesaffre E, De Iorio M, and Quintana F (2010). Bayesian semiparametric inference for multivariate doubly-interval-censored data, The Annals of Applied Statistics, 4, 2126-2149.
  12. Jin Z, Lin DY, Wei LJ, and Ying Z (2003). Rank-based inference for the accelerated failure time model, Biometrika, 90, 341-353.
  13. Jin Z, Lin DY, and Ying Z (2006). Rank regression analysis of multivariate failure time data based on marginal linear models, Scandinavian Journal of Statistics, 33, 1-23.
  14. Jin Z, Ying Z, and Wei LJ (2001). A simple resampling method by perturbing the minimand, Biometrika, 88, 381-390.
  15. Kalbfleisch JD and Lawless JF (1985). The analysis of panel data under a Markov assumption, Journal of the American Statistical Association, 80, 863-871.
  16. Kim MY, De Gruttola VG, and Lagakos SW (1993). Analyzing doubly censored data with covariates, with application to AIDS, Biometrics, 49, 13-22.
  17. Kim YJ (2010). Regression analysis of clustered interval-censored data with informative cluster size, Statistics in Medicine, 29, 2956-2962.
  18. Komarek A, Lesaffre E, Harkanen T, Declerck D, and Virtanen JI (2005). A Bayesian analysis of multivariate doubly-interval-censored dental data, Biostatistics, 6, 145-155.
  19. Komarek A and Lesaffre E (2006). Bayesian semi-parametric accelerated failure time model for paired doubly interval-censored data, Statistical Modelling, 6, 3-22.
  20. Komarek A and Lesaffre E (2008). Bayesian accelerated failure time model with multivariate doubly interval-censored data and flexible distributional assumptions, Journal of the American Statistical Association, 103, 523-533.
  21. Li Z and Owzar K (2016). Fitting cox models with doubly censored data using spline-based sieve marginal likelihood, Scandinavian Journal of Statistics, 43, 476-486.
  22. Lin Y and Chen K (2013). Efficient estimation of the censored linear regression model, Biometrika, 100, 525-530.
  23. Reich NG, Lessler J, Cummings DA, and Brookmeyer R (2009). Estimating incubation period distributions with coarse data, Statistics in Medicine, 28, 2769-2784.
  24. Spiekerman CF and Lin DY (1998). Marginal regression models for multivariate failure time data, Journal of the American Statistical Association, 93, 1164-1175.
  25. Sun J (2006). The Statistical Analysis of Interval-censored Failure Time Data, Springer, New York.
  26. Sun J and Zhao X (2013). Statistical Analysis of Panel Count Data, Springer, New York.
  27. Turnbull BW (1974). Nonparametric estimation of a survivorship function with doubly censored data, Journal of the American Statistical Association, 69, 169-173.
  28. Turnbull BW (1976). The empirical distribution function with arbitrarily grouped, censored and truncated data, Journal of the Royal Statistical Society: Series B (Methodological), 38, 290-295.
  29. van der Vaart AW and Wellner JA (1996). Weak Convergence and Empirical Processes with Applications to Statistics, Springer, New York.
  30. Wang YG and Zhao Y (2008). Weighted rank regression for clustered data analysis, Biometrics, 64, 39-45.
  31. Zeng D and Lin DY (2008). Efficient resampling methods for nonsmooth estimating functions, Biostatistics, 9, 355-363.
  32. Zhang X and Sun J (2010). Regression analysis of clustered interval-censored failure time data with informative cluster size, Computational Statistics & Data Analysis, 4, 1817-1823.