DOI QR코드

DOI QR Code

제 3상 임상시험에서 여러 형태 반응변수의 다변량 검정법인 P값 병합법

Methods of Combining P-values for Multiple Endpoints of Various Data Types

  • 김수영 (가톨릭대학교 의학통계학과) ;
  • 송혜향 (가톨릭대학교 의학통계학과)
  • Kim, Su-Young (Dept. of Biostatistics, The Catholic University of Korea) ;
  • Song, Hae-Hiang (Dept. of Biostatistics, The Catholic University of Korea)
  • 발행 : 2008.02.29

초록

제 3상 임상시험에서 치료효과가 여러 반응변수(endpoints)로 측정될 때, 이들 반응 변수가 대둥하게 중요하여 주요 반응변수(primary endpoint)를 선택할 수 없는 상황이 발생할 수 있다. O'Brion (1984)은 이들 반응변수 모두를 종합하여 치료효과에 대한 단축검정(one-tailed testing) 통계량으로서 반응변수가 연속형(continuous) 자료로 측정되었을 때 Ordinary Least Square(OLS)와 Generalized Least Square(GLS) 검정 통계량을 제시하였다. Pocock 등 (1987)은 여러 형태, 즉 연속형, 이산형(binary), 생존(survival) 자료의 반응변수를 함께 분석할 수 있음을 언급하고 있으나 실제로 이와 같이 여러 형태의 반응변수 병합에 대한 문제점을 설명하거나 구체적으로 모의 실험으로서 이러한 경우의 OLS와 GLS통계량의 효율성을 알아보지는 않았다. 본 논문에서는 특히 여러 형태의 반응변수를 종합하여 치료효과에 대한 결론을 내리는데 P값의 병합 통계량을 제안하며, 이때 각 반응변수의 치료효과에 대한 검정 결과인 P값은 서로 상관성이 존재하는 P값이다. OLS 및 GLS 검정 통계량보다 장점을 지닌 P값의 병합방법 중, 방법 F와 G는 제 1종 오류가 유의수준보다 커서 검정의 결론이 잘못 내려질 수 있는 경우가 있고 방법 B는 제 1종의 오류가 잘 통계되고 또한 효율성이 높은 것으로 나타났다.

Comparative studies in Phase III clinical trials quite often involve two or more equally important endpoints, and one cannot select primary endpoint from them. O'Brien(1984) proposed for continuous endpoints the OLS and GLS statistics as milti-variate test statistics. Pocock et al. (1987) mentioned the possibility of analyzing a mixture of data types, such as quantitative, binary and survival data types, with the OLS and GLS statistics, but the authors did not explore problems in combining several endpoints of different types. Furthermore, they did not perform a simulation study to assess the efficiencies of the OLS and GLS statistics for endpoints of a mixture of data types. In this paper, we propose the combining methods of correlated P-values for the analysis of multiple endpoints, and compare the efficiencies of this method with those of OLS and GLS statistics for a mixture of data types with a simulation study. Among the several methods of combining P-values that are more advantageous than combining of OLS and GLS statistics, method B maintains nominal significance levels and is more efficient, while method F and G have type I error rates that are larger than the specified significance levels, which might occasionally lead to a wrong conclusion.

키워드

참고문헌

  1. Brown, M. B. (1975). 400: A method for combining non-independent, one-sided tests of significance, Biometrics, 31, 987-992 https://doi.org/10.2307/2529826
  2. Chow, S. C. and Liu, J. P. (2003). Design and Analysis of Clinical Trials: Concepts and Methodologies, 2nd ed., John Wiley & Sons, New Jersey
  3. Fisher, R. A. (1950). Statistical Methods for Research Workers, 11th ed., Oliver and Boyd, London
  4. Gehan, E. A. (1965). A generalized Wilcoxon test for comparing arbitrarily singly censored samples, Biometrika, 52, 203-223 https://doi.org/10.1093/biomet/52.1-2.203
  5. Gertman, P. M. and Restuccia, J. D. (1981). The appropriateness evaluation protocol: A technique for assessing unnecessary days of hospital care, Medical Care, 14, 855-871
  6. Good, I. J. (1958). Significance test in parallel and in series, Journal of the American Statistical Association, 53, 799-813 https://doi.org/10.2307/2281953
  7. Gupta, S. S. (1963). Probability integrals of multivariate normal and multivariate t, The Annals of Mathematical Statistics, 34, 792-828 https://doi.org/10.1214/aoms/1177704004
  8. Hollander, M. and Wolfe, D. A. (1999). Nonparametric Statistical Methods, 2nd ed., John Wiley & Sons, New York
  9. Kim, J. H. (2004). Assessing Inappropriate Hospital Days with Random Coefficient Regression Models, Catholic University, Seoul, Korea
  10. Lehmacher, W., Wassmer, G. and Reitmeir, P. (1991). Procedures for two-sample comparisons with multiple endpoints controlling the experimentwise error rate, Biometrics, 47, 511-521 https://doi.org/10.2307/2532142
  11. Littell, R. C. and Folks, J. L. (1971). Asymptotic optimality of Fisher's method of combining independent tests, Journal of the American Statistical Association, 66, 802-806 https://doi.org/10.2307/2284230
  12. Littell, R. C. and Folks, J. L. (1973). Asymptotic optimality of Fisher's method of combining independent tests a, Journal of the American Statistical Association, 68, 193-194 https://doi.org/10.2307/2284167
  13. Mantel, N. (1966). Evaluation of survival data and two new rank order statistics arising in its consideration, Cancer Chemotherapy Reports, 50, 163-170
  14. Meier, P. (1975). Statistics and medical experimentation, Biometrics, 31, 511-529 https://doi.org/10.2307/2529434
  15. O'Brien, P. C. (1984). Procedures for comparing samples with multiple endpoints, Biometrics, 40, 1079-1087 https://doi.org/10.2307/2531158
  16. Pocock, S. J., Geller, N. L. and Tsiatis, A. A. (1987). The analysis of multiple endpoints in clinical trials, Biometrics, 43, 487-498 https://doi.org/10.2307/2531989
  17. Wilkinson, B. (1951). A statistical consideration in psychological research, Psychological Bulletin, 48, 156-158 https://doi.org/10.1037/h0059111
  18. Zaykin, D. V., Zhivotovsky, L. A., Westfall, P. H. and Weir, B. S. (2002). Truncated product method for combining P-values, Genetic Epidemiology, 22, 170-185 https://doi.org/10.1002/gepi.0042