Browse > Article
http://dx.doi.org/10.3346/jkms.2018.33.e343

Building Linked Big Data for Stroke in Korea: Linkage of Stroke Registry and National Health Insurance Claims Data  

Kim, Tae Jung (Department of Neurology, Seoul National University Hospital)
Lee, Ji Sung (Clinical Research Center, Asan Medical Center, University of Ulsan)
Kim, Ji-Woo (Department of Bigdata, Health Insurance Review and Assessment Service)
Oh, Mi Sun (Department of Neurology, Hallym University Sacred Heart Hospital)
Mo, Heejung (Department of Neurology, Seoul National University Hospital)
Lee, Chan-Hyuk (Department of Neurology, Seoul National University Hospital)
Jeong, Han-Young (Department of Neurology, Seoul National University Hospital)
Jung, Keun-Hwa (Department of Neurology, Seoul National University Hospital)
Lim, Jae-Sung (Department of Neurology, Hallym University Sacred Heart Hospital)
Ko, Sang-Bae (Department of Neurology, Seoul National University Hospital)
Yu, Kyung-Ho (Department of Neurology, Hallym University Sacred Heart Hospital)
Lee, Byung-Chul (Department of Neurology, Hallym University Sacred Heart Hospital)
Yoon, Byung-Woo (Department of Neurology, Seoul National University Hospital)
Publication Information
Journal of Korean Medical Science / v.33, no.53, 2018 , pp. 343.1-343.8 More about this Journal
Abstract
Background: Linkage of public healthcare data is useful in stroke research because patients may visit different sectors of the health system before, during, and after stroke. Therefore, we aimed to establish high-quality big data on stroke in Korea by linking acute stroke registry and national health claim databases. Methods: Acute stroke patients (n = 65,311) with claim data suitable for linkage were included in the Clinical Research Center for Stroke (CRCS) registry during 2006-2014. We linked the CRCS registry with national health claim databases in the Health Insurance Review and Assessment Service (HIRA). Linkage was performed using 6 common variables: birth date, gender, provider identification, receiving year and number, and statement serial number in the benefit claim statement. For matched records, linkage accuracy was evaluated using differences between hospital visiting date in the CRCS registry and the commencement date for health insurance care in HIRA. Results: Of 65,311 CRCS cases, 64,634 were matched to HIRA cases (match rate, 99.0%). The proportion of true matches was 94.4% (n = 61,017) in the matched data. Among true matches (mean age 66.4 years; men 58.4%), the median National Institutes of Health Stroke Scale score was 3 (interquartile range 1-7). When comparing baseline characteristics between true matches and false matches, no substantial difference was observed for any variable. Conclusion: We could establish big data on stroke by linking CRCS registry and HIRA records, using claims data without personal identifiers. We plan to conduct national stroke research and improve stroke care using the linked big database.
Keywords
Big Data; Data Linkage; Stroke Registry; National Health Claim Data;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 Silveira DP, Artmann E. Accuracy of probabilistic record linkage applied to health databases: systematic review. Rev Saude Publica 2009;43(5):875-82.   DOI
2 Bohensky MA, Jolley D, Sundararajan V, Evans S, Pilcher DV, Scott I, et al. Data linkage: a powerful research tool with potential problems. BMC Health Serv Res 2010;10(1):346.   DOI
3 Harron KL, Doidge JC, Knight HE, Gilbert RE, Goldstein H, Cromwell DA, et al. A guide to evaluating linkage quality for the analysis of linked data. Int J Epidemiol 2017;46(5):1699-710.   DOI
4 Jutte DP, Roos LL, Brownell MD. Administrative record linkage as a tool for public health research. Annu Rev Public Health 2011;32(1):91-108.   DOI
5 Ido MS, Bayakly R, Frankel M, Lyn R, Okosun IS. Administrative data linkage to evaluate a quality improvement program in acute stroke care, Georgia, 2006-2009. Prev Chronic Dis 2015;12:E05.
6 Zingmond DS, Ye Z, Ettner SL, Liu H. Linking hospital discharge and death records--accuracy and sources of bias. J Clin Epidemiol 2004;57(1):21-9.   DOI
7 Kwon S. Thirty years of National Health Insurance in South Korea: lessons for achieving universal health care coverage. Health Policy Plan 2009;24(1):63-71.   DOI
8 Kim L, Kim JA, Kim S. A guide for the utilization of health insurance review and assessment service national patient samples. Epidemiol Health 2014;36:e2014008.
9 Bradley CJ, Penberthy L, Devers KJ, Holden DJ. Health services research and data linkages: issues, methods, and directions for the future. Health Serv Res 2010;45(5 Pt 2):1468-88.   DOI
10 Kim BJ, Park JM, Kang K, Lee SJ, Ko Y, Kim JG, et al. Case characteristics, hyperacute treatment, and outcome information from the clinical research center for stroke-fifth division registry in South Korea. J Stroke 2015;17(1):38-53.   DOI
11 Park TH, Ko Y, Lee SJ, Lee KB, Lee J, Han MK, et al. Gender differences in the age-stratified prevalence of risk factors in Korean ischemic stroke patients: a nationwide stroke registry-based cross-sectional study. Int J Stroke 2014;9(6):759-65.   DOI
12 Hong KS, Bang OY, Kang DW, Yu KH, Bae HJ, Lee JS, et al. Stroke statistics in Korea: Part I. Epidemiology and risk factors: a report from the Korean stroke society and clinical research center for stroke. J Stroke 2013;15(1):2-20.   DOI
13 Kim HA, Kim S, Seo YI, Choi HJ, Seong SC, Song YW, et al. The epidemiology of total knee replacement in South Korea: national registry data. Rheumatology (Oxford) 2008;47(1):88-91.   DOI
14 Shin JY, Choi NK, Jung SY, Lee J, Kwon JS, Park BJ. Risk of ischemic stroke with the use of risperidone, quetiapine and olanzapine in elderly patients: a population-based, case-crossover study. J Psychopharmacol 2013;27(7):638-44.   DOI
15 Kim JA, Yoon S, Kim LY, Kim DS. Towards actualizing the value potential of Korea Health Insurance Review and Assessment (HIRA) data as a resource for health research: strengths, limitations, applications, and strategies for optimal use of HIRA data. J Korean Med Sci 2017;32(5):718-28.   DOI
16 D'Orazio M. Statistical matching and imputation of survey data with StatMatch. https://www.researchgate.net/publication/263888033. Updated 2014. Accessed August 1, 2018.
17 Ford JB, Roberts CL, Taylor LK. Characteristics of unmatched maternal and baby records in linked birth records and hospital discharge data. Paediatr Perinat Epidemiol 2006;20(4):329-37.   DOI
18 Capuani L, Bierrenbach AL, Abreu F, Takecian PL, Ferreira JE, Sabino EC. Accuracy of a probabilistic record-linkage methodology used to track blood donors in the Mortality Information System database. Cad Saude Publica 2014;30(8):1623-32.   DOI
19 Austin PC. Balance diagnostics for comparing the distribution of baseline covariates between treatment groups in propensity-score matched samples. Stat Med 2009;28(25):3083-107.   DOI
20 Raghupathi W, Raghupathi V. Big data analytics in healthcare: promise and potential. Health Inf Sci Syst 2014;2(1):3.   DOI
21 Harron K, Dibben C, Boyd J, Hjern A, Azimaee M, Barreto ML, et al. Challenges in administrative data linkage for research. Big Data Soc 2017;4(2):2053951717745678.
22 Park BJ, Stergachis A. Automated databases in pharmacoepidemiologic studies. In: Hartzema AG, editor. Pharmacoepidemiology and Therapeutic Risk Management. Cincinnati, OH: Harvey Whitney Books, 2008, 519-544.