DOI QR코드

DOI QR Code

Interpretive Performance and Inter-Observer Agreement on Digital Mammography Test Sets

  • Kim, Sung Hun (Department of Radiology, Seoul St. Mary's Hospital, College of Medicine, The Catholic University of Korea) ;
  • Lee, Eun Hye (Department of Radiology, Soonchunhyang University Hospital Bucheon, Soonchunhyang University College of Medicine) ;
  • Jun, Jae Kwan (National Cancer Control Institute, National Cancer Center) ;
  • Kim, You Me (Department of Radiology, Dankook University Hospital, Dankook University College of Medicine) ;
  • Chang, Yun-Woo (Department of Radiology, Soonchunhyang University Hospital, Soonchunhyang University College of Medicine) ;
  • Lee, Jin Hwa (Department of Radiology, Dong-A University Hospital) ;
  • Kim, Hye-Won (Department of Radiology, Wonkwang University Hospital, Wonkwang University School of Medicine) ;
  • Choi, Eun Jung (Department of Radiology, Chonbuk National University Hospital)
  • 투고 : 2018.03.28
  • 심사 : 2018.10.09
  • 발행 : 2019.02.01

초록

Objective: To evaluate the interpretive performance and inter-observer agreement on digital mammographs among radiologists and to investigate whether radiologist characteristics affect performance and agreement. Materials and Methods: The test sets consisted of full-field digital mammograms and contained 12 cancer cases among 1000 total cases. Twelve radiologists independently interpreted all mammograms. Performance indicators included the recall rate, cancer detection rate (CDR), positive predictive value (PPV), sensitivity, specificity, false positive rate (FPR), and area under the receiver operating characteristic curve (AUC). Inter-radiologist agreement was measured. The reporting radiologist characteristics included number of years of experience interpreting mammography, fellowship training in breast imaging, and annual volume of mammography interpretation. Results: The mean and range of interpretive performance were as follows: recall rate, 7.5% (3.3-10.2%); CDR, 10.6 (8.0-12.0 per 1000 examinations); PPV, 15.9% (8.8-33.3%); sensitivity, 88.2% (66.7-100%); specificity, 93.5% (90.6-97.8%); FPR, 6.5% (2.2-9.4%); and AUC, 0.93 (0.82-0.99). Radiologists who annually interpreted more than 3000 screening mammograms tended to exhibit higher CDRs and sensitivities than those who interpreted fewer than 3000 mammograms (p = 0.064). The inter-radiologist agreement showed a percent agreement of 77.2-88.8% and a kappa value of 0.27-0.34. Radiologist characteristics did not affect agreement. Conclusion: The interpretative performance of the radiologists fulfilled the mammography screening goal of the American College of Radiology, although there was inter-observer variability. Radiologists who interpreted more than 3000 screening mammograms annually tended to perform better than radiologists who did not.

키워드

과제정보

We would like to thank Seung Hoon Song and Hye-Mi Jo for data collection and management.

참고문헌

  1. Youlden DR, Cramb SM, Yip CH, Baade PD. Incidence and mortality of female breast cancer in the Asia-Pacific region. Cancer Biol Med 2014;11:101-115
  2. Leong SP, Shen ZZ, Liu TJ, Agarwal G, Tajima T, Paik NS, et al. Is breast cancer the same disease in Asian and Western countries? World J Surg 2010;34:2308-2324 https://doi.org/10.1007/s00268-010-0683-1
  3. Ohuchi N, Suzuki A, Sobue T, Kawai M, Yamamoto S, Zheng YF, et al.; J-START investigator groups. Sensitivity and specificity of mammography and adjunctive ultrasonography to screen for breast cancer in the Japan Strategic Anti-cancer Randomized Trial (J-START): a randomised controlled trial. Lancet 2016;387:341-348 https://doi.org/10.1016/S0140-6736(15)00774-6
  4. American College of Radiology. ACR BI-RADS Atlas®, 5th ed. Reston, VA: American College of Radiology, 2013
  5. Lee EH, Kim KW, Kim YJ, Shin DR, Park YM, Lim HS, et al. Performance of screening mammography: a report of the alliance for breast cancer screening in Korea. Korean J Radiol 2016;17:489-496 https://doi.org/10.3348/kjr.2016.17.4.489
  6. Baker JA, Kornguth PJ, Floyd CE Jr. Breast Imaging Reporting and Data System standardized mammography lexicon: observer variability in lesion description. AJR Am J Roentgenol 1996;166:773-778 https://doi.org/10.2214/ajr.166.4.8610547
  7. Lazarus E, Mainiero MB, Schepps B, Koelliker SL, Livingston LS. BI-RADS lexicon for US and mammography: interobserver variability and positive predictive value. Radiology 2006;239:385-391 https://doi.org/10.1148/radiol.2392042127
  8. Berg WA, Campassi C, Langenberg P, Sexton MJ. Breast Imaging Reporting and Data System: inter- and intraobserver variability in feature analysis and final assessment. AJR Am J Roentgenol 2000;174:1769-1777 https://doi.org/10.2214/ajr.174.6.1741769
  9. Timmers JM, van Doorne-Nagtegaal HJ, Zonderland HM, van Tinteren H, Visser O, Verbeek AL, et al. The Breast Imaging Reporting and Data System (BI-RADS) in the Dutch breast cancer screening programme: its role as an assessment and stratification tool. Eur Radiol 2012;22:1717-1723 https://doi.org/10.1007/s00330-012-2409-2
  10. Duijm LE, Louwman MW, Groenewoud JH, van de Poll-Franse LV, Fracheboud J, Coebergh JW. Inter-observer variability in mammography screening and effect of type and number of readers on screening outcome. Br J Cancer 2009;100:901-907 https://doi.org/10.1038/sj.bjc.6604954
  11. Elmore JG, Jackson SL, Abraham L, Miglioretti DL, Carney PA, Geller BM, et al. Variability in interpretive performance at screening mammography and radiologists' characteristics associated with accuracy. Radiology 2009;253:641-651 https://doi.org/10.1148/radiol.2533082308
  12. Barlow WE, Chi C, Carney PA, Taplin SH, D'Orsi C, Cutter G, et al. Accuracy of screening mammography interpretation by characteristics of radiologists. J Natl Cancer Inst 2004;96:1840-1850 https://doi.org/10.1093/jnci/djh333
  13. Kim YJ, Lee EH, Jun JK, Shin DR, Park YM, Kim HW, et al. Analysis of participant factors that affect the diagnostic performance of screening mammography: a report of the Alliance for Breast Cancer Screening in Korea. Korean J Radiol 2017;18:624-631 https://doi.org/10.3348/kjr.2017.18.4.624
  14. Pisano ED, Gatsonis C, Hendrick E, Yaffe M, Baum JK, Acharyya S, et al.; Digital Mammographic Imaging Screening Trial (DMIST) Investigators Group. Diagnostic performance of digital versus film mammography for breast-cancer screening. N Engl J Med 2005;353:1773-1783 https://doi.org/10.1056/NEJMoa052911
  15. Hayes AF, Krippendorff K. Answering the call for a standard reliability measure for coding data. Communication Methods and Measures 2007;1:77-89 https://doi.org/10.1080/19312450709336664
  16. Landis JR, Koch GG. The measurement of observer agreement for categorical data. Biometrics 1977;33:159-174 https://doi.org/10.2307/2529310
  17. Lehman CD, Arao RF, Sprague BL, Lee JM, Buist DS, Kerlikowske K, et al. National performance benchmarks for modern screening digital mammography: update from the Breast Cancer Surveillance Consortium. Radiology 2017;283:49-58 https://doi.org/10.1148/radiol.2016161174
  18. Rickard M, Taylor R, Page A, Estoesta J. Cancer detection and mammogram volume of radiologists in a population-based screening programme. Breast 2006;15:39-43 https://doi.org/10.1016/j.breast.2005.04.005
  19. Albert US, Altland H, Duda V, Engel J, Geraedts M, Heywang-Kobrunner S, et al. 2008 update of the guideline: early detection of breast cancer in Germany. J Cancer Res Clin Oncol 2009;135:339-354 https://doi.org/10.1007/s00432-008-0450-y
  20. National Cancer Center. Ministry of Health & Welfare. Quality guidelines of breast cancer screening, 2nd ed. Goyang: National Cancer Center, 2018:43
  21. Haneuse S, Buist DS, Miglioretti DL, Anderson ML, Carney PA, Onega T, et al. Mammographic interpretive volume and diagnostic mammogram interpretation performance in community practice. Radiology 2012;262:69-79 https://doi.org/10.1148/radiol.11111026
  22. Berg WA, D'Orsi CJ, Jackson VP, Bassett LW, Beam CA, Lewis RS, et al. Does training in the Breast Imaging Reporting and Data System (BI-RADS) improve biopsy recommendations or feature analysis agreement with experienced breast imagers at mammography? Radiology 2002;224:871-880 https://doi.org/10.1148/radiol.2243011626
  23. Nelson HD, Pappas M, Cantor A, Griffin J, Daeges M, Humphrey L. Harms of breast cancer screening: systematic review to update the 2009 U.S. Preventive Services Task Force recommendation. Ann Intern Med 2016;164:256-267 https://doi.org/10.7326/M15-0970
  24. Lee EH, Jun JK, Jung SE, Kim YM, Choi N. The efficacy of mammography boot camp to improve the performance of radiologists. Korean J Radiol 2014;15:578-585 https://doi.org/10.3348/kjr.2014.15.5.578
  25. Elmore JG, Miglioretti DL, Reisch LM, Barton MB, Kreuter W, Christiansen CL, et al. Screening mammograms by community radiologists: variability in false-positive rates. J Natl Cancer Inst 2002;94:1373-1380 https://doi.org/10.1093/jnci/94.18.1373

피인용 문헌

  1. More than interobserver agreement is required for comparisons of categorization systems vol.38, pp.4, 2019, https://doi.org/10.14366/usg.19021
  2. Characteristics of Recent Articles Published in the Korean Journal of Radiology Based on the Citation Frequency vol.21, pp.12, 2020, https://doi.org/10.3348/kjr.2020.1322