DOI QR코드

DOI QR Code

Breast Cancer Statistics and Prediction Methodology: A Systematic Review and Analysis

  • Dubey, Ashutosh Kumar (Department of Computer Science & Engineering, JK Lakshmipat University) ;
  • Gupta, Umesh (Department of Computer Science & Engineering, JK Lakshmipat University) ;
  • Jain, Sonal (Department of Computer Science & Engineering, JK Lakshmipat University)
  • Published : 2015.06.03

Abstract

Breast cancer is a menacing cancer, primarily affecting women. Continuous research is going on for detecting breast cancer in the early stage as the possibility of cure in early stages is bright. There are two main objectives of this current study, first establish statistics for breast cancer and second to find methodologies which can be helpful in the early stage detection of the breast cancer based on previous studies. The breast cancer statistics for incidence and mortality of the UK, US, India and Egypt were considered for this study. The finding of this study proved that the overall mortality rates of the UK and US have been improved because of awareness, improved medical technology and screening, but in case of India and Egypt the condition is less positive because of lack of awareness. The methodological findings of this study suggest a combined framework based on data mining and evolutionary algorithms. It provides a strong bridge in improving the classification and detection accuracy of breast cancer data.

Keywords

References

  1. Agrawal R, Srikant R (1994). Fast algorithms for mining association rules. VLDB, 1215, 487-99.
  2. Ahmad F, Yusoff N (2013). Classifying breast cancer types based on fine needle aspiration biopsy data using random forest classifier. 13th International Conference on in Intelligent Systems Design and Applications (ISDA), IEEE, 121-5.
  3. Akay MF (2009). Support vector machines combined with feature selection for breast cancer diagnosis. Expert Systems Applications, 36, 3240-7. https://doi.org/10.1016/j.eswa.2008.01.009
  4. Al-Darwish AA, Al-Naim AF, Al-Mulhim KS, et al (2014). Knowledge about cervical cancer early warning signs and symptoms, risk factors and vaccination among students at a medical school in Al-Ahsa, Kingdom of Saudi Arabia. Asian Pac J Cancer Prev, 15, 2529-32. https://doi.org/10.7314/APJCP.2014.15.6.2529
  5. Alon U, Barkai N, Notterman DA, Gish K, et al (1999). Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proc Natl Acad Sci USA, 96, 6745-50. https://doi.org/10.1073/pnas.96.12.6745
  6. Autier P, Boniol M, LaVecchia C, et al (2010). Disparities in breast cancer mortality trends between 30 European countries: retrospective trend analysis of WHO mortality database. BMJ, 341, 1-7.
  7. Bennett KP, Mangasarian O (1992). Neural Network Training via Linear Programming. Advance in Optimization and Parallel Computing, 56-67.
  8. Breast Cancer Deadline 2020 (2012). Retrieved May 10, 2014, from www.breastcancerdeadline2020.org/
  9. Blake CL, Merz CJ (2007). UCI machine learning repository of machine learning databases. www.ics.uci.edu/- mlearn/MLSummary.html.
  10. Report to the nation-breast cancer (2012). Cancer Australia, Surry Hills, NSW.
  11. Chang RF, Wu WJ, Moon WK, et al (2003). Support vector machines for diagnosis of breast tumors on US images. Acad Radiology, 10, 189-97. https://doi.org/10.1016/S1076-6332(03)80044-2
  12. Cho J, Yoon J, Cho S, et al (2006). In vivo measurements of the dielectric properties of breast carcinoma xenografted on nude mice. Int J Cancer, 119, 593-8. https://doi.org/10.1002/ijc.21896
  13. Chou SM, Lee TS, Shao YE,et al (2004). Mining the breast cancer pattern using artificial neural networks and multivariate adaptive regression splines. Expert Systems with Applications, 27, 133-42. https://doi.org/10.1016/j.eswa.2003.12.013
  14. Coldman AJ, McBride ML, Braun T (1992). Calculating the prevalence of cancer. Statistics in medicine, 11, 1579-1589. https://doi.org/10.1002/sim.4780111205
  15. Conrads TP, Zhou M, Petricoin EF, et al (2003). Cancer diagnosis using proteomic patterns, Expert Rev Mol Diagn, 3, 411-20. https://doi.org/10.1586/14737159.3.4.411
  16. Dheeba J, Selvi ST (2011). A CAD system for breast cancer diagnosis using modified genetic algorithm optimized artificial neural network. In Swarm, Evolutionary, and Memetic Computing, 349-57.
  17. Edwards BK, Noone AM, Mariotto AB, et al (2014). Annual Report to the Nation on the status of cancer, 1975-2010, featuring prevalence of comorbidity and impact on survival among persons with lung, colorectal, breast, or prostate cancer. Cancer, 120, 1290-314. https://doi.org/10.1002/cncr.28509
  18. Elatar I (2002). Cancer registration, NCI Egypt 2001. Cairo, Egypt, National Cancer Institute, Available from: http://www.nci.edu.eg/Journal/nci2001%20.pdf.
  19. Einipour A (2011). A fuzzy-ACO method for detect breast cancer. Global J Health Science, 3, 195-9.
  20. Evans WE, Guy RK (2004). Gene expression as a drug discovery tool. Nature Genetics, 36, 214-5. https://doi.org/10.1038/ng0304-214
  21. Ferlay J, Soerjomataram I, Ervik M, et al (2014). GLOBOCAN 2012 v1. 0, Cancer Incidence and Mortality Worldwide: IARC CancerBase No. 11. Lyon, France: International Agency for Research on Cancer; 2013. Visit: http://globocan.iarc.fr.
  22. Ferlay J, Shin HR, Bray F, et al (2010). Estimates of worldwide burden of cancer in 2008: GLOBOCAN 2008. Int J Cancer, 127, 2893-917. https://doi.org/10.1002/ijc.25516
  23. Fertay J, Bray F (2004). GLOBOCAN 2002Cancer Incidence, Mortality and Prevalence Worldwide, IARC CancerBase No. 5, Version 2.0; IARC Press, Lyon.
  24. Gallo C. A, Carballido JA, Ponzoni I (2011). Discovering time-lagged rules from microarray data using gene profile classifiers. BMC Bioinformatics, 12, 1-21. https://doi.org/10.1186/1471-2105-12-1
  25. Gandhi KR, Karnan M, Kannan S (2010). Classification rule construction using particle swarm optimization algorithm for breast cancer data sets. In Signal Acquisition and Processing (ICSAP), 233-7.
  26. George YM, Bagoury BM, Zayed HH, et al (2012). Breast fine needle tumor classification using neural networks. Int J Computer Sci Issues, 9, 247-56.
  27. Goldberg DE (1989). Genetic algorithms in search, optimization, and machine learning. Reading Menlo Park: Addison-wesley.
  28. Golub TR, Slonim DK, Tamayo P, et al (1999). Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science, 286, 531-7. https://doi.org/10.1126/science.286.5439.531
  29. Guyon I, Weston J, Barnhill S, et al (2002). Gene selection for cancer classification using support vector machines. Machine learning, 46, 389-422. https://doi.org/10.1023/A:1012487302797
  30. Hassanien AE, Ali JM (2004).Enhanced rough sets rule reduction algorithm for classification digital mammography. J Intelligent Systems, 13, 151-71.
  31. Hassanien AE, Kim TH (2012). Breast cancer MRI diagnosis approach using support vector machine and pulse coupled neural networks. J Applied Logic, 10, 277-84. https://doi.org/10.1016/j.jal.2012.07.003
  32. Husmeier D (2003). Sensitivity and specificity of inferring genetic regulatory interactions from microarray experiments with dynamic Bayesian networks. Bioinformatics, 19, 2271-82. https://doi.org/10.1093/bioinformatics/btg313
  33. Information Services Division (ISD) Scotland, 2012 (2012). Available from: http://www.isdscotland.org/Health-Topics/Cancer/Publications/index.asp#605
  34. Jemal A, Bray F, Center MM, et al (2011). Global cancer statistics. CA: A Cancer J Clin, 61, 69-90. https://doi.org/10.3322/caac.20107
  35. Kamsu-Foguem B, Rigal F, Mauget F (2013). Mining association rules for the quality improvement of the production process. Expert Systems with Applications, 40, 1034-45. https://doi.org/10.1016/j.eswa.2012.08.039
  36. Karnan M, Gandhi KR (2010). Diagnose breast cancer through mammograms, using image processing techniques and optimization techniques. In Computational Intelligence and Computing Research (ICCIC), 1-4.
  37. Kerhet A, Raffetto M, Boni A, et al (2006). A SVM-based approach to microwave breast cancer detection. Engineering Applications Artificial Intelligence, 19, 807-18. https://doi.org/10.1016/j.engappai.2006.05.010
  38. Khaing HW (2011). Data mining based fragmentation and prediction of medical data. In Computer Research and Development (ICCRD), 2, 480-5.
  39. Khan HM, Saxena A, Rana S, et al (2014). Bayesian method for modeling male breast cancer survival data. Asian Pac J Cancer Prev, 15, 663-9. https://doi.org/10.7314/APJCP.2014.15.2.663
  40. Khan J, Wei JS, Ringner M, Saal LH, et al (2001). Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks. Nature medicine, 7, 673-9. https://doi.org/10.1038/89044
  41. Kim JK, Park H (1999). Statistical textural features for detection of microcalcifications in digitized mammograms. IEEE Transactions on Medical Imaging, 18, 231-8. https://doi.org/10.1109/42.764896
  42. Kingsmore D, Ssemwogerere A, Hole D, Gillis C (2003). Specialisation and breast cancer survival in the screening era. Br J Cancer, 88, 1708-12. https://doi.org/10.1038/sj.bjc.6600949
  43. Lawrence G, Kearins O, Lagord C, et al (2011). The second all breast cancer report. national cancer intelligence network: London.
  44. Lazebnik M, Popovic D, McCartney L, et al (2007). A large-scale study of the ultrawideband microwave dielectric properties of normal, benign and malignant breast tissues obtained from cancer surgeries. Physics Medicine Biol, 52, 6093-115. https://doi.org/10.1088/0031-9155/52/20/002
  45. Lee KE, Sha N, Dougherty ER, et al (2003). Gene selection: a Bayesian variable selection approach. Bioinformatics, 19, 90-7. https://doi.org/10.1093/bioinformatics/19.1.90
  46. Li Y, Wang G, Chen H, et al (2013). An ant colony optimization based dimension reduction method for high-dimensional datasets. J Bionic Engineering, 10, 231-41. https://doi.org/10.1016/S1672-6529(13)60219-X
  47. Liu Y, Chung YY (2011). Mining cancer data with discrete particle swarm optimization and rule pruning. In IT in Medicine and Education (ITME), 2, 31-4.
  48. Machraoui AN, Cherni MA, Sayadi M (2013). Ant Colony optimization algorithm for breast cancer cells classification. In Electrical Engineering and Software Applications (ICEESA), 1-6.
  49. Malpani R, Lu M, Zhang D, Sung WK (2011). Mining transcriptional association rules from breast cancer profile data. In Information Reuse and Integration (IRI), 154-9.
  50. Martinez-Ballesteros M, Nepomuceno-Chamorro IA, Riquelme JC (2014). Discovering gene association networks by multi-objective evolutionary quantitative association rules. J Computer System Sciences, 80, 118-36. https://doi.org/10.1016/j.jcss.2013.03.010
  51. Modiri A, Kiasaleh K (2011). Permittivity estimation for breast cancer detection using particle swarm optimization algorithm. In Engineering in Medicine and Biology Society (EMBC), 1359-62.
  52. Nahar J, Imam T, Tickle KS, Chen YPP (2013). Association rule mining to detect factors which contribute to heart disease in males and females. Expert Systems with Applications, 40, 1086-93. https://doi.org/10.1016/j.eswa.2012.08.028
  53. National Cancer Intelligence Network and Cancer Research UK (2009). Cancer Incidence and Survival by Major Ethnic Group, England 2002-2006.
  54. Nelson KE, Williams CM (2012). Infectious disease epidemiology. Jones & Bartlett Publishers.
  55. Northern Ireland Cancer Registry (2012). Available from: http://www.qub.ac.uk/research-centres/nicr/CancerData/OnlineStatistics/
  56. Nutt CL, Mani DR, Betensky RA, et al (2003). Gene expressionbased classification of malignant gliomas correlates better with survival than histological classification. Cancer research, 63, 1602-7.
  57. Office for National Statistics (2012). Available from: http://www.ons.gov.uk/ons/search/index.html?newquery=cancer+registrations
  58. Orfanidis SJ (2002). Electromagnetic waves and antennas. Rutgers University, 227-50.
  59. Pandey B, Garg N (2013). Swarm optimized modular neural network based diagnostic system for breast cancer diagnosis. Int J Soft Computing, Artificial Intelligence Applications, 2, 11-20.
  60. Pang KP, Ali AS (2010). Finding association of impact factor for breast cancer patient-A novel statistical approach. In Neural Networks (IJCNN), 1-5.
  61. Rothman KJ, Greenland S, Lash TL (2008). Modern epidemiology. Lippincott Williams & Wilkins.
  62. Salem AA, Salem MAE, Abbass H (2010). Breast cancer: surgery at the south Egypt cancer institute. Cancers, 2, 1771-8. https://doi.org/10.3390/cancers2041771
  63. Sbeity H, Younes R, Topsu S, Mougharbel I (2011). Comparative study of the optimization theory for cancer treatment. In Biomedical Engineering and Informatics (BMEI), 2, 927-33.
  64. Segal E, Shapira M, Regev A, Pe’er D, Botstein D, Koller D, Friedman N (2003). Module networks: identifying regulatory modules and their condition-specific regulators from gene expression data. Nature Genetics, 34, 166-76. https://doi.org/10.1038/ng1165
  65. SEER*Stat Database: Incidence - SEER 9 Regs Research Data, Nov 2013 Sub (1973-2011) U.S., National Cancer Institute, DCCPS, Surveillance Research Program, Surveillance Systems Branch, released April 2014, based on the November 2013 submission, www.seer.cancer.gov.
  66. Shah S (2014). BreastCancerIndia.net. Retrieved January 30, 2014, from http://www.breastcancerindia.net/.
  67. Shrivastava S, Sant A Aharwal R (2013). An overview on data mining approach on breast cancer data. Int J Advanced Computer Research, 3, 256-62.
  68. Shen R, Yang Y, Shao F (2014). Intelligent breast cancer prediction model using data mining techniques. Sixth International Conference on In Intelligent Human-Machine Systems and Cybernetics (IHMSC), 1, 384-7.
  69. Sousa T, Silva A, Neves A (2004). Particle swarm based data mining algorithms for classification tasks. Parallel Computing, 30, 767-83. https://doi.org/10.1016/j.parco.2003.12.015
  70. Tang EK, Suganthan PN, Yao X (2006). Gene selection algorithms for microarray data based on least squares support vector machine. BMC Bioinformatics, 7, 1-16. https://doi.org/10.1186/1471-2105-7-1
  71. Tewolde GS, Hanna DM (2007). Particle swarm optimization for classification of breast cancer data using single and multisurface methods of data separation. In Electro/ Information Technology, 443-6.
  72. Thakur KP, Holmes WS, Carter G. (2002). An inverse technique to evaluate thickness and permittivity using reflection of plane wave from inhomogeneous dielectrics. In ARFTG Conference Digest, 1-7.
  73. US Cancer Statistics Working Group. (2010). United States cancer statistics: 1999-2006 incidence and mortality webbased report. Atlanta, GA.
  74. Wang D, Shi L, Ann HP (2009). Automatic detection of breast cancers in mammograms using structured support vector machines. Neurocomputing, 72, 3296-302. https://doi.org/10.1016/j.neucom.2009.02.015
  75. Wang Z, Sun X, Zhang D (2007). A PSO-based classification rule mining algorithm. In Advanced Intelligent Computing Theories and Applications with Aspects of Artificial Intelligence, 377-84.
  76. Wang M, Su X, Liu F, Cai R (2012). A cancer classification method based on association rules. In Fuzzy Systems and Knowledge Discovery (FSKD), 1094-8.
  77. Welsh Cancer Intelligence and Surveillance Unit (2012). Available from: http://www.wales.nhs.uk/sites3/page.cfm?orgid=242&pid=51358
  78. Westlake S, Cooper N (2008). Cancer incidence and mortality: trends in the United Kingdom and constituent countries, 1993 to 2004. Health Stat Q, 38, 33-46.
  79. Wigle DA, Jurisica I, Radulovich N, et al (2002). Molecular profiling of non-small cell lung cancer and correlation with disease-free survival. Cancer Res, 62, 3005-8.
  80. Witten IH, Frank E (2005). Data Mining: Practical machine learning tools and techniques. Morgan Kaufmann.
  81. Wolberg WH, Mangasarian OL. (1990). Multisurface method of pattern separation for medical diagnosis applied to breast cytology. Proc Natl Acad Sci USA, 87, 9193-6. https://doi.org/10.1073/pnas.87.23.9193
  82. Xiong M, Li W, Zhao J, Jin L, Boerwinkle E (2001). Feature (gene) selection in gene expression-based tumor classification. Molecular Genetics Metabolism, 73, 239-47. https://doi.org/10.1006/mgme.2001.3193
  83. Yang CH, Lin YD, Chuang LY, Chang HW (2013). SNP barcodes generated using particle swarm optimization to detect susceptibility to breast cancer. Natural Science, 5, 359-67. https://doi.org/10.4236/ns.2013.53049
  84. Yeh WC, Chang WW, Chung YY (2009). A new hybrid approach for mining breast cancer pattern using discrete particle swarm optimization and statistical method. Expert Systems with Applications, 36, 8204-11. https://doi.org/10.1016/j.eswa.2008.10.004
  85. Yeung CW, Leung FF, Chan KY, Ling SH (2009). An integrated approach of particle swarm optimization and support vector machine for gene signature selection and cancer prediction. IJCNN, 3450-6.
  86. Yu H, Ni J, Zhao J (2013). ACOSampling: An ant colony optimization-based undersampling method for classifying imbalanced DNA microarray data. Neurocomputing, 101, 309-18. https://doi.org/10.1016/j.neucom.2012.08.018
  87. Zainud-Deen SH, Hassen WM, Ali EM, et al (2008). Breast cancer detection using a hybrid finite difference frequency domain and particle swarm optimization techniques. In Radio Science Conference, 2008, 1-8.
  88. Zakharchenko O, Greenwood C, Alldridge L, Souchelnytskyi S (2011). Optimized protocol for protein extraction from the Breast Tissue that is compatible with Two-Dimensional Gel electrophoresis. Breast cancer: basic and clinical research, 5, 37-42.
  89. Zhou X, Kao MCJ, Wong WH (2002). Transitive functional annotation by shortest-path analysis of gene expression data. Proc Natl Acad Sci USA, 99, 12783-8. https://doi.org/10.1073/pnas.192159399
  90. Zhu YY, Zhou L, Jiao SC, Xu LZ (2011). Relationship between soy food intake and breast cancer in China. Asian Pac J Cancer Prev, 12, 2837-40.
  91. Zibakhsh A, Abadeh MS (2013). Gene selection for cancer tumor detection using a novel memetic algorithm with a multi-view fitness function. Engineering Applications of Artificial Intelligence, 26, 1274-81. https://doi.org/10.1016/j.engappai.2012.12.009

Cited by

  1. Analysis of k-means clustering approach on the breast cancer Wisconsin dataset vol.11, pp.11, 2016, https://doi.org/10.1007/s11548-016-1437-9
  2. Plasma Circulating Cell-free Nuclear and Mitochondrial DNA as Potential Biomarkers in the Peripheral Blood of Breast Cancer Patients vol.16, pp.18, 2015, https://doi.org/10.7314/APJCP.2015.16.18.8299
  3. Epidemiology of lung cancer and approaches for its prediction: a systematic review and analysis vol.35, pp.1, 2016, https://doi.org/10.1186/s40880-016-0135-x
  4. F-fluoroestradio positron emission tomography-computed tomography results in a breast cancer xenograft vol.59, pp.13, 2016, https://doi.org/10.1002/jlcr.3467
  5. Association of breast adipose tissue levels of polychlorinated biphenyls and breast cancer development in women from Chaoshan, China vol.24, pp.5, 2017, https://doi.org/10.1007/s11356-016-8208-6
  6. Breast cancer in Africa: prevalence, treatment options, herbal medicines, and socioeconomic determinants pp.1573-7217, 2017, https://doi.org/10.1007/s10549-017-4408-0
  7. Mangiferin inhibits cell migration and invasion through Rac1/WAVE2 signalling in breast cancer vol.70, pp.2, 2018, https://doi.org/10.1007/s10616-017-0140-1
  8. Identification of the copy number variant biomarkers for breast cancer subtypes pp.1617-4623, 2018, https://doi.org/10.1007/s00438-018-1488-4
  9. Association of miR-1247-5p expression with clinicopathological parameters and prognosis in breast cancer vol.99, pp.4, 2018, https://doi.org/10.1111/iep.12287
  10. Inhibiting 6-phosphogluconate dehydrogenase selectively targets breast cancer through AMPK activation vol.20, pp.9, 2018, https://doi.org/10.1007/s12094-018-1833-4
  11. A miRNA-HERC4 pathway promotes breast tumorigenesis by inactivating tumor suppressor LATS1 pp.1674-8018, 2019, https://doi.org/10.1007/s13238-019-0607-2
  12. Effect of individualised physical rehabilitation programmes on the functional state of the cardiovascular system in women with post-mastectomy syndrome vol.26, pp.2, 2019, https://doi.org/10.12968/ijtr.2018.0003