• Title/Summary/Keyword: Lung cancer prediction

Search Result 60, Processing Time 0.025 seconds

Early Detection of Lung Cancer Risk Using Data Mining

  • Ahmed, Kawsar;Abdullah-Al-Emran, Abdullah-Al-Emran;Jesmin, Tasnuba;Mukti, Roushney Fatima;Rahman, Md. Zamilur;Ahmed, Farzana
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.14 no.1
    • /
    • pp.595-598
    • /
    • 2013
  • Background: Lung cancer is the leading cause of cancer death worldwide Therefore, identification of genetic as well as environmental factors is very important in developing novel methods of lung cancer prevention. However, this is a multi-layered problem. Therefore a lung cancer risk prediction system is here proposed which is easy, cost effective and time saving. Materials and Methods: Initially 400 cancer and non-cancer patients' data were collected from different diagnostic centres, pre-processed and clustered using a K-means clustering algorithm for identifying relevant and non-relevant data. Next significant frequent patterns are discovered using AprioriTid and a decision tree algorithm. Results: Finally using the significant pattern prediction tools for a lung cancer prediction system were developed. This lung cancer risk prediction system should prove helpful in detection of a person's predisposition for lung cancer. Conclusions: Most of people of Bangladesh do not even know they have lung cancer and the majority of cases are diagnosed at late stages when cure is impossible. Therefore early prediction of lung cancer should play a pivotal role in the diagnosis process and for an effective preventive strategy.

Lung Cancer Risk Prediction Method Based on Feature Selection and Artificial Neural Network

  • Xie, Nan-Nan;Hu, Liang;Li, Tai-Hui
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.15 no.23
    • /
    • pp.10539-10542
    • /
    • 2015
  • A method to predict the risk of lung cancer is proposed, based on two feature selection algorithms: Fisher and ReliefF, and BP Neural Networks. An appropriate quantity of risk factors was chosen for lung cancer risk prediction. The process featured two steps, firstly choosing the risk factors by combining two feature selection algorithms, then providing the predictive value by neural network. Based on the method framework, an algorithm LCRP (lung cancer risk prediction) is presented, to reduce the amount of risk factors collected in practical applications. The proposed method is suitable for health monitoring and self-testing. Experiments showed it can actually provide satisfactory accuracy under low dimensions of risk factors.

Characteristics and Prediction of Lung Cancer Mortality in China from 1991 to 2013

  • Fang, Jia-Ying;Dong, Hong-Li;Wu, Ku-Sheng;Du, Pei-Ling;Xu, Zhen-Xi;Lin, Kun
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.16 no.14
    • /
    • pp.5829-5834
    • /
    • 2015
  • Objective: To describe and analyze the epidemiological characteristics of lung cancer mortality in China from 1991 to 2013, forecast the future five-year trend and provide scientific evidence for prevention and management of lung cancer. Materials and Methods: Mortality data for lung cancer in China from 1991 to 2013 were used to describe epidemiological characteristics. Trend surface analysis was applied to analyze the geographical distribution of lung cancer. Four models, curve estimation, time series modeling, gray modeling (GM) and joinpoint regression, were performed to forecast the trend for the future. Results: Since 1991 the mortality rate of lung cancer increased yearly. The rate for males was higher than that for females and rates in urban areas were higher than in rural areas. In addition, our results showed that the trend will continue to increase in the ensuing 5 years. The mortality rate increased from age 45-50 and peaked in the group of 85 years old. Geographical analysis indicated that people living in northeast China provinces and the coastal provinces in eastern China had a higher mortality rate for lung cancer than those living in the centre or western Chinese provinces. Conclusions: The standardized mortality rate of lung cancer has constantly increased from 1991 to 2013, and been predicted to continue in the ensuing 5 years. Further efforts should be concentrated on education of the general public to increase prevention and early detection. Much better prevention and management is needed in high mortality areas (northeastern and eastern parts of China) and high risk populations (45-50-year-olds).

Accessing the Clustering of TNM Stages on Survival Analysis of Lung Cancer Patient (폐암환자 생존분석에 대한 TNM 병기 군집분석 평가)

  • Choi, Chulwoong;Kim, Kyungbaek
    • Smart Media Journal
    • /
    • v.9 no.4
    • /
    • pp.126-133
    • /
    • 2020
  • The treatment policy and prognosis are determined based on the final stage of lung cancer patients. The final stage of lung cancer patients is determined based on the T, N, and M stage classification table provided by the American Cancer Society (AJCC). However, the final stage of AJCC has limitations in its use for various fields such as patient treatment, prognosis and survival days prediction. In this paper, clustering algorithm which is one of non-supervised learning algorithms was assessed in order to check whether using only T, N, M stages with a data science method is effective for classifying the group of patients in the aspect of survival days. The final stage groups and T, N, M stage clustering groups of lung cancer patients were compared by using the cox proportional hazard model. It is confirmed that the accuracy of prediction of survival days with only T, N, M stages becomes higher than the accuracy with the final stages of patients. Especially, the accuracy of prediction of survival days with clustering of T, N, M stages improves when more or less clusters are analyzed than the seven clusters which is same to the number of final stage of AJCC.

A Study on Predicting Lung Cancer Using RNA-Sequencing Data with Ensemble Learning (앙상블 기법을 활용한 RNA-Sequencing 데이터의 폐암 예측 연구)

  • Geon AN;JooYong PARK
    • Journal of Korean Artificial Intelligence Association
    • /
    • v.2 no.1
    • /
    • pp.7-14
    • /
    • 2024
  • In this paper, we explore the application of RNA-sequencing data and ensemble machine learning to predict lung cancer and treatment strategies for lung cancer, a leading cause of cancer mortality worldwide. The research utilizes Random Forest, XGBoost, and LightGBM models to analyze gene expression profiles from extensive datasets, aiming to enhance predictive accuracy for lung cancer prognosis. The methodology focuses on preprocessing RNA-seq data to standardize expression levels across samples and applying ensemble algorithms to maximize prediction stability and reduce model overfitting. Key findings indicate that ensemble models, especially XGBoost, substantially outperform traditional predictive models. Significant genetic markers such as ADGRF5 is identified as crucial for predicting lung cancer outcomes. In conclusion, ensemble learning using RNA-seq data proves highly effective in predicting lung cancer, suggesting a potential shift towards more precise and personalized treatment approaches. The results advocate for further integration of molecular and clinical data to refine diagnostic models and improve clinical outcomes, underscoring the critical role of advanced molecular diagnostics in enhancing patient survival rates and quality of life. This study lays the groundwork for future research in the application of RNA-sequencing data and ensemble machine learning techniques in clinical settings.

Prediction of Lung Cancer Susceptibility using an Importance Evaluation of SNP Data and SVM Learning (SNP 데이터의 중요도 평가와 SVM 학습법을 이용한 폐암 감수성 예측)

  • Ryoo, Myung-Chun;Kim, Sang-Jin;Park, Chang-Hyeon
    • The Journal of the Korea Contents Association
    • /
    • v.8 no.10
    • /
    • pp.11-19
    • /
    • 2008
  • In this paper, we propose a prediction method of lung cancer susceptibility using an importance evaluation of SNP data and the SVM learning, a gene data concerning getting sick with the lung cancer. Since the number of negative data is much larger that of positive data, which are to be used in the SVM learning, for each positive data, a negative data is first searched which has the same sex and the minimum age difference with the positive data. The searched negative data is then coupled with the positive data. For the importance evaluation of each SNP data, an equation which calculates the influence of each SNP data on the prediction of getting sick is adopted. The SNP data are sorted according to the evaluated importance. In experiments, we observed the prediction accuracy which varies according to the number of sorted SNP data used in the learning. LOOCV test results showed that the proposed method yields the prediction accuracy of maximum 65.0% for test data.

A Study on Comparison of Lung Cancer Prediction Using Ensemble Machine Learning

  • NAM, Yu-Jin;SHIN, Won-Ji
    • Korean Journal of Artificial Intelligence
    • /
    • v.7 no.2
    • /
    • pp.19-24
    • /
    • 2019
  • Lung cancer is a chronic disease which ranks fourth in cancer incidence with 11 percent of the total cancer incidence in Korea. To deal with such issues, there is an active study on the usefulness and utilization of the Clinical Decision Support System (CDSS) which utilizes machine learning. Thus, this study reviews existing studies on artificial intelligence technology that can be used in determining the lung cancer, and conducted a study on the applicability of machine learning in determination of the lung cancer by comparison and analysis using Azure ML provided by Microsoft. The results of this study show different predictions yielded by three algorithms: Support Vector Machine (SVM), Two-Class Support Decision Jungle and Multiclass Decision Jungle. This study has its limitations in the size of the Big data used in Machine Learning. Although the data provided by Kaggle is the most suitable one for this study, it is assumed that there is a limit in learning the data fully due to the lack of absolute figures. Therefore, it is claimed that if the agency's cooperation in the subsequent research is used to compare and analyze various kinds of algorithms other than those used in this study, a more accurate screening machine for lung cancer could be created.

Prediction of Lung Cancer Based on Serum Biomarkers by Gene Expression Programming Methods

  • Yu, Zhuang;Chen, Xiao-Zheng;Cui, Lian-Hua;Si, Hong-Zong;Lu, Hai-Jiao;Liu, Shi-Hai
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.15 no.21
    • /
    • pp.9367-9373
    • /
    • 2014
  • In diagnosis of lung cancer, rapid distinction between small cell lung cancer (SCLC) and non-small cell lung cancer (NSCLC) tumors is very important. Serum markers, including lactate dehydrogenase (LDH), C-reactive protein (CRP), carcino-embryonic antigen (CEA), neurone specific enolase (NSE) and Cyfra21-1, are reported to reflect lung cancer characteristics. In this study classification of lung tumors was made based on biomarkers (measured in 120 NSCLC and 60 SCLC patients) by setting up optimal biomarker joint models with a powerful computerized tool - gene expression programming (GEP). GEP is a learning algorithm that combines the advantages of genetic programming (GP) and genetic algorithms (GA). It specifically focuses on relationships between variables in sets of data and then builds models to explain these relationships, and has been successfully used in formula finding and function mining. As a basis for defining a GEP environment for SCLC and NSCLC prediction, three explicit predictive models were constructed. CEA and NSE are requentlyused lung cancer markers in clinical trials, CRP, LDH and Cyfra21-1 have significant meaning in lung cancer, basis on CEA and NSE we set up three GEP models-GEP 1(CEA, NSE, Cyfra21-1), GEP2 (CEA, NSE, LDH), GEP3 (CEA, NSE, CRP). The best classification result of GEP gained when CEA, NSE and Cyfra21-1 were combined: 128 of 135 subjects in the training set and 40 of 45 subjects in the test set were classified correctly, the accuracy rate is 94.8% in training set; on collection of samples for testing, the accuracy rate is 88.9%. With GEP2, the accuracy was significantly decreased by 1.5% and 6.6% in training set and test set, in GEP3 was 0.82% and 4.45% respectively. Serum Cyfra21-1 is a useful and sensitive serum biomarker in discriminating between NSCLC and SCLC. GEP modeling is a promising and excellent tool in diagnosis of lung cancer.

Determinants of Willingness to Undergo Lung Cancer Screening among High-Risk Current and Ex-smokers in Sabah, Malaysia: A Cross-Sectional Pilot Study

  • Larry Ellee Nyanti;Chia Zhen Chua;Han Chuan Loo;Cheng Zhi Khor;Emilia Sheau Yuin Toh;Rasvinder Singh Gill;Eng Tat Chan;Ker Yin Tan;Taufiq Rosli;Muhammad Aklil Abd Rahim;Arfian Ibrahim;Nai Chien Huan;Hema Yamini Devi Ramarmuty;Kunji Kannan Sivaraman Kannan
    • Tuberculosis and Respiratory Diseases
    • /
    • v.86 no.4
    • /
    • pp.284-293
    • /
    • 2023
  • Background: Attitudes towards smoking, lung cancer screening, and perceived risk of lung cancer have not been widely studied in Malaysia. The primary objective of this study was to describe the factors affecting the willingness of high-risk current smokers and ex-smokers to undergo low-dose computed tomography (LDCT) screening for lung cancer. Methods: A prospective, cross-sectional questionnaire study was conducted in current smokers or ex-smokers aged between 55 and 80 years at three hospitals in Kota Kinabalu, Sabah, Malaysia. The questionnaire recorded the following parameters: perceived lung cancer risk; Prostate Lung Colon Ovarian Cancer 2012 risk prediction model excluding race and ethnicity predictor (PLCOm2012norace); demographic characteristics; psychosocial characteristics; and attitudes towards lung cancer and lung cancer screening. Results: A vast majority of the 95 respondents (94.7%) indicated their willingness to undergo screening. Stigma of lung cancer, low levels of knowledge about lung cancer symptoms, concerns about financial constraints, and a preference for traditional medication were still prevalent among the respondents, and they may represent potential barriers to lung cancer screening uptake. A desire to have an early diagnosis (odds ratio [OR], 11.33; 95% confidence interval [CI], 1.53 to 84.05; p=0.02), perceived time constraints (OR, 3.94; 95% CI, 1.32 to 11.73; p=0.01), and proximity of LDCT screening facilities (OR, 14.33; 95% CI, 1.84 to 111.4; p=0.01) had significantly higher odds of willingness to undergo screening. Conclusion: Although high-risk current smokers and ex-smokers are likely to undergo screening for lung cancer, several psychosocial barriers persist. The results of this study may guide the policymakers and clinicians regarding the need to improve lung cancer awareness in our population.

Prediction of Postpneumonectomy Pulmonary Function by Lung Scan in Lung Cancer Patient (폐관류스캔을 이용한 폐암환자의 일측 전폐절제술후의 폐기능예측)

  • Hur, Jin;Jang, Bong-Hyun;Lee, Jong-Tae;Kim, Kyu-Tae
    • Journal of Chest Surgery
    • /
    • v.24 no.4
    • /
    • pp.338-344
    • /
    • 1991
  • If the postoperative pulmonary function can be predicted in the patients undergoing pneumonectomy for lung cancer preoperatively, it will be helpful for identifying them as high or low risk candidates. Perfusion lung scan and spirometry were performed in 12 patients with lung cancer pre-operatively and the predicted postoperative Vital Capacity, FVC, FEV1.0, FEF25 - 75% and MVV were estimated by multiplying the preoperative values by the percentage of perfusion of the nonsurgical lung. Three months postoperation the patients were reinvestigated with spirometry and the obtained values were compared with the predicted values. The linear regression lines derived from the correlation between predicted values [X] and observed values [Y] were as follows; VC; R=0.532, Y=0.48X+1.28, P=0.075 FVC; R=0.566, Y=O 54X+1.15, P=0.055 FEV1.0; R=0.832, Y=0.68X+0.70, P=0.001 FEF25 ~ 75%; R=0.781, Y=0.68X+0.54, P=0.003 MVV; R=0.718, Y=0.67X+34.75, P=0.009 The prediction of postoperative FEV1.0, FEF25 ~ 75% and MVV in lung cancer patients undergoing pneumonectomy appear to be valid for evaluating preoperative pulmonary function.

  • PDF