• 제목/요약/키워드: Lung cancer prediction

검색결과 61건 처리시간 0.032초

Early Detection of Lung Cancer Risk Using Data Mining

  • Ahmed, Kawsar;Abdullah-Al-Emran, Abdullah-Al-Emran;Jesmin, Tasnuba;Mukti, Roushney Fatima;Rahman, Md. Zamilur;Ahmed, Farzana
    • Asian Pacific Journal of Cancer Prevention
    • /
    • 제14권1호
    • /
    • pp.595-598
    • /
    • 2013
  • Background: Lung cancer is the leading cause of cancer death worldwide Therefore, identification of genetic as well as environmental factors is very important in developing novel methods of lung cancer prevention. However, this is a multi-layered problem. Therefore a lung cancer risk prediction system is here proposed which is easy, cost effective and time saving. Materials and Methods: Initially 400 cancer and non-cancer patients' data were collected from different diagnostic centres, pre-processed and clustered using a K-means clustering algorithm for identifying relevant and non-relevant data. Next significant frequent patterns are discovered using AprioriTid and a decision tree algorithm. Results: Finally using the significant pattern prediction tools for a lung cancer prediction system were developed. This lung cancer risk prediction system should prove helpful in detection of a person's predisposition for lung cancer. Conclusions: Most of people of Bangladesh do not even know they have lung cancer and the majority of cases are diagnosed at late stages when cure is impossible. Therefore early prediction of lung cancer should play a pivotal role in the diagnosis process and for an effective preventive strategy.

Lung Cancer Risk Prediction Method Based on Feature Selection and Artificial Neural Network

  • Xie, Nan-Nan;Hu, Liang;Li, Tai-Hui
    • Asian Pacific Journal of Cancer Prevention
    • /
    • 제15권23호
    • /
    • pp.10539-10542
    • /
    • 2015
  • A method to predict the risk of lung cancer is proposed, based on two feature selection algorithms: Fisher and ReliefF, and BP Neural Networks. An appropriate quantity of risk factors was chosen for lung cancer risk prediction. The process featured two steps, firstly choosing the risk factors by combining two feature selection algorithms, then providing the predictive value by neural network. Based on the method framework, an algorithm LCRP (lung cancer risk prediction) is presented, to reduce the amount of risk factors collected in practical applications. The proposed method is suitable for health monitoring and self-testing. Experiments showed it can actually provide satisfactory accuracy under low dimensions of risk factors.

Characteristics and Prediction of Lung Cancer Mortality in China from 1991 to 2013

  • Fang, Jia-Ying;Dong, Hong-Li;Wu, Ku-Sheng;Du, Pei-Ling;Xu, Zhen-Xi;Lin, Kun
    • Asian Pacific Journal of Cancer Prevention
    • /
    • 제16권14호
    • /
    • pp.5829-5834
    • /
    • 2015
  • Objective: To describe and analyze the epidemiological characteristics of lung cancer mortality in China from 1991 to 2013, forecast the future five-year trend and provide scientific evidence for prevention and management of lung cancer. Materials and Methods: Mortality data for lung cancer in China from 1991 to 2013 were used to describe epidemiological characteristics. Trend surface analysis was applied to analyze the geographical distribution of lung cancer. Four models, curve estimation, time series modeling, gray modeling (GM) and joinpoint regression, were performed to forecast the trend for the future. Results: Since 1991 the mortality rate of lung cancer increased yearly. The rate for males was higher than that for females and rates in urban areas were higher than in rural areas. In addition, our results showed that the trend will continue to increase in the ensuing 5 years. The mortality rate increased from age 45-50 and peaked in the group of 85 years old. Geographical analysis indicated that people living in northeast China provinces and the coastal provinces in eastern China had a higher mortality rate for lung cancer than those living in the centre or western Chinese provinces. Conclusions: The standardized mortality rate of lung cancer has constantly increased from 1991 to 2013, and been predicted to continue in the ensuing 5 years. Further efforts should be concentrated on education of the general public to increase prevention and early detection. Much better prevention and management is needed in high mortality areas (northeastern and eastern parts of China) and high risk populations (45-50-year-olds).

폐암환자 생존분석에 대한 TNM 병기 군집분석 평가 (Accessing the Clustering of TNM Stages on Survival Analysis of Lung Cancer Patient)

  • 최철웅;김경백
    • 스마트미디어저널
    • /
    • 제9권4호
    • /
    • pp.126-133
    • /
    • 2020
  • 병원에서는 폐암 환자의 최종병기를 기준으로 치료방침 및 예후를 결정하고 있다. 폐암 환자의 최종병기는 미국 암 연합회(AJCC)에서 제공하는 TNM 분류방법을 바탕으로 7단계로 나누어 진단된다. 이런 접근 방법은 환자의 치료, 예후 및 생존일 예측 등 다양한 분야에서 사용하기에 한계가 있다. 이 논문에서는 데이터 과학적 접근을 통해 T, N, M병기를 사용하여 생존일수별 환자집단을 나눌 수 있는지 알아보기 위해 비지도 학습 중 하나인 군집분석(Clustering)을 진행한 후 군집분석의 결과를 Cox비례위험모형을 사용하여 비교 하였다. 환자들의 최종병기를 사용하지 않고, T, N, M병기 정보만 사용하였을 때 생존시간 예측정확도가 더 높은 것을 확인하였다. 특히, AJCC의 최종병기 7단계와 같이 군집의 개수를 7로 설정했을 때보다 군집의 수를 축소하거나 확장했을 때 T, N, M 병기 군집분석을 통한 생존시간 예측정확도가 향상하는 것을 확인하였다.

앙상블 기법을 활용한 RNA-Sequencing 데이터의 폐암 예측 연구 (A Study on Predicting Lung Cancer Using RNA-Sequencing Data with Ensemble Learning)

  • Geon AN;JooYong PARK
    • Journal of Korea Artificial Intelligence Association
    • /
    • 제2권1호
    • /
    • pp.7-14
    • /
    • 2024
  • In this paper, we explore the application of RNA-sequencing data and ensemble machine learning to predict lung cancer and treatment strategies for lung cancer, a leading cause of cancer mortality worldwide. The research utilizes Random Forest, XGBoost, and LightGBM models to analyze gene expression profiles from extensive datasets, aiming to enhance predictive accuracy for lung cancer prognosis. The methodology focuses on preprocessing RNA-seq data to standardize expression levels across samples and applying ensemble algorithms to maximize prediction stability and reduce model overfitting. Key findings indicate that ensemble models, especially XGBoost, substantially outperform traditional predictive models. Significant genetic markers such as ADGRF5 is identified as crucial for predicting lung cancer outcomes. In conclusion, ensemble learning using RNA-seq data proves highly effective in predicting lung cancer, suggesting a potential shift towards more precise and personalized treatment approaches. The results advocate for further integration of molecular and clinical data to refine diagnostic models and improve clinical outcomes, underscoring the critical role of advanced molecular diagnostics in enhancing patient survival rates and quality of life. This study lays the groundwork for future research in the application of RNA-sequencing data and ensemble machine learning techniques in clinical settings.

SNP 데이터의 중요도 평가와 SVM 학습법을 이용한 폐암 감수성 예측 (Prediction of Lung Cancer Susceptibility using an Importance Evaluation of SNP Data and SVM Learning)

  • 류명춘;김상진;박창현
    • 한국콘텐츠학회논문지
    • /
    • 제8권10호
    • /
    • pp.11-19
    • /
    • 2008
  • 본 논문에서는 폐암의 발생에 관여하는 유전자 데이터인 SNP 데이터의 중요도 평가와 SVM 학습법을 이용하여 폐암 감수성을 예측하는 방법을 제안한다. 학습에 사용될 폐암 관련 양성 데이터에 비하여 음성 데이터의 수가 훨씬 많은 이유로 각 양성 데이터에 대하여 같은 성별과 적은 나이 차를 갖는 음성 데이터를 찾아서 쌍이 되도록 한다. 또한 각 SNP가 발병 예측에 미칠 영향력을 계산하는 수식을 도입하여 각 SNP의 중요도를 평가하고 SNP를 중요도에 따라 서열화 한다. 실험에서는 학습에 사용되는 순위별 SNP 개수에 따라 변화되는 예측률을 관측하였고, LOOCV 테스트 결과 제안된 방법은 실험 데이터에 대하여 최대 65.0%의 예측 정확도를 보였다.

A Study on Comparison of Lung Cancer Prediction Using Ensemble Machine Learning

  • NAM, Yu-Jin;SHIN, Won-Ji
    • 한국인공지능학회지
    • /
    • 제7권2호
    • /
    • pp.19-24
    • /
    • 2019
  • Lung cancer is a chronic disease which ranks fourth in cancer incidence with 11 percent of the total cancer incidence in Korea. To deal with such issues, there is an active study on the usefulness and utilization of the Clinical Decision Support System (CDSS) which utilizes machine learning. Thus, this study reviews existing studies on artificial intelligence technology that can be used in determining the lung cancer, and conducted a study on the applicability of machine learning in determination of the lung cancer by comparison and analysis using Azure ML provided by Microsoft. The results of this study show different predictions yielded by three algorithms: Support Vector Machine (SVM), Two-Class Support Decision Jungle and Multiclass Decision Jungle. This study has its limitations in the size of the Big data used in Machine Learning. Although the data provided by Kaggle is the most suitable one for this study, it is assumed that there is a limit in learning the data fully due to the lack of absolute figures. Therefore, it is claimed that if the agency's cooperation in the subsequent research is used to compare and analyze various kinds of algorithms other than those used in this study, a more accurate screening machine for lung cancer could be created.

Prediction of Lung Cancer Based on Serum Biomarkers by Gene Expression Programming Methods

  • Yu, Zhuang;Chen, Xiao-Zheng;Cui, Lian-Hua;Si, Hong-Zong;Lu, Hai-Jiao;Liu, Shi-Hai
    • Asian Pacific Journal of Cancer Prevention
    • /
    • 제15권21호
    • /
    • pp.9367-9373
    • /
    • 2014
  • In diagnosis of lung cancer, rapid distinction between small cell lung cancer (SCLC) and non-small cell lung cancer (NSCLC) tumors is very important. Serum markers, including lactate dehydrogenase (LDH), C-reactive protein (CRP), carcino-embryonic antigen (CEA), neurone specific enolase (NSE) and Cyfra21-1, are reported to reflect lung cancer characteristics. In this study classification of lung tumors was made based on biomarkers (measured in 120 NSCLC and 60 SCLC patients) by setting up optimal biomarker joint models with a powerful computerized tool - gene expression programming (GEP). GEP is a learning algorithm that combines the advantages of genetic programming (GP) and genetic algorithms (GA). It specifically focuses on relationships between variables in sets of data and then builds models to explain these relationships, and has been successfully used in formula finding and function mining. As a basis for defining a GEP environment for SCLC and NSCLC prediction, three explicit predictive models were constructed. CEA and NSE are requentlyused lung cancer markers in clinical trials, CRP, LDH and Cyfra21-1 have significant meaning in lung cancer, basis on CEA and NSE we set up three GEP models-GEP 1(CEA, NSE, Cyfra21-1), GEP2 (CEA, NSE, LDH), GEP3 (CEA, NSE, CRP). The best classification result of GEP gained when CEA, NSE and Cyfra21-1 were combined: 128 of 135 subjects in the training set and 40 of 45 subjects in the test set were classified correctly, the accuracy rate is 94.8% in training set; on collection of samples for testing, the accuracy rate is 88.9%. With GEP2, the accuracy was significantly decreased by 1.5% and 6.6% in training set and test set, in GEP3 was 0.82% and 4.45% respectively. Serum Cyfra21-1 is a useful and sensitive serum biomarker in discriminating between NSCLC and SCLC. GEP modeling is a promising and excellent tool in diagnosis of lung cancer.

Determinants of Willingness to Undergo Lung Cancer Screening among High-Risk Current and Ex-smokers in Sabah, Malaysia: A Cross-Sectional Pilot Study

  • Larry Ellee Nyanti;Chia Zhen Chua;Han Chuan Loo;Cheng Zhi Khor;Emilia Sheau Yuin Toh;Rasvinder Singh Gill;Eng Tat Chan;Ker Yin Tan;Taufiq Rosli;Muhammad Aklil Abd Rahim;Arfian Ibrahim;Nai Chien Huan;Hema Yamini Devi Ramarmuty;Kunji Kannan Sivaraman Kannan
    • Tuberculosis and Respiratory Diseases
    • /
    • 제86권4호
    • /
    • pp.284-293
    • /
    • 2023
  • Background: Attitudes towards smoking, lung cancer screening, and perceived risk of lung cancer have not been widely studied in Malaysia. The primary objective of this study was to describe the factors affecting the willingness of high-risk current smokers and ex-smokers to undergo low-dose computed tomography (LDCT) screening for lung cancer. Methods: A prospective, cross-sectional questionnaire study was conducted in current smokers or ex-smokers aged between 55 and 80 years at three hospitals in Kota Kinabalu, Sabah, Malaysia. The questionnaire recorded the following parameters: perceived lung cancer risk; Prostate Lung Colon Ovarian Cancer 2012 risk prediction model excluding race and ethnicity predictor (PLCOm2012norace); demographic characteristics; psychosocial characteristics; and attitudes towards lung cancer and lung cancer screening. Results: A vast majority of the 95 respondents (94.7%) indicated their willingness to undergo screening. Stigma of lung cancer, low levels of knowledge about lung cancer symptoms, concerns about financial constraints, and a preference for traditional medication were still prevalent among the respondents, and they may represent potential barriers to lung cancer screening uptake. A desire to have an early diagnosis (odds ratio [OR], 11.33; 95% confidence interval [CI], 1.53 to 84.05; p=0.02), perceived time constraints (OR, 3.94; 95% CI, 1.32 to 11.73; p=0.01), and proximity of LDCT screening facilities (OR, 14.33; 95% CI, 1.84 to 111.4; p=0.01) had significantly higher odds of willingness to undergo screening. Conclusion: Although high-risk current smokers and ex-smokers are likely to undergo screening for lung cancer, several psychosocial barriers persist. The results of this study may guide the policymakers and clinicians regarding the need to improve lung cancer awareness in our population.

폐관류스캔을 이용한 폐암환자의 일측 전폐절제술후의 폐기능예측 (Prediction of Postpneumonectomy Pulmonary Function by Lung Scan in Lung Cancer Patient)

  • 허진;장봉현;이종태;김규태
    • Journal of Chest Surgery
    • /
    • 제24권4호
    • /
    • pp.338-344
    • /
    • 1991
  • If the postoperative pulmonary function can be predicted in the patients undergoing pneumonectomy for lung cancer preoperatively, it will be helpful for identifying them as high or low risk candidates. Perfusion lung scan and spirometry were performed in 12 patients with lung cancer pre-operatively and the predicted postoperative Vital Capacity, FVC, FEV1.0, FEF25 - 75% and MVV were estimated by multiplying the preoperative values by the percentage of perfusion of the nonsurgical lung. Three months postoperation the patients were reinvestigated with spirometry and the obtained values were compared with the predicted values. The linear regression lines derived from the correlation between predicted values [X] and observed values [Y] were as follows; VC; R=0.532, Y=0.48X+1.28, P=0.075 FVC; R=0.566, Y=O 54X+1.15, P=0.055 FEV1.0; R=0.832, Y=0.68X+0.70, P=0.001 FEF25 ~ 75%; R=0.781, Y=0.68X+0.54, P=0.003 MVV; R=0.718, Y=0.67X+34.75, P=0.009 The prediction of postoperative FEV1.0, FEF25 ~ 75% and MVV in lung cancer patients undergoing pneumonectomy appear to be valid for evaluating preoperative pulmonary function.

  • PDF