• Title/Summary/Keyword: microsoft excel

Search Result 176, Processing Time 0.023 seconds

Optimal Selection of Classifier Ensemble Using Genetic Algorithms (유전자 알고리즘을 이용한 분류자 앙상블의 최적 선택)

  • Kim, Myung-Jong
    • Journal of Intelligence and Information Systems
    • /
    • v.16 no.4
    • /
    • pp.99-112
    • /
    • 2010
  • Ensemble learning is a method for improving the performance of classification and prediction algorithms. It is a method for finding a highly accurateclassifier on the training set by constructing and combining an ensemble of weak classifiers, each of which needs only to be moderately accurate on the training set. Ensemble learning has received considerable attention from machine learning and artificial intelligence fields because of its remarkable performance improvement and flexible integration with the traditional learning algorithms such as decision tree (DT), neural networks (NN), and SVM, etc. In those researches, all of DT ensemble studies have demonstrated impressive improvements in the generalization behavior of DT, while NN and SVM ensemble studies have not shown remarkable performance as shown in DT ensembles. Recently, several works have reported that the performance of ensemble can be degraded where multiple classifiers of an ensemble are highly correlated with, and thereby result in multicollinearity problem, which leads to performance degradation of the ensemble. They have also proposed the differentiated learning strategies to cope with performance degradation problem. Hansen and Salamon (1990) insisted that it is necessary and sufficient for the performance enhancement of an ensemble that the ensemble should contain diverse classifiers. Breiman (1996) explored that ensemble learning can increase the performance of unstable learning algorithms, but does not show remarkable performance improvement on stable learning algorithms. Unstable learning algorithms such as decision tree learners are sensitive to the change of the training data, and thus small changes in the training data can yield large changes in the generated classifiers. Therefore, ensemble with unstable learning algorithms can guarantee some diversity among the classifiers. To the contrary, stable learning algorithms such as NN and SVM generate similar classifiers in spite of small changes of the training data, and thus the correlation among the resulting classifiers is very high. This high correlation results in multicollinearity problem, which leads to performance degradation of the ensemble. Kim,s work (2009) showedthe performance comparison in bankruptcy prediction on Korea firms using tradition prediction algorithms such as NN, DT, and SVM. It reports that stable learning algorithms such as NN and SVM have higher predictability than the unstable DT. Meanwhile, with respect to their ensemble learning, DT ensemble shows the more improved performance than NN and SVM ensemble. Further analysis with variance inflation factor (VIF) analysis empirically proves that performance degradation of ensemble is due to multicollinearity problem. It also proposes that optimization of ensemble is needed to cope with such a problem. This paper proposes a hybrid system for coverage optimization of NN ensemble (CO-NN) in order to improve the performance of NN ensemble. Coverage optimization is a technique of choosing a sub-ensemble from an original ensemble to guarantee the diversity of classifiers in coverage optimization process. CO-NN uses GA which has been widely used for various optimization problems to deal with the coverage optimization problem. The GA chromosomes for the coverage optimization are encoded into binary strings, each bit of which indicates individual classifier. The fitness function is defined as maximization of error reduction and a constraint of variance inflation factor (VIF), which is one of the generally used methods to measure multicollinearity, is added to insure the diversity of classifiers by removing high correlation among the classifiers. We use Microsoft Excel and the GAs software package called Evolver. Experiments on company failure prediction have shown that CO-NN is effectively applied in the stable performance enhancement of NNensembles through the choice of classifiers by considering the correlations of the ensemble. The classifiers which have the potential multicollinearity problem are removed by the coverage optimization process of CO-NN and thereby CO-NN has shown higher performance than a single NN classifier and NN ensemble at 1% significance level, and DT ensemble at 5% significance level. However, there remain further research issues. First, decision optimization process to find optimal combination function should be considered in further research. Secondly, various learning strategies to deal with data noise should be introduced in more advanced further researches in the future.

The comparison of Patient Hygiene Performance(PHP) Index according to the number of Oral Health Care worker with Disabled (장애인 구강건강관리인력에 따른 구강환경관리능력 지수 비교)

  • Kim, So-Yeon;Kim, Su-ji;Kim, Yeon-seon;Kim, Ji-Hong;Kim, Hyo-Jin;Jung, Seung-min;Hong, Ji-Hee
    • Journal of the Korean Academy of Esthetic Dentistry
    • /
    • v.28 no.2
    • /
    • pp.116-126
    • /
    • 2019
  • Objectives: Currently, oral health of the disabled is taken care of by the social workers, not by dental hygienists, who are the oral health professional in this area. Therefore, we aim to enhance the equity of oral health for the disabled by providing the correct oral health care method to social workers residing in the welfare facility for the disabled. Methods: Four dental hygienists and four social workers were given the class I intellectual disabilities living in 'o' welfare facilities for disabled people in Songpa-gu, Seoul from April 13, 2019 to April 20, 2019. Patient Hygiene Performance(PHP) Index were measured and compared. In advance, the social workers were taught brushing (Rolling method), and the method of brushing and measuring tools were integrated. Results: Twice a total of dental hygienists and social workers practiced toothbrushing(Rolling method) for the class I intellectual disabilities who is a person to be brushed. When comparing the Patient Hygiene Performance(PHP) Index after the second round, the result shows that both the first and second dental hygienists' Patient Hygiene Performance(PHP) Index is lower. Conclusions: Comparing oral health knowledge level and Patient Hygiene Performance(PHP) index of dental hygienist and social workers, the result shows that dental hygienist has higher oral health care ability. Therefore, the dental hygienist should be placed in welfare facility for the disabled as an expert in oral health management to create an environment in which the disabled and social workers can be trained. In addition, the curriculum of the college that nurtures the dental hygienists should have a course to understand the characteristics of the disabled person in order to enhance the professionalism of dental hygienists.

Classification of submitted KSNMT dissertation (대한핵의학기술학회 투고 논문 분류)

  • Han, Dong-Chan;Lee, Hyuk;Hong, Gun-Chul;Ahn, Byeong-Ho;Choi, Seong-Wook
    • The Korean Journal of Nuclear Medicine Technology
    • /
    • v.21 no.1
    • /
    • pp.65-69
    • /
    • 2017
  • Purpose KSNMT(Korea Society of Nuclear Medicine Technology) stepping first step in 1997, has published first journal related with nuclear medicine technology in 1985. With classifying In Vivo Session Dissertation reported in the entire journal, trend of the Dissertation will be studied. Materials and Methods Dissertations which published from 1985 to first half of 2016 in the journal are classified with presentation form and with scanner, And all the data is organized with Excel program. Through the data, the number of dissertations published in each year, the number of dissertation published in details, and keyword distributions in each period are analyzed. Results The number of In-vivo section dissertations was 1151 and the number of In-vivo section dissertations that have common subject with In-vitro section was 28. The number of In-vivo section dissertation in 1980s was 46, in 1990s was 149, in 2000 was 467 and from 2010 to the first half of 2016 was 517. The number of dissertation with original articles was 571, with abstract was 529, with symposium was 31, with special lecture was 25, with review was 11, with interesting image was 7, with poster was 3 and with case report was 2. With symposium and special lecture excluded, which count 56, the number of dissertation with PET was 319, with Planar was 302, with SPECT was 172, with radiopharmaceutical was 113, with guard and safety management 103, with BMD was 28, etc. was 86. The number of dissertation about oncology was 201, about scanner was 179, about cardiovascular and circulatory system was 102, about safe environment was 82, about musculoskeletal system was 76, about nervous nuclear medicine was 66, about quality assurance was 61, about genitourinary system was 56, about endocrine system was 49, about digestive system was 44, about Therapy, about industrial safety was 24, about molecular imaging was 15, infection and inflammation was 9, about respiratory system was 8 and etc. was 108. The mostly used keyword through 1999 to 2005 was PET and through 2006 to 2016 was PET/CT. Conclusion To encourage various dissertations to be submitted, Korea Society of Nuclear Medicine should analyze date about not only about dissertations that are already published, but also about various research materials. Moreover, Korea Society of Nuclear Medicine also have to provide technical support such as sharing big data from homepage and systematical support to its member to publish dissertation that has high impact factor. It is important each individual researcher to have continuing effort as well as each organization cooperation.

  • PDF

The variability of 6-D Skull Tracking(6DST) in Cyberknife for Bone metastasis patients (사이버나이프 6-D Skull Tracking의 유용성 평가)

  • Lee, Geon Ho;Bae, Sun Myeong;Song, Heung Kwon;Baek, Geum Mun
    • The Journal of Korean Society for Radiation Therapy
    • /
    • v.30 no.1_2
    • /
    • pp.41-47
    • /
    • 2018
  • Purpose : The purpose of this study is to evaluate the usefulness of 6 Dimensional Skull Tracking(6DST) in Cyberknife Stereotactic Body Radiation Therapy(SBRT) for the first and second cervical vertebrae(C1 and C2) metastasis. Methode and material : The Computed Tomography (Lightspeed VCT 64, General Electric Co. Waukesha, WI, USA) was used to acquire the CT images of the 9 patients with cervical vertebrae(C1 and C2) metastasis. Treatment plans for Xsight spine tracking and 6 Dimensional skull tracking were established with planning system (Multiplan system Version 4.6, Accuray, US). The results of XST and 6DST for each patient were analyzed with Microsoft Excel 2010. Result : The Maximum offsets of XST for C1 were 0.9 mm in Y(supero-inferior), 0.9 mm in Z(antero-posterior), 0.7 mm in X(left-right) direction, and rotations were and 1.0 degrees roll, 1.0 degrees pitch and 1.2 degrees yaw. The Maximum offsets of 6DST for C1 were 0.7 mm, 0.7 mm, 0.9 mm and $1.0^{\circ}$, $1.0^{\circ}$, $1.2^{\circ}$ for Y, Z, X and Roll, Pitch, Yaw. The Maximum offsets of XST and 6DST for C2 were 0.7 mm, 0.7 mm, 0.8 mm and $0.9^{\circ}$, $1.0^{\circ}$, $1.8^{\circ}$, and 0.9 mm, 0.7 mm, 0.9 mm and $0.9^{\circ}$, $0.9^{\circ}$, $1.0^{\circ}$ for Y, Z, X and Roll, Pitch, Yaw, respectively. Conclusion : XST and 6DST showed identical results for translations and rotations within the tolerance. It is possible to simplify the treatment time and procedure by using the 6DST. Therefore, 6DST is very useful methode with XST among the various tracking methods in Cyberknife for the patients with C1, C2 vertebral metastasis.

  • PDF

Gene Expression Profiles in Cervical Cancer with Radiation Therapy Alone and Chemo-radiation Therapy (자궁경부암의 방사선치료 및 방사선항암화학 병용치료에 따른 유전자발현 조절양상)

  • Lee Kyu Chan;Kim Meyoung-kon;Kim Jooyoung;Hwang You Jin;Choi Myung Sun;Kim Chul Yong
    • Radiation Oncology Journal
    • /
    • v.21 no.1
    • /
    • pp.54-65
    • /
    • 2003
  • Purpose : To analyze the gene expression Profiles of uterine ceulcal cancer, and its variation after radiation therapy, with or without concurrent chemotherapy, using a CDNA microarray. Materials and Methods :Sixteen patients, 8 with squamous ceil carcinomas of the uterine cervix, who were treated with radiation alone, and the other 8 treated w14h concurrent chemo-radiation, were Included in the study. Before the starling of the treatment, tumor biopsies were carried out, and the second time biopsies were peformed after a radiation dose of 16.2$\~$27 Gy. Three normal cervix tissues were used as a control group. The microarray experiments were peformed with 5 groups of the total RNAs extracted individually and then admixed as control, pre-radiation therapy alone, during-radiation therapy alone, pre-chemoradiation therapy, and during-chemoradlation therapy. The 33P-iabeled CDNAS were synthesized from the total RNAs of each group, by reverse transcription, and then they were hybridized to the CDNA microarray membrane. The gene expression of each microarrays was captured by the intensity of each spot produced by the radioactive isotopes. The pixels per spot were counted with an Arrayguage, and were exported to Microsoft Excel The data were normalized by the Z transformation, and the comparisons were peformed on the Z-ratio values calculated. Results : The expressions of 15 genes, including integrin linked kinase (ILK), CDC28 protein kinase 2, Spry 2, and ERK 3, were increased with the Z-ratio values of over 2.0 for the cervix cancer tissues compared to those for the normal controls. Those genes were involved In cell growth and proliferation, cell cycle control, or signal transduction. The expressions of the other 6 genes, Including G protein coupled receptor kinase 5, were decreased with the Z-ratio values of below -2.0. After the radiation thorapy, most of the genes, with a previously Increase expressions, represented the decreased expression profiles, and the genes, with the Z-ratio values of over 2.0, were cyclic nucleotlde gated channel and 3 Expressed sequence tags (EST). In the concurrent chemo-radiation group, the genes involved in cell growth and proliferation, cell cycle control, and signal transduction were shown to have increased expressions compared to the radiation therapy alone group. The expressions of genes involved in anglogenesis (angiopoietln-2), immune reactions (formyl peptide receptor-iike 1), and DNA repair (CAMP phosphodiesterase) were increased, however, the expression of gene involved In apoptosls (death associated protein kinase) was decreased. Conclusion : The different kinds of genes involved in the development and progression of cervical cancer were identified with the CDNA microarray, and the proposed theory is that the proliferation signal stalls with ILK, and is amplified with Spry 2 and MAPK signaling, and the cellular mitoses are Increased with the increased expression oi Cdc 2 and cell division kinases. After the radiation therapy, the expression profiles demonstrated 4he evidence of the decreased cancer cell proliferation. There was no sigificant difference in the morphological findings of cell death between the radiation therapy aione and the chemo-radiation groups In the second time biopsy specimen, however, the gene expression profiles were markedly different, and the mechanism at the molecular level needs further study.

Optimization of Multiclass Support Vector Machine using Genetic Algorithm: Application to the Prediction of Corporate Credit Rating (유전자 알고리즘을 이용한 다분류 SVM의 최적화: 기업신용등급 예측에의 응용)

  • Ahn, Hyunchul
    • Information Systems Review
    • /
    • v.16 no.3
    • /
    • pp.161-177
    • /
    • 2014
  • Corporate credit rating assessment consists of complicated processes in which various factors describing a company are taken into consideration. Such assessment is known to be very expensive since domain experts should be employed to assess the ratings. As a result, the data-driven corporate credit rating prediction using statistical and artificial intelligence (AI) techniques has received considerable attention from researchers and practitioners. In particular, statistical methods such as multiple discriminant analysis (MDA) and multinomial logistic regression analysis (MLOGIT), and AI methods including case-based reasoning (CBR), artificial neural network (ANN), and multiclass support vector machine (MSVM) have been applied to corporate credit rating.2) Among them, MSVM has recently become popular because of its robustness and high prediction accuracy. In this study, we propose a novel optimized MSVM model, and appy it to corporate credit rating prediction in order to enhance the accuracy. Our model, named 'GAMSVM (Genetic Algorithm-optimized Multiclass Support Vector Machine),' is designed to simultaneously optimize the kernel parameters and the feature subset selection. Prior studies like Lorena and de Carvalho (2008), and Chatterjee (2013) show that proper kernel parameters may improve the performance of MSVMs. Also, the results from the studies such as Shieh and Yang (2008) and Chatterjee (2013) imply that appropriate feature selection may lead to higher prediction accuracy. Based on these prior studies, we propose to apply GAMSVM to corporate credit rating prediction. As a tool for optimizing the kernel parameters and the feature subset selection, we suggest genetic algorithm (GA). GA is known as an efficient and effective search method that attempts to simulate the biological evolution phenomenon. By applying genetic operations such as selection, crossover, and mutation, it is designed to gradually improve the search results. Especially, mutation operator prevents GA from falling into the local optima, thus we can find the globally optimal or near-optimal solution using it. GA has popularly been applied to search optimal parameters or feature subset selections of AI techniques including MSVM. With these reasons, we also adopt GA as an optimization tool. To empirically validate the usefulness of GAMSVM, we applied it to a real-world case of credit rating in Korea. Our application is in bond rating, which is the most frequently studied area of credit rating for specific debt issues or other financial obligations. The experimental dataset was collected from a large credit rating company in South Korea. It contained 39 financial ratios of 1,295 companies in the manufacturing industry, and their credit ratings. Using various statistical methods including the one-way ANOVA and the stepwise MDA, we selected 14 financial ratios as the candidate independent variables. The dependent variable, i.e. credit rating, was labeled as four classes: 1(A1); 2(A2); 3(A3); 4(B and C). 80 percent of total data for each class was used for training, and remaining 20 percent was used for validation. And, to overcome small sample size, we applied five-fold cross validation to our dataset. In order to examine the competitiveness of the proposed model, we also experimented several comparative models including MDA, MLOGIT, CBR, ANN and MSVM. In case of MSVM, we adopted One-Against-One (OAO) and DAGSVM (Directed Acyclic Graph SVM) approaches because they are known to be the most accurate approaches among various MSVM approaches. GAMSVM was implemented using LIBSVM-an open-source software, and Evolver 5.5-a commercial software enables GA. Other comparative models were experimented using various statistical and AI packages such as SPSS for Windows, Neuroshell, and Microsoft Excel VBA (Visual Basic for Applications). Experimental results showed that the proposed model-GAMSVM-outperformed all the competitive models. In addition, the model was found to use less independent variables, but to show higher accuracy. In our experiments, five variables such as X7 (total debt), X9 (sales per employee), X13 (years after founded), X15 (accumulated earning to total asset), and X39 (the index related to the cash flows from operating activity) were found to be the most important factors in predicting the corporate credit ratings. However, the values of the finally selected kernel parameters were found to be almost same among the data subsets. To examine whether the predictive performance of GAMSVM was significantly greater than those of other models, we used the McNemar test. As a result, we found that GAMSVM was better than MDA, MLOGIT, CBR, and ANN at the 1% significance level, and better than OAO and DAGSVM at the 5% significance level.