• Title/Summary/Keyword: separated set

Search Result 320, Processing Time 0.03 seconds

Bankruptcy prediction using an improved bagging ensemble (개선된 배깅 앙상블을 활용한 기업부도예측)

  • Min, Sung-Hwan
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.4
    • /
    • pp.121-139
    • /
    • 2014
  • Predicting corporate failure has been an important topic in accounting and finance. The costs associated with bankruptcy are high, so the accuracy of bankruptcy prediction is greatly important for financial institutions. Lots of researchers have dealt with the topic associated with bankruptcy prediction in the past three decades. The current research attempts to use ensemble models for improving the performance of bankruptcy prediction. Ensemble classification is to combine individually trained classifiers in order to gain more accurate prediction than individual models. Ensemble techniques are shown to be very useful for improving the generalization ability of the classifier. Bagging is the most commonly used methods for constructing ensemble classifiers. In bagging, the different training data subsets are randomly drawn with replacement from the original training dataset. Base classifiers are trained on the different bootstrap samples. Instance selection is to select critical instances while deleting and removing irrelevant and harmful instances from the original set. Instance selection and bagging are quite well known in data mining. However, few studies have dealt with the integration of instance selection and bagging. This study proposes an improved bagging ensemble based on instance selection using genetic algorithms (GA) for improving the performance of SVM. GA is an efficient optimization procedure based on the theory of natural selection and evolution. GA uses the idea of survival of the fittest by progressively accepting better solutions to the problems. GA searches by maintaining a population of solutions from which better solutions are created rather than making incremental changes to a single solution to the problem. The initial solution population is generated randomly and evolves into the next generation by genetic operators such as selection, crossover and mutation. The solutions coded by strings are evaluated by the fitness function. The proposed model consists of two phases: GA based Instance Selection and Instance based Bagging. In the first phase, GA is used to select optimal instance subset that is used as input data of bagging model. In this study, the chromosome is encoded as a form of binary string for the instance subset. In this phase, the population size was set to 100 while maximum number of generations was set to 150. We set the crossover rate and mutation rate to 0.7 and 0.1 respectively. We used the prediction accuracy of model as the fitness function of GA. SVM model is trained on training data set using the selected instance subset. The prediction accuracy of SVM model over test data set is used as fitness value in order to avoid overfitting. In the second phase, we used the optimal instance subset selected in the first phase as input data of bagging model. We used SVM model as base classifier for bagging ensemble. The majority voting scheme was used as a combining method in this study. This study applies the proposed model to the bankruptcy prediction problem using a real data set from Korean companies. The research data used in this study contains 1832 externally non-audited firms which filed for bankruptcy (916 cases) and non-bankruptcy (916 cases). Financial ratios categorized as stability, profitability, growth, activity and cash flow were investigated through literature review and basic statistical methods and we selected 8 financial ratios as the final input variables. We separated the whole data into three subsets as training, test and validation data set. In this study, we compared the proposed model with several comparative models including the simple individual SVM model, the simple bagging model and the instance selection based SVM model. The McNemar tests were used to examine whether the proposed model significantly outperforms the other models. The experimental results show that the proposed model outperforms the other models.

Evaluation of Energy Consumption in Heat Treatment of Pine Log (소나무 원목의 열처리 소요 에너지 평가)

  • Eom, Chang-Deuk;Park, Jun-Ho;Han, Yeon Jung;Shin, Sang-Chul;Chung, YoungJin;Jung, Chan-Sik;Yeo, Hwanmyeong
    • Journal of the Korean Wood Science and Technology
    • /
    • v.36 no.6
    • /
    • pp.41-48
    • /
    • 2008
  • The required energy for the heat treatment of pine log was evaluated in this study. A proper heat treatment of pine log infected by pinewood nematode (Bursaphelenchus xylophilus) can prevent spreading of the infection by pinewood nematode and make the infected pinewood valuable again. The FAO (Food and Agriculture Organization of the United Nations) heat treatment standard for various types of infected wood for which a heat treatment of the core part of the wood is necessary is 30 minutes at $56^{\circ}C$, taking into account the international standards for phytosanitary measures (ISPM No. 15). In this study, the energy consumption during the heat treatment was separated into two kinds of energy, initial energy for heating kiln drier and to reach set point temperature and relative humidity and the required energy supplementing heat loss. The initial required energy per unit time is greater than that during the treatment. The energy consumption per unit time varied little during the heat treatment. As a result, the set point relative humidity with set dry bulb temperature and density of wood dependent on moisture content are very important factors to change energy consumption in the experiment. The heat treatment at higher temperature and higher humidity levels requires more energy consumption but less treatment time. It is expected that a more effective energy program could be planed for the heat treatment of pine log through this study.

A New Method of Registering the XML-based Clinical Document Architecture Supporting Pseudonymization in Clinical Document Registry Framework (익명화 방법을 적용한 임상진료문서 등록 기법 연구)

  • Kim, Il-Kwang;Lee, Jae-Young;Kim, Il-Kon;Kwak, Yun-Sik
    • Journal of KIISE:Software and Applications
    • /
    • v.34 no.10
    • /
    • pp.918-928
    • /
    • 2007
  • The goal of this paper is to propose a new way to register CDA documents in CDR (Clinical Document Repository) that is proposed by the author earlier. One of the methods is to use a manifest archiving for seamless references and visualization of CDA related files. Another method is to enhance the CDA security level for supporting pseudonymization of CDA. The former is a useful method to support the bundled registration of CDA related files as a set. And it also can provide a seamless presentation view to end-users, once downloaded, without each HTTP connection. The latter is a new method of CDA registration which can supports a do-identification of a patient. Usually, CDA header can be used for containing patient identification information, and CDA body can be used for diagnosis or treatment data. So, if we detach each other, we can get good advantages for privacy protection. Because even if someone succeeded to get separated CDA body, he/she never knows whose clinical data that is. The other way, even if someone succeeded to get separated CDA header; he/she doesn't know what kind of treatment has been done. This is the way to achieve protecting privacy by disconnecting association of relative information and reducing possibility of leaking private information. In order to achieve this goal, the method we propose is to separate CDA into two parts and to store them in different repositories.

Two-dimensional gel Electrophoresis of Helicobacter pylori for Proteomic Analysis

  • Jung, Tae-Sung;Kang, Seung-Chul;Choi, Yeo-Jeong;Jeon, Beong-Sam;Park, Jeong-Won;Jung, Sun-Ae;Song, Jae-Young;Choi, Sang-Haeng;Park, Seong-Gyu;Choe, Mi-Young;Lee, Byung-Sang;Byun, Eun-Young;Baik, Seung-Chul
    • The Journal of the Korean Society for Microbiology
    • /
    • v.35 no.2
    • /
    • pp.97-108
    • /
    • 2000
  • Two-dimensional gel electrophoresis (2-DE) is an essential tool of proteomics to analyse the entire set of proteins of an organism and its variation between organisms. Helicobacter pylori was tried to identify differences between strains. As the first step, whole H. pylori was lysed using high concentration urea contained lysis buffer [9.5 M Urea, 4% CHAPS, 35 mM Tris, 65 mM DTT, 0.01% SDS and 0.5% Ampholite (Bio-Rad, pH 3-10)]. The extract ($10\;{\mu}g$) was rehydrated to commercially available immobilised pH gradient (IPG) strips, then the proteins were separated according to their charges as the first dimensional separation. The IPG strips were placed on Sodium Dodecyl Sulfate-Polyacrylamide Gel Electrophoresis (SDS-PAGE) to separate according to molecular mass of the proteins as the second dimension. The separated protein spots were visualised by silver staining in order to compare different expression of proteins between strains. Approximately 120 spots were identified in each mini-protein electrophoresised gel, furthermore about 65 to 75 spots were regarded as identical proteins in terms of pI value and molecular weight between strains used. In addition, distinct differences were found between strains, such as 219-1, Y7 and Y14, CH150. Two representative strains were examined using strips which had pH range from 4 to 7. This strips showed a number of isoforms which were considered large spots on pH range 3-10. Furthermore, the rest of spots on pH 4-7 IPG strips appeared very distinctive compared to broad range IPG strips. 2-DE seems to be an excellent tool for analysing and identifying variations between H. pylori strains.

  • PDF

Separation of Egg White Using HPLC with Change of Mobile Phase and Temperature (HPLC에서 이동상 변화와 온도에 따른 난백의 분리)

  • Do, Jin-Sun;Song, Shin-Young;Cho, Ki-Jung;Kim, In-Ho
    • Korean Chemical Engineering Research
    • /
    • v.49 no.6
    • /
    • pp.829-834
    • /
    • 2011
  • Lysozyme in egg white functions as bacteriolysis agent and ovalbumin plays a role as antigen in immune system. Egg white analysis methods usually include electrophoresis, gel permeation chromatography and reversed-phase HPLC(RP-HPLC). Among them, RP-HPLC was selected for rapid analysis and C18 column(Agilent, USA) was used as HPLC column. Optimum conditions were searched by changing mobile phase and temperature. Capacity factor and resolution were calculated and compared for various elution conditions. In the isocratic elution, mobile phase volume ratio was changed from 30/70/0.1 to 60/40/0.1(Acetonitrile(ACN)/Distilled water(DW)/Trifluoroacetic acid(TFA)). ACN composition was increased by 10% and temperature was set as $20^{\circ}C$. In the gradient elution, ACN/DW ratio was changed from 10/90 to 60/40 during 20 minute and temperature was varied as 20, 30 and $40^{\circ}C$. In the isocratic elution, three peaks were separated at 50/50/0.1. Lysozyme and ovalbumin were confirmed as first and third peak in three peaks respectively. In the gradient elution, four peaks were separated at $30^{\circ}C$. Lysozyme and ovalbumin were confirmed as first peak and third peak in four peaks respectively.

Research on Changing of Renal Relative Uptake Depending on the Setting of Background ROI in Kidney MAG3 Study of Hydronephrosis Patients (Hydronephrosis 환자의 Kidney MAG3 검사 시 Background ROI 설정에 따른 신장 상대 섭취율 변화에 관한 연구)

  • Noh, Ik Sang;Ahn, Byung Ho;Kim, Soo Yung;Choi, Sung Wook
    • The Korean Journal of Nuclear Medicine Technology
    • /
    • v.17 no.2
    • /
    • pp.25-30
    • /
    • 2013
  • Purpose: To evaluate kidney function, renal relative uptake is very important and is affected by kidney and the setting of background region of interest (ROI). In particular, in the case of patients with hydronephrosis to the naked eyes, such as size, position and shape etc. can be difficult to identify. So according to ROI to be set by user, the results are many differences. This study assumes the ROI of a constant kidney. According to the change of background ROI by analyzing renal relative uptake affect how the results are intended to study. Materials and Methods: From January 2012 to February 2013, we analyzed 27 patients with hydronephrosis who were examined MAG3 test in nuclear medicine department of Samsung medical center. After patients were received intravenous injection of $^{99m}Tc-MAG3$ 185 MBq (5 mCi) data were obtained. While we reconstructed images of patients, we've changed background ROI in the process of setting up ROI. First, in the process of renal processing, automatic ROI which set automatically and background ROI which needed to set manually were compared. Second, we set the ROI position separated by above, lateral and bottom of kidney. Third, background setting time were compared with 1-2 min and 2-3 min. Results: The relative uptake occurred in 3.7%p of the errors on average in Automatic & Manual ROI study. And comparison of background ROI position study, located in the lower position was more accurate results. Above, lateral, bottom each of the values 74.6%, 67.6% and 62.0% showed respectively. The standard value was 59.9%. finally, split function range test doesn't show significant difference. Conclusion: The study shows that relative uptake of kidney is affected in the background ROI. Therefore, it should be set by considering various dependent factors.

  • PDF

A Study on Traffic Line Efficiency of Health Examination Centers Based on Space Syntax - Focused on the Spatial Cognition of the Testee Taking the National Examination Program (공간구문론에 기초한 건강검진센터 동선효율성 분석 연구 - 국가검진프로그램에 대한 수검자의 공간인지를 중심으로)

  • Song, Seungeon;Kim, Suktae
    • Journal of The Korea Institute of Healthcare Architecture
    • /
    • v.18 no.4
    • /
    • pp.53-65
    • /
    • 2012
  • Purpose: With the increasing national interest in health, the number of health examination centers is growing rapidly, and it is growing as independent medical institutes separated from hospitals. With the growing functions and size of health examination institutes, considerations for testees, who are the most important users of the health examination centers, have taken the back seat. In particular, for health examination programs that take on a sequential traffic line, it is important to be aware of the space of each examination room, but the lack of a scientific evaluation method for this has resulted in great discomforts for testees using the health examination center. Method: Thus, this study proposes risk evaluation indices (RCF TCF, RC3, RR, ARR), and set a standard health examination program based on the national health examination program. This was applied to 11 different sized health examination centers to find their features, and together with identifying the trends of the indices, the following results were deduced. Result: 1) ARR showed a wide-range feature as the number of unit spaces increased, while RR were discovered regardless of the size, thus displaying local features. 2) The increase of ARR is affected more by internal factors in the health examination center than from outside factors. 3) By gender, when separating the basic health examination fields, the connective relation of the comprehensive health examination fields had a big effect on ARR. 4) By becoming larger, the fields of function become independent and the waiting space that results from it increases the number of total movement, so there is space for improvement in this.

Sensory Profiling of Commercial Korean Distilled Soju (시판 증류식 소주의 관능특성 분석)

  • Lee, Seung-Joo;Park, Cheon-Soo;Kim, Ho-Kyung
    • Korean Journal of Food Science and Technology
    • /
    • v.44 no.5
    • /
    • pp.648-652
    • /
    • 2012
  • The sensory characteristics of nine commercially distilled soju samples were determined by sensory descriptive analysis. Eight aroma attributes, as well as four flavor/taste attributes, and six mouth-feel related attributes were evaluated by 9 judges. The descriptive data set was initially analyzed for a significant overall product effect by employing a three-way mixed model analysis of variance (judges, samples, and replications) as well as two-way interactions, with judges treated as random. In addition, correlations between mean attribute ratings were calculated, and a principal component analysis (PCA) of the mean attribute ratings employing the covariance matrix was conducted. Based on the PCA, distilled soju samples were primarily separated along the first principal component, which accounted for 66% of the total variance between the samples, with high intensities of 'alcohol taste' and 'alcohol aroma' versus 'yeast aroma'. The second principal component accounted for 14% of the total variance. Soju containing high alcohol showed stronger intensities of 'bitterness', 'alcohol taste', 'alcohol aroma', as well as all mouth-feel attributes.

Contribution analysis of Hanwoo carcass traits on unit price in national slaughter house

  • Eum, Seung-Hoon;Park, Hu-Rak;Seo, Jakyeom;Cho, Seong-Keun;Kim, Byeong-Woo
    • Korean Journal of Agricultural Science
    • /
    • v.43 no.4
    • /
    • pp.603-611
    • /
    • 2016
  • The aim of this study was to analyze the contribution factors (backfat thickness, eye muscle area, carcass weight, marbling score, and feeding period) affecting meat unit price (South-Korean Won / Kg of meat). The best slaughtering age to maximize unit price was also assumed. All data used in this study were acquired from the Korea Institute for Animal Products Quality Evaluation from 2010 to 2014. Contributions to the estimated unit price of cows by the following factors, backfat thickness, eye muscle area, carcass weights, feeding period, and marbling score were 2.65%, 0.04%, 1.58%, 1.58%, and 95.72%, respectively. Contribution to estimated unit price of steers by the same factors (backfat thickness, eye muscle area, carcass weights, feeding period, and marbling score) were 7.88%, 1.24%, 0.07%, 90.81%, and 95.72%, respectively. Slaughtering ages ranged from 26 to 36 months and the data were separated into each month for an 11 month period. The unit price of meat from Hanwoo slaughtered at 30 months was highest among groups. The lowest unit price was observed in the group belonging to the Hanwoo slaughtered at 36 months. In conclusion, of all contributing factors, marbling score affected unit price the most. Based on our results, it is recommended that the optimal slaughtering age be set at 30 months to maximize unit price. Moreover, the feeding of beef cattle past 30 months of age is not recommended because of the increase in feeding costs.

Quantitative Morphology of High-Redshift Galaxies Using GALEX Ultraviolet Images of Nearby Galaxies

  • Yeom, Bum-Suk;Rey, Soo-Chang;Kim, Youngkwang;Lee, Youngdae;Chung, Jiwon;Kim, Suk;Lee, Woong
    • Journal of Astronomy and Space Sciences
    • /
    • v.34 no.3
    • /
    • pp.183-197
    • /
    • 2017
  • We present simulations of the optical-band images of high-redshift galaxies utilizing 845 near-ultraviolet (NUV) images of nearby galaxies obtained through the Galaxy Evolution Explorer (GALEX). We compute the concentration (C), asymmetry (A), Gini (G), and $M_{20}$ parameters of the GALEX NUV/Sloan Digital Sky Survey r-band images at z ~ 0 and their artificially redshifted optical images at z = 0.9 and 1.6 in order to quantify the morphology of galaxies at local and high redshifts. The morphological properties of nearby galaxies in the NUV are presented using a combination of morphological parameters, in which early-type galaxies are well separated from late-type galaxies in the $G-M_{20}$, $C-M_{20}$, A-C, and $A-M_{20}$ planes. Based on the distribution of galaxies in the A-C and $G-M_{20}$ planes, we examine the morphological K-correction (i.e., cosmological distance effect and bandshift effect). The cosmological distance effect on the quantitative morphological parameters is found to be significant for early-type galaxies, while late-type galaxies are more greatly affected by the bandshift effect. Knowledge of the morphological K-correction will set the foundation for forthcoming studies on understanding the quantitative assessment of galaxy evolution.