• Title/Summary/Keyword: 이상치 판별

Search Result 62, Processing Time 0.025 seconds

A Hybrid Under-sampling Approach for Better Bankruptcy Prediction (부도예측 개선을 위한 하이브리드 언더샘플링 접근법)

  • Kim, Taehoon;Ahn, Hyunchul
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.2
    • /
    • pp.173-190
    • /
    • 2015
  • The purpose of this study is to improve bankruptcy prediction models by using a novel hybrid under-sampling approach. Most prior studies have tried to enhance the accuracy of bankruptcy prediction models by improving the classification methods involved. In contrast, we focus on appropriate data preprocessing as a means of enhancing accuracy. In particular, we aim to develop an effective sampling approach for bankruptcy prediction, since most prediction models suffer from class imbalance problems. The approach proposed in this study is a hybrid under-sampling method that combines the k-Reverse Nearest Neighbor (k-RNN) and one-class support vector machine (OCSVM) approaches. k-RNN can effectively eliminate outliers, while OCSVM contributes to the selection of informative training samples from majority class data. To validate our proposed approach, we have applied it to data from H Bank's non-external auditing companies in Korea, and compared the performances of the classifiers with the proposed under-sampling and random sampling data. The empirical results show that the proposed under-sampling approach generally improves the accuracy of classifiers, such as logistic regression, discriminant analysis, decision tree, and support vector machines. They also show that the proposed under-sampling approach reduces the risk of false negative errors, which lead to higher misclassification costs.

Genesis of the acidic metavolcanic rocks distributed around the Chungju iron deposit in the Gyemyeongsan Formation (계명산층 내의 충주 철광상 주변에 분포하는 산성 변성화산암의 성인)

  • Park Maeng-Eon;Kim Gun-Soo;Park Kye-Hun
    • The Journal of the Petrological Society of Korea
    • /
    • v.14 no.3 s.41
    • /
    • pp.169-179
    • /
    • 2005
  • Acidic metavolcanic rocks distributed around the Chungju iron deposit show significantly high abundances of rare earth elements and high field strength elements. Relatively high ${\epsilon}_{Nd}$(0) values and lack of negative Nb anomaly suggest that assimilation of crustal material is not involved in their generation. They are plotted within the within-plate environment according the tectonic discrimination diagrams. Such geochemical characteristics are very similar to the acidic metavolcanic rocks of Munjuri Formation. They also show geochemical characteristics of Al-type magma of Eby (1992). All such diagnostic characters indicate differentiation of mantle-derived magma produced from the rift environment, related to the breakup of continent. In contrast to the alkali granites and the rare metal deposit both having age of c. 330 Ma, Sm-Nd isotopic data of the acidic metavolcanic rocks do not form well defined isochron. However, the alkali granites reveal low ${\epsilon}_{Nd}$(0) values, while the acidic metavolcanic rocks and the rare metal deposit both have significantly higher ${\epsilon}_{Nd}$(0) values. Considering such differences, we propose following generation hypothesis: The acidic metavolcanic rocks around Chungju iron deposit was erupted at 750 Ma as rest of the acidic metavolcanic rocks of Gyemyeongsan and Munjuri Formations. About 330 Ma ago, partial melting of existing Al-type igneous materials and some old crustal materials produced alkali granite. The rare metal deposit was also produced by redistribution of related materials within the acidic volcanics due to hydrothermal activities occurred at the same time. Sm-Nd isotopic systematics of the acidic metavolcanic rocks were disturbed during the regional metamorphic event at ca. 280 Ma.

Use of Real-Time PCR and Internal Standard Addition Method for Identifying Mixed Ratio of Chicken Meat in Sausages (Real-Time PCR과 Internal Standard Addition법을 이용한 돼지고기 소시지에 혼합된 닭고기의 정량)

  • Lee, Namrye;Joo, Jae-Young;Yeo, Yong-Heon
    • Journal of the Korean Society of Food Science and Nutrition
    • /
    • v.46 no.9
    • /
    • pp.1097-1105
    • /
    • 2017
  • This study examined how much chicken meat was in sausage made with pork. Both real-time polymerase chain reaction (PCR) and internal standard addition were used. Fifty ng of chicken DNA was added to the sausages as an internal standard. The addition of standard DNA increased the amplification efficiency of PCR and confirmed the possibility of quantitative analysis. A QIAamp DNA Micro Kit was used to improve the DNA recovery and amplification efficiency. The density of template DNA and primer were suitable for $3.0{\sim}5.0{\mu}L$ and $0.5{\mu}L$, respectively. Each DNA of pig and chicken was diluted in 10-fold from steps 50 ng to 0.05 ng. The detection limit of both pig and chicken meat was more than 0.05 ng and the correlation coefficient of the standard curve was at least 0.98. The result of the quantitative analysis after heat treatment of 3 samples of pigs and chickens mixed at 70:30 showed a 5.7% difference (64.3:35.7) between the expected value and measured value. The quantitative value was changed by affecting the DNA according to the heat treatment ($70^{\circ}C$, 10 min). An analysis of the pork and chicken content in sausages showed that it was difficult to detect chicken meat and the quantitative value of DNA according to the Ct value was very low. On the other hand, when adding standard material (50 ng of chicken DNA) to the sausages, the Ct value decreased gradually with increasing chicken mixing ratio. Thus, the mixing ratio of chicken in sausages could be estimated.

A Review of Statistical Methods in the Korean Journal of Orthodontics and the American Journal of Orthodontics and Dentofacial Orthopedics (대한치과교정학회지(KJO)와 미국교정학회지(AJODO)에서 사용된 통계기법의 비교분석 및 고찰(1999-2003))

  • Lim, Hoi-Jeong
    • The korean journal of orthodontics
    • /
    • v.34 no.5 s.106
    • /
    • pp.371-379
    • /
    • 2004
  • The purpose of this study was to investigate the changes and types of statistical methods used in the Korean Journal of Orthodontics (KJO) and the American Journal of Orthodontics and Dentofacial Orthopedics (AJODO) from )999 to 2003. The frequency of use, transitions, assumption check of statistical methods and types of advanced statistical methods were examined from each journal. The study consisted of 247 articles published in the KJO and randomly chosen 50 articles per year which were original articles and used statistical methods T-test, analysis of variance(ANOVA), correlation analysis, nonparametric analysis. regression analysis chi-square test. factor analysis, were the order of statistical methods most frequently used in the KJO, while t-test. ANOVA, nonparametric analysis, correlation analysis, regression analysis, chi-square test. factor analysis. were the order of statistical methods used in the AJODO The changes of statistical methods observed in the KJO were not significant $(X^2=17.4\;p=0.5881)$ but the changes observed in the AJODO was seen to be significant $(x^2=42.4,\;p=0.0397)$ Some of the studies examined had overlooked the assumptions of the statistical methods employed. Data investigation such as outlier should be performed before analysis and alternative statistical approaches are applied for a small sample size. Types of advanced statistical methods were factor analysis and discriminant analysis in the KJO and Intention-To-Treat (ITT) analysis in clinical trials through multi-center, survival analysis and Generalized Estimating Equations (GEE) in the AJODO. Appropriate analysis approaches and interpretations should be applied for the correlated and repeated measurements of the orthodontic data set.

Forecasting the Precipitation of the Next Day Using Deep Learning (딥러닝 기법을 이용한 내일강수 예측)

  • Ha, Ji-Hun;Lee, Yong Hee;Kim, Yong-Hyuk
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.26 no.2
    • /
    • pp.93-98
    • /
    • 2016
  • For accurate precipitation forecasts the choice of weather factors and prediction method is very important. Recently, machine learning has been widely used for forecasting precipitation, and artificial neural network, one of machine learning techniques, showed good performance. In this paper, we suggest a new method for forecasting precipitation using DBN, one of deep learning techniques. DBN has an advantage that initial weights are set by unsupervised learning, so this compensates for the defects of artificial neural networks. We used past precipitation, temperature, and the parameters of the sun and moon's motion as features for forecasting precipitation. The dataset consists of observation data which had been measured for 40 years from AWS in Seoul. Experiments were based on 8-fold cross validation. As a result of estimation, we got probabilities of test dataset, so threshold was used for the decision of precipitation. CSI and Bias were used for indicating the precision of precipitation. Our experimental results showed that DBN performed better than MLP.

A STUDY OF THE DEVELOPMENT AND STANDARDIZATION OF ADHD DIAGNOSTIC SYSTEM (전산화된 주의력장애 진단시스템의 개발 및 표준화 연구)

  • Cho, Sung-Zoon;Chun, Sun-Young;Hong, Kang-E;Shin, Min-Sup
    • Journal of the Korean Academy of Child and Adolescent Psychiatry
    • /
    • v.11 no.1
    • /
    • pp.91-99
    • /
    • 2000
  • Objectives:Present study developed the computerized ADHD Diagnostic System(ADS) in order to diagnose ADHD and evaluate treatment effect of it, and conducted a standardization study for ADS. Methods:The normative group was composed of 847 children and adolescents between the age of 5 and 15(boy 429, girl 418) living in the areas of Seoul, Kyunggi-do, and Kangwon-do. 30 ADHD children with age ranged 7 to 9 years were participated present study to evaluate the validity of ADS. To establish the norms for diagnosing ADHD, the means and standard deviations of normative group were used to calculate T-scores for each age group. Results:The reliability coefficient of ADS(Cronbach's ${\alpha}$) was .85. There were significant differences in the measures of ADS except commission error between the normal and the ADHD groups. Three factors were extracted through factor analysis of ADS, which were labelled 'inattention', 'slow information processing' and 'impulsivity'. Discriminant analysis showed that ADS significantly discriminate the normal and the ADHD groups. Percentage of correct classification by ADS variables was 96.7%. Conclusion:These results put together strongly support the reliability and validity of ADS as a diagnostic instrument for ADHD.

  • PDF

Assessment of Hydroureteronephrosis in Children Using Diuretic Radionuclide Ureterography (동위원소 이뇨 요관그람을 이용한 소아 요관폐쇄의 평가)

  • Kim, Jong-Ho;Lee, Dong-Soo;Kwark, Cheol-Eun;Lee, Kyung-Han;Choi, Chang-Woon;Chung, June-Key;Lee, Myung-Chul;Koh, Chang-Soon;Choi, Yong;Choi, Hwang
    • The Korean Journal of Nuclear Medicine
    • /
    • v.28 no.1
    • /
    • pp.75-84
    • /
    • 1994
  • The need for assessment of ureteric function in the patient with an obviousely dilated ureter has increased particularly with the added spectrum of asymptomatic patients presenting with hydrone-phrosis and hydroureter on antenatal and perinatal ultrasound. To assess the influence of ureteral status on kidney washout during $^{99m}Tc$-DTPA diuretic renography, ureteral images were reviewed in 80 children referred for hydronephrosis. A scintigraphically abnormal ureter was defined as an intense and continuous image of > 10 min during diuretic renography. Out of them, a total of 16 nephroureteral systems in 12 children with scintigraphically abnormal ureter were analyzed. A diuretic washout index using response half time (t1/2) by linear fitting after lasix injection, was determined on renal (Kt1/2) and ureteral (Ut1/2) curves (diuretic renogram vs. diuretic ureterogram). Diuretic ureterogram curve patterns corresponding to normal (type I), obstructive (II) and non-obstructive (III) cases were described. Compared with X-ray data, diuretic renography was highly sensitive (88%) and specific (99%) for detecting any ureteral abnormality. Despite an obstructive Kt1/2 (>20 min), no patient with an abnormal ureter underwent therapy at the ureteropelvic junction because the hydronephrosis regressed after surgery at the lower level. Our data indicate that the abnormal ureter findings during diuretic renography have to be recognized before therapy for children with hydeonephrosis.

  • PDF

Studies on Development of Prediction Model of Landslide Hazard and Its Utilization (산지사면(山地斜面)의 붕괴위험도(崩壞危險度) 예측(豫測)모델의 개발(開發) 및 실용화(實用化) 방안(方案))

  • Ma, Ho-Seop
    • Journal of Korean Society of Forest Science
    • /
    • v.83 no.2
    • /
    • pp.175-190
    • /
    • 1994
  • In order to get fundamental information for prediction of landslide hazard, both forest and site factors affecting slope stability were investigated in many areas of active landslides. Twelve descriptors were identified and quantified to develop the prediction model by multivariate statistical analysis. The main results obtained could be summarized as follows : The main factors influencing a large scale of landslide were shown in order of precipitation, age group of forest trees, altitude, soil texture, slope gradient, position of slope, vegetation, stream order, vertical slope, bed rock, soil depth and aspect. According to partial correlation coefficient, it was shown in order of age group of forest trees, precipitation, soil texture, bed rock, slope gradient, position of slope, altitude, vertical slope, stream order, vegetation, soil depth and aspect. The main factors influencing a landslide occurrence were shown in order of age group of forest trees, altitude, soil texture, slope gradient, precipitation, vertical slope, stream order, bed rock and soil depth. Two prediction models were developed by magnitude and frequency of landslide. Particularly, a prediction method by magnitude of landslide was changed the score for the convenience of use. If the total store of the various factors mark over 9.1636, it is evaluated as a very dangerous area. The mean score of landslide and non-landslide group was 0.1977 and -0.1977, and variance was 0.1100 and 0.1250, respectively. The boundary value between the two groups related to slope stability was -0.02, and its predicted rate of discrimination was 73%. In the score range of the degree of landslide hazard based on the boundary value of discrimination, class A was 0.3132 over, class B was 0.3132 to -0.1050, class C was -0.1050 to -0.4196, class D was -0.4195 below. The rank of landslide hazard could be divided into classes A, B, C and D by the boundary value. In the number of slope, class A was 68, class B was 115, class C was 65, and class D was 52. The rate of landslide occurrence in class A and class B was shown at the hige prediction of 83%. Therefore, dangerous areas selected by the prediction method of landslide could be mapped for land-use planning and criterion of disaster district. And also, it could be applied to an administration index for disaster prevention.

  • PDF

Differences in Grip Strength by Living Conditions and Living Area among Men and Women in Middle and Later Life (독거여부와 거주지역에 따른 중년기와 노년기 남성과 여성의 악력 차이)

  • Joo, Susanna;Jun, Hey Jung;Park, Hayoung
    • 한국노년학
    • /
    • v.38 no.3
    • /
    • pp.551-567
    • /
    • 2018
  • Demographic and socio-structural information is useful to identify potential welfare recipients who are in need of disease-prevention and intervention services. Thus, the present study aims to explore the differences in grip strength among middle and old-aged adults by living conditions and by living area. The 5th wave data of Korean Longitudinal Study of Aging was utilized. The dependent variable was grip strength, and the independent variables were living alone (living alone or not) and living area (city or non-city). Covariates were age, education, log-transformed household income, spouse existence, body mass index, self-rated health conditions, depressive symptoms, cognitive function, smoking, regular exercise, frequency of meeting with friends, and the number of social participation. Regression analysis was performed for middle-aged men, middle-aged women, old-aged men, and old-aged women, respectively. ANOVA and Chi-test were additionally used to specifically discuss significant results. Cross-sectional weight was applied to all analyses. According to the results, living alone and living area did not have significant effects on grip strength among middle-aged men, old-aged men, and old-aged women. In middle-aged women, however, living alone and living area were significantly associated with grip strength. To be specific, middle-aged women who lived alone in rural areas had the lowest grip strength compared to other middle-aged women. Additional analysis showed that middle-aged women who lived alone in rural areas had risk factors, such as low education level, low income, or high depressive symptoms. It implies that middle-aged women living alone in rural areas may have physical health risks, so they might be in need of disease prevention. This study is meaningful in that it can provide reliable information on the latent welfare recipients by using representative panel data and applying weight values.

Relationships on Magnitude and Frequency of Freshwater Discharge and Rainfall in the Altered Yeongsan Estuary (영산강 하구의 방류와 강우의 규모 및 빈도 상관성 분석)

  • Rhew, Ho-Sang;Lee, Guan-Hong
    • The Sea:JOURNAL OF THE KOREAN SOCIETY OF OCEANOGRAPHY
    • /
    • v.16 no.4
    • /
    • pp.223-237
    • /
    • 2011
  • The intermittent freshwater discharge has an critical influence upon the biophysical environments and the ecosystems of the Yeongsan Estuary where the estuary dam altered the continuous mixing of saltwater and freshwater. Though freshwater discharge is controlled by human, the extreme events are mainly driven by the heavy rainfall in the river basin, and provide various impacts, depending on its magnitude and frequency. This research aims to evaluate the magnitude and frequency of extreme freshwater discharges, and to establish the magnitude-frequency relationships between basin-wide rainfall and freshwater inflow. Daily discharge and daily basin-averaged rainfall from Jan 1, 1997 to Aug 31, 2010 were used to determine the relations between discharge and rainfall. Consecutive daily discharges were grouped into independent events using well-defined event-separation algorithm. Partial duration series were extracted to obtain the proper probability distribution function for extreme discharges and corresponding rainfall events. Extreme discharge events over the threshold 133,656,000 $m^3$ count up to 46 for 13.7y years, following the Weibull distribution with k=1.4. The 3-day accumulated rain-falls which occurred one day before peak discharges (1day-before-3day -sum rainfall), are determined as a control variable for discharge, because their magnitude is best correlated with that of the extreme discharge events. The minimum value of the corresponding 1day-before-3day-sum rainfall, 50.98mm is initially set to a threshold for the selection of discharge-inducing rainfall cases. The number of 1day-before-3day-sum rainfall groups after selection, however, exceeds that of the extreme discharge events. The canonical discriminant analysis indicates that water level over target level (-1.35 m EL.) can be useful to divide the 1day-before-3day-sum rainfall groups into discharge-induced and non-discharge ones. It also shows that the newly-set threshold, 104mm, can just separate these two cases without errors. The magnitude-frequency relationships between rainfall and discharge are established with the newly-selected lday-before-3day-sum rainfalls: $D=1.111{\times}10^8+1.677{\times}10^6{\overline{r_{3day}}$, (${\overline{r_{3day}}{\geqq}104$, $R^2=0.459$), $T_d=1.326T^{0.683}_{r3}$, $T_d=0.117{\exp}[0.0155{\overline{r_{3day}}]$, where D is the quantity of discharge, ${\overline{r_{3day}}$ the 1day-before-3day-sum rainfall, $T_{r3}$ and $T_d$, are respectively return periods of 1day-before-3day-sum rainfall and freshwater discharge. These relations provide the framework to evaluate the effect of freshwater discharge on estuarine flow structure, water quality, responses of ecosystems from the perspective of magnitude and frequency.