• Title/Summary/Keyword: Multivariate statistical models

Search Result 126, Processing Time 0.02 seconds

A study on multiple imputation modeling for Korean EAPS (경제활동인구조사 자료를 위한 다중대체 방식 연구)

  • Park, Min-Jeong;Bae, Yoonjong;Kim, Joungyoun
    • The Korean Journal of Applied Statistics
    • /
    • v.34 no.5
    • /
    • pp.685-696
    • /
    • 2021
  • The Korean Economically Active Population Survey (KEAPS) is a national survey that produces employment-related statistics. The main purpose of the survey is to find out the economic activity status (employed/ unemployed/ non-employed) of the people. KEAPS has a unique characteristics caused by the survey method. In this study, through understanding of structural non-response and utilization of past data, we would like to present an improved imputation model. The performance of the proposed model is compared with the existing model through simulation. The performance of the imputation models is evaluated based on the degree of mathing/nonmatching rates. For this, we employ the KEAPS data in November 2019. For the randomly selected ones among the total 59,996 respondents, the six explanatory variables, which are critical in determining the economic activity states, are treated as non-response. The proposed model includes industry variable and job status variable in addition to the explanatory variables used in the precedent research. This is based on the linkage and utilization of past data. The simulation results confirm that the proposed model with additional variables outperforms the existing model in the precedent research. In addition, we consider various scenarios for the number of non-responders by the economic activity status.

Forecasting the flap: predictors for pediatric lower extremity trauma reconstruction

  • Fallah, Kasra N.;Konty, Logan A.;Anderson, Brady J.;Cepeda, Alfredo Jr.;Lamaris, Grigorios A.;Nguyen, Phuong D.;Greives, Matthew R.
    • Archives of Plastic Surgery
    • /
    • v.49 no.1
    • /
    • pp.91-98
    • /
    • 2022
  • Background Predicting the need for post-traumatic reconstruction of lower extremity injuries remains a challenge. Due to the larger volume of cases in adults than in children, the majority of the medical literature has focused on adult lower extremity reconstruction. This study evaluates predictive risk factors associated with the need for free flap reconstruction in pediatric patients following lower extremity trauma. Methods An IRB-approved retrospective chart analysis over a 5-year period (January 1, 2012 to December 31, 2017) was performed, including all pediatric patients (<18 years old) diagnosed with one or more lower extremity wounds. Patient demographics, trauma information, and operative information were reviewed. The statistical analysis consisted of univariate and multivariate regression models to identify predictor variables associated with free flap reconstruction. Results In total, 1,821 patients were identified who fit our search criteria, of whom 41 patients (2.25%) required free flap reconstruction, 65 patients (3.57%) required local flap reconstruction, and 19 patients (1.04%) required skin graft reconstruction. We determined that older age (odds ratio [OR], 1.134; P =0.002), all-terrain vehicle accidents (OR, 6.698; P<0.001), and trauma team activation (OR, 2.443; P=0.034) were associated with the need for free flap reconstruction following lower extremity trauma in our pediatric population. Conclusions Our study demonstrates a higher likelihood of free flap reconstruction in older pediatric patients, those involved in all-terrain vehicle accidents, and cases involving activation of the trauma team. This information can be implemented to help develop an early risk calculator that defines the need for complex lower extremity reconstruction in the pediatric population.

A Study on the Prediction Model of Stock Price Index Trend based on GA-MSVM that Simultaneously Optimizes Feature and Instance Selection (입력변수 및 학습사례 선정을 동시에 최적화하는 GA-MSVM 기반 주가지수 추세 예측 모형에 관한 연구)

  • Lee, Jong-sik;Ahn, Hyunchul
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.4
    • /
    • pp.147-168
    • /
    • 2017
  • There have been many studies on accurate stock market forecasting in academia for a long time, and now there are also various forecasting models using various techniques. Recently, many attempts have been made to predict the stock index using various machine learning methods including Deep Learning. Although the fundamental analysis and the technical analysis method are used for the analysis of the traditional stock investment transaction, the technical analysis method is more useful for the application of the short-term transaction prediction or statistical and mathematical techniques. Most of the studies that have been conducted using these technical indicators have studied the model of predicting stock prices by binary classification - rising or falling - of stock market fluctuations in the future market (usually next trading day). However, it is also true that this binary classification has many unfavorable aspects in predicting trends, identifying trading signals, or signaling portfolio rebalancing. In this study, we try to predict the stock index by expanding the stock index trend (upward trend, boxed, downward trend) to the multiple classification system in the existing binary index method. In order to solve this multi-classification problem, a technique such as Multinomial Logistic Regression Analysis (MLOGIT), Multiple Discriminant Analysis (MDA) or Artificial Neural Networks (ANN) we propose an optimization model using Genetic Algorithm as a wrapper for improving the performance of this model using Multi-classification Support Vector Machines (MSVM), which has proved to be superior in prediction performance. In particular, the proposed model named GA-MSVM is designed to maximize model performance by optimizing not only the kernel function parameters of MSVM, but also the optimal selection of input variables (feature selection) as well as instance selection. In order to verify the performance of the proposed model, we applied the proposed method to the real data. The results show that the proposed method is more effective than the conventional multivariate SVM, which has been known to show the best prediction performance up to now, as well as existing artificial intelligence / data mining techniques such as MDA, MLOGIT, CBR, and it is confirmed that the prediction performance is better than this. Especially, it has been confirmed that the 'instance selection' plays a very important role in predicting the stock index trend, and it is confirmed that the improvement effect of the model is more important than other factors. To verify the usefulness of GA-MSVM, we applied it to Korea's real KOSPI200 stock index trend forecast. Our research is primarily aimed at predicting trend segments to capture signal acquisition or short-term trend transition points. The experimental data set includes technical indicators such as the price and volatility index (2004 ~ 2017) and macroeconomic data (interest rate, exchange rate, S&P 500, etc.) of KOSPI200 stock index in Korea. Using a variety of statistical methods including one-way ANOVA and stepwise MDA, 15 indicators were selected as candidate independent variables. The dependent variable, trend classification, was classified into three states: 1 (upward trend), 0 (boxed), and -1 (downward trend). 70% of the total data for each class was used for training and the remaining 30% was used for verifying. To verify the performance of the proposed model, several comparative model experiments such as MDA, MLOGIT, CBR, ANN and MSVM were conducted. MSVM has adopted the One-Against-One (OAO) approach, which is known as the most accurate approach among the various MSVM approaches. Although there are some limitations, the final experimental results demonstrate that the proposed model, GA-MSVM, performs at a significantly higher level than all comparative models.

Estimation of Genetic Parameters for Reproductive Traits in Yorkshire (요크셔종의 번식형질에 대한 유전모수 추정)

  • Song, Kwang-Lim;Kim, Byeong-Woo;Roh, Seung-Hee;Sun, Du-Won;Kim, Hyo-Sun;Lee, Deuk-Hwan;Jeon, Jin-Tae;Lee, Jung-Gyu
    • Journal of agriculture & life science
    • /
    • v.44 no.5
    • /
    • pp.55-64
    • /
    • 2010
  • This study was conducted to estimate genetic parameters for reproductive traits using multivariate animal models in Yorkshire breed. For the study, 4,989 records for litter traits collected between the year 2001 and 2005 from Yorkshire pigs in K GGP were used. The effects of environmental factors such as farrowing year, parity, weaning to estrus interval (WEI), and suckling period were statistically significant (p<0.05), but farrowing season was not significant, for reproductive traits. The estimates genetic correlations and phenotypic correlations in total number of born and number of suckling, was shown to highly correlated. The genetic correlations were higher than phenotypic correlation. The estimates of heritabilities for reproductive traits, considering permanent environment effects (PE) were much lower than those obtained when permanent environment effects were not considered (NPE) in the model. The estimates of heritabilities were 0.240 and 0.076 for total number of born and 0.187 and 0.096 for number of suckling in NPE, and PE, respectively. These results itivcate that PE should be considered in the statistical mode to estimate more acco ate breeding values.

Estimation of Genetic Parameters for Growth Traits in Yorkshire (요크셔종의 산육형질에 대한 유전모수 추정)

  • Song, Kwang-Lim;Kim, Byeong-Woo;Roh, Seung-Hee;Sun, Du-Won;Kim, Hyo-Sun;Lee, Deuk-Hwan;Jeon, Jin-Tae;Lee, Jung-Gyu
    • Journal of agriculture & life science
    • /
    • v.44 no.3
    • /
    • pp.41-52
    • /
    • 2010
  • This study was conducted to estimate genetic parameters for growth traits using multivariate animal models in Yorkshire breed. For the study, 16,202 records for growth traits collected between the year 1999 and 2005 from Yorkshire pigs in K GGP were used. The effects of environmental factors such as sex, birth year, birth season, parity and birth weight group affected growth traits significantly (p<0.01). Birth weight tended to be positively correlated with average daily gain (ADG) and lean percent. But it seemed to affect age at 90 kg, average adjusted backfat thickness (BF), and eye muscle ares (EMA) negatively. For average pig suckling weight (ASW) and total weight at suckling (TWS), the higher birth weight is the better performance. But, in case of total number of born and number of suckling, the result was shown vice versa. Approximately 10~30% lower heritability estimates were obtained for growth traits by using the model that includes descriptions of common litter effects (CL) than by using the model that ignores those (NCL) for more accurate estimation of heritability. The estimates of heritabilities were 0.468, and 0.328 for ADG, 0.474 and 0.326 for age at 90 kg, 0.452, and 0.396 for BF, 0.240 and 0.200 for EMA and, 0.458, and 0.380 for lean percent in NCL and CL, respectively. Therefore, in order to estimate optimal genetic parameters, it could be inferred that the statistical model which considers litter effects must be applied.

Studies on Development of Prediction Model of Landslide Hazard and Its Utilization (산지사면(山地斜面)의 붕괴위험도(崩壞危險度) 예측(豫測)모델의 개발(開發) 및 실용화(實用化) 방안(方案))

  • Ma, Ho-Seop
    • Journal of Korean Society of Forest Science
    • /
    • v.83 no.2
    • /
    • pp.175-190
    • /
    • 1994
  • In order to get fundamental information for prediction of landslide hazard, both forest and site factors affecting slope stability were investigated in many areas of active landslides. Twelve descriptors were identified and quantified to develop the prediction model by multivariate statistical analysis. The main results obtained could be summarized as follows : The main factors influencing a large scale of landslide were shown in order of precipitation, age group of forest trees, altitude, soil texture, slope gradient, position of slope, vegetation, stream order, vertical slope, bed rock, soil depth and aspect. According to partial correlation coefficient, it was shown in order of age group of forest trees, precipitation, soil texture, bed rock, slope gradient, position of slope, altitude, vertical slope, stream order, vegetation, soil depth and aspect. The main factors influencing a landslide occurrence were shown in order of age group of forest trees, altitude, soil texture, slope gradient, precipitation, vertical slope, stream order, bed rock and soil depth. Two prediction models were developed by magnitude and frequency of landslide. Particularly, a prediction method by magnitude of landslide was changed the score for the convenience of use. If the total store of the various factors mark over 9.1636, it is evaluated as a very dangerous area. The mean score of landslide and non-landslide group was 0.1977 and -0.1977, and variance was 0.1100 and 0.1250, respectively. The boundary value between the two groups related to slope stability was -0.02, and its predicted rate of discrimination was 73%. In the score range of the degree of landslide hazard based on the boundary value of discrimination, class A was 0.3132 over, class B was 0.3132 to -0.1050, class C was -0.1050 to -0.4196, class D was -0.4195 below. The rank of landslide hazard could be divided into classes A, B, C and D by the boundary value. In the number of slope, class A was 68, class B was 115, class C was 65, and class D was 52. The rate of landslide occurrence in class A and class B was shown at the hige prediction of 83%. Therefore, dangerous areas selected by the prediction method of landslide could be mapped for land-use planning and criterion of disaster district. And also, it could be applied to an administration index for disaster prevention.

  • PDF