• 제목/요약/키워드: Robust 회귀분석

Search Result 75, Processing Time 0.028 seconds

A Confirmation of Identified Multiple Outliers and Leverage Points in Linear Model (다중 선형 모형에서 식별된 다중 이상점과 다중 지렛점의 재확인 방법에 대한 연구)

  • 유종영;안기수
    • The Korean Journal of Applied Statistics
    • /
    • v.15 no.2
    • /
    • pp.269-279
    • /
    • 2002
  • We considered the problem for confirmation of multiple outliers and leverage points. Identification of multiple outliers and leverage points is difficult because of the masking effect and swamping effect. Rousseeuw and van Zomeren(1990) identified multiple outliers and leverage points by using the Least Median of Squares and Minimum Value of Ellipsoids which are high-breakdown robust estimators. But their methods tend to declare too many observations as extremes. Atkinson(1987) suggested a method for confirming of outliers and Fung(1993) pointed out Atkinson method's limitation and proposed another method by using the add-back model. But we analyzed that Fung's method is affected by adjacent effect. In this thesis, we proposed one procedure for confirmation of outliers and leverage points and compared three example with Fung's method.

Identifying the Chickens-Eggs Statistical Lead-Lag Dilemma (닭-달걀 간 통계적 인과성 논란의 판별)

  • Kim, Tae Ho;Kim, Min Jeong;Lee, Jeen Woan
    • The Korean Journal of Applied Statistics
    • /
    • v.26 no.3
    • /
    • pp.401-411
    • /
    • 2013
  • This study investigates the controversial chickens-eggs dilemma and empirically performs statistical tests to examine if there exists a causality between them. Granger and Hsiao tests are applied to both level and stationary variables to identify the lead-lag relationships. Each of these test is found to have the robust result where the causality runs from eggs to chickens; in addition, the explanatory power of one variable in variations of the other appears to remain time invariant. The outcome is proved to be valid as the hypothesis test for no structural change in their relationship fails to be rejected.

A Exploratory Study on The Determinants of Youth Facilities Visits (청소년시설이용에 영향을 미치는 요인에 대한 탐색적 연구)

  • Kim, Sin-Young
    • The Journal of the Convergence on Culture Technology
    • /
    • v.9 no.1
    • /
    • pp.129-134
    • /
    • 2023
  • This study purports to investigate potential factors in various levels that affect respondents' use of youth facilities. Those levels include individual, family, and school. The data from 「2021 Youth Survey on Human Right Conditions」 will be analyzed. Hierarchical multiple regression analysis shows several results. First of all, respondents' age and level of human rights related information strongly influence respondents' use of youth facilities. Secondly, the analysis also shows that subjective well -being, abusive language and physical punishment from school faculty, and experience of human rights violation in schools affect the level of respondents' use of youth facilities. The order of effect sizes among significant variables are as follows; respondents' age, level of human rights related information, subjective well -being, abusive language and physical punishment from school faculty, and experience of human rights violation in schools. The independent variables in the model explain roughly 20 percent of whole variation of dependents variable.

Corporate Default Prediction Model Using Deep Learning Time Series Algorithm, RNN and LSTM (딥러닝 시계열 알고리즘 적용한 기업부도예측모형 유용성 검증)

  • Cha, Sungjae;Kang, Jungseok
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.4
    • /
    • pp.1-32
    • /
    • 2018
  • In addition to stakeholders including managers, employees, creditors, and investors of bankrupt companies, corporate defaults have a ripple effect on the local and national economy. Before the Asian financial crisis, the Korean government only analyzed SMEs and tried to improve the forecasting power of a default prediction model, rather than developing various corporate default models. As a result, even large corporations called 'chaebol enterprises' become bankrupt. Even after that, the analysis of past corporate defaults has been focused on specific variables, and when the government restructured immediately after the global financial crisis, they only focused on certain main variables such as 'debt ratio'. A multifaceted study of corporate default prediction models is essential to ensure diverse interests, to avoid situations like the 'Lehman Brothers Case' of the global financial crisis, to avoid total collapse in a single moment. The key variables used in corporate defaults vary over time. This is confirmed by Beaver (1967, 1968) and Altman's (1968) analysis that Deakins'(1972) study shows that the major factors affecting corporate failure have changed. In Grice's (2001) study, the importance of predictive variables was also found through Zmijewski's (1984) and Ohlson's (1980) models. However, the studies that have been carried out in the past use static models. Most of them do not consider the changes that occur in the course of time. Therefore, in order to construct consistent prediction models, it is necessary to compensate the time-dependent bias by means of a time series analysis algorithm reflecting dynamic change. Based on the global financial crisis, which has had a significant impact on Korea, this study is conducted using 10 years of annual corporate data from 2000 to 2009. Data are divided into training data, validation data, and test data respectively, and are divided into 7, 2, and 1 years respectively. In order to construct a consistent bankruptcy model in the flow of time change, we first train a time series deep learning algorithm model using the data before the financial crisis (2000~2006). The parameter tuning of the existing model and the deep learning time series algorithm is conducted with validation data including the financial crisis period (2007~2008). As a result, we construct a model that shows similar pattern to the results of the learning data and shows excellent prediction power. After that, each bankruptcy prediction model is restructured by integrating the learning data and validation data again (2000 ~ 2008), applying the optimal parameters as in the previous validation. Finally, each corporate default prediction model is evaluated and compared using test data (2009) based on the trained models over nine years. Then, the usefulness of the corporate default prediction model based on the deep learning time series algorithm is proved. In addition, by adding the Lasso regression analysis to the existing methods (multiple discriminant analysis, logit model) which select the variables, it is proved that the deep learning time series algorithm model based on the three bundles of variables is useful for robust corporate default prediction. The definition of bankruptcy used is the same as that of Lee (2015). Independent variables include financial information such as financial ratios used in previous studies. Multivariate discriminant analysis, logit model, and Lasso regression model are used to select the optimal variable group. The influence of the Multivariate discriminant analysis model proposed by Altman (1968), the Logit model proposed by Ohlson (1980), the non-time series machine learning algorithms, and the deep learning time series algorithms are compared. In the case of corporate data, there are limitations of 'nonlinear variables', 'multi-collinearity' of variables, and 'lack of data'. While the logit model is nonlinear, the Lasso regression model solves the multi-collinearity problem, and the deep learning time series algorithm using the variable data generation method complements the lack of data. Big Data Technology, a leading technology in the future, is moving from simple human analysis, to automated AI analysis, and finally towards future intertwined AI applications. Although the study of the corporate default prediction model using the time series algorithm is still in its early stages, deep learning algorithm is much faster than regression analysis at corporate default prediction modeling. Also, it is more effective on prediction power. Through the Fourth Industrial Revolution, the current government and other overseas governments are working hard to integrate the system in everyday life of their nation and society. Yet the field of deep learning time series research for the financial industry is still insufficient. This is an initial study on deep learning time series algorithm analysis of corporate defaults. Therefore it is hoped that it will be used as a comparative analysis data for non-specialists who start a study combining financial data and deep learning time series algorithm.

Effects of Open Innovation on Export Performance: Moderation of Innovation Speed (개방형 혁신이 수출성과에 미치는 영향: 혁신속도의 조절효과를 중심으로)

  • Roh, Taewoo;Park, Kwangmin;Seo, Jeongeun;Kim, Gyunhwan;Kim, Hwayoung;Kang, Minah
    • Journal of Digital Convergence
    • /
    • v.16 no.12
    • /
    • pp.207-215
    • /
    • 2018
  • This study started from the point that the most important SMEs in the economic growth engine of Korea are prepared to grow through innovation. This study focuses on the fact that existing studies have focused on the open innovation of SMEs has been continued since the external knowledge search became an important concept, but mainly focused on the enterprise performance. The purpose of this study is to examine the moderating effect of innovation speed focusing on exports to Korean SMEs. The hypothesis suggests the depth and breadth of external knowledge search, which is the two methods of open innovation emphasized in the previous studies, and then shows the innovation speed on export performance as a moderating effect. Robust regression analysis was used for the analysis and the sample used for the analysis was valid 1,357 SMEs data. The hypothesis test for the moderation effect was performed by comparing the F-values between models. The proposed hypothesis was adopted and the moderation effect was verified.

Analysis of Structural and Thermal Parameters for Evaluating Fire Resistance of Steel Beams (철골보의 내화시간 평가를 위한 구조 및 열적 변수해석)

  • Park, Han Na;Ahn, Jae Kwon;Lee, Cheol Ho
    • Journal of Korean Society of Steel Construction
    • /
    • v.21 no.6
    • /
    • pp.609-618
    • /
    • 2009
  • This paper proposes a versatile formula which can be used to evaluate the fire resistant time of steel beams under various design conditions. Towards this end, the key parameters which affect the fire performance of steel beams were first determined through thermo-mechanical considerations, and classified into two groups: structural parameters and thermal parameters. Then the degree of influence of each parameter on the fire performance was investigated through a fully coupled thermo-mechanical analysis up to the occurrence of run-away deflection. The accuracy of the numerical model used was verified using an available full-scale fire test before conducting an extensive parametric analysis. Multiple linear regression analysis was performed to obtain the formula which can be used to predict the fire resistance time of steel beams under various design conditions. The statistical analysis showed that the proposed formula is very robust. The application of the formula in practical fire design under the current code was illustrated in detail. The economy and other advantages of the proposed formula were clearly shown.

Prediction on the Quality of Forage Crop by Near Infrared Reflectance Spectroscopy (근적외선 분광법에 의한 사초의 성분추정)

  • Lee, Hyo-Won;Kim, Jong-Duk;Kim, Won-Ho;Lee, Joung-Kyong
    • Journal of The Korean Society of Grassland and Forage Science
    • /
    • v.29 no.1
    • /
    • pp.31-36
    • /
    • 2009
  • This study was conducted to find out an alternative way of rapid and accurate analysis of forage quality. Near reflectance infrared spectroscopy (NIRS) was used to evaluate the possibility of forage analysis and collect 258 samples such as barley for whole crop silage, forage corn and sudangrass from 2002 to 2007. The samples were analyzed for CP (crude protein), CF (crude fiber), ADF (acid detergent fiber), NDF (neutral detergent fiber) and IVTD (in vitro true digestibility), and also scanned using NIRSystem with wavelength from $400{\sim}2,400nm$. Multiple linear regression was used with wet analysis data for developing the calibration model and validate unknown samples. The important index In this experiment was SEC and SEP $r^2$ for CF, CP, NDF, ADF and IVTD in calibration set were 0.70, 0.86, 0.94, 0.94 and 0.89, also 0.47, 0.39, 0.89, 0.90 and 0.61 in validation sample, respectively. The results of this experiment indicates that NIRS was reliable analytical method to assess forage quality, specially in CF, ADF and IVTD, sample should be included for respective forage samples to get accurate result. More robust calibrations can be made to cover every forage samples if added representative sample set.

Empirical Analyses on the Financial Profile of Korean Chaebols in Corporate Research & Development Intensity (국내 자본시장에서의 재벌 계열사들의 연구개발비 비중에 대한 재무적 실증분석)

  • Kim, Hanjoon
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.20 no.4
    • /
    • pp.232-241
    • /
    • 2019
  • This study examines one of the conventional and controversial issues in modern finance. Specifically, this study identifies financial determinants of corporate R&D intensity for firms belonging to Korean Chaebols. Empirical estimation procedures are applied to derive more robust results of each hypothesis test. Static panel data, Tobit regression and stepwise regression models are employed to obtain significant financial factors of R&D expenditures, while logit, probit and complementary log-log regression models are used to detect financial differences between Chaebol firms and their counterparts not classified as Chaebols. Study results found the level of R&D intensity in the prior fiscal year, market-value based leverage ratio and firm size empirically showed their significance to account for corporate R&D intensity in the first hypothesis test, whereas the majority of explanatory variables had important power on a relative basis. Assuming that the current circumstances in the domestic capital market may necessitate gradual changes of Korean Chaebols in terms of their socio-economic function, the results of this study are expected to contribute to identifying financial antecedents that can be beneficial to attain optimal level of corporate R&D expenditures for Chaebol firms on a virtuous cycle.

A Study on Spoken Digits Analysis and Recognition (숫자음 분석과 인식에 관한 연구)

  • 김득수;황철준
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.6 no.3
    • /
    • pp.107-114
    • /
    • 2001
  • This paper describes Connected Digit Recognition with Considering Acoustic Feature in Korea. The recognition rate of connected digit is usually lower than word recognition. Therefore, speech feature parameter and acoustic feature are employed to make robust model for digit, and we could confirm the effect of Considering. Acoustic Feature throughout the experience of recognition. We used KLE 4 connected digit as database and 19 continuous distributed HMM as PLUs(Phoneme Like Units) using phonetical rules. For recognition experience, we have tested two cases. The first case, we used usual method like using Mel-Cepstrum and Regressive Coefficient for constructing phoneme model. The second case, we used expanded feature parameter and acoustic feature for constructing phoneme model. In both case, we employed OPDP(One Pass Dynamic Programming) and FSA(Finite State Automata) for recognition tests. When appling FSN for recognition, we applied various acoustic features. As the result, we could get 55.4% recognition rate for Mel-Cepstrum, and 67.4% for Mel-Cepstrum and Regressive Coefficient. Also, we could get 74.3% recognition rate for expanded feature parameter, and 75.4% for applying acoustic feature. Since, the case of applying acoustic feature got better result than former method, we could make certain that suggested method is effective for connected digit recognition in korean.

  • PDF

Implementing an Adaptive Neuro-Fuzzy Model for Emotion Prediction Based on Heart Rate Variability(HRV) (심박변이도를 이용한 적응적 뉴로 퍼지 감정예측 모형에 관한 연구)

  • Park, Sung Soo;Lee, Kun Chang
    • Journal of Digital Convergence
    • /
    • v.17 no.1
    • /
    • pp.239-247
    • /
    • 2019
  • An accurate prediction of emotion is a very important issue for the sake of patient-centered medical device development and emotion-related psychology fields. Although there have been many studies on emotion prediction, no studies have applied the heart rate variability and neuro-fuzzy approach to emotion prediction. We propose ANFEP(Adaptive Neuro Fuzzy System for Emotion Prediction) HRV. The ANFEP bases its core functions on an ANFIS(Adaptive Neuro-Fuzzy Inference System) which integrates neural networks with fuzzy systems as a vehicle for training predictive models. To prove the proposed model, 50 participants were invited to join the experiment and Heart rate variability was obtained and used to input the ANFEP model. The ANFEP model with STDRR and RMSSD as inputs and two membership functions per input variable showed the best results. The result out of applying the ANFEP to the HRV metrics proved to be significantly robust when compared with benchmarking methods like linear regression, support vector regression, neural network, and random forest. The results show that reliable prediction of emotion is possible with less input and it is necessary to develop a more accurate and reliable emotion recognition system.