• Title/Summary/Keyword: robust regression

Search Result 360, Processing Time 0.028 seconds

Effects of Open Innovation on Export Performance: Moderation of Innovation Speed (개방형 혁신이 수출성과에 미치는 영향: 혁신속도의 조절효과를 중심으로)

  • Roh, Taewoo;Park, Kwangmin;Seo, Jeongeun;Kim, Gyunhwan;Kim, Hwayoung;Kang, Minah
    • Journal of Digital Convergence
    • /
    • v.16 no.12
    • /
    • pp.207-215
    • /
    • 2018
  • This study started from the point that the most important SMEs in the economic growth engine of Korea are prepared to grow through innovation. This study focuses on the fact that existing studies have focused on the open innovation of SMEs has been continued since the external knowledge search became an important concept, but mainly focused on the enterprise performance. The purpose of this study is to examine the moderating effect of innovation speed focusing on exports to Korean SMEs. The hypothesis suggests the depth and breadth of external knowledge search, which is the two methods of open innovation emphasized in the previous studies, and then shows the innovation speed on export performance as a moderating effect. Robust regression analysis was used for the analysis and the sample used for the analysis was valid 1,357 SMEs data. The hypothesis test for the moderation effect was performed by comparing the F-values between models. The proposed hypothesis was adopted and the moderation effect was verified.

A comparison of imputation methods using nonlinear models (비선형 모델을 이용한 결측 대체 방법 비교)

  • Kim, Hyein;Song, Juwon
    • The Korean Journal of Applied Statistics
    • /
    • v.32 no.4
    • /
    • pp.543-559
    • /
    • 2019
  • Data often include missing values due to various reasons. If the missing data mechanism is not MCAR, analysis based on fully observed cases may an estimation cause bias and decrease the precision of the estimate since partially observed cases are excluded. Especially when data include many variables, missing values cause more serious problems. Many imputation techniques are suggested to overcome this difficulty. However, imputation methods using parametric models may not fit well with real data which do not satisfy model assumptions. In this study, we review imputation methods using nonlinear models such as kernel, resampling, and spline methods which are robust on model assumptions. In addition, we suggest utilizing imputation classes to improve imputation accuracy or adding random errors to correctly estimate the variance of the estimates in nonlinear imputation models. Performances of imputation methods using nonlinear models are compared under various simulated data settings. Simulation results indicate that the performances of imputation methods are different as data settings change. However, imputation based on the kernel regression or the penalized spline performs better in most situations. Utilizing imputation classes or adding random errors improves the performance of imputation methods using nonlinear models.

An Exploration of Somatization among Korean Older Immigrants in the U.S. (신체증후군에 대한 탐색적 연구: 한인 노인 이민자를 중심으로)

  • Ahn, Joonhee
    • 한국노년학
    • /
    • v.28 no.4
    • /
    • pp.1179-1200
    • /
    • 2008
  • Knowledge about somatization (somatic manifestation of psychological distress symptoms) among immigrant populations is limited. While several studies have recognized somatization as a culturally distinctive expression of depression amongst older Korean immigrant population, somatization has not been incorporated into the comprehensive empirical model for depression of this population. In order to improve our general understanding of the phenomenon, the objective of this study is to empirically investigate principal contributing factors of somatization as well as inter-relationships among them. Data were collected from a cross-sectional community survey of 234 older Korean immigrants ($$age{\geq_-}55$$) in the New York metropolitan area. The statistical methodology employed a robust hierarchical regression procedure that iteratively downweights outliers. The results indicated that living arrangement, greater numbers of physical illnesses, and depression were significant explanatory factors of somatization. Furthermore, physical illness had a significant joint effect with perception of health on somatization, which confirms that positive perception of health exerts a moderating effect on the relationship between physical illness and somatization. The knowledge obtained from this study will contribute toward extending our knowledge on somatization and implementing more culturally sensitive mental health services for this population.

Relationship between Urban Identity and Time and Space - Focusing on , Zhang Lu's Film (도시 정체성과 시공간 구조의 관계 -장률(張律)의 영화 <군산: 거위를 노래하다>를 중심으로)

  • Cho, Myung-Ki
    • Journal of Popular Narrative
    • /
    • v.27 no.3
    • /
    • pp.151-191
    • /
    • 2021
  • This paper examines what is the content of Gusan's urban identity, represented by the film and how the contents and aspects of this city's identity interact with the structure of the films' discourse. weaves Gunsan and Seoul into continuously reorganized cities based on an interactive relation, rather than literal ones. Seoul in which the time for a film narrative is closed is converted into the starting point for tour to Gunsan. The both points in which audiences' ex post return occurs are the starting point for the time for the film discourse and the other point in which the title is suggested. The journey-type of the narrative structure in this film is a3-dimensional spiral-shaped, rather than a 2-dimensional circular regression. embodies the characteristics and the identity and apriority of two cities, based on such a spiral-shaped temporal and spatial structure. Seoul severs the relation between grand narrative/collective memory and small narrative/individual memory as an agnostic one, in other words, it is a city that cuts off cities, relations and memory and rejects the continuity of memory. On the other hand, Gunsan is a city in which both grand and small narrative and collective and individual memory coexist and both split and isolated mind are cured and mutually consoled. It describes Gunsan as the surplus space as a being for others, while expressing its identity as robust and literal thing. The film describes it as the field in which oppositional concepts such as historical interruption and continuity and spatial being for others and originality become 3-dimensional spiral ones, through the reciprocity between the narrative and the discourse structure. This paper has an implication, in that it examines how temporal and spatial relationship constituting the urban identity interacts with the structure of the film narrative.

A Study on Antecedents of Suicidal Ideation among Korean Older Adults: A test of the Stress-diathesis Model (노인의 자살 생각에 영향을 미치는 선행요인에 관한 연구: 스트레스 소질 모델(Stress-diathesis Model)을 중심으로)

  • Ahn, Joonhee;Chun, Miae
    • 한국노년학
    • /
    • v.29 no.2
    • /
    • pp.489-511
    • /
    • 2009
  • While late-life suicide has been increasing and become an important issue for public health in Korea, little is known about the phenomenon and its contributing risk factors based on which effective preventive measures can be made. Since suicidal ideation is a major precursor to attempted and completed suicide, the objective of the present study was to reveal primary contributors to suicidal ideation. Data were collected from a cross-sectional survey of 247 community-dwelling Korean older adults (age≥60) in the mid-size city in Korea. The statistical methodology employed a robust hierarchical regression procedure that iteratively downweights outliers. Based on the stress-diathesis model, the study examined major diathesis and stressors directly explaining suicidal ideation. The study also explored the significant interaction among these factors. The findings revealed that living alone and depression were significant main antecedents of suicidal ideation. In addition, neuroticism X life events and neuroticism X depression were significant interaction terms with the strongest explanatory power, which provides an empirical evidence to support the stress-diathesis model in explaining suicidal phenomenon of the Korean elderly. The result demonstrates the theoretical implication as well as the practical implication for developing and implementing late-life suicide prevention strategies. Limitations and directions for future research are discussed.

Wildfire Severity Mapping Using Sentinel Satellite Data Based on Machine Learning Approaches (Sentinel 위성영상과 기계학습을 이용한 국내산불 피해강도 탐지)

  • Sim, Seongmun;Kim, Woohyeok;Lee, Jaese;Kang, Yoojin;Im, Jungho;Kwon, Chunguen;Kim, Sungyong
    • Korean Journal of Remote Sensing
    • /
    • v.36 no.5_3
    • /
    • pp.1109-1123
    • /
    • 2020
  • In South Korea with forest as a major land cover class (over 60% of the country), many wildfires occur every year. Wildfires weaken the shear strength of the soil, forming a layer of soil that is vulnerable to landslides. It is important to identify the severity of a wildfire as well as the burned area to sustainably manage the forest. Although satellite remote sensing has been widely used to map wildfire severity, it is often difficult to determine the severity using only the temporal change of satellite-derived indices such as Normalized Difference Vegetation Index (NDVI) and Normalized Burn Ratio (NBR). In this study, we proposed an approach for determining wildfire severity based on machine learning through the synergistic use of Sentinel-1A Synthetic Aperture Radar-C data and Sentinel-2A Multi Spectral Instrument data. Three wildfire cases-Samcheok in May 2017, Gangreung·Donghae in April 2019, and Gosung·Sokcho in April 2019-were used for developing wildfire severity mapping models with three machine learning algorithms (i.e., Random Forest, Logistic Regression, and Support Vector Machine). The results showed that the random forest model yielded the best performance, resulting in an overall accuracy of 82.3%. The cross-site validation to examine the spatiotemporal transferability of the machine learning models showed that the models were highly sensitive to temporal differences between the training and validation sites, especially in the early growing season. This implies that a more robust model with high spatiotemporal transferability can be developed when more wildfire cases with different seasons and areas are added in the future.

Ensemble Learning with Support Vector Machines for Bond Rating (회사채 신용등급 예측을 위한 SVM 앙상블학습)

  • Kim, Myoung-Jong
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.2
    • /
    • pp.29-45
    • /
    • 2012
  • Bond rating is regarded as an important event for measuring financial risk of companies and for determining the investment returns of investors. As a result, it has been a popular research topic for researchers to predict companies' credit ratings by applying statistical and machine learning techniques. The statistical techniques, including multiple regression, multiple discriminant analysis (MDA), logistic models (LOGIT), and probit analysis, have been traditionally used in bond rating. However, one major drawback is that it should be based on strict assumptions. Such strict assumptions include linearity, normality, independence among predictor variables and pre-existing functional forms relating the criterion variablesand the predictor variables. Those strict assumptions of traditional statistics have limited their application to the real world. Machine learning techniques also used in bond rating prediction models include decision trees (DT), neural networks (NN), and Support Vector Machine (SVM). Especially, SVM is recognized as a new and promising classification and regression analysis method. SVM learns a separating hyperplane that can maximize the margin between two categories. SVM is simple enough to be analyzed mathematical, and leads to high performance in practical applications. SVM implements the structuralrisk minimization principle and searches to minimize an upper bound of the generalization error. In addition, the solution of SVM may be a global optimum and thus, overfitting is unlikely to occur with SVM. In addition, SVM does not require too many data sample for training since it builds prediction models by only using some representative sample near the boundaries called support vectors. A number of experimental researches have indicated that SVM has been successfully applied in a variety of pattern recognition fields. However, there are three major drawbacks that can be potential causes for degrading SVM's performance. First, SVM is originally proposed for solving binary-class classification problems. Methods for combining SVMs for multi-class classification such as One-Against-One, One-Against-All have been proposed, but they do not improve the performance in multi-class classification problem as much as SVM for binary-class classification. Second, approximation algorithms (e.g. decomposition methods, sequential minimal optimization algorithm) could be used for effective multi-class computation to reduce computation time, but it could deteriorate classification performance. Third, the difficulty in multi-class prediction problems is in data imbalance problem that can occur when the number of instances in one class greatly outnumbers the number of instances in the other class. Such data sets often cause a default classifier to be built due to skewed boundary and thus the reduction in the classification accuracy of such a classifier. SVM ensemble learning is one of machine learning methods to cope with the above drawbacks. Ensemble learning is a method for improving the performance of classification and prediction algorithms. AdaBoost is one of the widely used ensemble learning techniques. It constructs a composite classifier by sequentially training classifiers while increasing weight on the misclassified observations through iterations. The observations that are incorrectly predicted by previous classifiers are chosen more often than examples that are correctly predicted. Thus Boosting attempts to produce new classifiers that are better able to predict examples for which the current ensemble's performance is poor. In this way, it can reinforce the training of the misclassified observations of the minority class. This paper proposes a multiclass Geometric Mean-based Boosting (MGM-Boost) to resolve multiclass prediction problem. Since MGM-Boost introduces the notion of geometric mean into AdaBoost, it can perform learning process considering the geometric mean-based accuracy and errors of multiclass. This study applies MGM-Boost to the real-world bond rating case for Korean companies to examine the feasibility of MGM-Boost. 10-fold cross validations for threetimes with different random seeds are performed in order to ensure that the comparison among three different classifiers does not happen by chance. For each of 10-fold cross validation, the entire data set is first partitioned into tenequal-sized sets, and then each set is in turn used as the test set while the classifier trains on the other nine sets. That is, cross-validated folds have been tested independently of each algorithm. Through these steps, we have obtained the results for classifiers on each of the 30 experiments. In the comparison of arithmetic mean-based prediction accuracy between individual classifiers, MGM-Boost (52.95%) shows higher prediction accuracy than both AdaBoost (51.69%) and SVM (49.47%). MGM-Boost (28.12%) also shows the higher prediction accuracy than AdaBoost (24.65%) and SVM (15.42%)in terms of geometric mean-based prediction accuracy. T-test is used to examine whether the performance of each classifiers for 30 folds is significantly different. The results indicate that performance of MGM-Boost is significantly different from AdaBoost and SVM classifiers at 1% level. These results mean that MGM-Boost can provide robust and stable solutions to multi-classproblems such as bond rating.

A PLS Path Modeling Approach on the Cause-and-Effect Relationships among BSC Critical Success Factors for IT Organizations (PLS 경로모형을 이용한 IT 조직의 BSC 성공요인간의 인과관계 분석)

  • Lee, Jung-Hoon;Shin, Taek-Soo;Lim, Jong-Ho
    • Asia pacific journal of information systems
    • /
    • v.17 no.4
    • /
    • pp.207-228
    • /
    • 2007
  • Measuring Information Technology(IT) organizations' activities have been limited to mainly measure financial indicators for a long time. However, according to the multifarious functions of Information System, a number of researches have been done for the new trends on measurement methodologies that come with financial measurement as well as new measurement methods. Especially, the researches on IT Balanced Scorecard(BSC), concept from BSC measuring IT activities have been done as well in recent years. BSC provides more advantages than only integration of non-financial measures in a performance measurement system. The core of BSC rests on the cause-and-effect relationships between measures to allow prediction of value chain performance measures to allow prediction of value chain performance measures, communication, and realization of the corporate strategy and incentive controlled actions. More recently, BSC proponents have focused on the need to tie measures together into a causal chain of performance, and to test the validity of these hypothesized effects to guide the development of strategy. Kaplan and Norton[2001] argue that one of the primary benefits of the balanced scorecard is its use in gauging the success of strategy. Norreklit[2000] insist that the cause-and-effect chain is central to the balanced scorecard. The cause-and-effect chain is also central to the IT BSC. However, prior researches on relationship between information system and enterprise strategies as well as connection between various IT performance measurement indicators are not so much studied. Ittner et al.[2003] report that 77% of all surveyed companies with an implemented BSC place no or only little interest on soundly modeled cause-and-effect relationships despite of the importance of cause-and-effect chains as an integral part of BSC. This shortcoming can be explained with one theoretical and one practical reason[Blumenberg and Hinz, 2006]. From a theoretical point of view, causalities within the BSC method and their application are only vaguely described by Kaplan and Norton. From a practical consideration, modeling corporate causalities is a complex task due to tedious data acquisition and following reliability maintenance. However, cause-and effect relationships are an essential part of BSCs because they differentiate performance measurement systems like BSCs from simple key performance indicator(KPI) lists. KPI lists present an ad-hoc collection of measures to managers but do not allow for a comprehensive view on corporate performance. Instead, performance measurement system like BSCs tries to model the relationships of the underlying value chain in cause-and-effect relationships. Therefore, to overcome the deficiencies of causal modeling in IT BSC, sound and robust causal modeling approaches are required in theory as well as in practice for offering a solution. The propose of this study is to suggest critical success factors(CSFs) and KPIs for measuring performance for IT organizations and empirically validate the casual relationships between those CSFs. For this purpose, we define four perspectives of BSC for IT organizations according to Van Grembergen's study[2000] as follows. The Future Orientation perspective represents the human and technology resources needed by IT to deliver its services. The Operational Excellence perspective represents the IT processes employed to develop and deliver the applications. The User Orientation perspective represents the user evaluation of IT. The Business Contribution perspective captures the business value of the IT investments. Each of these perspectives has to be translated into corresponding metrics and measures that assess the current situations. This study suggests 12 CSFs for IT BSC based on the previous IT BSC's studies and COBIT 4.1. These CSFs consist of 51 KPIs. We defines the cause-and-effect relationships among BSC CSFs for IT Organizations as follows. The Future Orientation perspective will have positive effects on the Operational Excellence perspective. Then the Operational Excellence perspective will have positive effects on the User Orientation perspective. Finally, the User Orientation perspective will have positive effects on the Business Contribution perspective. This research tests the validity of these hypothesized casual effects and the sub-hypothesized causal relationships. For the purpose, we used the Partial Least Squares approach to Structural Equation Modeling(or PLS Path Modeling) for analyzing multiple IT BSC CSFs. The PLS path modeling has special abilities that make it more appropriate than other techniques, such as multiple regression and LISREL, when analyzing small sample sizes. Recently the use of PLS path modeling has been gaining interests and use among IS researchers in recent years because of its ability to model latent constructs under conditions of nonormality and with small to medium sample sizes(Chin et al., 2003). The empirical results of our study using PLS path modeling show that the casual effects in IT BSC significantly exist partially in our hypotheses.

Comparison of Forest Carbon Stocks Estimation Methods Using Forest Type Map and Landsat TM Satellite Imagery (임상도와 Landsat TM 위성영상을 이용한 산림탄소저장량 추정 방법 비교 연구)

  • Kim, Kyoung-Min;Lee, Jung-Bin;Jung, Jaehoon
    • Korean Journal of Remote Sensing
    • /
    • v.31 no.5
    • /
    • pp.449-459
    • /
    • 2015
  • The conventional National Forest Inventory(NFI)-based forest carbon stock estimation method is suitable for national-scale estimation, but is not for regional-scale estimation due to the lack of NFI plots. In this study, for the purpose of regional-scale carbon stock estimation, we created grid-based forest carbon stock maps using spatial ancillary data and two types of up-scaling methods. Chungnam province was chosen to represent the study area and for which the $5^{th}$ NFI (2006~2009) data was collected. The first method (method 1) selects forest type map as ancillary data and uses regression model for forest carbon stock estimation, whereas the second method (method 2) uses satellite imagery and k-Nearest Neighbor(k-NN) algorithm. Additionally, in order to consider uncertainty effects, the final AGB carbon stock maps were generated by performing 200 iterative processes with Monte Carlo simulation. As a result, compared to the NFI-based estimation(21,136,911 tonC), the total carbon stock was over-estimated by method 1(22,948,151 tonC), but was under-estimated by method 2(19,750,315 tonC). In the paired T-test with 186 independent data, the average carbon stock estimation by the NFI-based method was statistically different from method2(p<0.01), but was not different from method1(p>0.01). In particular, by means of Monte Carlo simulation, it was found that the smoothing effect of k-NN algorithm and mis-registration error between NFI plots and satellite image can lead to large uncertainty in carbon stock estimation. Although method 1 was found suitable for carbon stock estimation of forest stands that feature heterogeneous trees in Korea, satellite-based method is still in demand to provide periodic estimates of un-investigated, large forest area. In these respects, future work will focus on spatial and temporal extent of study area and robust carbon stock estimation with various satellite images and estimation methods.

Association between seafood intake and frailty according to gender in Korean elderly: data procured from the Seventh (2016-2018) Korea National Health and Nutrition Examination Survey (한국 노인의 성별에 따른 수산물 섭취 수준과 노쇠 위험성의 상관성 연구: 제 7기 (2016-2018) 국민건강영양조사 자료를 이용하여)

  • Won Jang;Yeji Choi;Jung Hee Cho;Donglim Lee;Yangha Kim
    • Journal of Nutrition and Health
    • /
    • v.56 no.2
    • /
    • pp.155-167
    • /
    • 2023
  • Purpose: This study investigates the association between seafood consumption and frailty according to gender in the Korean elderly. Methods: Cross-sectional data from the Seventh (2016-2018) Korea National Health and Nutrition Examination Survey was procured for this study. Data from 3,675 subjects (1,643 men and 2,032 women) aged ≥ 65 years were analyzed. Levels of seafood intake were assessed by a one-day 24-hour dietary recall, and subjects were classified into three tertiles by gender according to frailty phenotype: robust, pre-frail, and frail. Multinomial logistic regression analysis was performed to clarify the association between seafood consumption and frailty for each gender. Results: The prevalence of frailty was determined as 13.4% for men and 29.7% for women. Participants with a higher seafood intake had higher intakes of grains, fruits, and vegetables, while the intake of meat was significantly lower. In both men and women, the group with higher seafood intake showed higher energy and micronutrient intakes. The frail prevalence and frailty score were significantly low in the highest tertiles of seafood consumption compared to the lowest tertile in men and women (p < 0.001). After adjusting for confounder, the highest tertile of seafood consumption showed a decreased risk of frailty compared to the lowest tertile only in women (hazard ratio [HR], 0.50; 95% confidence interval [CI], 0.32-0.78; p-trend = 0.008 vs. HR, 0.52; 95% CI, 0.32-0.83; p-trend = 0.008; respectively). Conclusion: Results of this study suggest that seafood consumption potentially decreases the risk of frailty in the elderly.