Search | Korea Science

Machine learning-based corporate default risk prediction model verification and policy recommendation: Focusing on improvement through stacking ensemble model (머신러닝 기반 기업부도위험 예측모델 검증 및 정책적 제언: 스태킹 앙상블 모델을 통한 개선을 중심으로)

Eom, Haneul;Kim, Jaeseong;Choi, Sangok
- Journal of Intelligence and Information Systems
- /
- v.26 no.2
- /
- pp.105-129
- /
- 2020
This study uses corporate data from 2012 to 2018 when K-IFRS was applied in earnest to predict default risks. The data used in the analysis totaled 10,545 rows, consisting of 160 columns including 38 in the statement of financial position, 26 in the statement of comprehensive income, 11 in the statement of cash flows, and 76 in the index of financial ratios. Unlike most previous prior studies used the default event as the basis for learning about default risk, this study calculated default risk using the market capitalization and stock price volatility of each company based on the Merton model. Through this, it was able to solve the problem of data imbalance due to the scarcity of default events, which had been pointed out as the limitation of the existing methodology, and the problem of reflecting the difference in default risk that exists within ordinary companies. Because learning was conducted only by using corporate information available to unlisted companies, default risks of unlisted companies without stock price information can be appropriately derived. Through this, it can provide stable default risk assessment services to unlisted companies that are difficult to determine proper default risk with traditional credit rating models such as small and medium-sized companies and startups. Although there has been an active study of predicting corporate default risks using machine learning recently, model bias issues exist because most studies are making predictions based on a single model. Stable and reliable valuation methodology is required for the calculation of default risk, given that the entity's default risk information is very widely utilized in the market and the sensitivity to the difference in default risk is high. Also, Strict standards are also required for methods of calculation. The credit rating method stipulated by the Financial Services Commission in the Financial Investment Regulations calls for the preparation of evaluation methods, including verification of the adequacy of evaluation methods, in consideration of past statistical data and experiences on credit ratings and changes in future market conditions. This study allowed the reduction of individual models' bias by utilizing stacking ensemble techniques that synthesize various machine learning models. This allows us to capture complex nonlinear relationships between default risk and various corporate information and maximize the advantages of machine learning-based default risk prediction models that take less time to calculate. To calculate forecasts by sub model to be used as input data for the Stacking Ensemble model, training data were divided into seven pieces, and sub-models were trained in a divided set to produce forecasts. To compare the predictive power of the Stacking Ensemble model, Random Forest, MLP, and CNN models were trained with full training data, then the predictive power of each model was verified on the test set. The analysis showed that the Stacking Ensemble model exceeded the predictive power of the Random Forest model, which had the best performance on a single model. Next, to check for statistically significant differences between the Stacking Ensemble model and the forecasts for each individual model, the Pair between the Stacking Ensemble model and each individual model was constructed. Because the results of the Shapiro-wilk normality test also showed that all Pair did not follow normality, Using the nonparametric method wilcoxon rank sum test, we checked whether the two model forecasts that make up the Pair showed statistically significant differences. The analysis showed that the forecasts of the Staging Ensemble model showed statistically significant differences from those of the MLP model and CNN model. In addition, this study can provide a methodology that allows existing credit rating agencies to apply machine learning-based bankruptcy risk prediction methodologies, given that traditional credit rating models can also be reflected as sub-models to calculate the final default probability. Also, the Stacking Ensemble techniques proposed in this study can help design to meet the requirements of the Financial Investment Business Regulations through the combination of various sub-models. We hope that this research will be used as a resource to increase practical use by overcoming and improving the limitations of existing machine learning-based models.
https://doi.org/10.13088/jiis.2020.26.2.105 인용 PDF KSCI

Fast Join Mechanism that considers the switching of the tree in Overlay Multicast (오버레이 멀티캐스팅에서 트리의 스위칭을 고려한 빠른 멤버 가입 방안에 관한 연구)

Cho, Sung-Yean;Rho, Kyung-Taeg;Park, Myong-Soon
- The KIPS Transactions:PartC
- /
- v.10C no.5
- /
- pp.625-634
- /
- 2003
More than a decade after its initial proposal, deployment of IP Multicast has been limited due to the problem of traffic control in multicast routing, multicast address allocation in global internet, reliable multicast transport techniques etc. Lately, according to increase of multicast application service such as internet broadcast, real time security information service etc., overlay multicast is developed as a new internet multicast technology. In this paper, we describe an overlay multicast protocol and propose fast join mechanism that considers switching of the tree. To find a potential parent, an existing search algorithm descends the tree from the root by one level at a time, and it causes long joining latency. Also, it is try to select the nearest node as a potential parent. However, it can't select the nearest node by the degree limit of the node. As a result, the generated tree has low efficiency. To reduce long joining latency and improve the efficiency of the tree, we propose searching two levels of the tree at a time. This method forwards joining request message to own children node. So, at ordinary times, there is no overhead to keep the tree. But the joining request came, the increasing number of searching messages will reduce a long joining latency. Also searching more nodes will be helpful to construct more efficient trees. In order to evaluate the performance of our fast join mechanism, we measure the metrics such as the search latency and the number of searched node and the number of switching by the number of members and degree limit. The simulation results show that the performance of our mechanism is superior to that of the existing mechanism.
https://doi.org/10.3745/KIPSTC.2003.10C.5.625 인용 PDF KSCI

The Impact of Organizational Internal IT Capability on Agility and Performance: The Moderating Effect of Managerial IT Capability and Top Management Championship (기업 내적 IT 자원이 기업 민첩성과 성과에 미치는 영향: 관리적 IT 능력과 경영진 존재의 조절효과)

Kim, Geuna;Kim, Sanghyun
- Information Systems Review
- /
- v.15 no.3
- /
- pp.39-69
- /
- 2013
Business value of information technology has been the biggest interest of all such as practitioners and scholars for decades. Information technology is considered as the driving force or success factor of firm agility. The general assumption is that organizations making considerable efforts in IT investment are more agile than the organizations that are not. However, IT that should help the strategies of the firm that can hinder business or impede agility of the firm occasionally. In other words, it is still unknown if IT helps the agility of the firm or bothers it. Therefore, we note that contrary aspects of IT such as promotion and hindrance of firm agility have been observed frequently and theorize the relationships between them. In other words, we propose a rationale that firms should need to develop superior firm-wide IT capability to manage IT resources successfully in order to realize agility. Thus, this paper theorizes two IT capabilities, including technical IT capability and managerial IT capability as key factors impacting firm agility and firm performance. Further, we operationalize firm agility into two sub-types, including operational adjustment agility and market capitalizing agility. The data from 171 firms was analyzed using PLS approach. The results showed that technical IT capability has positive impact on firm agility and managerial IT capability had positive moderating effects between technical IT capability and firm agility. In addition, it was identified that top management championship positively moderates between agility and firm performance. Finally, it was indicated that firm agility was a very important causal variable of firm performance. Our study provides more exquisite and practical empirical evidences in the relationship between IT capability and firm agility by proposing applicable solution although IT has some contradicting effects on firm agility. Our findings suggest many useful implications to agility related researches in relatively primitive stage and working level officers in organizations.
PDF

A Comparative Study on the Acceptability and the Consumption Attitude for Soy Foods between Korean and Canadian University Students (한국과 캐나다 대학생들의 콩가공식품에 대한 수응도 및 소비실태 비교 연구)

Ahn Tae-Hyun;Paliyath Gopinadhan
- KOREAN JOURNAL OF CROP SCIENCE
- /
- v.51 no.5
- /
- pp.466-476
- /
- 2006
The objective of this study was to compare and analyze the acceptability and consumption attitude for soy foods between Korean and Canadian university students as young consumers. This survey was carried out by questionnaire and the subjects were n=516 in Korea and n=502 in Canada. Opinions for soy foods in terms of general knowledge were that soy foods are healthy (86.5% in Korean and 53.4% in Canadian) or neutral (11.6% in Korean and 42.8% in Canadian), dairy foods can be substituted by soy foods (51.9% in Korean and 41.8% in Canadian), and soy foods are not only for vegetarians and milk allergy Patients but also for ordinary People (94.2% in Korean and 87.6% in Canadian). In main sources of information about soy foods, the rate by commercials on TV, radio or magazine was the highest (58.0%) for Korean students and the rate by family or friend was the highest(35.7%) for Canadian students. In consumption attitude, all of Korean students have purchased soy foods but only 55.4% of Canadian students have purchased soy foods, and soymilk was remarkably recognized and consumed then soy beverage and margarine in order. 76.4% of Korean students and 65.1% of Canadian students think soy foods are general and popular and can purchase easily, otherwise, in terms of price, soy foods were expensively recognized as 'more expensive than dairy foods' was 59.1% (Korean) and 54.7% (Canadian), and 'similar to dairy foods' was 36.8% (Korean) and 39.9% (Canadian). Major reasons for the rare consumption were 'I am not interested in soy foods' in Korean students (27.3%) and 'I prefer dairy foods to soy foods' in Canadian students (51.7%). However, consumption of soy foods in both countries are very positive and it will be increased.
PDF KSCI

Target Word Selection Disambiguation using Untagged Text Data in English-Korean Machine Translation (영한 기계 번역에서 미가공 텍스트 데이터를 이용한 대역어 선택 중의성 해소)

Kim Yu-Seop;Chang Jeong-Ho
- The KIPS Transactions:PartB
- /
- v.11B no.6
- /
- pp.749-758
- /
- 2004
In this paper, we propose a new method utilizing only raw corpus without additional human effort for disambiguation of target word selection in English-Korean machine translation. We use two data-driven techniques; one is the Latent Semantic Analysis(LSA) and the other the Probabilistic Latent Semantic Analysis(PLSA). These two techniques can represent complex semantic structures in given contexts like text passages. We construct linguistic semantic knowledge by using the two techniques and use the knowledge for target word selection in English-Korean machine translation. For target word selection, we utilize a grammatical relationship stored in a dictionary. We use k- nearest neighbor learning algorithm for the resolution of data sparseness Problem in target word selection and estimate the distance between instances based on these models. In experiments, we use TREC data of AP news for construction of latent semantic space and Wail Street Journal corpus for evaluation of target word selection. Through the Latent Semantic Analysis methods, the accuracy of target word selection has improved over 10% and PLSA has showed better accuracy than LSA method. finally we have showed the relatedness between the accuracy and two important factors ; one is dimensionality of latent space and k value of k-NT learning by using correlation calculation.
https://doi.org/10.3745/KIPSTB.2004.11B.6.749 인용 PDF KSCI

A Study on Developing Sensibility Model for Visual Display (시각 디스플레이에서의 감성 모형 개발 -움직임과 색을 중심으로-)

임은영;조경자;한광희
- Korean Journal of Cognitive Science
- /
- v.15 no.2
- /
- pp.1-15
- /
- 2004
The structure of sensibility from motion was developed for the purpose of understanding relationship between sensibilities and physical factors to apply it to dynamic visual display. Seventy adjectives were collected by assessing adequacy to express sensibilities from motion and reporting sensibilities recalled from dynamic displays with achromatic color. Various motion displays with a moving single dot were rated according to the degree of sensibility corresponding to each adjective, on the basis of the Semantic Differential (SD) method. The results of assessment were analyzed by means of the factor analysis to reduce 70 words into 19 fundamental sensibilities from motion. The Multidimensional Scaling (MDS) technique constructed the sensibility space in motion, in which 19 sensibilities were scattered with two dimensions, active-passive and bright-dark Motion types systemically varied in kinematic factors were placed on the two-dimensional space of motion sensibility, in order to analyze important variables affecting sensibility from motion. Patterns of placement indicate that speed and both of cycle and amplitude in trajectories tend to partially determine sensibility. Although color and motion affected sensibility according to the in dimensions, it seemed that combination of motion and color made each have dominant effect individually in a certain sensibility dimension, motion to active-passive and color to bright-dark.
PDF

A Hybrid Forecasting Framework based on Case-based Reasoning and Artificial Neural Network (사례기반 추론기법과 인공신경망을 이용한 서비스 수요예측 프레임워크)

Hwang, Yousub
- Journal of Intelligence and Information Systems
- /
- v.18 no.4
- /
- pp.43-57
- /
- 2012
To enhance the competitive advantage in a constantly changing business environment, an enterprise management must make the right decision in many business activities based on both internal and external information. Thus, providing accurate information plays a prominent role in management's decision making. Intuitively, historical data can provide a feasible estimate through the forecasting models. Therefore, if the service department can estimate the service quantity for the next period, the service department can then effectively control the inventory of service related resources such as human, parts, and other facilities. In addition, the production department can make load map for improving its product quality. Therefore, obtaining an accurate service forecast most likely appears to be critical to manufacturing companies. Numerous investigations addressing this problem have generally employed statistical methods, such as regression or autoregressive and moving average simulation. However, these methods are only efficient for data with are seasonal or cyclical. If the data are influenced by the special characteristics of product, they are not feasible. In our research, we propose a forecasting framework that predicts service demand of manufacturing organization by combining Case-based reasoning (CBR) and leveraging an unsupervised artificial neural network based clustering analysis (i.e., Self-Organizing Maps; SOM). We believe that this is one of the first attempts at applying unsupervised artificial neural network-based machine-learning techniques in the service forecasting domain. Our proposed approach has several appealing features : (1) We applied CBR and SOM in a new forecasting domain such as service demand forecasting. (2) We proposed our combined approach between CBR and SOM in order to overcome limitations of traditional statistical forecasting methods and We have developed a service forecasting tool based on the proposed approach using an unsupervised artificial neural network and Case-based reasoning. In this research, we conducted an empirical study on a real digital TV manufacturer (i.e., Company A). In addition, we have empirically evaluated the proposed approach and tool using real sales and service related data from digital TV manufacturer. In our empirical experiments, we intend to explore the performance of our proposed service forecasting framework when compared to the performances predicted by other two service forecasting methods; one is traditional CBR based forecasting model and the other is the existing service forecasting model used by Company A. We ran each service forecasting 144 times; each time, input data were randomly sampled for each service forecasting framework. To evaluate accuracy of forecasting results, we used Mean Absolute Percentage Error (MAPE) as primary performance measure in our experiments. We conducted one-way ANOVA test with the 144 measurements of MAPE for three different service forecasting approaches. For example, the F-ratio of MAPE for three different service forecasting approaches is 67.25 and the p-value is 0.000. This means that the difference between the MAPE of the three different service forecasting approaches is significant at the level of 0.000. Since there is a significant difference among the different service forecasting approaches, we conducted Tukey's HSD post hoc test to determine exactly which means of MAPE are significantly different from which other ones. In terms of MAPE, Tukey's HSD post hoc test grouped the three different service forecasting approaches into three different subsets in the following order: our proposed approach > traditional CBR-based service forecasting approach > the existing forecasting approach used by Company A. Consequently, our empirical experiments show that our proposed approach outperformed the traditional CBR based forecasting model and the existing service forecasting model used by Company A. The rest of this paper is organized as follows. Section 2 provides some research background information such as summary of CBR and SOM. Section 3 presents a hybrid service forecasting framework based on Case-based Reasoning and Self-Organizing Maps, while the empirical evaluation results are summarized in Section 4. Conclusion and future research directions are finally discussed in Section 5.
https://doi.org/10.13088/jiis.2012.18.4.043 인용 PDF KSCI

Development of an Approach for Analysing Vegetation Community Mosaic Using Landscape Metrics (경관지수를 활용한 식생군락 모자이크화 분석법)

Lee, Peter Sang-Hoon;Jeong, Jong-Chul
- Journal of Cadastre & Land InformatiX
- /
- v.47 no.1
- /
- pp.161-178
- /
- 2017
Whereas the demand for development of forested areas covering more than 60% of Korean territory, permission on the forest development has been still given from the perspective of effective land utilization rather than conservation. As the assessment of large forested areas usually focuses more on forest structure, it has its limitation of observing and analyzing the interior change in forest in this way. This study was aimed at computing landscape metrics using a presence vegetation map and FRAGTSTATS 4.2 and analyzing vegetation mosaics. Colonies in native vegetation were classified into a series of major groups and sub-groups based on the native species within the colonies. The colonies were investigated by analyzing a suite of landscape metrics - Core Area, Percentage of Landscape, Number of Patches, Patch Density, Largest Patch Index, Total Edge, Edge Density, Landscape Shape Index, Mean Patch Area, Euclidean Nearest Neighbor. In the Chungnam province major groups and sub-groups of colonies classified based on the proportion of pine and oak species, and pine species was the principal one in terms of distribution area. As for the competition between pines and oaks, while the coverage of pine-centered colonies were three times larger than those of oak-centered ones, pine colonies showed the greater number of patches and therefore higher fragmentation than oaks at the major group level. For the sub-groups, the largest coverage colonies were not only indicated by Pinus densiflora-Quesrcus mongolica colonies among P. densiflora-centered colonies, Q. accutissima colonies among Q. accutissima-centered ones, Q. accutissima-P. densiflora colonies among Q. accutissima-centered ones, Q. mongolica colonies among Q. mongolica-centered ones, P. thumbergii colonies among P. thumbergii-centered ones, and Q. serrata-Q. acutissima colonies among Q. serrata-centered ones, but also revealed more severely mosaicked than other smaller colonies. The overall mosaicking degree estimated by landscape metrics was considered useful for monitoring and investigating vegetation. However, in order to develop management strategy based on analyzing the reason for the mosaicking process and anticipating a trend in vegetation succession, it is essential to further study about ecological characteristics of each colony in the vegetation.
https://doi.org/10.22640/lxsiri.2017.47.1.161 인용 PDF KSCI

Sensitivity Analysis of Meteorology-based Wildfire Risk Indices and Satellite-based Surface Dryness Indices against Wildfire Cases in South Korea (기상기반 산불위험지수와 위성기반 지면건조지수의 우리나라 산불발생에 대한 민감도분석)

Kong, Inhak;Kim, Kwangjin;Lee, Yangwon
- Journal of Cadastre & Land InformatiX
- /
- v.47 no.2
- /
- pp.107-120
- /
- 2017
There are many wildfire risk indices worldwide, but objective comparisons between such various wildfire risk indices and surface dryness indices have not been conducted for the wildfire cases in Korea. This paper describes a sensitivity analysis on the wildfire risk indices and surface dryness indices for Korea using LDAPS(Local Analysis and Prediction System) meteorological dataset on a 1.5-km grid and MODIS(Moderate-resolution Imaging Spectroradiometer) satellite images on a 1-km grid. We analyzed the meteorology-based wildfire risk indices such as the Australian FFDI(forest fire danger index), the Canadian FFMC(fine fuel moisture code), the American HI(Haines index), and the academically presented MNI(modified Nesterov index). Also we examined the satellite-based surface dryness indices such as NDDI(normalized difference drought index) and TVDI(temperature vegetation dryness index). As a result of the comparisons between the six indices regarding 120 wildfire cases with the area damaged over 1ha during the period between January 2013 and May 2017, we found that the FFDI and FFMC showed a good predictability for most wildfire cases but the MNI and TVDI were not suitable for Korea. The NDDI can be used as a proxy parameter for wildfire risk because its average CDF(cumulative distribution function) scores were stably high irrespective of fire size. The indices tested in this paper should be carefully chosen and used in an integrated way so that they can contribute to wildfire forecasting in Korea.
https://doi.org/10.22640/lxsiri.2017.47.2.107 인용 PDF KSCI

Analysis of Traffic Accidents Injury Severity in Seoul using Decision Trees and Spatiotemporal Data Visualization (의사결정나무와 시공간 시각화를 통한 서울시 교통사고 심각도 요인 분석)

Kang, Youngok;Son, Serin;Cho, Nahye
- Journal of Cadastre & Land InformatiX
- /
- v.47 no.2
- /
- pp.233-254
- /
- 2017
The purpose of this study is to analyze the main factors influencing the severity of traffic accidents and to visualize spatiotemporal characteristics of traffic accidents in Seoul. To do this, we collected the traffic accident data that occurred in Seoul for four years from 2012 to 2015, and classified as slight, serious, and death traffic accidents according to the severity of traffic accidents. The analysis of spatiotemporal characteristics of traffic accidents was performed by kernel density analysis, hotspot analysis, space time cube analysis, and Emerging HotSpot Analysis. The factors affecting the severity of traffic accidents were analyzed using decision tree model. The results show that traffic accidents in Seoul are more frequent in suburbs than in central areas. Especially, traffic accidents concentrated in some commercial and entertainment areas in Seocho and Gangnam, and the traffic accidents were more and more intense over time. In the case of death traffic accidents, there were statistically significant hotspot areas in Yeongdeungpo-gu, Guro-gu, Jongno-gu, Jung-gu and Seongbuk. However, hotspots of death traffic accidents by time zone resulted in different patterns. In terms of traffic accident severity, the type of accident is the most important factor. The type of the road, the type of the vehicle, the time of the traffic accident, and the type of the violation of the regulations were ranked in order of importance. Regarding decision rules that cause serious traffic accidents, in case of van or truck, there is a high probability that a serious traffic accident will occur at a place where the width of the road is wide and the vehicle speed is high. In case of bicycle, car, motorcycle or the others there is a high probability that a serious traffic accident will occur under the same circumstances in the dawn time.
https://doi.org/10.22640/lxsiri.2017.47.2.223 인용 PDF KSCI

Search Result 2,431, Processing Time 0.03 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)