• Title/Summary/Keyword: 최적회귀모형

Search Result 228, Processing Time 0.032 seconds

Optimization of Multiclass Support Vector Machine using Genetic Algorithm: Application to the Prediction of Corporate Credit Rating (유전자 알고리즘을 이용한 다분류 SVM의 최적화: 기업신용등급 예측에의 응용)

  • Ahn, Hyunchul
    • Information Systems Review
    • /
    • v.16 no.3
    • /
    • pp.161-177
    • /
    • 2014
  • Corporate credit rating assessment consists of complicated processes in which various factors describing a company are taken into consideration. Such assessment is known to be very expensive since domain experts should be employed to assess the ratings. As a result, the data-driven corporate credit rating prediction using statistical and artificial intelligence (AI) techniques has received considerable attention from researchers and practitioners. In particular, statistical methods such as multiple discriminant analysis (MDA) and multinomial logistic regression analysis (MLOGIT), and AI methods including case-based reasoning (CBR), artificial neural network (ANN), and multiclass support vector machine (MSVM) have been applied to corporate credit rating.2) Among them, MSVM has recently become popular because of its robustness and high prediction accuracy. In this study, we propose a novel optimized MSVM model, and appy it to corporate credit rating prediction in order to enhance the accuracy. Our model, named 'GAMSVM (Genetic Algorithm-optimized Multiclass Support Vector Machine),' is designed to simultaneously optimize the kernel parameters and the feature subset selection. Prior studies like Lorena and de Carvalho (2008), and Chatterjee (2013) show that proper kernel parameters may improve the performance of MSVMs. Also, the results from the studies such as Shieh and Yang (2008) and Chatterjee (2013) imply that appropriate feature selection may lead to higher prediction accuracy. Based on these prior studies, we propose to apply GAMSVM to corporate credit rating prediction. As a tool for optimizing the kernel parameters and the feature subset selection, we suggest genetic algorithm (GA). GA is known as an efficient and effective search method that attempts to simulate the biological evolution phenomenon. By applying genetic operations such as selection, crossover, and mutation, it is designed to gradually improve the search results. Especially, mutation operator prevents GA from falling into the local optima, thus we can find the globally optimal or near-optimal solution using it. GA has popularly been applied to search optimal parameters or feature subset selections of AI techniques including MSVM. With these reasons, we also adopt GA as an optimization tool. To empirically validate the usefulness of GAMSVM, we applied it to a real-world case of credit rating in Korea. Our application is in bond rating, which is the most frequently studied area of credit rating for specific debt issues or other financial obligations. The experimental dataset was collected from a large credit rating company in South Korea. It contained 39 financial ratios of 1,295 companies in the manufacturing industry, and their credit ratings. Using various statistical methods including the one-way ANOVA and the stepwise MDA, we selected 14 financial ratios as the candidate independent variables. The dependent variable, i.e. credit rating, was labeled as four classes: 1(A1); 2(A2); 3(A3); 4(B and C). 80 percent of total data for each class was used for training, and remaining 20 percent was used for validation. And, to overcome small sample size, we applied five-fold cross validation to our dataset. In order to examine the competitiveness of the proposed model, we also experimented several comparative models including MDA, MLOGIT, CBR, ANN and MSVM. In case of MSVM, we adopted One-Against-One (OAO) and DAGSVM (Directed Acyclic Graph SVM) approaches because they are known to be the most accurate approaches among various MSVM approaches. GAMSVM was implemented using LIBSVM-an open-source software, and Evolver 5.5-a commercial software enables GA. Other comparative models were experimented using various statistical and AI packages such as SPSS for Windows, Neuroshell, and Microsoft Excel VBA (Visual Basic for Applications). Experimental results showed that the proposed model-GAMSVM-outperformed all the competitive models. In addition, the model was found to use less independent variables, but to show higher accuracy. In our experiments, five variables such as X7 (total debt), X9 (sales per employee), X13 (years after founded), X15 (accumulated earning to total asset), and X39 (the index related to the cash flows from operating activity) were found to be the most important factors in predicting the corporate credit ratings. However, the values of the finally selected kernel parameters were found to be almost same among the data subsets. To examine whether the predictive performance of GAMSVM was significantly greater than those of other models, we used the McNemar test. As a result, we found that GAMSVM was better than MDA, MLOGIT, CBR, and ANN at the 1% significance level, and better than OAO and DAGSVM at the 5% significance level.

Impacts of Climate Change and Follow-up Cropping Season Shift on Growing Period and Temperature in Different Rice Maturity Types (미래 기후변화 및 그에 따른 재배시기 조정이 벼 생태형별 생육기간과 생육온도에 미치는 영향)

  • Lee, Chung-Kuen;Kwak, Kang-Su;Kim, Jun-Hwan;Son, Ji-Young;Yang, Won-Ha
    • KOREAN JOURNAL OF CROP SCIENCE
    • /
    • v.56 no.3
    • /
    • pp.233-243
    • /
    • 2011
  • This experiment was conducted to investigate the effect of future climate change on growing period and temperature in different rice maturity types as global warming progressed, where Odaebyeo, Hwaseongbyeo, Ilpumbyeo were used as a representative cultivar of early, medium, and medium-late rice maturity type, respectively, and A1B scenario was applied to weather data for future climate change at 57 sites in Korea. When cropping season was not adjusted to climate change, entire growing period and growing temperature were shorten and risen, respectively, as global warming progressed. On the other side, when cropping season was adjusted to climate change, growing period and temperature after heading date were not changed in contrast to growing period and growing temperature before heading which were more seriously shortened and risen as global warming progressed than in not adjusted cropping season. It is supposed that adjusting cropping season to climate change can alleviate rice yield reduction and quality deterioration to some degree by improving growing temperature condition during grain-filling period, but also still have a limit such as seriously shortened growing period indicating that there need to develope actively new rice cultivation methods and varieties for future climate change.

Impacts assessment of Climate changes in North Korea based on RCP climate change scenarios II. Impacts assessment of hydrologic cycle changes in Yalu River (RCP 기후변화시나리오를 이용한 미래 북한지역의 수문순환 변화 영향 평가 II. 압록강유역의 미래 수문순환 변화 영향 평가)

  • Jeung, Se Jin;Kang, Dong Ho;Kim, Byung Sik
    • Journal of Wetlands Research
    • /
    • v.21 no.spc
    • /
    • pp.39-50
    • /
    • 2019
  • This study aims to assess the influence of climate change on the hydrological cycle at a basin level in North Korea. The selected model for this study is MRI-CGCM 3, the one used for the Coupled Model Intercomparison Project Phase 5 (CMIP5). Moreover, this study adopted the Spatial Disaggregation-Quantile Delta Mapping (SDQDM), which is one of the stochastic downscaling techniques, to conduct the bias correction for climate change scenarios. The comparison between the preapplication and postapplication of the SDQDM supported the study's review on the technique's validity. In addition, as this study determined the influence of climate change on the hydrological cycle, it also observed the runoff in North Korea. In predicting such influence, parameters of a runoff model used for the analysis should be optimized. However, North Korea is classified as an ungauged region for its political characteristics, and it was difficult to collect the country's runoff observation data. Hence, the study selected 16 basins with secured high-quality runoff data, and the M-RAT model's optimized parameters were calculated. The study also analyzed the correlation among variables for basin characteristics to consider multicollinearity. Then, based on a phased regression analysis, the study developed an equation to calculate parameters for ungauged basin areas. To verify the equation, the study assumed the Osipcheon River, Namdaecheon Stream, Yongdang Reservoir, and Yonggang Stream as ungauged basin areas and conducted cross-validation. As a result, for all the four basin areas, high efficiency was confirmed with the efficiency coefficients of 0.8 or higher. The study used climate change scenarios and parameters of the estimated runoff model to assess the changes in hydrological cycle processes at a basin level from climate change in the Amnokgang River of North Korea. The results showed that climate change would lead to an increase in precipitation, and the corresponding rise in temperature is predicted to cause elevating evapotranspiration. However, it was found that the storage capacity in the basin decreased. The result of the analysis on flow duration indicated a decrease in flow on the 95th day; an increase in the drought flow during the periods of Future 1 and Future 2; and an increase in both flows for the period of Future 3.

A Study on the Prediction Model of Stock Price Index Trend based on GA-MSVM that Simultaneously Optimizes Feature and Instance Selection (입력변수 및 학습사례 선정을 동시에 최적화하는 GA-MSVM 기반 주가지수 추세 예측 모형에 관한 연구)

  • Lee, Jong-sik;Ahn, Hyunchul
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.4
    • /
    • pp.147-168
    • /
    • 2017
  • There have been many studies on accurate stock market forecasting in academia for a long time, and now there are also various forecasting models using various techniques. Recently, many attempts have been made to predict the stock index using various machine learning methods including Deep Learning. Although the fundamental analysis and the technical analysis method are used for the analysis of the traditional stock investment transaction, the technical analysis method is more useful for the application of the short-term transaction prediction or statistical and mathematical techniques. Most of the studies that have been conducted using these technical indicators have studied the model of predicting stock prices by binary classification - rising or falling - of stock market fluctuations in the future market (usually next trading day). However, it is also true that this binary classification has many unfavorable aspects in predicting trends, identifying trading signals, or signaling portfolio rebalancing. In this study, we try to predict the stock index by expanding the stock index trend (upward trend, boxed, downward trend) to the multiple classification system in the existing binary index method. In order to solve this multi-classification problem, a technique such as Multinomial Logistic Regression Analysis (MLOGIT), Multiple Discriminant Analysis (MDA) or Artificial Neural Networks (ANN) we propose an optimization model using Genetic Algorithm as a wrapper for improving the performance of this model using Multi-classification Support Vector Machines (MSVM), which has proved to be superior in prediction performance. In particular, the proposed model named GA-MSVM is designed to maximize model performance by optimizing not only the kernel function parameters of MSVM, but also the optimal selection of input variables (feature selection) as well as instance selection. In order to verify the performance of the proposed model, we applied the proposed method to the real data. The results show that the proposed method is more effective than the conventional multivariate SVM, which has been known to show the best prediction performance up to now, as well as existing artificial intelligence / data mining techniques such as MDA, MLOGIT, CBR, and it is confirmed that the prediction performance is better than this. Especially, it has been confirmed that the 'instance selection' plays a very important role in predicting the stock index trend, and it is confirmed that the improvement effect of the model is more important than other factors. To verify the usefulness of GA-MSVM, we applied it to Korea's real KOSPI200 stock index trend forecast. Our research is primarily aimed at predicting trend segments to capture signal acquisition or short-term trend transition points. The experimental data set includes technical indicators such as the price and volatility index (2004 ~ 2017) and macroeconomic data (interest rate, exchange rate, S&P 500, etc.) of KOSPI200 stock index in Korea. Using a variety of statistical methods including one-way ANOVA and stepwise MDA, 15 indicators were selected as candidate independent variables. The dependent variable, trend classification, was classified into three states: 1 (upward trend), 0 (boxed), and -1 (downward trend). 70% of the total data for each class was used for training and the remaining 30% was used for verifying. To verify the performance of the proposed model, several comparative model experiments such as MDA, MLOGIT, CBR, ANN and MSVM were conducted. MSVM has adopted the One-Against-One (OAO) approach, which is known as the most accurate approach among the various MSVM approaches. Although there are some limitations, the final experimental results demonstrate that the proposed model, GA-MSVM, performs at a significantly higher level than all comparative models.

Estimation of Willingness to pay for Realtime Route Guidance Information by Contingent Valuation Method (조건부가치측정법(CVM)을 이용한 실시간 경로안내시스템의 지불의사액 산정)

  • Do, Myung-Sik;Kim, Yoon-Sik
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.11 no.5
    • /
    • pp.46-55
    • /
    • 2012
  • This study proposes an estimate method of willingness to pay(WTP) for real-time route guidance systems using contingent valuation method(CVM) under double bounded dichotomous choice question(DBDCQ) and analysis for impact factors of WTP estimation. This study assumed that provided real-time traffic information service is optimal route concepts dealing with traffic conditions on origin-destination. Analysis targets were classified into two groups as short distance path and middle distance path for estimating WTP for realtime route guidance system in a year using the survival analysis method and the regression model with personal information, actual condition and satisfaction of information usage and users' awareness and usage of facilities. As a result, mean WTP of realtime route guidance system is 4,034won/year in short distance path, and 4,884won/year in middle distance path. Therefore real-time route guidance system for longer distance path is recognized as more valuable than shorter distance path. Moreover, the necessity of information was required on a higher income group and higher WTP was estimated on owners of vehicle group and lower awareness of a route group.

An Empirical Test of the Dynamic Optimality Condition for Exhaustible Resources -An Input Distance Function- (투입물거리함수를 통한 고갈자원의 동태적 최적이용 여부 검증)

  • Lee, Myunghun
    • Environmental and Resource Economics Review
    • /
    • v.15 no.4
    • /
    • pp.673-692
    • /
    • 2006
  • In order to test for the dynamic optimality condition for the use of nonrenewable resource, it is necessary to estimate the shadow value of the resource in situ. In the previous literatures, a time series for in situ price has been derived either as the difference between marginal revenue and marginal cost or by differentiating with respect to the quantity of ore extracted the restricted cost function in which the quantity of ore is quasi-fixed. However, not only inconsistent estimates are likely to be generated due to the nonmalleability of capital, but the estimate of marginal revenue will be affected by market power. Since firms will likely fail to minimize the cost of the reproducible inputs subject to market prices under realistic circumstances where imperfect factor markets, strikes, or government regulations are present, the shadow in situ values obtained by estimating the restricted cost function can be biased. This paper provides a valid methodology for checking the dynamic optimality condition for a nonrenewable resource by using the input distance function. Our methodology has some advantages over previous ones: only data on quantities of inputs and outputs are required; nor is the maintained hypothesis of cost minimization required; adoption of linear programming enables us to circumvent autocorrelated errors problem caused by use of time series or panel data. The dynamic optimality condition for domestic coal mining does not hold for constant discount rates ranging from 2 to 20 percent over the period 1970~1993. The dynamic optimality condition also does not hold for variable rates ranging from fourth to four times the real interest rate.

  • PDF

Extractions of Surface-Active Substances from Defatted Rapeseed Meal (Brassica napus L.) by Supercritical Carbon Dioxide (초임계 CO2 유체 추출법을 이용한 탈지 유채박 중 표면활성물질 추출의 최적화)

  • Kim, Jeong-Won;Jeong, Yong-Seon;Gil, Na-Young;Lee, Eui-Seok;Lee, Yong-Hwa;Jang, Young-Seok;Lee, Ki-Teak;Hong, Soon-Taek
    • The Korean Journal of Food And Nutrition
    • /
    • v.26 no.4
    • /
    • pp.831-840
    • /
    • 2013
  • In this study, an attempt is being made to extract surface-active substances from defatted rapeseed cakes by supercritical carbon dioxide fluid. Independent variables for the extraction process, being formulated by D-optimal design, are pressure (150~350 bar), temperature ($33{\sim}65^{\circ}C$ and co-solvent (ethanol, 50~250 g). The dependent variables of the extraction yield, the content of neutral lipids, phospholipids and glycolipids in the extracts were analyzed upon the results through the response surface methodology. As for the extraction yield, it was found to increase with increasing independent variables, among which the co-solvent proved to be a major influencing parameter. Similar trends were found for the content of surface-active substances (i.e, phospholipids and glycolipids) in the extracts, except for the content of neutral lipids. Regression equations were suggested to coincide well with the results from the experiments. Extraction conditions are being optimized to maximize the extraction yields, the content of phospholipids, and glycolipids were 350 bar (pressure), $65^{\circ}C$ (temperature) and 228.55 g (co-solvent), respectively.

A Study on the Derivation of the Unit Hydrograph using Multiple Regression Model (다중회귀모형으로 추정된 모수에 의한 최적단위유량도의 유도에 관한 연구)

  • 이종남;김채원;황창현
    • Water for future
    • /
    • v.25 no.1
    • /
    • pp.93-100
    • /
    • 1992
  • A study on the Derivation of the Unit Hydrograph using Multiple Regression Moe이. The purpose of this study is to deriver an optimal unit hydrograph suing the multiple regression model, particularly when only small amount of data is available. The presence of multicollinearity among the input data can cause serious oscillations in the derivation of the unit hydrograph. In this case, the oscillations in the unit hydrograph ordinate are eliminated by combining the data. The data used in this study are based upon the collection and arrangement of rainfall-runoff data(1977-1989) at the Soyang-river Dam site. When the matrix X is the rainfall series, the condition number and the reciprocal of the minimum eigenvalue of XTX are calculated by the Jacobi an method, and are compared with the oscillation in the unit hydrograph. The optimal unit hydrograph is derived by combining the numerous rainfall-runoff data. The conclusions are as follows; 1)The oscillations in the derived unit hydrograph are reduced by combining the data from each flood event. 2) The reciprocals of the minimum eigen\value of XTX, 1/k and the condition number CN are increased when the oscillations are active in the derived unit hydrograph. 3)The parameter estimates are validated by extending the model to the Soyang river Dam site with elimination of the autocorrelation in the disturbances. Finally, this paper illustrates the application of the multiple regression model to drive an optimal unit hydrograph dealing with the multicollinearity and the autocorrelation which cause some problems.

  • PDF

Optimization for Maillard Reaction Substrate Conditions of Ribose and Hydrolyzed Wheat Gluten Solution Using Response Surface Methodology (반응표면분석법을 이용한 Ribose와 소맥 글루텐 산 가수분해물의 마이얄 반응기질 조건 최적화)

  • Moon, Ji-Hye;Choi, Hee-Don;Choi, In-Wook;Kim, Yoon-Sook
    • Journal of the Korean Society of Food Science and Nutrition
    • /
    • v.40 no.3
    • /
    • pp.458-465
    • /
    • 2011
  • Response surface methodology (RSM) was applied to optimize substrate conditions of ribose and hydrolyzed wheat gluten solution for Maillard reaction. Independent variables were NaCl concentration of hydrolyzed wheat gluten ($X_1$), concentration of ribose ($X_2$) and concentration of hydrolyzed wheat gluten ($X_3$), while the dependent variables of the central composite design (CCD) were browning index (absorbance 420 nm), DPPH radical scavenging activity (DF) and sensory preference (score). Optimum substrate conditions at $140^{\circ}C$, 30 min reaction were 3% NaCl concentration of hydrolyzed wheat gluten, 6.2% concentration of ribose and 13.27% concentration of hydrolyzed wheat gluten. The coefficients of determination ($R^2$) were 0.975, 0.960 and 0.854, the model fit was very significant (p<0.001). DPPH radical scavenging activities and sensory preferences were predicted as 700 (DF) and 8.42 (score), respectively. The model solution increased more browning and DPPH radical scavenging activities with increasing ribose and hydrolyzed wheat gluten concentration. Especially hydrolyzed wheat gluten concentration was the most influential factor, while NaCl concentration of hydrolyzed wheat gluten hardly affected the responses. Sensory preference was increased with rising wheat gluten concentration and decreasing NaCl concentration of hydrolyzed wheat gluten.

Optimization of Manufacturing Condition and Physicochemical Properties for Mixing Beverage added Extract of Elaeagnus multiflora Thunb. Fruits (뜰보리수 추출물을 첨가한 혼합음료 이화학적 특성과 제조조건의 최적화)

  • Hong, Ju-Yeon;Cha, Hyun-Shik;Shin, Seung-Ryeul;Jeong, Yong-Jin;Youn, Kwang-Sup;Kim, Mi-Hyun;Kim, Nam-Woo
    • Food Science and Preservation
    • /
    • v.14 no.3
    • /
    • pp.269-275
    • /
    • 2007
  • This paper was study to develop an extract of Elaeagnus multiflora as a beverage component, and was part of a broader research project for at the development of processed foods using extract of Elaeagnus multiflora. Acceptable mixing properties of the beverage were significantly related to brix values, pH, total acidity, and total phenol contents. When brown rice vinegar was used as a supplement, the vinegar contributed only 1% of total acidity content, and the brix was below 5% of acceptable level. Maximal total acidity of the mixed beverage was attained which added 19.2%(v/v) of Elaeagnus multiflora extract and 7.6%(v/v) of brown rice vinegar. The mixed beverage contributed 0.88% of the total acidity content. The maximum condition of brix(11.5) of the mixed beverage was arrived to 24.7%(v/v) of Elaeagnus multiflora extract and 4.9%(v/v) of brown rice vinegar. The maximum polyphenol contents of beverage(14.47 mg%) was achieved which added 25.0%(v/v) of Elaeagnus multiflora extract and 4.3%(v/v) of brown rice vinegar.