• Title/Summary/Keyword: Linear Fitting

Search Result 378, Processing Time 0.028 seconds

Corporate Bond Rating Using Various Multiclass Support Vector Machines (다양한 다분류 SVM을 적용한 기업채권평가)

  • Ahn, Hyun-Chul;Kim, Kyoung-Jae
    • Asia pacific journal of information systems
    • /
    • v.19 no.2
    • /
    • pp.157-178
    • /
    • 2009
  • Corporate credit rating is a very important factor in the market for corporate debt. Information concerning corporate operations is often disseminated to market participants through the changes in credit ratings that are published by professional rating agencies, such as Standard and Poor's (S&P) and Moody's Investor Service. Since these agencies generally require a large fee for the service, and the periodically provided ratings sometimes do not reflect the default risk of the company at the time, it may be advantageous for bond-market participants to be able to classify credit ratings before the agencies actually publish them. As a result, it is very important for companies (especially, financial companies) to develop a proper model of credit rating. From a technical perspective, the credit rating constitutes a typical, multiclass, classification problem because rating agencies generally have ten or more categories of ratings. For example, S&P's ratings range from AAA for the highest-quality bonds to D for the lowest-quality bonds. The professional rating agencies emphasize the importance of analysts' subjective judgments in the determination of credit ratings. However, in practice, a mathematical model that uses the financial variables of companies plays an important role in determining credit ratings, since it is convenient to apply and cost efficient. These financial variables include the ratios that represent a company's leverage status, liquidity status, and profitability status. Several statistical and artificial intelligence (AI) techniques have been applied as tools for predicting credit ratings. Among them, artificial neural networks are most prevalent in the area of finance because of their broad applicability to many business problems and their preeminent ability to adapt. However, artificial neural networks also have many defects, including the difficulty in determining the values of the control parameters and the number of processing elements in the layer as well as the risk of over-fitting. Of late, because of their robustness and high accuracy, support vector machines (SVMs) have become popular as a solution for problems with generating accurate prediction. An SVM's solution may be globally optimal because SVMs seek to minimize structural risk. On the other hand, artificial neural network models may tend to find locally optimal solutions because they seek to minimize empirical risk. In addition, no parameters need to be tuned in SVMs, barring the upper bound for non-separable cases in linear SVMs. Since SVMs were originally devised for binary classification, however they are not intrinsically geared for multiclass classifications as in credit ratings. Thus, researchers have tried to extend the original SVM to multiclass classification. Hitherto, a variety of techniques to extend standard SVMs to multiclass SVMs (MSVMs) has been proposed in the literature Only a few types of MSVM are, however, tested using prior studies that apply MSVMs to credit ratings studies. In this study, we examined six different techniques of MSVMs: (1) One-Against-One, (2) One-Against-AIL (3) DAGSVM, (4) ECOC, (5) Method of Weston and Watkins, and (6) Method of Crammer and Singer. In addition, we examined the prediction accuracy of some modified version of conventional MSVM techniques. To find the most appropriate technique of MSVMs for corporate bond rating, we applied all the techniques of MSVMs to a real-world case of credit rating in Korea. The best application is in corporate bond rating, which is the most frequently studied area of credit rating for specific debt issues or other financial obligations. For our study the research data were collected from National Information and Credit Evaluation, Inc., a major bond-rating company in Korea. The data set is comprised of the bond-ratings for the year 2002 and various financial variables for 1,295 companies from the manufacturing industry in Korea. We compared the results of these techniques with one another, and with those of traditional methods for credit ratings, such as multiple discriminant analysis (MDA), multinomial logistic regression (MLOGIT), and artificial neural networks (ANNs). As a result, we found that DAGSVM with an ordered list was the best approach for the prediction of bond rating. In addition, we found that the modified version of ECOC approach can yield higher prediction accuracy for the cases showing clear patterns.

Analysis of Cancer Incidence in Zhejiang Cancer Registry in China during 2000 to 2009

  • Du, Ling-Bin;Li, Hui-Zhang;Wang, Xiang-Hui;Zhu, Chen;Liu, Qing-Min;Li, Qi-Long;Li, Xue-Qin;Shen, Yong-Zhou;Zhang, Xin-Pei;Ying, Jiang-Wei;Yu, Chuan-Ding;Mao, Wei-Min
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.15 no.14
    • /
    • pp.5839-5843
    • /
    • 2014
  • Objective: The Zhejiang Provincial Cancer Prevention and Control Office collected cancer registration data during 2000 to 2009 from 6 cancer registries in Zhejiang province of China in order to analyze the cancer incidence. Methods: Descriptive analysis included cancer incidence stratified by sex, age and cancer site group. The proportions and cumulative rates of 10 common cancers in different groups were also calculated. Chinese population census in 1982 and Segi's population were used for calculating age-standardized incidence rates. The log-linear model was used for fitting to calculate the incidence trends. Results: The 6 cancer registries in Zhejiang province in China covered a total of 60,087,888 person-years during 2000 to 2009 (males 30,445,904, females 29,641,984). The total number of new cancer cases were 163,104 (males 92,982, females 70,122). The morphology verified cases accounted for 69.7%, and the new cases verified only by information from death certification accounted for 1.23%. The crude incidence rate in Zhejiang cancer registration areas was $271.5/10^5$ during 2000 to 2009 (male $305.41/10^5$, female $236.58/10^5$), age-standardized incidence rates by Chinese standard population (ASIRC) and by world standard population (ASIRW) were $147.1/10^5$ and $188.2/10^5$, the cumulative incidence rate (aged from 0 to 74) being 21.7%. The crude incidence rate was $209.6/10^5$ in 2000, and it increased to $320.20/10^5$ in 2009 (52.8%), with an annual percent change (APC) of 4.51% (95% confidence interval, 3.25%-5.79%). Age-specific incidence rate of 80-84 age group was achieved at the highest point of the incidence curve. Overall with different age groups, the cancer incidences differed, the incidence of liver cancer being highest in 15-44 age group in males; the incidence of breast cancer was the highest in 15-64 age group in females; the incidences of lung cancer were the highest in both males and females over the age of 65 years. Conclusions: Lung cancer, digestive system malignancies and breast cancer are the most common cancers in Zhejiang province in China requiring an especial focus. The incidences of thyroid cancer, prostate cancer, cervical cancer and lymphoma have increased rapidly. Prevention and control measures should be implemented for these cancers.

Estimation of $T_2{^*}$ Relaxation Times for the Glandular Tissue and Fat of Breast at 3T MRI System (3테슬러 자기공명영상기기에서 유방의 유선조직과 지방조직의 $T_2{^*}$이완시간 측정)

  • Ryu, Jung Kyu;Oh, Jang-Hoon;Kim, Hyug-Gi;Rhee, Sun Jung;Seo, Mirinae;Jahng, Geon-Ho
    • Investigative Magnetic Resonance Imaging
    • /
    • v.18 no.1
    • /
    • pp.1-6
    • /
    • 2014
  • Purpose : $T_2{^*}$ relaxation time which includes susceptibility information represents unique feature of tissue. The objective of this study was to investigate $T_2{^*}$ relaxation times of the normal glandular tissue and fat of breast using a 3T MRI system. Materials and Methods: Seven-echo MR Images were acquired from 52 female subjects (age $49{\pm}12 $years; range, 25 to 75) using a three-dimensional (3D) gradient-echo sequence. Echo times were between 2.28 ms to 25.72 ms in 3.91 ms steps. Voxel-based $T_2{^*}$ relaxation times and $R_2{^*}$ relaxation rate maps were calculated by using the linear curve fitting for each subject. The 3D regions-of-interest (ROI) of the normal glandular tissue and fat were drawn on the longest echo-time image to obtain $T_2{^*}$ and $R_2{^*}$ values. Mean values of those parameters were calculated over all subjects. Results: The 3D ROI sizes were $4818{\pm}4679$ voxels and $1455{\pm}785$ voxels for the normal glandular tissue and fat, respectively. The mean $T_2{^*}$ values were $22.40{\pm}5.61ms$ and $36.36{\pm}8.77ms$ for normal glandular tissue and fat, respectively. The mean $R_2{^*}$ values were $0.0524{\pm}0.0134/ms$ and $0.0297{\pm}0.0069/ms$ for the normal glandular tissue and fat, respectively. Conclusion: $T_2{^*}$ and $R_2{^*}$ values were measured from human breast tissues. $T_2{^*}$ of the normal glandular tissue was shorter than that of fat. Measurement of $T_2{^*}$ relaxation time could be important to understand susceptibility effects in the breast cancer and the normal tissue.

Assessment of Hydroureteronephrosis in Children Using Diuretic Radionuclide Ureterography (동위원소 이뇨 요관그람을 이용한 소아 요관폐쇄의 평가)

  • Kim, Jong-Ho;Lee, Dong-Soo;Kwark, Cheol-Eun;Lee, Kyung-Han;Choi, Chang-Woon;Chung, June-Key;Lee, Myung-Chul;Koh, Chang-Soon;Choi, Yong;Choi, Hwang
    • The Korean Journal of Nuclear Medicine
    • /
    • v.28 no.1
    • /
    • pp.75-84
    • /
    • 1994
  • The need for assessment of ureteric function in the patient with an obviousely dilated ureter has increased particularly with the added spectrum of asymptomatic patients presenting with hydrone-phrosis and hydroureter on antenatal and perinatal ultrasound. To assess the influence of ureteral status on kidney washout during $^{99m}Tc$-DTPA diuretic renography, ureteral images were reviewed in 80 children referred for hydronephrosis. A scintigraphically abnormal ureter was defined as an intense and continuous image of > 10 min during diuretic renography. Out of them, a total of 16 nephroureteral systems in 12 children with scintigraphically abnormal ureter were analyzed. A diuretic washout index using response half time (t1/2) by linear fitting after lasix injection, was determined on renal (Kt1/2) and ureteral (Ut1/2) curves (diuretic renogram vs. diuretic ureterogram). Diuretic ureterogram curve patterns corresponding to normal (type I), obstructive (II) and non-obstructive (III) cases were described. Compared with X-ray data, diuretic renography was highly sensitive (88%) and specific (99%) for detecting any ureteral abnormality. Despite an obstructive Kt1/2 (>20 min), no patient with an abnormal ureter underwent therapy at the ureteropelvic junction because the hydronephrosis regressed after surgery at the lower level. Our data indicate that the abnormal ureter findings during diuretic renography have to be recognized before therapy for children with hydeonephrosis.

  • PDF

Establishment of a Murine Model for Radiation-induced Bone Loss in Growing C3H/HeN Mice (성장기 마우스에서 방사선 유도 골소실 동물모델 확립)

  • Jang, Jong-Sik;Moon, Changjong;Kim, Jong-Choon;Bae, Chun-Sik;Kang, Seong-Soo;Jung, Uhee;Jo, Sung-Kee;Kim, Sung-Ho
    • Journal of Radiation Protection and Research
    • /
    • v.40 no.1
    • /
    • pp.10-16
    • /
    • 2015
  • Bone changes are common sequela of irradiation in growing animal. The purpose of this study was to establish an experimental model of radiation-induced bone loss in growing mice using micro-computed tomography (${\mu}CT$). The extent of changes following 2 Gy gamma irradiation ($2Gy{\cdot}min^{-1}$) was studied at 4, 8 or 12 weeks after exposure. Mice that received 0.5, 1.0, 2.0 or 4.0 Gy of gamma-rays were examined 8 weeks after irradiation. Tibiae were analyzed using ${\mu}CT$. Serum alkaline phosphatase (ALP) and biomechanical properties were measured and the osteoclast surface was examined. A significant loss of trabecular bone in tibiae was evident 8 weeks after exposure. Measurements performed after irradiation showed a dose-related decrease in trabecular bone volume fraction (BV/TV) and bone mineral density (BMD), respectively. The best-fitting dose-response curves were linear-quadratic. Taking the controls into accounts, the lines of best fit were as follows: BV/TV (%) = $0.9584D^2-6.0168D+20.377$ ($r^2$ = 0.946, D = dose in Gy) and BMD ($mg{\cdot}cm^{-3}$) = $8.8115D^2-56.197D+194.41$ ($r^2$ = 0.999, D = dose in Gy). Body weight did not differ among the groups. No dose-dependent differences were apparent among the groups with regard to mechanical and anatomical properties of tibia, serum ALP and osteoclast activity. The findings provide the basis required for better understanding of the results that will be obtained in any further studies of radiation-induced bone responses.

Evaluation on the Effects of Deicing Salts on Crop using Seedling Emergence Assay of Oilseed Rape (Brassica napus) (유채의 출아 검정을 통한 제설제의 작물 영향 평가)

  • Lim, Soo-Hyun;Yu, Hyejin;Lee, Chan-Young;Gong, Yu-Seok;Lee, Byung-Duk;Kim, Do-Soon
    • KOREAN JOURNAL OF CROP SCIENCE
    • /
    • v.66 no.1
    • /
    • pp.72-79
    • /
    • 2021
  • The increasing use of deicing salts has caused various environmental problems, including crop damage along the motorway where deicing salts are sprayed during winter. Deicing salts used on roads have been reported to negatively affect crops, but little information is known about their impact on crops. A seedling emergence assay was conducted to evaluate the effects of deicing salts on crops using oilseed rape (Brassica napus) as a model plant. We tested five chloride deicing salts consisting of NaCl, CaCl2, or MgCl2 and 1 non-chloride deicing salt (SM-3) at a range of concentrations (25, 50, 100, 200, and 400 mM), and untreated control. Regardless of deicing salts, they significantly delayed and reduced seedling emergence of oilseed rape with increasing salt concentration. Non-linear regression analysis of seedling emergence with a range of salt concentrations by fitting to the log-logistic model revealed that the chloride deicing salts reduced seedling emergence more than the non-chloride deicing salt SM-3. The GR50 value, the concentration causing 50% seedling emergence, of SM-3 was 47.1 mM, while those of the chloride deicing salts ranged from 30.7 mM (PC-10) to 37.5 mM (ES-1), showing approximately 10 mM difference between non-chloride and chloride deicing salts. Our findings suggest that seedling emergence assay is a useful tool to estimate the potential damage caused by deicing salts on crops.

A study on solar radiation prediction using medium-range weather forecasts (중기예보를 이용한 태양광 일사량 예측 연구)

  • Sujin Park;Hyojeoung Kim;Sahm Kim
    • The Korean Journal of Applied Statistics
    • /
    • v.36 no.1
    • /
    • pp.49-62
    • /
    • 2023
  • Solar energy, which is rapidly increasing in proportion, is being continuously developed and invested. As the installation of new and renewable energy policy green new deal and home solar panels increases, the supply of solar energy in Korea is gradually expanding, and research on accurate demand prediction of power generation is actively underway. In addition, the importance of solar radiation prediction was identified in that solar radiation prediction is acting as a factor that most influences power generation demand prediction. In addition, this study can confirm the biggest difference in that it attempted to predict solar radiation using medium-term forecast weather data not used in previous studies. In this paper, we combined the multi-linear regression model, KNN, random fores, and SVR model and the clustering technique, K-means, to predict solar radiation by hour, by calculating the probability density function for each cluster. Before using medium-term forecast data, mean absolute error (MAE) and root mean squared error (RMSE) were used as indicators to compare model prediction results. The data were converted into daily data according to the medium-term forecast data format from March 1, 2017 to February 28, 2022. As a result of comparing the predictive performance of the model, the method showed the best performance by predicting daily solar radiation with random forest, classifying dates with similar climate factors, and calculating the probability density function of solar radiation by cluster. In addition, when the prediction results were checked after fitting the model to the medium-term forecast data using this methodology, it was confirmed that the prediction error increased by date. This seems to be due to a prediction error in the mid-term forecast weather data. In future studies, among the weather factors that can be used in the mid-term forecast data, studies that add exogenous variables such as precipitation or apply time series clustering techniques should be conducted.

Application of Support Vector Regression for Improving the Performance of the Emotion Prediction Model (감정예측모형의 성과개선을 위한 Support Vector Regression 응용)

  • Kim, Seongjin;Ryoo, Eunchung;Jung, Min Kyu;Kim, Jae Kyeong;Ahn, Hyunchul
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.3
    • /
    • pp.185-202
    • /
    • 2012
  • .Since the value of information has been realized in the information society, the usage and collection of information has become important. A facial expression that contains thousands of information as an artistic painting can be described in thousands of words. Followed by the idea, there has recently been a number of attempts to provide customers and companies with an intelligent service, which enables the perception of human emotions through one's facial expressions. For example, MIT Media Lab, the leading organization in this research area, has developed the human emotion prediction model, and has applied their studies to the commercial business. In the academic area, a number of the conventional methods such as Multiple Regression Analysis (MRA) or Artificial Neural Networks (ANN) have been applied to predict human emotion in prior studies. However, MRA is generally criticized because of its low prediction accuracy. This is inevitable since MRA can only explain the linear relationship between the dependent variables and the independent variable. To mitigate the limitations of MRA, some studies like Jung and Kim (2012) have used ANN as the alternative, and they reported that ANN generated more accurate prediction than the statistical methods like MRA. However, it has also been criticized due to over fitting and the difficulty of the network design (e.g. setting the number of the layers and the number of the nodes in the hidden layers). Under this background, we propose a novel model using Support Vector Regression (SVR) in order to increase the prediction accuracy. SVR is an extensive version of Support Vector Machine (SVM) designated to solve the regression problems. The model produced by SVR only depends on a subset of the training data, because the cost function for building the model ignores any training data that is close (within a threshold ${\varepsilon}$) to the model prediction. Using SVR, we tried to build a model that can measure the level of arousal and valence from the facial features. To validate the usefulness of the proposed model, we collected the data of facial reactions when providing appropriate visual stimulating contents, and extracted the features from the data. Next, the steps of the preprocessing were taken to choose statistically significant variables. In total, 297 cases were used for the experiment. As the comparative models, we also applied MRA and ANN to the same data set. For SVR, we adopted '${\varepsilon}$-insensitive loss function', and 'grid search' technique to find the optimal values of the parameters like C, d, ${\sigma}^2$, and ${\varepsilon}$. In the case of ANN, we adopted a standard three-layer backpropagation network, which has a single hidden layer. The learning rate and momentum rate of ANN were set to 10%, and we used sigmoid function as the transfer function of hidden and output nodes. We performed the experiments repeatedly by varying the number of nodes in the hidden layer to n/2, n, 3n/2, and 2n, where n is the number of the input variables. The stopping condition for ANN was set to 50,000 learning events. And, we used MAE (Mean Absolute Error) as the measure for performance comparison. From the experiment, we found that SVR achieved the highest prediction accuracy for the hold-out data set compared to MRA and ANN. Regardless of the target variables (the level of arousal, or the level of positive / negative valence), SVR showed the best performance for the hold-out data set. ANN also outperformed MRA, however, it showed the considerably lower prediction accuracy than SVR for both target variables. The findings of our research are expected to be useful to the researchers or practitioners who are willing to build the models for recognizing human emotions.