• Title/Summary/Keyword: error bound

Search Result 417, Processing Time 0.023 seconds

Accuracy of HF radar-derived surface current data in the coastal waters off the Keum River estuary (금강하구 연안역에서 HF radar로 측정한 유속의 정확도)

  • Lee, S.H.;Moon, H.B.;Baek, H.Y.;Kim, C.S.;Son, Y.T.;Kwon, H.K.;Choi, B.J.
    • The Sea:JOURNAL OF THE KOREAN SOCIETY OF OCEANOGRAPHY
    • /
    • v.13 no.1
    • /
    • pp.42-55
    • /
    • 2008
  • To evaluate the accuracy of currents measured by HF radar in the coastal sea off Keum River estuary, we compared the facing radial vectors of two HF radars, and HF radar-derived currents with in-situ measurement currents. Principal component analysis was used to extract regression line and RMS deviation in the comparison. When two facing radar's radial vectors at the mid-point of baseline are compared, RMS deviation is 4.4 cm/s in winter and 5.4 cm/s in summer. When GDOP(Geometric Dilution of Precision) effect is corrected from the RMS deviations that is analyzed from the comparison between HF radar-derived and current-metermeasured currents, the error of velocity combined by HF radar-derived current is less than 5.1 cm/s in the stations having moderate GDOP values. These two results obtained from different method suggest that the lower limit of HF radar-derived current's accuracy is 5.4 cm/s in our study area. As mentioned in previous researches, RMS deviations become large in the stations located near the islands and increase as a function of mean distance from the radar site due to decrease of signal-to-noise level and the intersect angle of radial vectors. We found that an uncertain error bound of HF radar-derived current can be produced from the separation process of RMS deviations using GDOP value if GDOP value for each component is very close and RMS deviations obtained from current component comparison are also close. When the current measured in the stations having moderate GDOP values is separated into tidal and subtidal current, characteristics of tidal current ellipses analyzed from HF radar-derived current show a good agreement with those from current-meter-measured current, and time variation of subtidal current showed a response reflecting physical process driven by wind and density field.

The Impacts of Smoking Bans on Smoking in Korea (금연법 강화가 흡연에 미치는 영향)

  • Kim, Beomsoo;Kim, Ahram
    • KDI Journal of Economic Policy
    • /
    • v.31 no.2
    • /
    • pp.127-153
    • /
    • 2009
  • There is a growing concern about potential harmful effect of second-hand or environmental tobacco smoking. As a result, smoking bans in workplace become more prevalent worldwide. In Korea, workplace smoking ban policy become more restrictive in 2003 when National health enhancing law was amended. The new law requires all office buildings larger than 3,000 square meters (multi-purpose buildings larger than 2,000 square meters) should be smoke free. Therefore, a lot of indoor office became non smoking area. Previous studies in other counties often found contradicting answers for the effects of workplace smoking ban on smoking behavior. In addition, there was no study in Korea yet that examines the causal impacts of smoking ban on smoking behavior. The situation in Korea might be different from other countries. Using 2001 and 2005 Korea National Health and Nutrition surveys which are representative for population in Korea we try to examine the impacts of law change on current smoker and cigarettes smoked per day. The amended law impacted the whole country at the same time and there was a declining trend in smoking rate even before the legislation update. So, the challenge here is to tease out the true impact only. We compare indoor working occupations which are constrained by the law change with outdoor working occupations which are less impacted. Since the data has been collected before (2001) and after (2005) the law change for treated (indoor working occupations) and control (outdoor working occupations) groups we will use difference in difference method. We restrict our sample to working age (between 20 and 65) since these are the relevant population by the workplace smoking ban policy. We also restrict the sample to indoor occupations (executive or administrative and administrative support) and outdoor occupations (sales and low skilled worker) after dropping unemployed and someone working for military since it is not clear whether these occupations are treated group or control group. This classification was supported when we examined the answers for workplace smoking ban policy existing only in 2005 survey. Sixty eight percent of indoor occupations reported having an office smoking ban policy compared to forty percent of outdoor occupation answering workplace smoking ban policy. The estimated impacts on current smoker are 4.1 percentage point decline and cigarettes per day show statistically significant decline of 2.5 cigarettes per day. Taking into account consumption of average sixteen cigarettes per day among smokers it is sixteen percent decline in smoking rate which is substantial. We tested robustness using the same sample across two surveys and also using tobit model. Our results are robust against both concerns. It is possible that our measure of treated and control group have measurement error which will lead to attenuation bias. However, we are finding statistically significant impacts which might be a lower bound of the true estimates. The magnitude of our finding is not much different from previous finding of significant impacts. For cigarettes per day previous estimates varied from 1.37 to 3.9 and for current smoker it showed between 1%p and 7.8%p.

  • PDF

Corporate Bond Rating Using Various Multiclass Support Vector Machines (다양한 다분류 SVM을 적용한 기업채권평가)

  • Ahn, Hyun-Chul;Kim, Kyoung-Jae
    • Asia pacific journal of information systems
    • /
    • v.19 no.2
    • /
    • pp.157-178
    • /
    • 2009
  • Corporate credit rating is a very important factor in the market for corporate debt. Information concerning corporate operations is often disseminated to market participants through the changes in credit ratings that are published by professional rating agencies, such as Standard and Poor's (S&P) and Moody's Investor Service. Since these agencies generally require a large fee for the service, and the periodically provided ratings sometimes do not reflect the default risk of the company at the time, it may be advantageous for bond-market participants to be able to classify credit ratings before the agencies actually publish them. As a result, it is very important for companies (especially, financial companies) to develop a proper model of credit rating. From a technical perspective, the credit rating constitutes a typical, multiclass, classification problem because rating agencies generally have ten or more categories of ratings. For example, S&P's ratings range from AAA for the highest-quality bonds to D for the lowest-quality bonds. The professional rating agencies emphasize the importance of analysts' subjective judgments in the determination of credit ratings. However, in practice, a mathematical model that uses the financial variables of companies plays an important role in determining credit ratings, since it is convenient to apply and cost efficient. These financial variables include the ratios that represent a company's leverage status, liquidity status, and profitability status. Several statistical and artificial intelligence (AI) techniques have been applied as tools for predicting credit ratings. Among them, artificial neural networks are most prevalent in the area of finance because of their broad applicability to many business problems and their preeminent ability to adapt. However, artificial neural networks also have many defects, including the difficulty in determining the values of the control parameters and the number of processing elements in the layer as well as the risk of over-fitting. Of late, because of their robustness and high accuracy, support vector machines (SVMs) have become popular as a solution for problems with generating accurate prediction. An SVM's solution may be globally optimal because SVMs seek to minimize structural risk. On the other hand, artificial neural network models may tend to find locally optimal solutions because they seek to minimize empirical risk. In addition, no parameters need to be tuned in SVMs, barring the upper bound for non-separable cases in linear SVMs. Since SVMs were originally devised for binary classification, however they are not intrinsically geared for multiclass classifications as in credit ratings. Thus, researchers have tried to extend the original SVM to multiclass classification. Hitherto, a variety of techniques to extend standard SVMs to multiclass SVMs (MSVMs) has been proposed in the literature Only a few types of MSVM are, however, tested using prior studies that apply MSVMs to credit ratings studies. In this study, we examined six different techniques of MSVMs: (1) One-Against-One, (2) One-Against-AIL (3) DAGSVM, (4) ECOC, (5) Method of Weston and Watkins, and (6) Method of Crammer and Singer. In addition, we examined the prediction accuracy of some modified version of conventional MSVM techniques. To find the most appropriate technique of MSVMs for corporate bond rating, we applied all the techniques of MSVMs to a real-world case of credit rating in Korea. The best application is in corporate bond rating, which is the most frequently studied area of credit rating for specific debt issues or other financial obligations. For our study the research data were collected from National Information and Credit Evaluation, Inc., a major bond-rating company in Korea. The data set is comprised of the bond-ratings for the year 2002 and various financial variables for 1,295 companies from the manufacturing industry in Korea. We compared the results of these techniques with one another, and with those of traditional methods for credit ratings, such as multiple discriminant analysis (MDA), multinomial logistic regression (MLOGIT), and artificial neural networks (ANNs). As a result, we found that DAGSVM with an ordered list was the best approach for the prediction of bond rating. In addition, we found that the modified version of ECOC approach can yield higher prediction accuracy for the cases showing clear patterns.

Estimation of Surface fCO2 in the Southwest East Sea using Machine Learning Techniques (기계학습법을 이용한 동해 남서부해역의 표층 이산화탄소분압(fCO2) 추정)

  • HAHM, DOSHIK;PARK, SOYEONA;CHOI, SANG-HWA;KANG, DONG-JIN;RHO, TAEKEUN;LEE, TONGSUP
    • The Sea:JOURNAL OF THE KOREAN SOCIETY OF OCEANOGRAPHY
    • /
    • v.24 no.3
    • /
    • pp.375-388
    • /
    • 2019
  • Accurate evaluation of sea-to-air $CO_2$ flux and its variability is crucial information to the understanding of global carbon cycle and the prediction of atmospheric $CO_2$ concentration. $fCO_2$ observations are sparse in space and time in the East Sea. In this study, we derived high resolution time series of surface $fCO_2$ values in the southwest East Sea, by feeding sea surface temperature (SST), salinity (SSS), chlorophyll-a (CHL), and mixed layer depth (MLD) values, from either satellite-observations or numerical model outputs, to three machine learning models. The root mean square error of the best performing model, a Random Forest (RF) model, was $7.1{\mu}atm$. Important parameters in predicting $fCO_2$ in the RF model were SST and SSS along with time information; CHL and MLD were much less important than the other parameters. The net $CO_2$ flux in the southwest East Sea, calculated from the $fCO_2$ predicted by the RF model, was $-0.76{\pm}1.15mol\;m^{-2}yr^{-1}$, close to the lower bound of the previous estimates in the range of $-0.66{\sim}-2.47mol\;m^{-2}yr^{-1}$. The time series of $fCO_2$ predicted by the RF model showed a significant variation even in a short time interval of a week. For accurate evaluation of the $CO_2$ flux in the Ulleung Basin, it is necessary to conduct high resolution in situ observations in spring when $fCO_2$ changes rapidly.

Comparison of Treatment Planning System(TPS) and actual Measurement on the surface under the electron beam therapy with bolus (전자선 치료 시 Bolus를 적용한 경우 표면선량의 Treatment Planning System(TPS) 계산 값과 실제 측정값의 비교)

  • Kim, Byeong Soo;Park, Ju Young;Park, Byoung Suk;Song, Yong Min;Park, Byung Soo;Song, Ki Weon
    • The Journal of Korean Society for Radiation Therapy
    • /
    • v.26 no.2
    • /
    • pp.163-170
    • /
    • 2014
  • Purpose : If electron, chosen for superficial oncotherapy, was applied with bolus, it could work as an important factor to a therapy result by showing a drastic change in surface dose. Hence the calculation value and the actual measurement value of surface dose of Treatment Planning System (TPS) according to four variables influencing surface dose when using bolus on an electron therapy were compared and analyzed in this paper. Materials and Methods : Four variables which frequently occur during the actual therapies (A: bolus thickness - 3, 5, 10 mm, B: field size - $6{\time}6$, $10{\time}10$, $15{\time}15cm2$, C: energy - 6, 9, 12 MeV, D: gantry angle - $0^{\circ}$, $15^{\circ}$) were set to compare the actual measurement value with TPS(Pinnacle 9.2, philips, USA). A computed tomography (lightspeed ultra 16, General Electric, USA) was performed using 16 cm-thick solid water phantom without bolus and total 54 beams where A, B, C, and D were combined after creating 3, 5 and 10 mm bolus on TPS were planned for a therapy. At this moment SSD 100 cm, 300 MU was investigated and measured twice repeatedly by placing it on iso-center by using EBT3 film(International Specialty Products, NJ, USA) to compare and analyze the actual measurement value and TPS. Measured film was analyzed with each average value and standard deviation value using digital flat bed scanner (Expression 10000XL, EPSON, USA) and dose density analyzing system (Complete Version 6.1, RIT, USA). Results : For the values according to the thickness of bolus, the actual measured values for 3, 5 and 10 mm were 101.41%, 99.58% and 101.28% higher respectively than the calculation values of TPS and the standard deviations were 0.0219, 0.0115 and 0.0190 respectively. The actual values according to the field size were $6{\time}6$, $10{\time}10$ and $15{\time}15cm2$ which were 99.63%, 101.40% and 101.24% higher respectively than the calculation values and the standard deviations were 0.0138, 0.0176 and 0.0220. The values according to energy were 6, 9, and 12 MeV which were 99.72%, 100.60% and 101.96% higher respectively and the standard deviations were 0.0200, 0.0160 and 0.0164. The actual measurement value according to beam angle were measured 100.45% and 101.07% higher at $0^{\circ}$ and $15^{\circ}$ respectively and standard deviations were 0.0199 and 0.0190 so they were measured 0.62% higher at $15^{\circ}$ than $0^{\circ}$. Conclusion : As a result of analyzing the calculation value of TPS and measurement value according to the used variables in this paper, the values calculated with TPS on 5 mm bolus, $6{\time}6cm2$ field size and low-energy electron at $0^{\circ}$ gantry angle were closer to the measured values, however, it showed a modest difference within the error bound of maximum 2%. If it was beyond the bounds of variables selected in this paper using electron and bolus simultaneously, the actual measurement value could differ from TPS according to each variable, therefore QA for the accurate surface dose would have to be performed.

Crystal Structures of Dehydrated Partially $Sr^{2+}$-Exchanged Zeolite X, $Sr_{31}K_{30}Si_{100}A1_{92}O_{384}\;and\;Sr_{8.5}TI_{75}Si_{100}AI_{92}O_{384}$ (부분적으로 스트론튬이온으로 교환되고 탈수된, 제올라이트 X의 결정구조)

  • Kim Mi Jung;Kim Yang;Seff Karl
    • Korean Journal of Crystallography
    • /
    • v.8 no.1
    • /
    • pp.6-14
    • /
    • 1997
  • The crystal structures of $Sr_{31}K_{30}-X\;(Sr_{31}K_{30}Si_{100}A1_{92}O_{384};\;a=25.169(5) {\AA}$) and $Sr_{8.5}Tl_{75}-X (Sr_{8.5}Tl_{75}Si_{100}A1_{92}O_{384};\;a=25.041(5) {\AA}$) have been determined by single-crystal X-ray diffraction techniques in the cubic space group $\=F{d3}\;at\;21(1)^{\circ}C$. Each crystal was prepared by ion exchange in a flowing stream of aqueous $Sr(ClO_4)_2\;and\;(K\;or\;T1)NO_3$ whose mole ratio was 1 : 5 for five days. Vacuum dehydration was done at $360^{\circ}C$ for 2d. Their structures were refined to the final error indices $R_1=0.072\;and\;R_w=0.057$ with 293 reflections, and $R_1= 0.058\;and\;R_w=0.044$ with 351 reflections, for which $I>2{\sigma}(I)$, respectively. In dehydrated $Sr_{31}K_{30}-X,\;all\;Sr^{2+}$ ions and $K^+$ ions are located at five different crystallographic sites. Six-teen $Sr^{2+}$ ions per unit cell are at the centers of the double six-rings (site I), filling that position. The remaining 15 $Sr^{2+}$ ions and 17 $K^+$ ions fill site II in the supercage. These $Sr^{2+}$ and $K^+$ ions are recessed ca $0.45{\AA}\;and\;1.06{\AA}$ into the supercage, respectively, from the plane of three oxygens to which each is bound. ($Sr-O=2.45(1){\AA}\;and\;K-O=2.64(1){\AA}$) Eight $K^+$ ons occupy site III'($K-O=3.09(7){\AA}\;and\;3.11(10){\AA}$) and the remaining five $K^+$ ions occupy another site III'($K-O=2.88(7){\AA}\;and\;2.76(7){\AA}$). In $Sr_{8.5}Tl_{75}-X,\;Sr^{2+}\;and\;Tl^+$ ions also occupy five different crystallographic sites. About 8.5 $Sr^{2+}$ ions are at site I. Fifteen $Tl^+$ ions are at site I' in the sodalite cavities on threefold axes opposite double six-rings: each is $1.68{\AA}$ from the plane of its three oxygens ($T1-O=2.70(2){\AA}$). Together these fill the double six-rings. Another 32 $Tl^+$ ions fill site II opposite single six-rings in the supercage, each being $1.48{\AA}$ from the plane of three oxygens ($T1-O=2.70(1){\AA}$). About 18 $Tl^+$ ions occupy site III in the supercage ($T1-O=2.86(2){\AA}$), and the remaining 10 are found at site III' in the supercage ($T1-O=2.96(4){\AA}$).

  • PDF

Ensemble Learning with Support Vector Machines for Bond Rating (회사채 신용등급 예측을 위한 SVM 앙상블학습)

  • Kim, Myoung-Jong
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.2
    • /
    • pp.29-45
    • /
    • 2012
  • Bond rating is regarded as an important event for measuring financial risk of companies and for determining the investment returns of investors. As a result, it has been a popular research topic for researchers to predict companies' credit ratings by applying statistical and machine learning techniques. The statistical techniques, including multiple regression, multiple discriminant analysis (MDA), logistic models (LOGIT), and probit analysis, have been traditionally used in bond rating. However, one major drawback is that it should be based on strict assumptions. Such strict assumptions include linearity, normality, independence among predictor variables and pre-existing functional forms relating the criterion variablesand the predictor variables. Those strict assumptions of traditional statistics have limited their application to the real world. Machine learning techniques also used in bond rating prediction models include decision trees (DT), neural networks (NN), and Support Vector Machine (SVM). Especially, SVM is recognized as a new and promising classification and regression analysis method. SVM learns a separating hyperplane that can maximize the margin between two categories. SVM is simple enough to be analyzed mathematical, and leads to high performance in practical applications. SVM implements the structuralrisk minimization principle and searches to minimize an upper bound of the generalization error. In addition, the solution of SVM may be a global optimum and thus, overfitting is unlikely to occur with SVM. In addition, SVM does not require too many data sample for training since it builds prediction models by only using some representative sample near the boundaries called support vectors. A number of experimental researches have indicated that SVM has been successfully applied in a variety of pattern recognition fields. However, there are three major drawbacks that can be potential causes for degrading SVM's performance. First, SVM is originally proposed for solving binary-class classification problems. Methods for combining SVMs for multi-class classification such as One-Against-One, One-Against-All have been proposed, but they do not improve the performance in multi-class classification problem as much as SVM for binary-class classification. Second, approximation algorithms (e.g. decomposition methods, sequential minimal optimization algorithm) could be used for effective multi-class computation to reduce computation time, but it could deteriorate classification performance. Third, the difficulty in multi-class prediction problems is in data imbalance problem that can occur when the number of instances in one class greatly outnumbers the number of instances in the other class. Such data sets often cause a default classifier to be built due to skewed boundary and thus the reduction in the classification accuracy of such a classifier. SVM ensemble learning is one of machine learning methods to cope with the above drawbacks. Ensemble learning is a method for improving the performance of classification and prediction algorithms. AdaBoost is one of the widely used ensemble learning techniques. It constructs a composite classifier by sequentially training classifiers while increasing weight on the misclassified observations through iterations. The observations that are incorrectly predicted by previous classifiers are chosen more often than examples that are correctly predicted. Thus Boosting attempts to produce new classifiers that are better able to predict examples for which the current ensemble's performance is poor. In this way, it can reinforce the training of the misclassified observations of the minority class. This paper proposes a multiclass Geometric Mean-based Boosting (MGM-Boost) to resolve multiclass prediction problem. Since MGM-Boost introduces the notion of geometric mean into AdaBoost, it can perform learning process considering the geometric mean-based accuracy and errors of multiclass. This study applies MGM-Boost to the real-world bond rating case for Korean companies to examine the feasibility of MGM-Boost. 10-fold cross validations for threetimes with different random seeds are performed in order to ensure that the comparison among three different classifiers does not happen by chance. For each of 10-fold cross validation, the entire data set is first partitioned into tenequal-sized sets, and then each set is in turn used as the test set while the classifier trains on the other nine sets. That is, cross-validated folds have been tested independently of each algorithm. Through these steps, we have obtained the results for classifiers on each of the 30 experiments. In the comparison of arithmetic mean-based prediction accuracy between individual classifiers, MGM-Boost (52.95%) shows higher prediction accuracy than both AdaBoost (51.69%) and SVM (49.47%). MGM-Boost (28.12%) also shows the higher prediction accuracy than AdaBoost (24.65%) and SVM (15.42%)in terms of geometric mean-based prediction accuracy. T-test is used to examine whether the performance of each classifiers for 30 folds is significantly different. The results indicate that performance of MGM-Boost is significantly different from AdaBoost and SVM classifiers at 1% level. These results mean that MGM-Boost can provide robust and stable solutions to multi-classproblems such as bond rating.