• Title/Summary/Keyword: Robust Statistics

Search Result 397, Processing Time 0.022 seconds

An Evaluation Method for Web Contents Services (웹콘텐츠 서비스 평가)

  • Jang, Hee S.;Park, Jong T.
    • Journal of Service Research and Studies
    • /
    • v.3 no.2
    • /
    • pp.33-44
    • /
    • 2013
  • As the Internet and mobile services increase, the use of wired/wireless web contents services increase and the demand for various contents explosively grows. To survive in competitive market, and to minimize the errors and warnings for web accessibility and standardization, and then to maximize the web usability, the periodical evaluation for web site should be performed with the events of web marketing and campaign. Through the web evaluation, the errors for technical programming language and contents offering can be found and diagnosed. In this paper, the quantitative and qualitative evaluation method for web site providing web contents are presented, and the analytic results for the 138 home pages in domestic are evaluated to validate the quantitative methodology. The accessibility, standardization, and usability factor are adopted for the evaluation in which accessibility is evaluated for perceivable, operable, understandable, and robust discipline with K-WAH(Korea-Web Accessibility Helper) tool, the standardization are measured for the number of errors and warnings in technical language with the W3C validator, and finally the usability factor is analyzed for the number of visits, average visit duration, and bounce rate with Google Analytics. In addition to, the quantitative analysis is also performed with the consideration of cost for construction and operation of web site. From the results, in the case of total score of 100 in conversion with relative weight, the average and standard deviation are evaluated to be 55 and 14, respectively. The correlation analysis indicates that the coefficient is estimated as 0.058, and then correlation between the quantitative results and cost is evaluated to be a little positive.

  • PDF

New Methods for Estimation of Time Delay and Time-Frequency Delay in Impulsive NOise Environment Using FNOM and MD Criterion (임펄스 잡음 환경 하에서 FNOM와 MD를 이용한 새로운 시지연 및 시간-주파수 지연 복합 추정 방법)

  • Lee, Jin;Jung, Jung-Kyun;Lee, Young-Seok;Kim, Sung-Hwan
    • The Journal of the Acoustical Society of Korea
    • /
    • v.16 no.5
    • /
    • pp.96-104
    • /
    • 1997
  • In this paper, we proposed new methods for estimation of time delay and time-frequency delay in impulsive noise environment. The proposed methods are developed using the theory of ${\alpha}-stable$ distribution, including the fractional negative order moment(FNOM) and minimum dispersion(MD), which are formulated for the time delay estimation and the fractional negative order ambiguity function and complex minimum dispersion, which are difined for the joint estimation of time delay and frequency delay. Through simulation work, its performance was compared with various other algorithms. As a result, while the conventional approaches based on second-order statistics are only verified in Gaussian noise environent ($S{\alpha}S$ noise with ${\alpha}$=2) and also the recently proposed robust methods by Nikias[7] are verified only in limited impulse noise ($S{\alpha}S$ noise with the range of $1<{\alpha}{\le}2$), the methods proposed are able to estimate the time delay in Gaussian and any impulsive noise environments($S{\alpha}S$ noise with the range of $0<{\alpha}{\le}2$).

  • PDF

A Fast Background Subtraction Method Robust to High Traffic and Rapid Illumination Changes (많은 통행량과 조명 변화에 강인한 빠른 배경 모델링 방법)

  • Lee, Gwang-Gook;Kim, Jae-Jun;Kim, Whoi-Yul
    • Journal of Korea Multimedia Society
    • /
    • v.13 no.3
    • /
    • pp.417-429
    • /
    • 2010
  • Though background subtraction has been widely studied for last decades, it is still a poorly solved problem especially when it meets real environments. In this paper, we first address some common problems for background subtraction that occur in real environments and then those problems are resolved by improving an existing GMM-based background modeling method. First, to achieve low computations, fixed point operations are used. Because background model usually does not require high precision of variables, we can reduce the computation time while maintaining its accuracy by adopting fixed point operations rather than floating point operations. Secondly, to avoid erroneous backgrounds that are induced by high pedestrian traffic, static levels of pixels are examined using shot-time statistics of pixel history. By using a lower learning rate for non-static pixels, we can preserve valid backgrounds even for busy scenes where foregrounds dominate. Finally, to adapt rapid illumination changes, we estimated the intensity change between two consecutive frames as a linear transform and compensated learned background models according to the estimated transform. By applying the fixed point operation to existing GMM-based method, it was able to reduce the computation time to about 30% of the original processing time. Also, experiments on a real video with high pedestrian traffic showed that our proposed method improves the previous background modeling methods by 20% in detection rate and 5~10% in false alarm rate.

The effect of temperature on the electricity demand: An empirical investigation (기온이 전력수요에 미치는 영향 분석)

  • Kim, Hye-min;Kim, In-gyum;Park, Ki-Jun;Yoo, Seung-Hoon
    • Journal of Energy Engineering
    • /
    • v.24 no.2
    • /
    • pp.167-173
    • /
    • 2015
  • This paper attempts to estimate the electricity demand function in Korea with quarterly data of average temperature, GDP and electricity price over the period 2005-2013. We apply lagged dependent variable model and ordinary least square method as a robust approach to estimating the parameters of the electricity demand function. The results show that short-run price and income elasticities of the electricity demand are estimated to be -0.569 and 0.631, respectively. They are statistically significant at the 1% level. Moreover, long-run income and price elasticities are estimated to be 1.589 and -1.433, respectively Both of results reveal that the demand for electricity is price- and income-elastic in the long-run. The relationship between electricity consumption and temperature is supported by many of references as a U-shaped relationship, and the base temperature of electricity demand is about $15.2^{\circ}C$. It is shown that power of explanation and goodness-of-fit statistics are improved in the use of the lagged dependent variable model rather than conventional model.

Global Patterns of Pigment Concentration, Cloud Cover, and Sun Glint: Application to the OSMI Data Collection Planning (색소농도, 운량 및 태양반사의 전구분포 : OSMI 자료수집계획에 대한 응용)

  • Yongseung Kim;Chiho Kang;Hyo-Suk Lim
    • Korean Journal of Remote Sensing
    • /
    • v.14 no.3
    • /
    • pp.277-284
    • /
    • 1998
  • To establish a monthly data collection planning for the Ocean Scanning Multispectral Imager (OSMI), we have examined the global patterns of three impacting factors: pigment concentration, cloud cover, and sun glint. Other than satellite mission constraints (e.g., duty cycle), these three factors are considered critical for the OSMI data collection. The Nimbus-7 Coastal Zone Color Scanner (CZCS) monthly mean products and the International Satellite Cloud Climatology Project (ISCCP) monthly mean products (C2) were used for the analysis of pigment concentration and cloud cover distributions, respectively. And the monthly-simulated patterns of sun glint were produced by performing the OSMI orbit prediction and the calculation of sun glint radiances at the top-of-atmosphere (TOA). Using monthly statistics (mean and/or standard deviation) of each factor in the above for a given 10$^{\circ}$ latitude by 10$^{\circ}$ longitude grid, we generated the priority map for each month. The priority maps of three factors for each month were subsequently superimposed to visualize the impact of three factors in all. The initial results illustrated that a large part of oceans in the summer hemisphere was classified into the low priority regions because of seasonal changes of clouds and sun illumination. Sensitivity tests for different sets of classifications were performed and demonstrated the seasonal effects of clouds and sun glint to be robust.

Analysis of AI interview data using unified non-crossing multiple quantile regression tree model (통합 비교차 다중 분위수회귀나무 모형을 활용한 AI 면접체계 자료 분석)

  • Kim, Jaeoh;Bang, Sungwan
    • The Korean Journal of Applied Statistics
    • /
    • v.33 no.6
    • /
    • pp.753-762
    • /
    • 2020
  • With an increasing interest in integrating artificial intelligence (AI) into interview processes, the Republic of Korea (ROK) army is trying to lead and analyze AI-powered interview platform. This study is to analyze the AI interview data using a unified non-crossing multiple quantile tree (UNQRT) model. Compared to the UNQRT, the existing models, such as quantile regression and quantile regression tree model (QRT), are inadequate for the analysis of AI interview data. Specially, the linearity assumption of the quantile regression is overly strong for the aforementioned application. While the QRT model seems to be applicable by relaxing the linearity assumption, it suffers from crossing problems among estimated quantile functions and leads to an uninterpretable model. The UNQRT circumvents the crossing problem of quantile functions by simultaneously estimating multiple quantile functions with a non-crossing constraint and is robust from extreme quantiles. Furthermore, the single tree construction from the UNQRT leads to an interpretable model compared to the QRT model. In this study, by using the UNQRT, we explored the relationship between the results of the Army AI interview system and the existing personnel data to derive meaningful results.

Ensemble Learning with Support Vector Machines for Bond Rating (회사채 신용등급 예측을 위한 SVM 앙상블학습)

  • Kim, Myoung-Jong
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.2
    • /
    • pp.29-45
    • /
    • 2012
  • Bond rating is regarded as an important event for measuring financial risk of companies and for determining the investment returns of investors. As a result, it has been a popular research topic for researchers to predict companies' credit ratings by applying statistical and machine learning techniques. The statistical techniques, including multiple regression, multiple discriminant analysis (MDA), logistic models (LOGIT), and probit analysis, have been traditionally used in bond rating. However, one major drawback is that it should be based on strict assumptions. Such strict assumptions include linearity, normality, independence among predictor variables and pre-existing functional forms relating the criterion variablesand the predictor variables. Those strict assumptions of traditional statistics have limited their application to the real world. Machine learning techniques also used in bond rating prediction models include decision trees (DT), neural networks (NN), and Support Vector Machine (SVM). Especially, SVM is recognized as a new and promising classification and regression analysis method. SVM learns a separating hyperplane that can maximize the margin between two categories. SVM is simple enough to be analyzed mathematical, and leads to high performance in practical applications. SVM implements the structuralrisk minimization principle and searches to minimize an upper bound of the generalization error. In addition, the solution of SVM may be a global optimum and thus, overfitting is unlikely to occur with SVM. In addition, SVM does not require too many data sample for training since it builds prediction models by only using some representative sample near the boundaries called support vectors. A number of experimental researches have indicated that SVM has been successfully applied in a variety of pattern recognition fields. However, there are three major drawbacks that can be potential causes for degrading SVM's performance. First, SVM is originally proposed for solving binary-class classification problems. Methods for combining SVMs for multi-class classification such as One-Against-One, One-Against-All have been proposed, but they do not improve the performance in multi-class classification problem as much as SVM for binary-class classification. Second, approximation algorithms (e.g. decomposition methods, sequential minimal optimization algorithm) could be used for effective multi-class computation to reduce computation time, but it could deteriorate classification performance. Third, the difficulty in multi-class prediction problems is in data imbalance problem that can occur when the number of instances in one class greatly outnumbers the number of instances in the other class. Such data sets often cause a default classifier to be built due to skewed boundary and thus the reduction in the classification accuracy of such a classifier. SVM ensemble learning is one of machine learning methods to cope with the above drawbacks. Ensemble learning is a method for improving the performance of classification and prediction algorithms. AdaBoost is one of the widely used ensemble learning techniques. It constructs a composite classifier by sequentially training classifiers while increasing weight on the misclassified observations through iterations. The observations that are incorrectly predicted by previous classifiers are chosen more often than examples that are correctly predicted. Thus Boosting attempts to produce new classifiers that are better able to predict examples for which the current ensemble's performance is poor. In this way, it can reinforce the training of the misclassified observations of the minority class. This paper proposes a multiclass Geometric Mean-based Boosting (MGM-Boost) to resolve multiclass prediction problem. Since MGM-Boost introduces the notion of geometric mean into AdaBoost, it can perform learning process considering the geometric mean-based accuracy and errors of multiclass. This study applies MGM-Boost to the real-world bond rating case for Korean companies to examine the feasibility of MGM-Boost. 10-fold cross validations for threetimes with different random seeds are performed in order to ensure that the comparison among three different classifiers does not happen by chance. For each of 10-fold cross validation, the entire data set is first partitioned into tenequal-sized sets, and then each set is in turn used as the test set while the classifier trains on the other nine sets. That is, cross-validated folds have been tested independently of each algorithm. Through these steps, we have obtained the results for classifiers on each of the 30 experiments. In the comparison of arithmetic mean-based prediction accuracy between individual classifiers, MGM-Boost (52.95%) shows higher prediction accuracy than both AdaBoost (51.69%) and SVM (49.47%). MGM-Boost (28.12%) also shows the higher prediction accuracy than AdaBoost (24.65%) and SVM (15.42%)in terms of geometric mean-based prediction accuracy. T-test is used to examine whether the performance of each classifiers for 30 folds is significantly different. The results indicate that performance of MGM-Boost is significantly different from AdaBoost and SVM classifiers at 1% level. These results mean that MGM-Boost can provide robust and stable solutions to multi-classproblems such as bond rating.