• Title/Summary/Keyword: confidence probability

Search Result 314, Processing Time 0.023 seconds

On Confidence Intervals of Robust Regression Estimators (로버스트 회귀추정에 의한 신뢰구간 구축)

  • Lee Dong-Hee;Park You-Sung;Kim Kee-Whan
    • The Korean Journal of Applied Statistics
    • /
    • v.19 no.1
    • /
    • pp.97-110
    • /
    • 2006
  • Since it is well-established that even high quality data tend to contain outliers, one would expect fat? greater reliance on robust regression techniques than is actually observed. But most of all robust regression estimators suffers from the computational difficulties and the lower efficiency than the least squares under the normal error model. The weighted self-tuning estimator (WSTE) recently suggested by Lee (2004) has no more computational difficulty and it has the asymptotic normality and the high break-down point simultaneously. Although it has better properties than the other robust estimators, WSTE does not have full efficiency under the normal error model through the weighted least squares which is widely used. This paper introduces a new approach as called the reweighted WSTE (RWSTE), whose scale estimator is adaptively estimated by the self-tuning constant. A Monte Carlo study shows that new approach has better behavior than the general weighted least squares method under the normal model and the large data.

A Study on the Keyword Extraction for ESG Controversies Through Association Rule Mining (연관규칙 분석을 통한 ESG 우려사안 키워드 도출에 관한 연구)

  • Ahn, Tae Wook;Lee, Hee Seung;Yi, June Suh
    • The Journal of Information Systems
    • /
    • v.30 no.1
    • /
    • pp.123-149
    • /
    • 2021
  • Purpose The purpose of this study is to define the anti-ESG activities of companies recognized by media by reflecting ESG recently attracted attention. This study extracts keywords for ESG controversies through association rule mining. Design/methodology/approach A research framework is designed to extract keywords for ESG controversies as follows: 1) From DeepSearch DB, we collect 23,837 articles on anti-ESG activities exposed to 130 media from 2013 to 2018 of 294 listed companies with ESG ratings 2) We set keywords related to environment, social, and governance, and delete or merge them with other keywords based on the support, confidence, and lift derived from association rule mining. 3) We illustrate the importance of keywords and the relevance between keywords through density, degree centrality, and closeness centrality on network analysis. Findings We identify a total of 26 keywords for ESG controversies. 'Gapjil' records the highest frequency, followed by 'corruption', 'bribery', and 'collusion'. Out of the 26 keywords, 16 are related to governance, 8 to social, and 2 to environment. The keywords ranked high are mostly related to the responsibility of shareholders within corporate governance. ESG controversies associated with social issues are often related to unfair trade. As a result of confidence analysis, the keywords related to social and governance are clustered and the probability of mutual occurrence between keywords is high within each group. In particular, in the case of "owner's arrest", it is caused by "bribery" and "misappropriation" with an 80% confidence level. The result of network analysis shows that 'corruption' is located in the center, which is the most likely to occur alone, and is highly related to 'breach of duty', 'embezzlement', and 'bribery'.

A Comparative Study on Misconception about Statistical Estimation that Future Math Teachers and High School Students have (통계적 추정에 관한 예비 수학교사들과 고등학생들의 오개념 비교 분석)

  • Han, Ga-Hee;Jeon, Youngju
    • Journal of the Korean School Mathematics Society
    • /
    • v.21 no.3
    • /
    • pp.247-266
    • /
    • 2018
  • In this paper, three main concepts are chosen for this statistical estimation study, based on previous studies: confidence interval and reliability, sampling distribution of mean and population mean estimation, and relationships between elements of confidence interval. The main objectives of this study are as follows: 1. How are the attitudes that future math teachers and high school students have to ward the statistical estimation? 2. Is there some difference in the awareness of misconceptions about the statistical estimation that future math teachers and high school students have? A study result shows that both groups have difficulties in understanding statistical concepts and their meaning used in Unit Statistical Estimation. They tend to wrongly think that the meaning of reliability is the same as that of probability. They also have difficulties in understanding sample variance in the sampling distribution of mean, which makes it impossible to connect with population mean estimation. It is shown that relationships between elements consisting of confidence interval are not consistent.

Reproducibility of Hypothesis Testing and Confidence Interval (가설검정과 신뢰구간의 재현성)

  • Huh, Myung-Hoe
    • The Korean Journal of Applied Statistics
    • /
    • v.27 no.4
    • /
    • pp.645-653
    • /
    • 2014
  • P-value is the probability of observing a current sample and possibly other samples departing equally or more extremely from the null hypothesis toward postulated alternative hypothesis. When p-value is less than a certain level called ${\alpha}$(= 0:05), researchers claim that the alternative hypothesis is supported empirically. Unfortunately, some findings discovered in that way are not reproducible, partly because the p-value itself is a statistic vulnerable to random variation. Boos and Stefanski (2011) suggests calculating the upper limit of p-value in hypothesis testing, using a bootstrap predictive distribution. To determine the sample size of a replication study, this study proposes thought experiments by simulating boosted bootstrap samples of different sizes from given observations. The method is illustrated for the cases of two-group comparison and multiple linear regression. This study also addresses the reproducibility of the points in the given 95% confidence interval. Numerical examples show that the center point is covered by 95% confidence intervals generated from bootstrap resamples. However, end points are covered with a 50% chance. Hence this study draws the graph of the reproducibility rate for each parameter in the confidence interval.

Improvement of Keyword Spotting Performance Using Normalized Confidence Measure (정규화 신뢰도를 이용한 핵심어 검출 성능향상)

  • Kim, Cheol;Lee, Kyoung-Rok;Kim, Jin-Young;Choi, Seung-Ho;Choi, Seung-Ho
    • The Journal of the Acoustical Society of Korea
    • /
    • v.21 no.4
    • /
    • pp.380-386
    • /
    • 2002
  • Conventional post-processing as like confidence measure (CM) proposed by Rahim calculates phones' CM using the likelihood between phoneme model and anti-model, and then word's CM is obtained by averaging phone-level CMs[1]. In conventional method, CMs of some specific keywords are tory low and they are usually rejected. The reason is that statistics of phone-level CMs are not consistent. In other words, phone-level CMs have different probability density functions (pdf) for each phone, especially sri-phone. To overcome this problem, in this paper, we propose normalized confidence measure. Our approach is to transform CM pdf of each tri-phone to the same pdf under the assumption that CM pdfs are Gaussian. For evaluating our method we use common keyword spotting system. In that system context-dependent HMM models are used for modeling keyword utterance and contort-independent HMM models are applied to non-keyword utterance. The experiment results show that the proposed NCM reduced FAR (false alarm rate) from 0.44 to 0.33 FA/KW/HR (false alarm/keyword/hour) when MDR is about 8%. It achieves 25% improvement of FAR.

Sensitivity analysis of serological tests for detection of disease in cattle (소 질병 검출을 위한 혈청학적 검사의 민감도 평가)

  • Lee, Sang-Jin;Moon, Oun-Kyong;Pak, Son-Il
    • Korean Journal of Veterinary Research
    • /
    • v.50 no.1
    • /
    • pp.43-48
    • /
    • 2010
  • Animal disease surveillance system, defined as the continuous investigation of a given population to detect the occurrence of disease or infection for control purposes, has been key roles to assess the health status of an animal population and, more recently, in international trade of animal and animal products with regard to risk assessment. Especially, for a system aiming to determine whether or not a disease is present in a population sensitivity of the system should be maintained high enough not to miss an infected animal. Therefore, when planning the implementation of surveillance system a number of factors that affecting surveillance sensitivity should be taken into account. Of these parameters sample size is of important, and different approaches are used to calculate sample size, usually depending on the objective of surveillance systems. The purpose of this study was to evaluate the sensitivity of the current national serological surveillance programs for four selected bovine diseases assuming a specified sampling plan, to examine factors affecting the probability of detection, and to provide sample sizes required for achieving surveillance goal of detecting at least an infection in a given population. Our results showed that, for example, detecting low level of prevalence (0.2% for bovine tuberculosis) requires selection of all animals per typical Korean cattle farm (n = 17), and thus risk-based target surveillance for high risk groups can be an alternative strategy to increase sensitivity while not increasing overall sampling efforts. The minimum sample size required for detecting at least one positive animal was sharply increased as the disease prevalence is low. More importantly, high reliability of prevalence estimation was expected with increased sampling fraction even when zero-infected animal was identified. The effect of sample size is also discussed in terms of the maximum prevalence when zero-infected animals were identified and on the probability of failure to detect an infection. We suggest that for many serological surveillance systems, diagnostic performance of the testing method, sample size, prevalence, population size, and statistical confidence need to be considered to correctly interpret results of the system.

A Study on Review-Level Ground Motion For Seismic Margin Assessment (내진여유도 평가를 위한 부석기준지진동(RLGM) 평가 연구)

  • 연관희;이종림
    • Proceedings of the Earthquake Engineering Society of Korea Conference
    • /
    • 2000.04a
    • /
    • pp.97-104
    • /
    • 2000
  • Evaluating a Review-Level Ground Motion is a key to efficiently perform Seismic Margin Assessment of nuclear power plants whose purpose is to determine a ground motion level for which a plant has high-confidence-of-a-low-probability of seismic-induced core damage and to identify any weaker-link components. In this study a method to obtain RLGMs is reviewed which is recommended by Electric Power Research Institute and implemented to be applied to Limerick site in eastern and central U. S as a case study. This method provides reasonable and site-specific RLGMs as minimum required plant HCLPF for SMA that meet a target mean seismic core-damage frequency based on seismic hazard results and generic values of uncertainty and randomness parameters of the core-damage fragility curves. In addition high-frequency RLGM is justifiably modified to reflect the increased seismic capacity of high-frequency components and spatial variation and incoherence of input ground motion on a basemat of large structures by establishing a method to obtain high0-frequency reduction factors according to EPRI guidelines.

  • PDF

A Web Usage Prediction Model by Transition Probability Matrix (전이 확률 행렬에 의한 웹 사용 예측 모델)

  • 김영희;김응모;정명숙;강우준
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2004.10b
    • /
    • pp.31-33
    • /
    • 2004
  • 웹 사용에 대한 다음 요구 사항을 예측하기 위한 마이닝 방법으로 연관규칙이나 순차 패턴 등이 많이 사용되고 있지만, 이러한 방법들은 생성된 규칙들의 지지도(Support)나 신뢰도(Confidence)에 의한 예측만을 고려하기 때문에 정확한 예측을 하기 어려운 단점을 가지고 있다. 따라서, 본 논문에서는 빈도 수에 의한 Markov model을 기반으로 하여 웹 로그 파일에 저장된 사용자들의 행동 패턴에 따라 생성되어지는 여러 형태의 규칙 유형을 찾아내고, 사용 빈도 수를 이용한 전이 확률 행렬에 따른 다음 요구사항을 정확하게 예측할 수 있는 모델을 제시하고자 한다. 그 결과 여러 형태의 규칙 유형을 $K^{th}$ -order Markov 과정에서 효율적으로 발견해 낼 수 있다.

  • PDF

Impact study for multi-girder bridge based on correlated road roughness

  • Liu, Chunhua;Wang, Ton-Lo;Huang, Dongzhou
    • Structural Engineering and Mechanics
    • /
    • v.11 no.3
    • /
    • pp.259-272
    • /
    • 2001
  • The impact behavior of a multigirder concrete bridge under single and multiple moving vehicles is studied based on correlated road surface characteristics. The bridge structure is modeled as grillage beam system. A 3D nonlinear vehicle model with eleven degrees of freedom is utilized according to the HS20-44 truck design loading in the American Association of State Highway and Transportation Officials (AASHTO) specifications. A triangle correlation model is introduced to generate four classes of longitudinal road surface roughness as multi-correlated random processes along deck transverse direction. On the basis of a correlation length of approximately half the bridge width, the upper limits of impact factors obtained under confidence level of 95 percent and side-by-side three-truck loading provide probability-based evidence for the evaluation of AASHTO specifications. The analytical results indicate that a better transverse correlation among road surface roughness generally leads to slightly higher impact factors. Suggestions are made for the routine maintenance of this type of highway bridges.

coaxing 효과가 피로한도에 미치는 영향에 관한 연구

  • Lee, Jung-Hyoung;Yoo, Duck-Sang;Song, Duek-Chung
    • Journal of the Korean Society of Industry Convergence
    • /
    • v.5 no.1
    • /
    • pp.3-9
    • /
    • 2002
  • In the field of design of mechanical structure and expectation of life time, characteristic of fatigue limit comes out to he the most important problem. In this paper, in order to get fatigue limit, (I) investigate the aspects of economy, time and confidence comparing two methods: the method by fracture probability introducing statistical conception and the staircase method. And (II) examine the experience approaching fatigue limit and coaxing effect. The value of fatigue limit by staircase method in very effective in view of practical use, and coaxing comes out by the same material effect as restraining crack progress, not as strengthening the tip of crack alone.

  • PDF