• 제목/요약/키워드: AIC(Akaike Information Criterion)

검색결과 68건 처리시간 0.022초

정확한 신뢰성 해석을 위한 아카이케 정보척도 기반 일반화파레토 분포의 임계점 추정 (Threshold Estimation of Generalized Pareto Distribution Based on Akaike Information Criterion for Accurate Reliability Analysis)

  • 강승훈;임우철;조수길;박상현;이민욱;최종수;홍섭;이태희
    • 대한기계학회논문집A
    • /
    • 제39권2호
    • /
    • pp.163-168
    • /
    • 2015
  • 공학분야의 신뢰성 해석은 점점 더 높은 신뢰도 영역에 대한 확률밀도함수의 예측을 요구한다. 따라서 높은 신뢰도를 정확하게 해석하기 위해 분포의 꼬리부분을 정확하게 표현해야 한다. 최근 들어 꼬리부분에 대한 표본만을 이용해 꼬리 모형을 생성하여 신뢰도를 추정할 수 있는 방법인 일반화파레토 분포에 대한 연구가 활발히 진행되고 있다. 하지만 기존의 연구에서는 부정확한 임계점 추정으로 꼬리부분에서 신뢰도의 정확도가 떨어진다. 따라서 본 논문에서는 아카이케 정보척도를 이용하여 임계점을 정확하고 강건하게 추정하고 이를 통해 꼬리 모형의 정확도를 향상시키는 아카이케 정보척도 기반 일반 화파레토 분포 기법을 제안한다. 또한 제안하는 기법을 이용한 신뢰성 해석을 수행하여 정확도가 향상된 신뢰성 해석 결과를 도출하였다.

Minimum Message Length and Classical Methods for Model Selection in Univariate Polynomial Regression

  • Viswanathan, Murlikrishna;Yang, Young-Kyu;WhangBo, Taeg-Keun
    • ETRI Journal
    • /
    • 제27권6호
    • /
    • pp.747-758
    • /
    • 2005
  • The problem of selection among competing models has been a fundamental issue in statistical data analysis. Good fits to data can be misleading since they can result from properties of the model that have nothing to do with it being a close approximation to the source distribution of interest (for example, overfitting). In this study we focus on the preference among models from a family of polynomial regressors. Three decades of research has spawned a number of plausible techniques for the selection of models, namely, Akaike's Finite Prediction Error (FPE) and Information Criterion (AIC), Schwartz's criterion (SCH), Generalized Cross Validation (GCV), Wallace's Minimum Message Length (MML), Minimum Description Length (MDL), and Vapnik's Structural Risk Minimization (SRM). The fundamental similarity between all these principles is their attempt to define an appropriate balance between the complexity of models and their ability to explain the data. This paper presents an empirical study of the above principles in the context of model selection, where the models under consideration are univariate polynomials. The paper includes a detailed empirical evaluation of the model selection methods on six target functions, with varying sample sizes and added Gaussian noise. The results from the study appear to provide strong evidence in support of the MML- and SRM- based methods over the other standard approaches (FPE, AIC, SCH and GCV).

  • PDF

A Machine Learning Univariate Time series Model for Forecasting COVID-19 Confirmed Cases: A Pilot Study in Botswana

  • Mphale, Ofaletse;Okike, Ezekiel U;Rafifing, Neo
    • International Journal of Computer Science & Network Security
    • /
    • 제22권1호
    • /
    • pp.225-233
    • /
    • 2022
  • The recent outbreak of corona virus (COVID-19) infectious disease had made its forecasting critical cornerstones in most scientific studies. This study adopts a machine learning based time series model - Auto Regressive Integrated Moving Average (ARIMA) model to forecast COVID-19 confirmed cases in Botswana over 60 days period. Findings of the study show that COVID-19 confirmed cases in Botswana are steadily rising in a steep upward trend with random fluctuations. This trend can also be described effectively using an additive model when scrutinized in Seasonal Trend Decomposition method by Loess. In selecting the best fit ARIMA model, a Grid Search Algorithm was developed with python language and was used to optimize an Akaike Information Criterion (AIC) metric. The best fit ARIMA model was determined at ARIMA (5, 1, 1), which depicted the least AIC score of 3885.091. Results of the study proved that ARIMA model can be useful in generating reliable and volatile forecasts that can used to guide on understanding of the future spread of infectious diseases or pandemics. Most significantly, findings of the study are expected to raise social awareness to disease monitoring institutions and government regulatory bodies where it can be used to support strategic health decisions and initiate policy improvement for better management of the COVID-19 pandemic.

다속성 빅데이터로부터 유용한 정보 추출에 관한 연구 - 서울시 1인 가구를 중심으로 - (A Study on Extraction of Useful Information from Big dataset of Multi-attributes - Focus on Single Household in Seoul -)

  • 최정민;김건우
    • 한국주거학회논문집
    • /
    • 제25권4호
    • /
    • pp.59-72
    • /
    • 2014
  • This study proposes a data-mining analysis method for examining variable multi-attribute big-data, which is considered to be more applicable in social science using a Correspondence Analysis of variables obtained by AIC model selection. The proposed method was applied on the Seoul Survey from 2005 to 2010 in order to extract interesting rules or patterns on characteristics of single household. The results found as follows. Firstly, this paper illustrated that the proposed method is efficiently able to apply on a big dataset of huge categorical multi attributes variables. Secondly, as a result of Seoul Survey analysis, it has been found that the more dissatisfied with residential environment the higher tendency of residential mobility in single household. Thirdly, it turned out that there are three types of single households based on the characteristics of their demographic characteristics, and it was different from recognition of home and partner of counselling by the three types of single households. Fourthly, this paper extracted eight significant variables with a spatial aggregated dataset which are highly correlated to the ratio of occupancy of single household in 25 Seoul Municipals, and to conclude, it investigated the relation between spatial distribution of single households and their demographic statistics based on the six divided groups obtained by Cluster Analysis.

중도절단 해류속도자료를 이용한 심해저 시험집광기의 주행성능에 관한 신뢰성 기반 최적설계 (Reliability-based Design Optimization on Mobility of Deep-seabed Test Miner Using Censored Data of Current Speed)

  • 박상현;조수길;임우철;김새결;최성식;이민욱;최종수;김형우;이창호;홍섭;이태희
    • Ocean and Polar Research
    • /
    • 제36권4호
    • /
    • pp.487-494
    • /
    • 2014
  • Deep-seabed test miner operated by a self-propelled mining system moving on soft soil is an essential device to secure floating and towing performances. The performances of the tracked vehicle are seriously influenced by noise factors such as the shear strength of the seafloor, bottom current, seafloor slope, speed of tracked vehicle, reaction forces of flexible hose, steering ratio, etc. Due to uncertainties related to noise factors, the design of a deep-sea manganese nodules test miner that satisfies target reliabilities is difficult. Therefore, reliability-based design optimization (RBDO) is required to guarantee system reliability under circumstances where uncertainties related to noise factors prevail. Among noise factors, the bottom current, a bimodal distribution, is censored due to the observation limit of measurement devices. Therefore, estimated distribution of the bottom current is inaccurate without considering these characteristics and the result of RBDO cannot be guaranteed. In this paper, we define censored data as unknown values over the limit of observation. If this data is estimated by using Akaike information criterion (AIC) that cannot consider the characteristics of censored data, the distribution of estimated data cannot guarantee accurate reliability. Therefore, censored AIC that can consider the characteristics of data is used to estimate accurate distribution of the bottom current. Finally, RBDO, under circumstances where uncertainties related to noise factors combined censored data are present, is performed on the mobility of a deep-sea manganese nodules test miner.

제한된 이산정보를 이용한 로어컨트롤암의 신뢰성 기반 최적설계 (Reliability-based Design Optimization for Lower Control Arm using Limited Discrete Information)

  • 장준용;나종호;임우철;박상현;최성식;김정호;김용석;이태희
    • 한국자동차공학회논문집
    • /
    • 제22권2호
    • /
    • pp.100-106
    • /
    • 2014
  • Lower control arm (LCA) is a part of chassis in automotive. Performances of LCA such as stiffness, durability and permanent displacement must be considered in design optimization. However it is hard to consider different performances at once in optimization because these are measured by different commercial tools like Radioss, Abaqus, etc. In this paper, firstly, we construct the integrated design automation system for LCA based on Matlab including Hypermesh, Radioss and Abaqus. Secondly, Akaike information criterion (AIC) is used for assessment of reliability of LCA. It can find the best estimated distribution of performance from limited and discrete stochastic information and then obtains the reliability from the distribution. Finally, we consider tolerances of design variables and variation of elastic modulus and achieve the target reliability by carrying out reliability-based design optimization (RBDO) with the integrated system.

특이값 접근방법에 의한 정현파의 수의 결정에 관한 연구 (Determination of the number of sinusoidal frequencies by a new singular value approach)

  • 안태천;류창선;이동윤;황금찬
    • 대한전기학회:학술대회논문집
    • /
    • 대한전기학회 1989년도 추계학술대회 논문집 학회본부
    • /
    • pp.467-469
    • /
    • 1989
  • A new singular value approach is presented and analized in order to determine the number of multi pie sinsoidal frequencies from the finite noisy data. Simulations are conducted for Akaike's information criterion(AIC), Rissanen's shortest data description(MDL) and a new singular value approach, in covariance matrix based methods. And then performances are compared.

  • PDF

Differences by Selection Method for Exposure Factor Input Distribution for Use in Probabilistic Consumer Exposure Assessment

  • Kang, Sohyun;Kim, Jinho;Lim, Miyoung;Lee, Kiyoung
    • 한국환경보건학회지
    • /
    • 제48권5호
    • /
    • pp.266-271
    • /
    • 2022
  • Background: The selection of distributions of input parameters is an important component in probabilistic exposure assessment. Goodness-of-fit (GOF) methods are used to determine the distribution of exposure factors. However, there are no clear guidelines for choosing an appropriate GOF method. Objectives: The outcomes of probabilistic consumer exposure assessment were compared by using five different GOF methods for the selection of input distributions: chi-squared test, Kolmogorov-Smirnov test (K-S), Anderson-Darling test (A-D), Akaike information criterion (AIC) and Bayesian information criterion (BIC). Methods: Individual exposures were estimated based on product usage factor combinations from 10,000 respondents. The distribution of individual exposure was considered as the true value of population exposures. Results: Among the five GOF methods, probabilistic exposure distributions using the A-D and K-S methods were similar to individual exposure estimations. Comparing the 95th percentiles of the probabilistic distributions and the individual estimations for 10 CPs, there were 0.73 to 1.92 times differences for the A-D method, and 0.73 to 1.60 times differences (excluding tire-shine spray) for the K-S method. Conclusions: There were significant differences in exposure assessment results among the selection of the GOF methods. Therefore, the GOF methods for probabilistic consumer exposure assessment should be carefully selected.

Multiphasic Analysis of Growth Curve of Body Weight in Mice

  • Kurnianto, E.;Shinjo, A.;Suga, D.
    • Asian-Australasian Journal of Animal Sciences
    • /
    • 제12권3호
    • /
    • pp.331-335
    • /
    • 1999
  • The present study describes the analysis of the multiphasic growth function (MGF) to body weight in laboratory and wild mice. Three genetic groups of laboratory mice (Mus musculus domesticus) designated $CF_{{\sharp}1}$, C3H/HeNCrj and C57BL/6NCrj, and a genetic group of Yonakuni wild mice (Mus musculus molossinus yonakuni, Yk) were used. Mean body weights of each genetic group-sex subclass from birth to 69 days of age taken at 3-day intervals were analyzed by a monophasic, diphasic and triphasic functions for describing growth patterns. A comparison among the three functions of the MGF was based on the goodness-of-fit criteria: residual standard deviation (RSD), adjusted R-square (Adj $R^2$) and Akaike's information criterion (AIC). Result of this study indicated that body weight averaged heavier for males than for females. Among the four genetic groups within both sexes, $CF_{{\sharp}1}$ showed the highest, subsequent followed by C3H/HeNCrj, C57BL/6NCrj and Yk. Comparison among the three functions revealed that the triphasic function was the best fit to growth data, with the lowest RSD, the highest Adj $R^2$ and the lowest AIC, for the four genetic groups. For the triphasic function, RSD within each genetic group-sex subclass was similar for males and females. Adj $R^2$ was 0.999 for all genetic group-sex subclasses. AIC for laboratory mice males and females ranged from -70.48 to 66.50 and from -92.81 to -68.64, respectively; whereas for Yk wild mice males was -74.29 and females -78.42.

Comparison of Temperature Indexes for the Impact Assessment of Heat Stress on Heat-Related Mortality

  • Kim, Young-Min;Kim, So-Yeon;Cheong, Hae-Kwan;Kim, Eun-Hye
    • Environmental Analysis Health and Toxicology
    • /
    • 제26권
    • /
    • pp.9.1-9.9
    • /
    • 2011
  • Objectives: In order to evaluate which temperature index is the best predictor for the health impact assessment of heat stress in Korea, several indexes were compared. Methods: We adopted temperature, perceived temperature (PT), and apparent temperature (AT), as a heat stress index, and changes in the risk of death for Seoul and Daegu were estimated with $^1{\circ}C$ increases in those temperature indexes using generalized additive model (GAM) adjusted for the non-temperature related factors: time trends, seasonality, and air pollution. The estimated excess mortality and Akaike's Information Criterion (AIC) due to the increased temperature indexes for the $75^{th}$ percentile in the summers from 2001 to 2008 were compared and analyzed to define the best predictor. Results: For Seoul, all-cause mortality presented the highest percent increase (2.99% [95% CI, 2.43 to 3.54%]) in maximum temperature while AIC showed the lowest value when the all-cause daily death counts were fitted with the maximum PT for the $75^{th}$ percentile of summer. For Daegu, all-cause mortality presented the greatest percent increase (3.52% [95% CI, 2.23 to 4.80%]) in minimum temperature and AIC showed the lowest value in maximum temperature. No lag effect was found in the association between temperature and mortality for Seoul, whereas for Daegu one-day lag effect was noted. Conclusions: There was no one temperature measure that was superior to the others in summer. To adopt an appropriate temperature index, regional meteorological characteristics and the disease status of population should be considered.