• Title/Summary/Keyword: Akaike's Information (AIC)

Search Result 36, Processing Time 0.025 seconds

Minimum Message Length and Classical Methods for Model Selection in Univariate Polynomial Regression

  • Viswanathan, Murlikrishna;Yang, Young-Kyu;WhangBo, Taeg-Keun
    • ETRI Journal
    • /
    • v.27 no.6
    • /
    • pp.747-758
    • /
    • 2005
  • The problem of selection among competing models has been a fundamental issue in statistical data analysis. Good fits to data can be misleading since they can result from properties of the model that have nothing to do with it being a close approximation to the source distribution of interest (for example, overfitting). In this study we focus on the preference among models from a family of polynomial regressors. Three decades of research has spawned a number of plausible techniques for the selection of models, namely, Akaike's Finite Prediction Error (FPE) and Information Criterion (AIC), Schwartz's criterion (SCH), Generalized Cross Validation (GCV), Wallace's Minimum Message Length (MML), Minimum Description Length (MDL), and Vapnik's Structural Risk Minimization (SRM). The fundamental similarity between all these principles is their attempt to define an appropriate balance between the complexity of models and their ability to explain the data. This paper presents an empirical study of the above principles in the context of model selection, where the models under consideration are univariate polynomials. The paper includes a detailed empirical evaluation of the model selection methods on six target functions, with varying sample sizes and added Gaussian noise. The results from the study appear to provide strong evidence in support of the MML- and SRM- based methods over the other standard approaches (FPE, AIC, SCH and GCV).

  • PDF

Mesh Selectivity of Durm Net Fish Trap for Elkhorn sculpin(Alcichthys alcicornis) in the Eastern Sea of Korea (동해의 장구형 통발에 대한 빨간횟대 (Alcichthys alcicornis)의 망목선택성)

  • Park, Hae-Hoon;Jeong, Eui-Cheol;An, Heui-Chun;Park, Chang-Doo;Kim, Hyun-Young;Bae, Jae-Hyun;Cho, Sam-Kwang;Baik, Chul-In
    • Journal of the Korean Society of Fisheries and Ocean Technology
    • /
    • v.40 no.4
    • /
    • pp.247-254
    • /
    • 2004
  • The mesh selectivity of the drum net fish trap for elkhorn sculpin(Alcichthys alcicornis) in the estern sea of Korea was described. The selection curve for the elkhorn sculpin caught from the experiments between June 2003 and December 2003 was by SELECT(Share Each Length Class's Catch Total)model and by Kitahaa's method to a polynomial equation and two parameter logistic selection curve. The selection curve by SELECT model showed to be equal probability of entrance of the elkhorn sculpin in the large(55mm) and small(20mm) mesh traps by minimum AIC (Akaike Information Criteria). The equation of selectivity curve obtained by Kitahara's method using a logistic function with least square method was $s(R)\;=\;\frac{1}{1+exp(-0.3545R+2.141)$, where R=1/m, and/and m are total length and mesh size, respectively. The mesh selectivity curve showed that the current regulated mesh size(35mm) for the trap was corresponded to 21.4cm in the $L_{50}$of the selection curve for the elkhorn sculpin.

Determination of the number of sinusoidal frequencies by a new singular value approach (특이값 접근방법에 의한 정현파의 수의 결정에 관한 연구)

  • Ahn, Tae-Chon;Ryu, Chang-Sun;Lee, Dong-Yoon;Whang, Keun-Chan
    • Proceedings of the KIEE Conference
    • /
    • 1989.11a
    • /
    • pp.467-469
    • /
    • 1989
  • A new singular value approach is presented and analized in order to determine the number of multi pie sinsoidal frequencies from the finite noisy data. Simulations are conducted for Akaike's information criterion(AIC), Rissanen's shortest data description(MDL) and a new singular value approach, in covariance matrix based methods. And then performances are compared.

  • PDF

Multiphasic Analysis of Growth Curve of Body Weight in Mice

  • Kurnianto, E.;Shinjo, A.;Suga, D.
    • Asian-Australasian Journal of Animal Sciences
    • /
    • v.12 no.3
    • /
    • pp.331-335
    • /
    • 1999
  • The present study describes the analysis of the multiphasic growth function (MGF) to body weight in laboratory and wild mice. Three genetic groups of laboratory mice (Mus musculus domesticus) designated $CF_{{\sharp}1}$, C3H/HeNCrj and C57BL/6NCrj, and a genetic group of Yonakuni wild mice (Mus musculus molossinus yonakuni, Yk) were used. Mean body weights of each genetic group-sex subclass from birth to 69 days of age taken at 3-day intervals were analyzed by a monophasic, diphasic and triphasic functions for describing growth patterns. A comparison among the three functions of the MGF was based on the goodness-of-fit criteria: residual standard deviation (RSD), adjusted R-square (Adj $R^2$) and Akaike's information criterion (AIC). Result of this study indicated that body weight averaged heavier for males than for females. Among the four genetic groups within both sexes, $CF_{{\sharp}1}$ showed the highest, subsequent followed by C3H/HeNCrj, C57BL/6NCrj and Yk. Comparison among the three functions revealed that the triphasic function was the best fit to growth data, with the lowest RSD, the highest Adj $R^2$ and the lowest AIC, for the four genetic groups. For the triphasic function, RSD within each genetic group-sex subclass was similar for males and females. Adj $R^2$ was 0.999 for all genetic group-sex subclasses. AIC for laboratory mice males and females ranged from -70.48 to 66.50 and from -92.81 to -68.64, respectively; whereas for Yk wild mice males was -74.29 and females -78.42.

Comparison of Temperature Indexes for the Impact Assessment of Heat Stress on Heat-Related Mortality

  • Kim, Young-Min;Kim, So-Yeon;Cheong, Hae-Kwan;Kim, Eun-Hye
    • Environmental Analysis Health and Toxicology
    • /
    • v.26
    • /
    • pp.9.1-9.9
    • /
    • 2011
  • Objectives: In order to evaluate which temperature index is the best predictor for the health impact assessment of heat stress in Korea, several indexes were compared. Methods: We adopted temperature, perceived temperature (PT), and apparent temperature (AT), as a heat stress index, and changes in the risk of death for Seoul and Daegu were estimated with $^1{\circ}C$ increases in those temperature indexes using generalized additive model (GAM) adjusted for the non-temperature related factors: time trends, seasonality, and air pollution. The estimated excess mortality and Akaike's Information Criterion (AIC) due to the increased temperature indexes for the $75^{th}$ percentile in the summers from 2001 to 2008 were compared and analyzed to define the best predictor. Results: For Seoul, all-cause mortality presented the highest percent increase (2.99% [95% CI, 2.43 to 3.54%]) in maximum temperature while AIC showed the lowest value when the all-cause daily death counts were fitted with the maximum PT for the $75^{th}$ percentile of summer. For Daegu, all-cause mortality presented the greatest percent increase (3.52% [95% CI, 2.23 to 4.80%]) in minimum temperature and AIC showed the lowest value in maximum temperature. No lag effect was found in the association between temperature and mortality for Seoul, whereas for Daegu one-day lag effect was noted. Conclusions: There was no one temperature measure that was superior to the others in summer. To adopt an appropriate temperature index, regional meteorological characteristics and the disease status of population should be considered.

Validity and Reliability of the Korean Version of Person-Centered Practice Inventory-Staff for Nurses (간호사 대상 한국어판 인간중심돌봄 측정도구의 타당도와 신뢰도)

  • Kim, Sohyun;Tak, Sunghee H
    • Journal of Korean Academy of Nursing
    • /
    • v.51 no.3
    • /
    • pp.363-379
    • /
    • 2021
  • Purpose: The purpose of this study was to evaluate the validity and reliability of the Korean version of Person-Centered Practice Inventory-Staff (PCPI-S) for nurses. Methods: The English PCPI-S was translated into Korean with forward and backward translation. Data were collected from 338 nurses at one general hospital in Korea. Construct validity was evaluated with confirmatory factor analysis, convergent validity, and discriminant validity. Known-group validity was also evaluated. Cronbach's α was used to assess the reliability. Results: The PCPI-S Korean version consisted of 51 items in three areas: prerequisites, the care environment, and person-centered process. The comparative fit index (CFI) and values of person-centered care process were improved after engagement and having sympathetic presence items were combined as one component. The construct validity of PCPI-S Korean version was verified using four-factor structures (.05 < RMSEA < .10, AGFI > .70, CFI > .70, and AIC). The convergent validity and discriminant validity of the entire PCPI-S question were verified using a two-factor structures (AVE > .50, construct reliability > .70). There was an acceptable known-group validity with a significant correlation between the PCPI-S level and the degree of person-centered care awareness and education. Internal consistency was reliable with Cronbach's α .95. Conclusion: The Korean version of PCPI-S is valid and reliable. It can be used as a standardized Korean version of person-centered care measurement tool. Abbreviation: RMSEA = root mean square error of approximation; AGFI = adjusted goodness of fit index; AIC = Akaike information criterion; AVE = average variance extracted.

Differences by Selection Method for Exposure Factor Input Distribution for Use in Probabilistic Consumer Exposure Assessment

  • Kang, Sohyun;Kim, Jinho;Lim, Miyoung;Lee, Kiyoung
    • Journal of Environmental Health Sciences
    • /
    • v.48 no.5
    • /
    • pp.266-271
    • /
    • 2022
  • Background: The selection of distributions of input parameters is an important component in probabilistic exposure assessment. Goodness-of-fit (GOF) methods are used to determine the distribution of exposure factors. However, there are no clear guidelines for choosing an appropriate GOF method. Objectives: The outcomes of probabilistic consumer exposure assessment were compared by using five different GOF methods for the selection of input distributions: chi-squared test, Kolmogorov-Smirnov test (K-S), Anderson-Darling test (A-D), Akaike information criterion (AIC) and Bayesian information criterion (BIC). Methods: Individual exposures were estimated based on product usage factor combinations from 10,000 respondents. The distribution of individual exposure was considered as the true value of population exposures. Results: Among the five GOF methods, probabilistic exposure distributions using the A-D and K-S methods were similar to individual exposure estimations. Comparing the 95th percentiles of the probabilistic distributions and the individual estimations for 10 CPs, there were 0.73 to 1.92 times differences for the A-D method, and 0.73 to 1.60 times differences (excluding tire-shine spray) for the K-S method. Conclusions: There were significant differences in exposure assessment results among the selection of the GOF methods. Therefore, the GOF methods for probabilistic consumer exposure assessment should be carefully selected.

Mesh selectivity of the bottom trammel net for spinyhead sculpin Dasycottus setiger in the eastern coastal sea of Korea (저층 삼중자망에 대한 동해안산 고무꺽정이 (Dasycottus setiger)의 망목 선택성)

  • PARK, Chang-Doo;BAE, Jae-Hyun
    • Journal of the Korean Society of Fisheries and Ocean Technology
    • /
    • v.53 no.4
    • /
    • pp.317-326
    • /
    • 2017
  • Comparative fishing experiments were conducted in the eastern coastal waters near Uljin, Korea from 2002 to 2004, using the experimental trammel nets to estimate the selectivity for spinyhead sculpin Dasycottus setiger. The inner panels of the nets were made of nylon monofilament with four mesh sizes (82.2, 89.4, 104.8, and 120.2 mm) while its two outer panels were made of twisted nylon multifilament with a mesh size of 510 mm. The SELECT (Share Each Length's Catch Total) procedure with maximum likelihood method was applied to obtain a master selection curve. The different functional models (normal, lognormal, bi-normal, and logistic model) were fitted to the catch data. The lognormal model with the fixed relative fishing intensity was chosen as the best-fitted selection curve through comparison of model deviance and AIC (Akaike's Information Criterion). The optimum relative length (the ratio of fish total length to mesh size) with the maximum relative efficiency was obtained as 2.492.

Non-linear modelling to describe lactation curve in Gir crossbred cows

  • Bangar, Yogesh C.;Verma, Med Ram
    • Journal of Animal Science and Technology
    • /
    • v.59 no.2
    • /
    • pp.3.1-3.7
    • /
    • 2017
  • Background: The modelling of lactation curve provides guidelines in formulating farm managerial practices in dairy cows. The aim of the present study was to determine the suitable non-linear model which most accurately fitted to lactation curves of five lactations in 134 Gir crossbred cows reared in Research-CumDevelopment Project (RCDP) on Cattle farm, MPKV (Maharashtra). Four models viz. gamma-type function, quadratic model, mixed log function and Wilmink model were fitted to each lactation separately and then compared on the basis of goodness of fit measures viz. adjusted $R^2$, root mean square error (RMSE), Akaike's Informaion Criteria (AIC) and Bayesian Information Criteria (BIC). Results: In general, highest milk yield was observed in fourth lactation whereas it was lowest in first lactation. Among the models investigated, mixed log function and gamma-type function provided best fit of the lactation curve of first and remaining lactations, respectively. Quadratic model gave least fit to lactation curve in almost all lactations. Peak yield was observed as highest and lowest in fourth and first lactation, respectively. Further, first lactation showed highest persistency but relatively higher time to achieve peak yield than other lactations. Conclusion: Lactation curve modelling using gamma-type function may be helpful to setting the management strategies at farm level, however, modelling must be optimized regularly before implementing them to enhance productivity in Gir crossbred cows.

Application of Cox and Parametric Survival Models to Assess Social Determinants of Health Affecting Three-Year Survival of Breast Cancer Patients

  • Mohseny, Maryam;Amanpour, Farzaneh;Mosavi-Jarrahi, Alireza;Jafari, Hossein;Moradi-Joo, Mohammad;Monfared, Esmat Davoudi
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.17 no.sup3
    • /
    • pp.311-316
    • /
    • 2016
  • Breast cancer is one of the most common causes of cancer mortality in Iran. Social determinants of health are among the key factors affecting the pathogenesis of diseases. This cross-sectional study aimed to determine the social determinants of breast cancer survival time with parametric and semi-parametric regression models. It was conducted on male and female patients diagnosed with breast cancer presenting to the Cancer Research Center of Shohada-E-Tajrish Hospital from 2006 to 2010. The Cox proportional hazard model and parametric models including the Weibull, log normal and log-logistic models were applied to determine the social determinants of survival time of breast cancer patients. The Akaike information criterion (AIC) was used to assess the best fit. Statistical analysis was performed with STATA (version 11) software. This study was performed on 797 breast cancer patients, aged 25-93 years with a mean age of 54.7 (${\pm}11.9$) years. In both semi-parametric and parametric models, the three-year survival was related to level of education and municipal district of residence (P<0.05). The AIC suggested that log normal distribution was the best fit for the three-year survival time of breast cancer patients. Social determinants of health such as level of education and municipal district of residence affect the survival of breast cancer cases. Future studies must focus on the effect of childhood social class on the survival times of cancers, which have hitherto only been paid limited attention.