• 제목/요약/키워드: Akaike Information Criterion

검색결과 117건 처리시간 0.019초

Applying Conventional and Saturated Generalized Gamma Distributions in Parametric Survival Analysis of Breast Cancer

  • Yavari, Parvin;Abadi, Alireza;Amanpour, Farzaneh;Bajdik, Chris
    • Asian Pacific Journal of Cancer Prevention
    • /
    • 제13권5호
    • /
    • pp.1829-1831
    • /
    • 2012
  • Background: The generalized gamma distribution statistics constitute an extensive family that contains nearly all of the most commonly used distributions including the exponential, Weibull and log normal. A saturated version of the model allows covariates having effects through all the parameters of survival time distribution. Accelerated failure-time models assume that only one parameter of the distribution depends on the covariates. Methods: We fitted both the conventional GG model and the saturated form for each of its members including the Weibull and lognormal distribution; and compared them using likelihood ratios. To compare the selected parameter distribution with log logistic distribution which is a famous distribution in survival analysis that is not included in generalized gamma family, we used the Akaike information criterion (AIC; r=l(b)-2p). All models were fitted using data for 369 women age 50 years or more, diagnosed with stage IV breast cancer in BC during 1990-1999 and followed to 2010. Results: In both conventional and saturated parametric models, the lognormal was the best candidate among the GG family members; also, the lognormal fitted better than log-logistic distribution. By the conventional GG model, the variables "surgery", "radiotherapy", "hormone therapy", "erposneg" and interaction between "hormone therapy" and "erposneg" are significant. In the AFT model, we estimated the relative time for these variables. By the saturated GG model, similar significant variables are selected. Estimating the relative times in different percentiles of extended model illustrate the pattern in which the relative survival time change during the time. Conclusions: The advantage of using the generalized gamma distribution is that it facilitates estimating a model with improved fit over the standard Weibull or lognormal distributions. Alternatively, the generalized F family of distributions might be considered, of which the generalized gamma distribution is a member and also includes the commonly used log-logistic distribution.

Evaluation of goodness of fit of semiparametric and parametric models in analysis of factors associated with length of stay in neonatal intensive care unit

  • Kheiry, Fatemeh;Kargarian-Marvasti, Sadegh;Afrashteh, Sima;Mohammadbeigi, Abolfazl;Daneshi, Nima;Naderi, Salma;Saadat, Seyed Hossein
    • Clinical and Experimental Pediatrics
    • /
    • 제63권9호
    • /
    • pp.361-367
    • /
    • 2020
  • Background: Length of stay is a significant indicator of care effectiveness and hospital performance. Owing to the limited number of healthcare centers and facilities, it is important to optimize length of stay and associated factors. Purpose: The present study aimed to investigate factors associated with neonatal length of stay in the neonatal intensive care unit (NICU) using parametric and semiparametric models and compare model fitness according to Akaike information criterion (AIC) between 2016 and 2018. Methods: This retrospective cohort study reviewed 600 medical records of infants admitted to the NICU of Bandar Abbas Hospital. Samples were identified using census sampling. Factors associated with NICU length of stay were investigated based on semiparametric Cox model and 4 parametric models including Weibull, exponential, log-logistic, and log-normal to determine the best fitted model. The data analysis was conducted using R software. The significance level was set at 0.05. Results: The study findings suggest that breastfeeding, phototherapy, acute renal failure, presence of mechanical ventilation, and availability of central venous catheter were commonly identified as factors associated with NICU length of stay in all 5 models (P<0.05). Parametric models showed better fitness than the Cox model in this study. Conclusion: Breastfeeding and availability of central venous catheter had protective effects against length of stay, whereas phototherapy, acute renal failure, and mechanical ventilation increased length of stay in NICU. Therefore, the identification of factors associated with NICU length of stay can help establish effective interventions aimed at decreasing the length of stay among infants.

동해안 자망에 대한 고무꺽정이 (Dasycottus setiger )의 망목 선택성 (Size selectivity of the gill net for spinyhead sculpin, Dasycottus setiger in the eastern coastal waters of Korea)

  • 박창두;배재현;조삼광;안희춘;김인옥
    • 수산해양기술연구
    • /
    • 제52권4호
    • /
    • pp.281-289
    • /
    • 2016
  • Spinyhead sculpin Dasycottus setiger, a species of cold water fish, is distributed along the eastern coastal waters of Korea. A series of fishing experiments was carried out in the waters near Uljin from June, 2002 to November, 2004, using the experimental monofilament gill nets of different mesh sizes (82.2, 89.4, 104.8, and 120.2 mm) to describe the selectivity of the gill net for the fish. The SELECT (Share Each Length's Catch Total) analysis with maximum likelihood method was applied to fit the different functional models (normal, lognormal, and bi-normal models) for selection curves to the catch data. The bi-normal model with the fixed relative fishing intensity was selected as the best-fit selection curve by AIC (Akaike's Information Criterion) comparison. For the best-fit selection curve, the optimum relative length (the ratio of fish total length to mesh size) with the maximum efficiency and the selection range ($R_{50%,large}-R_{50%,small}$) of 50% retention were obtained as 2.363 and 0.851, respectively. The ratios of body girth to mesh perimeter at 100% retention where the selection curve of each mesh size represented the optimum total length were calculated as the range of 0.86 ~ 0.87.

공간분석을 이용한 심뇌혈관질환 사망률에 영향을 미치는 지역요인 분석 (A Study on the Regional Factors Affecting the Death Rates of Cardio-Cerebrovascular Disease Using the Spatial Analysis)

  • 박영용;박주현;박유현;이광수
    • 보건행정학회지
    • /
    • 제30권1호
    • /
    • pp.26-36
    • /
    • 2020
  • Background: The purpose of this study was to analyze the relationship between the regional characteristics and the age-adjusted cardio-cerebrovascular disease mortality rates (SCDMR) in 229 si·gun·gu administrative regions. Methods: SCDMR of man and woman was used as a dependent variable using the statistical data of death cause in 2017. As a representative index of regional characteristics, health behavior factors, socio-demographic and economic factors, physical environment factors, and health care factors were selected as independent variables. Ordinary least square (OLS) regression and geographically weighted regression (GWR) were performed to identify their relationship. Results: OLS analysis showed significant factors affecting the mortality rates of cardio-cerebrovascular disease as follows: high-risk drinking rates, the ratio of elderly living alone, financial independence, and walking practice rates. GWR analysis showed that the regression coefficients were varied by regions and the influence directions of the independent variables on the dependent variable were mixed. GWR showed higher adjusted R2 and Akaike information criterion values than those of OLS. Conclusion: If there is a spatial heterogeneity problem as Korea, it is appropriate to use the GWR model to estimate the influence of regional characteristics. Therefore, results using the GWR model suggest that it needs to establish customized health policies and projects for each region considering the socio-economic characteristics of each region.

Line Transect에서 발견율함수 추정에 사용되는 모델에 따른 상괭이, Neophocaena phocaenoides의 자원개체수 추정 (Abundance Estimation of the Finless Porpoise, Neophocaena phocaenoides, Using Models of the Detection Function in a Line Transect)

  • 박겸준;김장근;장창익
    • 한국수산과학회지
    • /
    • 제40권4호
    • /
    • pp.201-209
    • /
    • 2007
  • Line transect sampling in a sighting survey is one of most widely used methods for assessing animal abundance. This study applied distance data, collected from three sighting surveys using line transects for finless porpoise that were conducted in 2004 and 2005 off the west coast of Korea, to four models (hazard-rate, uniform, half-normal and exponential) that can use a variety of detection functions, g (x). The hazard-rate model, a derived model for the detection function, should have a shoulder condition chosen using the AIC (Akaike Information Criterion), as the most suitable model. However, it did not describe a shoulder shape for the value of g(x) near the track tine and underestimated g (x), just as the exponential model did. The hazard-rate model showed a bias toward overestimating the densities of finless porpoises with a higher coefficient of variation (CV) than the other models did. The uniform model underestimated the densities of finless porpoise but had the lowest CV. The half-normal model described a detection function with a shape similar to that of the uniform model. The half-normal model was robust for finless porpoise data and should be able to avoid density underestimation. The estimated abundance of finless porpoise was 3,602 individuals (95% CI=1,251-10,371) inshore in 2005 and 33,045 individuals (95% CI=24,274-44,985) offshore in 2004.

Extreme value modeling of structural load effects with non-identical distribution using clustering

  • Zhou, Junyong;Ruan, Xin;Shi, Xuefei;Pan, Chudong
    • Structural Engineering and Mechanics
    • /
    • 제74권1호
    • /
    • pp.55-67
    • /
    • 2020
  • The common practice to predict the characteristic structural load effects (LEs) in long reference periods is to employ the extreme value theory (EVT) for building limit distributions. However, most applications ignore that LEs are driven by multiple loading events and thus do not have the identical distribution, a prerequisite for EVT. In this study, we propose the composite extreme value modeling approach using clustering to (a) cluster initial blended samples into finite identical distributed subsamples using the finite mixture model, expectation-maximization algorithm, and the Akaike information criterion; (b) combine limit distributions of subsamples into a composite prediction equation using the generalized Pareto distribution based on a joint threshold. The proposed approach was validated both through numerical examples with known solutions and engineering applications of bridge traffic LEs on a long-span bridge. The results indicate that a joint threshold largely benefits the composite extreme value modeling, many appropriate tail approaching models can be used, and the equation form is simply the sum of the weighted models. In numerical examples, the proposed approach using clustering generated accurate extrema prediction of any reference period compared with the known solutions, whereas the common practice of employing EVT without clustering on the mixture data showed large deviations. Real-world bridge traffic LEs are driven by multi-events and present multipeak distributions, and the proposed approach is more capable of capturing the tendency of tailed LEs than the conventional approach. The proposed approach is expected to have wide applications to general problems such as samples that are driven by multiple events and that do not have the identical distribution.

Comparison of Temperature Indexes for the Impact Assessment of Heat Stress on Heat-Related Mortality

  • Kim, Young-Min;Kim, So-Yeon;Cheong, Hae-Kwan;Kim, Eun-Hye
    • Environmental Analysis Health and Toxicology
    • /
    • 제26권
    • /
    • pp.9.1-9.9
    • /
    • 2011
  • Objectives: In order to evaluate which temperature index is the best predictor for the health impact assessment of heat stress in Korea, several indexes were compared. Methods: We adopted temperature, perceived temperature (PT), and apparent temperature (AT), as a heat stress index, and changes in the risk of death for Seoul and Daegu were estimated with $^1{\circ}C$ increases in those temperature indexes using generalized additive model (GAM) adjusted for the non-temperature related factors: time trends, seasonality, and air pollution. The estimated excess mortality and Akaike's Information Criterion (AIC) due to the increased temperature indexes for the $75^{th}$ percentile in the summers from 2001 to 2008 were compared and analyzed to define the best predictor. Results: For Seoul, all-cause mortality presented the highest percent increase (2.99% [95% CI, 2.43 to 3.54%]) in maximum temperature while AIC showed the lowest value when the all-cause daily death counts were fitted with the maximum PT for the $75^{th}$ percentile of summer. For Daegu, all-cause mortality presented the greatest percent increase (3.52% [95% CI, 2.23 to 4.80%]) in minimum temperature and AIC showed the lowest value in maximum temperature. No lag effect was found in the association between temperature and mortality for Seoul, whereas for Daegu one-day lag effect was noted. Conclusions: There was no one temperature measure that was superior to the others in summer. To adopt an appropriate temperature index, regional meteorological characteristics and the disease status of population should be considered.

간호사 대상 한국어판 인간중심돌봄 측정도구의 타당도와 신뢰도 (Validity and Reliability of the Korean Version of Person-Centered Practice Inventory-Staff for Nurses)

  • 김소현;탁성희
    • 대한간호학회지
    • /
    • 제51권3호
    • /
    • pp.363-379
    • /
    • 2021
  • Purpose: The purpose of this study was to evaluate the validity and reliability of the Korean version of Person-Centered Practice Inventory-Staff (PCPI-S) for nurses. Methods: The English PCPI-S was translated into Korean with forward and backward translation. Data were collected from 338 nurses at one general hospital in Korea. Construct validity was evaluated with confirmatory factor analysis, convergent validity, and discriminant validity. Known-group validity was also evaluated. Cronbach's α was used to assess the reliability. Results: The PCPI-S Korean version consisted of 51 items in three areas: prerequisites, the care environment, and person-centered process. The comparative fit index (CFI) and values of person-centered care process were improved after engagement and having sympathetic presence items were combined as one component. The construct validity of PCPI-S Korean version was verified using four-factor structures (.05 < RMSEA < .10, AGFI > .70, CFI > .70, and AIC). The convergent validity and discriminant validity of the entire PCPI-S question were verified using a two-factor structures (AVE > .50, construct reliability > .70). There was an acceptable known-group validity with a significant correlation between the PCPI-S level and the degree of person-centered care awareness and education. Internal consistency was reliable with Cronbach's α .95. Conclusion: The Korean version of PCPI-S is valid and reliable. It can be used as a standardized Korean version of person-centered care measurement tool. Abbreviation: RMSEA = root mean square error of approximation; AGFI = adjusted goodness of fit index; AIC = Akaike information criterion; AVE = average variance extracted.

Novel nomogram-based integrated gonadotropin therapy individualization in in vitro fertilization/intracytoplasmic sperm injection: A modeling approach

  • Ebid, Abdel Hameed IM;Motaleb, Sara M Abdel;Mostafa, Mahmoud I;Soliman, Mahmoud MA
    • Clinical and Experimental Reproductive Medicine
    • /
    • 제48권2호
    • /
    • pp.163-173
    • /
    • 2021
  • Objective: This study aimed to characterize a validated model for predicting oocyte retrieval in controlled ovarian stimulation (COS) and to construct model-based nomograms for assistance in clinical decision-making regarding the gonadotropin protocol and dose. Methods: This observational, retrospective, cohort study included 636 women with primary unexplained infertility and a normal menstrual cycle who were attempting assisted reproductive therapy for the first time. The enrolled women were split into an index group (n=497) for model building and a validation group (n=139). The primary outcome was absolute oocyte count. The dose-response relationship was tested using modified Poisson, negative binomial, hybrid Poisson-Emax, and linear models. The validation group was similarly analyzed, and its results were compared to that of the index group. Results: The Poisson model with the log-link function demonstrated superior predictive performance and precision (Akaike information criterion, 2,704; λ=8.27; relative standard error (λ)=2.02%). The covariate analysis included women's age (p<0.001), antral follicle count (p<0.001), basal follicle-stimulating hormone level (p<0.001), gonadotropin dose (p=0.042), and protocol type (p=0.002 and p<0.001 for short and antagonist protocols, respectively). The estimates from 500 bootstrap samples were close to those of the original model. The validation group showed model assessment metrics comparable to the index model. Based on the fitted model, a static nomogram was built to improve visualization. In addition, a dynamic electronic tool was created for convenience of use. Conclusion: Based on our validated model, nomograms were constructed to help clinicians individualize the stimulation protocol and gonadotropin doses in COS cycles.

평점에 따른 OTT 서비스 콘텐츠의 성공과 실패 요인 분석: 넷플릭스를 중심으로 (Analysis of Success and Failure Factors of OTT Service Contents According to the Rating: Focus on Netflix)

  • 홍지수;박진수;강성우
    • 산업경영시스템학회지
    • /
    • 제44권4호
    • /
    • pp.65-75
    • /
    • 2021
  • This study explores multiple variables of an OTT service for discovering hidden relationship between rating and the other variables of each successful and failed content, respectively. In order to extract key variables that are strongly correlated to the rating across the contents, this work analyzes 170 Netflix original dramas and 419 movies. These contents are classified as success and failure by using the rating site IMDb, respectively. The correlation between the contents, which are classified via rating, and variables such as violence, lewdness and running time are analyzed to determine whether a certain variable appears or not in each successful and failure content. This study employs a regression analysis to discover correlations across the variables as a main analysis method. Since the correlation between independent variables should be low, check multicollinearity and select the variable. Cook's distance is used to detect and remove outliers. To improve the accuracy of the model, a variable selection based on AIC(Akaike Information Criterion) is performed. Finally, the basic assumptions of regression analysis are identified by residual diagnosis and Dubin Watson test. According to the whole analysis process, it is concluded that the more director awards exist and the less immatatable tend to be successful in movies. On the contrary, lower fear tend to be failure in movies. In case of dramas, there are close correlations between failure dramas and lower violence, higher fear, higher drugs.