• Title/Summary/Keyword: AIC(Akaike's Information Criterion)

Search Result 29, Processing Time 0.024 seconds

A Prediction Model on Freeway Accident Duration using AFT Survival Analysis (AFT 생존분석 기법을 이용한 고속도로 교통사고 지속시간 예측모형)

  • Jeong, Yeon-Sik;Song, Sang-Gyu;Choe, Gi-Ju
    • Journal of Korean Society of Transportation
    • /
    • v.25 no.5
    • /
    • pp.135-148
    • /
    • 2007
  • Understanding the relation between characteristics of an accident and its duration is crucial for the efficient response of accidents and the reduction of total delay caused by accidents. Thus the objective of this study is to model accident duration using an AFT metric model. Although the log-logistic and log-normal AFT models were selected based on the previous studies and statistical theory, the log-logistic model was better fitted. Since the AFT model is commonly used for the purpose of prediction, the estimated model can be also used for the prediction of duration on freeways as soon as the base accident information is reported. Therefore, the predicted information will be directly useful to make some decisions regarding the resources needed to clear accident and dispatch crews as well as will lead to less traffic congestion and much saving the injured.

Analysis and probabilistic modeling of wind characteristics of an arch bridge using structural health monitoring data during typhoons

  • Ye, X.W.;Xi, P.S.;Su, Y.H.;Chen, B.
    • Structural Engineering and Mechanics
    • /
    • v.63 no.6
    • /
    • pp.809-824
    • /
    • 2017
  • The accurate evaluation of wind characteristics and wind-induced structural responses during a typhoon is of significant importance for bridge design and safety assessment. This paper presents an expectation maximization (EM) algorithm-based angular-linear approach for probabilistic modeling of field-measured wind characteristics. The proposed method has been applied to model the wind speed and direction data during typhoons recorded by the structural health monitoring (SHM) system instrumented on the arch Jiubao Bridge located in Hangzhou, China. In the summer of 2015, three typhoons, i.e., Typhoon Chan-hom, Typhoon Soudelor and Typhoon Goni, made landfall in the east of China and then struck the Jiubao Bridge. By analyzing the wind monitoring data such as the wind speed and direction measured by three anemometers during typhoons, the wind characteristics during typhoons are derived, including the average wind speed and direction, turbulence intensity, gust factor, turbulence integral scale, and power spectral density (PSD). An EM algorithm-based angular-linear modeling approach is proposed for modeling the joint distribution of the wind speed and direction. For the marginal distribution of the wind speed, the finite mixture of two-parameter Weibull distribution is employed, and the finite mixture of von Mises distribution is used to represent the wind direction. The parameters of each distribution model are estimated by use of the EM algorithm, and the optimal model is determined by the values of $R^2$ statistic and the Akaike's information criterion (AIC). The results indicate that the stochastic properties of the wind field around the bridge site during typhoons are effectively characterized by the proposed EM algorithm-based angular-linear modeling approach. The formulated joint distribution of the wind speed and direction can serve as a solid foundation for the purpose of accurately evaluating the typhoon-induced fatigue damage of long-span bridges.

Size selectivity of the gill net for spinyhead sculpin, Dasycottus setiger in the eastern coastal waters of Korea (동해안 자망에 대한 고무꺽정이 (Dasycottus setiger )의 망목 선택성)

  • PARK, Chang-Doo;BAE, Jae-Hyun;CHO, Sam-Kwang;AN, Heui-Chun;KIM, In-Ok
    • Journal of the Korean Society of Fisheries and Ocean Technology
    • /
    • v.52 no.4
    • /
    • pp.281-289
    • /
    • 2016
  • Spinyhead sculpin Dasycottus setiger, a species of cold water fish, is distributed along the eastern coastal waters of Korea. A series of fishing experiments was carried out in the waters near Uljin from June, 2002 to November, 2004, using the experimental monofilament gill nets of different mesh sizes (82.2, 89.4, 104.8, and 120.2 mm) to describe the selectivity of the gill net for the fish. The SELECT (Share Each Length's Catch Total) analysis with maximum likelihood method was applied to fit the different functional models (normal, lognormal, and bi-normal models) for selection curves to the catch data. The bi-normal model with the fixed relative fishing intensity was selected as the best-fit selection curve by AIC (Akaike's Information Criterion) comparison. For the best-fit selection curve, the optimum relative length (the ratio of fish total length to mesh size) with the maximum efficiency and the selection range ($R_{50%,large}-R_{50%,small}$) of 50% retention were obtained as 2.363 and 0.851, respectively. The ratios of body girth to mesh perimeter at 100% retention where the selection curve of each mesh size represented the optimum total length were calculated as the range of 0.86 ~ 0.87.

Evaluation of goodness of fit of semiparametric and parametric models in analysis of factors associated with length of stay in neonatal intensive care unit

  • Kheiry, Fatemeh;Kargarian-Marvasti, Sadegh;Afrashteh, Sima;Mohammadbeigi, Abolfazl;Daneshi, Nima;Naderi, Salma;Saadat, Seyed Hossein
    • Clinical and Experimental Pediatrics
    • /
    • v.63 no.9
    • /
    • pp.361-367
    • /
    • 2020
  • Background: Length of stay is a significant indicator of care effectiveness and hospital performance. Owing to the limited number of healthcare centers and facilities, it is important to optimize length of stay and associated factors. Purpose: The present study aimed to investigate factors associated with neonatal length of stay in the neonatal intensive care unit (NICU) using parametric and semiparametric models and compare model fitness according to Akaike information criterion (AIC) between 2016 and 2018. Methods: This retrospective cohort study reviewed 600 medical records of infants admitted to the NICU of Bandar Abbas Hospital. Samples were identified using census sampling. Factors associated with NICU length of stay were investigated based on semiparametric Cox model and 4 parametric models including Weibull, exponential, log-logistic, and log-normal to determine the best fitted model. The data analysis was conducted using R software. The significance level was set at 0.05. Results: The study findings suggest that breastfeeding, phototherapy, acute renal failure, presence of mechanical ventilation, and availability of central venous catheter were commonly identified as factors associated with NICU length of stay in all 5 models (P<0.05). Parametric models showed better fitness than the Cox model in this study. Conclusion: Breastfeeding and availability of central venous catheter had protective effects against length of stay, whereas phototherapy, acute renal failure, and mechanical ventilation increased length of stay in NICU. Therefore, the identification of factors associated with NICU length of stay can help establish effective interventions aimed at decreasing the length of stay among infants.

Abundance Estimation of the Finless Porpoise, Neophocaena phocaenoides, Using Models of the Detection Function in a Line Transect (Line Transect에서 발견율함수 추정에 사용되는 모델에 따른 상괭이, Neophocaena phocaenoides의 자원개체수 추정)

  • Park, Kyum-Joon;Kim, Zang-Geun;Zhang, Chang-Ik
    • Korean Journal of Fisheries and Aquatic Sciences
    • /
    • v.40 no.4
    • /
    • pp.201-209
    • /
    • 2007
  • Line transect sampling in a sighting survey is one of most widely used methods for assessing animal abundance. This study applied distance data, collected from three sighting surveys using line transects for finless porpoise that were conducted in 2004 and 2005 off the west coast of Korea, to four models (hazard-rate, uniform, half-normal and exponential) that can use a variety of detection functions, g (x). The hazard-rate model, a derived model for the detection function, should have a shoulder condition chosen using the AIC (Akaike Information Criterion), as the most suitable model. However, it did not describe a shoulder shape for the value of g(x) near the track tine and underestimated g (x), just as the exponential model did. The hazard-rate model showed a bias toward overestimating the densities of finless porpoises with a higher coefficient of variation (CV) than the other models did. The uniform model underestimated the densities of finless porpoise but had the lowest CV. The half-normal model described a detection function with a shape similar to that of the uniform model. The half-normal model was robust for finless porpoise data and should be able to avoid density underestimation. The estimated abundance of finless porpoise was 3,602 individuals (95% CI=1,251-10,371) inshore in 2005 and 33,045 individuals (95% CI=24,274-44,985) offshore in 2004.

Analysis of Success and Failure Factors of OTT Service Contents According to the Rating: Focus on Netflix (평점에 따른 OTT 서비스 콘텐츠의 성공과 실패 요인 분석: 넷플릭스를 중심으로)

  • Hong, Ji-Soo;Park, Jin-Soo;Kang, Sung-Woo
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.44 no.4
    • /
    • pp.65-75
    • /
    • 2021
  • This study explores multiple variables of an OTT service for discovering hidden relationship between rating and the other variables of each successful and failed content, respectively. In order to extract key variables that are strongly correlated to the rating across the contents, this work analyzes 170 Netflix original dramas and 419 movies. These contents are classified as success and failure by using the rating site IMDb, respectively. The correlation between the contents, which are classified via rating, and variables such as violence, lewdness and running time are analyzed to determine whether a certain variable appears or not in each successful and failure content. This study employs a regression analysis to discover correlations across the variables as a main analysis method. Since the correlation between independent variables should be low, check multicollinearity and select the variable. Cook's distance is used to detect and remove outliers. To improve the accuracy of the model, a variable selection based on AIC(Akaike Information Criterion) is performed. Finally, the basic assumptions of regression analysis are identified by residual diagnosis and Dubin Watson test. According to the whole analysis process, it is concluded that the more director awards exist and the less immatatable tend to be successful in movies. On the contrary, lower fear tend to be failure in movies. In case of dramas, there are close correlations between failure dramas and lower violence, higher fear, higher drugs.

Prediction of Seasonal Nitrate Concentration in Springs on the Southern Slope of Jeju Island using Multiple Linear Regression of Geographic Spatial Data (지리 공간 자료의 다중회귀분석을 이용한 제주도 남측사면 용천수의 시기별 질산성 질소 농도 예측)

  • Jung, Youn-Young;Koh, Dong-Chan;Kang, Bong-Rae;Ko, Kyung-Suk;Yu, Yong-Jae
    • Economic and Environmental Geology
    • /
    • v.44 no.2
    • /
    • pp.135-152
    • /
    • 2011
  • Nitrate concentrations in springs at the southern slope of Jeju Island were predicted using multiple linear regression (MLR) of spatial variables including hydrogeological parameters and land use characteristics. Springs showed wide range of nitrate concentrations from <0.02 to 86 mg/L with a mean of 20 mg/L. Spatial variables were generated for the circular buffer when the optimal buffer radius was assigned as 400 m. Selected regression models were tested using the p values and Durbin-Watson statistics. Explanatory variables were selected using the adjusted $R^2$, Cp (total squared error) and AIC (Akaike's Information Criterion), and significance. In addition, mutual linear relations between variables were also considered. Small portion of springs, usually <10% of total samples, were identified as outliers indicating limitations of MLR using circular buffers. Adjusted $R^2$ of the proposed models was improved from 0.75 to 0.87 when outliers were eliminated. In particular, the areal proportion of natural area had the greatest influence on the nitrate concentrations in springs. Among anthropogenic land uses, the influence of nitrate contamination is diminishing in the following order of orchard, residential area, and dry farmland. It is apparent quality of springs in the study area is likely to be controlled by land uses instead of hydrogeological parameters. Most of all, it is worth highlighting that the contamination susceptibility of springs is highly sensitive to nearby land uses, in particular, orchard.

Survival Analysis for White Non-Hispanic Female Breast Cancer Patients

  • Khan, Hafiz Mohammad Rafiqullah;Saxena, Anshul;Gabbidon, Kemesha;Stewart, Tiffanie Shauna-Jeanne;Bhatt, Chintan
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.15 no.9
    • /
    • pp.4049-4054
    • /
    • 2014
  • Background: Race and ethnicity are significant factors in predicting survival time of breast cancer patients. In this study, we applied advanced statistical methods to predict the survival of White non-Hispanic female breast cancer patients, who were diagnosed between the years 1973 and 2009 in the United States (U.S.). Materials and Methods: Demographic data from the Surveillance Epidemiology and End Results (SEER) database were used for the purpose of this study. Nine states were randomly selected from 12 U.S. cancer registries. A stratified random sampling method was used to select 2,000 female breast cancer patients from these nine states. We compared four types of advanced statistical probability models to identify the best-fit model for the White non-Hispanic female breast cancer survival data. Three model building criterion were used to measure and compare goodness of fit of the models. These include Akaike Information Criteria (AIC), Bayesian Information Criteria (BIC), and Deviance Information Criteria (DIC). In addition, we used a novel Bayesian method and the Markov Chain Monte Carlo technique to determine the posterior density function of the parameters. After evaluating the model parameters, we selected the model having the lowest DIC value. Using this Bayesian method, we derived the predictive survival density for future survival time and its related inferences. Results: The analytical sample of White non-Hispanic women included 2,000 breast cancer cases from the SEER database (1973-2009). The majority of cases were married (55.2%), the mean age of diagnosis was 63.61 years (SD = 14.24) and the mean survival time was 84 months (SD = 35.01). After comparing the four statistical models, results suggested that the exponentiated Weibull model (DIC= 19818.220) was a better fit for White non-Hispanic females' breast cancer survival data. This model predicted the survival times (in months) for White non-Hispanic women after implementation of precise estimates of the model parameters. Conclusions: By using modern model building criteria, we determined that the data best fit the exponentiated Weibull model. We incorporated precise estimates of the parameter into the predictive model and evaluated the survival inference for the White non-Hispanic female population. This method of analysis will assist researchers in making scientific and clinical conclusions when assessing survival time of breast cancer patients.

Predicting the Concentration of Obesity-related Metabolites via Heart Rate Variability for Korean Premenopausal Obese Women: Multiple Regression Analysis (심박변이도를 통한 폐경 전 한국인 비만 여성의 비만 관련 대사체 농도 예측을 위한 회귀분석)

  • Kim, Jongyeon;Yang, Yo-Chan;Yi, Woon-Sup;Kim, Je-In;Maeng, Tae-Ho;Yoo, Duk-Joo;Shim, Jae-Woo;Cho, Woo-Young;Song, Mi-Yeon;Lee, Jong-Soo
    • Journal of Korean Medicine Rehabilitation
    • /
    • v.24 no.4
    • /
    • pp.155-162
    • /
    • 2014
  • Objectives Advanced researches on the relationship between obesity and heart rate variability (HRV), heretofore, focused on characteristics of HRV depending on the state of obesity. However, the previous researches have not quantified predictive power of HRV toward the obesity-related variables, which is rather more meaningful for clinicians who regularly treat obese patients. Hence, we designed a research to investigate whether HRV could predict serum levels of obesity-related metabolites. Methods Ninety obese premenopausal women meeting the inclusion criteria were recruited. The HRV test, blood sampling, and measurement of physical traits were conducted. Multiple regression analysis of the measurement data was carried out, putting obesity-related metabolites (insulin, glucose, triglyceride, hs-CRP, HDL, LDL, total cholesterol) as outcome variables and the others as predictors. To select appropriate predictive variables, the Akaike's Information Criterion (AIC) was applied. Normality and homoskedasticity of residuals for each model were tested to identify if there were any violations of the regression analysis's basic assumption. Logarithm transformation was used for the values of the concentration of metabolites and the HRV. Results The regression model including Total Power (TP) value and BMI had significant predictive power for serum insulin concentration (F(2, 88)=835.7, p<0.001, $R^2=0.95$). The regression coefficient of ln (TP) was -0.1002. However, it was not sure if the HRV could predict concentrations of other metabolites. Conclusions The results suggest that the Total Power (TP) value of the HRV can predict the level of serum insulin. If the BMI could be assumed as being constant, when the TP value is multiplied by n, the predicted change of insulin could be drawn by multiplying $n^{-0.1002}$. The uncertainty of this model can be assumed as approximately 5%.