• Title/Summary/Keyword: Log-linear model

Search Result 229, Processing Time 0.045 seconds

Application of Inactivation Model on Phytophthora Blight Pathogen (Phytophthora capsici) using Plasma Process (플라즈마 공정을 이용한 고추역병균(Phytophthora capsici) 불활성화 모델의 적용)

  • Kim, Dong-Seog;Park, Young-Seek
    • Journal of Environmental Science International
    • /
    • v.24 no.11
    • /
    • pp.1393-1404
    • /
    • 2015
  • Ten empirical disinfection models for the plasma process were used to find an optimum model. The variation of model parameters in each model according to the operating conditions (first voltage, second voltage, air flow rate, pH, incubation water concentration) were investigated in order to explain the disinfection model. In this experiment, the DBD (dielectric barrier discharge) plasma reactor was used to inactivate Phytophthora capsici which cause wilt in tomato plantation. Optimum disinfection models were chosen among ten models by the application of statistical SSE (sum of squared error), RMSE (root mean sum of squared error), $r^2$ values on the experimental data using the GInaFiT software in Microsoft Excel. The optimum models were shown as Log-linear+Tail model, Double Weibull model and Biphasic model. Three models were applied to the experimental data according to the variation of the operating conditions. In Log-linear+Tail model, $Log_{10}(N_o)$, $Log_{10}(N_{res})$ and $k_{max}$ values were examined. In Double Weibull model, $Log_{10}(N_o)$, $Log_{10}(N_{res})$, ${\alpha}$, ${\delta}_1$, ${\delta}_2$, p values were calculated and examined. In Biphasic model, $Log_{10}(N_o)$, f, $k_{max1}$ and $k_{max2}$ values were used. The appropriate model parameters for the calculation of optimum operating conditions were $k_{max}$, ${\alpha}$, $k_{max1}$ at each model, respectively.

Analysis of Online Behavior and Prediction of Learning Performance in Blended Learning Environments

  • JO, Il-Hyun;PARK, Yeonjeong;KIM, Jeonghyun;SONG, Jongwoo
    • Educational Technology International
    • /
    • v.15 no.2
    • /
    • pp.71-88
    • /
    • 2014
  • A variety of studies to predict students' performance have been conducted since educational data such as web-log files traced from Learning Management System (LMS) are increasingly used to analyze students' learning behaviors. However, it is still challenging to predict students' learning achievement in blended learning environment where online and offline learning are combined. In higher education, diverse cases of blended learning can be formed from simple use of LMS for administrative purposes to full usages of functions in LMS for online distance learning class. As a result, a generalized model to predict students' academic success does not fulfill diverse cases of blended learning. This study compares two blended learning classes with each prediction model. The first blended class which involves online discussion-based learning revealed a linear regression model, which explained 70% of the variance in total score through six variables including total log-in time, log-in frequencies, log-in regularities, visits on boards, visits on repositories, and the number of postings. However, the second case, a lecture-based class providing regular basis online lecture notes in Moodle show weaker results from the same linear regression model mainly due to non-linearity of variables. To investigate the non-linear relations between online activities and total score, RF (Random Forest) was utilized. The results indicate that there are different set of important variables for the two distinctive types of blended learning cases. Results suggest that the prediction models and data-mining technique should be based on the considerations of diverse pedagogical characteristics of blended learning classes.

A study on log-density ratio in logistic regression model for binary data

  • Kahng, Myung-Wook
    • Journal of the Korean Data and Information Science Society
    • /
    • v.22 no.1
    • /
    • pp.107-113
    • /
    • 2011
  • We present methods for studying the log-density ratio, which allow us to select which predictors are needed, and how they should be included in the logistic regression model. Under multivariate normal distributional assumptions, we investigate the form of the log-density ratio as a function of many predictors. The linear, quadratic and crossproduct terms are required in general. If two covariance matrices are equal, then the crossproduct and quadratic terms are not needed. If the variables are uncorrelated, we do not need the crossproduct terms, but we still need the linear and quadratic terms.

Modified Local Density Estimation for the Log-Linear Density

  • Pak, Ro-Jin
    • Communications for Statistical Applications and Methods
    • /
    • v.7 no.1
    • /
    • pp.13-22
    • /
    • 2000
  • We consider local likelihood method with a smoothed version of the model density in stead of an original model density. For simplicity a model is assumed as the log-linear density then we were able to show that the proposed local density estimator is less affected by changes among observations but its bias increases little bit more than that of the currently used local density estimator. Hence if we use the existing method and the proposed method in a proper way we would derive the local density estimator fitting the data in a better way.

  • PDF

Intensity estimation with log-linear Poisson model on linear networks

  • Idris Demirsoy;Fred W. Hufferb
    • Communications for Statistical Applications and Methods
    • /
    • v.30 no.1
    • /
    • pp.95-107
    • /
    • 2023
  • Purpose: The statistical analysis of point processes on linear networks is a recent area of research that studies processes of events happening randomly in space (or space-time) but with locations limited to reside on a linear network. For example, traffic accidents happen at random places that are limited to lying on a network of streets. This paper applies techniques developed for point processes on linear networks and the tools available in the R-package spatstat to estimate the intensity of traffic accidents in Leon County, Florida. Methods: The intensity of accidents on the linear network of streets is estimated using log-linear Poisson models which incorporate cubic basis spline (B-spline) terms which are functions of the x and y coordinates. The splines used equally-spaced knots. Ten different models are fit to the data using a variety of covariates. The models are compared with each other using an analysis of deviance for nested models. Results: We found all covariates contributed significantly to the model. AIC and BIC were used to select 9 as the number of knots. Additionally, covariates have different effects such as increasing the speed limit would decrease traffic accident intensity by 0.9794 but increasing the number of lanes would result in an increase in the intensity of traffic accidents by 1.086. Conclusion: Our analysis shows that if other conditions are held fixed, the number of accidents actually decreases on roads with higher speed limits. The software we currently use allows our models to contain only spatial covariates and does not permit the use of temporal or space-time covariates. We would like to extend our models to include such covariates which would allow us to include weather conditions or the presence of special events (football games or concerts) as covariates.

Improvement of Rating Curve Fitting Considering Variance Function with Pseudo-likelihood Estimation (의사우도추정법에 의한 분산함수를 고려한 수위-유량 관계 곡선 산정법 개선)

  • Lee, Woo-Seok;Kim, Sang-Ug;Chung, Eun-Sung;Lee, Kil-Seong
    • Journal of Korea Water Resources Association
    • /
    • v.41 no.8
    • /
    • pp.807-823
    • /
    • 2008
  • This paper presents a technique for estimating discharge rating curve parameters. In typical practical applications, the original non-linear rating curve is transformed into a simple linear regression model by log-transforming the measurement without examining the effect of log transformation. The model of pseudo-likelihood estimation is developed in this study to deal with heteroscedasticity of residuals in the original non-linear model. The parameters of rating curves and variance functions of errors are simultaneously estimated by the pseudo-likelihood estimation(P-LE) method. Simulated annealing, a global optimization technique, is adapted to minimize the log likelihood of the weighted residuals. The P-LE model was then applied to a hypothetical site where stage-discharge data were generated by incorporating various errors. Results of the P-LE model show reduced error values and narrower confidence intervals than those of the common log-transform linear least squares(LT-LR) model. Also, the limit of water levels for segmentation of discharge rating curve is estimated in the process of P-LE using the Heaviside function. Finally, model performance of the conventional log-transformed linear regression and the developed model, P-LE are computed and compared. After statistical simulation, the developed method is then applied to the real data sets from 5 gauge stations in the Geum River basin. It can be suggested that this developed strategy is applied to real sites to successfully determine weights taking into account error distributions from the observed discharge data.

Seven-Parameter Log Linear Model for Estimating Constituent Loads in Nakdong River (7변수 대수선형모형을 이용한 낙동강 오염부하량 추정)

  • Lee, A-Yeon;Choi, Dae-Gyu;Kim, Sang-Dan
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2010.05a
    • /
    • pp.1400-1404
    • /
    • 2010
  • In this study the flow duration curves and load duration curves for Nakdong river basin are analyzed. The TANK model is used as s hydrologic simulation model whose parameters are estimated from 8-days intervals flow data measured by Nakdong River Water Environment Laboratory. also in this study a Minimum Variance Unbiased Estimator(MVUE) is confirmed that it provides satisfactory load estimate. The Seven-Parameter Log Linear Model for estimating Total Organic Carbon(TOC) and Biochemical Oxygen Demand(BOD) loads in Nakdong river using a MVUE.

  • PDF

The Comparative Study of Software Optimal Release Time of Finite NHPP Model Considering Log Linear Learning Factor (로그선형 학습요인을 이용한 유한고장 NHPP모형에 근거한 소프트웨어 최적방출시기 비교 연구)

  • Cheul, Kim Hee;Cheul, Shin Hyun
    • Convergence Security Journal
    • /
    • v.12 no.6
    • /
    • pp.3-10
    • /
    • 2012
  • In this paper, make a study decision problem called an optimal release policies after testing a software system in development phase and transfer it to the user. When correcting or modifying the software, finite failure non-homogeneous Poisson process model, considering learning factor, presented and propose release policies of the life distribution, log linear type model which used to an area of reliability because of various shape and scale parameter. In this paper, discuss optimal software release policies which minimize a total average software cost of development and maintenance under the constraint of satisfying a software reliability requirement. In a numerical example, the parameters estimation using maximum likelihood estimation of failure time data, make out estimating software optimal release time.

Diagnostics for Heteroscedasticity in Mixed Linear Models

  • Ahn, Chul-Hwan
    • Journal of the Korean Statistical Society
    • /
    • v.19 no.2
    • /
    • pp.171-175
    • /
    • 1990
  • A diagnostic test for detecting nonconstant variance in mixed linear models based on the score statistic is derived through the technique of model expansion, and compared to the log likelihood ratio test.

  • PDF

An Approach to Survey Data with Nonresponse: Evaluation of KEPEC Data with BMI (무응답이 있는 설문조사연구의 접근법 : 한국노인약물역학코호트 자료의 평가)

  • Baek, Ji-Eun;Kang, Wee-Chang;Lee, Young-Jo;Park, Byung-Joo
    • Journal of Preventive Medicine and Public Health
    • /
    • v.35 no.2
    • /
    • pp.136-140
    • /
    • 2002
  • Objectives : A common problem with analyzing survey data involves incomplete data with either a nonresponse or missing data. The mail questionnaire survey conducted for collecting lifestyle variables on the members of the Korean Elderly Phamacoepidemiologic Cohort(KEPEC) in 1996 contains some nonresponse or missing data. The proper statistical method was applied to evaluate the missing pattern of a specific KEPEC data, which had no missing data in the independent variable and missing data in the response variable, BMI. Methods : The number of study subjects was 8,689 elderly people. Initially, the BMI and significant variables that influenced the BMI were categorized. After fitting the log-linear model, the probabilities of the people on each category were estimated. The EM algorithm was implemented using a log-linear model to determine the missing mechanism causing the nonresponse. Results : Age, smoking status, and a preference of spicy hot food were chosen as variables that influenced the BMI. As a result of fitting the nonignorable and ignorable nonresponse log-linear model considering these variables, the difference in the deviance in these two models was 0.0034(df=1). Conclusion : There is a lot of risk if an inference regarding the variables and large samples is made without considering the pattern of missing data. On the basis of these results, the missing data occurring in the BMI is the ignorable nonresponse. Therefore, when analyzing the BMI in KEPEC data, the inference can be made about the data without considering the missing data.