• 제목/요약/키워드: log-linear model

검색결과 229건 처리시간 0.033초

플라즈마 공정을 이용한 고추역병균(Phytophthora capsici) 불활성화 모델의 적용 (Application of Inactivation Model on Phytophthora Blight Pathogen (Phytophthora capsici) using Plasma Process)

  • 김동석;박영식
    • 한국환경과학회지
    • /
    • 제24권11호
    • /
    • pp.1393-1404
    • /
    • 2015
  • Ten empirical disinfection models for the plasma process were used to find an optimum model. The variation of model parameters in each model according to the operating conditions (first voltage, second voltage, air flow rate, pH, incubation water concentration) were investigated in order to explain the disinfection model. In this experiment, the DBD (dielectric barrier discharge) plasma reactor was used to inactivate Phytophthora capsici which cause wilt in tomato plantation. Optimum disinfection models were chosen among ten models by the application of statistical SSE (sum of squared error), RMSE (root mean sum of squared error), $r^2$ values on the experimental data using the GInaFiT software in Microsoft Excel. The optimum models were shown as Log-linear+Tail model, Double Weibull model and Biphasic model. Three models were applied to the experimental data according to the variation of the operating conditions. In Log-linear+Tail model, $Log_{10}(N_o)$, $Log_{10}(N_{res})$ and $k_{max}$ values were examined. In Double Weibull model, $Log_{10}(N_o)$, $Log_{10}(N_{res})$, ${\alpha}$, ${\delta}_1$, ${\delta}_2$, p values were calculated and examined. In Biphasic model, $Log_{10}(N_o)$, f, $k_{max1}$ and $k_{max2}$ values were used. The appropriate model parameters for the calculation of optimum operating conditions were $k_{max}$, ${\alpha}$, $k_{max1}$ at each model, respectively.

Analysis of Online Behavior and Prediction of Learning Performance in Blended Learning Environments

  • JO, Il-Hyun;PARK, Yeonjeong;KIM, Jeonghyun;SONG, Jongwoo
    • Educational Technology International
    • /
    • 제15권2호
    • /
    • pp.71-88
    • /
    • 2014
  • A variety of studies to predict students' performance have been conducted since educational data such as web-log files traced from Learning Management System (LMS) are increasingly used to analyze students' learning behaviors. However, it is still challenging to predict students' learning achievement in blended learning environment where online and offline learning are combined. In higher education, diverse cases of blended learning can be formed from simple use of LMS for administrative purposes to full usages of functions in LMS for online distance learning class. As a result, a generalized model to predict students' academic success does not fulfill diverse cases of blended learning. This study compares two blended learning classes with each prediction model. The first blended class which involves online discussion-based learning revealed a linear regression model, which explained 70% of the variance in total score through six variables including total log-in time, log-in frequencies, log-in regularities, visits on boards, visits on repositories, and the number of postings. However, the second case, a lecture-based class providing regular basis online lecture notes in Moodle show weaker results from the same linear regression model mainly due to non-linearity of variables. To investigate the non-linear relations between online activities and total score, RF (Random Forest) was utilized. The results indicate that there are different set of important variables for the two distinctive types of blended learning cases. Results suggest that the prediction models and data-mining technique should be based on the considerations of diverse pedagogical characteristics of blended learning classes.

A study on log-density ratio in logistic regression model for binary data

  • Kahng, Myung-Wook
    • Journal of the Korean Data and Information Science Society
    • /
    • 제22권1호
    • /
    • pp.107-113
    • /
    • 2011
  • We present methods for studying the log-density ratio, which allow us to select which predictors are needed, and how they should be included in the logistic regression model. Under multivariate normal distributional assumptions, we investigate the form of the log-density ratio as a function of many predictors. The linear, quadratic and crossproduct terms are required in general. If two covariance matrices are equal, then the crossproduct and quadratic terms are not needed. If the variables are uncorrelated, we do not need the crossproduct terms, but we still need the linear and quadratic terms.

Modified Local Density Estimation for the Log-Linear Density

  • Pak, Ro-Jin
    • Communications for Statistical Applications and Methods
    • /
    • 제7권1호
    • /
    • pp.13-22
    • /
    • 2000
  • We consider local likelihood method with a smoothed version of the model density in stead of an original model density. For simplicity a model is assumed as the log-linear density then we were able to show that the proposed local density estimator is less affected by changes among observations but its bias increases little bit more than that of the currently used local density estimator. Hence if we use the existing method and the proposed method in a proper way we would derive the local density estimator fitting the data in a better way.

  • PDF

Intensity estimation with log-linear Poisson model on linear networks

  • Idris Demirsoy;Fred W. Hufferb
    • Communications for Statistical Applications and Methods
    • /
    • 제30권1호
    • /
    • pp.95-107
    • /
    • 2023
  • Purpose: The statistical analysis of point processes on linear networks is a recent area of research that studies processes of events happening randomly in space (or space-time) but with locations limited to reside on a linear network. For example, traffic accidents happen at random places that are limited to lying on a network of streets. This paper applies techniques developed for point processes on linear networks and the tools available in the R-package spatstat to estimate the intensity of traffic accidents in Leon County, Florida. Methods: The intensity of accidents on the linear network of streets is estimated using log-linear Poisson models which incorporate cubic basis spline (B-spline) terms which are functions of the x and y coordinates. The splines used equally-spaced knots. Ten different models are fit to the data using a variety of covariates. The models are compared with each other using an analysis of deviance for nested models. Results: We found all covariates contributed significantly to the model. AIC and BIC were used to select 9 as the number of knots. Additionally, covariates have different effects such as increasing the speed limit would decrease traffic accident intensity by 0.9794 but increasing the number of lanes would result in an increase in the intensity of traffic accidents by 1.086. Conclusion: Our analysis shows that if other conditions are held fixed, the number of accidents actually decreases on roads with higher speed limits. The software we currently use allows our models to contain only spatial covariates and does not permit the use of temporal or space-time covariates. We would like to extend our models to include such covariates which would allow us to include weather conditions or the presence of special events (football games or concerts) as covariates.

의사우도추정법에 의한 분산함수를 고려한 수위-유량 관계 곡선 산정법 개선 (Improvement of Rating Curve Fitting Considering Variance Function with Pseudo-likelihood Estimation)

  • 이우석;김상욱;정은성;이길성
    • 한국수자원학회논문집
    • /
    • 제41권8호
    • /
    • pp.807-823
    • /
    • 2008
  • 수위-유량 관계 곡선을 나타내는 곡선식에 포함되어 있는 매개변수의 추정을 위해 많이 사용되는 로그선형 회귀분석은 잔차의 비등분산성(heteroscedasticity)을 고려하지 못하므로 본 연구에서는 의사우도추정법(pseudolikelihood estimation, P-LE)에 의해 분산함수를 추정하고 이와 함께 회귀계수를 추정할 수 있는 방법을 제시하였다. 이 과정에서 제시된 회귀잔차를 최소화하기 위하여 SA(simulated annealing)이라는 전역 최적화 알고리즘을 적용하였다. 또한 수위-유량 관계 곡선은 단면 등의 영향으로 인해 구간에 따라 각각 다르게 구축되어져야 하므로 이를 보다 객관적으로 판단하고 분리 위치를 추정하기 위하여 Heaviside 함수를 의사우도함수에 포함시켜 결과를 추정하도록 하였으며, 2개의 구간을 가지는 유량자료를 이용하여 제시된 방법의 합리성을 통계적으로 실험하였다. 이와 같이 통계적 실험을 통해 제시된 방법들이 기존 방법과 비교하여 가질 수 있는 장점을 파악하였으며, 제시된 방법들을 금강유역 5개 지점에서 대해 수행하여 효율성을 검증하였다.

7변수 대수선형모형을 이용한 낙동강 오염부하량 추정 (Seven-Parameter Log Linear Model for Estimating Constituent Loads in Nakdong River)

  • 이아연;최대규;김상단
    • 한국수자원학회:학술대회논문집
    • /
    • 한국수자원학회 2010년도 학술발표회
    • /
    • pp.1400-1404
    • /
    • 2010
  • 본 연구에서는 현재 시행되고 있는 오염총량관리제 모니터링 시스템에 적용가능한 부하량 추정기법에 대하여 제시하였다. 수정 TANK 모형을 통하여 8일 간격 유량자료의 1일 간격 유량자료로 의 확장을 시행하였다. 그리고 최소분산 비편향 추정기법을 통한 7변수 대수 선형 모형으로 오염 부하량을 추정하였다. 그 결과 TOC 및 BOD 부하량 추정에서 만족스러운 결과를 확인할 수 있었다. 연구의 적용의 일환으로, 낙동강유역의 TOC 및 BOD 항목의 부하량 유황 곡선을 작성하여 전체적인 분포를 살펴보았다.

  • PDF

로그선형 학습요인을 이용한 유한고장 NHPP모형에 근거한 소프트웨어 최적방출시기 비교 연구 (The Comparative Study of Software Optimal Release Time of Finite NHPP Model Considering Log Linear Learning Factor)

  • 김희철;신현철
    • 융합보안논문지
    • /
    • 제12권6호
    • /
    • pp.3-10
    • /
    • 2012
  • 본 연구에서는 소프트웨어 제품을 개발하여 테스팅을 거친 후 사용자에게 인도하는 시기를 결정하는 방출문제에 대하여 연구 하였다. 소프트웨어의 결함을 제거하거나 수정 작업과정에서 학습요인을 고려한 유한고장수를 가진 비동질적인 포아송 과정에 기초하였다. 수명강도는 다양한 형상모수와 척도모수에 이용 할 수 있기 때문에 신뢰성 분야에서 많이 사용되는 로그 선형 모형을 이용한 방출시기에 관한 문제를 제시하였다. 소프트웨어 요구 신뢰도를 만족시키고 소프트웨어 개발 및 유지 총비용을 최소화 시키는 최적 소프트웨어 방출 정책에 대하여 논의 되었다. 본 논문의 수치적인 예에서는 고장 시간 자료를 적용하였으며 모수추정 방법은 최우추정법을 이용하고 최적 방출시기를 추정하였다.

Diagnostics for Heteroscedasticity in Mixed Linear Models

  • Ahn, Chul-Hwan
    • Journal of the Korean Statistical Society
    • /
    • 제19권2호
    • /
    • pp.171-175
    • /
    • 1990
  • A diagnostic test for detecting nonconstant variance in mixed linear models based on the score statistic is derived through the technique of model expansion, and compared to the log likelihood ratio test.

  • PDF

무응답이 있는 설문조사연구의 접근법 : 한국노인약물역학코호트 자료의 평가 (An Approach to Survey Data with Nonresponse: Evaluation of KEPEC Data with BMI)

  • 백지은;강위창;이영조;박병주
    • Journal of Preventive Medicine and Public Health
    • /
    • 제35권2호
    • /
    • pp.136-140
    • /
    • 2002
  • Objectives : A common problem with analyzing survey data involves incomplete data with either a nonresponse or missing data. The mail questionnaire survey conducted for collecting lifestyle variables on the members of the Korean Elderly Phamacoepidemiologic Cohort(KEPEC) in 1996 contains some nonresponse or missing data. The proper statistical method was applied to evaluate the missing pattern of a specific KEPEC data, which had no missing data in the independent variable and missing data in the response variable, BMI. Methods : The number of study subjects was 8,689 elderly people. Initially, the BMI and significant variables that influenced the BMI were categorized. After fitting the log-linear model, the probabilities of the people on each category were estimated. The EM algorithm was implemented using a log-linear model to determine the missing mechanism causing the nonresponse. Results : Age, smoking status, and a preference of spicy hot food were chosen as variables that influenced the BMI. As a result of fitting the nonignorable and ignorable nonresponse log-linear model considering these variables, the difference in the deviance in these two models was 0.0034(df=1). Conclusion : There is a lot of risk if an inference regarding the variables and large samples is made without considering the pattern of missing data. On the basis of these results, the missing data occurring in the BMI is the ignorable nonresponse. Therefore, when analyzing the BMI in KEPEC data, the inference can be made about the data without considering the missing data.