• 제목/요약/키워드: Binary response regression

검색결과 44건 처리시간 0.029초

디젤 기관(機關)의 계통식별(系統識別) -연료주입율(燃料注入率) 대(對) 매연반응(煤煙反應)- (System Identification of a Diesel Engine -Throttle-Smoke Response-)

  • 조한근
    • Journal of Biosystems Engineering
    • /
    • 제16권2호
    • /
    • pp.111-117
    • /
    • 1991
  • An empirical model for diesel engine control was obtained using a system identification method. A pseudo-random binary sequence was used as an input signal. Spectral anaylsis was used to find the frequency response of system. Model parameters of transfer functions were obtained using nonlinear regression.

  • PDF

Forecasting Probability of Precipitation Using Morkov Logistic Regression Model

  • Park, Jeong-Soo;Kim, Yun-Seon
    • Communications for Statistical Applications and Methods
    • /
    • 제14권1호
    • /
    • pp.1-9
    • /
    • 2007
  • A three-state Markov logistic regression model is suggested to forecast the probability of tomorrow's precipitation based on the current meteorological situation. The suggested model turns out to be better than Markov regression model in the sense of the mean squared error of forecasting for the rainfall data of Seoul area.

Fuzzy c-Logistic Regression Model in the Presence of Noise Cluster

  • Alanzado, Arnold C.;Miyamoto, Sadaaki
    • 한국지능시스템학회:학술대회논문집
    • /
    • 한국퍼지및지능시스템학회 2003년도 ISIS 2003
    • /
    • pp.431-434
    • /
    • 2003
  • In this paper we introduce a modified objective function for fuzzy c-means clustering with logistic regression model in the presence of noise cluster. The logistic regression model is commonly used to describe the effect of one or several explanatory variables on a binary response variable. In real application there is very often no sharp boundary between clusters so that fuzzy clustering is often better suited for the data.

  • PDF

Goodness-of-fit tests for a proportional odds model

  • Lee, Hyun Yung
    • Journal of the Korean Data and Information Science Society
    • /
    • 제24권6호
    • /
    • pp.1465-1475
    • /
    • 2013
  • The chi-square type test statistic is the most commonly used test in terms of measuring testing goodness-of-fit for multinomial logistic regression model, which has its grouped data (binomial data) and ungrouped (binary) data classified by a covariate pattern. Chi-square type statistic is not a satisfactory gauge, however, because the ungrouped Pearson chi-square statistic does not adhere well to the chi-square statistic and the ungrouped Pearson chi-square statistic is also not a satisfactory form of measurement in itself. Currently, goodness-of-fit in the ordinal setting is often assessed using the Pearson chi-square statistic and deviance tests. These tests involve creating a contingency table in which rows consist of all possible cross-classifications of the model covariates, and columns consist of the levels of the ordinal response. I examined goodness-of-fit tests for a proportional odds logistic regression model-the most commonly used regression model for an ordinal response variable. Using a simulation study, I investigated the distribution and power properties of this test and compared these with those of three other goodness-of-fit tests. The new test had lower power than the existing tests; however, it was able to detect a greater number of the different types of lack of fit considered in this study. I illustrated the ability of the tests to detect lack of fit using a study of aftercare decisions for psychiatrically hospitalized adolescents.

Geographically weighted kernel logistic regression for small area proportion estimation

  • Shim, Jooyong;Hwang, Changha
    • Journal of the Korean Data and Information Science Society
    • /
    • 제27권2호
    • /
    • pp.531-538
    • /
    • 2016
  • In this paper we deal with the small area estimation for the case that the response variables take binary values. The mixed effects models have been extensively studied for the small area estimation, which treats the spatial effects as random effects. However, when the spatial information of each area is given specifically as coordinates it is popular to use the geographically weighted logistic regression to incorporate the spatial information by assuming that the regression parameters vary spatially across areas. In this paper, relaxing the linearity assumption and propose a geographically weighted kernel logistic regression for estimating small area proportions by using basic principle of kernel machine. Numerical studies have been carried out to compare the performance of proposed method with other methods in estimating small area proportion.

An Empirical Study on Dimension Reduction

  • Suh, Changhee;Lee, Hakbae
    • Journal of the Korean Data Analysis Society
    • /
    • 제20권6호
    • /
    • pp.2733-2746
    • /
    • 2018
  • The two inverse regression estimation methods, SIR and SAVE to estimate the central space are computationally easy and are widely used. However, SIR and SAVE may have poor performance in finite samples and need strong assumptions (linearity and/or constant covariance conditions) on predictors. The two non-parametric estimation methods, MAVE and dMAVE have much better performance for finite samples than SIR and SAVE. MAVE and dMAVE need no strong requirements on predictors or on the response variable. MAVE is focused on estimating the central mean subspace, but dMAVE is to estimate the central space. This paper explores and compares four methods to explain the dimension reduction. Each algorithm of these four methods is reviewed. Empirical study for simulated data shows that MAVE and dMAVE has relatively better performance than SIR and SAVE, regardless of not only different models but also different distributional assumptions of predictors. However, real data example with the binary response demonstrates that SAVE is better than other methods.

로지스틱회귀모형에서 로그-밀도비를 이용한 변수의 선택 (Variable Selection with Log-Density in Logistic Regression Model)

  • 강명욱;신은영
    • Communications for Statistical Applications and Methods
    • /
    • 제19권1호
    • /
    • pp.1-11
    • /
    • 2012
  • 로지스틱회귀모형에서 반응변수가 주어졌을 때 설명변수의 조건부 확률분포의 로그-밀도비는 어떤 설명변수가어떻게모형에포함되는지에대한변수선택문제에서유용한정보를제공한다. 설명변수의 조건부 확률분포가 좌우대칭이 아닌 경우 감마분포로 가정하는 것이 적절하다. 여러 가지 모의실험을 수행한 결과를 보면, $x{\mid}y$ = 0과 $x{\mid}y$ = 1의 두 분포가 겹치는 경우에서는 x항과 log(x)항 모두 필요하다. 그리고 두 분포가 분리된 경우에는 x항 또는 log(x)항 중 하나만 필요하다.

Two-stage imputation method to handle missing data for categorical response variable

  • Jong-Min Kim;Kee-Jae Lee;Seung-Joo Lee
    • Communications for Statistical Applications and Methods
    • /
    • 제30권6호
    • /
    • pp.577-587
    • /
    • 2023
  • Conventional categorical data imputation techniques, such as mode imputation, often encounter issues related to overestimation. If the variable has too many categories, multinomial logistic regression imputation method may be impossible due to computational limitations. To rectify these limitations, we propose a two-stage imputation method. During the first stage, we utilize the Boruta variable selection method on the complete dataset to identify significant variables for the target categorical variable. Then, in the second stage, we use the important variables for the target categorical variable for logistic regression to impute missing data in binary variables, polytomous regression to impute missing data in categorical variables, and predictive mean matching to impute missing data in quantitative variables. Through analysis of both asymmetric and non-normal simulated and real data, we demonstrate that the two-stage imputation method outperforms imputation methods lacking variable selection, as evidenced by accuracy measures. During the analysis of real survey data, we also demonstrate that our suggested two-stage imputation method surpasses the current imputation approach in terms of accuracy.

Goodness of Link Tests for Binary Response Data

  • Yeo, In-Kwon
    • Communications for Statistical Applications and Methods
    • /
    • 제8권2호
    • /
    • pp.357-366
    • /
    • 2001
  • The present paper develops a method to check the propriety of link functions for binary data. In order to parameterize a certain type of goodness of the link, a family of link functions indexed by a shape parameter is proposed. I first investigate the maximum likelihood estimation of the shape parameter as well as regression parameters and then derive their large sample behaviors of the estimators. A score test is considered to evaluate the goodness of the current link function. For illustration, I employ two families of power transformations, the modulus transformation by John and Draper (1980) and the extended power transformation by Yeo and Johnson (2000), which are appropriate to detect symmetric and asymmetric inadequacy of the selected link function. respectively.

  • PDF

Recommended Rice Intake Levels Based on Average Daily Dose and Urinary Excretion of Cadmium in a Cadmium-Contaminated Area of Northwestern Thailand

  • La-Up, Aroon;Wiwatanadate, Phongtape;Pruenglampoo, Sakda;Uthaikhup, Sureeporn
    • Toxicological Research
    • /
    • 제33권4호
    • /
    • pp.291-297
    • /
    • 2017
  • This study was performed to investigate the dose-response relationship between average daily cadmium dose (ADCD) from rice and the occurrence of urinary cadmium (U-Cd) in individuals eating that rice. This was a retrospective cohort designed to compare populations from two areas with different levels of cadmium contamination. Five-hundred and sixty-seven participants aged 18 years or older were interviewed to estimate their rice intake, and were assessed for U-Cd. The sources of consumed rice were sampled for cadmium measurement, from which the ADCD was estimated. Binary logistic regression was used to examine the association between ADCD and U-Cd (cut-off point at $2{\mu}g/g$ creatinine), and a correlation between them was established. The lowest estimate was $ADCD=0.5{\mu}g/kg\;bw/day$ [odds ratio (OR) = 1.71; with a 95% confidence interval (CI) 1.02-2.87]. For comparison, the relationship in the contaminated area is expressed by $ADCD=0.7{\mu}g/kg\;bw/day$, OR = 1.84; [95 % CI, 1.06-3.19], while no relationship was found in the non-contaminated area, meaning that the highest level at which this relationship does not exist is $ADCD=0.6{\mu}g/kg\;bw/day$ [95% CI, 0.99-2.95]. Rice, as a main staple food, is the most likely source of dietary cadmium. Abstaining from or limiting rice consumption, therefore, will increase the likelihood of maintaining U-Cd within the normal range. As the recommended maximum ADCD is not to exceed $0.6{\mu}g/kg\;bw/day$, the consumption of rice grown in cadmium-contaminated areas should not be more than 246.8 g/day. However, the exclusion of many edible plants grown in the contaminated area from the analysis might result in an estimated ADCD that does not reflect the true level of cadmium exposure among local people.