• Title/Summary/Keyword: Logistic Modeling

Search Result 165, Processing Time 0.025 seconds

A Study on Improving the predict accuracy rate of Hybrid Model Technique Using Error Pattern Modeling : Using Logistic Regression and Discriminant Analysis

  • Cho, Yong-Jun;Hur, Joon
    • Journal of the Korean Data and Information Science Society
    • /
    • v.17 no.2
    • /
    • pp.269-278
    • /
    • 2006
  • This paper presents the new hybrid data mining technique using error pattern, modeling of improving classification accuracy. The proposed method improves classification accuracy by combining two different supervised learning methods. The main algorithm generates error pattern modeling between the two supervised learning methods(ex: Neural Networks, Decision Tree, Logistic Regression and so on.) The Proposed modeling method has been applied to the simulation of 10,000 data sets generated by Normal and exponential random distribution. The simulation results show that the performance of proposed method is superior to the existing methods like Logistic regression and Discriminant analysis.

  • PDF

Modeling for Prediction of Potato Late Blight (Phytophthora infestans) (감자역병 진전도 예측모형 작성)

  • 안재훈;함영일;신관용
    • Korean Journal Plant Pathology
    • /
    • v.14 no.4
    • /
    • pp.331-338
    • /
    • 1998
  • To develop the model for prediction of potato late blight progress, the relationship between severity index of potato late blight transformed by the logit and Gompit transformation function and cumulative severity value (CSV) processing weather data during growing period in Taegwallyeong alpine area, 1975 to 1992 were examined. When logistic model and Gompertz model were compared by determining goodness of fit for progressive degree of late blight using CSV as independent variable, the coefficients of determination were higher as 0.742 in the logistic model than 0.680 in the Gompertz model. Parameters in logistic model were composed of progressive rate and initial value of logistic model. Initial value was calculated in -3.664. The progressive rate of potato late blight was 0.137 in cv. Superior, 0.136 in cv. Irish Cobbler, and 0.070 in cv. Jopung without fungicide sprays. According to in crease of the number of spray times the progressive rate was lowered, was 0.020 in cv. Superior under the conventional program of fungicide sprays, 10 times sprays during cropping season. Equation of progressive rate, b1=0.0088 ACSV-0.033 (R2=0.976), was written by examining the relationship between the parameters of progressive rate of late blight and the average CSV (ACSV) quantifing weather information. By estimating parameters of logistic function, model able to describe the late blight progress of potato, cv. Superior was formulated in Y=4/(1+39.0·exp((0.0088 ACSV-0.033)·CSV).

  • PDF

Modeling the Growth of Neurology Literature

  • Hadagali, Gururaj S.;Anandhalli, Gavisiddappa
    • Journal of Information Science Theory and Practice
    • /
    • v.3 no.3
    • /
    • pp.45-63
    • /
    • 2015
  • The word ‘growth’ represents an increase in actual size, implying a change of state. In science and technology, growth may imply an increase in number of institutions, scientists, or publications, etc. The present study demonstrates the growth of neurology literature for the period 1961-2010. A total of 291,702 records were extracted from the Science Direct Database for fifty years. The Relative Growth Rate (RGR) and Doubling Time (Dt.) of neurology literature have been calculated, supplementing with different growth patterns to check whether neurology literature fits exponential, linear, or logistic models. The results of the study indicate that the growth of literature in neurology does not follow the linear, or logistic growth model. However, it follows closely the exponential growth model. The study concludes that there has been a consistent trend towards increased growth of literature in the field of neurology.

An Analysis on Relations between Design Errors Detected during BIM-based Design Validation and the Impacts Using Logistic Regression (로지스틱 회귀분석을 이용한 BIM 설계 검토에 의하여 발견된 설계 오류와 그 영향도간의 관계 분석)

  • Won, Jongsung
    • Proceedings of the Korean Institute of Building Construction Conference
    • /
    • 2017.05a
    • /
    • pp.264-265
    • /
    • 2017
  • This paper aims to analyze relations between design errors prevented by building information modeling (BIM)-based design validation and their impacts in order to identify critical consideration factors for successfully implementing BIM-based design validation in the architecture, engineering, and construction (AEC) projects. More than 800 design errors detected by BIM-based design validation in two BIM-based projects in South Korea are categorized according to its causes and work types. The relations between causes and work types of design errors and project delay, cost overrun, low quality, and rework generation that can be caused by the errors are analyzed through conducting logistic regression. Characteristics of each design error are analyzed by conducting face-to-face interviews with practitioners in the two BIM-based projects. As the results, the impacts of design error causes on predicting project delay, cost overrun, low quality, and rework generation were the highest.

  • PDF

Comparison of the Performance of Log-logistic Regression and Artificial Neural Networks for Predicting Breast Cancer Relapse

  • Faradmal, Javad;Soltanian, Ali Reza;Roshanaei, Ghodratollah;Khodabakhshi, Reza;Kasaeian, Amir
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.15 no.14
    • /
    • pp.5883-5888
    • /
    • 2014
  • Background: Breast cancer is the most common cancers in female populations. The exact cause is not known, but is most likely to be a combination of genetic and environmental factors. Log-logistic model (LLM) is applied as a statistical method for predicting survival and it influencing factors. In recent decades, artificial neural network (ANN) models have been increasingly applied to predict survival data. The present research was conducted to compare log-logistic regression and artificial neural network models in prediction of breast cancer (BC) survival. Materials and Methods: A historical cohort study was established with 104 patients suffering from BC from 1997 to 2005. To compare the ANN and LLM in our setting, we used the estimated areas under the receiver-operating characteristic (ROC) curve (AUC) and integrated AUC (iAUC). The data were analyzed using R statistical software. Results: The AUC for the first, second and third years after diagnosis are 0.918, 0.780 and 0.800 in ANN, and 0.834, 0.733 and 0.616 in LLM, respectively. The mean AUC for ANN was statistically higher than that of the LLM (0.845 vs. 0.744). Hence, this study showed a significant difference between the performance in terms of prediction by ANN and LLM. Conclusions: This study demonstrated that the ability of prediction with ANN was higher than with the LLM model. Thus, the use of ANN method for prediction of survival in field of breast cancer is suggested.

Modeling Age-specific Cancer Incidences Using Logistic Growth Equations: Implications for Data Collection

  • Shen, Xing-Rong;Feng, Rui;Chai, Jing;Cheng, Jing;Wang, De-Bin
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.15 no.22
    • /
    • pp.9731-9737
    • /
    • 2014
  • Large scale secular registry or surveillance systems have been accumulating vast data that allow mathematical modeling of cancer incidence and mortality rates. Most contemporary models in this regard use time series and APC (age-period-cohort) methods and focus primarily on predicting or analyzing cancer epidemiology with little attention being paid to implications for designing cancer registry, surveillance or evaluation initiatives. This research models age-specific cancer incidence rates using logistic growth equations and explores their performance under different scenarios of data completeness in the hope of deriving clues for reshaping relevant data collection. The study used China Cancer Registry Report 2012 as the data source. It employed 3-parameter logistic growth equations and modeled the age-specific incidence rates of all and the top 10 cancers presented in the registry report. The study performed 3 types of modeling, namely full age-span by fitting, multiple 5-year-segment fitting and single-segment fitting. Measurement of model performance adopted adjusted goodness of fit that combines sum of squred residuals and relative errors. Both model simulation and performance evalation utilized self-developed algorithms programed using C# languade and MS Visual Studio 2008. For models built upon full age-span data, predicted age-specific cancer incidence rates fitted very well with observed values for most (except cervical and breast) cancers with estimated goodness of fit (Rs) being over 0.96. When a given cancer is concerned, the R valuae of the logistic growth model derived using observed data from urban residents was greater than or at least equal to that of the same model built on data from rural people. For models based on multiple-5-year-segment data, the Rs remained fairly high (over 0.89) until 3-fourths of the data segments were excluded. For models using a fixed length single-segment of observed data, the older the age covered by the corresponding data segment, the higher the resulting Rs. Logistic growth models describe age-specific incidence rates perfectly for most cancers and may be used to inform data collection for purposes of monitoring and analyzing cancer epidemic. Helped by appropriate logistic growth equations, the work vomume of contemporary data collection, e.g., cancer registry and surveilance systems, may be reduced substantially.

MARS Modeling for Ordinal Categorical Response Data: A Case Study

  • Kim, Ji-Hyun
    • Communications for Statistical Applications and Methods
    • /
    • v.7 no.3
    • /
    • pp.711-720
    • /
    • 2000
  • A case study of modeling ordinal categorical response data with the MARS method is done. The study is to analyze the effect of some personal characteristics and socioeconomic status on the teenage marijuana use. The MARS method gave a new insight into the data set.

  • PDF

Modeling the Natural Occurrence of Selected Dipterocarp Genera in Sarawak, Borneo

  • Teo, Stephen;Phua, Mui-How
    • Journal of Forest and Environmental Science
    • /
    • v.28 no.3
    • /
    • pp.170-178
    • /
    • 2012
  • Dipterocarps or Dipterocarpaceae is a commercially important timber producing and dominant keystone tree family in the rain forests of Borneo. Borneo's landscape is changing at an unprecedented rate in recent years which affects this important biodiversity. This paper attempts to model the natural occurrence (distribution including those areas with natural forests before being converted to other land uses as opposed to current distribution) of dipterocarp species in Sarawak which is important for forest biodiversity conservation and management. Local modeling method of Inverse Distance Weighting was compared with commonly used statistical method (Binary Logistic Regression) to build the best natural distribution models for three genera (12 species) of dipterocarps. Database of species occurrence data and pseudoabsence data were constructed and divided into two halves for model building and validation. For logistic regression modeling, climatic, topographical and edaphic parameters were used. Proxy variables were used to represent the parameters which were highly (p>0.75) correlated to avoid over-fitting. The results show that Inverse Distance Weighting produced the best and consistent prediction with an average accuracy of over 80%. This study demonstrates that local interpolation method can be used for the modeling of natural distribution of dipterocarp species. The Inverse Distance Weighted was proven a better method and the possible reasons are discussed.

An Introduction to Logistic Regression: From Basic Concepts to Interpretation with Particular Attention to Nursing Domain

  • Park, Hyeoun-Ae
    • Journal of Korean Academy of Nursing
    • /
    • v.43 no.2
    • /
    • pp.154-164
    • /
    • 2013
  • Purpose: The purpose of this article is twofold: 1) introducing logistic regression (LR), a multivariable method for modeling the relationship between multiple independent variables and a categorical dependent variable, and 2) examining use and reporting of LR in the nursing literature. Methods: Text books on LR and research articles employing LR as main statistical analysis were reviewed. Twenty-three articles published between 2010 and 2011 in the Journal of Korean Academy of Nursing were analyzed for proper use and reporting of LR models. Results: Logistic regression from basic concepts such as odds, odds ratio, logit transformation and logistic curve, assumption, fitting, reporting and interpreting to cautions were presented. Substantial shortcomings were found in both use of LR and reporting of results. For many studies, sample size was not sufficiently large to call into question the accuracy of the regression model. Additionally, only one study reported validation analysis. Conclusion: Nursing researchers need to pay greater attention to guidelines concerning the use and reporting of LR models.

Nonparametric logistic regression based on sparse triangulation over a compact domain

  • Seoyeon Kim;Kwan-Young Bak
    • Communications for Statistical Applications and Methods
    • /
    • v.31 no.5
    • /
    • pp.557-569
    • /
    • 2024
  • Based on the investigation of logistic regression models utilizing sparse triangulation within a compact domain in ℝ2, this study addresses the limited research extending the triogram model to logistic regression. A primary challenge arises from the potential instability induced by a large number of vertices, hindering the effective modeling of complex relationships. To mitigate this challenge, we propose introducing sparsity to boundary vertices of the triangulation based on the Ramer-Douglas-Peucker algorithm and employing the K-means algorithm for adaptive vertex initialization. A second order coordinate-wise descent algorithm is adopted to implement the proposed method. Validation of the proposed algorithm's stability and performance assessment are conducted using synthetic and handwritten digit data (LeCun et al., 1989). Results demonstrate the advantages of our method over existing methodologies, particularly when dealing with non-rectangular data domains.