• Title/Summary/Keyword: logistic regression

Search Result 6,270, Processing Time 0.027 seconds

Application Method of Logistic Regression Analysis for Annoyance Prediction Model Based on Predicted Noise Level (예측소음도를 이용한 어노이언스 예측모델을 위한 로지스틱 회귀분석의 적용방법)

  • Son, Jin-Hee;Lee, Kun;Choung, Tae-Ryang;Chang, Seo-Il
    • Transactions of the Korean Society for Noise and Vibration Engineering
    • /
    • v.20 no.6
    • /
    • pp.555-561
    • /
    • 2010
  • Predicted noise level has been used to assess the annoyance response since noise map was generalized and being the normal method to assess the environmental noise. Unfortunately using predicted noise level to derive the annoyance prediction curve caused some problems. The data have to be grouped manually to use the annoyance prediction curve. The aim of this paper is to propose the method to handle the predicted noise level and the survey data for annoyance prediction curve. This paper used the percentage of persons annoyed(%A) and the percentage of persons highly annoyed as the descriptor of noise annoyance in a population. The logistic regression method was used for deriving annoyance prediction curve. It is concluded that the method of dichotomizing data and logistic regression was suitable to handle the predicted noise level and survey data.

APPLICATION AND CROSS-VALIDATION OF SPATIAL LOGISTIC MULTIPLE REGRESSION FOR LANDSLIDE SUSCEPTIBILITY ANALYSIS

  • LEE SARO
    • Proceedings of the KSRS Conference
    • /
    • 2004.10a
    • /
    • pp.302-305
    • /
    • 2004
  • The aim of this study is to apply and crossvalidate a spatial logistic multiple-regression model at Boun, Korea, using a Geographic Information System (GIS). Landslide locations in the Boun area were identified by interpretation of aerial photographs and field surveys. Maps of the topography, soil type, forest cover, geology, and land-use were constructed from a spatial database. The factors that influence landslide occurrence, such as slope, aspect, and curvature of topography, were calculated from the topographic database. Texture, material, drainage, and effective soil thickness were extracted from the soil database, and type, diameter, and density of forest were extracted from the forest database. Lithology was extracted from the geological database and land-use was classified from the Landsat TM image satellite image. Landslide susceptibility was analyzed using landslide-occurrence factors by logistic multiple-regression methods. For validation and cross-validation, the result of the analysis was applied both to the study area, Boun, and another area, Youngin, Korea. The validation and cross-validation results showed satisfactory agreement between the susceptibility map and the existing data with respect to landslide locations. The GIS was used to analyze the vast amount of data efficiently, and statistical programs were used to maintain specificity and accuracy.

  • PDF

Two-Stage Logistic Regression for Cancer Classi cation and Prediction from Copy-Numbe Changes in cDNA Microarray-Based Comparative Genomic Hybridization

  • Kim, Mi-Jung
    • The Korean Journal of Applied Statistics
    • /
    • v.24 no.5
    • /
    • pp.847-859
    • /
    • 2011
  • cDNA microarray-based comparative genomic hybridization(CGH) data includes low-intensity spots and thus a statistical strategy is needed to detect subtle differences between different cancer classes. In this study, genes displaying a high frequency of alteration in one of the different classes were selected among the pre-selected genes that show relatively large variations between genes compared to total variations. Utilizing copy-number changes of the selected genes, this study suggests a statistical approach to predict patients' classes with increased performance by pre-classifying patients with similar genetic alteration scores. Two-stage logistic regression model(TLRM) was suggested to pre-classify homogeneous patients and predict patients' classes for cancer prediction; a decision tree(DT) was combined with logistic regression on the set of informative genes. TLRM was constructed in cDNA microarray-based CGH data from the Cancer Metastasis Research Center(CMRC) at Yonsei University; it predicted the patients' clinical diagnoses with perfect matches (except for one patient among the high-risk and low-risk classified patients where the performance of predictions is critical due to the high sensitivity and specificity requirements for clinical treatments. Accuracy validated by leave-one-out cross-validation(LOOCV) was 83.3% while other classification methods of CART and DT performed as comparisons showed worse performances than TLRM.

Empirical Analysis on the Relationship between R&D Inputs and Performance Using Successive Binary Logistic Regression Models (연속적 이항 로지스틱 회귀모형을 이용한 R&D 투입 및 성과 관계에 대한 실증분석)

  • Park, Sungmin
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.40 no.3
    • /
    • pp.342-357
    • /
    • 2014
  • The present study analyzes the relationship between research and development (R&D) inputs and performance of a national technology innovation R&D program using successive binary Logistic regression models based on a typical R&D logic model. In particular, this study focuses on to answer the following three main questions; (1) "To what extent, do the R&D inputs have an effect on the performance creation?"; (2) "Is an obvious relationship verified between the immediate predecessor and its successor performance?"; and (3) "Is there a difference in the performance creation between R&D government subsidy recipient types and between R&D collaboration types?" Methodologically, binary Logistic regression models are established successively considering the "Success-Failure" binary data characteristic regarding the performance creation. An empirical analysis is presented analyzing the sample n = 2,178 R&D projects completed. This study's major findings are as follows. First, the R&D inputs have a statistically significant relationship only with the short-term, technical output, "Patent Registration." Second, strong dependencies are identified between the immediate predecessor and its successor performance. Third, the success probability of the performance creation is statistically significantly different between the R&D types aforementioned. Specifically, compared with "Large Company", "Small and Medium-Sized Enterprise (SMS)" shows a greater success probability of "Sales" and "New Employment." Meanwhile, "R&D Collaboration" achieves a larger success probability of "Patent Registration" and "Sales."

Data Mining for Knowledge Management in a Health Insurance Domain

  • Chae, Young-Moon;Ho, Seung-Hee;Cho, Kyoung-Won;Lee, Dong-Ha;Ji, Sun-Ha
    • Journal of Intelligence and Information Systems
    • /
    • v.6 no.1
    • /
    • pp.73-82
    • /
    • 2000
  • This study examined the characteristicso f the knowledge discovery and data mining algorithms to demonstrate how they can be used to predict health outcomes and provide policy information for hypertension management using the Korea Medical Insurance Corporation database. Specifically this study validated the predictive power of data mining algorithms by comparing the performance of logistic regression and two decision tree algorithms CHAID (Chi-squared Automatic Interaction Detection) and C5.0 (a variant of C4.5) since logistic regression has assumed a major position in the healthcare field as a method for predicting or classifying health outcomes based on the specific characteristics of each individual case. This comparison was performed using the test set of 4,588 beneficiaries and the training set of 13,689 beneficiaries that were used to develop the models. On the contrary to the previous study CHAID algorithm performed better than logistic regression in predicting hypertension but C5.0 had the lowest predictive power. In addition CHAID algorithm and association rule also provided the segment characteristics for the risk factors that may be used in developing hypertension management programs. This showed that data mining approach can be a useful analytic tool for predicting and classifying health outcomes data.

  • PDF

Arc Detection using Logistic Regression (로지스틱 회기를 이용한 아크 검출)

  • Kim, Manbae
    • Journal of Broadcast Engineering
    • /
    • v.26 no.5
    • /
    • pp.566-574
    • /
    • 2021
  • The arc is one of factors causing electrical fires. Over past decades, various researches have been carried out to detect arc occurrences. Even though frequency analysis, wavelet and statistical features have been used, arc detection performance is degraded due to diverse arc waveforms. On the contray, Deep neural network (DNN) direcly utilizes raw data without feature extraction, based on end-to-end learning. However, a disadvantage of the DNN is processing complexity, posing the difficulty of being migrated into a termnial device. To solve this, this paper proposes an arc detection method using a logistic regression that is one of simple machine learning methods.

Nonlinear Regression on Cold Tolerance Data for Brassica Napus

  • Yang, Woohyeong;Choi, Myeong Seok;Ahn, Sung Jin
    • Journal of the Korean Data Analysis Society
    • /
    • v.20 no.6
    • /
    • pp.2721-2731
    • /
    • 2018
  • This study purposes to derive the predictive model for the cold tolerance of Brassica napus, using the data collected in the Tree Breeding Lab of Gyeongsang National University during July and August of 2016. Three Brassica napus samples were treated at each of low temperatures from $4^{\circ}C$ to $-12^{\circ}C$ by decrement of $4^{\circ}C$, step by step, and electrolyte leakage levels were measured at each stage. Electrolyte leakages were observed tangibly from $-4^{\circ}C$. We tried to fit the six nonlinear regression models to the electrolyte leakage data of Brassica napus: 3-parameter logistic model, baseline logistic model, 4-parameter logistic model, (4-1)-parameter logistic model, 3-parameter Gompertz model, and (3-1)-parameter Gompertz model. The baseline levels of the electrolyte leakage estimated by these models were 4.81%, 4.07%, 4.19%, 4.07%, 4.55%, and 0%, respectively. The estimated median lethal temperature, LT50, were $-5.87^{\circ}C$, $-6.31^{\circ}C$, $-6.05^{\circ}C$, $-6.35^{\circ}C$, $-4.98^{\circ}C$, and $-5.15^{\circ}C$, respectively. We compared and discussed the measures of goodness of fit to select the appropriate nonlinear regression model.

Making Thoughts Real - a Machine Learning Approach for Brain-Computer Interface Systems

  • Tengis Tserendondog;Uurstaikh Luvsansambuu;Munkhbayar Bat-Erdende;Batmunkh Amar
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.15 no.2
    • /
    • pp.124-132
    • /
    • 2023
  • In this paper, we present a simple classification model based on statistical features and demonstrate the successful implementation of a brain-computer interface (BCI) based light on/off control system. This research shows study and development of light on/off control system based on BCI technology, which allows the users to control switching a lamp using electroencephalogram (EEG) signals. The logistic regression algorithm is used for classification of the EEG signal to convert it into light on, light off control commands. Training data were collected using 14-channel BCI system which records the brain signals of participants watching a screen with flickering lights and saves the data into .csv file for future analysis. After extracting a number of features from the data and performing classification using logistic regression, we created commands to switch on a physical lamp and tested it in a real environment. Logistic regression allowed us to quite accurately classify the EEG signals based on the user's mental state and we were able to classify the EEG signals with 82.5% accuracy, producing reliable commands for turning on and off the light.

Extraction of Potential Area for Block Stream and Talus Using Spatial Integration Model (공간통합 모델을 적용한 암괴류 및 애추 지형 분포가능지 추출)

  • Lee, Seong-Ho;JANG, Dong-Ho
    • Journal of The Geomorphological Association of Korea
    • /
    • v.26 no.2
    • /
    • pp.1-14
    • /
    • 2019
  • This study analyzed the relativity between block stream and talus distributions by employing a likelihood ratio approach. Possible distribution sites for each debris slope landform were extracted by applying a spatial integration model, in which we combined fuzzy set model, Bayesian predictive model, and logistic regression model. Moreover, to verify model performance, a success rate curve was prepared by cross-validation. The results showed that elevation, slope, curvature, topographic wetness index, geology, soil drainage, and soil depth were closely related to the debris slope landform sites. In addition, all spatial integration models displayed an accuracy of over 90%. The accuracy of the distribution potential area map of the block stream was highest in the logistic regression model (93.79%). Eventually, the accuracy of the distribution potential area map of the talus was also highest in the logistic regression model (97.02%). We expect that the present results will provide essential data and propose methodologies to improve the performance of efficient and systematic micro-landform studies. Moreover, our research will potentially help to enhance field research and topographic resource management.

A Study of Freshman Dropout Prediction Model Using Logistic Regression with Shift-Sigmoid Classification Function (시프트 시그모이드 분류함수를 가진 로지스틱 회귀를 이용한 신입생 중도탈락 예측모델 연구)

  • Kim Donghyung
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.19 no.4
    • /
    • pp.137-146
    • /
    • 2023
  • The dropout of university freshmen is a very important issue in the financial problems of universities. Moreover, the dropout rate is one of the important indicators among the external evaluation items of universities. Therefore, universities need to predict dropout students in advance and apply various dropout prevention programs targeting them. This paper proposes a method to predict such dropout students in advance. This paper is about a method for predicting dropout students. It proposes a method to select dropouts by applying logistic regression using a shift sigmoid classification function using only quantitative data from the first semester of the first year, which most universities have. It is based on logistic regression and can select the number of prediction subjects and prediction accuracy by using the shift sigmoid function as an classification function. As a result of the experiment, when the proposed algorithm was applied, the number of predicted dropout subjects varied from 100% to 20% compared to the actual number of dropout subjects, and it was found to have a prediction accuracy of 75% to 98%.