• Title/Summary/Keyword: 로지스틱 회귀 모형

Search Result 432, Processing Time 0.027 seconds

Introduction to variational Bayes for high-dimensional linear and logistic regression models (고차원 선형 및 로지스틱 회귀모형에 대한 변분 베이즈 방법 소개)

  • Jang, Insong;Lee, Kyoungjae
    • The Korean Journal of Applied Statistics
    • /
    • v.35 no.3
    • /
    • pp.445-455
    • /
    • 2022
  • In this paper, we introduce existing Bayesian methods for high-dimensional sparse regression models and compare their performance in various simulation scenarios. Especially, we focus on the variational Bayes approach proposed by Ray and Szabó (2021), which enables scalable and accurate Bayesian inference. Based on simulated data sets from sparse high-dimensional linear regression models, we compare the variational Bayes approach with other Bayesian and frequentist methods. To check the practical performance of the variational Bayes in logistic regression models, a real data analysis is conducted using leukemia data set.

A Study of Effect on the Smoking Status using Multilevel Logistic Model (다수준 로지스틱 모형을 이용한 흡연 여부에 미치는 영향 분석)

  • Lee, Ji Hye;Heo, Tae-Young
    • The Korean Journal of Applied Statistics
    • /
    • v.27 no.1
    • /
    • pp.89-102
    • /
    • 2014
  • In this study, we analyze the effect on the smoking status in the Seoul Metropolitan area using a multilevel logistic model with Community Health Survey data from the Korea Centers for Disease Control and Prevention. Intraclass correlation coefficient (ICC), profiling analysis and two types of predicted value were used to determine the appropriate multilevel analysis level. Sensitivity, specificity, percentage of correctly classified observations (PCC) and ROC curve evaluated model performance. We showed the applicability for multilevel analysis allowed for the possibility that different factors contribute to within group and between group variability using survey data.

Study on Detection Technique for Cochlodinium polykrikoides Red tide using Logistic Regression Model under Imbalanced Data (불균형 데이터 환경에서 로지스틱 회귀모형을 이용한 Cochlodinium polykrikoides 적조 탐지 기법 연구)

  • Bak, Su-Ho;Kim, Heung-Min;Kim, Bum-Kyu;Hwang, Do-Hyun;Enkhjargal, Unuzaya;Yoon, Hong-Joo
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.13 no.6
    • /
    • pp.1353-1364
    • /
    • 2018
  • This study proposed a method to detect Cochlodinium polykrikoides red tide pixels in satellite images using a logistic regression model of machine learning technique under Imbalanced data. The spectral profiles extracted from red tide, clear water, and turbid water were used as training dataset. 70% of the entire data set was extracted and used for as model training, and the classification accuracy of the model was evaluated using the remaining 30%. At this time, the white noise was added to the spectral profile of the red tide, which has a relatively small number of data compared to the clear water and the turbid water, and over-sampling was performed to solve the unbalanced data problem. As a result of the accuracy evaluation, the proposed algorithm showed about 94% classification accuracy.

Comparison analysis of big data integration models (빅데이터 통합모형 비교분석)

  • Jung, Byung Ho;Lim, Dong Hoon
    • Journal of the Korean Data and Information Science Society
    • /
    • v.28 no.4
    • /
    • pp.755-768
    • /
    • 2017
  • As Big Data becomes the core of the fourth industrial revolution, big data-based processing and analysis capabilities are expected to influence the company's future competitiveness. Comparative studies of RHadoop and RHIPE that integrate R and Hadoop environment, have not been discussed by many researchers although RHadoop and RHIPE have been discussed separately. In this paper, we constructed big data platforms such as RHadoop and RHIPE applicable to large scale data and implemented the machine learning algorithms such as multiple regression and logistic regression based on MapReduce framework. We conducted a study on performance and scalability with those implementations for various sample sizes of actual data and simulated data. The experiments demonstrated that our RHadoop and RHIPE can scale well and efficiently process large data sets on commodity hardware. We showed RHIPE is faster than RHadoop in almost all the data generally.

Prediction of fine dust PM10 using a deep neural network model (심층 신경망모형을 사용한 미세먼지 PM10의 예측)

  • Jeon, Seonghyeon;Son, Young Sook
    • The Korean Journal of Applied Statistics
    • /
    • v.31 no.2
    • /
    • pp.265-285
    • /
    • 2018
  • In this study, we applied a deep neural network model to predict four grades of fine dust $PM_{10}$, 'Good, Moderate, Bad, Very Bad' and two grades, 'Good or Moderate and Bad or Very Bad'. The deep neural network model and existing classification techniques (such as neural network model, multinomial logistic regression model, support vector machine, and random forest) were applied to fine dust daily data observed from 2010 to 2015 in six major metropolitan areas of Korea. Data analysis shows that the deep neural network model outperforms others in the sense of accuracy.

Prediction Model with a Logistic Regression of Sequencing Two Arrival Flows (합류하는 두 항공기간 도착순서 결정에 대한 로지스틱회귀 예측 모형)

  • Jung, Soyeon;Lee, Keumjin
    • Journal of the Korean Society for Aviation and Aeronautics
    • /
    • v.23 no.4
    • /
    • pp.42-48
    • /
    • 2015
  • This paper has its purpose on constructing a prediction model of the arrival sequencing strategy which reflects the actual sequencing patterns of air traffic controllers. As the first step, we analyzed a pair-wise sequencing of two aircraft entering TMA from different entering points. Based on the historical trajectory data, several traffic factors such as time, speed and traffic density were examined for the model. With statistically significant factors, we constructed a prediction model of arrival sequencing through a binary logistic regression analysis. With the estimated coefficients, the performance of the model was conducted through a cross validation.

An Idea, Strategy of Congestion Pricing for Differentiated Services and Forecasting Probability of Access using Logistic Regression Model (차등서비스를 위한 혼잡요금부과의 타당성 검토와 로지스틱 회귀모형을 이용한 인터넷 접속 확률 예측)

  • Ji Seonsu
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.10 no.1
    • /
    • pp.9-15
    • /
    • 2005
  • Congestion control is an important research area in computer network. In this paper, I provided strategy of congestion pricing with differentiated services. And, suggested forecasting model of access that considered differentiated pricing, delay time, satisfaction using logistic regression. In a forecasting model of access with logistic regression technique, it is shown that coefficient of determination using suggested model is $70.7\%$.

  • PDF

The Rating of Korean Basketball League Teams in 2006-2007 Season: Taking Account of Home-Court Advantage (홈팀의 이점을 고려한 KBL 2006-2007 시즌 경기력 평가)

  • Lee, Seung-Chun;Byun, Jong-Seok
    • Communications for Statistical Applications and Methods
    • /
    • v.15 no.5
    • /
    • pp.687-695
    • /
    • 2008
  • It is widely known that the home advantage plays an important factor for determining victory or defeat in sport leagues. Thus a ranking system of sport league should take account of the home advantage as a key factor. Various statistical models are studied to rate the Korean Basketball league teams in 2006-2007 season. Among them, the model equation provided by Harville and Smith (1994) is useful for constructing two ranking systems. Both systems give quite reasonable quantifications of the team's ability and the home advantage.

Inferential Problems in Bayesian Logistic Regression Models (베이지안 로지스틱 회귀모형에서의 추론에 대한 연구)

  • Hwang, Jin-Soo;Kang, Sung-Chan
    • The Korean Journal of Applied Statistics
    • /
    • v.24 no.6
    • /
    • pp.1149-1160
    • /
    • 2011
  • Model selection and hypothesis testing problems in Bayesian inference are still debated between scholars. Bayesian factors traditionally used as a criterion in Bayesian hypothesis testing and model selection, are easy to understand but sometimes hard to compute. In addition, there are other model selection criterions such as DIC(Deviance Information Criterion) by Spiegelhalter et al. (2002) and Bayesian P-values for testing. In this paper, we briefly introduce the Bayesian hypothesis testing and model selection procedure. In addition we have applied a Bayesian inference to Swiss banknote data by a fitting logistic regression model and computing several test statistics to see if they provide consistent results.