• Title/Summary/Keyword: 베이지안 회귀모형

Search Result 73, Processing Time 0.029 seconds

Estimating Probability of Mode Choice at Regional Level by Considering Spatial Association of Departure Place (출발지 공간 연관성을 고려한 지역별 수단선택확률 추정 연구)

  • Eom, Jin-Ki;Park, Man-Sik;Heo, Tae-Young
    • Journal of the Korean Society for Railway
    • /
    • v.12 no.5
    • /
    • pp.656-662
    • /
    • 2009
  • In general, the analysis of travelers' mode choice behavior is accomplished by developing the utility functions which reflect individual's preference of mode choice according to their demographic and travel characteristics. In this paper, we propose a methodology that takes the spatial effects of individuals' departure locations into account in the mode choice model. The statistical models considered here are spatial logistic regression model and conditional autoregressive model taking a spatial association parameter into account. We employed the Bayesian approach in order to obtain more reliable parameter estimates. The proposed methodology allows us to estimate mode shares by departure places even though the survey does not cover all areas.

Bayesian quantile regression analysis of private education expenses for high scool students in Korea (일반계 고등학생 사교육비 지출에 대한 베이지안 분위회귀모형 분석)

  • Oh, Hyun Sook
    • Journal of the Korean Data and Information Science Society
    • /
    • v.28 no.6
    • /
    • pp.1457-1469
    • /
    • 2017
  • Private education expenses is one of the key issues in Korea and there have been many discussions about it. Academically, most of previous researches for private education expenses have used multiple regression linear model based on ordinary least squares (OLS) method. However, if the data do not satisfy the basic assumptions of the OLS method such as the normality and homoscedasticity, there is a problem with the reliability of estimations of parameters. In this case, quantile regression model is preferred to OLS model since it does not depend on the assumptions of nonnormality and heteroscedasticity for the data. In the present study, the data from a survey on private education expenses, conducted by Statistics Korea in 2015 has been analyzed for investigation of the impacting factors for private education expenses. Since the data do not satisfy the OLS assumptions, quantile regression model has been employed in Bayesian approach by using gibbs sampling method. The analysis results show that the gender of the student, parent's age, and the time and cost of participating after school are not significant. Household income is positively significant in proportion to the same size for all levels (quantiles) of private education expenses. Spending on private education in Seoul is higher than other regions and the regional difference grows as private education expenditure increases. Total time for private education and student's achievement have positive effect on the lower quantiles than the higher quantiles. Education level of father is positively significant for midium-high quantiles only, but education level of mother is for all but low quantiles. Participating after school is positively significant for the lower quantiles but EBS textbook cost is positively significant for the higher quantiles.

A comparison and prediction of total fertility rate using parametric, non-parametric, and Bayesian model (모수, 비모수, 베이지안 출산율 모형을 활용한 합계출산율 예측과 비교)

  • Oh, Jinho
    • The Korean Journal of Applied Statistics
    • /
    • v.31 no.6
    • /
    • pp.677-692
    • /
    • 2018
  • The total fertility rate of Korea was 1.05 in 2017, showing a return to the 1.08 level in the year 2005. 1.05 is a very low fertility level that is far from replacement level fertility or safety zone 1.5. The number may indicate a low fertility trap. It is therefore important to predict fertility than at any other time. In the meantime, we have predicted the age-specific fertility rate and total fertility rate by various statistical methods. When the data trend is disconnected or fluctuating, it applied a nonparametric method applying the smoothness and weight. In addition, the Bayesian method of using the pre-distribution of fertility rates in advanced countries with reference to the three-stage transition phenomenon have been applied. This paper examines which method is reasonable in terms of precision and feasibility by applying estimation, forecasting, and comparing the results of the recent variability of the Korean fertility rate with parametric, non-parametric and Bayesian methods. The results of the analysis showed that the total fertility rate was in the order of KOSTAT's total fertility rate, Bayesian, parametric and non-parametric method outcomes. Given the level of TFR 1.05 in 2017, the predicted total fertility rate derived from the parametric and nonparametric models is most reasonable. In addition, if a fertility rate data is highly complete and a quality is good, the parametric model approach is superior to other methods in terms of parameter estimation, calculation efficiency and goodness-of-fit.

Bayesian Analysis and Mapping of Elderly Korean Suicide Rates (베이지안 모형을 활용한 국내 노인 자살률 질병지도)

  • Lee, Jayoun;Kim, Dal Ho
    • The Korean Journal of Applied Statistics
    • /
    • v.28 no.2
    • /
    • pp.325-334
    • /
    • 2015
  • Elderly suicide rates tend to be high in Korea. Suicide by the elderly is no longer a personal problem; consequently, further research on risk and regional factors is necessary. Disease mapping in epidemiology estimates spatial patterns for disease risk over a geographical region. In this study, we use a simultaneous conditional autoregressive model for spatial correlations between neighboring areas to estimate standard mortality ratios and mapping. The method is illustrated with cause of death data from 2006 and 2010 to analyze regional patterns of elderly suicide in Korea. By considering spatial correlations, the Bayesian spatial models, mean educational attainment and percentage of the elderly who live alone was the significant regional characteristic for elderly suicide. Gibbs sampling and grid method are used for computation.

KCYP data analysis using Bayesian multivariate linear model (베이지안 다변량 선형 모형을 이용한 청소년 패널 데이터 분석)

  • Insun, Lee;Keunbaik, Lee
    • The Korean Journal of Applied Statistics
    • /
    • v.35 no.6
    • /
    • pp.703-724
    • /
    • 2022
  • Although longitudinal studies mainly produce multivariate longitudinal data, most of existing statistical models analyze univariate longitudinal data and there is a limitation to explain complex correlations properly. Therefore, this paper describes various methods of modeling the covariance matrix to explain the complex correlations. Among them, modified Cholesky decomposition, modified Cholesky block decomposition, and hypersphere decomposition are reviewed. In this paper, we review these methods and analyze Korean children and youth panel (KCYP) data are analyzed using the Bayesian method. The KCYP data are multivariate longitudinal data that have response variables: School adaptation, academic achievement, and dependence on mobile phones. Assuming that the correlation structure and the innovation standard deviation structure are different, several models are compared. For the most suitable model, all explanatory variables are significant for school adaptation, and academic achievement and only household income appears as insignificant variables when cell phone dependence is a response variable.

Development of salinity simulation using a hierarchical bayesian ARX model (계층적 베이지안 ARX 모형을 활용한 염분모의기법 개발)

  • Kim, Hojun;Shin, Choong Hun;Kim, Tae-Woong;Kwon, Hyun-Han
    • Journal of Korea Water Resources Association
    • /
    • v.53 no.7
    • /
    • pp.481-491
    • /
    • 2020
  • The development of agricultural land at Saemangeum has required a significant increase in agricultural water use. It has been well acknowledged that salinity plays a critical role in the farming system. Therefore, a systematic study in salinity is necessary to better manage agricultural water. This study aims to develop a stochastic salinity simulation model that simultaneously simulates salinities obtained from different layers. More specifically, this study proposed a two-stage Autoregressive Exgeneous (ARX) model within a hierarchical Bayesian modeling framework. We derived posterior distributions of model parameters and further used them to obtain the predictive posterior distribution for salinities at three different layers. Here, the BIC values are used and compared to determine the optimal model from a set of candidate models. A detailed discussion of the model is provided.

Imputation for Binary or Ordered Categorical Traits Based on the Bayesian Threshold Model (베이지안 분계점 모형에 의한 순서 범주형 변수의 대체)

  • Lee Seung-Chun
    • The Korean Journal of Applied Statistics
    • /
    • v.18 no.3
    • /
    • pp.597-606
    • /
    • 2005
  • The nonresponse in sample survey causes a problem when it comes time to analyze dataset in public-use files where the user has only complete-data methods available and has limited information about the reasons for nonresponse. Recently imputation for nonresponse is becoming a standard approach for handling nonresponse and various imputation methods have been devised . However, most imputation methods concern with continuous traits while many interesting features are measured by binary or ordered categorical scales in sample survey. In this note. an imputation method for ignorable nonresponse in binary or ordered categorical traits is considered.

Detection and Forecast of Climate Change Signal over the Korean Peninsula (한반도 기후변화시그널 탐지 및 예측)

  • Sohn, Keon-Tae;Lee, Eun-Hye;Lee, Jeong-Hyeong
    • The Korean Journal of Applied Statistics
    • /
    • v.21 no.4
    • /
    • pp.705-716
    • /
    • 2008
  • The objectives of this study are the detection and forecast of climate change signal in the annual mean of surface temperature data, which are generated by MRI/JMA CGCM over the Korean Peninsula. MRI/JMA CGCM outputs consist of control run data(experiment with no change of $CO_2$ concentration) and scenario run data($CO_2$ 1%/year increase experiment to quadrupling) during 142 years for surface temperature and precipitation. And ECMWF reanalysis data during 43 years are used as observations. All data have the same spatial structure which consists of 42 grid points. Two statistical models, the Bayesian fingerprint method and the regression model with autoregressive error(AUTOREG model), are separately applied to detect the climate change signal. The forecasts up to 2100 are generated by the estimated AUTOREG model only for detected grid points.

Technology Forecasting using Bayesian Discrete Model (베이지안 이산모형을 이용한 기술예측)

  • Jun, Sunghae
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.27 no.2
    • /
    • pp.179-186
    • /
    • 2017
  • Technology forecasting is predict future trend and state of technology by analyzing the results so far of developing technology. In general, a patent has novel information about the result of developed technology, because the exclusive right of technology included in patent is protected for a time period by patent law. So many studies on the technology forecasting using patent data analysis has been performed. The patent keyword data widely used in patent analysis consist of occurred frequency of the keyword. In most previous researches, the continuous data analyses such as regression or Box-Jenkins Models were applied to the patent keyword data. But, we have to apply the analytical methods of discrete data for patent keyword analysis because the keyword data is discrete. To solve this problem, we propose a patent analysis methodology using Bayesian Poisson discrete model. To verify the performance of our research, we carry out a case study by analyzing the patent documents applied by Apple until now.

Multiple imputation and synthetic data (다중대체와 재현자료 작성)

  • Kim, Joungyoun;Park, Min-Jeong
    • The Korean Journal of Applied Statistics
    • /
    • v.32 no.1
    • /
    • pp.83-97
    • /
    • 2019
  • As society develops, the dissemination of microdata has increased to respond to diverse analytical needs of users. Analysis of microdata for policy making, academic purposes, etc. is highly desirable in terms of value creation. However, the provision of microdata, whose usefulness is guaranteed, has a risk of exposure of personal information. Several methods have been considered to ensure the protection of personal information while ensuring the usefulness of the data. One of these methods has been studied to generate and utilize synthetic data. This paper aims to understand the synthetic data by exploring methodologies and precautions related to synthetic data. To this end, we first explain muptiple imputation, Bayesian predictive model, and Bayesian bootstrap, which are basic foundations for synthetic data. And then, we link these concepts to the construction of fully/partially synthetic data. To understand the creation of synthetic data, we review a real longitudinal synthetic data example which is based on sequential regression multivariate imputation.