• Title/Summary/Keyword: 감마 일반화 회귀모형

Search Result 8, Processing Time 0.025 seconds

Comparing the performance of likelihood ratio test and F-test for gamma generalized linear models (감마 일반화 선형 모형에서의 가능도비 검정과 F-검정 비교연구)

  • Jo, Seongil;Han, Jeongseop;Lee, Woojoo
    • The Korean Journal of Applied Statistics
    • /
    • v.31 no.4
    • /
    • pp.475-484
    • /
    • 2018
  • Gamma generalized linear models are useful for non-negative and skewed responses. However, these models have received less attention than Poisson and binomial generalized linear models. In particular, hypothesis testing for the significance of regression coefficients has not been thoroughly studied. In this paper we assess the performance of various test statistics for gamma generalized linear models based on numerical studies. Our results show that the likelihood ratio test and F-type test are generally recommended and that the partial deviance test should be avoided in practice.

Patent Keyword Analysis using Gamma Regression Model and Visualization

  • Jun, Sunghae
    • Journal of the Korea Society of Computer and Information
    • /
    • v.27 no.8
    • /
    • pp.143-149
    • /
    • 2022
  • Since patent documents contain detailed results of research and development technologies, many studies on various patent analysis methods for effective technology analysis have been conducted. In particular, research on quantitative patent analysis by statistics and machine learning algorithms has been actively conducted recently. The most used patent data in quantitative patent analysis is technology keywords. Most of the existing methods for analyzing the keyword data were models based on the Gaussian probability distribution with random variable on real space from negative infinity to positive infinity. In this paper, we propose a model using gamma probability distribution to analyze the frequency data of patent keywords that can theoretically have values from zero to positive infinity. In addition, in order to determine the regression equation of the gamma-based regression model, two-mode network is constructed to visualize the technological association between keywords. Practical patent data is collected and analyzed for performance evaluation between the proposed method and the existing Gaussian-based analysis models.

Unattended Trends and Retail Locations: Focusing on Unmanned Convenience and Discount Stores in Seoul (점포의 무인화와 소매점 입지: 서울시 무인 편의점과 무인 할인판매점을 대상으로)

  • Park, Sohyun;Lee, Keumsook
    • Journal of the Economic Geographical Society of Korea
    • /
    • v.24 no.4
    • /
    • pp.411-424
    • /
    • 2021
  • The purpose of this study is to analyze the location characteristics of convenience stores, a consumption industry that forms a new retail environment and implements unattended, and reveals the geographic factors that affect its location. For this purpose, we first examine the growth and regional distribution of manned convenience stores and then construct a gamma generalized linear regression model that explains the distribution of the unmanned convenience stores as well as the manned and unmanned combined convenience stores. As the result, it was observed that the unmanned convenience stores and the unmanned discount stores are located close to public transportation facilities and they have relatively little movement of the population but they are distributed in areas with a higher density of young residential population and densely distributed retail stores and restaurants. The effects of demographic factors on the location of the unmanned convenience stores and the unmanned discount stores differed according to the scope and characteristics of their sales items. Our findings provide an empirical basis for subsequent academic research as an initial study that identified geographic factors influencing the selection of opening locations for unmanned stores.

Analysis of Household Overdue Loans by Using a Two-stage Generalized Linear Model (이단계 일반화 선형모형을 이용한 은행 고객의 연체성향 분석)

  • Oh, Man-Suk;Oh, Hyeon-Tak;Lee, Young-Mi
    • The Korean Journal of Applied Statistics
    • /
    • v.19 no.3
    • /
    • pp.407-419
    • /
    • 2006
  • In this paper, we analyze household overdue loans in Korea which has been causing serious social and economical problems. We consider customers of Bank A in Korea and focus on overdue cash services which have been snowballing in the past few years. From analysis of overdue loans, one can predict possible delays for current customers as well as build a credit evaluation and risk management system for future customers. As a statistical analytical tool, we propose a two-stage Generalized Linear regression Model (GLM) which assumes a logistic model for presence/non-presence of overdue and a gamma model for the amount of overdue in the case of overdue. We perform goodness of fit test for the two-stage model and select significant explanatory variables in each stage of the model. It turns out that age, the amount of credit loans from other financial companies, the amount of cash service from other companies, debit balance, the average amount of cash service, and net profit are important explanatory variables relevant to overdue credit card cash service in Korea.

Mathematics education attitude of the students in the specialized high school (특성화고 학생의 수학교과에 대한 태도 조사)

  • Kim, Minsuk;Oh, Kwangsik
    • Journal of the Korean Data and Information Science Society
    • /
    • v.23 no.6
    • /
    • pp.1173-1181
    • /
    • 2012
  • In order to suggest the basic resources of mathematics education to the specialized high school, we investigate the attitude of students about mathematics education. Questionnaires survey was carried on 654 students and we use the statistical analysis such as chi-square test, gamma, generalized linear model, Anova, regression. Several result can be derived from the questionnaire analysis. There are differences between the general and specialized high school students in the interest, pre-learning ability etc. The specialized school students think the usefulness of mathematics more importantly, while the general school students think more closely related to their course.

Predicting claim size in the auto insurance with relative error: a panel data approach (상대오차예측을 이용한 자동차 보험의 손해액 예측: 패널자료를 이용한 연구)

  • Park, Heungsun
    • The Korean Journal of Applied Statistics
    • /
    • v.34 no.5
    • /
    • pp.697-710
    • /
    • 2021
  • Relative error prediction is preferred over ordinary prediction methods when relative/percentile errors are regarded as important, especially in econometrics, software engineering and government official statistics. The relative error prediction techniques have been developed in linear/nonlinear regression, nonparametric regression using kernel regression smoother, and stationary time series models. However, random effect models have not been used in relative error prediction. The purpose of this article is to extend relative error prediction to some of generalized linear mixed model (GLMM) with panel data, which is the random effect models based on gamma, lognormal, or inverse gaussian distribution. For better understanding, the real auto insurance data is used to predict the claim size, and the best predictor and the best relative error predictor are comparatively illustrated.

Estimating home fire severity with statistical distributions (통계적 분포를 통한 주택 화재 심도 추정)

  • Yunjung Park;Inha Song;Soyoun Lee;Kwang Hyun Nam;Rosy Oh;Jaeyoun Ahn
    • The Korean Journal of Applied Statistics
    • /
    • v.36 no.6
    • /
    • pp.591-618
    • /
    • 2023
  • This paper evaluates the performance of various distribution assumptions in regression settings for estimating insurance loss. The gamma distribution is commonly used to handle the asymmetry property of loss distribution. However, recent studies highlight the significance of heavy-tailedness in loss distribution. Through an analysis of real home fire insurance data, we compare the effectiveness of different distribution assumptions in regression methods. Our findings show that the choice of parametric distributional assumption is crucial in determining premiums for various insurance products, including "excess of loss insurance" and "limit insurance". Additionally, we discuss practical considerations for applying our results in home fire insurance.

Rice Yield Estimation Using Sentinel-2 Satellite Imagery, Rainfall and Soil Data (Sentinel-2 위성영상과 강우 및 토양자료를 활용한 벼 수량 추정)

  • KIM, Kyoung-Seop;CHOUNG, Yun-Jae;JUN, Byong-Woon
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.25 no.1
    • /
    • pp.133-149
    • /
    • 2022
  • Existing domestic studies on estimating rice yield were mainly implemented at the level of cities and counties in the entire nation using MODIS satellite images with low spatial resolution. Unlike previous studies, this study tried to estimate rice yield at the level of eup-myon-dong in Gimje-si, Jeollabuk-do using Sentinel-2 satellite images with medium spatial resolution, rainfall and soil data, and then to evaluate its accuracy. Five vegetation indices such as NDVI, LAI, EVI2, MCARI1 and MCARI2 derived from Sentinel-2 images of August 1, 2018 for Gimje-si, Jeollabuk-do, rainfall and paddy soil-type data were aggregated by the level of eup-myon-dong and then rice yield was estimated with gamma generalized linear model, an expanded variant of multi-variate regression analysis to solve the non-normality problem of dependent variable. In the rice yield model finally developed, EVI2, rainfall days in September, and saline soils ratio were used as significant independent variables. The coefficient of determination representing the model fit was 0.68 and the RMSE for showing the model accuracy was 62.29kg/10a. This model estimated the total rice production in Gimje-si in 2018 to be 96,914.6M/T, which was very close to 94,470.3M/T the actual amount specified in the Statistical Yearbook with an error of 0.46%. Also, the rice production per unit area of Gimje-si was amounted to 552kg/10a, which was almost consistent with 550kg/10a of the statistical data. This result is similar to that of the previous studies and it demonstrated that the rice yield can be estimated using Sentinel-2 satellite images at the level of cities and counties or smaller districts in Korea.