• Title/Summary/Keyword: 비모수적 회귀

Search Result 105, Processing Time 0.028 seconds

Identifying and Predicting Adolescent Smoking Trajectories in Korea (청소년기 흡연 발달궤적 변화와 예측요인)

  • Chung, Ick-joong
    • Korean Journal of Social Welfare Studies
    • /
    • no.39
    • /
    • pp.5-28
    • /
    • 2008
  • The purpose of this study is two-fold: 1) to identify different adolescent smoking trajectories in Korea; and 2) to examine predictors of those smoking trajectories within a social developmental frame. Data were from the Korea Youth Panel Survey(KYPS), a longitudinal study of 3,449 youths followed since 2003. Using semi-parametric group-based modeling, four smoking trajectories were identified: non initiators, late onsetters, experimenters, and escalators. Multinomial logistic regressions were then used to identify risk and protective factors that distinguish the trajectory groups from one another. Among non smokers at age 13, late onsetters were distinguished from non initiators by a variety of factors in every ecological domain. Among youths who already smoked at age 13, escalators who increased their smoking were distinguished from experimenters who almost desisted from smoking by age 17 by self-esteem and academic achievement. Finally, implications for youth welfare practice from this study were discussed.

A comparison study of Bayesian variable selection methods for sparse covariance matrices (희박 공분산 행렬에 대한 베이지안 변수 선택 방법론 비교 연구)

  • Kim, Bongsu;Lee, Kyoungjae
    • The Korean Journal of Applied Statistics
    • /
    • v.35 no.2
    • /
    • pp.285-298
    • /
    • 2022
  • Continuous shrinkage priors, as well as spike and slab priors, have been widely employed for Bayesian inference about sparse regression coefficient vectors or covariance matrices. Continuous shrinkage priors provide computational advantages over spike and slab priors since their model space is substantially smaller. This is especially true in high-dimensional settings. However, variable selection based on continuous shrinkage priors is not straightforward because they do not give exactly zero values. Although few variable selection approaches based on continuous shrinkage priors have been proposed, no substantial comparative investigations of their performance have been conducted. In this paper, We compare two variable selection methods: a credible interval method and the sequential 2-means algorithm (Li and Pati, 2017). Various simulation scenarios are used to demonstrate the practical performances of the methods. We conclude the paper by presenting some observations and conjectures based on the simulation findings.

Retrieval of Atmospheric Optical Thickness from Digital Images of the Moon (월면 디지털 영상 분석을 이용한 대기 광학두께 산출)

  • Jeong, Myeong-Jae
    • Korean Journal of Remote Sensing
    • /
    • v.29 no.5
    • /
    • pp.555-568
    • /
    • 2013
  • Atmospheric optical thickness during nighttime was estimated in this study using analysis on the images of the moon taken from commercial digital camera. Basically the Langely Regression method was applied to the observations of the moon for the cloudless and optically stable sky conditions. The spectral response functions for the red(R), green(G), and blue(B) channels were employed to derive effective wavelength centers of each channel for the observations of the moon, and the correspondent Rayleigh optical thickness were also calculated. Aerosol optical thickness (AOT) was calculated by subtracting Rayleigh optical thickness from the atmospheric optical thickness derived from the Langley regression method. As there are only handful of nighttime AOT observations, the AOT from the moon observations was compared with the AOT from sun-photometers and the MODIS satellite sensor, which was taken several hours before the moon observations of this study. As a result, the values of AOT from moon observations agree with those from sun-photometers and MODIS within 0.1 for the R, G, B channels of the digital camera. On the other hand, ${\AA}$ngstr$\ddot{o}$m Exponent seems to be subject to larger errors due to its sensitiveness to the spectral errors of AOT. Nevertheless, the results of this study indicate that the method reported in this study is promising as it can provide nighttime AOT relatively easily with a low cost instrument like digital camera. More observations and analyses are warranted to attain improved nighttime AOT observations in the future.

A Study on Technology Forecasting of Unmanned Aerial Vehicles (UAVs) Using TFDEA (TFDEA를 이용한 무인항공기 기술예측에 관한 연구)

  • Jung, Byungki;Kim, H.C.;Lee, Choonjoo
    • Journal of Korea Technology Innovation Society
    • /
    • v.19 no.4
    • /
    • pp.799-821
    • /
    • 2016
  • Unmanned Aerial Vehicles (UAVs) are essential systems for Intelligence, Surveillance, and Reconnaissance (ISR) operations in current battlespace. And its importance will be getting extended because of complexity and uncertainty of battlespace. In this study, we forecast the advancement of 96 UAVs during the period of 32 years from 1982 to 2014 using TFDEA. TFDEA is a quantitative technology forecasting method which is characterized as non-parametric and non-statistical mathematical programming. Inman et al. (2006) showed that TFDEA is more accurate in forecasting compared with classical econometrics (e.g. regression). This study got 4.06% point of annual technological rate of change (RoC) for UAVs by applying TFDEA. And most UAVs in the period are inefficient according to the global SOA frontiers. That is because the countries which develop UAVs are in the middle class of technological level, so more than 60% of world UAVs markets are shared by North America and Europe which are advanced countries in terms of technological maturity level. This study could give some insights for UAVs development and its advancement. And also can be used for evaluating the adequacy of Required Operational Capability (ROC) of suggested future systems and managing the progress of Research and Development (R&D).

A Characteristics of the Vulnerable Area for Emergency Medical Service in Daejeon by Analysis of Geographic Information System (지리정보시스템(GIS)으로 분석한 대전광역시 응급의료서비스 취약지 특성)

  • Hwang, Ji-Hye;Na, Baeg-Ju;Lee, Dong-Woo;Hong, Jee-Young;Lee, Moo-Sik
    • Proceedings of the KAIS Fall Conference
    • /
    • 2010.05b
    • /
    • pp.859-862
    • /
    • 2010
  • 본 연구는 대전광역시의 응급의료서비스 취약지를 도출하고 취약지역의 보건학적 특성 및 응급의료서비스 취약여부와의 관련성을 분석하여 응급의료 관련 정책의 의사결정에 유용한 기초자료로 제공하기 위한 연구이다. 응급의료서비스 취약지 도출은 Arc GIS의 공간분석 방법 중 가중분석(Cost Weighted distance) 방법으로 응급의료센터로부터의 접근성 분석을 하였으며, 응급의료서비스 취약지의 보건학적 특성 및 응급의료서비스 취약여부와의 관련성은 SPSS 17.0을 이용하여 비모수 t-검정 및 다중회귀분석을 시행하였다. 본 연구의 주요 결과는 다음과 같다. 연구지역의 응급의료기관 분포는 동구와 유성구, 대덕구는 지정된 응급의료센터가 없으나 서구와 중구는 응급의료센터가 2개소 이상 위치하고 있어 응급의료기관 분포가 편중되어 있으며, GIS를 활용하여 응급의료센터와의 접근성 분석을 수행한 결과, 대전광역시 자치구별 전체 면적 대비 응급의료서비스 취약지의 비율이 높은 자치구는 동구가 41.2%로 가장 높았다. GIS를 활용하여 행정동별 응급의료서비스 취약지를 분석한 결과, 대덕구 신탄진동, 동구 대청동과 산내동, 유성구 구즉동과 노은2동, 서구 기성동, 중구 산성동으로 나타났으며, 응급의료서비스 취약지 중 기성동, 대청동이 노인 인구밀도가 높게 나타났다. 응급의료서비스 취약여부에 따른 보건학적 특성별 차이를 분석한 결과, 국민기초생활수급권자, 장애인등록자, 농업인구 비율의 평균은 취약지가 비취약지에 비해 높았으며 통계적으로 유의한 차이를 보였다(p<0.01). 응급의료서비스 취약여부를 종속변수로 하고 지역별 보건학적 특성을 독립변수로 하여 로지스틱 회귀분석을 시행한 결과, 농업인구 비율과 국민기초생활수급권자 비율이 높았으며 이는 통계적으로 유의하여 응급의료서비스 취약여부를 설명할 수 있는 변수인 것으로 나타났다(p<0.01, p<0.05). 이상의 결과를 종합하면 대전광역시 5개 자치구의 행정동 중 응급의료서비스 접근 불평등지역이 도출되었고 이러한 지역은 보건학적 특성 중 농업인구 비율과 국민기초생활수급권자의 비율이 높았으며 이는 통계적으로 유의하여 응급의료서비스 취약여부와 관련성이 있는 것으로 나타났다. 향후 효율적인 응급의료 자원 분배를 위해서는 GIS를 활용한 의사결정이 필요하며, 응급의료서비스 이용의 형평성을 증진시키기 위해서 응급의료서비스의 사각지대에 놓여있는 지역의 보건학적 특성을 고려한 정책이 시행되어야 할 것으로 사료된다.

  • PDF

A Study on the Efficiency and Productivity Change of Korean Non-Life Insurance Company After Financial Crisis (금융위기 이후 국내 손해보험회사의 효율성 및 생산성 변화 연구)

  • Park, Chun-Gwang;Kim, Byeong-Chul
    • The Korean Journal of Financial Management
    • /
    • v.23 no.2
    • /
    • pp.57-83
    • /
    • 2006
  • The purpose of this paper is to analyze the efficiency and productivity change and inefficiency cause of the korean non-life insurance companies of the before($1993{\sim}1996$) and after($1998{\sim}2004$) of IMF. we use DEA (Data Envelopment Analysis) model to measure company efficiency and MPI(Malmquist productivity indices) to measure company productivity change and Tobit regression to analyze inefficiency cause. we utilize ten non-life insurance companies in korea and the time-series data for eleven from 1993 to 2004 except 1997. The empirical results show the following findings. First, total cost efficiency shows that the after of IMF decrease of 3.7% over the before of IMF and MPI change indicates that the after of IMF increase 7.7% over the before IMF. Second, the results of Tobit regression to analysis the cause of inefficiency show that total cost efficiency is positively related invested assets, acquisition expenses ratio, collection expenses ratio and is negatively related solicitors ratio, personnel expenses ratio, land & buildings expenses ratio, loss ratio, net operating expenses ratio. Especially inefficiency of small-to-mid sized companies is main cause of total cost efficiency of non-life insurance companies in korea. Small-to-mid sized companies endeavored various aspects of business strategies.

  • PDF

A comparison of imputation methods using nonlinear models (비선형 모델을 이용한 결측 대체 방법 비교)

  • Kim, Hyein;Song, Juwon
    • The Korean Journal of Applied Statistics
    • /
    • v.32 no.4
    • /
    • pp.543-559
    • /
    • 2019
  • Data often include missing values due to various reasons. If the missing data mechanism is not MCAR, analysis based on fully observed cases may an estimation cause bias and decrease the precision of the estimate since partially observed cases are excluded. Especially when data include many variables, missing values cause more serious problems. Many imputation techniques are suggested to overcome this difficulty. However, imputation methods using parametric models may not fit well with real data which do not satisfy model assumptions. In this study, we review imputation methods using nonlinear models such as kernel, resampling, and spline methods which are robust on model assumptions. In addition, we suggest utilizing imputation classes to improve imputation accuracy or adding random errors to correctly estimate the variance of the estimates in nonlinear imputation models. Performances of imputation methods using nonlinear models are compared under various simulated data settings. Simulation results indicate that the performances of imputation methods are different as data settings change. However, imputation based on the kernel regression or the penalized spline performs better in most situations. Utilizing imputation classes or adding random errors improves the performance of imputation methods using nonlinear models.

Estimation of GARCH Models and Performance Analysis of Volatility Trading System using Support Vector Regression (Support Vector Regression을 이용한 GARCH 모형의 추정과 투자전략의 성과분석)

  • Kim, Sun Woong;Choi, Heung Sik
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.2
    • /
    • pp.107-122
    • /
    • 2017
  • Volatility in the stock market returns is a measure of investment risk. It plays a central role in portfolio optimization, asset pricing and risk management as well as most theoretical financial models. Engle(1982) presented a pioneering paper on the stock market volatility that explains the time-variant characteristics embedded in the stock market return volatility. His model, Autoregressive Conditional Heteroscedasticity (ARCH), was generalized by Bollerslev(1986) as GARCH models. Empirical studies have shown that GARCH models describes well the fat-tailed return distributions and volatility clustering phenomenon appearing in stock prices. The parameters of the GARCH models are generally estimated by the maximum likelihood estimation (MLE) based on the standard normal density. But, since 1987 Black Monday, the stock market prices have become very complex and shown a lot of noisy terms. Recent studies start to apply artificial intelligent approach in estimating the GARCH parameters as a substitute for the MLE. The paper presents SVR-based GARCH process and compares with MLE-based GARCH process to estimate the parameters of GARCH models which are known to well forecast stock market volatility. Kernel functions used in SVR estimation process are linear, polynomial and radial. We analyzed the suggested models with KOSPI 200 Index. This index is constituted by 200 blue chip stocks listed in the Korea Exchange. We sampled KOSPI 200 daily closing values from 2010 to 2015. Sample observations are 1487 days. We used 1187 days to train the suggested GARCH models and the remaining 300 days were used as testing data. First, symmetric and asymmetric GARCH models are estimated by MLE. We forecasted KOSPI 200 Index return volatility and the statistical metric MSE shows better results for the asymmetric GARCH models such as E-GARCH or GJR-GARCH. This is consistent with the documented non-normal return distribution characteristics with fat-tail and leptokurtosis. Compared with MLE estimation process, SVR-based GARCH models outperform the MLE methodology in KOSPI 200 Index return volatility forecasting. Polynomial kernel function shows exceptionally lower forecasting accuracy. We suggested Intelligent Volatility Trading System (IVTS) that utilizes the forecasted volatility results. IVTS entry rules are as follows. If forecasted tomorrow volatility will increase then buy volatility today. If forecasted tomorrow volatility will decrease then sell volatility today. If forecasted volatility direction does not change we hold the existing buy or sell positions. IVTS is assumed to buy and sell historical volatility values. This is somewhat unreal because we cannot trade historical volatility values themselves. But our simulation results are meaningful since the Korea Exchange introduced volatility futures contract that traders can trade since November 2014. The trading systems with SVR-based GARCH models show higher returns than MLE-based GARCH in the testing period. And trading profitable percentages of MLE-based GARCH IVTS models range from 47.5% to 50.0%, trading profitable percentages of SVR-based GARCH IVTS models range from 51.8% to 59.7%. MLE-based symmetric S-GARCH shows +150.2% return and SVR-based symmetric S-GARCH shows +526.4% return. MLE-based asymmetric E-GARCH shows -72% return and SVR-based asymmetric E-GARCH shows +245.6% return. MLE-based asymmetric GJR-GARCH shows -98.7% return and SVR-based asymmetric GJR-GARCH shows +126.3% return. Linear kernel function shows higher trading returns than radial kernel function. Best performance of SVR-based IVTS is +526.4% and that of MLE-based IVTS is +150.2%. SVR-based GARCH IVTS shows higher trading frequency. This study has some limitations. Our models are solely based on SVR. Other artificial intelligence models are needed to search for better performance. We do not consider costs incurred in the trading process including brokerage commissions and slippage costs. IVTS trading performance is unreal since we use historical volatility values as trading objects. The exact forecasting of stock market volatility is essential in the real trading as well as asset pricing models. Further studies on other machine learning-based GARCH models can give better information for the stock market investors.

A comparison of synthetic data approaches using utility and disclosure risk measures (유용성과 노출 위험성 지표를 이용한 재현자료 기법 비교 연구)

  • Seongbin An;Trang Doan;Juhee Lee;Jiwoo Kim;Yong Jae Kim;Yunji Kim;Changwon Yoon;Sungkyu Jung;Dongha Kim;Sunghoon Kwon;Hang J Kim;Jeongyoun Ahn;Cheolwoo Park
    • The Korean Journal of Applied Statistics
    • /
    • v.36 no.2
    • /
    • pp.141-166
    • /
    • 2023
  • This paper investigates synthetic data generation methods and their evaluation measures. There have been increasing demands for releasing various types of data to the public for different purposes. At the same time, there are also unavoidable concerns about leaking critical or sensitive information. Many synthetic data generation methods have been proposed over the years in order to address these concerns and implemented in some countries, including Korea. The current study aims to introduce and compare three representative synthetic data generation approaches: Sequential regression, nonparametric Bayesian multiple imputations, and deep generative models. Several evaluation metrics that measure the utility and disclosure risk of synthetic data are also reviewed. We provide empirical comparisons of the three synthetic data generation approaches with respect to various evaluation measures. The findings of this work will help practitioners to have a better understanding of the advantages and disadvantages of those synthetic data methods.

Estimation of growth curve parameters and analysis of year effect for body weight in Hanwoo (한우의 성장곡선의 모수추정과 연도별 효과 분석)

  • 조광현;나승환;최재관;서강석;김시동;박병호;이영창;박종대;손삼규
    • Journal of Animal Science and Technology
    • /
    • v.48 no.2
    • /
    • pp.151-160
    • /
    • 2006
  • This study was conducted to investigate the genetic characteristics of growth stages in Hanwoo, to provide useful information in farm management decisions. Data were taken from the nucleus herds of three farms, Namwon, Daegwalyong and Seosan, comprising 27,647 cows, 14,744 bulls, and 1,290 steers in between 1980 and 2004. According to the growth curve by year, the residuals for cows and bulls were 68.49 and 54.29, respectively, under the Gompertz model. The values were lower than in other years. Parameters, A, b and k were estimated as 423.6±5.8, 2.387±0.064 and 0.0908±0.0033 in cows and 823.3±15.3, 3.584±0.070, 0.1139±0.0032 in bulls, respectively. The fitness was higher under the Gompertz model than under the logistic model: monthly and daily estimation for cows were 379.3±7.509, 2.499±0.057, 0.114±0.0045 and 367.1±1.9003, 2.3983±0.012, 0.004±0.00003, respectively. Estimated residual mean squares were 31.85 and 998.4 in their respective models. Monthly and daily estimation of bulls were 834.6±22.00, 3.319±0.062, 0.104±0.0037 and 796.0±6.128, 3.184±0.014, 0.003±0.00003, respectively. Estimated residual mean square were 66.18 and 2106.5. Monthly and daily estimation of steers were 1049.1±144.2, 3.024±0.008, 0.067±0.0096 and 1505.1±176.6, 2.997±0.067, 0.001±0.0001, relatively. Squares, 186.0 and 1119.1. In terms of growth characteristic estimated by Gompertz model, body weight for cows and bulls were 139.53kg and 307.03kg, and the daily gains were 0.52kg and 1.04kg, respectively. Body weight for steers was 385.94kg at the inflection point. Body weight gain was 0.84kg in both models. Our results showed that cows had lower mature weight and daily weight gain, and reached the inflection point earlier than bulls or steers.