• 제목/요약/키워드: mixed effects regression model

검색결과 48건 처리시간 0.03초

회귀나무 모형을 이용한 패널데이터 분석 (Panel data analysis with regression trees)

  • 장영재
    • Journal of the Korean Data and Information Science Society
    • /
    • 제25권6호
    • /
    • pp.1253-1262
    • /
    • 2014
  • 회귀나무 (regression tree)는 독립변수로 이루어진 공간을 재귀적으로 분할하고 해당 영역에서 종속변수의 최선의 예측값을 찾고자 하는 비모수적 방법론이다. 회귀나무 모형이 제안된 이래 로지스틱 회귀나무모형이나 분위수 회귀나무모형과 같이 유연하고 다양한 모형적합을 위한 연구가 진행되어 왔다. 최근에 들어서는 Sela와 Simonoff (2012)의 RE-EM 알고리즘, Loh와 Zheng (2013)의 GUIDE 등 패널데이터와 관련하여 진일보한 나무모형 알고리즘도 제안되었다. 본 논문에서는 각 알고리즘을 소개하고 특징을 살펴보는 한편, 실험 데이터를 생성하여 평균제곱오차 (mean squared error)를 바탕으로 예측력을 비교하였다. 분석결과, RE-EM 알고리즘의 예측력이 상대적으로 우수하게 나타났다. 이 알고리즘을 통해 기업경기실사지수 업종별 패널자료를 분석한 결과 최근의 업황에 가장 큰 영향을 미치는 요소는 매출 실적으로 나타났으며 매출 상위 그룹의 경우 비제조업이 제조업에 비해 업황에 대한 판단이 긍정적인 것으로 나타났다.

Is the Peak-Affect Important in Fast Processing of Visual Images in Printed Ads?: A Comparative Study on the Affect Integration Theories

  • Bu, Kyunghee;Lee, Luri
    • Asia Marketing Journal
    • /
    • 제24권3호
    • /
    • pp.96-108
    • /
    • 2022
  • This study investigates how affects elicited by visual images in print ads are integrated to form a liking for the ads. Assuming a sequential rather than simultaneous processing of still-cut images, we adopt the 'think-aloud' method to capture consumers' spontaneous responses to visual images. We hypothesize that not only would consumers show mixed affects toward a still-cut visual image but that they would also integrate their serial affects heuristically rather than simply averaging the affects as suggested by the compensatory hypothesis. By comparing the effects of two contradictory affect integration hypotheses (i.e., peak-affect and mood-maintenance) with compensatory integration, using a single regression model, we found that peak-negative along with mood maintenance integration of serial affects for a print ad works best in the formation of ad liking. The results also support our initial premise that people can have mixed valence even toward a still-cut ad.

AA1100의 부식에 미치는 Na2S, NaCl, H2O2 농도의 영향 (Effects of Na2S, NaCl, and H2O2 Concentrations on Corrosion of Aluminum)

  • 이주희;장희진
    • Corrosion Science and Technology
    • /
    • 제18권6호
    • /
    • pp.312-317
    • /
    • 2019
  • The objective of this study was to investigate the corrosion behavior of aluminum (AA1100) in a mixed solution of 0 ~ 0.1 g/L Na2S + 0.3 ~ 3 g/L NaCl + 0 ~ 10 mL/L H2O2. Potentiodynamic polarization tests were performed. Effects of solution compositions on corrosion potential, corrosion rate, and pitting potential of aluminum were statistically analyzed with a regression model. Results suggested that localized corrosion susceptibility of aluminum was increased in the solution with increasing concentration of NaCl because the pitting potential was lowered linearly with increasing NaCl concentration. On the contrary, H2O2 mitigated the galvanic corrosion of aluminum by increasing the corrosion potential. It also mitigated localized corrosion by increasing the pitting potential of aluminum. Na2S did not exert a noticeable effect on the corrosion of aluminum. These effects of different chemical species at various concentrations were independent of each other. Synergy or offset effect was not observed.

Prediction of Future Milk Yield with Random Regression Model Using Test-day Records in Holstein Cows

  • Park, Byoungho;Lee, Deukhwan
    • Asian-Australasian Journal of Animal Sciences
    • /
    • 제19권7호
    • /
    • pp.915-921
    • /
    • 2006
  • Various random regression models with different order of Legendre polynomials for permanent environmental and genetic effects were constructed to predict future milk yield of Holstein cows in Korea. A total of 257,908 test-day (TD) milk yield records from a total of 28,135 cows belonging to 1,090 herds were considered for estimating (co)variance of the random covariate coefficients using an expectation-maximization REML algorithm in an animal mixed model. The variances did not change much between the models, having different order of Legendre polynomial, but a decreasing trend was observed with increase in the order of Legendre polynomial in the model. The R-squared value of the model increased and the residual variance reduced with the increase in order of Legendre polynomial in the model. Therefore, a model with $5^{th}$ order of Legendre polynomial was considered for predicting future milk yield. For predicting the future milk yield of cows, 132,771 TD records from 28,135 cows were randomly selected from the above data by way of preceding partial TD record, and then future milk yields were estimated using incomplete records from each cow randomly retained. Results suggested that we could predict the next four months milk yield with an error deviation of 4 kg. The correlation of more than 70% between predicted and observed values was estimated for the next four months milk yield. Even using only 3 TD records of some cows, the average milk yield of Korean Holstein cows would be predicted with high accuracy if compared with observed milk yield. Persistency of each cow was estimated which might be useful for selecting the cows with higher persistency. The results of the present study suggested the use of a $5^{th}$ order Legendre polynomial to predict the future milk yield of each cow.

혼합효과 영과잉 포아송 회귀모형을 이용한 대전광역시 코로나 발생 동향 분석 (Mixed-effects zero-inflated Poisson regression for analyzing the spread of COVID-19 in Daejeon)

  • 김광희;이은지
    • 응용통계연구
    • /
    • 제34권3호
    • /
    • pp.375-388
    • /
    • 2021
  • 본 연구는 대전광역시에서 나타난 확진자 증가 현상을 분석하여 COVID-19의 확산을 방지할 대책 마련에 도움이 되고자 계획되었다. 확진자 증가의 원인이 시민들의 잦은 이동과 장기간 지속한 사회적 거리두기로 인한 피로와 방심에 있다고 보고, 각 행정동의 주별 확진자 수를 반응변수로, 생활 속 거리두기로 전환된 시점으로부터 흐른 시간, 행정동의 버스 하차 인원을 설명변수로 하여 이들의 관계를 모형화하였다. 행정동별 확진자 수가 주 단위로 반복측정 되었고, 포아송분포로 기대되는 0보다 더 많은 0이 관측될 수 있기 때문에 혼합효과 영과잉 포아송 회귀모형을 적용하였다. 행정동의 성격에 따라 확진자 발생 동향이 다를 수 있어서서 서로 유사한 성격을 갖는 행정동을 군집화하여 이를 범주형 설명변수로 사용하였다. 또한 버스 하차 인원의 효과가 행정동의 성격에 따라 달라질 수 있다는 점을 고려하여 두 변수 간의 교호작용항을 포함하였고 상대적으로 번화한 행정동에서 그 효과가 유의한 것으로 나타났다 (유의수준=0.1). 모형 적합 결과 인구수의 증가와 번화한 행정동이라는 요인, 그리고 버스 하차 인원의 증가가 확진자 수의 증가와 중요한 연관 관계를 가진다는 것을 보였다. 한편, 추정된 모형에 따르면 인구수와 버스 하차량이 고정되었을 때 번화한 집단의 확진자 수가 그렇지 않은 집단에 비해 훨씬 적을 것으로 기대되었는데, 이는 코로나 고위험 지역에 대한 시 차원의 강력한 대응이 효과를 발휘한 것으로 해석할 수 있다.

청소년의 에너지드링크 섭취 및 관련 요인 (Factors Affecting Energy Drinks Consumption among Adolescents)

  • 윤혜선
    • 한국학교보건학회지
    • /
    • 제29권3호
    • /
    • pp.218-225
    • /
    • 2016
  • Purpose: The purpose of this study was to investigate the factors affecting energy drinks consumption among adolescents in South Korea. Methods: The study is a secondary analysis. Using statistics from the 11th (2015) Korea Youth Risk Behavior Web-based Survey, any variations among the subjects were presented as percentages and analyzed by $x^2$ test and logistic regression analysis. The study sample comprised 68,043 middle and high school students in South Korea. Results: In Model 1 including general characteristics, the significant factors of energy drinks consumption were gender, weekly allowance, cohabitation with family. and economic status. In the final model where health-related characteristics were added, the significant factors were gender, school type, weekly allowance, cohabitation with family, stress level, sadness, drinking, smoking and walking days. Conclusion: The result suggests that intensified education on energy drinks consumption is needed not only at schools, but in the whole community. Also, adolescents' awareness of potential health effects of energy drinks, in particular when mixed in alcoholic beverages, should be increased through health education.

Genetic Relationship between Carcass Traits and Carcass Price of Korean Cattle

  • Kim, Jong-Bok;Kim, Dae-Jung;Lee, Jeong-Koo;Lee, Chae-Young
    • Asian-Australasian Journal of Animal Sciences
    • /
    • 제23권7호
    • /
    • pp.848-854
    • /
    • 2010
  • The objectives of this study were to estimate genetic parameters for the carcass price and carcass traits contributing to carcass grading and to investigate the influence of each carcass trait on the carcass price using multiple regression and path analyses. Data for carcass traits and carcass prices were collected from March 2003 to January 2009 on steers of Korean cattle raised at private farms. The analytical mixed animal model, including slaughter house-year-month combination, linear and quadratic slaughter age as fixed effects and random animal and residual effects, was used to estimate genetic parameters. The effects of carcass traits on the carcass price were evaluated by applying multiple regression analyses. Heritability estimates of carcass traits were $0.20{\pm}0.08$ for carcass weight (CWT), $0.33{\pm}0.10$ for back fat thickness (BFT), $0.07{\pm}0.05$ for eye-muscle area (EMA) and $0.25{\pm}0.10$ for marbling score (MS), and those of carcass prices were $0.21{\pm}0.10$ for auction price per 1 kg of carcass weight (AP) and $0.13{\pm}0.07$ for total price (CP). Genetic correlation coefficients of AP with CWT and MS were $-0.35{\pm}0.29$ and $0.99{\pm}0.04$, respectively, and those of CP with CWT and MS were $0.59{\pm}0.22$ and $0.39{\pm}0.29$ respectively. If an appropriate adjustment for temporal economic value is available, the moderate heritability estimates of AP and CP might suggest their potential use as the breeding objectives for improving the gross incomes of beef cattle farms. The large genetic correlation estimates of carcass price variables with CWT and MS implied that simultaneous selection for both CWT and MS would be also useful in enhancing income.

분위수 공적분 모형과 해운 경기변동 분석 (Quantile Co-integration Application for Maritime Business Fluctuation)

  • 김현석
    • 한국항만경제학회지
    • /
    • 제38권2호
    • /
    • pp.153-164
    • /
    • 2022
  • 본 연구는 2000년 1월부터 2021년 12월까지의 대표적 원자재 운송 수단인 Capesize 중고선가를 대상으로 해운산업에 대한 분위수 모형을 추정한다. 본 연구는 두 가지 학술적 기여를 목표로 한다. 첫째, 혼재된 실증분석 결과가 제기되는 원자재 운송 시장의 대표적 선종인 Capesize 중고선과 운임시장의 연관성을 분석한다. 둘째, 분위수 회귀로 김현석·장명희(2020a) 연구에서 제기하는 구조변환을 고려하는 실증분석 모형을 제시한다. 분석 결과는 분위수 모형은 시계열 자료에서 구조변화를 분석에 반영함으로써 오차의 불안정성으로 제기되는 문제를 우회할 수 있음을 확인한다. 그리고 공적분 모형의 장기 균형관계를 장기와 단기 추정변수를 통해 외생변수의 장·단기 영향으로 구분하고, 이를 분위별로 세분화한 예측으로 확장한다. 이상의 추정결과는 해운 이론모형에 기반한 분석을 인공지능과 기계학습으로 확장할 수 있는 근거가 된다.

The Ability of L2 LSTM Language Models to Learn the Filler-Gap Dependency

  • Kim, Euhee
    • 한국컴퓨터정보학회논문지
    • /
    • 제25권11호
    • /
    • pp.27-40
    • /
    • 2020
  • 본 논문은 장단기기억신경망(LSTM)이 영어를 배우면서 학습한 암묵적 통사 관계인 필러-갭 의존 관계를 조사하여 영어 문장 학습량과 한국인 영어 학습자(L2ers)의 문장 처리 패턴 간의 상관관계를 규명한다. 이를 위해, 먼저 장단기기억신경망 언어모델(LSTM LM)을 구축하였다. 이 모델은 L2ers가 영어 학습 과정에서 잠재적으로 배울 수 있는 L2 코퍼스의 영어 문장들로 심층학습을 하였다. 다음으로, 이 언어 모델을 이용하여 필러-갭 의존 관계 구조를 위반한 영어 문장을 대상으로 의문사 상호작용 효과(wh-licensing interaction effect) 즉, 정보 이론의 정보량인 놀라움(surprisal)의 정도를 계산하여 문장 처리 양상을 조사하였다. 또한 L2ers 언어모델과 상응하는 원어민 언어모델을 비교 분석함으로써, 두 언어모델이 문장 처리에서 필러-갭 의존 관계에 내재된 추상적 구문 구조를 추적할 수 있음을 보여주었을 뿐만 아니라, 또한 선형 혼합효과 회귀모델을 사용하여 본 논문의 중심 연구 주제인 의존 관계 처리에 있어서 원어민 언어모델과 L2ers 언어모델간 통계적으로 유의미한 차이가 존재함을 규명하였다.

요인분석을 이용한 유해 중금속 복합 노출수준과 건강영향과의 관련성 평가 (Evaluation of the Relationship between the Exposure Level to Mixed Hazardous Heavy Metals and Health Effects Using Factor Analysis)

  • 김은섭;문선인;임동혁;최병선;박정덕;엄상용;김용대;김헌
    • 한국환경보건학회지
    • /
    • 제48권4호
    • /
    • pp.236-243
    • /
    • 2022
  • Background: In the case of multiple exposures to different types of heavy metals, such as the conditions faced by residents living near a smelter, it would be preferable to group hazardous substances with similar characteristics rather than individually related substances and evaluate the effects of each group on the human body. Objectives: The purpose of this study is to evaluate the utility of factor analysis in the assessment of health effects caused by exposure to two or more hazardous substances with similar characteristics, such as in the case of residents living near a smelter. Methods: Heavy metal concentration data for 572 people living in the vicinity of the Janghang smelter area were grouped based on several subfactors according to their characteristics using factor analysis. Using these factor scores as an independent variable, multiple regression analysis was performed on health effect markers. Results: Through factor analysis, three subfactors were extracted. Factor 1 contained copper and zinc in serum and revealed a common characteristic of the enzyme co-factor in the human body. Factor 2 involved urinary cadmium and arsenic, which are harmful metals related to kidney damage. Factor 3 encompassed blood mercury and lead, which are classified as related to cardiovascular disease. As a result of multiple linear regression analysis, it was found that using the factor index derived through factor analysis as an independent variable is more advantageous in assessing the relevance to health effects than when analyzing the two heavy metals by including them in a single regression model. Conclusions: The results of this study suggest that regression analysis linked with factor analysis is a good alternative in that it can simultaneously identify the effects of heavy metals with similar properties while overcoming multicollinearity that may occur in environmental epidemiologic studies on exposure to various types of heavy metals.