• 제목/요약/키워드: Multiple regression model

검색결과 2,531건 처리시간 0.027초

회귀모형에 의한 상수도 1일 급수량 예측에 관한 연구 (A Study on the Prediction of Daily Urban Water Demand with Multiple Regression Model)

  • 박성천;문병석;오창주;이병조
    • 한국농공학회지
    • /
    • 제40권1호
    • /
    • pp.68-77
    • /
    • 1998
  • The purpose of this paper is to establish a method estimating the daily urban water demand using statistical analysis that is used for developing the efficient management and operation of the water supply facilities, and accurary of the model is verified by error rate and F-value. The data used in this study were the daily urban water use, the weather conditions such as temperature, precipitation, relative humidity, etc, and the day of The week. The case study was taken placed for the city of Namwon in Korea. The raw data used in this study were rearranged either by month or by season for analysis purpose, and the statistical analysis was applied to the data to obtain the regression model As a result of this study, the linear regression model was developed to estimate the daily urban water use with weather condition. The regression constant and coefficients of the model were determined for each month of a year. The accuracy of the model was within 3% of average error and within 11% of maximum error. The resulting model was found to he useful to the practical operation and management of the water supply facilities.

  • PDF

의료비 결정요인 분석을 위한 계량적 모형 고안 (A Quantitative Model for the Projection of Health Expenditure)

  • 김한중;이영두;남정모
    • Journal of Preventive Medicine and Public Health
    • /
    • 제24권1호
    • /
    • pp.29-36
    • /
    • 1991
  • A multiple regression analysis using ordinary least square (OLS) is frequently used for the projection of health expenditure as well as for the identification of factors affecting health care costs. Data for the analysis often have mixed characteristics of time series and cross section. Parameters as a result of OLS estimation, in this case, are no longer the best linear unbiased estimators (BLUE) because the data do not satisfy basic assumptions of regression analysis. The study theoretically examined statistical problems induced when OLS estimation was applied with the time series cross section data. Then both the OLS regression and time series cross section regression (TSCS regression) were applied to the same empirical da. Finally, the difference in parameters between the two estimations were explained through residual analysis.

  • PDF

Comparison of tree-based ensemble models for regression

  • Park, Sangho;Kim, Chanmin
    • Communications for Statistical Applications and Methods
    • /
    • 제29권5호
    • /
    • pp.561-589
    • /
    • 2022
  • When multiple classifications and regression trees are combined, tree-based ensemble models, such as random forest (RF) and Bayesian additive regression trees (BART), are produced. We compare the model structures and performances of various ensemble models for regression settings in this study. RF learns bootstrapped samples and selects a splitting variable from predictors gathered at each node. The BART model is specified as the sum of trees and is calculated using the Bayesian backfitting algorithm. Throughout the extensive simulation studies, the strengths and drawbacks of the two methods in the presence of missing data, high-dimensional data, or highly correlated data are investigated. In the presence of missing data, BART performs well in general, whereas RF provides adequate coverage. The BART outperforms in high dimensional, highly correlated data. However, in all of the scenarios considered, the RF has a shorter computation time. The performance of the two methods is also compared using two real data sets that represent the aforementioned situations, and the same conclusion is reached.

An adaptive neuro-fuzzy inference system (ANFIS) model to predict the pozzolanic activity of natural pozzolans

  • Elif Varol;Didem Benzer;Nazli Tunar Ozcan
    • Computers and Concrete
    • /
    • 제31권2호
    • /
    • pp.85-95
    • /
    • 2023
  • Natural pozzolans are used as additives in cement to develop more durable and high-performance concrete. Pozzolanic activity index (PAI) is important for assessing the performance of a pozzolan as a binding material and has an important effect on the compressive strength, permeability, and chemical durability of concrete mixtures. However, the determining of the 28 days (short term) and 90 days (long term) PAI of concrete mixtures is a time-consuming process. In this study, to reduce extensive experimental work, it is aimed to predict the short term and long term PAIs as a function of the chemical compositions of various natural pozzolans. For this purpose, the chemical compositions of various natural pozzolans from Central Anatolia were determined with X-ray fluorescence spectroscopy. The mortar samples were prepared with the natural pozzolans and then, the short term and the long term PAIs were calculated based on compressive strength method. The effect of the natural pozzolans' chemical compositions on the short term and the long term PAIs were evaluated and the PAIs were predicted by using multiple linear regression (MLR) and adaptive neuro-fuzzy inference system (ANFIS) model. The prediction model results show that both reactive SiO2 and SiO2+Al2O3+Fe2O3 contents are the most effective parameters on PAI. According to the performance of prediction models determined with metrics such as root mean squared error (RMSE) and coefficient of correlation (R2), ANFIS models are more feasible than the multiple regression model in predicting the 28 days and 90 days pozzolanic activity. Estimation of PAIs based on the chemical component of natural pozzolana with high-performance prediction models is going to make an important contribution to material engineering applications in terms of selection of favorable natural pozzolana and saving time from tedious test processes.

국내 로터리의 연령대별 사고모형 (Accident Models of Rotary by Age Group in Korea)

  • 박민규;박병호
    • 한국도로학회논문집
    • /
    • 제15권2호
    • /
    • pp.121-129
    • /
    • 2013
  • PURPOSES : This study deals with the traffic accidents of rotary in Korea. The objective of this study is to develop the accident models by age group based on the various data of rotaries. METHODS : In pursuing the above, this study gives particular attentions to classifying the accident data of 17 rotaries by age, collecting the data of geometric structure, traffic volume and others, and developing the models using SPSS 17.0 and EXCEL. RESULTS : First, 3 multiple linear regression models which were all statistically significant were developed. The value of model of under 30-49 age group were, however, evaluated to be 0.688 and be less than those of other models. Second, the most powerful variables were analyzed to be traffic volume in the model of under 30 age group, circulatory roadway width in the model of 30-49 age group, and the number of approach lane in the model of above 50 age group. Finally, the test results of accident models using RMSE were all evaluated to be fitted to the given data. CONCLUSIONS : This study propose install streetlights, speed humps and widen Circulatory as effective improvements for reduction of accident in rotary.

다중 회귀 분석을 이용한 한자 난이도 예측 기법 연구 (Prediction Techniques for Difficulty Level of Hanja Using Multiple Linear Regression)

  • 최정환;노지우;김순태
    • 한국인터넷방송통신학회논문지
    • /
    • 제19권6호
    • /
    • pp.219-225
    • /
    • 2019
  • 한자 급수와 같이 기존 한자 난이도 선정 방식에 문제점이 있다. 실생활에서 쓰이는 한글 단어와 차이가 나며 해당 급수가 실제로 얼마나 많이 쓰이는지 알 수가 없다. 이러한 문제를 해결하기 위해 빈도수를 이용하여 다중 회귀 분석을 이용하여 한자 난이도를 측정한다. 초등 교과서를 기반으로 한자활용빈도수와 한글의미빈도수를 집계한다. 두 빈도수와 획수를 함께 사용하여 설문지를 작성하여 해당 한자의 학습 적정 시기를 답변 받아 이를 회귀에서 사용할 타겟 변수로 이용한다. 단계별 회귀분석을 이용하여 적절한 피처를 선택하고 다중 선형 회귀 분석을 한다. 모델의 R2는 0.1105가 나왔으며 RMSE는 0.1105의 결과가 나왔다.

APPLICATION AND CROSS-VALIDATION OF SPATIAL LOGISTIC MULTIPLE REGRESSION FOR LANDSLIDE SUSCEPTIBILITY ANALYSIS

  • LEE SARO
    • 대한원격탐사학회:학술대회논문집
    • /
    • 대한원격탐사학회 2004년도 Proceedings of ISRS 2004
    • /
    • pp.302-305
    • /
    • 2004
  • The aim of this study is to apply and crossvalidate a spatial logistic multiple-regression model at Boun, Korea, using a Geographic Information System (GIS). Landslide locations in the Boun area were identified by interpretation of aerial photographs and field surveys. Maps of the topography, soil type, forest cover, geology, and land-use were constructed from a spatial database. The factors that influence landslide occurrence, such as slope, aspect, and curvature of topography, were calculated from the topographic database. Texture, material, drainage, and effective soil thickness were extracted from the soil database, and type, diameter, and density of forest were extracted from the forest database. Lithology was extracted from the geological database and land-use was classified from the Landsat TM image satellite image. Landslide susceptibility was analyzed using landslide-occurrence factors by logistic multiple-regression methods. For validation and cross-validation, the result of the analysis was applied both to the study area, Boun, and another area, Youngin, Korea. The validation and cross-validation results showed satisfactory agreement between the susceptibility map and the existing data with respect to landslide locations. The GIS was used to analyze the vast amount of data efficiently, and statistical programs were used to maintain specificity and accuracy.

  • PDF

스트림 데이타 예측을 위한 슬라이딩 윈도우 기반 점진적 회귀분석 (Incremental Regression based on a Sliding Window for Stream Data Prediction)

  • 김성현;김룡;류근호
    • 한국정보과학회논문지:데이타베이스
    • /
    • 제34권6호
    • /
    • pp.483-492
    • /
    • 2007
  • 최근 센서 네트워크의 발달로 실세계의 많은 데이타가 시간 속성을 갖고 실시간으로 수집되고 있다. 기존의 시계열 데이타 예측 기법은 모델 갱신 없이 예측을 수행하였다. 그러나 스트림 데이타는 매우 빠르게 수집이 되고 시간이 지남에 따라 데이타의 특성이 변경될 수 있으므로 기존의 시계열 예측 기법을 적용하는 것은 적절하지 않다. 따라서 이 논문에서는 슬라이딩 윈도우와 점진적인 회귀분석을 이용한 스트림 데이타 예측 기법을 제안한다. 이 기법은 스트림 데이타를 다중 회귀 모델에 입력하기 위해 차원 분열을 통해 여러 개의 속성으로 분열(Fractal)하고, 변화되는 데이타의 분포를 반영하기 위해 슬라이딩 윈도우 기법을 사용하여 점진적으로 회귀 모델을 갱신한다. 또한 고정 크기 큐를 이용하여 최근의 데이타로만 모델을 유지한다. 이전 데이타의 유지 없이 최소 정보를 갖는 행렬을 통해 모델을 갱신하므로 낮은 공간 복잡도를 갖고 점진적으로 모델을 갱신함으로써 에러율의 증가를 방지한다. 제안된 기법의 타당성은 RME(Relative Mean Error)와 RMSE(Root Mean Square Error)를 이용하여 측정하였고, 실험 결과 다른 기법에 비해 우수하였다.

Thermal conductivity prediction model for compacted bentonites considering temperature variations

  • Yoon, Seok;Kim, Min-Jun;Park, Seunghun;Kim, Geon-Young
    • Nuclear Engineering and Technology
    • /
    • 제53권10호
    • /
    • pp.3359-3366
    • /
    • 2021
  • An engineered barrier system (EBS) for the deep geological disposal of high-level radioactive waste (HLW) is composed of a disposal canister, buffer material, gap-filling material, and backfill material. As the buffer fills the empty space between the disposal canisters and the near-field rock mass, heat energy from the canisters is released to the surrounding buffer material. It is vital that this heat energy is rapidly dissipated to the near-field rock mass, and thus the thermal conductivity of the buffer is a key parameter to consider when evaluating the safety of the overall disposal system. Therefore, to take into consideration the sizeable amount of heat being released from such canisters, this study investigated the thermal conductivity of Korean compacted bentonites and its variation within a temperature range of 25 ℃ to 80-90 ℃. As a result, thermal conductivity increased by 5-20% as the temperature increased. Furthermore, temperature had a greater effect under higher degrees of saturation and a lower impact under higher dry densities. This study also conducted a regression analysis with 147 sets of data to estimate the thermal conductivity of the compacted bentonite considering the initial dry density, water content, and variations in temperature. Furthermore, the Kriging method was adopted to establish an uncertainty metamodel of thermal conductivity to verify the regression model. The R2 value of the regression model was 0.925, and the regression model and metamodel showed similar results.

QSPR Study of the Absorption Maxima of Azobenzene Dyes

  • Xu, Jie;Wang, Lei;Liu, Li;Bai, Zikui;Wang, Luoxin
    • Bulletin of the Korean Chemical Society
    • /
    • 제32권11호
    • /
    • pp.3865-3872
    • /
    • 2011
  • A quantitative structure-property relationship (QSPR) study was performed for the prediction of the absorption maxima of azobenzene dyes. The entire set of 191 azobenzenes was divided into a training set of 150 azobenzenes and a test set of 41 azobenzenes according to Kennard and Stones algorithm. A seven-descriptor model, with squared correlation coefficient ($R^2$) of 0.8755 and standard error of estimation (s) of 14.476, was developed by applying stepwise multiple linear regression (MLR) analysis on the training set. The reliability of the proposed model was further illustrated using various evaluation techniques: leave-many-out crossvalidation procedure, randomization tests, and validation through the test set.