• 제목/요약/키워드: Multiple regression model

검색결과 2,531건 처리시간 0.026초

Subset selection in multiple linear regression: An improved Tabu search

  • Bae, Jaegug;Kim, Jung-Tae;Kim, Jae-Hwan
    • Journal of Advanced Marine Engineering and Technology
    • /
    • 제40권2호
    • /
    • pp.138-145
    • /
    • 2016
  • This paper proposes an improved tabu search method for subset selection in multiple linear regression models. Variable selection is a vital combinatorial optimization problem in multivariate statistics. The selection of the optimal subset of variables is necessary in order to reliably construct a multiple linear regression model. Its applications widely range from machine learning, timeseries prediction, and multi-class classification to noise detection. Since this problem has NP-complete nature, it becomes more difficult to find the optimal solution as the number of variables increases. Two typical metaheuristic methods have been developed to tackle the problem: the tabu search algorithm and hybrid genetic and simulated annealing algorithm. However, these two methods have shortcomings. The tabu search method requires a large amount of computing time, and the hybrid algorithm produces a less accurate solution. To overcome the shortcomings of these methods, we propose an improved tabu search algorithm to reduce moves of the neighborhood and to adopt an effective move search strategy. To evaluate the performance of the proposed method, comparative studies are performed on small literature data sets and on large simulation data sets. Computational results show that the proposed method outperforms two metaheuristic methods in terms of the computing time and solution quality.

다중선형회귀분석 기반 건설장비 이산화탄소 배출량 예측모델 개발 (Development of prediction methodology from CO2 emissions of construction equipment based multiple linear regression)

  • 권재민;이재학;조민도;최영준;한승우
    • 한국건축시공학회:학술대회논문집
    • /
    • 한국건축시공학회 2019년도 추계 학술논문 발표대회
    • /
    • pp.38-39
    • /
    • 2019
  • Environmental problems caused by GHG emitted by various industries are emerging around the world, and accordingly, relevant regulations are being applied by countries around the world. Korea is operating a carbon credit system that trades GHG in industry for money, which is expected to be applied to the construction industry. In addition, construction equipment using fossil fuels accounts for the largest portion of $CO_2$ emissions in the construction industry, and the importance of $CO_2$ reduction and prediction is increasing. However, there is a lack of data on the directly measured $CO_2$ emissions of construction equipment and there is no accurate methodology for measuring methods. Therefore, in this study, independent variables were derived based on the $CO_2$ emission data. In addition, multiple linear regression is performed for each independent variable to derive a predictive model of carbon dioxide emission by work type of construction equipment. It is expected that the construction process plan based on environmental factors in the construction industry can be established in the future.

  • PDF

기계학습 기반의 가스폭발위험범위 예측모델에 관한 연구 (A Study on Predictive Models based on the Machine Learning for Evaluating the Extent of Hazardous Zone of Explosive Gases)

  • 정용재;이창준
    • Korean Chemical Engineering Research
    • /
    • 제58권2호
    • /
    • pp.248-256
    • /
    • 2020
  • 본 연구에서는 폭발위험장소의 방폭설비 설치를 위해 필요한 가스폭발위험범위 예측모델 개발을 수행하였다. 이를 위해 12개의 가연성가스에 대한 1,200개의 폭발위험범위 데이터를 생성하였다. 가스폭발위험범위를 출력변수로 설정하였고 데이터 생성과정에서 필요한 12개의 변수를 입력변수로 설정하였다. 다중 회귀, 주성분 회귀, 인공신경망 기법을 이용해 예측모델을 개발하였다. 각각 모델의 예측 성능을 비교한 결과, 평균절대퍼센트오차(MAPE)는 각각 44.2%, 49.3%, 5.7%이고 평균제곱근오차(RMSE)는 1.389 m, 1.602 m, 0.203 m로 나타났다. 결과를 통해 인공신경망이 가장 우수한 성능을 보여주었고 가스폭발위험범위 예측을 위한 최적 모델이라는 것을 확인하였다.

PMF 모델을 이용한 대기 중 PM-10 오염원의 확인 (Source Identification of Ambient PM-10 Using the PMF Model)

  • 황인조;김동술
    • 한국대기환경학회지
    • /
    • 제19권6호
    • /
    • pp.701-717
    • /
    • 2003
  • The objective of this study was to extensively estimate the air quality trends of the study area by surveying con-centration trends in months or seasons, after analyzing the mass concentration of PM-10 samples and the inorganic lements, ion, and total carbon in PM-10. Also, the study introduced to apply the PMF (Positive Matrix Factoriza-tion) model that is useful when absence of the source profile. Thus the model was thought to be suitable in Korea that often has few information about pollution sources. After obtaining results from the PMF modeling, the existing sources at the study area were qualitatively identified The PM-10 particles collected on quartz fiber filters by a PM-10 high-vol air sampler for 3 years (Mar. 1999∼Dec.2001) in Kyung Hee University. The 25 chemical species (Al, Mn, Ti, V, Cr, Fe, Ni, Cu, Zn, As, Se, Cd, Ba, Ce, Pb, Si, N $a^{#}$, N $H_4$$^{+}$, $K^{+}$, $Mg^{2+}$, $Ca^{2+}$, C $l^{[-10]}$ , N $O_3$$^{[-10]}$ , S $O_4$$^{2-}$, TC) were analyzed by ICP-AES, IC, and EA after executing proper pre - treatments of each sample filter. The PMF model was intensively applied to estimate the quantitative contribution of air pollution sources based on the chemical information (128 samples and 25 chemical species). Through a case study of the PMF modeling for the PM-10 aerosols. the total of 11 factors were determined. The multiple linear regression analysis between the observed PM-10 mass concentration and the estimated G matrix had been performed following the FPEAK test. Finally the regression analysis provided source profiles (scaled F matrix). So, 11 sources were qualitatively identified, such as secondary aerosol related source, soil related source, waste incineration source, field burning source, fossil fuel combustion source, industry related source, motor vehicle source, oil/coal combustion source, non-ferrous metal source, and aged sea- salt source, respectively.ively.y.

청소년의 인터넷 중독현상과 자기통제기대의 구조적 경로모형에 관한 연구 (The Structural Path Model of Adolescents′ Internet Addiction and Expected Self-Control)

  • 박재성
    • 보건교육건강증진학회지
    • /
    • 제21권3호
    • /
    • pp.1-17
    • /
    • 2004
  • The purpose of this study is to evaluate the roles of expected self-control and expected self-control results in explaining adolescents' Internet addiction. In the study model, expectations of self-control and self-control results directly determine Internet addiction and Internet use time meditates the impacts of expectations of self-control and self-control results on Internet addiction. The study subjects are 1,080 middle and high school students in Busan. Stratified cluster sampling is applied by school type and school year. The response rate is 96%(l,037cases). This study develops the scales of expected self-control and expected self-control results. The scales of Internet addiction are devised by using the concept of functional dependency such as salience, withdrawal symptoms, mood modification, tolerance, relapse, and conflict. For verifying the study model, path analysis and multiple regression models are applied for identifying path significants and evaluating confounding effects of control variables, respectively. Moreover, multi partial F-test is performed for selecting the best regression model. Expected self-control is a significant determinant of Internet addiction and Internet use time that also significantly explains Internet addiction. The total effect of expected self-control towards Internet addiction is -.95. The total effect is comprised with the direct effect (-.71) and the indirect effect(-.24). In this result, the direct effect refers a curative effect since expected self-control directly reduces the level of Internet addiction, and the indirect effect refers a preventive effect because self-control can reduce time of Internet use that is a direct determinant of Internet addiction. In the test of the confounding effects of control variables, there are no confounding effects in the models of multiple regression. It implies a robustness of the study model as regards control variables. In conclusion, improving adolescents' expected self-control can control Internet addiction level. This finding implies that a health promotion program for improving expected self-control can be a cost effective method compared to other approaches.

실제 컨버터 출력 데이터를 이용한 특정 지역 태양광 장단기 발전 예측 (Prediction of Short and Long-term PV Power Generation in Specific Regions using Actual Converter Output Data)

  • 하은규;김태오;김창복
    • 한국항행학회논문지
    • /
    • 제23권6호
    • /
    • pp.561-569
    • /
    • 2019
  • 태양광 발전은 일사량만 있으면 전기에너지를 얻을 수 있기 때문에, 새로운 에너지 공급원으로 용도가 급증하고 있다. 본 논문은 실제 태양광 발전 시스템의 컨버터 출력을 이용하여 장단기 출력 예측을 하였다. 예측 알고리즘은 다중선형회귀와 머신러닝의 지도학습 중 분류모델인 서포트 벡터 머신 그리고 DNN과 LSTM 등 딥러닝을 이용하였다. 또한 기상요소의 입출력 구조에 따라 3개의 모델을 이용하였다. 장기 예측은 월별, 계절별, 연도별 예측을 하였으며, 단기 예측은 7일간의 예측을 하였다. 결과로서 RMSE 측도에 의한 예측 오차로 비교해 본 결과 다중선형회귀와 SVM 보다는 딥러닝 네트워크가 예측 정확도 측면에서 더 우수하였다. 또한, DNN 보다 시계열 예측에 우수한 모델인 LSTM이 예측 정확도 측면에서 우수하였다. 입출력 구조에 따른 실험 결과는 모델 1보다 모델 2가 오차가 적었으며, 모델 2보다는 모델 3이 오차가 적었다.

계획적 행동이론을 이용한 여대생의 유제품 섭취 행동 분석 (Using the Theory of Planned Behavior to Explain Dairy Food Consumption amount University Female Students)

  • 김경원;신은미
    • 대한지역사회영양학회지
    • /
    • 제8권1호
    • /
    • pp.53-61
    • /
    • 2003
  • This study was designed to explain the intentions and consumption of dairy foods among university female students. The factors related to intentions of consumption or actual consumption of dairy foods were identified within the theory of planned behavior. The survey questionnaire, developed using open-ended questions (n=35) , was administered to university female students (n:184) Subjects completed information regarding attitudes, subjective norms, perceived control, intentions and consumption of dairy foods. Correlation analysis and multiple regression were used to study the association of factors with intentions and consumption of dairy foods. Subjects showed relatively low intention to consume dairy foods (-0.4 $\pm$ 1.6 from a scale of -4-14). They ate 1.2 $\pm$ 0.9 servings of dairy foods a day and 52.2% of subjects had less than a serving a day, showing inadequate consumption of dairy foods. All three factors, attitudes, subjective norms and perceived control were significantly correlated to the intentions to take dairy foods regularly (r : 0.26-0.27) . Multiple regression results, however, revealed that subjective norms (p < 0.01) and perceived control (p < 0.05) contributed to the model of explaining intentions, while attitudes did not (model $R^2$ : 0.154) . To predict and explain actual consumption of dairy foods, two regression models were examined. In the first model, perceived control was significant in predicting dairy foods consumption, while attitudes and subjective norms were not. In the second model, intentions and perceived control were significantly related to actual consumption of dairy foods, providing the empirical evidence of the theory (model $R^2$: 0.121) These results suggest that perceived control was significant in explaining actual behavior as well as intentions. This study suggests that nutrition education to increase dairy foods consumption for young adults should focus on increasing perception of control and eliciting social support from respected others.

다중회귀모형의 그래픽적 방법 (Graphical Method for Multiple Regression Model)

  • 이우리;이의기;홍종선
    • 응용통계연구
    • /
    • 제20권1호
    • /
    • pp.195-204
    • /
    • 2007
  • 기하학적인 방법을 사용하여 다중회귀모형 자료를 그래프로 구현하는 회귀제곱합 그림을 제안한다. 두 설명변수의 회귀제곱합은 한 변수의 단순회귀제곱합과 한 변수의 회귀모형에 다른 변수가 추가되었을 때 회귀제곱합의 증가분의 합으로 표현되는 관계식을 이용하여 회귀제곱합 그림을 반원의 형태로 구현한다. 회귀제곱합 그림은 설명변수에 대응하는 벡터로 표현되고, 반응변수에 영향력 정도를 시각적으로 구현하는 그래픽적인 방법이다. 수평축에 가까운 벡터에 대응하는 설명변수가 반응변수에 더 많은 영향을 주는 설명변수라고 판단할 수 있다 또한 두개의 설명변수에 대응하는 벡터 사이의 각도 크기로 서프레션의 발생여부를 진단 가능하다.

유전 알고리즘을 이용한 국소가중회귀의 다중모델 결합을 위한 점진적 앙상블 학습 (Incremental Ensemble Learning for The Combination of Multiple Models of Locally Weighted Regression Using Genetic Algorithm)

  • 김상훈;정병희;이건호
    • 정보처리학회논문지:소프트웨어 및 데이터공학
    • /
    • 제7권9호
    • /
    • pp.351-360
    • /
    • 2018
  • 전통적으로 나태한 학습에 해당하는 국소가중회귀(LWR: Locally Weighted Regression)모델은 입력변수인 질의지점에 따라 예측의 해를 얻기 위해 일정구간 범위내의 학습 데이터를 대상으로 질의지점의 거리에 따라 가중값을 달리 부여하여 학습 한 결과로 얻은 짧은 구간내의 회귀식이다. 본 연구는 메모리 기반학습의 형태에 해당하는 LWR을 위한 점진적 앙상블 학습과정을 제안한다. LWR를 위한 본 연구의 점진적 앙상블 학습법은 유전알고리즘을 이용하여 시간에 따라 LWR모델들을 순차적으로 생성하고 통합하는 것이다. 기존의 LWR 한계는 인디케이터 함수와 학습 데이터의 선택에 따라 다중의 LWR모델이 생성될 수 있으며 이 모델에 따라 예측 해의 질도 달라질 수 있다. 하지만 다중의 LWR 모델의 선택이나 결합의 문제 해결을 위한 연구가 수행되지 않았다. 본 연구에서는 인디케이터 함수와 학습 데이터에 따라 초기 LWR 모델을 생성한 후 진화 학습 과정을 반복하여 적절한 인디케이터 함수를 선택하며 또한 다른 학습 데이터에 적용한 LWR 모델의 평가와 개선을 통하여 학습 데이터로 인한 편향을 극복하고자 한다. 모든 구간에 대해 데이터가 발생 되면 점진적으로 LWR모델을 생성하여 보관하는 열심학습(Eager learning)방식을 취하고 있다. 특정 시점에 예측의 해를 얻기 위해 일정구간 내에 신규로 발생된 데이터들을 기반으로 LWR모델을 생성한 후 유전자 알고리즘을 이용하여 구간 내의 기존 LWR모델들과 결합하는 방식이다. 제안하는 학습방법은 기존 단순평균법을 이용한 다중 LWR모델들의 선택방법 보다 적합도 평가에서 우수한 결과를 보여주고 있다. 특정지역의 시간 별 교통량, 고속도로 휴게소의 시간별 매출액 등의 실제 데이터를 적용하여 본 연구의 LWR에 의한 결과들의 연결된 패턴과 다중회귀분석을 이용한 예측결과를 비교하고 있다.

Forecasting for a Credit Loan from Households in South Korea

  • Jeong, Dong-Bin
    • 산경연구논집
    • /
    • 제8권4호
    • /
    • pp.15-21
    • /
    • 2017
  • Purpose - In this work, we examined the causal relationship between credit loans from households (CLH), loan collateralized with housing (LCH) and an interest of certificate of deposit (ICD) among others in South Korea. Furthermore, the optimal forecasts on the underlying model will be obtained and have the potential for applications in the economic field. Research design, data, and methodology - A total of 31 realizations sampled from the 4th quarter in 2008 to the 4th quarter in 2016 was chosen for this research. To achieve the purpose of this study, a regression model with correlated errors was exploited. Furthermore, goodness-of-fit measures was used as tools of optimal model-construction. Results - We found that by applying the regression model with errors component ARMA(1,5) to CLH, the steep and lasting rise can be expected over the next year, with moderate increase of LCH and ICD. Conclusions - Based on 2017-2018 forecasts for CLH, the precipitous and lasting increase can be expected over the next two years, with gradual rise of two major explanatory variables. By affording the assumption that the feedback among variables can exist, we can, in the future, consider more generalized models such as vector autoregressive model and structural equation model, to name a few.