• Title/Summary/Keyword: 비선형 다중회귀분석

Search Result 130, Processing Time 0.026 seconds

Improvement of Search Method of Genetic Programing for Wind Prediction MOS (풍속 예측 보정을 위한 Genetic Programing 탐색 기법의 개선)

  • Oh, Seungchul;Seo, Kisung
    • Proceedings of the KIEE Conference
    • /
    • 2015.07a
    • /
    • pp.1349-1350
    • /
    • 2015
  • 풍속은 다른 기상요소들보다 순간 변동이 심하고 국지성이 강하여 수치 예보 모델만으로 예측의 정확성을 높이기가 어렵다. 기상청의 단기 풍속 예보는 전 지구적 통합 예보모델인 UM(Unified Model)의 예측값에 MOS(Model Output Statictics)를 통한 보정을 수행하며, 보정식의 생성에 다중선형회귀분석 방법을 사용한다. 본 연구자는 유전프로그래밍(Genetic Programming)을 이용한 비선형 회귀분석 기반의 보정식 생성을 통하여 이를 개선한 바 있는데, 본 연구에서는 보다 향상된 성능을 얻기 위하여 GP 기법 측면에서 Automatically Defined Functions과 다군집(Multiple Populations) 수행을 통해 성능을 높이고자 한다.

  • PDF

Analysis of AI interview data using unified non-crossing multiple quantile regression tree model (통합 비교차 다중 분위수회귀나무 모형을 활용한 AI 면접체계 자료 분석)

  • Kim, Jaeoh;Bang, Sungwan
    • The Korean Journal of Applied Statistics
    • /
    • v.33 no.6
    • /
    • pp.753-762
    • /
    • 2020
  • With an increasing interest in integrating artificial intelligence (AI) into interview processes, the Republic of Korea (ROK) army is trying to lead and analyze AI-powered interview platform. This study is to analyze the AI interview data using a unified non-crossing multiple quantile tree (UNQRT) model. Compared to the UNQRT, the existing models, such as quantile regression and quantile regression tree model (QRT), are inadequate for the analysis of AI interview data. Specially, the linearity assumption of the quantile regression is overly strong for the aforementioned application. While the QRT model seems to be applicable by relaxing the linearity assumption, it suffers from crossing problems among estimated quantile functions and leads to an uninterpretable model. The UNQRT circumvents the crossing problem of quantile functions by simultaneously estimating multiple quantile functions with a non-crossing constraint and is robust from extreme quantiles. Furthermore, the single tree construction from the UNQRT leads to an interpretable model compared to the QRT model. In this study, by using the UNQRT, we explored the relationship between the results of the Army AI interview system and the existing personnel data to derive meaningful results.

Estimating soil moisture using machine learning approach: A Case Study to Yongdam watershed (기계학습 기반의 토양함수 예측 기법 개발 (용담댐 시험유역을 중심으로))

  • Huy, Nguyen Dinh;Kwon, Hyun-Han
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2018.05a
    • /
    • pp.167-167
    • /
    • 2018
  • 토양수분은 토양에 포함된 평균 수분량을 나타내며 수문 순환 관점에서 매우 중요한 수문변량 중 하나이다. 본 연구에서는 대표적인 기계학습 방법인 Support Vector Machine (SVM)을 이용한 토양 함수 예측 기법을 개발하고자 하며, 예측인자로서 원격 탐측 기반의 토양함수자료, 강수량, 온도 등을 활용하고자 한다. SVM은 Kernel 함수를 이용하여 복잡한 비선형 관계를 선형 가정을 통해서 해석하는 기계학습 방법으로서 전역모델(global model)로서 다양한 수문기상분야에 적용이 이루어지고 있다. SVM의 장점은 일정 부분의 오차를 허용함으로서 모형의 일반화 측면에서 기존 인공신경망(artificial neural network, ANN)에 비해 우수한 성능을 나타내며, 특히 예측모형으로서 적용성이 매우 크다. 본 연구에서는 과거 토양 함수 자료와 강수, 온도, 위성 관측 기반 정보 등을 이용하여 모형을 적합시키고 이를 미계측 유역으로 확장하는데 연구의 목적이 있으며, 본 연구를 통해 제안된 모형은 용담댐 시험유역을 대상으로 적용되며 기존 ANN 모형 및 다중회귀분석 결과와 비교를 통해 모형의 적합성을 평가하고자한다.

  • PDF

Comparative Analysis on the Characteristics and Models of Traffic Accidents by Day and Nighttime in the Case of Cheongju 4-legged ignalized Intersections (주·야간 교통사고의 특성 및 사고모형 비교분석 -청주시 4지 신호교차로를 중심으로 -)

  • Yoo, Doo Seon;Oh, Sang Jin;Kim, Tae Young;Park, Byung Ho
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.28 no.2D
    • /
    • pp.181-189
    • /
    • 2008
  • The purpose of this study is to comparatively analyze the characteristics and models of traffic accidents by day and nighttime. In pursuing the above, this study gives particular attentions to testing the differences and developing the models (multiple linear and non-linear and Poisson and negative binomial regression) using the data of Cheongju 4-legged signalized intersections. The main results analyzed are as follows. First, the differences between day and nighttime accidents were defined. Second, 12 accident models which are all statistically significant were developed. Finally, the differences between day and nighttime models were comparatively analyzed using the common and specific variables.

Non-linear regression model considering all association thresholds for decision of association rule numbers (기본적인 연관평가기준 전부를 고려한 비선형 회귀모형에 의한 연관성 규칙 수의 결정)

  • Park, Hee Chang
    • Journal of the Korean Data and Information Science Society
    • /
    • v.24 no.2
    • /
    • pp.267-275
    • /
    • 2013
  • Among data mining techniques, the association rule is the most recently developed technique, and it finds the relevance between two items in a large database. And it is directly applied in the field because it clearly quantifies the relationship between two or more items. When we determine whether an association rule is meaningful, we utilize interestingness measures such as support, confidence, and lift. Interestingness measures are meaningful in that it shows the causes for pruning uninteresting rules statistically or logically. But the criteria of these measures are chosen by experiences, and the number of useful rules is hard to estimate. If too many rules are generated, we cannot effectively extract the useful rules.In this paper, we designed a variety of non-linear regression equations considering all association thresholds between the number of rules and three interestingness measures. And then we diagnosed multi-collinearity and autocorrelation problems, and used analysis of variance results and adjusted coefficients of determination for the best model through numerical experiments.

N-supplying Capability Evaluation of Corn Field Soils in Pennsylvania (Pennsylvania주 옥수수 재배 토양의 질소공급능력 평가)

  • Hong, Soon-Dal
    • Korean Journal of Soil Science and Fertilizer
    • /
    • v.31 no.4
    • /
    • pp.359-367
    • /
    • 1998
  • In order to determine the nitrogen supplying capabilities (NSC) of corn fields, 47 field experiments were performed in Pennsylvania over 3 year from 1986 and NSCs were estimated by the regression analysis with chemical properties and soil attributes. Although the content of $NO_3-N$ in soil showed the best correlation with NSC ($R^2=0.518$), the standardized partial regression coefficient of $NO_3-N$ for NSC was 0.52, with some variations over the years. This value was slightly higher than those of the other properties which ranged from 0.001 to 0.351. Multiple linear regression with soil attributes for the evaluation of NSC was better than simple regression with $NO_3-N$. The coefficient of determination ($R^2$) for the evaluation of NSC was gradually increased; 0.599 with selected chemical properties, 0.698 with quantitative attributes(chemical properties and depth of Ap horizon), and 0.839 with quantitative and selected qualitative soil attributes. Consequently, in order to evaluate NSC, analysis by multiple linear regression with soil attributes was more reliable and better model than by the simple regression model.

  • PDF

Multiple linear regression model-based voltage imbalance estimation for high-power series battery pack (다중선형회귀모델 기반 고출력 직렬 배터리 팩의 전압 불균형 추정)

  • Kim, Seung-Woo;Lee, Pyeong-Yeon;Han, Dong-Ho;Kim, Jong-hoon
    • Journal of IKEEE
    • /
    • v.23 no.1
    • /
    • pp.1-8
    • /
    • 2019
  • In this paper, the electrical characteristics with various C-rates are tested with a high power series battery pack comprised of 18650 cylindrical nickel cobalt aluminum(NCA) lithium-ion battery. The electrical characteristics of discharge capacity test with 14S1P battery pack and electric vehicle (EV) cycle test with 4S1P battery pack are compared and analyzed by the various of C-rates. Multiple linear regression is used to estimate voltage imbalance of 14S1P and 4S1P battery packs with various C-rates based on experimental data. The estimation accuracy is evaluated by root mean square error(RMSE) to validate multiple linear regression. The result of this paper is contributed that to use for estimating the voltage imbalance of discharge capacity test with 14S1P battery pack using multiple linear regression better than to use the voltage imbalance of EV cycle with 4S1P battery pack.

Morphometric Characteristics and Correlation Analysis with Rainfall-runoff in the Han River Basin (한강 유역의 형태학적 특성과 강우-유출의 상관분석)

  • Lee, Ji Haeng;Lee, Woong Hee;Choi, Heung Sik
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.38 no.2
    • /
    • pp.237-247
    • /
    • 2018
  • The basin characteristics reflect the attributes of geomorphological pattern of basin and stream networks affect the rainfall-runoff. In order to analyze the relationship between the basin runoff and stream morphometric characteristics, the morphometric characteristics were investigated for 27 water-level observation stations on 19 rivers in the Han River basin using Arc-map. The morphometric characteristics were divided into linear, areal and relief aspects for calculation while the annual mean runoff ratio as a basin response by rainfall was estimated using the measured precipitation and discharge to analyze the rainfall-runoff characteristics. The correlation among the morphometric parameters were schematized to analyze the correlations among them. The multiple regression equation for rainfall-runoff ratio was provided with morphometric parameters of stream length ratio, form factor ratio, shape factor, stream area ratio, and relief ratio and the coefficient of determination was 0.691. The RMSE and MAPE between the measured and the estimated annual runoff rates were found as 0.09, 11.61% respectively, the suggested regression equation showed good estimation.

Analysis of Factors Affecting Travel Time Change Using the Time Use Survey Data in Seoul (서울시 통행시간 변화의 요인분석: 생활시간조사자료를 중심으로)

  • Koo, Ja hun;Choo, Sangho
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.17 no.1
    • /
    • pp.1-16
    • /
    • 2018
  • Changes in the life style might vary trip purposes, ultimately leading to the change in the travel behavior. Therefore, this study analyzed the factors affecting travel time change by using the time use survey data in Seoul, surveyed by the Statistics Korea in 1999~2014. We developed multiple linear regression models for travel time, considering individual, household and time-related variables as independent variables. The models were separately estimated weekday and weekend. the model results show that the household, individual, and time related variables have an significant effect on the travel time. In addition, travel time is more influenced by individual characteristics thn household ones. Each activity time positively affects the travel time, indicating that travel is derived demand. The variable that have the greatest influence on the travel time is the activity time for leisure.

Estimation of surface nitrogen dioxide mixing ratio in Seoul using the OMI satellite data (OMI 위성자료를 활용한 서울 지표 이산화질소 혼합비 추정 연구)

  • Kim, Daewon;Hong, Hyunkee;Choi, Wonei;Park, Junsung;Yang, Jiwon;Ryu, Jaeyong;Lee, Hanlim
    • Korean Journal of Remote Sensing
    • /
    • v.33 no.2
    • /
    • pp.135-147
    • /
    • 2017
  • We, for the first time, estimated daily and monthly surface nitrogen dioxide ($NO_2$) volume mixing ratio (VMR) using three regression models with $NO_2$ tropospheric vertical column density (OMIT-rop $NO_2$ VCD) data obtained from Ozone Monitoring Instrument (OMI) in Seoul in South Korea at OMI overpass time (13:45 local time). First linear regression model (M1) is a linear regression equation between OMI-Trop $NO_2$ VCD and in situ $NO_2$ VMR, whereas second linear regression model (M2) incorporates boundary layer height (BLH), temperature, and pressure obtained from Atmospheric Infrared Sounder (AIRS) and OMI-Trop $NO_2$ VCD. Last models (M3M & M3D) are a multiple linear regression equations which include OMI-Trop $NO_2$ VCD, BLH and various meteorological data. In this study, we determined three types of regression models for the training period between 2009 and 2011, and the performance of those regression models was evaluated via comparison with the surface $NO_2$ VMR data obtained from in situ measurements (in situ $NO_2$ VMR) in 2012. The monthly mean surface $NO_2$ VMRs estimated by M3M showed good agreements with those of in situ measurements(avg. R = 0.77). In terms of the daily (13:45LT) $NO_2$ estimation, the highest correlations were found between the daily surface $NO_2$ VMRs estimated by M3D and in-situ $NO_2$ VMRs (avg. R = 0.55). The estimated surface $NO_2$ VMRs by three modelstend to be underestimated. We also discussed the performance of these empirical modelsfor surface $NO_2$ VMR estimation with respect to otherstatistical data such asroot mean square error (RMSE), mean bias, mean absolute error (MAE), and percent difference. This present study shows a possibility of estimating surface $NO_2$ VMR using the satellite measurement.