• Title/Summary/Keyword: 다중선형회귀모델

Search Result 110, Processing Time 0.025 seconds

A Multivariate Analysis of Korean Professional Players Salary (한국 프로스포츠 선수들의 연봉에 대한 다변량적 분석)

  • Song, Jong-Woo
    • The Korean Journal of Applied Statistics
    • /
    • v.21 no.3
    • /
    • pp.441-453
    • /
    • 2008
  • We analyzed Korean professional basketball and baseball players salary under the assumption that it depends on the personal records and contribution to the team in the previous year. We extensively used data visualization tools to check the relationship among the variables, to find outliers and to do model diagnostics. We used multiple linear regression and regression tree to fit the model and used cross-validation to find an optimal model. We check the relationship between variables carefully and chose a set of variables for the stepwise regression instead of using all variables. We found that points per game, number of assists, number of free throw successes, career are important variables for the basketball players. For the baseball pitchers, career, number of strike-outs per 9 innings, ERA, number of homeruns are important variables. For the baseball hitters, career, number of hits, FA are important variables.

Relationship Between Physical Properties and Compression Index for Marine Clay (해성점토의 물리적 특성과 압축지수의 상관성)

  • 김동후;김기웅;백영식
    • Journal of the Korean Geotechnical Society
    • /
    • v.19 no.6
    • /
    • pp.371-378
    • /
    • 2003
  • The compression index of clay distributed in the west and south coast of the Korean Peninsula had been studied. Compression index was obtained from the conventional consolidation test, and was conducted accordingly to obtain the field virgin compression curve by means of Schmertmann's graphical correction. To examine a correlation closely between physical properties of soils($e_o$, LL, w) and compression index(Cc), linen. and non-linear regression analysis were employed based on the data collected from tests. The conclusions are as follows. The compression index obtained by means of Schmereann's graphical correction is about 1.16 times for the value of original oedometer test curve for U/D samples. Non-liner regression curve was preferable to establish a correlation equation rather than linear regression curve. All derived equations so far achieved have been summarized and given. However, linear equation is better for practical use so that part by part simplified linear equations were also suggested alternatively together with their own non-linear regression curve.

Predicting Harvest Date of 'Niitaka' Pear by Using Full Bloom Date and Growing Season Weather (배 '신고'의 만개일 및 생육기 기상을 이용한 수확일 예측)

  • Han, Jeom-Hwa;Son, In-Chang;Choi, In-Myeong;Kim, Seung-Heui;Cho, Jung-Gun;Yun, Seok-Kyu;Kim, Ho-Cheol;Kim, Tae-Choon
    • Horticultural Science & Technology
    • /
    • v.29 no.6
    • /
    • pp.549-554
    • /
    • 2011
  • The effect of full bloom date and growing season weather on harvesting date of 'Niitaka' pear (Pyrus pyrifolia) in Naju province and the model of multiple linear regression for predicting the fruit growing days was studied. Earlier year in full bloom date, the harvesting date tended earlier but fruit growing days tended longer. Mean and coefficient of variation of fruit growing degree days (GDD) accumulated daily mean and maximum temperature at the base of $0^{\circ}C$ from full bloom date to harvesting date was 3,565, 2.9% and 4,463, 2.5%, respectively. Fruit growing days was not correlated with the fruit GDD accumulated daily mean and maximum temperature at the base of $0^{\circ}C$ in each month but highly correlated with GDD accumulated daily meteorological factors at days after full bloom date. Especially, it was highly negatively correlated with GDD accumulated daily mean and maximum temperature at the base of $0^{\circ}C$ from $1^{st}$ day after full bloom to $60^{th}$ day. The determination coefficient ($r^2$) of multiple linear regression model by full bloom date, GDD accumulated daily mean and maximum temperature from $1^{st}$ day after full bloom to $60^{th}$ day for predicting fruit growing days was 0.7212. As a result, the fruit growing days of 'Niitaka' pear in Naju province can predict with 72% accuracy by the model of multiple linear regression.

Prediction Techniques for Difficulty Level of Hanja Using Multiple Linear Regression (다중 회귀 분석을 이용한 한자 난이도 예측 기법 연구)

  • Choi, Jeongwhan;Noh, Jiwoo;Kim, Suntae
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.19 no.6
    • /
    • pp.219-225
    • /
    • 2019
  • There is a problem with the existing method of selecting the difficulty levels of Hanja characters. Some Hanja characters selected by the existing methods are different from Sino-Korean words used in real life and it is impossible to know how many times the Hanja characters are used. To solve this problem, we measure the difficulty of Hanja characters using the multiple regression analysis with the frequency as the features. Based on the elementary textbooks, FWS and FHU are counted. A questionnaire is written using the two frequencies and stroke together to answer the appropriate timing of learning the Hanja characters and use them as target variables for regression. Use stepwise regression to select the appropriate features and perform multiple linear regression. The R2 score of the model was 0.1105 and the RMSE was 0.1105.

Development of Demand Forecasting Model for Public Bicycles in Seoul Using GRU (GRU 기법을 활용한 서울시 공공자전거 수요예측 모델 개발)

  • Lee, Seung-Woon;Kwahk, Kee-Young
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.4
    • /
    • pp.1-25
    • /
    • 2022
  • After the first Covid-19 confirmed case occurred in Korea in January 2020, interest in personal transportation such as public bicycles not public transportation such as buses and subways, increased. The demand for 'Ddareungi', a public bicycle operated by the Seoul Metropolitan Government, has also increased. In this study, a demand prediction model of a GRU(Gated Recurrent Unit) was presented based on the rental history of public bicycles by time zone(2019~2021) in Seoul. The usefulness of the GRU method presented in this study was verified based on the rental history of Around Exit 1 of Yeouido, Yeongdengpo-gu, Seoul. In particular, it was compared and analyzed with multiple linear regression models and recurrent neural network models under the same conditions. In addition, when developing the model, in addition to weather factors, the Seoul living population was used as a variable and verified. MAE and RMSE were used as performance indicators for the model, and through this, the usefulness of the GRU model proposed in this study was presented. As a result of this study, the proposed GRU model showed higher prediction accuracy than the traditional multi-linear regression model and the LSTM model and Conv-LSTM model, which have recently been in the spotlight. Also the GRU model was faster than the LSTM model and the Conv-LSTM model. Through this study, it will be possible to help solve the problem of relocation in the future by predicting the demand for public bicycles in Seoul more quickly and accurately.

Prediction of Equivalent Stress Block Parameters for High Strength Concrete (고강도 콘크리트의 등가응력 매개변수 추정에 관한 연구)

  • Lee, Do Hyung;Jeon, Jeongmoon;Jeong, Minchul;Kong, Jungsik
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.31 no.3A
    • /
    • pp.227-234
    • /
    • 2011
  • Recently, a high strength concrete of more than 40 MPa has been increasingly used in practice. However, use of the high strength concrete may influence on design parameters, particularly stress distribution. This is very true since the current everyday practice employs equivalent rectangular stress distribution that is derived from normal strength concrete. Subsequently, the stress distribution seems to be reevaluated and then a new distribution with new parameters needs to be suggested for the high strength concrete. For this purpose, linear and multiple regression analyses have been carried out in term of using experimental data for the high strength concrete of 40 to 80 MPa available in literatures. Accordingly, new parameters associated with the stress distribution have been proposed and employed for the design of flexural and compressive members. Comparative design examples indicate that designs with new parameters reduce section dimensions compared to those with the current code parameters for concrete strengths of 40 to 70 MPa. In particular, for compressive members, design with new parameters exhibit conservative compressive force compared to those with the current code parameters.

Self-Organizing Fuzzy Modeling using Creation of Clusters (클러스터 생성을 이용한 자기구성 퍼지 모델링)

  • 고택범
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 2002.05a
    • /
    • pp.245-251
    • /
    • 2002
  • 본 논문에서는 상대적으로 큰 퍼지 엔트로피를 갖는 입력-출력 데이터 집단에 다중 회귀 분석을 적용하여 다차원 평면 클러스터를 생성하고, 이 클러스터를 새로운 퍼지 모델의 규칙으로 추가한 후 퍼지 모델 파라미터의 개략 동조와 정밀 동조를 수행하는 자기구성 퍼지 모델링을 제안한다. Weighted recursive least squared 알고리즘과 fuzzy C-regression model 클러스터링에 의해 퍼지 모델의 파라미터를 개략적으로 동조한 후 gradient descent 알고리즘에 의해 파라미터를 정밀 동조하면서 감수분열 유전 알고리즘을 이용하여 최적의 학습률을 탐색한다. 그리고 자기 구성 퍼지 모델링 기법을 이용하여 Box-Jenkins의 가스로 데이터, 다변수비선형 정적 함수의 데이터와 하수 처리 활성오니 공정의 모델링을 수행하고, 기존의 방법에 의한 모델링 결과와 비교하여 그 성능을 입증한다.

  • PDF

Augmented Multiple Regression Algorithm for Accurate Estimation of Localized Solar Irradiance (국지적 일사량 산출 정확도 향상을 위한 다중회귀 증강 알고리즘)

  • Choi, Ji Nyeong;Lee, Sanghee;Ahn, Ki-Beom;Kim, Sug-Whan;Kim, Jinho
    • Korean Journal of Remote Sensing
    • /
    • v.36 no.6_1
    • /
    • pp.1435-1447
    • /
    • 2020
  • The seasonal variations in weather parameters can significantly affect the atmospheric transmission characteristics. Herein, we propose a novel augmented multiple regression algorithm for the accurate estimation of atmospheric transmittance and solar irradiance over highly localized areas. The algorithm employs 1) adaptive atmospheric model selection using measured meteorological data and 2) multiple linear regression computation augmented with the conventional application of MODerate resolution atmospheric TRANsmission (MODTRAN). In this study, the proposed algorithm was employed to estimate the solar irradiance over Taean coastal area using the 2018 clear days' meteorological data of the area, and the results were compared with the measurement data. The difference between the measured and computed solar irradiance significantly improved from 89.27 ± 48.08σ W/㎡ (with standard MODTRAN) to 21.35 ± 16.54σ W/㎡ (with augmented multiple regression algorithm). The novel method proposed herein can be a useful tool for the accurate estimation of solar irradiance and atmospheric transmission characteristics of highly localized areas with various weather conditions; it can also be used to correct remotely sensed atmospheric data of such areas.

Prediction Model of Energy Consumption of Wired Access Networks using Machine Learning (기계학습을 이용한 유선 액세스 네트워크의 에너지 소모량 예측 모델)

  • Suh, Yu-Hwa;Kim, Eun-Hoe
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.14 no.1
    • /
    • pp.14-21
    • /
    • 2021
  • Green networking has become a issue to reduce energy wastes and CO2 emission by adding energy managing mechanism to wired data networks. Energy consumption of the overall wired data networks is driven by access networks, expect for end devices. However, on a global scale, it is more difficult to manage centrally energy, measure and model the real energy use and energy savings potential of the access networks. This paper presented the multiple linear regression model to predict energy consumption of wired access networks using supervised learning of machine learning with data collected by existing investigated materials, actual measured values and results of many models. In addition, this work optimized the performance of it by various experiments and predict energy consumption of wired access networks. The performance evaluation of the regression model was achieved by well-knowned evaluation metrics.

Prediction of Final Construction Cost and Duration by Forecasting the Slopes of Cost and Time for Each Stage (공사 진행단계별 기울기 추정을 통한 최종 공사비 및 공기 예측)

  • Jin, Eui-Jae;Kwak, Soo-Nam;Kim, Du-Yon;Kim, Hyoung-Kwan;Han, Seung-Heon
    • Proceedings of the Korean Institute Of Construction Engineering and Management
    • /
    • 2006.11a
    • /
    • pp.137-142
    • /
    • 2006
  • Cost and duration is important factors which directly affect profit therefore must be forecasted correctly to accomplish success of projects. So construction company uses EVMS(Earned Value Management System) to forecast final cost and duration. But previous forecasting model has low accuracy because of its linear forecasting method and can't reflect characteristic of company and project and changes as each progress. This paper presents cost and duration forecasting model using the slope prediction of cost and duration as each progress to reflect the various characteristics of construction industry. EVMS data of 23 road construction projects was used to make up regression analysis equation of slope forecasting model.

  • PDF