• Title/Summary/Keyword: correlation regression

Search Result 9,553, Processing Time 0.211 seconds

Canonical Correlation: Permutation Tests and Regression

  • Yoo, Jae-Keun;Kim, Hee-Youn;Um, Hye-Yeon
    • Communications for Statistical Applications and Methods
    • /
    • 제19권3호
    • /
    • pp.471-478
    • /
    • 2012
  • In this paper, we present a permutation test to select the number of pairs of canonical variates in canonical correlation analysis. The existing chi-squared test is known to be limited to normality in use. We compare the existing test with the proposed permutation test and study their asymptotic behaviors through numerical studies. In addition, we connect canonical correlation analysis to regression and we we show that certain inferences in regression can be done through canonical correlation analysis. A regression analysis of real data through canonical correlation analysis is illustrated.

상관성과 단순선형회귀분석 (Correlation and Simple Linear Regression)

  • 박선일;오태호
    • 한국임상수의학회지
    • /
    • 제27권4호
    • /
    • pp.427-434
    • /
    • 2010
  • Correlation is a technique used to measure the strength or the degree of closeness of the linear association between two quantitative variables. Common misuses of this technique are highlighted. Linear regression is a technique used to identify a relationship between two continuous variables in mathematical equations, which could be used for comparison or estimation purposes. Specifically, regression analysis can provide answers for questions such as how much does one variable change for a given change in the other, how accurately can the value of one variable be predicted from the knowledge of the other. Regression does not give any indication of how good the association is while correlation provides a measure of how well a least-squares regression line fits the given set of data. The better the correlation, the closer the data points are to the regression line. In this tutorial article, the process of obtaining a linear regression relationship for a given set of bivariate data was described. The least square method to obtain the line which minimizes the total error between the data points and the regression line was employed and illustrated. The coefficient of determination, the ratio of the explained variation of the values of the independent variable to total variation, was described. Finally, the process of calculating confidence and prediction interval was reviewed and demonstrated.

농업용저수지 수질인자간 상관성 및 획귀분석 (Correlations and Regression Analysis Between Reservoir Water Quality Parameters)

  • 최은희;박영석
    • 한국관개배수논문집
    • /
    • 제18권1호
    • /
    • pp.25-32
    • /
    • 2011
  • In order to effectively manage the reservoir, reservoir water quality management should be based on physicochemical and configurational characteristics. In this research, correlation between factors affecting the reservoir water quality was examined. Chl-a and COD shows the highest positive correlation. Chl-a and T-P also has a high positive correlation, however Chl-a and T-N show lower correlation relatively. Even though T-N is an important factor for phytoplankton growth which increase Chl-a concentration, corelation of Ch1-a and T-N shows that enough nitrogen in the reservoir isn't no longer limiting factor. The age of reservoir can cause of increasing COD and SS. Embankment height and elevation of reservoirs shows strong negative correlation to water quality. That means reservoir which is higher embankment height and locate in higher elevations is less contaminated. Regression expression was derived with Chl-a and water quality parameters, and height of reservoir. Finally Chl-a was simulated using regression expression and it was a good approach to predict the Chl-a concentration.

  • PDF

다양한 회귀분석을 통한 강우유출용적에 따른 비점오염부하량 예측방안 (Predictive Relationships of the Nonpoint Source Pollutant Loads with Stormwater Runoff Volumes based on the Various Regression Analyses)

  • 신지웅;길경익
    • 한국물환경학회지
    • /
    • 제27권3호
    • /
    • pp.257-263
    • /
    • 2011
  • This study analyzes the correlations between non-point sources and runoff to estimate non-point sources for effective management. From the monitoring results, the correlation factors among pollutant mass loading, EMC, total runoff volume and average flow are calculated. And using correlation factors, the most related two constituents are determined. Also the most appropriate regression between two constituents are determined. Pollutant mass loading and total runoff volume has the highest correlation. Also, compound regression is found to be the most appropriate regression. This shows that pollutant mass loading increases as total runoff volume increases. It is not continuous increase but has some pattern.

회귀 모델을 활용한 철강 기업의 에너지 소비 예측 (Forecasting Energy Consumption of Steel Industry Using Regression Model)

  • Sung-Ho KANG;Hyun-Ki KIM
    • Journal of Korea Artificial Intelligence Association
    • /
    • 제1권2호
    • /
    • pp.21-25
    • /
    • 2023
  • The purpose of this study was to compare the performance using multiple regression models to predict the energy consumption of steel industry. Specific independent variables were selected in consideration of correlation among various attributes such as CO2 concentration, NSM, Week Status, Day of week, and Load Type, and preprocessing was performed to solve the multicollinearity problem. In data preprocessing, we evaluated linear and nonlinear relationships between each attribute through correlation analysis. In particular, we decided to select variables with high correlation and include appropriate variables in the final model to prevent multicollinearity problems. Among the many regression models learned, Boosted Decision Tree Regression showed the best predictive performance. Ensemble learning in this model was able to effectively learn complex patterns while preventing overfitting by combining multiple decision trees. Consequently, these predictive models are expected to provide important information for improving energy efficiency and management decision-making at steel industry. In the future, we plan to improve the performance of the model by collecting more data and extending variables, and the application of the model considering interactions with external factors will also be considered.

최근 3년간(2007~2009년) 여성건강간호학회지의 상관분석과 회귀분석 통계활용 논문 분석 (Analysis of the Correlation and Regression Analysis Studies from the Korean Journal of Women Health Nursing over the Past Three Years (2007~2009))

  • 이은주;이은희;김증임;강희선;오현이;전은미;천숙희
    • 여성건강간호학회지
    • /
    • 제17권2호
    • /
    • pp.187-194
    • /
    • 2011
  • Purpose: This study investigated the statistical methods and the results had reported correlation/regression analysis in the studies of Korean Journal of Women Health Nursing (KJWHN). Methods: We reviewed 45 studies using correlation/regression analysis for the suitability of the statistical methods and the research purposes, the criteria for analysis of figures, tables and charts had published in the KJWHN from vol 13 (1) in 2007 to vol 15 (4) in 2009. Results: Forty three studies were fitted to their statistical methodology and their research purposes. Eleven studies considered the minimum sample size. Fourteen regression studies used multiple regression and 12 studies used forward method for variable entry. Only one study among the 17 regression studies accomplished scatter plots and residuals examination. Sixteen studies in correlation studies and six studies in regression studies showed some errors in either the title, variables, category of figures, tables and charts. In the regression study, all reported $R^2$ and ${\beta}$ values except one. Conclusion: It was found that there were still statistical errors or articulation errors in the statistical analysis. All reviewers need to be reviewed more closely for detecting errors not only during reviewing process of the manuscript but also periodic publication for the quality of this academic journal.

유역의 지상적 요인과 저수지 비퇴사량과의 관계분석 (Regression Analysis Between Specific Sediments of Reservoirs and Physiographic Factors of Watersheds)

  • 서승덕;박흥익;천만복;윤경덕
    • 한국농공학회지
    • /
    • 제30권4호
    • /
    • pp.45-61
    • /
    • 1988
  • The purpose of this study is to develop regression equations between annual specific sedi- ment of reservoirs and physiographic factors of watersheds. 122 irrigation reservoirs, which have irrigation areas equal to or larger than 200 ha, located in Korea except Cheju province are used in the analysis. Simple regression analyses between the specific annual sediment and each of the physical characteristic factors of the reservoirs are carried out at first. Then, multiple regression analyses between the annual specific sediment and the physical characteristic factors with high correlation coefficients in the simple regression analyses are made. The results obtained from this study are as follows : 1. The results of the sirnple regression analyses show that in each province the watershed area, the length of mainstream, the circumferential length of watershed have high cor- relation coefficients (R=0.814-0.986), and that drainage density, reservoir capacity per watershed area, drainage frequency, basin relief have low correlation coefficients (R=0. 387-0.955). 2. The purposed multiple regression equations between the annual specific sediment of reservoirs and three major characteritic factors of watersheds, namely, the watershed area, the circumferential length of watershed, and the length of mainstream, are proposed as given in Table 2. 3. The result of the simple regression analyses with respect to the reservoir elevation except Jeonnam province, which has very different characteristics comparing to other provinces, shows that watershed area, main stream length and circumferential length have high correlation coefficients (R=0.806-0.884) in low-elevation reservoirs and intermediate- elevation reservoirs, but low correlation coefficients (R=0.639-0.739) in high-elevation reservoirs. 4. With respect to the reservoir elevation, the proposed multiple regression equations bet- ween the annual specific sediment of reservoirs and the three major characteristic factors of watershed which have high correlation coefficients are proposed as given in Table 5.

  • PDF

Regression and Correlation Analysis via Dynamic Graphs

  • Kang, Hee Mo;Sim, Songyong
    • Communications for Statistical Applications and Methods
    • /
    • 제10권3호
    • /
    • pp.695-705
    • /
    • 2003
  • In this article, we propose a regression and correlation analysis via dynamic graphs and implement them in Java Web Start. For the polynomial relations between dependent and independent variables, dynamic graphics are implemented for both polynomial regression and spline estimates for an instant model selection. The results include basic statistics. They are available both as a web-based service and an application.

가변스트레치성형 설계변수와 성형오차의 상관관계에 대한 통계적 연구 (Statistical Study on Correlation Between Design Variable and Shape Error in Flexible Stretch Forming)

  • 서영호;허성찬;강범수;김정
    • 소성∙가공
    • /
    • 제20권2호
    • /
    • pp.124-131
    • /
    • 2011
  • A flexible stretch forming process is useful for small quantity batch production because various shape changes of the flexible die can be achieved conveniently. In this study, the design variables, namely, the punch size, curvature radius and elastic pad thickness, were quantitatively evaluated to understand their influence on sheet formability using statistical methods such as the correlation and regression analyses. Forming simulations were designed and conducted by a three-way factorial design to obtain numerical values of a shape error. Linear relationships between the design variables and the shape error resulted from the Pearson correlation analysis. Subsequently, a regression analysis was also conducted between the design variables and the shape error. A regression equation was derived and used in the flexible die design stage to estimate the shape error.

유입식 변압기의 열화시간에 따른 절연 열화특성 및 선형회귀법을 이용한 상관관계 분석 (Analysis for Insulating Degradation Characteristics with Aging Time for Oil-filled Transformers and/or Correlation between using Linear Regression Method)

  • 이승민
    • 전기학회논문지
    • /
    • 제59권4호
    • /
    • pp.693-699
    • /
    • 2010
  • General transformer's life is known as paper insulation' life. If a transformer is degraded by these aging factors, it is known that electrical, mechanical and chemical characteristics for transformer's oil-paper are changed. When the kraft paper is aged, the cellulose polymer chains break down into shorter lengths. It causes decrease in both tensile strength and degree of polymerization of paper insulation. The paper breakdown is accompanied by an increase in the content of furanic compounds within the dielectric liquid. In this paper it is aimed at analysis on correlation between aging characteristics for insulating diagnosis of thermally aged paper. For investigating the accelerated aging process of oil-paper samples accelerating aging cell was manufactured for estimating variation of paper insulation during 500 hours at $140^{\circ}C$ temperature. To derive the results, it was performed analysis such as tensile strength(TS), depolymerization(DP), dielectric strength(DS), relative permittivity, water content(WC) and furan compound(FC) for aged paper. Also for analyzing correlation between insulating degradation characteristics, we used linear regression method. As as results of linear regression analysis, there was a close correlation between TS and DP. WC, FC. But dielectric strength was a weak correlation with aging time.