• Title/Summary/Keyword: Regression problem

Search Result 1,658, Processing Time 0.024 seconds

Supervised Learning-Based Collaborative Filtering Using Market Basket Data for the Cold-Start Problem

  • Hwang, Wook-Yeon;Jun, Chi-Hyuck
    • Industrial Engineering and Management Systems
    • /
    • v.13 no.4
    • /
    • pp.421-431
    • /
    • 2014
  • The market basket data in the form of a binary user-item matrix or a binary item-user matrix can be modelled as a binary classification problem. The binary logistic regression approach tackles the binary classification problem, where principal components are predictor variables. If users or items are sparse in the training data, the binary classification problem can be considered as a cold-start problem. The binary logistic regression approach may not function appropriately if the principal components are inefficient for the cold-start problem. Assuming that the market basket data can also be considered as a special regression problem whose response is either 0 or 1, we propose three supervised learning approaches: random forest regression, random forest classification, and elastic net to tackle the cold-start problem, comparing the performance in a variety of experimental settings. The experimental results show that the proposed supervised learning approaches outperform the conventional approaches.

THE USE OF MATHEMATICAL PROGRAMMING FOR LINEAR REGRESSION PROBLEMS

  • Park, Sung-Hyun
    • Journal of the Korean Operations Research and Management Science Society
    • /
    • v.3 no.1
    • /
    • pp.75-79
    • /
    • 1978
  • The use of three mathematical programming techniques (quadratic programming, integer quadratic programming and linear programming) is discussed to solve some problems in linear regression analysis. When the criterion is the minimization of the sum of squared deviations and the parameters are linearly constrained, the problem may be formulated as quadratic programming problem. For the selection of variables to find "best" regression equation in statistics, the technique of integer quadratic programming is proposed and found to be a very useful tool. When the criterion of fitting a linear regression is the minimization of the sum of absolute deviations from the regression function, the problem may be reduced to a linear programming problem and can be solved reasonably well.ably well.

  • PDF

Combining Ridge Regression and Latent Variable Regression

  • Kim, Jong-Duk
    • Journal of the Korean Data and Information Science Society
    • /
    • v.18 no.1
    • /
    • pp.51-61
    • /
    • 2007
  • Ridge regression (RR), principal component regression (PCR) and partial least squares regression (PLS) are among popular regression methods for collinear data. While RR adds a small quantity called ridge constant to the diagonal of X'X to stabilize the matrix inversion and regression coefficients, PCR and PLS use latent variables derived from original variables to circumvent the collinearity problem. One problem of PCR and PLS is that they are very sensitive to overfitting. A new regression method is presented by combining RR and PCR and PLS, respectively, in a unified manner. It is intended to provide better predictive ability and improved stability for regression models. A real-world data from NIR spectroscopy is used to investigate the performance of the newly developed regression method.

  • PDF

Improvement of Genetic Programming Based Nonlinear Regression Using ADF and Application for Prediction MOS of Wind Speed (ADF를 사용한 유전프로그래밍 기반 비선형 회귀분석 기법 개선 및 풍속 예보 보정 응용)

  • Oh, Seungchul;Seo, Kisung
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.64 no.12
    • /
    • pp.1748-1755
    • /
    • 2015
  • A linear regression is widely used for prediction problem, but it is hard to manage an irregular nature of nonlinear system. Although nonlinear regression methods have been adopted, most of them are only fit to low and limited structure problem with small number of independent variables. However, real-world problem, such as weather prediction required complex nonlinear regression with large number of variables. GP(Genetic Programming) based evolutionary nonlinear regression method is an efficient approach to attach the challenging problem. This paper introduces the improvement of an GP based nonlinear regression method using ADF(Automatically Defined Function). It is believed ADFs allow the evolution of modular solutions and, consequently, improve the performance of the GP technique. The suggested ADF based GP nonlinear regression methods are compared with UM, MLR, and previous GP method for 3 days prediction of wind speed using MOS(Model Output Statistics) for partial South Korean regions. The UM and KLAPS data of 2007-2009, 2011-2013 years are used for experimentation.

Bayesian Logistic Regression for Human Detection (Human Detection 을 위한 Bayesian Logistic Regression)

  • Aurrahman, Dhi;Setiawan, Nurul Arif;Lee, Chil-Woo
    • 한국HCI학회:학술대회논문집
    • /
    • 2008.02a
    • /
    • pp.569-572
    • /
    • 2008
  • The possibility to extent the solution in human detection problem for plug-in on vision-based Human Computer Interaction domain is very attractive, since the successful of the machine leaning theory and computer vision marriage. Bayesian logistic regression is a powerful classifier performing sparseness and high accuracy. The difficulties of finding people in an image will be conquered by implementing this Bavesian model as classifier. The comparison with other massive classifier e.g. SVM and RVM will introduce acceptance of this method for human detection problem. Our experimental results show the good performance of Bavesian logistic regression in human detection problem, both in trade-off curves (ROC, DET) and real-implementation compare to SVM and RVM.

  • PDF

Analysis of the Relationship between Technological Problem-Solving Traits and Engineering Design Competency of Universities (대학생의 기술적 문제해결 성향과 공학설계 역량 간의 관계 분석)

  • Wee, Seonbouk;Kim, Taehoon
    • Journal of Engineering Education Research
    • /
    • v.25 no.6
    • /
    • pp.103-113
    • /
    • 2022
  • The purpose of this study is to correlation analysis between technological problem-solving traits and engineering design competency. To this end, correlation analysis and regression analysis between technological problem-solving traits and engineering design competency were used to analyze the relationship between each other. To collect data on individual characteristics, technological problem-solving traits, and engineering design competency, a survey was conducted with university students. As a result of the analysis, there was no difference in engineering design competency by gender, but there was a difference in technological problem-solving traits. There was no difference in technological problem-solving traits by major, but there was a difference in engineering design competency. As a result of correlation analysis, the correlation was found. In the case of regression analysis, a statistically significant result was found in the problem-solving trait domain, and the regression analysis model was found to be suitable. The results of the analysis of differences in engineering design competency according to technological problem-solving traits showed that the effective problem solvers were significantly higher.

CENSORED FUZZY REGRESSION MODEL

  • Choi, Seung-Hoe;Kim, Kyung-Joong
    • Journal of the Korean Mathematical Society
    • /
    • v.43 no.3
    • /
    • pp.623-634
    • /
    • 2006
  • Various methods have been studied to construct a fuzzy regression model in order to present a fuzzy relation between a dependent variable and an independent variable. However, in the fuzzy regression analysis the value of the center point of estimated fuzzy output may be either greater than the value of the right endpoint or smaller than the value of the left endpoint. In the case, we cannot predict the fuzzy output properly. This paper presents sufficient conditions to construct the fuzzy regression model using several methods investigated by some authors and then introduces the censored fuzzy regression model using the censored samples to manipulate the problem of crossing of the center and the end points of the estimated fuzzy number. Examples show that the censored fuzzy regression model is an extension of the fuzzy regression model and also it improves the problem of crossing.

Determination of Research Octane Number using NIR Spectral Data and Ridge Regression

  • Jeong, Ho Il;Lee, Hye Seon;Jeon, Ji Hyeok
    • Bulletin of the Korean Chemical Society
    • /
    • v.22 no.1
    • /
    • pp.37-42
    • /
    • 2001
  • Ridge regression is compared with multiple linear regression (MLR) for determination of Research Octane Number (RON) when the baseline and signal-to-noise ratio are varied. MLR analysis of near-infrared (NIR) spectroscopic data usually encounters a collinearity problem, which adversely affects long-term prediction performance. The collinearity problem can be eliminated or greatly improved by using ridge regression, which is a biased estimation method. To evaluate the robustness of each calibration, the calibration models developed by both calibration methods were used to predict RONs of gasoline spectra in which the baseline and signal-to-noise ratio were varied. The prediction results of a ridge calibration model showed more stable prediction performance as compared to that of MLR, especially when the spectral baselines were varied. . In conclusion, ridge regression is shown to be a viable method for calibration of RON with the NIR data when only a few wavelengths are available such as hand-carry device using a few diodes.

Influence of parents' parenting values and beliefs on preschoolers' problem behaviors (부모의 양육가치와 양육신념이 유아의 행동문제에 미치는 영향)

  • Lee, Eun-Ju;Min, Ha-Yeoung
    • Korean Journal of Human Ecology
    • /
    • v.15 no.4
    • /
    • pp.541-549
    • /
    • 2006
  • The purpose of this study is to clarify that parents' values and beliefs in bringing up their children deeply relate to their children's problem behaviors, The subjects are 267 preschoolers attending kindergarten in Daegue area, Statistical techniques are Two Way ANOVA, Scheffe' test, Pearson's Correlation and Regression, The results of this study are as follows: (1) Problem behaviors of preschoolers are significantly related to parents' values, Preschoolers whose parents have a higher level of values have a lower level of problem behaviors. (2) Problem behaviors of preschoolers are significantly related to parents' beliefs, Preschoolers whose parents have a higher level of beliefs have a higher level of problem behaviors. (3) The Multiple Regression analysis shows that parents' parenting values and beliefs are crucially predictive of preschoolers' problem behaviors. Especially, parents' parenting beliefs is more relevant to preschoolers' problem behaviors than parents' parenting values is.

  • PDF

ROBUST REGRESSION ESTIMATION BASED ON DATA PARTITIONING

  • Lee, Dong-Hee;Park, You-Sung
    • Journal of the Korean Statistical Society
    • /
    • v.36 no.2
    • /
    • pp.299-320
    • /
    • 2007
  • We introduce a high breakdown point estimator referred to as data partitioning robust regression estimator (DPR). Since the DPR is obtained by partitioning observations into a finite number of subsets, it has no computational problem unlike the previous robust regression estimators. Empirical and extensive simulation studies show that the DPR is superior to the previous robust estimators. This is much so in large samples.