• Title/Summary/Keyword: Regression Algorithm

Search Result 1,065, Processing Time 0.022 seconds

Estimation of Pollutant Load Using Genetic-algorithm and Regression Model (유전자 알고리즘과 회귀식을 이용한 오염부하량의 예측)

  • Park, Youn Shik
    • Korean Journal of Environmental Agriculture
    • /
    • v.33 no.1
    • /
    • pp.37-43
    • /
    • 2014
  • BACKGROUND: Water quality data are collected less frequently than flow data because of the cost to collect and analyze, while water quality data corresponding to flow data are required to compute pollutant loads or to calibrate other hydrology models. Regression models are applicable to interpolate water quality data corresponding to flow data. METHODS AND RESULTS: A regression model was suggested which is capable to consider flow and time variance, and the regression model coefficients were calibrated using various measured water quality data with genetic-algorithm. Both LOADEST and the regression using genetic-algorithm were evaluated by 19 water quality data sets through calibration and validation. The regression model using genetic-algorithm displayed the similar model behaviors to LOADEST. The load estimates by both LOADEST and the regression model using genetic-algorithm indicated that use of a large proportion of water quality data does not necessarily lead to the load estimates with smaller error to measured load. CONCLUSION: Regression models need to be calibrated and validated before they are used to interpolate pollutant loads, as separating water quality data into two data sets for calibration and validation.

Object Tracking with the Multi-Templates Regression Model Based MS Algorithm

  • Zhang, Hua;Wang, Lijia
    • Journal of Information Processing Systems
    • /
    • v.14 no.6
    • /
    • pp.1307-1317
    • /
    • 2018
  • To deal with the problems of occlusion, pose variations and illumination changes in the object tracking system, a regression model weighted multi-templates mean-shift (MS) algorithm is proposed in this paper. Target templates and occlusion templates are extracted to compose a multi-templates set. Then, the MS algorithm is applied to the multi-templates set for obtaining the candidate areas. Moreover, a regression model is trained to estimate the Bhattacharyya coefficients between the templates and candidate areas. Finally, the geometric center of the tracked areas is considered as the object's position. The proposed algorithm is evaluated on several classical videos. The experimental results show that the regression model weighted multi-templates MS algorithm can track an object accurately in terms of occlusion, illumination changes and pose variations.

Estimation of Software Reliability with Immune Algorithm and Support Vector Regression (면역 알고리즘 기반의 서포트 벡터 회귀를 이용한 소프트웨어 신뢰도 추정)

  • Kwon, Ki-Tae;Lee, Joon-Kil
    • Journal of Information Technology Services
    • /
    • v.8 no.4
    • /
    • pp.129-140
    • /
    • 2009
  • The accurate estimation of software reliability is important to a successful development in software engineering. Until recent days, the models using regression analysis based on statistical algorithm and machine learning method have been used. However, this paper estimates the software reliability using support vector regression, a sort of machine learning technique. Also, it finds the best set of optimized parameters applying immune algorithm, changing the number of generations, memory cells, and allele. The proposed IA-SVR model outperforms some recent results reported in the literature.

Speed-up of the Matrix Computation on the Ridge Regression

  • Lee, Woochan;Kim, Moonseong;Park, Jaeyoung
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.15 no.10
    • /
    • pp.3482-3497
    • /
    • 2021
  • Artificial intelligence has emerged as the core of the 4th industrial revolution, and large amounts of data processing, such as big data technology and rapid data analysis, are inevitable. The most fundamental and universal data interpretation technique is an analysis of information through regression, which is also the basis of machine learning. Ridge regression is a technique of regression that decreases sensitivity to unique or outlier information. The time-consuming calculation portion of the matrix computation, however, basically includes the introduction of an inverse matrix. As the size of the matrix expands, the matrix solution method becomes a major challenge. In this paper, a new algorithm is introduced to enhance the speed of ridge regression estimator calculation through series expansion and computation recycle without adopting an inverse matrix in the calculation process or other factorization methods. In addition, the performances of the proposed algorithm and the existing algorithm were compared according to the matrix size. Overall, excellent speed-up of the proposed algorithm with good accuracy was demonstrated.

Prediction of the number of public bicycle rental in Seoul using Boosted Decision Tree Regression Algorithm

  • KIM, Hyun-Jun;KIM, Hyun-Ki
    • Korean Journal of Artificial Intelligence
    • /
    • v.10 no.1
    • /
    • pp.9-14
    • /
    • 2022
  • The demand for public bicycles operated by the Seoul Metropolitan Government is increasing every year. The size of the Seoul public bicycle project, which first started with about 5,600 units, increased to 3,7500 units as of September 2021, and the number of members is also increasing every year. However, as the size of the project grows, excessive budget spending and deficit problems are emerging for public bicycle projects, and new bicycles, rental office costs, and bicycle maintenance costs are blamed for the deficit. In this paper, the Azure Machine Learning Studio program and the Boosted Decision Tree Regression technique are used to predict the number of public bicycle rental over environmental factors and time. Predicted results it was confirmed that the demand for public bicycles was high in the season except for winter, and the demand for public bicycles was the highest at 6 p.m. In addition, in this paper compare four additional regression algorithms in addition to the Boosted Decision Tree Regression algorithm to measure algorithm performance. The results showed high accuracy in the order of the First Boosted Decision Tree Regression Algorithm (0.878802), second Decision Forest Regression (0.838232), third Poison Regression (0.62699), and fourth Linear Regression (0.618773). Based on these predictions, it is expected that more public bicycles will be placed at rental stations near public transportation to meet the growing demand for commuting hours and that more bicycles will be placed in rental stations in summer than winter and the life of bicycles can be extended in winter.

Nonparametric logistic regression based on sparse triangulation over a compact domain

  • Seoyeon Kim;Kwan-Young Bak
    • Communications for Statistical Applications and Methods
    • /
    • v.31 no.5
    • /
    • pp.557-569
    • /
    • 2024
  • Based on the investigation of logistic regression models utilizing sparse triangulation within a compact domain in ℝ2, this study addresses the limited research extending the triogram model to logistic regression. A primary challenge arises from the potential instability induced by a large number of vertices, hindering the effective modeling of complex relationships. To mitigate this challenge, we propose introducing sparsity to boundary vertices of the triangulation based on the Ramer-Douglas-Peucker algorithm and employing the K-means algorithm for adaptive vertex initialization. A second order coordinate-wise descent algorithm is adopted to implement the proposed method. Validation of the proposed algorithm's stability and performance assessment are conducted using synthetic and handwritten digit data (LeCun et al., 1989). Results demonstrate the advantages of our method over existing methodologies, particularly when dealing with non-rectangular data domains.

A Fast Kernel Regression Framework for Video Super-Resolution

  • Yu, Wen-Sen;Wang, Ming-Hui;Chang, Hua-Wen;Chen, Shu-Qing
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.8 no.1
    • /
    • pp.232-248
    • /
    • 2014
  • A series of kernel regression (KR) algorithms, such as the classic kernel regression (CKR), the 2- and 3-D steering kernel regression (SKR), have been proposed for image and video super-resolution. In existing KR frameworks, a single algorithm is usually adopted and applied for a whole image/video, regardless of region characteristics. However, their performances and computational efficiencies can differ in regions of different characteristics. To take full advantage of the KR algorithms and avoid their disadvantage, this paper proposes a kernel regression framework for video super-resolution. In this framework, each video frame is first analyzed and divided into three types of regions: flat, non-flat-stationary, and non-flat-moving regions. Then different KR algorithm is selected according to the region type. The CKR and 2-D SKR algorithms are applied to flat and non-flat-stationary regions, respectively. For non-flat-moving regions, this paper proposes a similarity-assisted steering kernel regression (SASKR) algorithm, which can give better performance and higher computational efficiency than the 3-D SKR algorithm. Experimental results demonstrate that the computational efficiency of the proposed framework is greatly improved without apparent degradation in performance.

Noisy label based discriminative least squares regression and its kernel extension for object identification

  • Liu, Zhonghua;Liu, Gang;Pu, Jiexin;Liu, Shigang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.11 no.5
    • /
    • pp.2523-2538
    • /
    • 2017
  • In most of the existing literature, the definition of the class label has the following characteristics. First, the class label of the samples from the same object has an absolutely fixed value. Second, the difference between class labels of the samples from different objects should be maximized. However, the appearance of a face varies greatly due to the variations of the illumination, pose, and expression. Therefore, the previous definition of class label is not quite reasonable. Inspired by discriminative least squares regression algorithm (DLSR), a noisy label based discriminative least squares regression algorithm (NLDLSR) is presented in this paper. In our algorithm, the maximization difference between the class labels of the samples from different objects should be satisfied. Meanwhile, the class label of the different samples from the same object is allowed to have small difference, which is consistent with the fact that the different samples from the same object have some differences. In addition, the proposed NLDLSR is expanded to the kernel space, and we further propose a novel kernel noisy label based discriminative least squares regression algorithm (KNLDLSR). A large number of experiments show that our proposed algorithms can achieve very good performance.

Performance Comparison of Logistic Regression Algorithms on RHadoop

  • Jung, Byung Ho;Lim, Dong Hoon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.22 no.4
    • /
    • pp.9-16
    • /
    • 2017
  • Machine learning has found widespread implementations and applications in many different domains in our life. Logistic regression is a type of classification in machine leaning, and is used widely in many fields, including medicine, economics, marketing and social sciences. In this paper, we present the MapReduce implementation of three existing algorithms, this is, Gradient Descent algorithm, Cost Minimization algorithm and Newton-Raphson algorithm, for logistic regression on RHadoop that integrates R and Hadoop environment applicable to large scale data. We compare the performance of these algorithms for estimation of logistic regression coefficients with real and simulated data sets. We also compare the performance of our RHadoop and RHIPE platforms. The performance experiments showed that our Newton-Raphson algorithm when compared to Gradient Descent and Cost Minimization algorithms appeared to be better to all data tested, also showed that our RHadoop was better than RHIPE in real data, and was opposite in simulated data.

Ensemble variable selection using genetic algorithm

  • Seogyoung, Lee;Martin Seunghwan, Yang;Jongkyeong, Kang;Seung Jun, Shin
    • Communications for Statistical Applications and Methods
    • /
    • v.29 no.6
    • /
    • pp.629-640
    • /
    • 2022
  • Variable selection is one of the most crucial tasks in supervised learning, such as regression and classification. The best subset selection is straightforward and optimal but not practically applicable unless the number of predictors is small. In this article, we propose directly solving the best subset selection via the genetic algorithm (GA), a popular stochastic optimization algorithm based on the principle of Darwinian evolution. To further improve the variable selection performance, we propose to run multiple GA to solve the best subset selection and then synthesize the results, which we call ensemble GA (EGA). The EGA significantly improves variable selection performance. In addition, the proposed method is essentially the best subset selection and hence applicable to a variety of models with different selection criteria. We compare the proposed EGA to existing variable selection methods under various models, including linear regression, Poisson regression, and Cox regression for survival data. Both simulation and real data analysis demonstrate the promising performance of the proposed method.