• Title/Summary/Keyword: Regression testing

Search Result 707, Processing Time 0.026 seconds

The Forward Sequential Procedure for the Identifying Multiple Outliers in Linear Regression

  • Park, Jin-Pyo
    • Journal of the Korean Data and Information Science Society
    • /
    • v.16 no.4
    • /
    • pp.1053-1066
    • /
    • 2005
  • In this paper we consider the problem of identifying and testing outliers in linear regression. First we consider the use of the so-called scale ratio tests for testing the null hypothesis of no outliers. This test is based on the ratio of two residual scale estimates. We show the asymptotic distribution of the test statistics and investigate its properties. Next we consider the problem of identifying the outliers. A forward sequential procedure using the suggested test is proposed. The new method is compared with classical procedure in the real data example. Unlike other forward procedures, the present one is unaffected by masking and swamping effects because the test statistic is based on robust scale estimate.

  • PDF

Influential Points in GLMs via Backwards Stepping

  • Jeong, Kwang-Mo;Oh, Hae-Young
    • Communications for Statistical Applications and Methods
    • /
    • v.9 no.1
    • /
    • pp.197-212
    • /
    • 2002
  • When assessing goodness-of-fit of a model, a small subset of deviating observations can give rise to a significant lack of fit. It is therefore important to identify such observations and to assess their effects on various aspects of analysis. A Cook's distance measure is usually used to detect influential observation. But it sometimes is not fully effective in identifying truly influential set of observations because there may exist masking or swamping effects. In this paper we confine our attention to influential subset In GLMs such as logistic regression models and loglinear models. We modify a backwards stepping algorithm, which was originally suggested for detecting outlying cells in contingency tables, to detect influential observations in GLMs. The algorithm consists of two steps, the identification step and the testing step. In identification step we Identify influential observations based on influencial measures such as Cook's distances. On the other hand in testing step we test the subset of identified observations to be significant or not Finally we explain the proposed method through two types of dataset related to logistic regression model and loglinear model, respectively.

On inference of multivariate means under ranked set sampling

  • Rochani, Haresh;Linder, Daniel F.;Samawi, Hani;Panchal, Viral
    • Communications for Statistical Applications and Methods
    • /
    • v.25 no.1
    • /
    • pp.1-13
    • /
    • 2018
  • In many studies, a researcher attempts to describe a population where units are measured for multiple outcomes, or responses. In this paper, we present an efficient procedure based on ranked set sampling to estimate and perform hypothesis testing on a multivariate mean. The method is based on ranking on an auxiliary covariate, which is assumed to be correlated with the multivariate response, in order to improve the efficiency of the estimation. We showed that the proposed estimators developed under this sampling scheme are unbiased, have smaller variance in the multivariate sense, and are asymptotically Gaussian. We also demonstrated that the efficiency of multivariate regression estimator can be improved by using Ranked set sampling. A bootstrap routine is developed in the statistical software R to perform inference when the sample size is small. We use a simulation study to investigate the performance of the method under known conditions and apply the method to the biomarker data collected in China Health and Nutrition Survey (CHNS 2009) data.

Agent Based Object Oriented Software Test Technique (에이전트 기반의 객체지향 소프트웨어 테스트 방안)

  • Choe, Jeong-Eun;Choe, Byeong-Ju
    • Journal of KIISE:Software and Applications
    • /
    • v.27 no.11
    • /
    • pp.1106-1114
    • /
    • 2000
  • 컴퓨터 분야에서 에이전트의 개념은 전자 상거래, 정보 검색과 같은 많은 어플리케이션들에 응용되어 중요 시 되고 있다. 하지만, 아직까지 지능성을 가진 테스트 도구는 없었다. 이 논문에서 제안하는 테스트 에이전트 시스템은 에이전트의 특성을 가지고 테스터를 도와주는 테스트 도구이다. 테스트 에이전트 시스템은 객체지향 테스트 프로세스를 따라 테스터의 일을 대행해 주고, 테스터의 간섭을 최소화 시켜 준다. 이 시스템은 자동 생성된 많은 양의 테스트케이스에서 중복이 없고 일관성 있는 테스트케이스를 지능적으로 선택하여 테스트 시간을 단축시켜 준다. 테스트 에이전트 시스템은 3개의 에이전트 User Interface Agent, Test Case Selection & Testing Agent, Regression Test Agent로 구성된다. 특히 Test Case Selection & Testing Agent은 RE-Rule과 CTS-Rule을 통하여 중복이 없고 일관성 있는 테스트케이스를 지능적으로 선택하며, Regression Test Agent는 RRTIS-Rule을 통해 리그래션 테스트 항목을 지능적으로 선택한다.

  • PDF

Using Machine Learning Algorithms for Housing Price Prediction: The Case of Islamabad Housing Data

  • Imran, Imran;Zaman, Umar;Waqar, Muhammad;Zaman, Atif
    • Soft Computing and Machine Intelligence
    • /
    • v.1 no.1
    • /
    • pp.11-23
    • /
    • 2021
  • House price prediction is a significant financial decision for individuals working in the housing market as well as for potential buyers. From investment to buying a house for residence, a person investing in the housing market is interested in the potential gain. This paper presents machine learning algorithms to develop intelligent regressions models for House price prediction. The proposed research methodology consists of four stages, namely Data Collection, Pre Processing the data collected and transforming it to the best format, developing intelligent models using machine learning algorithms, training, testing, and validating the model on house prices of the housing market in the Capital, Islamabad. The data used for model validation and testing is the asking price from online property stores, which provide a reasonable estimate of the city housing market. The prediction model can significantly assist in the prediction of future housing prices in Pakistan. The regression results are encouraging and give promising directions for future prediction work on the collected dataset.

A linearity test statistic in a simple linear regression (단순회귀모형에서 선형성 검정통계량)

  • Park, Chun Gun;Lee, Kyeong Eun
    • Journal of the Korean Data and Information Science Society
    • /
    • v.25 no.2
    • /
    • pp.305-315
    • /
    • 2014
  • In a simple linear regression, a linear relationship between an explanatory variable and a response variable can be easily recognized in the scatter plot of them. The lack of fit test for the replicated data is commonly used for testing the linearity but it is not easy to test the linearity when the explanatory variable is not replicated. In this paper, we propose three new test statistics for testing the linearity regardless of replication using the principle of average slope and validate them through several simulations and empirical studies.

High Temperature Reliability Study of Low Frequency In-door Electrodeless Lamp (무전극형광램프의 고온 신뢰성 연구)

  • Jeong, Ui-Hyo;Hyung, Jae-Phil;Lim, Seong-Yong;Lim, Hong-Woo;Jang, Joong-Soon
    • Journal of Applied Reliability
    • /
    • v.14 no.3
    • /
    • pp.203-207
    • /
    • 2014
  • Electrodeless lamp is famous for its long life. But its reliability is dependent not only on electrodes but also on materials and structures. To evaluate end product's reliability, we studied high temperature durability by $60^{\circ}C$, $75^{\circ}C$ and $90^{\circ}C$ temperature tests, and predicted failure times by an exponential model through regression analysis. However, the test showed that temperature does not affect degradation of electrodeless lamps. Their luminous outputs degrade during the early time of the test (till 250 hours) and then converge to a saturation points. Also, '410nm ~ 530nm' spectrum degrades more than other spectra.

Tests for Panel Regression Model with Unbalanced Data

  • Song, Suck-Heun;Jung, Byoung-Cheol
    • Journal of the Korean Statistical Society
    • /
    • v.30 no.3
    • /
    • pp.511-527
    • /
    • 2001
  • This paper consider the testing problem of variance component for the unbalanced tow=-way error component model. We provide a conditional LM test statistic for testing zero individual(time) effects assuming that the other time-specific(individual)efefcts are present. This test is extension of Baltagi, Chang and Li(1998, 1992). Monte Carlo experiments are conducted to study the performance of this LM test.

  • PDF

CASB-DELETION DIAGNOSTICS FOR TESTING A LINEAR HYPOTHESIS ABOUT REGRESSION COEFFICIENTS

  • Kim, Myung-Geun
    • Journal of applied mathematics & informatics
    • /
    • v.10 no.1_2
    • /
    • pp.111-118
    • /
    • 2002
  • We study the influence of observations on testing a linear hypothesis using single and multiple case-deletions. The change in the F-test statistic due to case-deletions is shown to be completely determined by two externally Studentized residuals. These residuals we used for investigating the outlyingness when there are linear constraints or not. An illustrative example is given. It shows the usefulness of case-deletions.

Test for Discontinuities in Nonparametric Regression

  • Park, Dong-Ryeon
    • Communications for Statistical Applications and Methods
    • /
    • v.15 no.5
    • /
    • pp.709-717
    • /
    • 2008
  • The difference of two one-sided kernel estimators is usually used to detect the location of the discontinuity points of regression function. The large absolute value of the statistic imply discontinuity of regression function, so we may use the difference of two one-sided kernel estimators as the test statistic for testing null hypothesis of a smooth regression function. The problem is, however, we only know the asymptotic distribution of the test statistic under $H_0$ and we hardly expect the good performance of test if we rely solely on the asymptotic distribution for determining the critical points. In this paper, we show that if we adjust the bias of test statistic properly, the asymptotic rules hold for even small sample size situation.