Section 2.3 in the book suggests three ways to numerically evaluate a simple linear regression model: the regression standard error, s, the coefficient of determination, R2, and assessing the slope parameter, b1, using a hypothesis test or confidence interval. For simple linear regression all three methods are mathematically equivalent, but their generalizations for multiple linear regression are not equivalent.
The purpose of this exercise is to demonstrate that a simple linear regression model with superior measures of fit (i.e., a lower s, a higher R2, and a higher absolute t-statistic for the slope) does not necessarily imply the model is more appropriate than a model with a higher s, a lower R2, and a lower absolute t-statistic for the slope. These three measures of model fit should always be used in conjunction with a graphical check of the model to make sure that it is appropriate (e.g., see section 2.4 in the book).
Download the simulated data from one of the following files (in SPSS, text, and Excel format, respectively): COMPARE.SAV, COMPARE.TXT, COMPARE.XLS. There is a single response variable, Y, and four possible predictor variables, X1, X2, X3, and X4.
You should find that the model with X1 has a higher s, a lower R2, and a lower absolute t-statistic for the slope than the other three models (which all have the same values of s, R2, and t). However, the model with X1 is more appropriate than each of the other three models:
Concluding message: measures of regression model fit like the regression standard error, s, the coefficient of determination, R2, and the absolute t-statistic for the slope are only really meaningful when the assumptions of the model are broadly satisfied.
Last updated: September, 2006
The views and opinions expressed in this page are strictly those of the page author. The contents of this page have not been reviewed or approved by the University of Oregon.
© 2006, Iain Pardoe, Lundquist College of Business, University of Oregon