Problem on multiple linear regression model building I

The following problem provides a challenging dataset that students can use to try to find their best multiple linear regression model.

You've been asked to find out how restaurant profits are affected by certain characteristics of the restaurants. You have data on 39 restaurants, and would like to build a regression model for predicting Y = annual profits (in thousands of dollars) from 5 potential predictor variables:

Note that region is a qualitative (categorical) variable with three levels; the following data files (in SPSS, text, and Excel format, respectively): RESTAURANT.SAV, RESTAURANT.TXT, RESTAURANT.XLS contain two dummy indicator variables to code the information in region: "D5" = 1 for Southwest, 0 otherwise, and "D6" = 1 for Northwest, 0 otherwise. (So, the Mountain region is the reference level with zero for both "D5" and "D6.")

Build a suitable regression model. You may want to consider the following topics in doing so:

You may use the following for terms in your model:

[Hint: a "good" model should have R2 around 0.93 and a regression standard error, s, around 10.6.]


Last updated: June, 2008

The views and opinions expressed in this page are strictly those of the page author. The contents of this page have not been reviewed or approved by the University of Oregon.

© 2008, Iain Pardoe, Lundquist College of Business, University of Oregon