Problems on simple linear regression concepts
The following problems enable students to practice important concepts
from Chapter 2, without having to use a computer. Answers are
provided below.
- For simple linear regression, state the four assumptions
concerning the probability distribution of the random error
term. The mean of the distribution is 0 for each value of X, the
variance of the distribution remains constant as X increases, the
distribution is normal for each value of X, and the errors are
independent.
- A restaurant owner use the straight-line equation, E(Y) =
b0 + b1 X, to model the relationship between
mean daily costs, E(Y), and customers served, X. To test this theory,
a week of data was collected and the results for n = 7 days is shown
in the accompanying table followed by an SPSS printout of the
regression analysis.
Day 1 2 3 4 5 6 7
Costs, Y ($) 1000 2180 2240 2410 2590 2820 3060
Customers, X 0 60 120 133 143 175 175
R Square = 0.905 Std. Error of the Estimate (s) = 224
Unstandardized Coefficients
B Std. Error t Sig.
(Constant) 1192 185 6.44 0.001
X 9.87 1.43 6.91 0.001
- State the set of hypotheses that you would test to determine
whether costs are positively linearly related to number of customers
in the situation above? NH: b1=0 vs. AH:
b1>0
- In the situation above, is there sufficient evidence of a positive
linear relationship between costs and number of customers?
Use significance level 5%. Yes, since the p-value of the test
(0.0005) is less than 0.05.
- Give a practical interpretation of the estimate of the slope of
the least squares line in the situation above. For every additional
customer, we estimate costs to increase by $9.87.
- Give a practical interpretation of the estimate of the Y-intercept
of the least squares line in the situation above. For a day
with no customers (presumably a day the restaurant is closed), we
estimate costs (presumably fixed costs) to be $1192.
- For the situation above, give a practical interpretation of s, the
estimate of the standard deviation of the random error term in the
model. We expect about 95% of the observed cost values to lie
within $448 of their least squares predicted values.
- For the situation above, give a practical interpretation of
R2, the coefficient of determination for the least squares
model. About 90.5% of the total variation in the sample of cost
values can be explained by (or attributed to) the linear relationship
between cost and number of customers.
- Is it true that a simple linear regression model allows the E(Y)
values (expected or predicted Y-values) to fall around the regression
line while the actual values of Y must fall on the line?
- Is it true that the coefficient of correlation is a useful measure
of the linear relationship between two variables?
Last updated: May, 2008
The views and opinions expressed in this page are strictly those of
the page author. The contents of this page have not been reviewed or
approved by the University of Oregon.
© 2008, Iain Pardoe, Lundquist College of Business, University
of Oregon