Problems on simple linear regression concepts

The following problems enable students to practice important concepts from Chapter 2, without having to use a computer. Answers are provided below.

  1. For simple linear regression, state the four assumptions concerning the probability distribution of the random error term. The mean of the distribution is 0 for each value of X, the variance of the distribution remains constant as X increases, the distribution is normal for each value of X, and the errors are independent.
  2. A restaurant owner use the straight-line equation, E(Y) = b0 + b1 X, to model the relationship between mean daily costs, E(Y), and customers served, X. To test this theory, a week of data was collected and the results for n = 7 days is shown in the accompanying table followed by an SPSS printout of the regression analysis.
    	Day		   1	   2	   3	   4	   5	   6	   7
    	Costs, Y ($)	1000	2180	2240	2410	2590	2820	3060
    	Customers, X	   0	  60	 120	 133	 143	 175	 175
    
    		R Square = 0.905	Std. Error of the Estimate (s) = 224
    	
    		Unstandardized Coefficients
    
    			B		Std. Error	t	Sig.
    	(Constant)	1192		185		6.44	0.001
    	X		   9.87		  1.43		6.91	0.001
    
    1. State the set of hypotheses that you would test to determine whether costs are positively linearly related to number of customers in the situation above? NH: b1=0 vs. AH: b1>0
    2. In the situation above, is there sufficient evidence of a positive linear relationship between costs and number of customers? Use significance level 5%. Yes, since the p-value of the test (0.0005) is less than 0.05.
    3. Give a practical interpretation of the estimate of the slope of the least squares line in the situation above. For every additional customer, we estimate costs to increase by $9.87.
    4. Give a practical interpretation of the estimate of the Y-intercept of the least squares line in the situation above. For a day with no customers (presumably a day the restaurant is closed), we estimate costs (presumably fixed costs) to be $1192.
    5. For the situation above, give a practical interpretation of s, the estimate of the standard deviation of the random error term in the model. We expect about 95% of the observed cost values to lie within $448 of their least squares predicted values.
    6. For the situation above, give a practical interpretation of R2, the coefficient of determination for the least squares model. About 90.5% of the total variation in the sample of cost values can be explained by (or attributed to) the linear relationship between cost and number of customers.
  3. Is it true that a simple linear regression model allows the E(Y) values (expected or predicted Y-values) to fall around the regression line while the actual values of Y must fall on the line?
  4. Is it true that the coefficient of correlation is a useful measure of the linear relationship between two variables?

Last updated: May, 2008

The views and opinions expressed in this page are strictly those of the page author. The contents of this page have not been reviewed or approved by the University of Oregon.

© 2008, Iain Pardoe, Lundquist College of Business, University of Oregon