
Applied Regression Modeling, 2nd edition
by Iain Pardoe
Preface
The first edition of this book was developed from class notes written for an applied regression
course taken primarily by undergraduate business majors
in their junior year at the University of Oregon. Since the regression methods and techniques
covered in the book have broad application in many fields, not just business, this second
edition widens its scope to reflect this. Details of the major changes for the second edition are
included below.
The book is suitable for any undergraduate statistics course
in which regression analysis is the main focus. A recommended prerequisite is an introductory
probability and statistics course. It would also be
suitable for use in an applied regression course for nonstatistics
major graduate students, including MBAs, and for vocational, professional, or other nondegree
courses. Mathematical details have
deliberately been kept to a minimum, and the book does not contain
any calculus. Instead, emphasis is placed on applying regression
analysis to data using statistical software, and understanding and
interpreting results.
Chapter 1 reviews essential introductory statistics
material, while Chapter 2 covers simple linear
regression. Chapter 3 introduces multiple linear
regression, while Chapters 4 and 5
provide guidance on building regression models, including
transforming variables, using interactions, incorporating
qualitative information, and using regression diagnostics. Each of
these chapters includes homework problems, mostly based on analyzing
real datasets provided with the book. Chapter 6
contains three indepth case studies, while Chapter 7
introduces extensions to linear regression and outlines some related
topics. The appendices contain a list of statistical software packages that can be used to carry
out all the analyses covered in the book (each with detailed instructions available from the book
website), a table of critical values for the
tdistribution, notation and formulas used throughout the
book, a glossary of important terms, a short mathematics refresher,
and brief answers to selected homework problems.
The first five chapters of the book have been used successfully in
quarterlength courses at a number of institutions. An alternative
approach for a quarterlength course would be to skip some of the
material in Chapters 4 and 5 and
substitute one or more of the case studies in
Chapter 6, or briefly introduce some of the topics in
Chapter 7. A semesterlength course could comfortably
cover all the material in the book.
The website for the book contains supplementary material designed to
help both the instructor teaching from this book and the student
learning from it. There you'll find all the datasets used for
examples and homework problems in formats suitable for most
statistical software packages, as well as detailed instructions for using the major packages,
including SPSS, Minitab, SAS, JMP, Data Desk, EViews, Stata, Statistica, R, and SPLUS.
There is also
some information on using the Microsoft Excel spreadsheet package for some of the
analyses covered in the book (dedicated statistical software is necessary to carry out
all of the analyses). The website also includes
information on obtaining a solutions manual containing complete
answers to all the homework problems, as well as further ideas for
organizing class time around the material in the book.
The book contains the following stylistic conventions:
 When displaying calculated values, the general approach is to be as accurate as possible
when it matters (such as in intermediate calculations for problems with many steps), but to
round appropriately when convenient or when reporting final results for realworld questions.
Displayed results from statistical software use the default rounding employed in R throughout.
 In the author's experience, many students find some traditional approaches to notation
and terminology a barrier to learning and understanding. Thus, some traditions have been
altered to improve ease of understanding. These include: using familiar Roman letters in place
of unfamiliar Greek letters [e.g., E(Y) rather than μ and b rather than β];
replacing the nonintuitive ȳ for the sample mean of Y with m_{Y}; using NH
and AH for null hypothesis and alternative hypothesis, respectively, rather than the usual
H_{0} and H_{a}.
Major changes for the second edition
 The first edition of this book was used in the regression analysis course run by
Statistics.com from 2008 to 2012. The lively discussion boards provided an invaluable
source for suggestions for changes to the book. This edition clarifies and expands on concepts
that students found challenging and addresses every question posed in those
discussions.
 The foundational material on interval estimation has been rewritten to clarify the
mathematics.
 There is new material on testing model assumptions, transformations, indicator variables,
nonconstant variance, autocorrelation, power and sample size, model building, and model
selection.
 As far as possible, I've replaced outdated data examples with more recent data, and also
used more appropriate data examples for particular topics (e.g., autocorrelation). In total, about
40% of the data files have been replaced.
 Most of the data examples now use descriptive names for variables rather than generic
letters such as Y and X.
 As in the first edition, this edition uses mathematics to explain methods and techniques only
where necessary, and formulas are used within the text only when they are instructive.
However, this edition also includes additional formulas in optional sections to aid those students
who can benefit from more mathematical detail.
 I've added many more endofchapter problems. In total, the number of problems has
increased by nearly 25%.
 I've updated and added new references, nearly doubling the total number of references.
 I've added a third case study to Chapter 3.
 The first edition included detailed computer software instructions for five major software
packages (SPSS, Minitab, SAS Analyst, R/SPLUS, and Excel) in an appendix. This appendix
has been dropped from this edition; instead, instructions for newer software versions and other
packages (e.g., JMP and Stata) are now just updated on the book website.
