Minitab instructions
These instructions accompany Applied Regression Modeling by Iain Pardoe, 2nd edition
published by Wiley in 2012. The numbered items crossreference with the "computer help" references
in the book. These instructions are based on Minitab 17 for Windows, but they (or something similar)
should also work for other versions. Find instructions for other statistical software packages
here.
Getting started and summarizing univariate data
 If desired, change Mintab's default options by selecting
Tools > Options.
 To open a Mintab data file, select File > Open.
 To edit last dialog box, select Edit > Edit Last Dialog or click
the Edit Last Dialog tool (ninth button from the left).
 Output appears in the Session Window and can be copied and pasted
from Minitab to a word processor like OpenOffice Writer or Microsoft Word. Graphs appear in
separate windows and can also easily be copied and pasted to other applications.
 You can access help by selecting Help > Help. For example, to
find out about "boxplots" click the Index tab, type boxplots in the first box, and
select the index entry you want in the second box.
 To transform data or compute a new variable, select
Calc > Calculator. Type a name (with no spaces) for the new variable in the
Store result in variable box, and type a mathematical expression for the variable in the
Expression box. Current variables in the dataset can be moved into the
Expression box, while the keypad and list of functions can be used to create the
expression. Examples are LOGE('X') for the natural logarithm of X and 'X'**2 for
X^{2}. Click OK to create the new variable, which will be added to the dataset
(check it looks correct in the Worksheet Window); it can now be used just like any other
variable. If you get the error message "Completion of computation impossible," this means there is
a syntax error in your Expression—a common mistake is to forget the multiplication
symbol (*) between a number and a variable (e.g., 2*'X' represents 2X).
 To create indicator (dummy) variables from a qualitative variable, select
Calc > Make Indicator Variables. Move the qualitative variable into the
Indicator variables for box, type a range of columns in which to store the variables (e.g.,
C5C6) in the Store results in box, and click OK (check that the correct indicator
variables have been added to your spreadsheet in the Worksheet Window).

 To find a percentile (critical value) for a tdistribution, select
Calc > Probability Distributions > T. Select Inverse cumulative probability, type
the Degrees of freedom, select Input constant, and type the lowertail area (i.e.,
one minus the onetail significance level). For example, typing 29 for the
Degrees of freedom and 0.95 for the Input constant returns the 95th
percentile of the tdistribution with 29 degrees of freedom (1.699), which is the critical value
for an uppertail test with a 5% significance level. By contrast, typing 0.975 for the
Input constant returns the 97.5th percentile (2.045), which is the critical value for a
twotail test with a 5% significance level.
 To find a percentile (critical value) for an Fdistribution, select
Calc > Probability Distributions > F. Select Inverse cumulative probability, type
the Numerator degrees of freedom and Denominator degrees of freedom, select
Input constant, and type the lowertail area (i.e., one minus the significance level).
For example, typing 2 for the Numerator degrees of freedom, 3 for the
Denominator degrees of freedom, and 0.95 for the Input constant returns
the 95th percentile of the Fdistribution with 2 numerator degrees of freedom and 3 denominator
degrees of freedom (9.552).
 To find a percentile (critical value) for a chisquared distribution, select
Calc > Probability Distributions > ChiSquare. Select Inverse cumulative probability, type
the Degrees of freedom, select Input constant, and type the lowertail area (i.e.,
one minus the significance level). For example, typing 2 for the Degrees of freedom
and 0.95 for the Input constant returns the 95th percentile of the chisquared
distribution with 2 degrees of freedom (5.991).

 To find an uppertail area (onetail pvalue) for a tdistribution, select
Calc > Probability Distributions > T. Select Cumulative probability, type
the Degrees of freedom, select Input constant, and type the tstatistic. For
example, typing 29 for the Degrees of freedom and 2.40 for the
Input constant returns 0.988, which is one minus the uppertail area for a tstatistic of
2.40 from the tdistribution with 29 degrees of freedom (i.e., the pvalue for an uppertail test
is 1−0.988=0.012). By contrast, 2*(1−0.988)=0.023 is the pvalue for the corresponding
twotail test.
 To find an uppertail area (pvalue) for an Fdistribution, select
Calc > Probability Distributions > F. Select Cumulative probability, type
the Numerator degrees of freedom and Denominator degrees of freedom, select
Input constant, and type the Fstatistic. For example, typing 2 for the
Numerator degrees of freedom, 3 for the Denominator degrees of freedom
and 51.4 for the Input constant returns 0.995, which is one minus the
uppertail area for an Fstatistic of 51.4 from the Fdistribution with 2 numerator degrees of
freedom and 3 denominator degrees of freedom (i.e., the pvalue is 1−0.995=0.005).
 To find an uppertail area (pvalue) for a chisquared distribution, select
Calc > Probability Distributions > ChiSquare. Select Cumulative probability, type
the Degrees of freedom, select Input constant, and type the chisquared statistic.
For example, typing 2 for the Degrees of freedom and 0.38 for the
Input constant returns 0.173, which is one minus the uppertail area for a chisquared
statistic of 0.38 from the chisquared distribution with 2 degrees of freedom (i.e., the pvalue
is 1−0.173=0.827).
 Calculate descriptive statistics for quantitative variables by selecting
Stat > Basic Statistics > Display Descriptive Statistics. Move the variable(s) into the
Variable(s) list. Click Statistics to select the summaries, such as the
Mean, that you would like.
 Create contingency tables or crosstabulations for qualitative
variables by selecting Stat > Tables > Cross Tabulation and ChiSquare. Move one
qualitative variable into the rows box and another into the columns box. Cell
percentages (within rows, columns, or the whole table) can be calculated by clicking
the appropriate boxes under Display.
 If you have a quantitative variable and a qualitative variable, you can calculate
descriptive statistics for cases grouped in different categories by selecting
Stat > Tables > Descriptive Statistics. Move the qualitative variable into the
rows box (and another qualitative variable into the columns box if there is more
than one). Click Associated Variables to select the quantitative variable for which you
would like descriptive statistics, and the descriptive statistics to display; the default is the
number of cases, but other statistics such as the Mean and Standard Deviation can
also be selected.
 To make a stemandleaf plot for a quantitative variable, select
Graph > StemandLeaf. Move the variable into the Graph variables box.
 To make a histogram for a quantitative variable, select
Graph > Histogram. Choose Simple and move the variable into the
Graph variables box.
 To make a scatterplot with two quantitative variables, select
Graph > Scatterplot. Choose Simple and move the vertical axis variable into the
first row of the Y variables column and the horizontal axis variable into the first row of
the X variables column.
 All possible scatterplots for more than two variables can be drawn simultaneously
(called a scatterplot matrix) by selecting Graph > Matrix Plot, choosing
Matrix of plots, Simple, and moving the variables into the Graph variables
list.
 You can mark or label cases in a scatterplot with different colors/symbols
according to categories in a qualitative variable by selecting Graph > Scatterplot and
choosing With Groups. After moving the vertical axis variable into the first row of the
Y variables column and the horizontal axis variable into the first row of the
X variables column, move the grouping variable into the
Categorical variables for grouping box. To change the colors/symbols used, select the
symbols you want to change by clicking on one of the points with that symbol twice (all the data
points should become highlighted on the first click, and just the points in that group should remain
highlighted on the second click). Then select Editor > Edit Symbols. Select the
color/symbol you want and click OK to see the effect.
 You can identify individual cases in a scatterplot by hovering over
them.
 To remove one of more observations from a dataset, select
Data > Subset Worksheet. Select Specify which rows to exclude and select
one of the subsequent options.
 To make a bar chart for cases in different categories, select
Graph > Bar Chart.
 For frequency bar charts of one qualitative variable, choose Simple with
Bars represent: Counts of unique values and move the variable into the
Categorical variables box.
 For frequency bar charts of two qualitative variables, choose Cluster
with Bars represent: Counts of unique values and move the variables into the
Categorical variables box.
 The bars can also represent various summary functions for a quantitative variable.
For example, to represent means, select Bars represent: A function of a variable and select
Mean for the function.
 To make boxplots for cases in different categories, select
Graph > Boxplot. Choose One Y, With Groups, move the quantitative variable into
the Graph variables box, and move the qualitative variable(s) into the
Categorical variables box.
 To make a QQplot (also known as a normal probability plot) for a
quantitative variable, select Graph > Probability Plot. Choose Single and move
the variable into the Graph variables box.
 To compute a confidence interval for a univariate population mean, select
Stat > Basic Statistics > 1Sample t. Move the variable for which you want to
calculate the confidence interval into the Samples in columns box. Then click the
Options button to bring up another dialog box in which you can specify the confidence level
for the interval. Clicking OK will take you back to the previous dialog box, where you can
now click OK.
 To do a hypothesis test for a univariate population mean, select
Stat > Basic Statistics > 1Sample t. Move the variable for which you want to do
the test into the Samples in columns box, check Perform hypothesis test, and type
the (null) hypothesized value into the Hypothesized mean box. Then click the
Options button to bring up another dialog box in which you can specify a lowertailed
("less than"), uppertailed ("greater than"), or twotailed ("not equal") alternative hypothesis.
OK will take you back to the previous dialog box, where you can now click
OK.
Simple linear regression
 To fit a simple linear regression model (i.e., find a least squares line),
select Stat > Regression > Regression > Fit Regression Model. Move the response variable into the
Response box and the predictor variable into the Predictors box. Just
click OK for now—the other items in the dialog box are addressed below. In the rare
circumstance that you wish to fit a model without an intercept term (regression through the origin),
click the Model button and deselect Include the constant term in the model before clicking OK.
 To add a regression line or least squares line to a scatterplot,
select Editor > Add > Regression Fit, and Linear for the Model Order.
You can create a scatterplot with a regression line superimposed by selecting
Graph > Scatterplot. Choose With Regression and move the response variable into
the first row of the Y variables column and the predictor variable into the first row of
the X variables column.
 To find 95% confidence intervals
for the regression parameters in a linear regression model, select
Stat > Regression > Regression > Fit Regression Model. Move the response variable into the
Response box and the predictor variable into the Predictors box. Before clicking
OK, click the Results button, select Expanded Table and check Coefficients. The confidence intervals are displayed as the final two columns of the "Coefficients" output. This applies more generally to multiple linear regression also.

 To find a fitted value or predicted value of Y (the response
variable) at a particular value of X (the predictor variable), select
Stat > Regression > Regression > Fit Regression Model. Move the response variable into the
Response box and the predictor variable into the Predictors box. Before clicking
OK, click the Storage button, check Fits, then click OK to return to
the main Regression dialog box, and then click OK. The predicted or fitted values
of Y at each of the Xvalues in the dataset are displayed in the Worksheet Window in
a column headed FITS in the Worksheet Window. Each time you ask Minitab to
calculate predicted or fitted values like this, it will add a new column to the dataset and
increment an end digit by one. For example, the second time you calculate a predicted or fitted
value of Y it will be called FITS_1.
 You can also obtain a predicted or fitted value of Y at an Xvalue that is
not in the dataset by selecting
Stat > Regression > Regression > Predict after fitting a model. Type the Xvalue into the first space beneath the predictor variable label. In this case, the predicted or fitted value of
Y at this Xvalue is displayed in the Session Window as "Fit." (not in the Worksheet
Window).
 This applies more generally to multiple linear regression also.

 To find a confidence interval for the mean of Y at a particular value of
X, select Stat > Regression > Regression > Predict after fitting a model. In the pulldown menu below where it says "Response," change the option to "Enter columns of values" and select the predictor variable to go in the box labeled with the name of the predictor. The
confidence intervals for the mean of Y at each of the Xvalues in the dataset are displayed as two columns headed CLIM and CLIM_1 in the
Worksheet Window. Each time you ask Minitab to calculate confidence intervals like this, it
will add new columns to the dataset and increment the end digit by one. For example, the second
time you calculate confidence intervals for the mean of Y the end points will be called
CLIM_2 and CLIM_3.
 You can also obtain a confidence interval for the mean of Y at
an Xvalue that is not in the dataset by selecting
Stat > Regression > Regression > Predict after fitting a model. Type the Xvalue into the first space beneath the predictor variable label. In this case, the confidence
interval for the mean of Y at this Xvalue is displayed only in the Session Window (and not
in the Worksheet Window).
 This applies more generally to multiple linear regression also.

 To find a prediction interval for an individual value of Y at a particular
value of X, select Stat > Regression > Regression > Predict after fitting a model. In the pulldown menu below where it says "Response," change the option to "Enter columns of values" and select the predictor variable to go in the box labeled with the name of the predictor. The
prediction intervals for an individual Yvalue at each of the Xvalues in the dataset are displayed
as two columns headed PLIM and PLIM_1 in
the Worksheet Window. Each time you ask Minitab to calculate prediction intervals like this, it will add new columns to the dataset and increment the end digit by one. For example, the
second time you calculate prediction intervals for an individual Yvalue the end points will be
called PLIM_2 and PLIM_3.
 You can also obtain a prediction interval for the mean of
Y at an Xvalue that is not in the dataset by selecting
Stat > Regression > Regression > Predict after fitting a model. Type the Xvalue into the first space beneath the predictor variable label. In this case, the prediction interval for an individual Yvalue at this Xvalue is displayed only in the Session
Window (and not in the Worksheet Window).
 This applies more generally to multiple linear regression also.
Multiple linear regression
 To fit a multiple linear regression model, select
Stat > Regression > Regression > Fit Regression Model. Move the response variable into the Response box
and the predictor variables into the Predictors box. In the rare
circumstance that you wish to fit a model without an intercept term (regression through the origin),
click the Options button and deselect Fit intercept before clicking OK.
 To add a quadratic regression line to a scatterplot, select
Editor > Add > Regression Fit, and Quadratic for the Model Order. You
can create a scatterplot with a quadratic regression line superimposed by selecting
Graph > Scatterplot. Choose With Regression and move the vertical axis variable
into the first row of the Y variables column and the horizontal axis variable into the
first row of the X variables column. Before clicking OK, click the
Data View button, click the Regression tab in the subsequent
Scatterplot  Data View dialog box, and change the Model Order from
Linear to Quadratic. Click OK to return to the
Scatterplot  With Regression dialog box, and OK again to create the graph.
 Categories of a qualitative variable can be thought of as defining subsets
of the sample. If there is also a quantitative response and a quantitative predictor variable in
the dataset, a regression model can be fit to the data to represent separate regression lines for
each subset. To display a regression line for each subset in a scatterplot, select
Graph > Scatterplot and choose With Regression and Groups. After moving the
vertical axis variable into the first row of the Y variables column and the horizontal axis
variable into the first row of the X variables column, move the grouping variable into the
Categorical variables for grouping box. Click OK to create the graph.
 Minitab does not appear to offer an automatic way to find the Fstatistic and
associated pvalue for a nested model Ftest in multiple linear regression. It is possible
to calculate these quantities by hand using Minitab regression output and appropriate percentiles
from a Fdistribution.
 To save residuals in a multiple linear regression model, select
Stat > Regression > Regression > Fit Regression Model. Move the response variable into the Response box
and the predictor variables into the Predictors box. Before clicking OK,
click the Storage button and check Residuals under Diagnostic Measures in
the subsequent Regression: Storage dialog box. Click OK to return to the main
Regression dialog box, and then click OK. The residuals are saved as a variable
called RESI in the Worksheet Window; they can now be used just like any other
variable, for example, to construct residual plots. Each time you ask Minitab to save residuals
like this, it will add a new variable to the dataset and increment an end digit by one; for
example, the second time you save residuals they will be called RESI_1. To save
what Pardoe (2012) calls standardized residuals, check Standardized residuals under
Diagnostic Measures in the Regression: Storage dialog box—they will be saved
as a variable called SRES in the Data Editor Window. To save
what Pardoe (2012) calls studentized residuals, check Deleted t residuals under
Diagnostic Measures in the Regression: Storage dialog box—they will be saved
as a variable called TRES in the Data Editor Window.
 To add a loess fitted line to a scatterplot (useful for checking the zero
mean regression assumption in a residual plot), select Editor > Add > Smoother. The
default value of 0.5 for Degree of smoothing tends to be a little on the low side:
I would change it to 0.75. You can create a scatterplot with a loess fitted line
superimposed by selecting Graph > Scatterplot. Choose With Regression and move
the vertical axis variable into the first row of the Y variables column and the horizontal
axis variable into the first row of the X variables column. Before hitting OK,
click the Data View button, click the Smoother tab in the subsequent
Scatterplot  Data View dialog box, and change the Smoother from None to
Lowess. Hit OK to return to the Scatterplot  With Regression dialog
box, and OK again to create the graph.
 To save leverages in a multiple linear regression model, select
Stat > Regression > Regression > Fit Regression Model. Move the response variable into the Response box
and the predictor variables into the Predictors box. Before clicking OK,
click the Storage button and check Hi (leverages) under Diagnostic
Measures in the subsequent Regression: Storage dialog box. Click OK to
return to the main Regression dialog box, and then hit OK. The leverages
are saved as a variable called HI1 in the Worksheet Window; they can now be used
just like any other variable, for example, to construct scatterplots. Each time you ask Minitab to
save leverages like this, it will add a new variable to the dataset and increment an end digit by
one; for example, the second time you save leverages they will be called HI_1.
 To save Cook's distances in a multiple linear regression model, select
Stat > Regression > Regression > Fit Regression Model. Move the response variable into the Response box
and the predictor variables into the Predictors box. Before clicking OK,
click the Storage button and check Cook's distance under Diagnostic
Measures in the subsequent Regression: Storage dialog box. Click OK to
return to the main Regression dialog box, and then hit OK. Cook's distances are
saved as a variable called COOK in the Worksheet Window; they can now be used
just like any other variable, for example, to construct scatterplots. Each time you ask Minitab to
save Cook's distances like this, it will add a new variable to the dataset and increment an end
digit by one; for example, the second time you save Cooks' distances they will be called
COOK_1.
 To create some residual plots automatically in a multiple linear regression
model, select Stat > Regression > Regression > Fit Regression Model. Move the response variable into the
Response box and the predictor variables into the Predictors box. Before
clicking OK, click the Graphs button and select Deleted under
Residuals for Plots in the subsequent
Regression  Graphs dialog box. Check Residuals versus fits under Individual
plots to create a scatterplot of the studentized residuals on the vertical axis versus the
predicted values on the horizontal axis. You could also move individual predictor
variables into the Residuals versus the variables box to create residual plots with each
predictor variable on the horizontal axis. Click OK to return to the main
Regression dialog box, and then hit OK. To create residual plots manually, first
create studentized residuals (see help #35), and then construct scatterplots with these studentized
residuals on the vertical axis.
 To create a correlation matrix of quantitative variables (useful for
checking potential multicollinearity problems), select Stat > Basic Statistics >
Correlation. Move the variables into the Variables box and hit OK.
 Minitab now displays variance inflation factors by default in multiple linear regression. The variance inflation
factors are in the last column of the main regression output under "VIF."
 To draw a predictor effect plot for graphically displaying the effects of
transformed quantitative predictors and/or interactions between quantitative and qualitative
predictors in multiple linear regression, first create a variable representing the effect, say,
"X1effect" (see computer help #6). Then select Graph > Scatterplot. Choose With
Connect and Groups and move the "X1effect" variable into the first row of the
Y variables column and X1 into the first row of the X variables column.
 If the "X1effect" variable just involves X1 (e.g., 1 + 3X1 + 4X1^{2}),
you can click OK at this point.
 If the "X1effect" variable also involves a qualitative variable (e.g.,
1 − 2X1 + 3D2X1, where D2 is an indicator variable), you should move the qualitative variable
into the Categorical variables for grouping box before clicking OK.
See Section 5.5 in Pardoe (2012) for an example.
Last updated: Oct 2016
© 2016, Iain Pardoe