JMP instructions

These instructions accompany Applied Regression Modeling by Iain Pardoe, 2nd edition published by Wiley in 2012. The numbered items cross-reference with the "computer help" references in the book. These instructions are based on SAS JMP 10 for Mac OS, but they (or something similar) should also work for other versions. Find instructions for other statistical software packages here.

Getting started and summarizing univariate data

  1. If desired, change JMP's default options by selecting JMP > Preferences (Mac) or File > Preferences (Windows).
  2. To open a JMP data file, select File > Open. You can also use File > Open to open text data files or Excel spreadsheets. For Excel spreadsheets, check the box labeled Always enforce Excel Row 1 as labels if the spreadsheet has the variable labels in the first row.
  3. To relaunch analysis or recall dialog after running an analysis, click the red triangle next to the analysis name at the top of the output window, and select Script > Relaunch Analysis or Model dialog.
  4. Output appears in a separate window each time you run an analysis. If you click on the "selection tool" (the third button from the left at the top of the window that looks like a "+"), you can select the output by clicking on it, and then right-click to Copy so that you can then paste it to a word processor like OpenOffice Writer or Microsoft Word.
  5. You can access help by selecting Help > Statistics Index, then selecting the topic that you would like help with. There is also a Help button in each analysis dialog box.
  6. To transform data or compute a new variable, select Cols > New Column, type the new variable name in the Column Name box, and select Formula under Column Properties. In the resulting dialog box, select the variable to be transformed under Table Columns and build the formula using the various operations and functions. Examples are Transcendental > Log for the natural logarithm and xy for powers such as 2 ("squared"). The new variable should appear in the data spreadsheet (check that it looks correct) and can now be used just like any other variable.
  7. To create indicator (dummy) variables from a qualitative variable, select the qualitative variable and select Cols > Recode. Type the values 0 and 1 under New Value for the appropriate categories and change\linebreak In Place to New Column. Check that the correct indicator variable has been created in the spreadsheet. Change the name and data/modeling type of the created variable by double-clicking the column heading (Data Type should be Numeric rather than Character and Modeling Type should be Continuous rather than Nominal). Repeat for other indicator variables (if necessary).
  8. Calculate descriptive statistics for quantitative variables by selecting Analyze > Distribution. Move the variable(s) into the Y, Columns list and click OK. In the resulting output window, you can select additional output by clicking on the red triangle next to each variable name.
  9. Create contingency tables or cross-tabulations for qualitative variables by selecting Analyze > Fit Y by X. Move one qualitative variable into the Y, Response list and another into the X, Factor list. Cell percentages (within rows, columns, or the whole table) are displayed automatically in the resulting table.
  10. If you have quantitative variables and qualitative variables, you can calculate descriptive statistics for cases grouped in different categories by selecting Tables > Summary. Select the quantitative variable(s) and then select the summaries that you would like from the Statistics menu. Move the qualitative variable(s) into the Group list.
  11. To make a stem-and-leaf plot for a quantitative variable, select Analyze > Distribution. Move the variable(s) into the Y, Columns list and click OK. In the resulting output window, you can select Stem and Leaf by clicking on the red triangle next to each variable name.
  12. To make a histogram for a quantitative variable, select Analyze > Distribution. Move the variable(s) into the Y, Columns list and click OK. In the resulting output window, you can select various Histogram Options by clicking on the red triangle next to each variable name.
  13. To make a scatterplot with two quantitative variables, select Analyze > Fit Y by X. Move the vertical axis variable into the Y, Response box and the horizontal axis variable into the X, Factor box.
  14. All possible scatterplots for more than two variables can be drawn simultaneously (called a scatterplot matrix) by selecting Graph > Scatterplot Matrix. Move all the variables into the Y, Columns box.
  15. You can mark or label cases in a scatterplot with different colors/symbols according to categories in a qualitative variable by selecting Rows > Color or Mark by Column... before drawing the plot. Select the column containing the variable you wish to mark by.
  16. You can identify individual cases in a scatterplot by hovering over individual points in the scatterplot. If you double-click a point, the corresponding row in the spreadsheet will be highlighted.
  17. To remove one of more observations from a dataset, right-click on the row number(s) in the data spreadsheet and select Exclude/Unexclude.
  18. To make a bar chart for cases in different categories, select Graph > Chart.
  19. To make boxplots for cases in different categories, select Analyze > Fit Y by X.
  20. To make a QQ-plot (also known as a normal probability plot) for a quantitative variable, select Analyze > Distribution. Move the variable into the Y, Columns list and click OK. In the resulting output window, you can select Normal Quantile Plot by clicking on the red triangle next to the variable name.
  21. To compute a confidence interval for a univariate population mean, select Analyze > Distribution. Move the variable into the Y, Columns list and click OK. In the resulting output window, you can select Confidence Interval by clicking on the red triangle next to the variable name. Enter the confidence level in the resulting Confidence Intervals dialog box and click OK.
  22. To do a hypothesis test for a univariate population mean, select Analyze > Distribution. Move the variable into the Y, Columns list and click OK. In the resulting output window, you can select Test Mean by clicking on the red triangle next to the variable name. Enter the (null) hypothesized mean in the resulting Test Mean dialog box and click OK.

Simple linear regression

  1. To fit a simple linear regression model (i.e., find a least squares line), select Analyze > Fit Model. Move the response variable into the Y box, select the predictor variable and Add it to the Construct Model Effects box, and click Run. In the rare circumstance that you wish to fit a model without an intercept term (regression through the origin), click No Intercept before clicking Run.
  2. To add a regression line or least squares line to a scatterplot, select Analyze > Fit Y by X. Move the response variable into the Y, Response box, move the predictor variable into the X, Factor box, and click OK. Click on the red triangle in the resulting Fit Y by X output window, and select Fit Line.
  3. To find 95% confidence intervals for the regression parameters in a simple or multiple linear regression model, fit the model using computer help #25 or #31, right-click in the body of the Parameter Estimates table in the resulting Fit Least Squares output window, and select Columns > Lower 95% and Columns > Upper 95%.

Multiple linear regression

  1. To fit a multiple linear regression model, select Analyze > Fit Model. Move the response variable into the Y box, select the predictor variables and Add them to the Construct Model Effects box, and click Run. In the rare circumstance that you wish to fit a model without an intercept term (regression through the origin), click No Intercept before clicking Run.
  2. To add a quadratic regression line to a scatterplot, select Analyze > Fit Y by X. Move the response variable into the Y, Response box, move the predictor variable into the X, Factor box, and click OK. Click on the red triangle in the resulting Fit Y by X output window, and select Fit Polynomial > 2,quadratic.
  3. Categories of a qualitative variable can be thought of as defining subsets of the sample. If there are also a quantitative response and a quantitative predictor variable in the dataset, a regression model can be fit to the data to represent separate regression lines for each subset. First use computer help #15 and #17 to make a scatterplot with the response variable on the vertical axis, the quantitative predictor variable on the horizontal axis, and the cases marked with different colors according to the categories in the qualitative predictor variable. To add a regression line for each subset to this scatterplot first click on the red triangle in the resulting Fit Y by X output window, select Group By ..., select the qualitative predictor variable, and click OK. Then click on the red triangle again and select Fit Line.
  4. To find the F-statistic and associated p-value for a nested model F-test in multiple linear regression, fit the model using computer help #31, click on the red triangle next to Response in the resulting Fit Least Squares output window, and select Custom Test.... The resulting Custom Test output will have a list of regression parameters that has a column of zeroes next to it; click the zero next to the first parameter in the nested F-test null hypothesis and change the value to "1." Then click Add Column and repeat for the second parameter in the null hypothesis. Repeat for each of the parameters in the null hypothesis, then click Done.
  5. To save residuals in a multiple linear regression model, fit the model using computer help #31, click on the red triangle next to Response in the resulting Fit Least Squares output window, and select Save Columns > Residuals. The residuals are saved as a variable called Residual *, where the star represents the response variable name; they can now be used just like any other variable, for example, to construct residual plots. To save what Pardoe (2012) calls standardized residuals, select Save Columns > Studentized Residuals—they will be saved as a variable called Studentized Resid *. JMP does not appear to offer a way to save what Pardoe (2012) calls studentized residuals
  6. JMP does not appear to offer a way to add a loess fitted line to a scatterplot but it can add a similar smoothing spline fitted line (useful for checking the zero mean regression assumption in a residual plot). To do so, select Analyze > Fit Y by X. Move the vertical axis variable (e.g., the studentized residuals) into the Y, Response box, move the horizontal axis variable into the X, Factor box, and click OK. Click on the red triangle in the resulting Fit Y by X output window, and select Fit Spline; you can experiment to find a value for the smoothing parameter "lambda" that captures the major trends in the scatterplot without being overly "wiggly," but typically a value of 1 or 10 should work well.
  7. To save leverages in a multiple linear regression model, fit the model using computer help #31, click on the red triangle next to Response in the resulting Fit Least Squares output window, and select Save Columns > Hats. The leverages are saved as a variable called h *, where the star represents the response variable name; they can now be used just like any other variable, for example, to construct scatterplots.
  8. To save Cook's distances in a multiple linear regression model, fit the model using computer help #31, click on the red triangle next to Response in the resulting Fit Least Squares output window, and select Save Columns > Cook's D Influence. The Cook's distances are saved as a variable called Cook's D Influence *, where the star represents the response variable name; they can now be used just like any other variable, for example, to construct scatterplots.
  9. JMP will automatically create a residual plot in a multiple linear regression model, specifically one with the (ordinary) residuals on the vertical axis versus the predicted values on the horizontal axis. To create residual plots manually, first create standardized residuals (see computer help #35), and then construct scatterplots with these standardized residuals on the vertical axis.
  10. To create a correlation matrix of quantitative variables (useful for checking potential multicollinearity problems), select Analyze > Multivariate Methods > Multivariate. Move all the variables into the Y, Columns box and click OK.
  11. To find variance inflation factors in multiple linear regression, fit the model using computer help #31, right-click in the body of the Parameter Estimates table in the resulting Fit Least Squares output window, and select Columns > VIF.
  12. To draw a predictor effect plot for graphically displaying the effects of transformed quantitative predictors and/or interactions between quantitative and qualitative predictors in multiple linear regression, first create a variable representing the effect, say, "X1effect" (see computer help #6). See Section 5.5 in Pardoe (2012) for an example.

Last updated: June, 2012

© 2012, Iain Pardoe