/* File "jobspw2.sas": Jobs example. Note comments in SAS are enclosed by "slash-star" and "star-slash." First import the jobspw2.xls data as usual (use "jobspw2" for the "member" name), then open this file (use File > Open) and highlight the following code (from "proc cluster" to the first "run;"), and select Run > Submit from the main menu. This should run an Average Linkage Cluster Analysis with agglomoration coefficient of "root-mean-square" (RMS). */ proc cluster method=average noeigen nonorm data=work.jobspw2 outtree=work.atree; var Eugene Portland Seattle Denver S50K S60K S70K S80K; id id; run; /* If you click on the Explorer tab in the Results window you should see file "Atree" in the Work library. This file can be used to construct a dendogram using proc tree (Highlight from "proc tree" to "run;" and "Submit"): */ proc tree data=work.atree; run; /* Alternatively run Ward's Minimum Variance Cluster Analysis with agglomoration coefficient of "between-cluster sum of squares" (BSS). */ proc cluster method=ward noeigen nonorm data=work.jobspw2 outtree=work.wtree; var Eugene Portland Seattle Denver S50K S60K S70K S80K; id id; run; /* If you click on the Explorer tab in the Results window you should see file "Wtree" in the Work library. Make sure you close the previous dendogram before constructing this one: */ proc tree data=work.wtree; run; /* The % change in agglomoration coefficients (BSS) in going from 5 clusters down to 1 are: Clusters Coefficient Change 5 32.271 80% 4 57.99 31% 3 76.108 18% 2 89.693 104% 1 183.26 So, going from 5 to 4 clusters, has a large jump, as does going from 2 to 1 clusters. This suggests the 5 and 2 cluster solutions are worth investigating. Details of cluster membership for the 5 cluster solution can be saved in a data table using the following code: */ proc tree data=work.wtree nclusters=5 noprint out=work.w5cl; copy Eugene Portland Seattle Denver S50K S60K S70K S80K; run; /* Select Solutions > Analysis > Analyst to profile the 5-cluster solution and assess significant differences. First use File > Open By SAS Name to open "W5cl" (which should be in the Work library). Then use Statistics > Descriptive > Summary Statistics, and put all the variables (Eugene, Portland, etc.) as the analysis variables, and "CLUSTER" as the class variable: this will provide cluster means (amongst other things). Then use Statistics > ANOVA > One-Way ANOVA, and put all the variables (Eugene, Portland, etc.) as the dependent variables, and "CLUSTER" as the independent variable: this will provide Analysis of Variance F-tests (amongst other things). Finally, use the following code to create a data table with cluster means (which can be exported to Excel and used to create profile plots): */ proc means data=work.w5cl; output out=work.w5clmeans mean(Eugene Portland Seattle Denver S50K S60K S70K S80K)= Eugene Portland Seattle Denver S50K S60K S70K S80K; class CLUSTER; run; /* In particular, click on the Explorer tab in the Results window and "right-click" on the file "W5clmeans" in the Work library. Then select Export to export the data to an Excel spreadsheet. If you open this spreadsheet in Excel you should be able to figure out how to create the profile line plots (see p483 for example).