A graph for visualizing hierarchical and non-hierarchical cluster analyses
In hierarchical cluster analysis dendrogram graphs are used to visualize how clusters are formed. I propose an alternative graph named
to examine how cluster members are assigned to clusters as the number of clusters increases. This graph is useful in exploratory analysis for non-hierarchical clustering algorithms like k-means and for hierarchical cluster algorithms when the number of observations is large enough to make dendrograms impractical.
The clustergram is currently implemented in Stata and R.
Schonlau M. The clustergram: a graph for visualizing hierarchical and non-hierarchical cluster analyses. The Stata Journal, 2002; 2 (4):391-402.
The paper introduces the clustergram and explains how to use the stata ado files.
Schonlau M. Visualizing Hierarchical and Non-Hierarchical Cluster Analyses with Clustergrams. Computational Statistics: 2004; 19(1):95-111.
This paper points out that the y-axis in the clustergram may take different functions and gives further examples.
Stata implementation clustergram ZIP File
The ZIP file with the stata implementation contains the following stata programs :
- clustergram ado file
clustergram Help File
clustervar Ado file
This supplementary ado file makes it easier to run various cluster algorithms.
The stata paper describes how to run cluster analysis without using this supplementary ado file.
clustervar help file
clustervar sepallen-petalwid, algorithm(singlelinkage) max(`max') distance(L2)
Asbestos.dta (stata data set)
Asbestos data set used in the paper
Tal Galili has implemented the clustergram in R. He gives the code and talks about it in his blog:
Clustergram blog as part of www.r-statistics.com
Return to Home Page
Remove navigation bar on the left