Hierarchical Clustering and Heat Maps in Mathematica

Hierarchical clustering is a way to expose the hidden structure of a complex, high-dimensional dataset. Heat maps are a common way to visualize the results of such clustering algorithms. This Demonstration shows how to use the HierarchicalClustering package in Mathematica to generate heat maps with the dendrograms included on the sides of the heat map.

Specifically, a sample distribution (pictured on the left) is uniformly sampled with an added noise term (proportional to the sliders in the - and directions) in an " sampling density" by " sampling density" grid. The resulting matrix of function values is then hierarchically clustered in both the rows and the columns, often revealing a relatively simple underlying structure to the originally complex-structured data. You can vary the noise, the functions used in the hierarchical clustering, and the structure of the underlying distribution being sampled to see how these algorithms are both sensitive to the inputs given and powerful when used properly.

This Demonstration is based on a program originally presented on the Mathematica Stack Exchange blog by user verbeia, incorporating the edits suggested by user kguler and several original modifications. The original blog post can be found at [1].

A word of caution: Only the functions contained in this notebook are fully accurate in terms of matching the dendrograms to the displayed data. The previous implementations mentioned are not correct. While the author wants to give attribution to others' code, it should not be used unless corrected in the manner shown above.