Loading graph…

Hierarchical Clustering and Ward Linkage

Definition

Hierarchical clustering builds nested clusters instead of choosing a single partition immediately.

Agglomerative hierarchical clustering starts with each observation as its own cluster and repeatedly merges clusters.

Distance Matrix

The algorithm begins with a distance matrix:

\[d(i,j)\]

which measures how far observation $i$ is from observation $j$.

In the retail mining report, product profiles are first embedded into correspondence-analysis coordinates, then clustered using Euclidean distances in that coordinate space.

Dendrogram

The sequence of merges forms a tree called a dendrogram.

Cutting the dendrogram at a chosen number of clusters $k$ gives a cluster assignment.

Ward Linkage

Ward linkage chooses the merge that creates the smallest increase in within-cluster squared error.

For clusters $A$ and $B$, Ward’s method considers the increase in within-cluster variation after merging them:

\[\Delta(A,B) = WSS(A \cup B) - WSS(A) - WSS(B)\]

where:

\[WSS(C) = \sum_{i \in C} ||x_i - \bar{x}_C||^2\]

The merge with the smallest $\Delta(A,B)$ is chosen.

Ward.D2

In R, hclust(..., method = "ward.D2") implements the Ward criterion using squared distances in the intended way.

It tends to prefer compact clusters.

Choosing k

After building the dendrogram, candidate values of $k$ can be evaluated using the silhouette coefficient.

A cluster solution can also be rejected if it is structurally degenerate, such as:

  • one cluster contains almost all observations
  • a cluster has only one or a few observations

See

25

25
Ready to start
Hierarchical Clustering and Ward Linkage
Session: 1 | Break: Short
Today: 0 sessions
Total: 0 sessions