Identifies the posterior draw that minimizes squared distance to the
posterior similarity matrix, following Dahl (2006). Returns relabeled
cluster assignments as consecutive integers 1, 2, ..., K.
Usage
.dahl_representative(z_matrix, PSM)
Arguments
- z_matrix
Integer matrix (iterations x N) of cluster assignments.
- PSM
Posterior similarity matrix (N x N).
Value
List with components: draw_index (integer), labels (integer vector),
K (number of clusters).
Details
For each posterior draw, the helper forms its adjacency matrix and computes
the squared Frobenius distance to the PSM. The selected representative draw is
the one that minimizes that loss, which is Dahl's least-squares rule for
choosing one clustering from the posterior sample.
References
Dahl, D. B. (2006). Model-based clustering for expression data
via a Dirichlet process mixture model. In M. Vannucci, et al. (Eds.),
Bayesian Inference for Gene Expression and Proteomics (pp. 201-218).
Cambridge University Press.