Skip to contents

Extract posterior cluster allocation from mixgpd_fit

Usage

# S3 method for class 'mixgpd_fit'
allocation(object, method = c("dahl"), ...)

Arguments

object

A mixgpd_fit object from dpmgpd or related functions.

method

Character; clustering summarization method. Currently only "dahl" is supported (Dahl's method for finding representative partition).

...

Additional arguments. If newdata is provided, cluster allocations will be predicted for new observations (requires y column in newdata).

Value

An object of class "mixgpd_allocation" containing:

labels_train

Integer vector of cluster labels for training data (1, 2, ..., K)

probs_train

N x K matrix of cluster membership probabilities for training data

PSM

N x N posterior similarity matrix

K

Number of clusters in representative partition

method

Clustering method used

backend

Model backend (crp or spliced)

GPD

Logical; whether GPD tail is included

kernel

Kernel family used

n_train

Number of training observations

labels_new

(if newdata) Integer vector of predicted cluster labels

probs_new

(if newdata) N_new x K matrix of predicted membership probabilities

n_new

(if newdata) Number of new observations

y_new

(if newdata) Response values for new observations

References

Dahl, D. B. (2006). Model-based clustering for expression data via a Dirichlet process mixture model. In M. Vannucci, et al. (Eds.), Bayesian Inference for Gene Expression and Proteomics (pp. 201-218). Cambridge University Press.

Examples

if (FALSE) { # \dontrun{
# Fit a spliced CRP model
data("nc_posX100_p3_k2", package = "CausalMixGPD")
fit <- dpmgpd(y ~ x1 + x2 + x3, data = nc_posX100_p3_k2,
              backend = "spliced", kernel = "lognormal",
              components = 4, GPD = TRUE)

# Extract allocation
alloc <- allocation(fit)
print(alloc)
summary(alloc)
plot(alloc)

# Predict for new data
X_new <- data.frame(x1 = rnorm(20), x2 = rnorm(20), x3 = rnorm(20))
y_new <- rlnorm(20, meanlog = 1, sdlog = 0.5)
newdata <- cbind(y = y_new, X_new)
alloc_new <- allocation(fit, newdata = newdata)
plot(alloc_new, overlay = TRUE)
} # }