Fit a clustering-only bulk-tail model
dpmgpd.cluster.RdVariant of dpmix.cluster() that augments the cluster kernel with a generalized Pareto tail.
This is the clustering analogue of the spliced bulk-tail workflow used by dpmgpd().
Arguments
- formula
Model formula. The response must be present in
data.- data
Data frame containing the response and optional predictors.
- type
Clustering mode:
"weights": links mixture weights to predictors"param": links kernel parameters to predictors"both": links both weights and kernel parameters to predictors
- default
Default mode used when
typeis omitted.- mcmc
MCMC control list passed into the cluster bundle.
- ...
Additional arguments passed to
build_cluster_bundle(), including kernel settings, prior overrides, component counts, and monitoring controls.
Details
For observations above a component-specific threshold, the component density is spliced as $$ f(y) = (1 - F_{bulk}(u)) g_{GPD}(y \mid u, \sigma_u, \xi_u), \qquad y \ge u, $$ so cluster assignment can be informed by both central behavior and tail behavior.
This interface is preferable when cluster separation is driven by upper-tail differences rather than bulk-only shape or location differences.
See also
dpmix.cluster(), predict.dpmixgpd_cluster_fit(),
dpmgpd(), sim_bulk_tail().
Other cluster workflow:
cluster_profiles(),
dpmix.cluster(),
plot.dpmixgpd_cluster_bundle(),
predict.dpmixgpd_cluster_fit(),
print.dpmixgpd_cluster_bundle(),
print.dpmixgpd_cluster_fit(),
print.dpmixgpd_cluster_labels(),
print.dpmixgpd_cluster_psm(),
summary.dpmixgpd_cluster_bundle(),
summary.dpmixgpd_cluster_fit(),
summary.dpmixgpd_cluster_labels(),
summary.dpmixgpd_cluster_psm()
Examples
# \donttest{
data("nc_posX100_p3_k2", package = "CausalMixGPD")
dat <- data.frame(y = nc_posX100_p3_k2$y[1:20],
nc_posX100_p3_k2$X[1:20, , drop = FALSE])
fit <- dpmgpd.cluster(
y ~ x1 + x2 + x3,
data = dat,
kernel = "gamma",
type = "param",
components = 3,
mcmc = list(niter = 60, nburnin = 30, thin = 1, nchains = 1, seed = 1)
)
#> [cluster] Validating configuration
#> [cluster] Checking build/compile cache
#> [cluster] Building model and MCMC configuration
#> [cluster] Compiling NIMBLE model
#> [cluster] Initializing chains
#> [cluster] Running MCMC
#> [cluster] Finalizing WAIC and diagnostics
#> [cluster] Assembling fit object
cluster_profiles(fit)
#> cluster n y_mean y_sd x1_mean x1_sd x2_mean x2_sd x3_mean x3_sd
#> 1 C1 20 3.334 2.25 0.211 0.918 -0.096 0.563 -0.489 0.859
#> certainty_mean certainty_sd
#> 1 1 0
# }