Skip to contents

Variant of dpmix.cluster() that augments the cluster kernel with a generalized Pareto tail. This is the clustering analogue of the spliced bulk-tail workflow used by dpmgpd().

Usage

dpmgpd.cluster(
  formula,
  data,
  type = c("weights", "param", "both"),
  default = "weights",
  mcmc = list(),
  ...
)

Arguments

formula

Model formula. The response must be present in data.

data

Data frame containing the response and optional predictors.

type

Clustering mode:

  • "weights": links mixture weights to predictors

  • "param": links kernel parameters to predictors

  • "both": links both weights and kernel parameters to predictors

default

Default mode used when type is omitted.

mcmc

MCMC control list passed into the cluster bundle.

...

Additional arguments passed to build_cluster_bundle(), including kernel settings, prior overrides, component counts, and monitoring controls.

Value

Object of class dpmixgpd_cluster_fit.

Details

For observations above a component-specific threshold, the component density is spliced as $$ f(y) = (1 - F_{bulk}(u)) g_{GPD}(y \mid u, \sigma_u, \xi_u), \qquad y \ge u, $$ so cluster assignment can be informed by both central behavior and tail behavior.

This interface is preferable when cluster separation is driven by upper-tail differences rather than bulk-only shape or location differences.