CausalMixGPD
  • Home
  • Roadmaps
    • Website roadmap
    • Package roadmap
  • Start
    • Start Hub
    • Roadmap
    • Usage Diagrams
    • Start Here
    • Basic Compile and Run
    • Backends and Workflow
    • Troubleshooting
  • Tracks
    • Quickstart
    • Modeling (1-arm)
    • Causal
    • Clustering
    • Kernels & tails
    • Customization
  • Examples
  • Kernels
  • Advanced
  • Developers
  • Reference
    • Reference hub
    • Function reference by job
  • News
  • Cite
  • Coverage
  • API Reference

Theory: Customization maps + extension points

This page turns the package’s customization surface into a conceptual map: where you set options (bundle vs runner vs consumers), and what the supported extension points are.

Core customization map (where each choice lives)

At a high level, CausalMixGPD builds an internal “bundle plan” and then runs MCMC (and later consumes the fit for posterior summaries and prediction).

The main user-facing entry points and where you customize are:

Interface Scope Main choices Notes
bundle() Build only raw vs formula input; optional treat; kernel; backend; GPD; components Chooses one-arm vs causal bundle without running MCMC.
dpmix() and dpmgpd() One-arm fit kernel, backend, components, param_specs, mcmc Convenience wrappers for bulk-only vs spliced tail.
dpmix.causal() and dpmgpd.causal() Two-arm causal fit arm-specific kernel, backend, GPD, components; propensity-score controls Wraps paired outcome models + optional PS stage; then exposes causal summaries.
dpmix.cluster() and dpmgpd.cluster() Clustering fit type (dependence mode), link overrides, priors, components, mcmc Uses the same mixture machinery but returns label-invariant clustering summaries.
mcmc() Run only stored bundle + MCMC overrides Keeps the build/run split explicit when you want manual inspection.

Advanced customization: param_specs modes

The most important low-level knob is param_specs, which moves individual parameters between:

  • fixed values (fixed),
  • prior-driven randomness without covariate regression (dist),
  • covariate-linked regression with an inverse link (link),
  • and an augmented regression layer with additional stochastic dispersion (link+dist).

Supported inverse-link style structures include identity, exp, softplus, log, and power (as implemented by the package’s link plans).

Representative modes:

Mode Representative structure Interpretation Typical use
fixed list(mode = "fixed", value = ...) locks the parameter to a constant enforce known threshold/shape or simplify a submodel
dist list(mode = "dist", dist = ..., args = list(...)) random under a prior, but not regressed on covariates default component-specific bulk parameters and scalar tail-shape terms
link list(mode = "link", link = ..., beta_prior = ...) regression layer + inverse link covariate-dependent location/scale or PS-adjusted outcome regression
link+dist threshold-scale style variants with mode = "link" plus extra dispersion regression mean structure plus additional stochastic dispersion observation-specific spliced threshold patterns

Advanced core extension points (registry -> bundle -> runner -> consumers)

The package architecture is registry-based: kernels and the GPD tail are declared once, then reused across bundle builders, compilation, MCMC, and S3 methods.

The practical extension rule is:

  1. stabilize registry metadata first (kernel/tail definitions, support, defaults, compatibility);
  2. rely on shared bundle/runner layers instead of writing workflow-specific code;
  3. ensure your new kernel/tail remains structurally compatible with how the fit consumers (summary(), params(), predict(), plot()) read posterior output.

Conceptually, the extension path is:

Registry stage -> Bundle stage -> Runner stage -> Consumer stage

where:

  • Registry stage: get_kernel_registry() and get_tail_registry() define signatures, support declarations, defaults, and GPD compatibility.
  • Bundle stage: builders normalize fixed / dist / link plans into a compiled specification.
  • Runner stage: manual runners compile NIMBLE code, run MCMC, and attach shared fit metadata.
  • Consumer stage: S3 methods (summary(), params(), predict(), plot()) consume the same fit structure.

References (key)

  • Antoniak (1974), Mixtures of Dirichlet Processes… — doi:10.1214/aos/1176342871
  • Neal (2000), Markov Chain Sampling Methods for Dirichlet Process Mixture Models — doi:10.1080/10618600.2000.10474879
  • de Valpine et al. (2017), Programming with Models: Writing Statistical Algorithms for General Model Structures with NIMBLE — doi:10.18637/jss.v076.i10

Prereqs

  • Required packages and data for this page are listed in the setup chunks above.

Outputs

  • This page renders model fits, diagnostics, and summary artifacts generated by package APIs.

Interpretation

  • Canonical concept page: Index
  • Treat this page as an application/example view and use the canonical page for core definitions.

Next

  • Continue to the linked canonical concept page, then return for implementation-specific details.
(c) CausalMixGPD - Bayesian semiparametric modeling for heavy-tailed data
- - Cite - API - GitHub