CausalMixGPD
  • Home
  • Roadmaps
    • Website roadmap
    • Package roadmap
  • Start
    • Start Hub
    • Roadmap
    • Usage Diagrams
    • Start Here
    • Basic Compile and Run
    • Backends and Workflow
    • Troubleshooting
  • Tracks
    • Quickstart
    • Modeling (1-arm)
    • Causal
    • Clustering
    • Kernels & tails
    • Customization
  • Examples
  • Kernels
  • Advanced
  • Developers
  • Reference
    • Reference hub
    • Function reference by job
  • News
  • Cite
  • Coverage
  • API Reference

ex06. Conditional DPM (SB Backend)

Website workflow note. This page reflects the current exported API and recommended wrapper-first usage. Last updated: 2026-02-19.

For the full package narrative, see the main package vignettes (basic, unconditional, conditional, and causal).

Conditional DPmix: Stick-Breaking Backend

Purpose: Replace the CRP backend with stick-breaking truncation while keeping the covariate-dependent bulk structure. This demonstrates how fixed components interplay with covariates.

What you’ll learn

  • How to run the conditional (y \mid X) workflow using the SB backend (fixed truncation).
  • How components controls effective mixture complexity in conditional fits.
  • How to compare kernels under the same SB setup using parallel bundles.

When to use this template

  • You want conditional density estimation with predictable runtime and a tunable complexity knob (components).
  • You want conditional modeling but do not need tail splicing (GPD = FALSE).

Next steps

  • Add tail augmentation for covariate-dependent extremes (ex08), or compare with CRP conditional (ex05).

Data Setup

Code
data("nc_posX100_p3_k2")
y <- nc_posX100_p3_k2$y
X <- as.matrix(nc_posX100_p3_k2$X)
if (is.null(colnames(X))) {
  colnames(X) <- paste0("x", seq_len(ncol(X)))
}

summary_tbl <- tibble(
  statistic = c("N", "Mean", "SD", "Min", "Max"),
  value = c(length(y), mean(y), sd(y), min(y), max(y))
)

ggplot(data.frame(y = y, x1 = X[, 1]), aes(x = x1, y = y)) +
  geom_point(alpha = 0.6, color = "darkorange") +
  geom_smooth(method = "loess", color = "navy", fill = NA) +
  labs(title = "y vs X1 (SB)", x = "X1", y = "y") +
  theme_minimal()

# A tibble: 5 × 2
  statistic   value
  <chr>       <dbl>
1 N         100    
2 Mean        3.45 
3 SD          2.41 
4 Min         0.377
5 Max        10.9  

Model Specification

Code
bundle_sb_normal <- bundle(
  y = y,
  X = X,
  kernel = "normal",
  backend = "sb",
  GPD = FALSE,
  components = 5,
  mcmc = mcmc
)

bundle_sb_cauchy <- bundle(
  y = y,
  X = X,
  kernel = "cauchy",
  backend = "sb",
  GPD = FALSE,
  components = 5,
  mcmc = mcmc
)

Running MCMC

Code
fit_sb_normal <- load_or_fit("ex06-conditional-dpm-sb-fit_sb_normal", dpmix(bundle_sb_normal))
fit_sb_cauchy <- load_or_fit("ex06-conditional-dpm-sb-fit_sb_cauchy", dpmix(bundle_sb_cauchy))
summary(fit_sb_normal)
MixGPD summary | backend: Stick-Breaking Process | kernel: Normal Distribution | GPD tail: FALSE | epsilon: 0.025
n = 100 | components = 5
Summary
Initial components: 5 | Components after truncation: 2

Summary table
       parameter   mean    sd q0.025 q0.500 q0.975    ess
      weights[1]   0.71 0.125  0.432  0.717  0.891  2.821
      weights[2]  0.188 0.091  0.075  0.139  0.353  3.134
           alpha  0.973 0.467  0.266  0.836  2.086 30.527
 beta_mean[1, 1] -0.697 0.689 -2.269 -0.399  0.193  3.626
 beta_mean[2, 1]  0.874 0.527 -0.155  1.016  1.891  3.601
 beta_mean[3, 1]  -0.21 0.482 -1.133 -0.228  0.599 11.882
 beta_mean[4, 1]  -0.86 0.553 -1.798 -0.897  0.353   6.33
 beta_mean[5, 1]  0.552 0.522 -1.185  0.581  1.235  8.768
 beta_mean[1, 2]  1.274 0.846  0.465  0.976  3.069  2.147
 beta_mean[2, 2]    0.3  0.45 -0.736  0.399  1.115  8.059
 beta_mean[3, 2] -0.982 0.425  -1.82 -0.838 -0.251 12.352
 beta_mean[4, 2]  -0.33 0.314 -1.136 -0.367  0.335 29.796
 beta_mean[5, 2]  2.136 0.702  1.518  1.919  3.718  2.955
 beta_mean[1, 3] -0.464 0.897 -2.007 -0.273  0.447  2.248
 beta_mean[2, 3]  0.336 0.852 -0.594  0.263  1.843  1.602
 beta_mean[3, 3] -1.703 0.564 -2.487 -1.858 -0.072  6.078
 beta_mean[4, 3]  0.792 0.817 -0.046  0.432  2.963  2.573
 beta_mean[5, 3]  1.737 0.812  0.548  1.701  3.785  3.443
           sd[1]  0.053  0.01  0.036  0.053  0.073 52.239
           sd[2]  0.794 1.107  0.037  0.234  3.846 22.413
Code
summary(fit_sb_cauchy)
MixGPD summary | backend: Stick-Breaking Process | kernel: Cauchy Distribution | GPD tail: FALSE | epsilon: 0.025
n = 100 | components = 5
Summary
Initial components: 5 | Components after truncation: 3

Summary table
           parameter   mean    sd q0.025 q0.500 q0.975     ess
          weights[1]  0.435 0.078  0.297  0.447  0.573   8.109
          weights[2]  0.285 0.049  0.189    0.3   0.35  10.355
          weights[3]   0.17 0.046  0.083  0.172  0.258   11.74
               alpha  1.381 0.846  0.319  1.133  3.513  17.696
 beta_location[1, 1]  1.031 1.408 -1.795  1.507  2.522   1.503
 beta_location[2, 1]  1.135 0.786 -0.182  1.183   2.12   4.936
 beta_location[3, 1]  0.758 1.099 -1.514  0.772  2.525   3.277
 beta_location[4, 1]  0.153 0.777 -0.631 -0.184  1.832   2.518
 beta_location[5, 1]  0.091 0.661 -0.979  0.309  1.353   9.197
 beta_location[1, 2]  0.532 0.927 -0.891  0.133  2.046    2.91
 beta_location[2, 2] -1.785 1.241 -3.511  -1.88  0.311    2.29
 beta_location[3, 2] -3.126 1.533 -4.652 -3.934  0.302   1.866
 beta_location[4, 2] -1.304 1.338 -3.017 -1.371  1.447   2.479
 beta_location[5, 2]  1.507 0.419  0.527  1.623   2.03   9.906
 beta_location[1, 3]  1.058 0.827 -1.123   0.94  2.358    4.52
 beta_location[2, 3]  1.654 1.804 -2.067   2.43  3.752   2.449
 beta_location[3, 3]  1.364 1.264 -1.047  1.404  2.961   2.195
 beta_location[4, 3]  -1.01 1.051 -2.325  -1.35  1.259   1.961
 beta_location[5, 3]  -1.34 1.028 -3.261 -1.114    0.3   2.243
            scale[1]  2.046  0.56   1.16  1.964  3.233 103.998
            scale[2]   2.19  0.82  0.978  2.027  4.075  78.988
            scale[3]   1.99 0.966  0.612  1.788  4.133  82.816
Code
params_sb <- params(fit_sb_normal)
params_sb
Posterior mean parameters

$alpha
[1] 0.9726

$w
[1] 0.7103 0.1881

$beta_mean
           x1      x2      x3
comp1 -0.6969  1.2740 -0.4642
comp2  0.8744  0.3004  0.3364
comp3 -0.2103 -0.9816 -1.7030
comp4 -0.8597 -0.3299  0.7915
comp5  0.5520  2.1360  1.7370

$sd
[1] 0.05346 0.79370

Conditional Predictive Density

Code
X_new <- expand.grid(x1 = seq(-2, 2, length.out = 3), x2 = c(-1, 1), x3 = 0)
colnames(X_new) <- colnames(X)
y_min <- max(min(y), .Machine$double.eps)
y_grid <- seq(y_min, max(y) * 1.1, length.out = 200)

densities_normal <- lapply(seq_len(nrow(X_new)), function(i) {
  pred <- predict(fit_sb_normal, newdata =as.matrix(X_new[i, , drop = FALSE]), y = y_grid, type = "density")
  data.frame(
    y = pred$fit$y,
    density = pred$fit$density,
    label = paste0("x1=", round(X_new[i, "x1"], 1), ", x2=", X_new[i, "x2"]),
    model = "Normal"
  )
})

densities_cauchy <- lapply(seq_len(nrow(X_new)), function(i) {
  pred <- predict(fit_sb_cauchy, newdata =as.matrix(X_new[i, , drop = FALSE]), y = y_grid, type = "density")
  data.frame(
    y = pred$fit$y,
    density = pred$fit$density,
    label = paste0("x1=", round(X_new[i, "x1"], 1), ", x2=", X_new[i, "x2"]),
    model = "Cauchy"
  )
})

df_cond <- bind_rows(densities_normal, densities_cauchy)

ggplot(df_cond, aes(x = y, y = density, color = label)) +
  geom_line(linewidth = 1) +
  facet_wrap(~ model) +
  labs(title = "SB Conditional Predictive Densities", x = "y", y = "Density") +
  theme_minimal() +
  theme(legend.position = "bottom")


Quantile Drift with Covariates

Code
X_eval <- cbind(x1 = seq(-2, 2, length.out = 5), x2 = 0, x3 = 0)
colnames(X_eval) <- colnames(X)
quant_probs <- c(0.25, 0.5, 0.75)

pred_q_normal <- predict(fit_sb_normal, newdata =as.matrix(X_eval), type = "quantile", p = quant_probs)
pred_q_cauchy <- predict(fit_sb_cauchy, newdata =as.matrix(X_eval), type = "quantile", p = quant_probs)

quant_df_normal <- pred_q_normal$fit
quant_df_normal$x1 <- X_eval[quant_df_normal$id, "x1"]
quant_df_normal$model <- "Normal"

quant_df_cauchy <- pred_q_cauchy$fit
quant_df_cauchy$x1 <- X_eval[quant_df_cauchy$id, "x1"]
quant_df_cauchy$model <- "Cauchy"

bind_rows(quant_df_normal, quant_df_cauchy) %>%
  ggplot(aes(x = x1, y = estimate, color = factor(index), group = index)) +
  geom_line(linewidth = 1) +
  geom_point(size = 2) +
  facet_wrap(~ model) +
  labs(title = "SB Conditional Quantiles vs x1", x = "x1", y = "y", color = "Quantile") +
  theme_minimal()


Residuals & Diagnostics

Code
plot(fitted(fit_sb_cauchy))

Code
plot(fit_sb_normal, family = c("traceplot", "autocorrelation", "geweke"))

=== traceplot ===


=== autocorrelation ===


=== geweke ===

Code
plot(fit_sb_cauchy, family = c("density", "running", "caterpillar"))

=== density ===


=== running ===


=== caterpillar ===


Takeaways

  • Stick-breaking component count is fixed but still supports covariate-dependent mixtures.
  • predict(..., type = "density") returns group-specific densities for each X.
  • predict(..., type = "quantile") reports posterior-mean quantiles; the median is the 0.5 quantile and shifts with x1.
  • Diagnoses rely on the same S3 plot()/fitted() pipeline available in other vignettes.

Workflow Navigation

  • Previous: ex05-conditional-dpm-crp
  • Next: ex07-conditional-dpmgpd-crp
  • Workflow index: Roadmap
  • Practical entry: Examples

Prereqs

  • Required packages and data for this page are listed in the setup chunks above.

Outputs

  • This page renders model fits, diagnostics, and summary artifacts generated by package APIs.

Interpretation

  • Canonical concept page: Model Umbrella
  • Treat this page as an application/example view and use the canonical page for core definitions.

Next

  • Continue to the linked canonical concept page, then return for implementation-specific details.
(c) CausalMixGPD - Bayesian semiparametric modeling for heavy-tailed data
- - Cite - API - GitHub