CausalMixGPD
  • Home
  • Roadmaps
    • Website roadmap
    • Package roadmap
  • Start
    • Start Hub
    • Roadmap
    • Usage Diagrams
    • Start Here
    • Basic Compile and Run
    • Backends and Workflow
    • Troubleshooting
  • Tracks
    • Quickstart
    • Modeling (1-arm)
    • Causal
    • Clustering
    • Kernels & tails
    • Customization
  • Examples
  • Kernels
  • Advanced
  • Developers
  • Reference
    • Reference hub
    • Function reference by job
  • News
  • Cite
  • Coverage
  • API Reference

ex05. Conditional DPM (CRP Backend)

Website workflow note. This page reflects the current exported API and recommended wrapper-first usage. Last updated: 2026-02-19.

For the full package narrative, see the main package vignettes (basic, unconditional, conditional, and causal).

Conditional DPmix: CRP Backend with Covariates

Purpose: Show how the CRP backend can model \(y | X\) via a covariate-dependent Dirichlet Process mixture. This vignette parallels start/backends-and-workflow but includes \(X\) so we can explore conditional predictions.

What you’ll learn

  • How to fit a conditional DP mixture (y \mid X) with the CRP backend.
  • How to form covariate-slice predictions with predict(fit, newdata = X_new, ...) and interpret them.
  • How kernel choice affects conditional fit even when GPD = FALSE.

When to use this template

  • You want flexible conditional density estimation (p(y \mid X)) without committing to a parametric regression form.
  • You expect heteroskedasticity, skewness, or multimodality that can vary with (X).

Next steps

  • If extremes matter for some covariate regions, extend to conditional tail modeling (ex07/ex08).

Data Setup

Code
data("nc_posX100_p3_k2")
y <- nc_posX100_p3_k2$y
X <- as.matrix(nc_posX100_p3_k2$X)
if (is.null(colnames(X))) {
  colnames(X) <- paste0("x", seq_len(ncol(X)))
}

summary_tbl <- tibble(
  statistic = c("N", "Mean", "SD", "Min", "Max"),
  value = c(length(y), mean(y), sd(y), min(y), max(y))
)

df_cov <- data.frame(y = y, x1 = X[, 1], x2 = X[, 2])

p_cov <- ggplot(df_cov, aes(x = x1, y = y)) +
  geom_point(alpha = 0.6, color = "steelblue") +
  geom_smooth(method = "loess", color = "firebrick", fill = NA) +
  labs(title = "y vs X1 (loess smoother)", x = "X1", y = "y") +
  theme_minimal()

print(p_cov)

Code
summary_tbl
# A tibble: 5 × 2
  statistic   value
  <chr>       <dbl>
1 N         100    
2 Mean        3.45 
3 SD          2.41 
4 Min         0.377
5 Max        10.9  

Model Specification & Bundle

Code
bundle_cond_normal <- bundle(
  y = y,
  X = X,
  kernel = "normal",
  backend = "crp",
  GPD = FALSE,
  components = 5,
  mcmc = mcmc
)

bundle_cond_amoroso <- bundle(
  y = y,
  X = X,
  kernel = "amoroso",
  backend = "crp",
  GPD = FALSE,
  components = 5,
  mcmc = mcmc
)

Running MCMC

Code
fit_cond_normal <- load_or_fit("ex05-conditional-dpm-crp-fit_cond_normal", dpmix(bundle_cond_normal))
fit_cond_amoroso <- load_or_fit("ex05-conditional-dpm-crp-fit_cond_amoroso", dpmix(bundle_cond_amoroso))
summary(fit_cond_normal)
MixGPD summary | backend: Chinese Restaurant Process | kernel: Normal Distribution | GPD tail: FALSE | epsilon: 0.025
n = 100 | components = 5
Summary
Initial components: 5 | Components after truncation: 2

WAIC: 365.588
lppd: -144.375 | pWAIC: 38.418

Summary table
  parameter  mean    sd q0.025 q0.500 q0.975    ess
 weights[1] 0.529 0.093  0.357   0.53  0.723 13.352
 weights[2] 0.342 0.097  0.167   0.34   0.49  12.52
      alpha 0.615  0.36  0.129  0.542  1.359 56.464
    mean[1] 2.613 1.359  1.427  1.989  5.785 51.113
    mean[2] 4.747 1.916  1.424  5.448  7.147 44.946
      sd[1] 1.156   0.7  0.145  1.176  2.451 32.227
      sd[2] 0.804 0.804  0.166  0.392  2.919 73.942
Code
summary(fit_cond_amoroso)
MixGPD summary | backend: Chinese Restaurant Process | kernel: Amoroso Distribution | GPD tail: FALSE | epsilon: 0.025
n = 100 | components = 5
Summary
Initial components: 5 | Components after truncation: 1

WAIC: 415.282
lppd: -194.939 | pWAIC: 12.702

Summary table
  parameter  mean    sd q0.025 q0.500 q0.975   ess
 weights[1] 0.984 0.032  0.887      1      1 15.19
      alpha 0.282 0.323  0.007  0.197  0.868   150
     loc[1] 0.217 0.194 -0.229  0.294  0.373  9.39
   scale[1] 4.273 1.337  2.465  3.844  6.779 1.894
  shape1[1] 0.858 0.378  0.374  0.829  1.489 2.235
  shape2[1] 1.707 0.411  1.277   1.55  2.549 3.986
Code
params_cond <- params(fit_cond_normal)
params_cond
Posterior mean parameters

$alpha
[1] 0.6151

$w
[1] 0.5287 0.3424

$mean
[1] 2.613 4.747

$sd
[1] 1.1560 0.8038

Conditional Predictions

Code
X_new <- expand.grid(
  x1 = seq(-2, 2, length.out = 3),
  x2 = c(0, 1),
  x3 = 0
)
colnames(X_new) <- colnames(X)

y_grid <- seq(-1, 10, length.out = 200)
densities_normal <- lapply(seq_len(nrow(X_new)), function(i) {
  pred <- predict(fit_cond_normal, newdata =as.matrix(X_new[i, , drop = FALSE]), y = y_grid, type = "density")
  data.frame(
    y = pred$fit$y,
    density = pred$fit$density,
    group = paste0("x1=", round(X_new[i, "x1"], 1), ", x2=", X_new[i, "x2"]),
    model = "Normal"
  )
})

densities_amoroso <- lapply(seq_len(nrow(X_new)), function(i) {
  pred <- predict(fit_cond_amoroso, newdata =as.matrix(X_new[i, , drop = FALSE]), y = y_grid, type = "density")
  data.frame(
    y = pred$fit$y,
    density = pred$fit$density,
    group = paste0("x1=", round(X_new[i, "x1"], 1), ", x2=", X_new[i, "x2"]),
    model = "Amoroso (shape1=1)"
  )
})

df_dens <- bind_rows(densities_normal, densities_amoroso)

ggplot(df_dens, aes(x = y, y = density, color = group)) +
  geom_line(linewidth = 1) +
  facet_wrap(~ model) +
  labs(title = "Conditional Predictive Densities", x = "y", y = "Density") +
  theme_minimal() +
  theme(legend.position = "bottom")


Covariate Effect on Conditional Quantiles

Code
X_grid <- cbind(
  x1 = seq(-2, 2, length.out = 5),
  x2 = 0,
  x3 = 0
)
colnames(X_grid) <- colnames(X)

quant_probs <- c(0.25, 0.5, 0.75)
pred_q_normal <- predict(fit_cond_normal, newdata =as.matrix(X_grid), type = "quantile", p = quant_probs)
pred_q_amoroso <- predict(fit_cond_amoroso, newdata =as.matrix(X_grid), type = "quantile", p = quant_probs)

quant_df_normal <- pred_q_normal$fit
quant_df_normal$x1 <- X_grid[quant_df_normal$id, "x1"]
quant_df_normal$model <- "Normal"

quant_df_amoroso <- pred_q_amoroso$fit
quant_df_amoroso$x1 <- X_grid[quant_df_amoroso$id, "x1"]
quant_df_amoroso$model <- "Amoroso (shape1=1)"

bind_rows(quant_df_normal, quant_df_amoroso) %>%
  ggplot(aes(x = x1, y = estimate, color = factor(index), group = index)) +
  geom_line(linewidth = 1) +
  geom_point(size = 2) +
  facet_wrap(~ model) +
  labs(title = "Conditional Quantiles vs x1 (x2=0)", x = "x1", y = "y", color = "Quantile") +
  theme_minimal()


Residuals & Diagnostics

Code
fit_vals <- fitted(fit_cond_normal)
plot(fit_vals)

Code
plot(fit_cond_normal, family = c("traceplot", "autocorrelation", "geweke"))

=== traceplot ===


=== autocorrelation ===


=== geweke ===

Code
plot(fit_cond_amoroso, family = c("density", "running", "caterpillar"))

=== density ===


=== running ===


=== caterpillar ===


Takeaways

  • Covariate-informed DP mixtures predict outcome distributions that shift with x1 (and other covariates).
  • Use predict(..., type = "density") to visualize conditional densities and type = "quantile" to compare conditional distribution shifts.
  • Diagnostics (plot(fit_cond_normal), fitted(fit_cond_normal)) ensure the chains mix before relying on predictions.
  • Next vignette extends the same idea to the SB backend before adding tails.

Workflow Navigation

  • Previous: ex04-unconditional-dpmgpd-sb
  • Next: ex06-conditional-dpm-sb
  • Workflow index: Roadmap
  • Practical entry: Examples

Prereqs

  • Required packages and data for this page are listed in the setup chunks above.

Outputs

  • This page renders model fits, diagnostics, and summary artifacts generated by package APIs.

Interpretation

  • Canonical concept page: Model Umbrella
  • Treat this page as an application/example view and use the canonical page for core definitions.

Next

  • Continue to the linked canonical concept page, then return for implementation-specific details.
(c) CausalMixGPD - Bayesian semiparametric modeling for heavy-tailed data
- - Cite - API - GitHub