ex05. Conditional DPM (CRP Backend)

Website workflow note. This page reflects the current exported API and recommended wrapper-first usage. Last updated: 2026-02-19.

For the full package narrative, see the main package vignettes (basic, unconditional, conditional, and causal).

Conditional DPmix: CRP Backend with Covariates

Purpose: Show how the CRP backend can model \(y | X\) via a covariate-dependent Dirichlet Process mixture. This vignette parallels start/backends-and-workflow but includes \(X\) so we can explore conditional predictions.

What you’ll learn

How to fit a conditional DP mixture (y \mid X) with the CRP backend.
How to form covariate-slice predictions with predict(fit, newdata = X_new, ...) and interpret them.
How kernel choice affects conditional fit even when GPD = FALSE.

When to use this template

You want flexible conditional density estimation (p(y \mid X)) without committing to a parametric regression form.
You expect heteroskedasticity, skewness, or multimodality that can vary with (X).

Next steps

If extremes matter for some covariate regions, extend to conditional tail modeling (ex07/ex08).

Data Setup

Code

data("nc_posX100_p3_k2")
y <- nc_posX100_p3_k2$y
X <- as.matrix(nc_posX100_p3_k2$X)
if (is.null(colnames(X))) {
  colnames(X) <- paste0("x", seq_len(ncol(X)))
}

summary_tbl <- tibble(
  statistic = c("N", "Mean", "SD", "Min", "Max"),
  value = c(length(y), mean(y), sd(y), min(y), max(y))
)

df_cov <- data.frame(y = y, x1 = X[, 1], x2 = X[, 2])

p_cov <- ggplot(df_cov, aes(x = x1, y = y)) +
  geom_point(alpha = 0.6, color = "steelblue") +
  geom_smooth(method = "loess", color = "firebrick", fill = NA) +
  labs(title = "y vs X1 (loess smoother)", x = "X1", y = "y") +
  theme_minimal()

print(p_cov)

Code

summary_tbl

# A tibble: 5 × 2
  statistic   value
  <chr>       <dbl>
1 N         100    
2 Mean        3.45 
3 SD          2.41 
4 Min         0.377
5 Max        10.9

Model Specification & Bundle

Code

bundle_cond_normal <- bundle(
  y = y,
  X = X,
  kernel = "normal",
  backend = "crp",
  GPD = FALSE,
  components = 5,
  mcmc = mcmc
)

bundle_cond_amoroso <- bundle(
  y = y,
  X = X,
  kernel = "amoroso",
  backend = "crp",
  GPD = FALSE,
  components = 5,
  mcmc = mcmc
)

Running MCMC

Code

fit_cond_normal <- load_or_fit("ex05-conditional-dpm-crp-fit_cond_normal", dpmix(bundle_cond_normal))
fit_cond_amoroso <- load_or_fit("ex05-conditional-dpm-crp-fit_cond_amoroso", dpmix(bundle_cond_amoroso))
summary(fit_cond_normal)

MixGPD summary | backend: Chinese Restaurant Process | kernel: Normal Distribution | GPD tail: FALSE | epsilon: 0.025
n = 100 | components = 5
Summary
Initial components: 5 | Components after truncation: 2

WAIC: 365.588
lppd: -144.375 | pWAIC: 38.418

Summary table
  parameter  mean    sd q0.025 q0.500 q0.975    ess
 weights[1] 0.529 0.093  0.357   0.53  0.723 13.352
 weights[2] 0.342 0.097  0.167   0.34   0.49  12.52
      alpha 0.615  0.36  0.129  0.542  1.359 56.464
    mean[1] 2.613 1.359  1.427  1.989  5.785 51.113
    mean[2] 4.747 1.916  1.424  5.448  7.147 44.946
      sd[1] 1.156   0.7  0.145  1.176  2.451 32.227
      sd[2] 0.804 0.804  0.166  0.392  2.919 73.942

Code

summary(fit_cond_amoroso)

MixGPD summary | backend: Chinese Restaurant Process | kernel: Amoroso Distribution | GPD tail: FALSE | epsilon: 0.025
n = 100 | components = 5
Summary
Initial components: 5 | Components after truncation: 1

WAIC: 415.282
lppd: -194.939 | pWAIC: 12.702

Summary table
  parameter  mean    sd q0.025 q0.500 q0.975   ess
 weights[1] 0.984 0.032  0.887      1      1 15.19
      alpha 0.282 0.323  0.007  0.197  0.868   150
     loc[1] 0.217 0.194 -0.229  0.294  0.373  9.39
   scale[1] 4.273 1.337  2.465  3.844  6.779 1.894
  shape1[1] 0.858 0.378  0.374  0.829  1.489 2.235
  shape2[1] 1.707 0.411  1.277   1.55  2.549 3.986

Code

params_cond <- params(fit_cond_normal)
params_cond

Posterior mean parameters

$alpha
[1] 0.6151

$w
[1] 0.5287 0.3424

$mean
[1] 2.613 4.747

$sd
[1] 1.1560 0.8038

Conditional Predictions

Code

X_new <- expand.grid(
  x1 = seq(-2, 2, length.out = 3),
  x2 = c(0, 1),
  x3 = 0
)
colnames(X_new) <- colnames(X)

y_grid <- seq(-1, 10, length.out = 200)
densities_normal <- lapply(seq_len(nrow(X_new)), function(i) {
  pred <- predict(fit_cond_normal, newdata =as.matrix(X_new[i, , drop = FALSE]), y = y_grid, type = "density")
  data.frame(
    y = pred$fit$y,
    density = pred$fit$density,
    group = paste0("x1=", round(X_new[i, "x1"], 1), ", x2=", X_new[i, "x2"]),
    model = "Normal"
  )
})

densities_amoroso <- lapply(seq_len(nrow(X_new)), function(i) {
  pred <- predict(fit_cond_amoroso, newdata =as.matrix(X_new[i, , drop = FALSE]), y = y_grid, type = "density")
  data.frame(
    y = pred$fit$y,
    density = pred$fit$density,
    group = paste0("x1=", round(X_new[i, "x1"], 1), ", x2=", X_new[i, "x2"]),
    model = "Amoroso (shape1=1)"
  )
})

df_dens <- bind_rows(densities_normal, densities_amoroso)

ggplot(df_dens, aes(x = y, y = density, color = group)) +
  geom_line(linewidth = 1) +
  facet_wrap(~ model) +
  labs(title = "Conditional Predictive Densities", x = "y", y = "Density") +
  theme_minimal() +
  theme(legend.position = "bottom")

Covariate Effect on Conditional Quantiles

Code

X_grid <- cbind(
  x1 = seq(-2, 2, length.out = 5),
  x2 = 0,
  x3 = 0
)
colnames(X_grid) <- colnames(X)

quant_probs <- c(0.25, 0.5, 0.75)
pred_q_normal <- predict(fit_cond_normal, newdata =as.matrix(X_grid), type = "quantile", p = quant_probs)
pred_q_amoroso <- predict(fit_cond_amoroso, newdata =as.matrix(X_grid), type = "quantile", p = quant_probs)

quant_df_normal <- pred_q_normal$fit
quant_df_normal$x1 <- X_grid[quant_df_normal$id, "x1"]
quant_df_normal$model <- "Normal"

quant_df_amoroso <- pred_q_amoroso$fit
quant_df_amoroso$x1 <- X_grid[quant_df_amoroso$id, "x1"]
quant_df_amoroso$model <- "Amoroso (shape1=1)"

bind_rows(quant_df_normal, quant_df_amoroso) %>%
  ggplot(aes(x = x1, y = estimate, color = factor(index), group = index)) +
  geom_line(linewidth = 1) +
  geom_point(size = 2) +
  facet_wrap(~ model) +
  labs(title = "Conditional Quantiles vs x1 (x2=0)", x = "x1", y = "y", color = "Quantile") +
  theme_minimal()

Residuals & Diagnostics

Code

fit_vals <- fitted(fit_cond_normal)
plot(fit_vals)

Code

plot(fit_cond_normal, family = c("traceplot", "autocorrelation", "geweke"))


=== traceplot ===


=== autocorrelation ===


=== geweke ===

Code

plot(fit_cond_amoroso, family = c("density", "running", "caterpillar"))


=== density ===


=== running ===


=== caterpillar ===

Takeaways

Covariate-informed DP mixtures predict outcome distributions that shift with x1 (and other covariates).
Use predict(..., type = "density") to visualize conditional densities and type = "quantile" to compare conditional distribution shifts.
Diagnostics (plot(fit_cond_normal), fitted(fit_cond_normal)) ensure the chains mix before relying on predictions.
Next vignette extends the same idea to the SB backend before adding tails.

Prereqs

Required packages and data for this page are listed in the setup chunks above.

Outputs

This page renders model fits, diagnostics, and summary artifacts generated by package APIs.

Interpretation

Canonical concept page: Model Umbrella
Treat this page as an application/example view and use the canonical page for core definitions.

Continue to the linked canonical concept page, then return for implementation-specific details.

Conditional DPmix: CRP Backend with Covariates

What you’ll learn

When to use this template

Next steps

Data Setup

Model Specification & Bundle

Running MCMC

Conditional Predictions

Covariate Effect on Conditional Quantiles

Residuals & Diagnostics

Takeaways

Workflow Navigation

Prereqs

Outputs

Interpretation

Next