`R/pp_mixture.R`

`pp_mixture.brmsfit.Rd`

Compute the posterior probabilities of mixture component memberships for each observation including uncertainty estimates.

```
# S3 method for brmsfit
pp_mixture(
x,
newdata = NULL,
re_formula = NULL,
resp = NULL,
ndraws = NULL,
draw_ids = NULL,
log = FALSE,
summary = TRUE,
robust = FALSE,
probs = c(0.025, 0.975),
...
)
pp_mixture(x, ...)
```

- x
An R object usually of class

`brmsfit`

.- newdata
An optional data.frame for which to evaluate predictions. If

`NULL`

(default), the original data of the model is used.`NA`

values within factors are interpreted as if all dummy variables of this factor are zero. This allows, for instance, to make predictions of the grand mean when using sum coding.- re_formula
formula containing group-level effects to be considered in the prediction. If

`NULL`

(default), include all group-level effects; if`NA`

, include no group-level effects.- resp
Optional names of response variables. If specified, predictions are performed only for the specified response variables.

- ndraws
Positive integer indicating how many posterior draws should be used. If

`NULL`

(the default) all draws are used. Ignored if`draw_ids`

is not`NULL`

.- draw_ids
An integer vector specifying the posterior draws to be used. If

`NULL`

(the default), all draws are used.- log
Logical; Indicates whether to return probabilities on the log-scale.

- summary
Should summary statistics be returned instead of the raw values? Default is

`TRUE`

.- robust
If

`FALSE`

(the default) the mean is used as the measure of central tendency and the standard deviation as the measure of variability. If`TRUE`

, the median and the median absolute deviation (MAD) are applied instead. Only used if`summary`

is`TRUE`

.- probs
The percentiles to be computed by the

`quantile`

function. Only used if`summary`

is`TRUE`

.- ...
Further arguments passed to

`prepare_predictions`

that control several aspects of data validation and prediction.

If `summary = TRUE`

, an N x E x K array,
where N is the number of observations, K is the number
of mixture components, and E is equal to `length(probs) + 2`

.
If `summary = FALSE`

, an S x N x K array, where
S is the number of posterior draws.

The returned probabilities can be written as
\(P(Kn = k | Yn)\), that is the posterior probability
that observation n originates from component k.
They are computed using Bayes' Theorem
$$P(Kn = k | Yn) = P(Yn | Kn = k) P(Kn = k) / P(Yn),$$
where \(P(Yn | Kn = k)\) is the (posterior) likelihood
of observation n for component k, \(P(Kn = k)\) is
the (posterior) mixing probability of component k
(i.e. parameter `theta<k>`

), and
$$P(Yn) = \sum (k=1,...,K) P(Yn | Kn = k) P(Kn = k)$$
is a normalizing constant.

```
if (FALSE) {
## simulate some data
set.seed(1234)
dat <- data.frame(
y = c(rnorm(100), rnorm(50, 2)),
x = rnorm(150)
)
## fit a simple normal mixture model
mix <- mixture(gaussian, nmix = 2)
prior <- c(
prior(normal(0, 5), Intercept, nlpar = mu1),
prior(normal(0, 5), Intercept, nlpar = mu2),
prior(dirichlet(2, 2), theta)
)
fit1 <- brm(bf(y ~ x), dat, family = mix,
prior = prior, chains = 2, init = 0)
summary(fit1)
## compute the membership probabilities
ppm <- pp_mixture(fit1)
str(ppm)
## extract point estimates for each observation
head(ppm[, 1, ])
## classify every observation according to
## the most likely component
apply(ppm[, 1, ], 1, which.max)
}
```