`R/posterior_predict.R`

`posterior_predict.brmsfit.Rd`

Compute posterior draws of the posterior predictive distribution. Can be
performed for the data used to fit the model (posterior predictive checks) or
for new data. By definition, these draws have higher variance than draws
of the expected value of the posterior predictive distribution computed by
`posterior_epred.brmsfit`

. This is because the residual error
is incorporated in `posterior_predict`

. However, the estimated means of
both methods averaged across draws should be very similar.

```
# S3 method for brmsfit
posterior_predict(
object,
newdata = NULL,
re_formula = NULL,
re.form = NULL,
transform = NULL,
resp = NULL,
negative_rt = FALSE,
ndraws = NULL,
draw_ids = NULL,
sort = FALSE,
ntrys = 5,
cores = NULL,
...
)
```

- object
An object of class

`brmsfit`

.- newdata
An optional data.frame for which to evaluate predictions. If

`NULL`

(default), the original data of the model is used.`NA`

values within factors are interpreted as if all dummy variables of this factor are zero. This allows, for instance, to make predictions of the grand mean when using sum coding.- re_formula
formula containing group-level effects to be considered in the prediction. If

`NULL`

(default), include all group-level effects; if`NA`

, include no group-level effects.- re.form
Alias of

`re_formula`

.- transform
(Deprecated) A function or a character string naming a function to be applied on the predicted responses before summary statistics are computed.

- resp
Optional names of response variables. If specified, predictions are performed only for the specified response variables.

- negative_rt
Only relevant for Wiener diffusion models. A flag indicating whether response times of responses on the lower boundary should be returned as negative values. This allows to distinguish responses on the upper and lower boundary. Defaults to

`FALSE`

.- ndraws
Positive integer indicating how many posterior draws should be used. If

`NULL`

(the default) all draws are used. Ignored if`draw_ids`

is not`NULL`

.- draw_ids
An integer vector specifying the posterior draws to be used. If

`NULL`

(the default), all draws are used.- sort
Logical. Only relevant for time series models. Indicating whether to return predicted values in the original order (

`FALSE`

; default) or in the order of the time series (`TRUE`

).- ntrys
Parameter used in rejection sampling for truncated discrete models only (defaults to

`5`

). See Details for more information.- cores
Number of cores (defaults to

`1`

). On non-Windows systems, this argument can be set globally via the`mc.cores`

option.- ...
Further arguments passed to

`prepare_predictions`

that control several aspects of data validation and prediction.

An `array`

of draws. In univariate models,
the output is as an S x N matrix, where S is the number of posterior
draws and N is the number of observations. In multivariate models, an
additional dimension is added to the output which indexes along the
different response variables.

`NA`

values within factors in `newdata`

,
are interpreted as if all dummy variables of this factor are
zero. This allows, for instance, to make predictions of the grand mean
when using sum coding.

In multilevel models, it is possible to
allow new levels of grouping factors to be used in the predictions.
This can be controlled via argument `allow_new_levels`

.
New levels can be sampled in multiple ways, which can be controlled
via argument `sample_new_levels`

. Both of these arguments are
documented in `prepare_predictions`

along with several
other useful arguments to control specific aspects of the predictions.

For truncated discrete models only: In the absence of any general
algorithm to sample from truncated discrete distributions, rejection
sampling is applied in this special case. This means that values are
sampled until a value lies within the defined truncation boundaries. In
practice, this procedure may be rather slow (especially in R). Thus, we
try to do approximate rejection sampling by sampling each value
`ntrys`

times and then select a valid value. If all values are
invalid, the closest boundary is used, instead. If there are more than a
few of these pathological cases, a warning will occur suggesting to
increase argument `ntrys`

.

```
if (FALSE) {
## fit a model
fit <- brm(time | cens(censored) ~ age + sex + (1 + age || patient),
data = kidney, family = "exponential", init = "0")
## predicted responses
pp <- posterior_predict(fit)
str(pp)
## predicted responses excluding the group-level effect of age
pp <- posterior_predict(fit, re_formula = ~ (1 | patient))
str(pp)
## predicted responses of patient 1 for new data
newdata <- data.frame(
sex = factor(c("male", "female")),
age = c(20, 50),
patient = c(1, 1)
)
pp <- posterior_predict(fit, newdata = newdata)
str(pp)
}
```