Predictor variables can have any marginal distribution as long as a function is provided to sample from the distribution. Multivariate distributions are also supported: if the random generation function returns multiple columns, multiple random variables will be created. If the columns are named, the random variables will be named accordingly; otherwise, they will be successively numbered.
Arguments
- dist
Function to generate draws from this predictor's distribution, provided as a function or as a string naming the function.
- ...
Additional arguments to pass to
dist
when generating draws.
Value
A predictor_dist
object, to be used in population()
to specify a
population distribution
Details
The random generation function must take an argument named n
specifying the
number of draws. For univariate distributions, it should return a vector of
length n
; for multivariate distributions, it should return an array or
matrix with n
rows and a column per variable.
Multivariate predictors are successively numbered. For instance, if predictor
X
is specified with
library(mvtnorm)
predictor(dist = rmvnorm, mean = c(0, 1),
sigma = matrix(c(1, 0.5, 0.5, 1), nrow = 2))
then the population predictors will be named X1
and X2
, and will have
covariance 0.5.
If the multivariate predictor has named columns, the names will be used
instead. For instance, if predictor X
generates a matrix with columns A
and B
, the population predictors will be named XA
and XB
.
Examples
# Univariate normal distribution
predictor(dist = rnorm, mean = 10, sd = 2.5)
#> rnorm(list(mean = 10, sd = 2.5))
# Multivariate normal distribution
library(mvtnorm)
predictor(dist = rmvnorm, mean = c(0, 1, 7))
#> rmvnorm(list(mean = c(0, 1, 7)))
# Multivariate with named columns
rmulti <- function(n) {
cbind(treatment = rbinom(n, size = 1, prob = 0.5),
confounder = rnorm(n)
)
}
predictor(dist = rmulti)
#> rmulti()