Augment a model fit with partial residuals for all terms

Construct a data frame containing the model data, partial residuals for all quantitative predictors, and predictor effects, for use in residual diagnostic plots and other analyses. The result is in tidy form (one row per predictor per observation), allowing it to be easily manipulated for plots and simulations.

Usage

partial_residuals(fit, predictors = everything())

Arguments

fit: The model to obtain residuals for. This can be a model fit with lm() or glm(), or any model with a predict() method that accepts a newdata argument.
predictors: Predictors to calculate partial residuals for. Defaults to all predictors, skipping factors. Predictors can be specified using tidyselect syntax; see help("language", package = "tidyselect") and the examples below.

Value

Data frame (tibble) containing the model data and residuals in tidy form. There is one row per selected predictor per observation. All predictors are included as columns, plus the following additional columns:

.obs: Row number of this observation in the original model data frame.
.predictor_name: Name of the predictor this row gives the partial residual for.
.predictor_value: Value of the predictor this row gives the partial residual for.
.partial_resid: Partial residual for this predictor for this observation.
.predictor_effect: Predictor effect $\hat \mu(X_{if}, 0)$ for this observation.

Predictors and regressors

To define partial residuals, we must distinguish between the predictors, the measured variables we are using to fit our model, and the regressors, which are calculated from them. In a simple linear model, the regressors are equal to the predictors. But in a model with polynomials, splines, or other nonlinear terms, the regressors may be functions of the predictors.

For example, in a regression with a single predictor $X$, the regression model $Y = \beta_0 + \beta_1 X + e$ has one regressor, $X$. But if we choose a polynomial of degree 3, the model is $Y = \beta_0 + \beta_1 X + \beta_2 X^2 + \beta_3 X^3$, and the regressors are $\{X, X^2, X^3\}$.

Similarly, if we have predictors $X_1$ and $X_2$ and form a model with main effects and an interaction, the regressors are $\{X_1, X_2, X_1 X_2\}$.

Partial residuals are defined in terms of the predictors, not the regressors, and are intended to allow us to see the shape of the relationship between a particular predictor and the response, and to compare it to how we have chosen to model it with regressors. Partial residuals are not useful for categorical (factor) predictors, and so these are omitted.

Linear models

Consider a linear model where $\mathbb{E}[Y \mid X = x] = \mu(x)$. The mean function $\mu(x)$ is a linear combination of regressors. Let $\hat \mu$ be the fitted model and $\hat \beta_0$ be its intercept.

Choose a predictor $X_f$, the focal predictor, to calculate partial residuals for. Write the mean function as $\mu(X_f, X_o)$, where $X_f$ is the value of the focal predictor, and $X_o$ represents all other predictors.

If $e_i$ is the residual for observation $i$, the partial residual is

$$r_{if} = e_i + (\hat \mu(x_{if}, 0) - \hat \beta_0).$$

Setting $X_o = 0$ means setting all other numeric predictors to 0; factor predictors are set to their first (baseline) level.

Generalized linear models

Consider a generalized linear model where $g(\mathbb{E}[Y \mid X = x]) = \mu(x)$, where $g$ is a link function. Let $\hat \mu$ be the fitted model and $\hat \beta_0$ be its intercept.

Let $e_i$ be the working residual for observation $i$, defined to be

$$e_i = (y_i - g^{-1}(x_i)) g'(x_i).$$

Choose a predictor $X_f$, the focal predictor, to calculate partial residuals for. Write $\mu$ as $\mu(X_f, X_o)$, where $X_f$ is the value of the focal predictor, and $X_o$ represents all other predictors. Hence $\mu(X_f, X_o)$ gives the model's prediction on the link scale.

The partial residual is again

$$r_{if} = e_i + (\hat \mu(x_{if}, 0) - \hat \beta_0).$$

Interpretation

In linear regression, because the residuals $e_i$ should have mean zero in a well-specified model, plotting the partial residuals against $x_f$ should produce a shape matching the modeled relationship $\mu$. If the model is wrong, the partial residuals will appear to deviate from the fitted relationship. Provided the regressors are uncorrelated or approximately linearly related to each other, the plotted trend should approximate the true relationship between $x_f$ and the response.

In generalized linear models, this is approximately true if the link function $g$ is approximately linear over the range of observed $x$ values.

Additionally, the function $\mu(X_f, 0)$ can be used to show the relationship between the focal predictor and the response. In a linear model, the function is linear; with polynomial or spline regressors, it is nonlinear. This function is the predictor effect function, and the estimated predictor effects $\hat \mu(X_{if}, 0)$ are included in this function's output.

Limitations

Factor predictors (as factors, logical, or character vectors) are detected automatically and omitted. However, if a numeric variable is converted to factor in the model formula, such as with y ~ factor(x), the function cannot determine the appropriate type and will raise an error. Create factors as needed in the source data frame before fitting the model to avoid this issue.

References

R. Dennis Cook (1993). "Exploring Partial Residual Plots", Technometrics, 35:4, 351-362. doi:10.1080/00401706.1993.10485350

Cook, R. Dennis, and Croos-Dabrera, R. (1998). "Partial Residual Plots in Generalized Linear Models." Journal of the American Statistical Association 93, no. 442: 730–39. doi:10.2307/2670123

Fox, J., & Weisberg, S. (2018). "Visualizing Fit and Lack of Fit in Complex Regression Models with Predictor Effect Plots and Partial Residuals." Journal of Statistical Software, 87(9). doi:10.18637/jss.v087.i09

Examples

fit <- lm(mpg ~ cyl + disp + hp, data = mtcars)
partial_residuals(fit)
#> # A tibble: 96 × 7
#>      cyl  disp    hp .predictor_name .predictor_value .predictor_effect
#>    <dbl> <dbl> <dbl> <chr>                      <dbl>             <dbl>
#>  1     6  160    110 cyl                            6             -7.36
#>  2     6  160    110 cyl                            6             -7.36
#>  3     4  108     93 cyl                            4             -4.91
#>  4     6  258    110 cyl                            6             -7.36
#>  5     8  360    175 cyl                            8             -9.82
#>  6     6  225    105 cyl                            6             -7.36
#>  7     8  360    245 cyl                            8             -9.82
#>  8     4  147.    62 cyl                            4             -4.91
#>  9     4  141.    95 cyl                            4             -4.91
#> 10     6  168.   123 cyl                            6             -7.36
#> # ℹ 86 more rows
#> # ℹ 1 more variable: .partial_resid <dbl>

# You can select predictors with tidyselect syntax:
partial_residuals(fit, c(disp, hp))
#> # A tibble: 64 × 7
#>      cyl  disp    hp .predictor_name .predictor_value .predictor_effect
#>    <dbl> <dbl> <dbl> <chr>                      <dbl>             <dbl>
#>  1     6  160    110 disp                        160              -3.01
#>  2     6  160    110 disp                        160              -3.01
#>  3     4  108     93 disp                        108              -2.03
#>  4     6  258    110 disp                        258              -4.86
#>  5     8  360    175 disp                        360              -6.78
#>  6     6  225    105 disp                        225              -4.24
#>  7     8  360    245 disp                        360              -6.78
#>  8     4  147.    62 disp                        147.             -2.76
#>  9     4  141.    95 disp                        141.             -2.65
#> 10     6  168.   123 disp                        168.             -3.16
#> # ℹ 54 more rows
#> # ℹ 1 more variable: .partial_resid <dbl>

# Predictors with multiple regressors are supported:
fit2 <- lm(mpg ~ poly(disp, 2), data = mtcars)
partial_residuals(fit2)
#> # A tibble: 32 × 5
#>     disp .predictor_name .predictor_value .predictor_effect .partial_resid
#>    <dbl> <chr>                      <dbl>             <dbl>          <dbl>
#>  1  160  disp                        160               2.11          0.909
#>  2  160  disp                        160               2.11          0.909
#>  3  108  disp                        108               5.83          2.71 
#>  4  258  disp                        258              -3.07          1.31 
#>  5  360  disp                        360              -5.89         -1.39 
#>  6  225  disp                        225              -1.59         -1.99 
#>  7  360  disp                        360              -5.89         -5.79 
#>  8  147. disp                        147.              3.00          4.31 
#>  9  141. disp                        141.              3.40          2.71 
#> 10  168. disp                        168.              1.62         -0.891
#> # ℹ 22 more rows

# Allowing an interaction by number of cylinders is fine, but partial
# residuals are not generated for the factor. Notice the factor must be
# created first, not in the model formula:
mtcars$cylinders <- factor(mtcars$cyl)
fit3 <- lm(mpg ~ cylinders * disp + hp, data = mtcars)
partial_residuals(fit3)
#> Factor predictors were dropped:
#> • `cylinders`
#> # A tibble: 64 × 7
#>    cylinders  disp    hp .predictor_name .predictor_value .predictor_effect
#>    <fct>     <dbl> <dbl> <chr>                      <dbl>             <dbl>
#>  1 6          160    110 disp                        160              -20.9
#>  2 6          160    110 disp                        160              -20.9
#>  3 4          108     93 disp                        108              -14.1
#>  4 6          258    110 disp                        258              -33.6
#>  5 8          360    175 disp                        360              -46.9
#>  6 6          225    105 disp                        225              -29.3
#>  7 8          360    245 disp                        360              -46.9
#>  8 4          147.    62 disp                        147.             -19.1
#>  9 4          141.    95 disp                        141.             -18.4
#> 10 6          168.   123 disp                        168.             -21.8
#> # ℹ 54 more rows
#> # ℹ 1 more variable: .partial_resid <dbl>