Groups a data frame (similarly to dplyr::group_by()) based on the values of
a column, either by dividing up the range into equal pieces or by quantiles.
Arguments
- .data
- Data frame to bin 
- col
- Column to bin by 
- breaks
- Number of bins to create. - bin_by_interval()also accepts a numeric vector of two or more unique cut points to use. If- NULL, a default number of breaks is chosen based on the number of non-- NArows in the data. In- bin_by_quantile(), if the number of unique values of the column is smaller than- breaks, fewer bins will be produced.
Value
Grouped data frame, similar to those returned by dplyr::group_by().
An additional column .bin indicates the bin number for each group. Use
dplyr::summarize() to calculate values within each group, or other dplyr
operations that work on groups.
Details
bin_by_interval() breaks the numerical range of that column into
equal-sized intervals, or into intervals specified by breaks.
bin_by_quantile() splits the range into pieces based on quantiles of the
data, so each interval contains roughly an equal number of observations.
Examples
suppressMessages(library(dplyr))
cars |>
  bin_by_interval(speed, breaks = 5) |>
  summarize(mean_speed = mean(speed),
            mean_dist = mean(dist))
#> # A tibble: 5 × 3
#>    .bin mean_speed mean_dist
#>   <int>      <dbl>     <dbl>
#> 1     1        6        10.8
#> 2     2       10.9      21.9
#> 3     3       14.2      39.5
#> 4     4       18.7      52.1
#> 5     5       23.7      82.9
cars |>
  bin_by_quantile(speed, breaks = 5) |>
  summarize(mean_speed = mean(speed),
            mean_dist = mean(dist))
#> # A tibble: 5 × 3
#>    .bin mean_speed mean_dist
#>   <int>      <dbl>     <dbl>
#> 1     1       8.27      17  
#> 2     2      13         35.7
#> 3     3      16         36.8
#> 4     4      19.1       55  
#> 5     5      23.7       82.9