purrr enhances R’s functional programming (FP) toolkit by providing a
complete and consistent set of tools for working with functions and
vectors. If you’ve never heard of FP before, the best place to start is
the family of map()
functions which allow you to replace
many for loops with code that is both more succinct and easier to read.
The best place to learn about the map()
functions is the iteration chapter in R
for data science.
# The easiest way to get purrr is to install the whole tidyverse:
install.packages("tidyverse")
# Alternatively, install just purrr:
install.packages("purrr")
# Or the the development version from GitHub:
# install.packages("remotes")
::install_github("tidyverse/purrr") remotes
The following example uses purrr to solve a fairly realistic problem: split a data frame into pieces, fit a model to each piece, compute the summary, then extract the R2.
library(purrr)
|>
mtcars split(mtcars$cyl) |> # from base R
map(\(df) lm(mpg ~ wt, data = df)) |>
map(summary) %>%
map_dbl("r.squared")
#> 4 6 8
#> 0.5086326 0.4645102 0.4229655
This example illustrates some of the advantages of purrr functions over the equivalents in base R:
The first argument is always the data, so purrr works naturally with the pipe.
All purrr functions are type-stable. They always return the
advertised output type (map()
returns lists;
map_dbl()
returns double vectors), or they throw an
error.
All map()
functions accept functions (named,
anonymous, and lambda), character vector (used to extract components by
name), or numeric vectors (used to extract by position).