Programação Funcional

purrr e furrr

Jose Storopoli https://scholar.google.com/citations?user=xGU7H1QAAAAJ&hl=en (UNINOVE)https://www.uninove.br
April 16, 2021

Programação Funcional

Figure 1: Programação Funcional

{purrr} tem a seguinte lógica:

Ao invés de:

for (i in 1:n) {
  output[[i]] <- f(input[[i]])
}

Você faz:

library(purrr)
list %>% map(f)    # ou map(list, f)
library(purrr)
1L:10L %>% map(rnorm, 5, 10)
[[1]]
[1] -1.6

[[2]]
[1] 5.8 4.3

[[3]]
[1] -12.3 -22.2   9.5

[[4]]
[1] -15.2  -5.1  30.9  -5.7

[[5]]
[1] 16.55  5.18 13.63 11.00  0.89

[[6]]
[1]  15.2  -5.0 -10.1  16.0  20.6  -7.1

[[7]]
[1] 20.1 21.8  7.4  7.4 14.6  7.5  3.6

[[8]]
[1]   3.89   6.48  19.70  -1.79   0.67   3.99 -11.44   8.04

[[9]]
[1] 14.7 15.9 24.0 -4.3 28.1  9.6  4.9  2.0 16.6

[[10]]
 [1]  -9.96  -0.41  -0.29  -8.56  -9.73  13.21  -6.07   0.33  27.48
[10]  27.42

Além disso temos o map2* (2 listas de inputs) e o pmap* (lista de vetores de inputs, pode ser um data.frame) para múltiplos inputs.

map_*()

Conversão implícita

1L:10L %>%
  map(rnorm) %>%
  map_dbl(mean) # igual a map_dbl(~mean(.x))
 [1]  0.30  0.73 -1.12 -0.42  0.58 -0.12  0.29 -0.17  0.49  0.15

Professor, e a ~?

O tio ~ (em inglês é tilde) ele funciona quando você precisa especificar funções e argumentos mais complexos:

c("meu", "microfone", "está", "aberto") %>% # vire um vetor de c("meuprefixo_meu", ...)
  map_chr(~paste0("meuprefixo_", .x))
[1] "meuprefixo_meu"       "meuprefixo_microfone"
[3] "meuprefixo_está"      "meuprefixo_aberto"   
library(ggplot2)
c("hp", "wt", "qsec") %>% 
  map(~mtcars %>% 
        ggplot(aes_string(.x, "mpg")) +
        geom_point() +
        geom_smooth())
[[1]]


[[2]]


[[3]]

Agora é um bom momento para introduzir o purrr::walk

c("hp", "wt", "qsec") %>% 
  walk(~ {p <- mtcars %>% 
        ggplot(aes_string(.x, "mpg")) +
        geom_point() +
        geom_smooth()
      print(p)})

y <- c("mpg", "cyl", "gear")

c("hp", "wt", "qsec") %>% 
  walk2(y, ~{p <- mtcars %>% 
        ggplot(aes_string(.x, .y)) +
        geom_point() +
        geom_smooth()
      print(p)})

Agora o purrr::pwalk()

x <- c("hp", "wt", "qsec")
y <- c("mpg", "cyl", "gear")
z <- c("vs", "am", "cyl")
list_v <- list(x, y, z)

list_v %>% 
  pwalk(~{p <- mtcars %>% 
        ggplot(aes_string(..1, ..2, colour = ..3)) + # e continua ..4 ..5 ..6
        geom_point() +
        geom_smooth()
      print(p)})

Até dá para “knitar” vários markdowns

c("arquivo1.Rmd", "arquivo2.Rmd", ...) %>%    # ou fs::dir_ls(glob = "*.Rmd") lê todo o diretório
  walk(~knitr::render(.x,
                      output_format = "html_document",
                      output_dir = "relatorios/"))

MOAH POWER! {furrr}

{purrr} é single-thread então vamos usar o {furrr}.

Para usar é muito fácil! Ao invés de:

Use:

seq_len(8) %>% 
  walk(~{
    Sys.sleep(1)
    print("Oi")})
[1] "Oi"
[1] "Oi"
[1] "Oi"
[1] "Oi"
[1] "Oi"
[1] "Oi"
[1] "Oi"
[1] "Oi"

Agora com o {furrr}:

library(furrr)
plan(multisession) # ou coloque no `.Rprofile` `options(Ncpus = parallel::detectCores())` e `options(mc.cores = parallel::detectCores())`
seq_len(8) %>% 
  future_walk(~{
    Sys.sleep(1)
    print("Oi")},
    .progress = TRUE)
[1] "Oi"
[1] "Oi"
[1] "Oi"
[1] "Oi"
[1] "Oi"
[1] "Oi"
[1] "Oi"
[1] "Oi"

Se for mexer com coisas aleatórias é importante usar o argumento: * .options = furrr_options(seed = TRUE)

1L:10L %>% future_map(rnorm)
[[1]]
[1] -1.1

[[2]]
[1]  0.34 -2.01

[[3]]
[1]  0.101 -0.048 -0.260

[[4]]
[1] -1.0  1.2  1.9 -0.7

[[5]]
[1] -0.62  0.34 -0.39 -1.47 -0.11

[[6]]
[1]  0.12  0.86  1.38 -0.82  0.59  1.69

[[7]]
[1] -0.256  0.957 -0.486  3.089 -1.994  0.086  0.581

[[8]]
[1]  0.48  1.16 -0.99 -0.12 -1.11 -1.58  0.52  0.94

[[9]]
[1] -1.554  0.152  1.692  0.107 -0.916  0.560  2.328 -0.058 -0.475

[[10]]
 [1]  0.016 -0.362 -0.228 -0.774  1.012  0.665  0.756 -0.043  1.059
[10]  0.869
1L:10L %>%
  future_map(rnorm,
             .options = furrr_options(seed = TRUE))
[[1]]
[1] 0.59

[[2]]
[1]  0.24 -0.98

[[3]]
[1]  0.231  0.064 -0.299

[[4]]
[1] 1.168 1.348 0.785 0.063

[[5]]
[1]  2.58  0.17  0.11 -1.15  2.70

[[6]]
[1]  0.54 -0.23 -1.07  0.39  0.91  0.11

[[7]]
[1]  0.65  1.17 -0.57  1.78  0.20  1.77 -1.42

[[8]]
[1]  1.866 -0.843  0.152 -1.764  1.789 -1.026 -1.072 -0.029

[[9]]
[1]  0.375  1.118  0.942 -0.639 -0.093 -1.363  0.348  0.641 -0.210

[[10]]
 [1]  0.0023  0.0910  0.9352  0.9922  0.8922 -1.7373  1.2078  1.1799
 [9] -2.1303  1.1040

Ambiente

R version 4.1.0 (2021-05-18)
Platform: aarch64-apple-darwin20 (64-bit)
Running under: macOS Big Sur 11.4

Matrix products: default
BLAS:   /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/lib/libRblas.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods  
[7] base     

other attached packages:
 [1] furrr_0.2.3       future_1.21.0     purrr_0.3.4      
 [4] ggridges_0.5.3    ggExtra_0.9       gghighlight_0.3.2
 [7] ggrepel_0.9.1     patchwork_1.1.1   forcats_0.5.1    
[10] plotly_4.9.4.1    repurrrsive_1.0.0 ggplot2_3.3.5    
[13] stringr_1.4.0     tidyr_1.1.3       janitor_2.1.0    
[16] dplyr_1.0.7       readr_1.4.0       magrittr_2.0.1   
[19] tibble_3.1.2     

loaded via a namespace (and not attached):
 [1] nlme_3.1-152       lubridate_1.7.10   bit64_4.0.5       
 [4] RColorBrewer_1.1-2 httr_1.4.2         rprojroot_2.0.2   
 [7] tools_4.1.0        bslib_0.2.5.1      utf8_1.2.1        
[10] R6_2.5.0           DBI_1.1.1          lazyeval_0.2.2    
[13] mgcv_1.8-35        colorspace_2.0-2   withr_2.4.2       
[16] tidyselect_1.1.1   downlit_0.2.1      bit_4.0.4         
[19] compiler_4.1.0     textshaping_0.3.5  cli_3.0.0         
[22] labeling_0.4.2     bookdown_0.22      sass_0.4.0        
[25] scales_1.1.1       systemfonts_1.0.2  digest_0.6.27     
[28] rmarkdown_2.9      jpeg_0.1-8.1       pkgconfig_2.0.3   
[31] htmltools_0.5.1.1  parallelly_1.26.1  dbplyr_2.1.1      
[34] fastmap_1.1.0      highr_0.9          htmlwidgets_1.5.3 
[37] rlang_0.4.11       rstudioapi_0.13    RSQLite_2.2.7     
[40] shiny_1.6.0        jquerylib_0.1.4    farver_2.1.0      
[43] generics_0.1.0     jsonlite_1.7.2     crosstalk_1.1.1   
[46] distill_1.2        Matrix_1.3-4       Rcpp_1.0.6        
[49] munsell_0.5.0      fansi_0.5.0        lifecycle_1.0.0   
[52] stringi_1.6.2      yaml_2.2.1         snakecase_0.11.0  
[55] plyr_1.8.6         grid_4.1.0         blob_1.2.1        
[58] parallel_4.1.0     listenv_0.8.0      promises_1.2.0.1  
[61] crayon_1.4.1       miniUI_0.1.1.1     lattice_0.20-44   
[64] splines_4.1.0      hms_1.1.0          knitr_1.33        
[67] pillar_1.6.1       codetools_0.2-18   glue_1.4.2        
[70] evaluate_0.14      data.table_1.14.0  httpuv_1.6.1      
[73] png_0.1-7          vctrs_0.3.8        gtable_0.3.0      
[76] assertthat_0.2.1   cachem_1.0.5       xfun_0.24         
[79] mime_0.11          xtable_1.8-4       later_1.2.0       
[82] ragg_1.1.3         viridisLite_0.4.0  memoise_2.0.0     
[85] globals_0.14.0     ellipsis_0.3.2    

Corrections

If you see mistakes or want to suggest changes, please create an issue on the source repository.

Reuse

Text and figures are licensed under Creative Commons Attribution CC BY-SA 4.0. Source code is available at https://github.com/storopoli/Linguagem-R, unless otherwise noted. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".

Citation

For attribution, please cite this work as

Storopoli (2021, April 16). Linguagem R: Programação Funcional. Retrieved from https://storopoli.io/Linguagem-R/4-Programacao_Funcional.html

BibTeX citation

@misc{storopoli2021programacaofuncionalR,
  author = {Storopoli, Jose},
  title = {Linguagem R: Programação Funcional},
  url = {https://storopoli.io/Linguagem-R/4-Programacao_Funcional.html},
  year = {2021}
}