Linguagem R: Programação Funcional

Jose Storopoli

Figure 1: Programação Funcional

{purrr} tem a seguinte lógica:

Ao invés de:

for (i in 1:n) {
  output[[i]] <- f(input[[i]])
}

Você faz:

library(purrr)
list %>% map(f)    # ou map(list, f)

library(purrr)
1L:10L %>% map(rnorm, 5, 10)

[[1]]
[1] -0.66

[[2]]
[1] -2.8  9.6

[[3]]
[1]  0.47 -6.42 12.58

[[4]]
[1] 11.2  4.5  4.8  8.2

[[5]]
[1] -6.4 20.6 -1.6  5.8  4.3

[[6]]
[1] -12.3 -22.2   9.5 -15.2  -5.1  30.9

[[7]]
[1] -5.65 16.55  5.18 13.63 11.00  0.89 15.21

[[8]]
[1]  -5.0 -10.1  16.0  20.6  -7.1  20.1  21.8   7.4

[[9]]
[1]  7.39 14.63  7.47  3.57  3.89  6.48 19.70 -1.79  0.67

[[10]]
 [1]   4.0 -11.4   8.0  14.7  15.9  24.0  -4.3  28.1   9.6   4.9

map() — sempre retorna uma list
map_lgl(), map_int(), map_dbl() e map_chr() — retornam um vetor do tipo desejado (conversão implícita)
map_dfr() e map_dfc() — retornaram um data.frame concatenando colunas (c) ou linhas (r)
walk() — usado para efeitos colaterais (side-effects)

Além disso temos o map2* (2 listas de inputs) e o pmap* (lista de vetores de inputs, pode ser um data.frame) para múltiplos inputs.

`map_*()`

Conversão implícita

1L:10L %>%
  map(rnorm) %>%
  map_dbl(mean) # igual a map_dbl(~mean(.x))

 [1] -0.297 -0.167 -0.809 -0.556  1.251 -0.713  0.319 -0.045  0.389
[10]  0.103

Professor, e a `~`?

O tio ~ (em inglês é tilde) ele funciona quando você precisa especificar funções e argumentos mais complexos:

c("meu", "microfone", "está", "aberto") %>% # vire um vetor de c("meuprefixo_meu", ...)
  map_chr(~paste0("meuprefixo_", .x))

[1] "meuprefixo_meu"       "meuprefixo_microfone"
[3] "meuprefixo_está"      "meuprefixo_aberto"

library(ggplot2)
c("hp", "wt", "qsec") %>% 
  map(~mtcars %>% 
        ggplot(aes_string(.x, "mpg")) +
        geom_point() +
        geom_smooth())

[[1]]


[[2]]


[[3]]

Agora é um bom momento para introduzir o purrr::walk

walk() — usado para efeitos colaterais (side-effects)

c("hp", "wt", "qsec") %>% 
  walk(~ {p <- mtcars %>% 
        ggplot(aes_string(.x, "mpg")) +
        geom_point() +
        geom_smooth()
      print(p)})

y <- c("mpg", "cyl", "gear")

c("hp", "wt", "qsec") %>% 
  walk2(y, ~{p <- mtcars %>% 
        ggplot(aes_string(.x, .y)) +
        geom_point() +
        geom_smooth()
      print(p)})

Agora o purrr::pwalk()

x <- c("hp", "wt", "qsec")
y <- c("mpg", "cyl", "gear")
z <- c("vs", "am", "cyl")
list_v <- list(x, y, z)

list_v %>% 
  pwalk(~{p <- mtcars %>% 
        ggplot(aes_string(..1, ..2, colour = ..3)) + # e continua ..4 ..5 ..6
        geom_point() +
        geom_smooth()
      print(p)})

Até dá para “knitar” vários markdowns

c("arquivo1.Rmd", "arquivo2.Rmd", ...) %>%    # ou fs::dir_ls(glob = "*.Rmd") lê todo o diretório
  walk(~knitr::render(.x,
                      output_format = "html_document",
                      output_dir = "relatorios/"))

MOAH POWER! `{furrr}`

{purrr} é single-thread então vamos usar o {furrr}.

Para usar é muito fácil! Ao invés de:

map()
map2()
pmap()
walk()

Use:

future_map()
future_map2()
future_pmap()
future_walk()

seq_len(8) %>% 
  walk(~{
    Sys.sleep(1)
    print("Oi")})

[1] "Oi"
[1] "Oi"
[1] "Oi"
[1] "Oi"
[1] "Oi"
[1] "Oi"
[1] "Oi"
[1] "Oi"

Agora com o {furrr}:

library(furrr)
plan(multisession) # ou coloque no `.Rprofile` `options(Ncpus = parallel::detectCores())` e `options(mc.cores = parallel::detectCores())`

seq_len(8) %>% 
  future_walk(~{
    Sys.sleep(1)
    print("Oi")},
    .progress = TRUE)

[1] "Oi"
[1] "Oi"
[1] "Oi"
[1] "Oi"
[1] "Oi"
[1] "Oi"
[1] "Oi"
[1] "Oi"

Se for mexer com coisas aleatórias é importante usar o argumento: * .options = furrr_options(seed = TRUE)

1L:10L %>% future_map(rnorm)

[[1]]
[1] 0.33

[[2]]
[1] -0.76 -0.37

[[3]]
[1] -2.06 -1.83  0.85

[[4]]
[1] -1.306  1.214  0.026 -0.457

[[5]]
[1]  0.95 -0.61 -0.49 -1.37  0.55

[[6]]
[1]  1.07  0.53 -0.35  0.18 -0.31 -1.75

[[7]]
[1] -0.63 -0.40  1.45  1.33 -0.47  0.48 -0.40

[[8]]
[1] -1.005  1.912  0.479  0.578  0.031  0.059  0.534  0.138

[[9]]
[1] -0.948  0.081 -0.310  0.133  0.869  1.014 -0.472  0.433  1.420

[[10]]
 [1]  0.829  2.082 -0.335 -0.657  0.096 -0.346  0.605 -1.011 -0.050
[10] -0.423

1L:10L %>%
  future_map(rnorm,
             .options = furrr_options(seed = TRUE))

[[1]]
[1] 0.32

[[2]]
[1]  0.26 -2.68

[[3]]
[1] 0.16 0.43 0.91

[[4]]
[1]  0.30 -1.11 -2.85  0.42

[[5]]
[1]  0.32 -0.59 -0.98 -0.58 -0.52

[[6]]
[1] -0.627  0.830  0.367  0.122  1.210  0.091

[[7]]
[1] -0.079 -1.188  0.757  1.256  1.521  0.124 -1.445

[[8]]
[1] -0.085 -0.036 -1.797  0.366  0.054 -2.345 -0.218 -0.141

[[9]]
[1] -0.27 -0.42  1.55 -0.39 -0.53 -0.65 -0.85  0.45 -1.15

[[10]]
 [1]  0.166  0.084  3.082 -0.108  0.362 -0.876 -0.067 -0.489 -0.976
[10]  0.712

Ambiente

sessionInfo()

R version 4.2.2 (2022-10-31)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 22.04.1 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods  
[7] base     

other attached packages:
 [1] furrr_0.3.1       future_1.31.0     purrr_1.0.1      
 [4] ggridges_0.5.4    ggExtra_0.10.0    gghighlight_0.4.0
 [7] ggrepel_0.9.3     patchwork_1.1.2   forcats_1.0.0    
[10] plotly_4.10.1     repurrrsive_1.1.0 ggplot2_3.4.1    
[13] stringr_1.5.0     tidyr_1.3.0       janitor_2.2.0    
[16] dplyr_1.1.0       readr_2.1.4       magrittr_2.0.3   
[19] tibble_3.1.8     

loaded via a namespace (and not attached):
 [1] nlme_3.1-160       lubridate_1.9.2    bit64_4.0.5       
 [4] RColorBrewer_1.1-3 httr_1.4.4         rprojroot_2.0.3   
 [7] tools_4.2.2        bslib_0.4.2        utf8_1.2.3        
[10] R6_2.5.1           DBI_1.1.3          lazyeval_0.2.2    
[13] mgcv_1.8-41        colorspace_2.1-0   withr_2.5.0       
[16] tidyselect_1.2.0   downlit_0.4.2      bit_4.0.5         
[19] compiler_4.2.2     cli_3.6.0          labeling_0.4.2    
[22] bookdown_0.32      sass_0.4.5         scales_1.2.1      
[25] digest_0.6.31      rmarkdown_2.20     pkgconfig_2.0.3   
[28] htmltools_0.5.4    parallelly_1.34.0  dbplyr_2.3.0      
[31] fastmap_1.1.0      highr_0.10         htmlwidgets_1.6.1 
[34] rlang_1.0.6        rstudioapi_0.14    RSQLite_2.3.0     
[37] shiny_1.7.4        jquerylib_0.1.4    farver_2.1.1      
[40] generics_0.1.3     jsonlite_1.8.4     crosstalk_1.2.0   
[43] vroom_1.6.1        distill_1.5        Matrix_1.5-1      
[46] Rcpp_1.0.10        munsell_0.5.0      fansi_1.0.4       
[49] lifecycle_1.0.3    stringi_1.7.12     yaml_2.3.7        
[52] snakecase_0.11.0   grid_4.2.2         blob_1.2.3        
[55] listenv_0.9.0      promises_1.2.0.1   parallel_4.2.2    
[58] crayon_1.5.2       miniUI_0.1.1.1     lattice_0.20-45   
[61] splines_4.2.2      hms_1.1.2          knitr_1.42        
[64] pillar_1.8.1       codetools_0.2-18   glue_1.6.2        
[67] evaluate_0.20      data.table_1.14.8  png_0.1-8         
[70] vctrs_0.5.2        tzdb_0.3.0         httpuv_1.6.9      
[73] gtable_0.3.1       assertthat_0.2.1   cachem_1.0.6      
[76] xfun_0.37          mime_0.12          xtable_1.8-4      
[79] later_1.3.0        viridisLite_0.4.1  memoise_2.0.1     
[82] globals_0.16.2     timechange_0.2.0   ellipsis_0.3.2

Programação Funcional

`map_*()`

Professor, e a `~`?

MOAH POWER! `{furrr}`

Ambiente

Corrections

Reuse

Citation

Programação Funcional

map_*()

Professor, e a ~?

MOAH POWER! {furrr}

Ambiente

Corrections

Reuse

Citation

`map_*()`

Professor, e a `~`?

MOAH POWER! `{furrr}`