Tabelas para Publicação

Como montar tabelas de modelos de regressão prontas para publicação.

Jose Storopoli https://scholar.google.com/citations?user=xGU7H1QAAAAJ&hl=en (UNINOVE)https://www.uninove.br , Leonardo Vils https://scholar.google.com/citations?user=VO07L9EAAAAJ&hl=en (UNINOVE)https://www.uninove.br
January 11, 2021

o invés de ser obrigado a passar horas a fio formatando tabelas em Excel softwares pagos, você pode usar a biblioteca {gtsummary} (Sjoberg, Curry, Hannum, Whiting, & Zabor, 2020) para formatar automaticamente suas tabelas:

O idioma das tabelas de {gtsummary} pode ser definido com a função theme_gtsummary_language()

Estatísticas Descritivas

O pacote gtsummary possui um conjunto de funções para sumarizar dados e tabelas. Nós particularmente usamos a função gtsummary::tbl_summary(). Ela formata uma tabela de Estatística Descritiva de maneira bem conveniente.

theme_gtsummary_language("pt")

gtsummary::tbl_summary(
  mtcars,
  by = am,
  type = all_continuous() ~ "continuous2",
  statistic = list(
    all_continuous() ~ c("{N_nonmiss}",
                         "{median} ({p25}, {p75})",
                         "{min}, {max}"),
    all_categorical() ~ "{n} ({p}%)"),
  missing = "no",
  digits = all_continuous() ~ 2) %>%
  add_overall() %>%
  bold_labels() %>%
  italicize_levels() %>%
  modify_header(list(label ~ "**Variáveis**",
                     stat_1 ~ "**Automático**, N = 18",
                     stat_2 ~ "**Manual**, N = 14")) %>%
  modify_spanning_header(c("stat_1", "stat_2") ~ "**Automáticos vs Manuais**") %>%
  add_n()
Variáveis N Total, N = 32 Automáticos vs Manuais
Automático, N = 18 Manual, N = 14
mpg 32.00
N 32.00 19.00 13.00
Mediana (IQR) 19.20 (15.43, 22.80) 17.30 (14.95, 19.20) 22.80 (21.00, 30.40)
Intervalo 10.40, 33.90 10.40, 24.40 15.00, 33.90
cyl 32
4 11 (34%) 3 (16%) 8 (62%)
6 7 (22%) 4 (21%) 3 (23%)
8 14 (44%) 12 (63%) 2 (15%)
disp 32.00
N 32.00 19.00 13.00
Mediana (IQR) 196.30 (120.83, 326.00) 275.80 (196.30, 360.00) 120.30 (79.00, 160.00)
Intervalo 71.10, 472.00 120.10, 472.00 71.10, 351.00
hp 32.00
N 32.00 19.00 13.00
Mediana (IQR) 123.00 (96.50, 180.00) 175.00 (116.50, 192.50) 109.00 (66.00, 113.00)
Intervalo 52.00, 335.00 62.00, 245.00 52.00, 335.00
drat 32.00
N 32.00 19.00 13.00
Mediana (IQR) 3.70 (3.08, 3.92) 3.15 (3.07, 3.70) 4.08 (3.85, 4.22)
Intervalo 2.76, 4.93 2.76, 3.92 3.54, 4.93
wt 32.00
N 32.00 19.00 13.00
Mediana (IQR) 3.33 (2.58, 3.61) 3.52 (3.44, 3.84) 2.32 (1.94, 2.78)
Intervalo 1.51, 5.42 2.46, 5.42 1.51, 3.57
qsec 32.00
N 32.00 19.00 13.00
Mediana (IQR) 17.71 (16.89, 18.90) 17.82 (17.18, 19.17) 17.02 (16.46, 18.61)
Intervalo 14.50, 22.90 15.41, 22.90 14.50, 19.90
vs 32 14 (44%) 7 (37%) 7 (54%)
gear 32
3 15 (47%) 15 (79%) 0 (0%)
4 12 (38%) 4 (21%) 8 (62%)
5 5 (16%) 0 (0%) 5 (38%)
carb 32
1 7 (22%) 3 (16%) 4 (31%)
2 10 (31%) 6 (32%) 4 (31%)
3 3 (9.4%) 3 (16%) 0 (0%)
4 10 (31%) 7 (37%) 3 (23%)
6 1 (3.1%) 0 (0%) 1 (7.7%)
8 1 (3.1%) 0 (0%) 1 (7.7%)

Tabela de Regressão Linear/Logística

A função gtsummary::tbl_regression() pode ser usada para modelos de regressão tanto linear quanto logística. Caso deseje incluir coeficientes padronizados em desvio padrões indique o argumento tidy_fun = tidy_standardize.

library(gtsummary)
modelo_linear <- lm(mpg ~ hp + wt, data = mtcars)

tbl_regression(
  modelo_linear,
  tidy_fun = tidy_standardize) %>%
  bold_labels() %>%
  add_glance_source_note(c(r.squared, adj.r.squared)) %>%
  modify_header(list(estimate ~ "**Beta Padronizados**"))
Características Beta Padronizados 95% CI1
hp -0.36 -0.57, -0.15
wt -0.63 -0.84, -0.42
R² = 0.827; Adjusted R² = 0.815

1 CI = Intervalo de confiança

No caso de regressão logísticos os coeficientes são mostrados em formato bruto: logaritmo natural de razões de probabilidades (odds ratio – OR). Caso deseje exibir os coeficientes em razões de probabilidade (OR) para uma melhor interpretação, indique o argumento exponentiate = TRUE.

data("TitanicSurvival", package = "carData")
modelo_logistico <- glm(survived ~ age + sex,
                        data = TitanicSurvival, family = binomial)

tbl_regression(
  modelo_logistico,
  exponentiate = TRUE) %>%
  bold_labels() %>%
  bold_p()
Características OR1 95% CI1 p-value
age 1.00 0.99, 1.01 0.4
sex
female
male 0.09 0.06, 0.11 <0.001

1 OR = Razão de chances, CI = Intervalo de confiança

Além disso conseguimos facilmente comparar diferentes modelos de regressão em uma mesma tabela com a função gtsummary::tbl_merge()

modelo_simples <- glm(survived ~ age + sex,
                      data = TitanicSurvival, family = binomial)
modelo_quali <- glm(survived ~ age + sex + passengerClass,
                    data = TitanicSurvival, family = binomial)
modelo_interacao <- glm(survived ~ age + sex * passengerClass,
                        data = TitanicSurvival, family = binomial)

tabelas_modelos <- list(modelo_simples, modelo_quali, modelo_interacao) %>%
  purrr::map(~ tbl_regression(
  .,
  exponentiate = TRUE) %>%
  bold_labels() %>%
  bold_p())


tbl_merge(
  tabelas_modelos,
  tab_spanner = c("**Modelo Simples**",
                  "**Modelo Qualitativo**",
                  "**Modelo Interação**")
)
Características Modelo Simples Modelo Qualitativo Modelo Interação
OR1 95% CI1 p-value OR1 95% CI1 p-value OR1 95% CI1 p-value
age 1.00 0.99, 1.01 0.4 0.97 0.95, 0.98 <0.001 0.96 0.95, 0.97 <0.001
sex
female
male 0.09 0.06, 0.11 <0.001 0.08 0.06, 0.11 <0.001 0.02 0.01, 0.05 <0.001
passengerClass
1st
2nd 0.28 0.18, 0.43 <0.001 0.22 0.07, 0.63 0.007
3rd 0.10 0.06, 0.16 <0.001 0.02 0.01, 0.04 <0.001
sex * passengerClass
male * 2nd 0.93 0.28, 3.44 >0.9
male * 3rd 12.0 4.50, 38.7 <0.001

1 OR = Razão de chances, CI = Intervalo de confiança

Ambiente

R version 4.0.3 (2020-10-10)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Catalina 10.15.7

Matrix products: default
BLAS:   /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRblas.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods  
[7] base     

other attached packages:
 [1] MASS_7.3-53          likert_1.3.5         xtable_1.8-4        
 [4] naniar_0.6.0.9000    purrr_0.3.4          gtsummary_1.3.6     
 [7] gt_0.2.2             lm.beta_1.5-1        lmtest_0.9-38       
[10] zoo_1.8-8            ggfortify_0.4.11     sjPlot_2.8.7        
[13] broom_0.7.4          palmerpenguins_0.1.0 magrittr_2.0.1      
[16] mnormt_2.0.2         cowplot_1.1.1        tidyr_1.1.2         
[19] DescTools_0.99.40    skimr_2.1.2          ggpubr_0.4.0        
[22] car_3.0-10           carData_3.0-4        patchwork_1.1.1     
[25] dplyr_1.0.4          ggplot2_3.3.3        DiagrammeR_1.0.6.1  
[28] readxl_1.3.1        

loaded via a namespace (and not attached):
  [1] backports_1.2.1     plyr_1.8.6          repr_1.1.3         
  [4] splines_4.0.3       usethis_2.0.1       digest_0.6.27      
  [7] htmltools_0.5.1.1   magick_2.6.0        fansi_0.4.2        
 [10] checkmate_2.0.0     openxlsx_4.2.3      modelr_0.1.8       
 [13] colorspace_2.0-0    haven_2.3.1         xfun_0.21          
 [16] crayon_1.4.1        jsonlite_1.7.2      Exact_2.1          
 [19] lme4_1.1-26         survival_3.2-7      glue_1.4.2         
 [22] gtable_0.3.0        emmeans_1.5.4       sjstats_0.18.1     
 [25] sjmisc_2.8.6        abind_1.4-5         scales_1.1.1       
 [28] mvtnorm_1.1-1       DBI_1.1.1           rstatix_0.6.0      
 [31] ggeffects_1.0.1     Rcpp_1.0.6          performance_0.7.0  
 [34] tmvnsim_1.0-2       reticulate_1.18     foreign_0.8-80     
 [37] htmlwidgets_1.5.3   RColorBrewer_1.1-2  ellipsis_0.3.1     
 [40] pkgconfig_2.0.3     farver_2.0.3        sass_0.3.1         
 [43] utf8_1.1.4          reshape2_1.4.4      tidyselect_1.1.0   
 [46] labeling_0.4.2      rlang_0.4.10        effectsize_0.4.3   
 [49] munsell_0.5.0       cellranger_1.1.0    tools_4.0.3        
 [52] visNetwork_2.0.9    cli_2.3.0           generics_0.1.0     
 [55] sjlabelled_1.1.7    evaluate_0.14       stringr_1.4.0      
 [58] yaml_2.2.1          fs_1.5.0            knitr_1.31         
 [61] zip_2.1.1           visdat_0.5.3        rootSolve_1.8.2.1  
 [64] nlme_3.1-149        xml2_1.3.2          compiler_4.0.3     
 [67] rstudioapi_0.13     curl_4.3            e1071_1.7-4        
 [70] ggsignif_0.6.0      tibble_3.0.6        statmod_1.4.35     
 [73] broom.helpers_1.1.0 stringi_1.5.3       highr_0.8          
 [76] parameters_0.11.0   forcats_0.5.1       lattice_0.20-41    
 [79] Matrix_1.2-18       psych_2.0.12        commonmark_1.7     
 [82] nloptr_1.2.2.2      ggsci_2.9           vctrs_0.3.6        
 [85] norm_1.0-9.5        pillar_1.4.7        lifecycle_0.2.0    
 [88] estimability_1.3    data.table_1.13.6   insight_0.12.0     
 [91] lmom_2.8            R6_2.5.0            bookdown_0.21      
 [94] gridExtra_2.3       rio_0.5.16          gld_2.6.2          
 [97] distill_1.2         boot_1.3-25         assertthat_0.2.1   
[100] rprojroot_2.0.2     withr_2.4.1         parallel_4.0.3     
[103] mgcv_1.8-33         bayestestR_0.8.2    expm_0.999-6       
[106] hms_1.0.0           labelled_2.7.0      grid_4.0.3         
[109] class_7.3-17        minqa_1.2.4         rmarkdown_2.6      
[112] downlit_0.2.1       lubridate_1.7.9.2   base64enc_0.1-3    
Sjoberg, D. D., Curry, M., Hannum, M., Whiting, K., & Zabor, E. C. (2020). Gtsummary: Presentation-ready data summary and analytic result tables. Retrieved from https://CRAN.R-project.org/package=gtsummary

References

Corrections

If you see mistakes or want to suggest changes, please create an issue on the source repository.

Reuse

Text and figures are licensed under Creative Commons Attribution CC BY-SA 4.0. Source code is available at https://github.com/storopoli/Estatistica, unless otherwise noted. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".

Citation

For attribution, please cite this work as

Storopoli & Vils (2021, Jan. 11). Estatística com R: Tabelas para Publicação. Retrieved from https://storopoli.github.io/Estatistica/aux_Tabelas_para_Publicacao.html

BibTeX citation

@misc{storopoli2021tabelaspublicR,
  author = {Storopoli, Jose and Vils, Leonardo},
  title = {Estatística com R: Tabelas para Publicação},
  url = {https://storopoli.github.io/Estatistica/aux_Tabelas_para_Publicacao.html},
  year = {2021}
}