Como montar tabelas de modelos de regressão prontas para publicação.
o invés de ser obrigado a passar horas a fio formatando tabelas em Excel softwares pagos, você pode usar a biblioteca {gtsummary}
(Sjoberg, Curry, Hannum, Whiting, & Zabor, 2020) para formatar automaticamente suas tabelas:
gtsummary::tbl_summary()
gtsummary::tbl_regression()
O idioma das tabelas de {gtsummary}
pode ser definido com a função theme_gtsummary_language()
O pacote gtsummary
possui um conjunto de funções para sumarizar dados e tabelas. Nós particularmente usamos a função gtsummary::tbl_summary()
. Ela formata uma tabela de Estatística Descritiva de maneira bem conveniente.
theme_gtsummary_language("pt")
gtsummary::tbl_summary(
mtcars,
by = am,
type = all_continuous() ~ "continuous2",
statistic = list(
all_continuous() ~ c("{N_nonmiss}",
"{median} ({p25}, {p75})",
"{min}, {max}"),
all_categorical() ~ "{n} ({p}%)"),
missing = "no",
digits = all_continuous() ~ 2) %>%
add_overall() %>%
bold_labels() %>%
italicize_levels() %>%
modify_header(list(label ~ "**Variáveis**",
stat_1 ~ "**Automático**, N = 18",
stat_2 ~ "**Manual**, N = 14")) %>%
modify_spanning_header(c("stat_1", "stat_2") ~ "**Automáticos vs Manuais**") %>%
add_n()
Variáveis | N | Total, N = 32 | Automáticos vs Manuais | |
---|---|---|---|---|
Automático, N = 18 | Manual, N = 14 | |||
mpg | 32.00 | |||
N | 32.00 | 19.00 | 13.00 | |
Mediana (IQR) | 19.20 (15.43, 22.80) | 17.30 (14.95, 19.20) | 22.80 (21.00, 30.40) | |
Intervalo | 10.40, 33.90 | 10.40, 24.40 | 15.00, 33.90 | |
cyl | 32 | |||
4 | 11 (34%) | 3 (16%) | 8 (62%) | |
6 | 7 (22%) | 4 (21%) | 3 (23%) | |
8 | 14 (44%) | 12 (63%) | 2 (15%) | |
disp | 32.00 | |||
N | 32.00 | 19.00 | 13.00 | |
Mediana (IQR) | 196.30 (120.83, 326.00) | 275.80 (196.30, 360.00) | 120.30 (79.00, 160.00) | |
Intervalo | 71.10, 472.00 | 120.10, 472.00 | 71.10, 351.00 | |
hp | 32.00 | |||
N | 32.00 | 19.00 | 13.00 | |
Mediana (IQR) | 123.00 (96.50, 180.00) | 175.00 (116.50, 192.50) | 109.00 (66.00, 113.00) | |
Intervalo | 52.00, 335.00 | 62.00, 245.00 | 52.00, 335.00 | |
drat | 32.00 | |||
N | 32.00 | 19.00 | 13.00 | |
Mediana (IQR) | 3.70 (3.08, 3.92) | 3.15 (3.07, 3.70) | 4.08 (3.85, 4.22) | |
Intervalo | 2.76, 4.93 | 2.76, 3.92 | 3.54, 4.93 | |
wt | 32.00 | |||
N | 32.00 | 19.00 | 13.00 | |
Mediana (IQR) | 3.33 (2.58, 3.61) | 3.52 (3.44, 3.84) | 2.32 (1.94, 2.78) | |
Intervalo | 1.51, 5.42 | 2.46, 5.42 | 1.51, 3.57 | |
qsec | 32.00 | |||
N | 32.00 | 19.00 | 13.00 | |
Mediana (IQR) | 17.71 (16.89, 18.90) | 17.82 (17.18, 19.17) | 17.02 (16.46, 18.61) | |
Intervalo | 14.50, 22.90 | 15.41, 22.90 | 14.50, 19.90 | |
vs | 32 | 14 (44%) | 7 (37%) | 7 (54%) |
gear | 32 | |||
3 | 15 (47%) | 15 (79%) | 0 (0%) | |
4 | 12 (38%) | 4 (21%) | 8 (62%) | |
5 | 5 (16%) | 0 (0%) | 5 (38%) | |
carb | 32 | |||
1 | 7 (22%) | 3 (16%) | 4 (31%) | |
2 | 10 (31%) | 6 (32%) | 4 (31%) | |
3 | 3 (9.4%) | 3 (16%) | 0 (0%) | |
4 | 10 (31%) | 7 (37%) | 3 (23%) | |
6 | 1 (3.1%) | 0 (0%) | 1 (7.7%) | |
8 | 1 (3.1%) | 0 (0%) | 1 (7.7%) |
A função gtsummary::tbl_regression()
pode ser usada para modelos de regressão tanto linear quanto logística. Caso deseje incluir coeficientes padronizados em desvio padrões indique o argumento tidy_fun = tidy_standardize
.
library(gtsummary)
modelo_linear <- lm(mpg ~ hp + wt, data = mtcars)
tbl_regression(
modelo_linear,
tidy_fun = tidy_standardize) %>%
bold_labels() %>%
add_glance_source_note(c(r.squared, adj.r.squared)) %>%
modify_header(list(estimate ~ "**Beta Padronizados**"))
Características | Beta Padronizados | 95% CI1 |
---|---|---|
hp | -0.36 | -0.57, -0.15 |
wt | -0.63 | -0.84, -0.42 |
R² = 0.827; Adjusted R² = 0.815 | ||
1
CI = Intervalo de confiança
|
No caso de regressão logísticos os coeficientes são mostrados em formato bruto: logaritmo natural de razões de probabilidades (odds ratio – OR). Caso deseje exibir os coeficientes em razões de probabilidade (OR) para uma melhor interpretação, indique o argumento exponentiate = TRUE
.
data("TitanicSurvival", package = "carData")
modelo_logistico <- glm(survived ~ age + sex,
data = TitanicSurvival, family = binomial)
tbl_regression(
modelo_logistico,
exponentiate = TRUE) %>%
bold_labels() %>%
bold_p()
Características | OR1 | 95% CI1 | p-value |
---|---|---|---|
age | 1.00 | 0.99, 1.01 | 0.4 |
sex | |||
female | — | — | |
male | 0.09 | 0.06, 0.11 | <0.001 |
1
OR = Razão de chances, CI = Intervalo de confiança
|
Além disso conseguimos facilmente comparar diferentes modelos de regressão em uma mesma tabela com a função gtsummary::tbl_merge()
modelo_simples <- glm(survived ~ age + sex,
data = TitanicSurvival, family = binomial)
modelo_quali <- glm(survived ~ age + sex + passengerClass,
data = TitanicSurvival, family = binomial)
modelo_interacao <- glm(survived ~ age + sex * passengerClass,
data = TitanicSurvival, family = binomial)
tabelas_modelos <- list(modelo_simples, modelo_quali, modelo_interacao) %>%
purrr::map(~ tbl_regression(
.,
exponentiate = TRUE) %>%
bold_labels() %>%
bold_p())
tbl_merge(
tabelas_modelos,
tab_spanner = c("**Modelo Simples**",
"**Modelo Qualitativo**",
"**Modelo Interação**")
)
Características | Modelo Simples | Modelo Qualitativo | Modelo Interação | ||||||
---|---|---|---|---|---|---|---|---|---|
OR1 | 95% CI1 | p-value | OR1 | 95% CI1 | p-value | OR1 | 95% CI1 | p-value | |
age | 1.00 | 0.99, 1.01 | 0.4 | 0.97 | 0.95, 0.98 | <0.001 | 0.96 | 0.95, 0.97 | <0.001 |
sex | |||||||||
female | — | — | — | — | — | — | |||
male | 0.09 | 0.06, 0.11 | <0.001 | 0.08 | 0.06, 0.11 | <0.001 | 0.02 | 0.01, 0.05 | <0.001 |
passengerClass | |||||||||
1st | — | — | — | — | |||||
2nd | 0.28 | 0.18, 0.43 | <0.001 | 0.22 | 0.07, 0.63 | 0.007 | |||
3rd | 0.10 | 0.06, 0.16 | <0.001 | 0.02 | 0.01, 0.04 | <0.001 | |||
sex * passengerClass | |||||||||
male * 2nd | 0.93 | 0.28, 3.44 | >0.9 | ||||||
male * 3rd | 12.0 | 4.50, 38.7 | <0.001 | ||||||
1
OR = Razão de chances, CI = Intervalo de confiança
|
R version 4.0.3 (2020-10-10)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Catalina 10.15.7
Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRblas.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods
[7] base
other attached packages:
[1] MASS_7.3-53 likert_1.3.5 xtable_1.8-4
[4] naniar_0.6.0.9000 purrr_0.3.4 gtsummary_1.3.6
[7] gt_0.2.2 lm.beta_1.5-1 lmtest_0.9-38
[10] zoo_1.8-8 ggfortify_0.4.11 sjPlot_2.8.7
[13] broom_0.7.4 palmerpenguins_0.1.0 magrittr_2.0.1
[16] mnormt_2.0.2 cowplot_1.1.1 tidyr_1.1.2
[19] DescTools_0.99.40 skimr_2.1.2 ggpubr_0.4.0
[22] car_3.0-10 carData_3.0-4 patchwork_1.1.1
[25] dplyr_1.0.4 ggplot2_3.3.3 DiagrammeR_1.0.6.1
[28] readxl_1.3.1
loaded via a namespace (and not attached):
[1] backports_1.2.1 plyr_1.8.6 repr_1.1.3
[4] splines_4.0.3 usethis_2.0.1 digest_0.6.27
[7] htmltools_0.5.1.1 magick_2.6.0 fansi_0.4.2
[10] checkmate_2.0.0 openxlsx_4.2.3 modelr_0.1.8
[13] colorspace_2.0-0 haven_2.3.1 xfun_0.21
[16] crayon_1.4.1 jsonlite_1.7.2 Exact_2.1
[19] lme4_1.1-26 survival_3.2-7 glue_1.4.2
[22] gtable_0.3.0 emmeans_1.5.4 sjstats_0.18.1
[25] sjmisc_2.8.6 abind_1.4-5 scales_1.1.1
[28] mvtnorm_1.1-1 DBI_1.1.1 rstatix_0.6.0
[31] ggeffects_1.0.1 Rcpp_1.0.6 performance_0.7.0
[34] tmvnsim_1.0-2 reticulate_1.18 foreign_0.8-80
[37] htmlwidgets_1.5.3 RColorBrewer_1.1-2 ellipsis_0.3.1
[40] pkgconfig_2.0.3 farver_2.0.3 sass_0.3.1
[43] utf8_1.1.4 reshape2_1.4.4 tidyselect_1.1.0
[46] labeling_0.4.2 rlang_0.4.10 effectsize_0.4.3
[49] munsell_0.5.0 cellranger_1.1.0 tools_4.0.3
[52] visNetwork_2.0.9 cli_2.3.0 generics_0.1.0
[55] sjlabelled_1.1.7 evaluate_0.14 stringr_1.4.0
[58] yaml_2.2.1 fs_1.5.0 knitr_1.31
[61] zip_2.1.1 visdat_0.5.3 rootSolve_1.8.2.1
[64] nlme_3.1-149 xml2_1.3.2 compiler_4.0.3
[67] rstudioapi_0.13 curl_4.3 e1071_1.7-4
[70] ggsignif_0.6.0 tibble_3.0.6 statmod_1.4.35
[73] broom.helpers_1.1.0 stringi_1.5.3 highr_0.8
[76] parameters_0.11.0 forcats_0.5.1 lattice_0.20-41
[79] Matrix_1.2-18 psych_2.0.12 commonmark_1.7
[82] nloptr_1.2.2.2 ggsci_2.9 vctrs_0.3.6
[85] norm_1.0-9.5 pillar_1.4.7 lifecycle_0.2.0
[88] estimability_1.3 data.table_1.13.6 insight_0.12.0
[91] lmom_2.8 R6_2.5.0 bookdown_0.21
[94] gridExtra_2.3 rio_0.5.16 gld_2.6.2
[97] distill_1.2 boot_1.3-25 assertthat_0.2.1
[100] rprojroot_2.0.2 withr_2.4.1 parallel_4.0.3
[103] mgcv_1.8-33 bayestestR_0.8.2 expm_0.999-6
[106] hms_1.0.0 labelled_2.7.0 grid_4.0.3
[109] class_7.3-17 minqa_1.2.4 rmarkdown_2.6
[112] downlit_0.2.1 lubridate_1.7.9.2 base64enc_0.1-3
If you see mistakes or want to suggest changes, please create an issue on the source repository.
Text and figures are licensed under Creative Commons Attribution CC BY-SA 4.0. Source code is available at https://github.com/storopoli/Estatistica, unless otherwise noted. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".
For attribution, please cite this work as
Storopoli & Vils (2021, Jan. 11). Estatística com R: Tabelas para Publicação. Retrieved from https://storopoli.github.io/Estatistica/aux_Tabelas_para_Publicacao.html
BibTeX citation
@misc{storopoli2021tabelaspublicR, author = {Storopoli, Jose and Vils, Leonardo}, title = {Estatística com R: Tabelas para Publicação}, url = {https://storopoli.github.io/Estatistica/aux_Tabelas_para_Publicacao.html}, year = {2021} }