An aba model is the foundational object in the aba package. It is composed of the following:
data: a data.frame to be used to fit the statistical models
spec: the specification for the aba model composed of the following:
groups: subsets of the data
outcomes: dependent variables in statistical fits.
covariates: independent variables which should always be included in statistical fits.
predictors: independent variables which will vary across different statistical fits.
results: the resulting fitted statistics.
Usage
aba_model(
data = NULL,
groups = NULL,
outcomes = NULL,
predictors = NULL,
covariates = NULL,
stats = NULL,
evals = NULL,
include_basic = TRUE
)
Arguments
- data
data.frame the data to use for the object
- groups
vector or list of logical statements as trings. Groups are subsets of the data on which different models will be fit.
- outcomes
vector or list of strings Outcomes are the dependent variables in the statistical fits.
- predictors
vector or list of strings Predictors are independent variables which you want to vary. You can include variables on their own or in combination with others. A collection of variables is referred to as a
predictor
and unique variables are referred to as aterm
.- covariates
vector of strings Covariates are independent variables which remain fixed across all statistical fits and are therefore always included with the different combinations of predictors.
- stats
string or abaStat object(s) with
stat_
prefix. Stats are the actual statistical models which you want to fit on the data. Their primary functions are to 1) generate a suitable model formula given the outcome - covariate - predictor combination, and 2) to actually fit the statistical model.- evals
string or abaEveal object(s) with
eval_
prefix. Evals are the ways in which your model is fit on the data. The standard method is to simply fit the models on the entire data, but you can also fit models using bootstrapping, train-test splits, or cross validation.- include_basic
logical. Whether to fit a "basic" model which includes only covariates.
Value
An aba model which can be fitted using the aba_fit()
function and
which can be modified in any manner.
Examples
# use built-in data and only take the baseline visit
data <- adnimerge %>% dplyr::filter(VISCODE == 'bl')
# Create aba model w/ data, groups, outcomes, covariates, predictors, stats.
# Note that we start with piping the data into the aba_model... This is
# possible because `data` is the first argument of the `aba_model()` function
# and is useful because it gives auto-completion of variables names in Rstudio.
model <- data %>% aba_model() %>%
set_groups(everyone(), DX_bl %in% c('MCI','AD')) %>%
set_outcomes(ConvertedToAlzheimers, CSF_ABETA_STATUS_bl) %>%
set_covariates(AGE, GENDER, EDUCATION) %>%
set_predictors(
PLASMA_ABETA_bl, PLASMA_PTAU181_bl, PLASMA_NFL_bl,
c(PLASMA_ABETA_bl, PLASMA_PTAU181_bl, PLASMA_NFL_bl)
) %>%
set_stats('glm')
# get a useful view of the model spec:
print(model)
#> ----------------------
#> ABA MODEL (not fitted)
#> ----------------------
#> Groups:
#> Everyone
#> DX_bl %in% c("MCI", "AD")
#>
#> Outcomes:
#> ConvertedToAlzheimers
#> CSF_ABETA_STATUS_bl
#>
#> Covariates:
#> AGE GENDER EDUCATION
#>
#> Predictors:
#> M1
#> M2
#> M3
#> M4
#>
#> Stats:
#> glm
#>
# model specs can be modified to build on one another and save time when
# doing sensitivity analyses. Here, we create the same model as before but
# just add APOE4 as covariate.
model2 <- model %>%
set_covariates(AGE, GENDER, EDUCATION, APOE4)
# see this change in the model print
print(model2)
#> ----------------------
#> ABA MODEL (not fitted)
#> ----------------------
#> Groups:
#> Everyone
#> DX_bl %in% c("MCI", "AD")
#>
#> Outcomes:
#> ConvertedToAlzheimers
#> CSF_ABETA_STATUS_bl
#>
#> Covariates:
#> AGE GENDER EDUCATION APOE4
#>
#> Predictors:
#> M1
#> M2
#> M3
#> M4
#>
#> Stats:
#> glm
#>
# Calling the `fit()` function actually triggers fitting of statistics.
model <- model %>% fit()
#> [1] "ConvertedToAlzheimers ~ AGE + GENDER + EDUCATION"
#> [1] "ConvertedToAlzheimers ~ AGE + GENDER + EDUCATION + PLASMA_ABETA_bl"
#> [1] "ConvertedToAlzheimers ~ AGE + GENDER + EDUCATION + PLASMA_PTAU181_bl"
#> [1] "ConvertedToAlzheimers ~ AGE + GENDER + EDUCATION + PLASMA_NFL_bl"
#> [1] "ConvertedToAlzheimers ~ AGE + GENDER + EDUCATION + PLASMA_ABETA_bl + PLASMA_PTAU181_bl + PLASMA_NFL_bl"
#> [1] "CSF_ABETA_STATUS_bl ~ AGE + GENDER + EDUCATION"
#> [1] "CSF_ABETA_STATUS_bl ~ AGE + GENDER + EDUCATION + PLASMA_ABETA_bl"
#> [1] "CSF_ABETA_STATUS_bl ~ AGE + GENDER + EDUCATION + PLASMA_PTAU181_bl"
#> [1] "CSF_ABETA_STATUS_bl ~ AGE + GENDER + EDUCATION + PLASMA_NFL_bl"
#> [1] "CSF_ABETA_STATUS_bl ~ AGE + GENDER + EDUCATION + PLASMA_ABETA_bl + PLASMA_PTAU181_bl + PLASMA_NFL_bl"
#> [1] "ConvertedToAlzheimers ~ AGE + GENDER + EDUCATION"
#> [1] "ConvertedToAlzheimers ~ AGE + GENDER + EDUCATION + PLASMA_ABETA_bl"
#> [1] "ConvertedToAlzheimers ~ AGE + GENDER + EDUCATION + PLASMA_PTAU181_bl"
#> [1] "ConvertedToAlzheimers ~ AGE + GENDER + EDUCATION + PLASMA_NFL_bl"
#> [1] "ConvertedToAlzheimers ~ AGE + GENDER + EDUCATION + PLASMA_ABETA_bl + PLASMA_PTAU181_bl + PLASMA_NFL_bl"
#> [1] "CSF_ABETA_STATUS_bl ~ AGE + GENDER + EDUCATION"
#> [1] "CSF_ABETA_STATUS_bl ~ AGE + GENDER + EDUCATION + PLASMA_ABETA_bl"
#> [1] "CSF_ABETA_STATUS_bl ~ AGE + GENDER + EDUCATION + PLASMA_PTAU181_bl"
#> [1] "CSF_ABETA_STATUS_bl ~ AGE + GENDER + EDUCATION + PLASMA_NFL_bl"
#> [1] "CSF_ABETA_STATUS_bl ~ AGE + GENDER + EDUCATION + PLASMA_ABETA_bl + PLASMA_PTAU181_bl + PLASMA_NFL_bl"
model2 <- model2 %>% fit()
#> [1] "ConvertedToAlzheimers ~ AGE + GENDER + EDUCATION + APOE4"
#> [1] "ConvertedToAlzheimers ~ AGE + GENDER + EDUCATION + APOE4 + PLASMA_ABETA_bl"
#> [1] "ConvertedToAlzheimers ~ AGE + GENDER + EDUCATION + APOE4 + PLASMA_PTAU181_bl"
#> [1] "ConvertedToAlzheimers ~ AGE + GENDER + EDUCATION + APOE4 + PLASMA_NFL_bl"
#> [1] "ConvertedToAlzheimers ~ AGE + GENDER + EDUCATION + APOE4 + PLASMA_ABETA_bl + PLASMA_PTAU181_bl + PLASMA_NFL_bl"
#> [1] "CSF_ABETA_STATUS_bl ~ AGE + GENDER + EDUCATION + APOE4"
#> [1] "CSF_ABETA_STATUS_bl ~ AGE + GENDER + EDUCATION + APOE4 + PLASMA_ABETA_bl"
#> [1] "CSF_ABETA_STATUS_bl ~ AGE + GENDER + EDUCATION + APOE4 + PLASMA_PTAU181_bl"
#> [1] "CSF_ABETA_STATUS_bl ~ AGE + GENDER + EDUCATION + APOE4 + PLASMA_NFL_bl"
#> [1] "CSF_ABETA_STATUS_bl ~ AGE + GENDER + EDUCATION + APOE4 + PLASMA_ABETA_bl + PLASMA_PTAU181_bl + PLASMA_NFL_bl"
#> [1] "ConvertedToAlzheimers ~ AGE + GENDER + EDUCATION + APOE4"
#> [1] "ConvertedToAlzheimers ~ AGE + GENDER + EDUCATION + APOE4 + PLASMA_ABETA_bl"
#> [1] "ConvertedToAlzheimers ~ AGE + GENDER + EDUCATION + APOE4 + PLASMA_PTAU181_bl"
#> [1] "ConvertedToAlzheimers ~ AGE + GENDER + EDUCATION + APOE4 + PLASMA_NFL_bl"
#> [1] "ConvertedToAlzheimers ~ AGE + GENDER + EDUCATION + APOE4 + PLASMA_ABETA_bl + PLASMA_PTAU181_bl + PLASMA_NFL_bl"
#> [1] "CSF_ABETA_STATUS_bl ~ AGE + GENDER + EDUCATION + APOE4"
#> [1] "CSF_ABETA_STATUS_bl ~ AGE + GENDER + EDUCATION + APOE4 + PLASMA_ABETA_bl"
#> [1] "CSF_ABETA_STATUS_bl ~ AGE + GENDER + EDUCATION + APOE4 + PLASMA_PTAU181_bl"
#> [1] "CSF_ABETA_STATUS_bl ~ AGE + GENDER + EDUCATION + APOE4 + PLASMA_NFL_bl"
#> [1] "CSF_ABETA_STATUS_bl ~ AGE + GENDER + EDUCATION + APOE4 + PLASMA_ABETA_bl + PLASMA_PTAU181_bl + PLASMA_NFL_bl"
# Access the raw results in case you care about that:
print(model$results)
#> # A tibble: 20 × 5
#> group outcome stat predictor fit
#> <chr> <chr> <chr> <chr> <lis>
#> 1 "Everyone" ConvertedToAlzhe… S1 Basic <glm>
#> 2 "Everyone" ConvertedToAlzhe… S1 M1 <glm>
#> 3 "Everyone" ConvertedToAlzhe… S1 M2 <glm>
#> 4 "Everyone" ConvertedToAlzhe… S1 M3 <glm>
#> 5 "Everyone" ConvertedToAlzhe… S1 M4 <glm>
#> 6 "Everyone" CSF_ABETA_STATUS… S1 Basic <glm>
#> 7 "Everyone" CSF_ABETA_STATUS… S1 M1 <glm>
#> 8 "Everyone" CSF_ABETA_STATUS… S1 M2 <glm>
#> 9 "Everyone" CSF_ABETA_STATUS… S1 M3 <glm>
#> 10 "Everyone" CSF_ABETA_STATUS… S1 M4 <glm>
#> 11 "DX_bl %in% c(\"MCI\", \"AD\")" ConvertedToAlzhe… S1 Basic <glm>
#> 12 "DX_bl %in% c(\"MCI\", \"AD\")" ConvertedToAlzhe… S1 M1 <glm>
#> 13 "DX_bl %in% c(\"MCI\", \"AD\")" ConvertedToAlzhe… S1 M2 <glm>
#> 14 "DX_bl %in% c(\"MCI\", \"AD\")" ConvertedToAlzhe… S1 M3 <glm>
#> 15 "DX_bl %in% c(\"MCI\", \"AD\")" ConvertedToAlzhe… S1 M4 <glm>
#> 16 "DX_bl %in% c(\"MCI\", \"AD\")" CSF_ABETA_STATUS… S1 Basic <glm>
#> 17 "DX_bl %in% c(\"MCI\", \"AD\")" CSF_ABETA_STATUS… S1 M1 <glm>
#> 18 "DX_bl %in% c(\"MCI\", \"AD\")" CSF_ABETA_STATUS… S1 M2 <glm>
#> 19 "DX_bl %in% c(\"MCI\", \"AD\")" CSF_ABETA_STATUS… S1 M3 <glm>
#> 20 "DX_bl %in% c(\"MCI\", \"AD\")" CSF_ABETA_STATUS… S1 M4 <glm>
# Calling the `summary()` function summarises covariates and metrics in
# a useful manner
model_summary <- model %>% summary()
model2_summary <- model2 %>% summary()
# see a nicely formatted print out of the summary
print(model_summary)
#> -----------------------------------------------------------
#> Group: Everyone | Outcome: ConvertedToAlzheimers | Stat: S1
#> -----------------------------------------------------------
#> Coefficients & Metrics:
#> # A tibble: 5 × 11
#> predictor AGE GENDER EDUCATION PLASMA_ABETA_bl PLASMA_PTAU181_bl
#> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 Basic 1.03 [0.96… 0.70 … 1.02 [0.… NA NA
#> 2 M1 1.03 [0.95… 0.66 … 1.02 [0.… 0.00 [0.00, 38… NA
#> 3 M2 1.01 [0.94… 0.75 … 1.01 [0.… NA 2.65 [1.07, 6.96…
#> 4 M3 0.95 [0.87… 0.82 … 1.04 [0.… NA NA
#> 5 M4 0.94 [0.85… 0.75 … 1.05 [0.… 0.00 [0.00, 64… 2.10 [0.79, 5.90…
#> # ℹ 5 more variables: PLASMA_NFL_bl <chr>, auc <chr>, aic <chr>,
#> # pval <chr>, nobs <chr>
#>
#> ---------------------------------------------------------
#> Group: Everyone | Outcome: CSF_ABETA_STATUS_bl | Stat: S1
#> ---------------------------------------------------------
#> Coefficients & Metrics:
#> # A tibble: 5 × 11
#> predictor AGE GENDER EDUCATION PLASMA_ABETA_bl PLASMA_PTAU181_bl
#> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 Basic 1.06 [1.01… 1.27 … 0.99 [0.… NA NA
#> 2 M1 1.03 [0.98… 1.01 … 1.00 [0.… 0.00 [0.00, 0.… NA
#> 3 M2 1.04 [0.99… 1.33 … 0.99 [0.… NA 3.28 [1.71, 6.93…
#> 4 M3 1.06 [1.01… 1.27 … 0.98 [0.… NA NA
#> 5 M4 1.04 [0.98… 1.09 … 1.00 [0.… 0.00 [0.00, 0.… 2.64 [1.36, 5.69…
#> # ℹ 5 more variables: PLASMA_NFL_bl <chr>, auc <chr>, aic <chr>,
#> # pval <chr>, nobs <chr>
#>
#> ----------------------------------------------------------------------------
#> Group: DX_bl %in% c("MCI", "AD") | Outcome: ConvertedToAlzheimers | Stat: S1
#> ----------------------------------------------------------------------------
#> Coefficients & Metrics:
#> # A tibble: 5 × 11
#> predictor AGE GENDER EDUCATION PLASMA_ABETA_bl PLASMA_PTAU181_bl
#> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 Basic 1.06 [0.98… 0.74 … 1.05 [0.… NA NA
#> 2 M1 1.05 [0.97… 0.73 … 1.04 [0.… 0.00 [0.00, 41… NA
#> 3 M2 1.06 [0.97… 0.76 … 1.08 [0.… NA 3.56 [1.19, 12.7…
#> 4 M3 0.99 [0.90… 0.75 … 1.13 [0.… NA NA
#> 5 M4 0.99 [0.89… 0.76 … 1.17 [0.… 0.76 [0.00, 57… 3.29 [1.05, 12.6…
#> # ℹ 5 more variables: PLASMA_NFL_bl <chr>, auc <chr>, aic <chr>,
#> # pval <chr>, nobs <chr>
#>
#> --------------------------------------------------------------------------
#> Group: DX_bl %in% c("MCI", "AD") | Outcome: CSF_ABETA_STATUS_bl | Stat: S1
#> --------------------------------------------------------------------------
#> Coefficients & Metrics:
#> # A tibble: 5 × 11
#> predictor AGE GENDER EDUCATION PLASMA_ABETA_bl PLASMA_PTAU181_bl
#> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 Basic 1.05 [0.98… 2.31 … 0.96 [0.… NA NA
#> 2 M1 1.01 [0.94… 2.41 … 0.96 [0.… 0.00 [0.00, 0.… NA
#> 3 M2 1.05 [0.98… 2.46 … 0.99 [0.… NA 2.93 [1.18, 8.29…
#> 4 M3 1.04 [0.97… 2.29 … 0.97 [0.… NA NA
#> 5 M4 1.01 [0.93… 2.45 … 0.98 [0.… 0.00 [0.00, 0.… 2.06 [0.80, 5.89…
#> # ℹ 5 more variables: PLASMA_NFL_bl <chr>, auc <chr>, aic <chr>,
#> # pval <chr>, nobs <chr>
#>
# or access the raw summary results:
print(model_summary$results)
#> $coefs
#> # A tibble: 84 × 9
#> group outcome stat predictor term estimate conf_low conf_high pval
#> <chr> <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 Everyo… Conver… S1 Basic AGE 1.03e+0 9.62e- 1 1.11e0 0.361
#> 2 Everyo… Conver… S1 Basic GEND… 7.00e-1 2.32e- 1 2.00e0 0.510
#> 3 Everyo… Conver… S1 Basic EDUC… 1.02e+0 8.32e- 1 1.26e0 0.868
#> 4 Everyo… Conver… S1 M1 AGE 1.03e+0 9.52e- 1 1.11e0 0.507
#> 5 Everyo… Conver… S1 M1 GEND… 6.57e-1 2.15e- 1 1.90e0 0.443
#> 6 Everyo… Conver… S1 M1 EDUC… 1.02e+0 8.36e- 1 1.26e0 0.833
#> 7 Everyo… Conver… S1 M1 PLAS… 1.44e-8 4.99e-26 3.90e8 0.368
#> 8 Everyo… Conver… S1 M2 AGE 1.01e+0 9.41e- 1 1.09e0 0.715
#> 9 Everyo… Conver… S1 M2 GEND… 7.52e-1 2.41e- 1 2.23e0 0.610
#> 10 Everyo… Conver… S1 M2 EDUC… 1.01e+0 8.19e- 1 1.25e0 0.937
#> # ℹ 74 more rows
#>
#> $metrics
#> # A tibble: 80 × 8
#> group outcome stat predictor term estimate conf_low conf_high
#> <chr> <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl>
#> 1 Everyone ConvertedTo… S1 Basic auc 0.576 0.416 0.735
#> 2 Everyone ConvertedTo… S1 Basic aic 123. NA NA
#> 3 Everyone ConvertedTo… S1 Basic pval 1 NA NA
#> 4 Everyone ConvertedTo… S1 Basic nobs 198 NA NA
#> 5 Everyone ConvertedTo… S1 M1 auc 0.606 0.454 0.757
#> 6 Everyone ConvertedTo… S1 M1 aic 124. NA NA
#> 7 Everyone ConvertedTo… S1 M1 pval 0.361 NA NA
#> 8 Everyone ConvertedTo… S1 M1 nobs 198 NA NA
#> 9 Everyone ConvertedTo… S1 M2 auc 0.701 0.591 0.812
#> 10 Everyone ConvertedTo… S1 M2 aic 120. NA NA
#> # ℹ 70 more rows
#>