Skip to contents

An aba model is the foundational object in the aba package. It is composed of the following:

  • data: a data.frame to be used to fit the statistical models

  • spec: the specification for the aba model composed of the following:

    • groups: subsets of the data

    • outcomes: dependent variables in statistical fits.

    • covariates: independent variables which should always be included in statistical fits.

    • predictors: independent variables which will vary across different statistical fits.

  • results: the resulting fitted statistics.

Usage

aba_model(
  data = NULL,
  groups = NULL,
  outcomes = NULL,
  predictors = NULL,
  covariates = NULL,
  stats = NULL,
  evals = NULL,
  include_basic = TRUE
)

Arguments

data

data.frame the data to use for the object

groups

vector or list of logical statements as trings. Groups are subsets of the data on which different models will be fit.

outcomes

vector or list of strings Outcomes are the dependent variables in the statistical fits.

predictors

vector or list of strings Predictors are independent variables which you want to vary. You can include variables on their own or in combination with others. A collection of variables is referred to as a predictor and unique variables are referred to as a term.

covariates

vector of strings Covariates are independent variables which remain fixed across all statistical fits and are therefore always included with the different combinations of predictors.

stats

string or abaStat object(s) with stat_ prefix. Stats are the actual statistical models which you want to fit on the data. Their primary functions are to 1) generate a suitable model formula given the outcome - covariate - predictor combination, and 2) to actually fit the statistical model.

evals

string or abaEveal object(s) with eval_ prefix. Evals are the ways in which your model is fit on the data. The standard method is to simply fit the models on the entire data, but you can also fit models using bootstrapping, train-test splits, or cross validation.

include_basic

logical. Whether to fit a "basic" model which includes only covariates.

Value

An aba model which can be fitted using the aba_fit() function and which can be modified in any manner.

Examples


# use built-in data and only take the baseline visit
data <- adnimerge %>% dplyr::filter(VISCODE == 'bl')

# Create aba model w/ data, groups, outcomes, covariates, predictors, stats.
# Note that we start with piping the data into the aba_model... This is
# possible because `data` is the first argument of the `aba_model()` function
# and is useful because it gives auto-completion of variables names in Rstudio.
model <- data %>% aba_model() %>%
  set_groups(everyone(), DX_bl %in% c('MCI','AD')) %>%
  set_outcomes(ConvertedToAlzheimers, CSF_ABETA_STATUS_bl) %>%
  set_covariates(AGE, GENDER, EDUCATION) %>%
  set_predictors(
    PLASMA_ABETA_bl, PLASMA_PTAU181_bl, PLASMA_NFL_bl,
    c(PLASMA_ABETA_bl, PLASMA_PTAU181_bl, PLASMA_NFL_bl)
  ) %>%
  set_stats('glm')

# get a useful view of the model spec:
print(model)
#> ----------------------
#> ABA MODEL (not fitted)
#> ----------------------
#> Groups:
#>    Everyone
#>    DX_bl %in% c("MCI", "AD")
#> 
#> Outcomes:
#>    ConvertedToAlzheimers
#>    CSF_ABETA_STATUS_bl
#> 
#> Covariates:
#>    AGE GENDER EDUCATION
#> 
#> Predictors:
#>    M1
#>    M2
#>    M3
#>    M4
#> 
#> Stats:
#>    glm 
#>    

# model specs can be modified to build on one another and save time when
# doing sensitivity analyses. Here, we create the same model as before but
# just add APOE4 as covariate.
model2 <- model %>%
  set_covariates(AGE, GENDER, EDUCATION, APOE4)

# see this change in the model print
print(model2)
#> ----------------------
#> ABA MODEL (not fitted)
#> ----------------------
#> Groups:
#>    Everyone
#>    DX_bl %in% c("MCI", "AD")
#> 
#> Outcomes:
#>    ConvertedToAlzheimers
#>    CSF_ABETA_STATUS_bl
#> 
#> Covariates:
#>    AGE GENDER EDUCATION APOE4
#> 
#> Predictors:
#>    M1
#>    M2
#>    M3
#>    M4
#> 
#> Stats:
#>    glm 
#>    

# Calling the `fit()` function actually triggers fitting of statistics.
model <- model %>% fit()
#> [1] "ConvertedToAlzheimers ~ AGE + GENDER + EDUCATION"
#> [1] "ConvertedToAlzheimers ~ AGE + GENDER + EDUCATION + PLASMA_ABETA_bl"
#> [1] "ConvertedToAlzheimers ~ AGE + GENDER + EDUCATION + PLASMA_PTAU181_bl"
#> [1] "ConvertedToAlzheimers ~ AGE + GENDER + EDUCATION + PLASMA_NFL_bl"
#> [1] "ConvertedToAlzheimers ~ AGE + GENDER + EDUCATION + PLASMA_ABETA_bl + PLASMA_PTAU181_bl + PLASMA_NFL_bl"
#> [1] "CSF_ABETA_STATUS_bl ~ AGE + GENDER + EDUCATION"
#> [1] "CSF_ABETA_STATUS_bl ~ AGE + GENDER + EDUCATION + PLASMA_ABETA_bl"
#> [1] "CSF_ABETA_STATUS_bl ~ AGE + GENDER + EDUCATION + PLASMA_PTAU181_bl"
#> [1] "CSF_ABETA_STATUS_bl ~ AGE + GENDER + EDUCATION + PLASMA_NFL_bl"
#> [1] "CSF_ABETA_STATUS_bl ~ AGE + GENDER + EDUCATION + PLASMA_ABETA_bl + PLASMA_PTAU181_bl + PLASMA_NFL_bl"
#> [1] "ConvertedToAlzheimers ~ AGE + GENDER + EDUCATION"
#> [1] "ConvertedToAlzheimers ~ AGE + GENDER + EDUCATION + PLASMA_ABETA_bl"
#> [1] "ConvertedToAlzheimers ~ AGE + GENDER + EDUCATION + PLASMA_PTAU181_bl"
#> [1] "ConvertedToAlzheimers ~ AGE + GENDER + EDUCATION + PLASMA_NFL_bl"
#> [1] "ConvertedToAlzheimers ~ AGE + GENDER + EDUCATION + PLASMA_ABETA_bl + PLASMA_PTAU181_bl + PLASMA_NFL_bl"
#> [1] "CSF_ABETA_STATUS_bl ~ AGE + GENDER + EDUCATION"
#> [1] "CSF_ABETA_STATUS_bl ~ AGE + GENDER + EDUCATION + PLASMA_ABETA_bl"
#> [1] "CSF_ABETA_STATUS_bl ~ AGE + GENDER + EDUCATION + PLASMA_PTAU181_bl"
#> [1] "CSF_ABETA_STATUS_bl ~ AGE + GENDER + EDUCATION + PLASMA_NFL_bl"
#> [1] "CSF_ABETA_STATUS_bl ~ AGE + GENDER + EDUCATION + PLASMA_ABETA_bl + PLASMA_PTAU181_bl + PLASMA_NFL_bl"
model2 <- model2 %>% fit()
#> [1] "ConvertedToAlzheimers ~ AGE + GENDER + EDUCATION + APOE4"
#> [1] "ConvertedToAlzheimers ~ AGE + GENDER + EDUCATION + APOE4 + PLASMA_ABETA_bl"
#> [1] "ConvertedToAlzheimers ~ AGE + GENDER + EDUCATION + APOE4 + PLASMA_PTAU181_bl"
#> [1] "ConvertedToAlzheimers ~ AGE + GENDER + EDUCATION + APOE4 + PLASMA_NFL_bl"
#> [1] "ConvertedToAlzheimers ~ AGE + GENDER + EDUCATION + APOE4 + PLASMA_ABETA_bl + PLASMA_PTAU181_bl + PLASMA_NFL_bl"
#> [1] "CSF_ABETA_STATUS_bl ~ AGE + GENDER + EDUCATION + APOE4"
#> [1] "CSF_ABETA_STATUS_bl ~ AGE + GENDER + EDUCATION + APOE4 + PLASMA_ABETA_bl"
#> [1] "CSF_ABETA_STATUS_bl ~ AGE + GENDER + EDUCATION + APOE4 + PLASMA_PTAU181_bl"
#> [1] "CSF_ABETA_STATUS_bl ~ AGE + GENDER + EDUCATION + APOE4 + PLASMA_NFL_bl"
#> [1] "CSF_ABETA_STATUS_bl ~ AGE + GENDER + EDUCATION + APOE4 + PLASMA_ABETA_bl + PLASMA_PTAU181_bl + PLASMA_NFL_bl"
#> [1] "ConvertedToAlzheimers ~ AGE + GENDER + EDUCATION + APOE4"
#> [1] "ConvertedToAlzheimers ~ AGE + GENDER + EDUCATION + APOE4 + PLASMA_ABETA_bl"
#> [1] "ConvertedToAlzheimers ~ AGE + GENDER + EDUCATION + APOE4 + PLASMA_PTAU181_bl"
#> [1] "ConvertedToAlzheimers ~ AGE + GENDER + EDUCATION + APOE4 + PLASMA_NFL_bl"
#> [1] "ConvertedToAlzheimers ~ AGE + GENDER + EDUCATION + APOE4 + PLASMA_ABETA_bl + PLASMA_PTAU181_bl + PLASMA_NFL_bl"
#> [1] "CSF_ABETA_STATUS_bl ~ AGE + GENDER + EDUCATION + APOE4"
#> [1] "CSF_ABETA_STATUS_bl ~ AGE + GENDER + EDUCATION + APOE4 + PLASMA_ABETA_bl"
#> [1] "CSF_ABETA_STATUS_bl ~ AGE + GENDER + EDUCATION + APOE4 + PLASMA_PTAU181_bl"
#> [1] "CSF_ABETA_STATUS_bl ~ AGE + GENDER + EDUCATION + APOE4 + PLASMA_NFL_bl"
#> [1] "CSF_ABETA_STATUS_bl ~ AGE + GENDER + EDUCATION + APOE4 + PLASMA_ABETA_bl + PLASMA_PTAU181_bl + PLASMA_NFL_bl"

# Access the raw results in case you care about that:
print(model$results)
#> # A tibble: 20 × 5
#>    group                           outcome           stat  predictor fit  
#>    <chr>                           <chr>             <chr> <chr>     <lis>
#>  1 "Everyone"                      ConvertedToAlzhe… S1    Basic     <glm>
#>  2 "Everyone"                      ConvertedToAlzhe… S1    M1        <glm>
#>  3 "Everyone"                      ConvertedToAlzhe… S1    M2        <glm>
#>  4 "Everyone"                      ConvertedToAlzhe… S1    M3        <glm>
#>  5 "Everyone"                      ConvertedToAlzhe… S1    M4        <glm>
#>  6 "Everyone"                      CSF_ABETA_STATUS… S1    Basic     <glm>
#>  7 "Everyone"                      CSF_ABETA_STATUS… S1    M1        <glm>
#>  8 "Everyone"                      CSF_ABETA_STATUS… S1    M2        <glm>
#>  9 "Everyone"                      CSF_ABETA_STATUS… S1    M3        <glm>
#> 10 "Everyone"                      CSF_ABETA_STATUS… S1    M4        <glm>
#> 11 "DX_bl %in% c(\"MCI\", \"AD\")" ConvertedToAlzhe… S1    Basic     <glm>
#> 12 "DX_bl %in% c(\"MCI\", \"AD\")" ConvertedToAlzhe… S1    M1        <glm>
#> 13 "DX_bl %in% c(\"MCI\", \"AD\")" ConvertedToAlzhe… S1    M2        <glm>
#> 14 "DX_bl %in% c(\"MCI\", \"AD\")" ConvertedToAlzhe… S1    M3        <glm>
#> 15 "DX_bl %in% c(\"MCI\", \"AD\")" ConvertedToAlzhe… S1    M4        <glm>
#> 16 "DX_bl %in% c(\"MCI\", \"AD\")" CSF_ABETA_STATUS… S1    Basic     <glm>
#> 17 "DX_bl %in% c(\"MCI\", \"AD\")" CSF_ABETA_STATUS… S1    M1        <glm>
#> 18 "DX_bl %in% c(\"MCI\", \"AD\")" CSF_ABETA_STATUS… S1    M2        <glm>
#> 19 "DX_bl %in% c(\"MCI\", \"AD\")" CSF_ABETA_STATUS… S1    M3        <glm>
#> 20 "DX_bl %in% c(\"MCI\", \"AD\")" CSF_ABETA_STATUS… S1    M4        <glm>

# Calling the `summary()` function summarises covariates and metrics in
# a useful manner
model_summary <- model %>% summary()
model2_summary <- model2 %>% summary()

# see a nicely formatted print out of the summary
print(model_summary)
#> -----------------------------------------------------------
#> Group: Everyone | Outcome: ConvertedToAlzheimers | Stat: S1
#> -----------------------------------------------------------
#> Coefficients & Metrics:
#> # A tibble: 5 × 11
#>   predictor AGE         GENDER EDUCATION PLASMA_ABETA_bl PLASMA_PTAU181_bl
#>   <chr>     <chr>       <chr>  <chr>     <chr>           <chr>            
#> 1 Basic     1.03 [0.96… 0.70 … 1.02 [0.… NA              NA               
#> 2 M1        1.03 [0.95… 0.66 … 1.02 [0.… 0.00 [0.00, 38… NA               
#> 3 M2        1.01 [0.94… 0.75 … 1.01 [0.… NA              2.65 [1.07, 6.96…
#> 4 M3        0.95 [0.87… 0.82 … 1.04 [0.… NA              NA               
#> 5 M4        0.94 [0.85… 0.75 … 1.05 [0.… 0.00 [0.00, 64… 2.10 [0.79, 5.90…
#> # ℹ 5 more variables: PLASMA_NFL_bl <chr>, auc <chr>, aic <chr>,
#> #   pval <chr>, nobs <chr>
#> 
#> ---------------------------------------------------------
#> Group: Everyone | Outcome: CSF_ABETA_STATUS_bl | Stat: S1
#> ---------------------------------------------------------
#> Coefficients & Metrics:
#> # A tibble: 5 × 11
#>   predictor AGE         GENDER EDUCATION PLASMA_ABETA_bl PLASMA_PTAU181_bl
#>   <chr>     <chr>       <chr>  <chr>     <chr>           <chr>            
#> 1 Basic     1.06 [1.01… 1.27 … 0.99 [0.… NA              NA               
#> 2 M1        1.03 [0.98… 1.01 … 1.00 [0.… 0.00 [0.00, 0.… NA               
#> 3 M2        1.04 [0.99… 1.33 … 0.99 [0.… NA              3.28 [1.71, 6.93…
#> 4 M3        1.06 [1.01… 1.27 … 0.98 [0.… NA              NA               
#> 5 M4        1.04 [0.98… 1.09 … 1.00 [0.… 0.00 [0.00, 0.… 2.64 [1.36, 5.69…
#> # ℹ 5 more variables: PLASMA_NFL_bl <chr>, auc <chr>, aic <chr>,
#> #   pval <chr>, nobs <chr>
#> 
#> ----------------------------------------------------------------------------
#> Group: DX_bl %in% c("MCI", "AD") | Outcome: ConvertedToAlzheimers | Stat: S1
#> ----------------------------------------------------------------------------
#> Coefficients & Metrics:
#> # A tibble: 5 × 11
#>   predictor AGE         GENDER EDUCATION PLASMA_ABETA_bl PLASMA_PTAU181_bl
#>   <chr>     <chr>       <chr>  <chr>     <chr>           <chr>            
#> 1 Basic     1.06 [0.98… 0.74 … 1.05 [0.… NA              NA               
#> 2 M1        1.05 [0.97… 0.73 … 1.04 [0.… 0.00 [0.00, 41… NA               
#> 3 M2        1.06 [0.97… 0.76 … 1.08 [0.… NA              3.56 [1.19, 12.7…
#> 4 M3        0.99 [0.90… 0.75 … 1.13 [0.… NA              NA               
#> 5 M4        0.99 [0.89… 0.76 … 1.17 [0.… 0.76 [0.00, 57… 3.29 [1.05, 12.6…
#> # ℹ 5 more variables: PLASMA_NFL_bl <chr>, auc <chr>, aic <chr>,
#> #   pval <chr>, nobs <chr>
#> 
#> --------------------------------------------------------------------------
#> Group: DX_bl %in% c("MCI", "AD") | Outcome: CSF_ABETA_STATUS_bl | Stat: S1
#> --------------------------------------------------------------------------
#> Coefficients & Metrics:
#> # A tibble: 5 × 11
#>   predictor AGE         GENDER EDUCATION PLASMA_ABETA_bl PLASMA_PTAU181_bl
#>   <chr>     <chr>       <chr>  <chr>     <chr>           <chr>            
#> 1 Basic     1.05 [0.98… 2.31 … 0.96 [0.… NA              NA               
#> 2 M1        1.01 [0.94… 2.41 … 0.96 [0.… 0.00 [0.00, 0.… NA               
#> 3 M2        1.05 [0.98… 2.46 … 0.99 [0.… NA              2.93 [1.18, 8.29…
#> 4 M3        1.04 [0.97… 2.29 … 0.97 [0.… NA              NA               
#> 5 M4        1.01 [0.93… 2.45 … 0.98 [0.… 0.00 [0.00, 0.… 2.06 [0.80, 5.89…
#> # ℹ 5 more variables: PLASMA_NFL_bl <chr>, auc <chr>, aic <chr>,
#> #   pval <chr>, nobs <chr>
#> 

# or access the raw summary results:
print(model_summary$results)
#> $coefs
#> # A tibble: 84 × 9
#>    group   outcome stat  predictor term  estimate conf_low conf_high  pval
#>    <chr>   <chr>   <chr> <chr>     <chr>    <dbl>    <dbl>     <dbl> <dbl>
#>  1 Everyo… Conver… S1    Basic     AGE    1.03e+0 9.62e- 1    1.11e0 0.361
#>  2 Everyo… Conver… S1    Basic     GEND…  7.00e-1 2.32e- 1    2.00e0 0.510
#>  3 Everyo… Conver… S1    Basic     EDUC…  1.02e+0 8.32e- 1    1.26e0 0.868
#>  4 Everyo… Conver… S1    M1        AGE    1.03e+0 9.52e- 1    1.11e0 0.507
#>  5 Everyo… Conver… S1    M1        GEND…  6.57e-1 2.15e- 1    1.90e0 0.443
#>  6 Everyo… Conver… S1    M1        EDUC…  1.02e+0 8.36e- 1    1.26e0 0.833
#>  7 Everyo… Conver… S1    M1        PLAS…  1.44e-8 4.99e-26    3.90e8 0.368
#>  8 Everyo… Conver… S1    M2        AGE    1.01e+0 9.41e- 1    1.09e0 0.715
#>  9 Everyo… Conver… S1    M2        GEND…  7.52e-1 2.41e- 1    2.23e0 0.610
#> 10 Everyo… Conver… S1    M2        EDUC…  1.01e+0 8.19e- 1    1.25e0 0.937
#> # ℹ 74 more rows
#> 
#> $metrics
#> # A tibble: 80 × 8
#>    group    outcome      stat  predictor term  estimate conf_low conf_high
#>    <chr>    <chr>        <chr> <chr>     <chr>    <dbl>    <dbl>     <dbl>
#>  1 Everyone ConvertedTo… S1    Basic     auc      0.576    0.416     0.735
#>  2 Everyone ConvertedTo… S1    Basic     aic    123.      NA        NA    
#>  3 Everyone ConvertedTo… S1    Basic     pval     1       NA        NA    
#>  4 Everyone ConvertedTo… S1    Basic     nobs   198       NA        NA    
#>  5 Everyone ConvertedTo… S1    M1        auc      0.606    0.454     0.757
#>  6 Everyone ConvertedTo… S1    M1        aic    124.      NA        NA    
#>  7 Everyone ConvertedTo… S1    M1        pval     0.361   NA        NA    
#>  8 Everyone ConvertedTo… S1    M1        nobs   198       NA        NA    
#>  9 Everyone ConvertedTo… S1    M2        auc      0.701    0.591     0.812
#> 10 Everyone ConvertedTo… S1    M2        aic    120.      NA        NA    
#> # ℹ 70 more rows
#>