:orphan:

.. _ahelp_3dGLMM:

******
3dGLMM
******

.. contents:: :local:


| 

.. code-block:: none

    
                 ================== Welcome to 3dGLMM ==================
              Program for Voxelwise Generalized Linear Mixed-Models (GLMMs) 
    +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
    Version 0.0.3, Feb 18, 2025
    Author: Gang Chen (gangchen@mail.nih.gov)
    SSCC/NIMH, National Institutes of Health, Bethesda MD 20892, USA
    +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
    
    Introduction
    ------
    
     ### Generalized Linear Mixed-Models (GLMM) Overview
    
     Generalized Linear Mixed-Models (GLMMs) extend Linear Mixed-Models (LMMs) to
     handle non-normal response variables, such as binary, count, or categorical data.
     The response variable in GLMMs can follow distributions like binomial, Poisson,
     or other members of the exponential family.
    
     ### 3dGLMM: Extension of 3dLMEr
    
     The program **3dGLMM** builds on **3dLMEr**, adding support for Student's
     *t*-distribution for model residuals in addition to the standard normal
     distribution. This functionality requires the R packages **glmmTMB**,
     **car**, and **emmeans**.
    
     Like **3dLMEr**, 3dGLMM automatically provides outputs for all main effects
     and interactions. However, users must explicitly request marginal effects
     and their comparisons through the options `-level` or `-slope` in 3dGLMM 
     instead of `-gltcode` or -glfCode in 3dLMEr.
    
     1. **Random-Effects Specification**:
    
     Random-effects components must be directly incorporated into the model
     specification via the `-model` option. The `-ranEff` option used in
     3dLMEr is no longer needed. Users are responsible for formulating
     appropriate model structures. For detailed guidance, refer to the blog post:
     [How to specify individual-level random effects in hierarchical modeling]
     (https://discuss.afni.nimh.nih.gov/t/how-to-specify-individual-level-random-effects-in-hierarchical-modeling/6462).
    
     2. **Marginal Effects and Pairwise Comparisons**:
    
     Users can specify marginal effects and pairwise comparisons through the
     options `-level` and `-slope`.
    
     ### Input and Output Formats
    
     3dGLMM accepts input files in various formats, including AFNI, NIfTI,
     surface (`niml.dset`), or 1D. To match the output format with the
     input, append an appropriate suffix to the output option `-prefix`
     (e.g., `.nii` for NIfTI, `.niml.dset` for surface, or `.1D` for 1D).
    
     ### Incorporation of Explanatory Variables
    
     3dGLMM supports various types of explanatory variables and covariates:
     - **Categorical variables**: Between- and within-subject factors.
     - **Quantitative variables**: Continuous predictors like age or
     behavioral data.
    
     #### Declaring Quantitative Variables
    
     When including quantitative variables, you must explicitly declare
     them using the `-qVars` option. Additionally, consider the
     **centering** of these variables:
    
     - Global centering.
     - Within-group (or within-condition) centering.
    
     For further guidance on centering, see: [AFNI documentation on centering]
     (https://afni.nimh.nih.gov/pub/dist/doc/htmldoc/statistics/center.html).
    
     ### Installation of Required R Packages
    
     Before running 3dGLMM, ensure the following R packages are installed:
     - `glmmTMB`
     - `car`
     - `emmeans`
     - `snow`
    
     You can install them via AFNI’s R installation script:
    
     rPkgsInstall -pkgs "glmmTMB,car,emmeans,snow"
    
     Alternatively, install them directly in R:
    
     ```
     install.packages("glmmTMB")
     install.packages("car")
     install.packages("emmeans")
     install.packages("snow")
     ```
    
     ### Example Scripts
    
     The following example scripts demonstrate 3dGLMM applications. More
     examples will be added as scenarios are crowdsourced from users. If
     one of the examples matches your data structure, use it as a template
     to build your own script.
    
     ### Running 3dGLMM
     Once you’ve constructed your command script, run it in the terminal.
     Save the script as a text file (e.g., `GLMM.txt`) and execute it with:
    
     ```
     nohup tcsh -x GLMM.txt &
     ```
    
     Alternatively, for progress tracking, redirect output to a log file:
    
     ```
     nohup tcsh -x GLMM.txt > diary.txt &
     nohup tcsh -x GLMM.txt |& tee diary.txt &
     ```
     
     This method saves output in `diary.txt`, allowing you to review
     progress and troubleshoot if needed.
    
    Here’s a revised version with improved clarity, grammar, and formatting:
    
    ---
    
    ### Example 1: one within-individual factor and a quantitiave predictor
    
    -------------------------------------------------------------------------
      3dGLMM -prefix glmm.student -jobs 12                              \
             -family student.t                                          \
             -model 'task*age+(1|Subj)'                                 \
             -qVars 'age'                                               \
             -qVarCenters 0                                             \
             -level LAB task CAT task                                   \
             -level LAB pos.slp2    CAT 1  FIX task=pos,age=2           \
             -slope LAB pos.age     CAT 1  FIX task=pos       QUANT age \
             -slope LAB task.by.age CAT task                  QUANT age \
             -dataTable                                                 \
             Subj    age   task  InputFile                              \
             s1     3.03   pos   data/pos_s1+tlrc.                      \
             s1     0.82   neg   data/neg_s1+tlrc.                      \
             s2     2.67   pos   data/pos_s2+tlrc.                      \
             s2     0.24   neg   data/neg_s2+tlrc.                      \
    
              ...
    
      #### Data Structure Overview  
    
      This example involves a **within-individual factor** (task with two levels:
      *pos* and *neg*) and a **between-individual quantitative variable** (*age*).
      The GLMM analysis is conducted using a **Student's t-distribution** for the
      model residuals.  
      
      #### Reserved Keywords for Post-Hoc Estimations  
      The following four reserved keywords should not be used in custom
      specifications for post-hoc estimations:  
      
      - **LAB**:   Used to define a label for the estimated effect.  
      - **CAT**:   Specifies a categorical variable for which effects are estimated
                   each level, and all possible pairwise comparisons. Use *1* for
                   the intercept or overall mean of the model.  
      - **FIX**:   Indicates variables fixed at specific levels or values.  
      - **QUANT**: Specifies the estimation of a slope for a quantitative variable.  
      
      ---
      
      #### Explanations for Post-Hoc Estimations  
      
      1. **`-level LAB task CAT task`**  
         - Estimates the effects for both levels of the task (*pos* and *neg*) and
           their contrast (evaluated at *age = 0*).  
      
      2. **`-level LAB pos.slp2 CAT 1 FIX task=pos,age=2`**  
         - Estimates the effect of the *pos* task at *age = 2* (relative to the
           centered value of age). The number *1* represents the intercept or grand
           mean of the model.  
      
      3. **`-slope LAB pos.age CAT 1 FIX task=pos QUANT age`**  
         - Estimates the slope effect of *age* for the *pos* task. The number *1* 
           represents the intercept or grand mean of the model.
      
      4. **`-slope LAB task.by.age CAT task QUANT age`**  
         - Estimates the slope effect of *age* for both *pos* and *neg* tasks and
           their contrast.
       
    
    Options in alphabetical order:
    ------------------------------
    
       -bounds lb ub: This option is for outlier removal. Two numbers are expected from
             the user: the lower bound (lb) and the upper bound (ub). The input data will
             be confined within [lb, ub]: any values in the input data that are beyond
             the bounds will be removed and treated as missing. Make sure the first number
             is less than the second. The default (the absence of this option) is no
             outlier removal. 
             **NOTE**: Using the -bounds option to remove outliers should be approached
             with caution due to its arbitrariness. A more principled alternative is to
             use the -family option with a Student's t-distribution.
    
       -cio: Use AFNI's C io functions, which is the default. Alternatively, -Rio
             can be used.
    
       -dataTable TABLE: List the data structure with a header as the first line.
    
             NOTE:
    
             1) This option has to occur last in the script; that is, no other
             options are allowed thereafter. Each line should end with a backslash
             except for the last line.
    
             2) The order of the columns should not matter except that the last
             column has to be the one for input files, 'InputFile'. Unlike 3dGLMM, the
             subject column (Subj in 3dGLMM) does not have to be the first column;
             and it does not have to include a subject ID column under some situations
             Each row should contain only one input file in the table of long format
             (cf. wide format) as defined in R. Input files can be in AFNI, NIfTI or
             surface format. AFNI files can be specified with sub-brick selector (square
             brackets [] within quotes) specified with a number or label.
    
             3) It is fine to have variables (or columns) in the table that are
             not modeled in the analysis.
    
             4) When the table is part of the script, a backslash is needed at the end
             of each line (except for the last line) to indicate the continuation to the
             next line. Alternatively, one can save the context of the table as a separate
             file, e.g., calling it table.txt, and then in the script specify the data
             with '-dataTable @table.txt'. However, when the table is provided as a
             separate file, do NOT put any quotes around the square brackets for each
             sub-brick, otherwise the program would not properly read the files, unlike the
             situation when quotes are required if the table is included as part of the
             script. Backslash is also not needed at the end of each line, but it would
             not cause any problem if present. This option of separating the table from
             the script is useful: (a) when there are many input files so that the program
             complains with an 'Arg list too long' error; (b) when you want to try
             different models with the same dataset.
    
       -dbgArgs: This option will enable R to save the parameters in a
             file called .3dGLMM.dbg.AFNI.args in the current directory
              so that debugging can be performed.
    
       -family: This option specifies the distribution of model residuals. Currently
             two families are supported: "Gaussian" (default) and "student.t".
    
       -help: this help message
    
       -IF var_name: var_name is used to specify the column name that is designated for
            input files of effect estimate. The default (when this option is not invoked
            is 'InputFile', in which case the column header has to be exactly as 'InputFile'
            This input file for effect estimates has to be the last column.
    
       -jobs NJOBS: On a multi-processor machine, parallel computing will speed 
             up the program significantly.
             Choose 1 for a single-processor computer.
    
       -level LAB ... CAT ... BY ... FIX ...: Specify the label, categorical variable
           ....  
    
       -mask MASK: Process voxels inside this mask only.
              Default is no masking.
    
       -model FORMULA: Specify the model structure for all the variables. The
             expression FORMULA with more than one variable has to be surrounded
             within (single or double) quotes. Variable names in the formula
             should be consistent with the ones used in the header of -dataTable.
             In the GLMM context the simplest model is "1+(1|Subj)" in
             which the random effect from each of the two subjects in a pair is
             symmetrically incorporated in the model. Each random-effects factor is
             specified within parentheses per formula convention in R. Any
             effects of interest and confounding variables (quantitative or
             categorical variables) can be added as fixed effects without parentheses.
    
       -prefix PREFIX: Output file name. For AFNI format, provide prefix only,
             with no view+suffix needed. Filename for NIfTI format should have
             .nii attached (otherwise the output would be saved in AFNI format).
    
       -qVarCenters VALUES: Specify centering values for quantitative variables
             identified under -qVars. Multiple centers are separated by 
             commas (,) without any other characters such as spaces and should
             be surrounded within (single or double) quotes. The order of the
             values should match that of the quantitative variables in -qVars.
             Default (absence of option -qVarCenters) means centering on the
             average of the variable across ALL subjects regardless their
             grouping. If within-group centering is desirable, center the
             variable YOURSELF first before the values are fed into -dataTable.
    
       -qVars variable_list: Identify quantitative variables (or covariates) with
             this option. The list with more than one variable has to be
             separated with comma (,) without any other characters such as
             spaces and should be surrounded within (single or double) quotes.
             For example, -qVars "Age,IQ"
             WARNINGS:
             1) Centering a quantitative variable through -qVarsCenters is
             very critical when other fixed effects are of interest.
             2) Between-subjects covariates are generally acceptable.
             However EXTREME caution should be taken when the groups
             differ substantially in the average value of the covariate.
             
    
       -R2: Enabling this option will prompt the program to provide both
             conditional and marginal coefficient of determination (R^2)
             values associated with the adopted model. Marginal R^2 indicates
             the proportion of variance explained by the fixed effects in the
             model, while conditional R^2 represents the proportion of variance
             explained by the entire model, encompassing both fixed and random
             effects. Two sub-bricks labeled 'R2m' and 'R2c' will be provided
             in the output.
    
       -resid PREFIX: Output file name for the residuals. For AFNI format, provide
             prefix only without view+suffix. Filename for NIfTI format should
             have .nii attached, while file name for surface data is expected
             to end with .niml.dset. The sub-brick labeled with the '(Intercept)',
             if present, should be interpreted as the effect with each factor
             at the reference level (alphabetically the lowest level) for each
             factor and with each quantitative covariate at the center value.
    
       -Rio: Use R's io functions. The alternative is -cio.
    
       -show_allowed_options: list of allowed options
    
       -slope LAB ... CAT ... BY ... FIX ... QUANT ...: Specify the label, categorical variable
           ....  
    
       -SS_type NUMBER: Specify the type for sums of squares in the F-statistics.
             Three options are: sequential (1), hierarchical (2), and marginal (3).
             When this option is absent (default), marginal (3) is automatically set.
             Some discussion regarding their differences can be found here:
             https://sscc.nimh.nih.gov/sscc/gangc/SS.html
     
       -vVarCenters VALUES: Specify centering values for voxel-wise covariates
             identified under -vVars. Multiple centers are separated by 
             commas (,) within (single or double) quotes. The order of the
             values should match that of the quantitative variables in -qVars.
             Default (absence of option -vVarsCenters) means centering on the
             average of the variable across ALL subjects regardless their
             grouping. If within-group centering is desirable, center the
             variable yourself first before the files are fed under -dataTable.
    
       -vVars variable_list: Identify voxel-wise covariates with this option.
             Currently one voxel-wise covariate is allowed only. By default
             mean centering is performed voxel-wise across all subjects.
             Alternatively centering can be specified through a global value
             under -vVarsCenters. If the voxel-wise covariates have already
             been centered, set the centers at 0 with -vVarsCenters.