Postprocessing

Overview

The clpipe postprocess command combines the functionality of the retired fmri_postprocess and glm_setup commands into a unified postprocessing stream.

This command allows for flexible creation of processing streams. The order of processing steps and their specific implementations can be modified in the configuration file. Any temporally-relevant processing steps can also be applied to each image’s corresponding confounds file. postprocess caches its processing intermediaries in a working directory, which allows quick re-runs of pipelines with new parameters.

This command will also output a detailed processing graph for each processing stream.

Example Pipeline

_images/example_pipeline.png

Configuration

Overview

The top level of the postprocessing configuration section contains general options like the path to your working directory, which tasks to target, etc.

Following this are the ProcessingSteps, which define the steps used for postprocessing. Postprocessing will occur in the order of the list.

ProcessStepOptions displays all of the processing steps with configurable options, allowing them to be configured to suit the needs of your project. See the Processing Step Options section for more information about configuring this section.

ConfoundOptions contains settings specific to each image’s confounds file, and BatchOptions contains settings for job submission.

Option Block

"postprocessing": {
        "working_directory": "/nas/longleaf/home/user/work",
        "write_process_graph": true,
        "target_directory": "/nas/longleaf/home/clpipe/data_fmriprep",
        "target_image_space": "MNI152NLin2009cAsym",
        "target_tasks": [],
        "target_acquisitions": [],
        "output_directory": "/nas/longleaf/home/user/clpipe/data_postprocess",
        "processing_steps": [
                "SpatialSmoothing",
                "TemporalFiltering",
                "IntensityNormalization",
                "ApplyMask"
        ],
        "processing_step_options": {
                "temporal_filtering": {
                        "implementation": "fslmaths",
                        "filtering_high_pass": 0.008,
                        "filtering_low_pass": -1,
                        "filtering_order": 2
                },
                ...additional processing step options
        },
        "confound_options": {
                "columns": [
                        "csf",
                        "csf_derivative1",
                        "white_matter",
                        "white_matter_derivative1"
                ],
                "motion_outliers": {
                        "include": true,
                        "scrub_var": "framewise_displacement",
                        "threshold": 0.9,
                        "scrub_ahead": 0,
                        "scrub_behind": 0,
                        "scrub_contiguous": 0
                }
        },
        "batch_options": {
                "memory_usage": "20G",
                "time_usage": "2:0:0",
                "n_threads": "1"
        },
        "log_directory": "/nas/longleaf/home/user/clpipe/logs/postprocess_logs"

Top-Level Definitions

class clpipe.config.options.PostProcessingOptions

Options for additional processing after fMRIPrep’s preprocessing.

working_directory: str = 'SET WORKING DIRECTORY'

Directory for caching intermediary processing files.

write_process_graph: bool = True

Set ‘true’ to write a processing graph alongside your output.

target_directory: str = ''

Which directory to process - leave empty to use your config’s fMRIPrep output directory.

target_image_space: str = 'MNI152NLin2009cAsym'

Which space to use from your fmriprep output. This is the value that follows “space-” in the image file names.

target_tasks: list

Which tasks to use from your fmriprep output. This is the value that follows “task-” in the image file names. Leave blank to target all tasks.

target_acquisitions: list

Which acquisitions to use from your fmriprep output. This is the value that follows “acq-” in the image file names. Leave blank to target all acquisitions.

output_directory: str = 'data_postprocess'

Path to save your postprocessing data. Defaults to data_postproc.

processing_steps: list

Your list of processing steps to use, in order.

processing_step_options: clpipe.config.options.ProcessingStepOptions

Configuration for each processing step.

confound_options: clpipe.config.options.ConfoundOptions

Options related to the outputted confounds file.

batch_options: clpipe.config.options.BatchOptions

Options for cluster resource usage.

log_directory: str = ''

Log output location. Not normally changed from default.

get_stream_working_dir(processing_stream: str)

Get the working directory relative to the processing stream.

get_stream_output_dir(processing_stream: str)

Get the output directory relative to the processing stream.

get_stream_log_dir(processing_stream: str)

Get the log directory relative to the processing stream.

get_pybids_db_path(processing_stream: str, index_name: str)

Get the path to the pybids index relative to the stream working dir.

Processing Step Options

Temporal Filtering

This step removes signals from an image’s timeseries based on cutoff thresholds. This transformation is also applied to your confounds.

ProcessingStepOptions Block:

"TemporalFiltering": {
        "Implementation":"fslmaths",
        "FilteringHighPass": 0.008,
        "FilteringLowPass": -1,
        "FilteringOrder": 2
}

Definitions:

class clpipe.config.options.TemporalFiltering

This step removes signals from an image’s timeseries based on cutoff thresholds. Also applied to confounds.

implementation: str = 'fslmaths'

Available implementations: fslmaths, afni_3dTproject

filtering_high_pass: float = 0.008

Values below this threshold are filtered. Defaults to .08 Hz. Set to -1 to disable.

filtering_low_pass: int = -1

Values above this threshold are filtered. Disabled by default (-1).

filtering_order: int = 2

Order of the filter. Defaults to 2.

Special Case: Filtering with Scrubbed Timepoints

When the scrubbing step is active at the same time as temporal filtering (see ScrubTimepoints), filtering is handled with a special workflow. This for two reasons: first, temporal filtering must be done before scrubbing, because this step cannot tolerate NAs or non-continuous gaps in the timeseries. Second, filtering can distribute the impact of a disruptive motion artifact throughout a timeseries, despite scrubbing the offending timepoints aftwards. The solution to this is to interpolate over the timepoints to be scrubbed when temporal filtering.

The following diagram shows a timeseries with a large motion artifact (blue), with the points to be scrubbed highlighted in red:

_images/filter_with_scrubs_example.png

The processed timeseries (orange), after filtering, shows how the scrubbed points were interpolated to improve the performance of the filter.

Warning: To achieve interpolation, this special case always uses the 3dTproject implementation, regardless of the implementation requested.

Intensity Normalization

This step normalizes the central tendency of the data to a standard scale. As data acquired from different subjects can vary in relative intensity values, this step is important for accurate group-level statistics.

ProcessingStepOptions Block

"IntensityNormalization": {
        "Implementation": "10000_GlobalMedian"
}

Definitions

class clpipe.config.options.IntensityNormalization

Normalize the intensity of the image data.

implementation: str = '10000_GlobalMedian'

Currently limited to ‘10000_GlobalMedian’

Spatial Smoothing

This step blurs the image data across adjacent voxels. This helps improve the validity of statistical testing by smoothing over random noise in the data, and enchancing underlying brain signal.

To achieve the smoothing, a 3D Gaussian filter is applied to the data. This filter takes as input a kernel radius, which is analogous to the size of the blur tool in a photo editing tool.

Unsmoothed Raw Image

_images/sample_image_base.png

Smoothed with 6mm Kernel

_images/sample_image_smoothed.png

ProcessingStepOptions Block

"SpatialSmoothing": {
        "Implementation": "SUSAN",
        "FWHM": 6
}

Definitions

class clpipe.config.options.SpatialSmoothing

Apply spatial smoothing to the image data.

implementation: str = 'SUSAN'

Currently limited to ‘SUSAN’

fwhm: int = 6

The size of the smoothing kernel. Specifically the full width half max of the Gaussian kernel. Scaled in millimeters.

AROMA Regression

This step removes AROMA-identified noise artifacts from the data with non-aggressive regression.

AROMA regression relies on the presence of AROMA output artifacts in your fMRIPrep directory - they are the files with desc-MELODIC_mixing.tsv and AROMAnoiseICs.csv as suffixes. Thus, you must have the UseAROMA option enabled in your preprocessing options to use this step.

Also applies to confounds.

ProcessingStepOptions Block

"AROMARegression": {
        "Implementation": "fsl_regfilt"
}

Definitions

class clpipe.config.options.AROMARegression

Regress out automatically classified noise artifacts from the image data using AROMA. Also applied to confounds.

Confound Regression

This step regresses the contents of the postprocessesed confounds file out of your data. Confounds are processed before their respective image, so regressed confounds will have any selected processing steps applied to them (such as TemporalFiltering) before this regression occurs. The columns used are those defined in the ConfoundOptions configuration block.

Confound regression is typically used for network analysis - GLM analysis removes these confounds through there inclusion in the model as nuisance regressors.

ProcessingStepOptions Block

"ConfoundRegression": {
        "Implementation": "afni_3dTproject"
}

Definitions

class clpipe.config.options.ConfoundRegression

Regress out the confound file values from your image. If any other processing steps are relevant to the confounds, they will be applied first.

implementation: str = 'afni_3dTproject'

Currently limited to “afni_3dTproject

Scrub Timepoints

The ScrubTimepoints step can be used to remove timepoints from the image timeseries based on a target variable from that image’s confounds file. Timepoints scrubbed from an image’s timeseries are also removed its respective confound file.

ProcessingStepOptions Block

"ScrubTimepoints": {
    "InsertNA": true,
    "Columns": [
        {
            "TargetVariable": "non_steady_state_outlier*",
            "Threshold": 0,
            "ScrubAhead": 0,
            "ScrubBehind": 0,
            "ScrubContiguous": 0
        },
        {
            "TargetVariable": "framewise_displacement",
            "Threshold": 0.9,
            "ScrubAhead": 0,
            "ScrubBehind": 0,
            "ScrubContiguous": 0
        }
    ]
}

Definitions

class clpipe.config.options.ScrubTimepoints

This step can be used to remove timepoints from the image timeseries based on a target variable from that image’s confounds file. Timepoints scrubbed from an image’s timeseries are also removed its respective confound file.

insert_na: bool = True

Set true to replace scrubbed timepoints with NA. False removes the timepoints completely.

scrub_columns: List[clpipe.config.options.ScrubColumn]

A list of columns to be scrubbed.

class clpipe.config.options.ScrubColumn

A definition for a single column to be scrubbed.

target_variable: str = 'framewise_displacement'

Which confound variable to use as a reference for scrubbing. May use wildcard (*) to select multiple similar columns.

threshold: float = 0.9

Any timepoint of the target variable exceeding this value will be scrubbed

scrub_ahead: int = 0

Set the number of timepoints to scrub ahead of target timepoints

scrub_behind: int = 0

Set the number of timepoints to scrub behind target timepoints

scrub_contiguous: int = 0

Scrub everything between scrub targets up to this far apart

Resample

This step will resample your image into the same resolution as the given ReferenceImage. Exercise caution with this step - make sure you are not unintentionally resampling to an image with a lower resolution.

ProcessingStepOptions Block

"Resample": {
        "ReferenceImage": "SET REFERENCE IMAGE"
}

Definitions

class clpipe.config.options.Resample

Resample your image to a new space.

reference_image: str = 'SET REFERENCE IMAGE'

Path to an image against which to resample - often a template

Trim Timepoints

This step performs simple trimming of timepoints from the beginning and/or end of your timeseries with no other logic. Also applies to your confounds.

ProcessingStepOptions Block

"TrimTimepoints": {
        "FromEnd": 0,
        "FromBeginning": 0
}

Definitions

class clpipe.config.options.TrimTimepoints

Trim timepoints from the beginning or end of an image. Also applied to confounds.

from_end: int = 0

Number of timepoints to trim from the end of each image.

from_beginning: int = 0

Number of timepoints to trim from the beginning of each image.

Apply Mask

This step will apply the image’s fMRIPrep mask.

Note - There is currently nothing to configure for this step, so it is simply added to the ProcessingSteps list as “ApplyMask” and does not have a section in ProcessingStepOptions

"ProcessingSteps": [
        "SpatialSmoothing",
        "TemporalFiltering",
        "IntensityNormalization",
        "ApplyMask"
]

Confounds Options

This option block defines your settings for processing the confounds file accompanying each image. A subset of the columns provided by your base fMRIPrep confounds file is chosen with the Columns list.

The MotionOutliers section is used to add spike regressors based on (usually) framewise displacement for inclusion in a GLM model. Note that this section is independent from the scrubbing step - the scrubbing step removes timepoints from both the image and the confounds, while this step adds a variable number of columns to your confounds.

Definitions

class clpipe.config.options.ConfoundOptions

The default options to apply to the confounds files.

columns: list

A list containing a subset of confound file columns to use from each image’s confound file. You may use the wildcard ‘*’ operator to select groups of columns, such as ‘csf*’

motion_outliers: clpipe.config.options.MotionOutliers

Options specific to motion outliers.

class clpipe.config.options.MotionOutliers

These options control the construction of spike regressor columns based on a particular confound column (usually framewise_displacement) and a threshold. For each timepoint of the chosen variable that exceeds the threshold, a new column of all 0s and a single ‘1’ at that timepoint is added to the end of the confounds file to serve as a spike regressor for GLM analysis.

include: bool = True

Set ‘true’ to add motion outlier spike regressors to each confound file.

scrub_var: str = 'framewise_displacement'

Which variable in the confounds file should be used to calculate motion outliers.

threshold: float = 0.9

Threshold at which to flag a timepoint as a motion outlier.

scrub_ahead: int = 0

How many time points ahead of a flagged time point should be flagged also.

scrub_behind: int = 0

If a timepoint is scrubbed, how many points before to remove.

scrub_contiguous: int = 0

How many good contiguous timepoints need to exist.

Resample

Trim Timepoints

Batch Options

These options specify the cluster compute options used when submitting jobs. The default values are usually sufficient to process the data.

Definitions

class clpipe.config.options.BatchOptions

The batch settings for postprocessing.

memory_usage: str = '20G'

How much memory to allocate per job.

time_usage: str = '2:0:0'

How much time to allocate per job.

n_threads: str = '1'

How many threads to allocate per job.

Processing Streams Setup

By default, the output from running fmri_postprocess will appear in your clpipe folder at data_postproc/default, reflecting the defaults from PostProcessingOptions.

However, you can utilize the power of processing streams to deploy multiple postprocessing streams. Options for processing streams are found in a separate section of your configuration file, ProcessingStreams. Each processing stream you define your config file’s ProcessingStreams block will create a new output folder named after the stream setting.

Within each processing stream, you can override any of the settings in the main PostProcessingOptions section. For example, in the follow json snippet, the first processing stream will only pick “rest” tasks and defines its own set of processing steps. The second stream does the same thing, but specifies a filtering high pass by overriding the default value of -1 with .009.

Option Block

...
"processing_streams": [
        {
                "stream_name": "GLM_default",
                "postprocessing_options": {
                        "processing_steps": [
                                "SpatialSmoothing",
                                "AROMARegression",
                                "TemporalFiltering",
                                "IntensityNormalization"
                        ]
                }
        },
        {
                "stream_name": "functional_connectivity_default",
                "postprocessing_options": {
                        "processing_steps": [
                                "SpatialSmoothing",
                                "AROMARegression",
                                "TemporalFiltering"
                                "IntensityNormalization",
                                "ConfoundRegression",
                        ],
                        "confound_options": {
                                "motion_outliers": {
                                        "include": false
                                }
                        }
                }
        }
],
...

Command

CLI Options

clpipe postprocess

Additional processing for GLM or connectivity analysis.

Providing no SUBJECTS will default to all subjects. List subject IDs in SUBJECTS to process specific subjects:

> clpipe postprocess2 123 124 125 …

clpipe postprocess [OPTIONS] [SUBJECTS]...

Options

-config_file, -c <config_file>

Required The path to your clpipe configuration file.

-fmriprep_dir, -i <fmriprep_dir>

Which fmriprep directory to process. If a configuration file is provided with a BIDS directory, this argument is not necessary. Note, must point to the fmriprep directory, not its parent directory.

-output_dir, -o <output_dir>

Where to put the postprocessed data. If a configuration file is provided with a output directory, this argument is not necessary.

-processing_stream, -p <processing_stream>

Specify a processing stream to use defined in your configuration file.

-log_dir <log_dir>

Where to put your HPC output files (such as SLURM output files).

-index_dir <index_dir>

Give the path to an existing pybids index database.

-refresh_index, -r

Refresh the pybids index database to reflect new fmriprep artifacts.

-batch, -no-batch

Flag to create batch jobs without prompting.

-cache, -no-cache
-submit, -s

Flag to submit commands to the HPC.

-debug, -d

Flag to enable detailed error messages and traceback.

Arguments

SUBJECTS

Optional argument(s)

Examples

Display jobs to be run without submitting.

clpipe postprocess -c clpipe_config.json

Submit jobs.

clpipe postprocess -c clpipe_config.json -submit

Submit jobs for specific subjects.

clpipe postprocess 123 124 125 -c clpipe_config.json -submit

To run a specific stream, give the -processing_stream (-p for short) option the name of the stream:

clpipe postprocess -c clpipe_config.json -p smooth_aroma-regress_filter-butterworth_normalize -submit