Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Physprep docs #123

Open
wants to merge 13 commits into
base: main
Choose a base branch
from
119 changes: 119 additions & 0 deletions source/DERIVATIVES.md
Original file line number Diff line number Diff line change
Expand Up @@ -145,3 +145,122 @@ The above boilerplate text was automatically generated by fMRIPrep
with the express intention that users should copy and paste this
text into their manuscripts *unchanged*.
It is released under the [CC0](https://creativecommons.org/publicdomain/zero/1.0/) license.

## PhysPrep

### Overview
The physiological data were preprocessed using a pipeline developed within the lab, called Physprep.
Physprep is a pipeline that segments, cleans and processes photoplethysmography (PPG), electrocardiography (ECG),
electrodermal (EDA), and respiratory (RSP) signals from minimal user input. The Physprep pipeline integrates
open access python packages including Phys2Bids, NeuroKit2, and Systole.

### Outputs
The description of participant, session, task and event tags can be found in the Datasets section.


**Raw timeseries**

The physiological raw timeseries are located in `<dataset>` alongside the fMRI data. Metadata associated with
the physiological files can be find at the root directory under `physio.json`.

Each participant folder (`sub-*`) contains the following outputs:
- `ses-*/func`
- `<match>_physio.tsv.gz` : raw segmented biosignals.
- `<match>_physio.json` : contains tsv columns names, start time, and signal sampling frequency information.


**Preprocessed timeseries & extracted features (derivatives)**

The physiological derivatives are located in derivatives datasets following the naming convention `<dataset>.physprep`.

Metadata associated with the derivatives files can be find at the root directory under:
- `preproc_physio.json` : contains tsv columns names, start time, and signal sampling frequency information associated with the `<match>_desc-preproc_physio.tsv.gz` files in the `sub-*/ses-*/func/` folders.
- `events.json` : contains description of the columns in the tabular events files located in the `sub-*/ses-*/func/` folders.

Each participant folder (`sub-*`) contains the following outputs:
- `ses-*/func`
- `<match>_desc-preproc_physio.tsv.gz` : processed time series.
- `<match>_desc-quality.json` : quality assessment of the biosignals.
- `<match>_events.tsv` : extracted features.

### Preprocessing pipeline description
The workflow developed to process the physiological data is based on Phys2Bids 2.8.3; Scipy 1.9.0; Neurokit2 0.2.3;
Systole 0.2.4.

The following section details the post-acquisition filtering procedure used for each physiological modality.


Photoplethysmography

: The PPG filtering procedure follows recommandations given by [Elgendi et al. 2013](https://doi.org/10.1371/journal.pone.0076585).
The artefacts were removed using a bidirectional butterworth bandpass filter (low cutoff: 0.5 Hz; high cutoff: 8 Hz; order: 3).
The signal was then downsampled to 1000 Hz. Systolic peaks were then detected using the method described in
[Elgendi et al. (2013)](https://doi.org/10.1371/journal.pone.0076585), and implemented in NeuroKit2 (see
[`ppg_findpeaks`](https://neuropsychology.github.io/NeuroKit/functions/ppg.html#ppg-findpeaks) documentation). Systolic peak placements
were corrected using the peak-to-peak differences from the NeuroKit2 package: intervals between peaks outside of the 0.5-1.5 seconds
range were identified as outliers, and were corrected (see [`signal_fixpeaks`](https://neuropsychology.github.io/NeuroKit/functions/signal.html#signal-fixpeaks) documentation).


Electrocardiography

: The ECG filtering procedure was implemented as per the [manufacturer application notes](https://www.biopac.com/wp-content/uploads/app242x.pdf).
Namely, a bidirectional butterworth highpass filter (cutoff: 2 Hz; order: 2) was first apply to remove low frequency artefacts such as respiration
and baseline wander. A second-order IIR notch digital filter was performed to filter fundamental and specific harmonics (Q: 100; see
[manufacturer application notes](https://www.biopac.com/wp-content/uploads/app242x.pdf) and [Bottenhorn et al., 2021](https://doi.org/10.1101/2021.04.01.437293)).
A bidirectional butterworth lowpass filter (cutoff: 40 Hz; order: 2) was finally applied before downsampling the signal to 1000 Hz. The R-peaks were detected using a
probabilistic approach as implemented in the [NeuroKit2 ProMAC method](https://neuropsychology.github.io/NeuroKit/functions/ecg.html#ecg-peaks). R-peaks were corrected
using the same procedure as described above for the systolic peak detection.


Electrodermal activity

: The EDA filtering procedure was implemented as per [NeuroKit2 default EDA cleaning method](https://neuropsychology.github.io/NeuroKit/functions/eda.html#preprocessing).
Since EDA signal is characterised by low-frequency components and MRI gradients introduce high-frequency artefacts, a bidirectional butterworth lowpass filter
(cutoff: 3 Hz; order: 4) was employed to clean the signal, as suggested by [Privratsky et al. (2020)](https://doi.org/10.1016/j.ijpsycho.2020.09.015). However,
unlike what is proposed in that article, we refrained from applying a highpass filter to retain the tonic component of the EDA signal. The signal was then downsample
to 1000 Hz. After the filtering procedure, the EDA signal was decomposed into its phasic and tonic components using the method implemented in Biopac's Acqknowledge
(i.e. highpass filtering with a cutoff of 0.05 Hz). From the extracted phasic component, the skin conductance responses were detected by finding the local maxima in
the signal (minimum relative amplitude of 0.1).


Respiratory activity

: The RSP filtering procedure was implemented as per [Khodadad et al., 2018](https://doi.org/10.1088/1361-6579/aad7e6), which includes a bidirectional
butterworth bandpass filter (low cutoff: 0.05 Hz; high cutoff: 3 Hz; order: 2). The lower cutoff was set to preserve breathing rate higher than 3 breath
per minute. The signal was then downsample to 1000 Hz. The peaks and trouhgs were identified on the downsampled signal as per [Khondadad et al. (2018)](https://doi.org/10.1088/1361-6579/aad7e6).


### QC-ing pipeline description
In order to evaluate the accuracy of the physiological data, quality assessment was performed for each modality using the filtered signals.
These signals were analysed run-by-run in 1-minute windows. The quality assessment was generated for each run, and can be found under:
`<dataset>.physprep/sub-XX/ses-XX/func/<match>_desc-quality.json`. For a given run, if more than 80% of the 1 min windows were acceptable, the run
passed the quality assessment, otherwise it failed. Those information can be find in the `<match>_desc-quality.json` under the `QualityAssessment` key
(possible value: `Pass` and `Fail`). The details about the quality assessement for each modality are described below.

:::{important}
Even if an automatic quality assessment is provided for each run, it is the responsibility of the users to make sure the data meet their quality requirements.
:::


Cardiac signals

: We assessed the quality of cardiac signals (PPG and ECG) for each run based on normal NN intervals mean and NN intervals standard deviation. One-minute segments were classified as
acceptable if the mean of NN intervals was within the range of 600 and 1200, and if the standard deviation was below 300. Futher quality checks should be carried out to ensure that
the available cardiac signal is suitable for a given analysis.


Electrodermal activity

: We assessed the quality of the preprocessed EDA waveforms for each each run based on the criteria proposed by [Böttcher et al. (2022)](https://doi.org/10.1038/s41598-022-25949-x).
For each 2 sec window, the quality score was equal to 1 if the rate of amplitude change (RAC) was smaller than 0.2 and if the mean of the signal was greater than 0.05, otherwise the
quality score was equal to 0. To obtain quality score for the 1min windows, the quality scores of the 2 sec segments were averaged. The signal within a window was considered acceptable
if the quality score was equal to 1. Futher quality checks should be carried out to ensure that the available EDA signal is suitable for a given analysis.


Respiratory activity

: To ensure that the respiratory waveforms are within a normal range, a threshold of 0.5 Hz was used on the signal rate. If the averaged respiratory rate within a specific
window was below 0.5 Hz, the signal was considered normal. However, if the averaged respiratory rate exceeded that threshold, the signal window was considered unacceptable. Futher quality
checks should be carried out to ensure that the available respiratory signal is suitable for a given analysis.