fMRI preprocessing MATLAB toolbox version 1.8

How to use the toolbox

  1. Assume you have the Haxby's data already.
  2. Download the toolbox here and uncompress.
  3. The toolbox contains two settings:
    1. for subject#0, run main_whole_process_subj0.m (obsolete soon)
    2. for subject#1-6, run plugin_convert_haxby_data.m and then main_whole_process_subj1_6.m (preferable)

Preferable directory structure for the toolbox

The diagram below show the preferable directory structure for the toolbox:

For subject#1-6, when the compressed file of subject#X is extracted, the directory subjX is built to contain some nifti files and a label text file. We can convert such directory into the preferable/ideal one by running the program plugin_convert_haxby_data.m. Here is more details about how to run the code:

  1. First, we extract the zipped file and put all the nifti files in the same dir, say subj1. There are masks, anatomical brain, bold files in the directory.
  2. Make sure that the label file is in "category_name run#" format. In Haxby's label.txt, you need to remove the first line, saying "label chunk", in the txt file.
  3. Specify the file path in the code plugin_convert_haxby_data.m and run it. This step can be pretty long when the file is huge:
    1. macbook pro 2011 + 8GB memory takes 600 sec
    2. Ubuntu machine with lots of memory takes only 16 sec
  4. At the end of the process, the dir subj1 will contain
    1. lots of nifti files already there
    2. dir matlab_format containing:
      1. bold.mat
      2. experiment_design.mat
    3. dir nifti_out containing lots of nifti file, each file for each run

For subject#0, the bold files are separated into several runs already, so we will just need to put them in nifti_out within subjX.

With the ideal format, you can move forward to use the toolbox. Although the document below explains based on the subject#0, the processing is the same for other subjects too.

fMRI preprocessing MATLAB toolbox version 1.8

The main file is main_whole_process_subj0.m and (main_whole_process_subj1_6.m) which is an ensemble of a few sub-routines listed below:

  1. main_convertDataToVTFormat
  2. main_convertMaskToVTFormat
  3. main_preProcessFMRIData
  4. main_convertDataToProperFormat

and the overview of the whole process is shown in the diagram below.

(click at the figure for better resolution.)

The process shown here is perfect for the "subject0" from Haxby's data set. However, for subject 1 to 6, we need plugin to convert the format.

Convert Nifti file into VT format in MATLAB workspace

main_convertDataToVTFormat is a routine that reads each nifti file (.nii), assuming that each file is a run of an experiment, and convert the content into VT-format in matlab workspace. This process reads all the voxels in the nifti file, that is, the whole scan, and does not care about mask or ROI. This routine is a wrapper program calling the function convertDataVTFormat which convert a nifti file into nifti format.

input: <input file name>_r<run#>.nii

output: <input file name>_r<run#>_VT.mat

% the fMRI in VT format out.VT = VT; % fMRI in VT format out.sizeX = sizeX; out.sizeY = sizeY; out.sizeZ = sizeZ; out.sizeT = sizeT; out.coorX = coorX; out.coorY = coorY; out.coorZ = coorZ;

Definition: Voxel-Time series (VT) format is a N by T matrix; where

N: the number of voxels in the whole scan (not just in the whole brain)

T: the number of time stamps

Therefore, the entry (i,j) of such a matrix represents the brain response at the j-th time stamp of the i-th voxel.

The order of voxels is the same as matlab operation X(:).

Import ROI mask (in nifti format) into MATLAB workspace

Goal: to import region of interest (ROI) mask file into a binary mask file in MATLAB.

Given: Haxby provides ROI masks of

  • the whole brain in 'wholebrain.nii' and
  • the ventrical temporal cortex in 'mask_cat_select_vt.nii'

both are in nifti format.

We have the routine "main_convertMaskToVTFormat" to convert a mask into MATLAB space with output file MaskHaxby8_<region name>.mat:

  • mask_bin: is 1 x N vector whose entry is binary; 1 for the voxel is in the ROI, and 0 otherwise. N here is the number of voxels in the whole scan space.
  • voxelID_total: the id (integer from 1 to N) of each voxel in the whole scan space. The order is the same as MATLAB "(:)" operation.
  • voxelID_mask: Only the ids of those in the ROI mask, that is, voxelID_mask is a subset of voxelID_total.
  • coord_scan: the xyz coordinates of all the voxels in the scan space
  • coord_mask: the xyz coordinates of the voxels in the ROI, with the same order of voxelOD_mask

Extract the brain activation (beta) coefficients

The routine main_preProcessFMRIData performs the following processes for every voxel in the whole scan:

  1. Regress out both offset and the drift from each voxel individually
  2. Calculate z-score (standardize) from the time series, each voxel individually
  3. Calculate the beta map for each voxel individually

The last process requires the design/experiment matrix in order to separate each "block" of the experiment, extract the beta from each block and determine the category label for each of them. In summary, this step takes:

input: <input file name>_r<run#>_VT.mat

output: <input file name>_r<run#>_beta.mat containing a beta matrix for each voxel in the same run#

The program handles one run# at a time, and does not require ROI mask. In fact, this step is applied for all voxels in the whole scan.

Here are more details about the process:

In the first step, we remove the offset and the linear drift from the time series of each voxel individually using linear regression, implemented in regressOut. The difference of time series between before and after the process can be found in the figure below.

In step 2, we then calculate the z-score along each time series before calculating the beta coefficients in the next step.

Step 3 is to extract the beta coefficients for each voxel, and requires the run# and category# for each example, which can be obtained from the file tutorial_runs.mat and tutorial_regs.mat respectively. The run# and category# indicators look like the figure below. The top figure illustrates the run# for each example. The bottom figure indicates the category for each example.

The program handles only 1 run at a time, and the design matrix will be built based on the categories for each run. The design matrix will need to be convolved with the haemodynamic response function (HRF) before being used as predictor to calculate beta. The figures show the design matrix before and after convolving with HRF.

We then make a predictor matrix X for run#1 such that the first 2 columns of X are the offset (all 1's) and linear drift predictor (1 to T). The program extracts the run# and the category# automatically, so the user do not need to change anything except just to make sure the format of the experiment design is correct.

The beta for each column can be calculated in closed-form using the equation

Where X is the design matrix for each run and y is the time series for each voxel in the same run. The beta_hat is the estimator of the beta coefficient. The beta calculation is implemented in the code calculateBetaMap, which is similar to regressOut.

The figure below shows the time series in each step of the process.

The beta coefficients for each voxel in run#1 are shown below:

Don't be surprised that the order of the output beta is well arranged already from category 1, 2, ... 8. That is because the design matrix is arranged in that way.

Again, this step does not care about the ROI mask. In fact, the ROI mask will be used in the next step.

Rearrange the beta matrix in a proper format

There are two formats that we use:

  1. Detail format: a 3D matrix of the dimension #voxels_in_mask x (2 + #categories) x #runs. The "2" refers to the coefficient for offset and linear drift. This format is good when you want all the coefficients, including the ones from offset and drift, in a separate run.
  2. Classification format: a 2D matrix of dimension #experiments x #voxels_in_mask. #experiments is the number of all trials/observation, for example, in this case we have #experiments = 80 (= 8 categories * 10 runs). This format is convenient for classification task, shown in the figure below.

(click at the figure for better resolution.)

In this step the user can pick an ROI mask from the step 2 to obtain only the area of interest.

input:

  • the beta matrix file for each run: <input file name>_r<run#>_beta.mat
  • the ROI mask file from step2: mask_<region>.mat

output:

  • Detail format: <filename>_beta_<region>_Classification_format.mat
  • Classification format: <filename>_beta_<region>_Detail_format.mat

The output from Classification format contains the following variables

'beta_matrix',... % the beta matrix 'num_category',...% number of categories in the experiment 'num_run',...% number of runs 'run_id',... % the run# for each experiment 'class_vector'... % the category label for each experiment

which are plotted in the figure above.

The final directory structure

When all the processe are applied, the final directory structure is shown below:

For classification purpose, I recommend using Haxby8_beta_vtc_Classification_format.mat in which the beta coefficient matrix beta_matrix is stored.

Display the data

In the file display_data.m, you can see how to load and display the results.

Above is the averaged beta coefficients across experiments plotted in xyz brain space.