fmristat
A general statistical analysis for fMRI data

The statistical analysis of the fMRI data was based on a linear model with correlated errors. The design matrix of the linear model was first convolved with a hemodynamic response function modeled as a difference of two gamma functions timed to coincide with the acquisition of each slice. Drift was removed by adding polynomial covariates in the frame times, up to degree 3(?), to the design matrix. The correlation structure was modeled as an autoregressive process of degree 1. At each voxel, the autocorrelation parameter was estimated from the least squares residuals using the Yule-Walker equations, after a bias correction for correlations induced by the linear model. The autocorrelation parameter was first regularized by spatial smoothing with a 15(?)mm FWHM Gaussian filter, then used to `whiten' the data and the design matrix. The linear model was then re-estimated using least squares on the whitened data to produce estimates of effects and their standard errors.

In a second step, runs, sessions and subjects were combined using a mixed effects linear model for the effects (as data) with fixed effects standard deviations taken from the previous analysis. This was fitted using ReML implemented by the EM algorithm. A random effects analysis was performed by first estimating the the ratio of the random effects variance to the fixed effects variance, then regularizing this ratio by spatial smoothing with a 15(?)mm FWHM Gaussian filter. The variance of the effect was then estimated by the smoothed ratio multiplied by the fixed effects variance to achieve higher degrees of freedom.

The resulting T statistic images were thresholded using the minimum given by a Bonferroni correction and random field theory.

References

Liao, C., Worsley, K.J., Poline, J-B., Aston, J.A.D., Duncan, G.H., Evans, A.C. (2002). Estimating the delay of the fMRI response. NeuroImage, 16:593-606. POSTER (PowerPoint)

Worsley, K.J., Liao, C., Aston, J., Petre, V., Duncan, G.H., Morales, F., Evans, A.C. (2002). A general statistical analysis for fMRI data. NeuroImage, 15:1-15. POSTER (PowerPoint)

PowerPoint presentation of fMRIstat

The latest release of fmristat

5 August, 2002: A few bug fixes in reading and writing analyze format files. A new program efficiency to calculate efficiencies and optimum designs - see Efficiency and choosing the best design.

25 April, 2002: Fmrilm has been extended to allow for different confounds for different slices. This is useful for effective connectivity, where a new function fmri_interp has been added to resample a single fmri time series at different frame times and slice times. See Effective connectivity of all voxels with a reference.

12 April, 2002: Fmristat now reads ANALYZE format files created on any machine, and the file seperator can be / or \.

8 April, 2002: Fmrilm now reads multiple 3D image files as well as a single 4D file - useful for ANALYZE format data.

Fmristat now allows for non-isotropic (non-constant FWHM) data. Many of the FWHM parameters can now be replaced by FWHM images. Fmristat now works for 2D data (one slice) as well as 3D data (more than one slice).

Fmrilm and multistat have been extended to estimate the effective FWHM of the data both from fixed and random effects. This allows for non-isotropic inference.

Multistat has been extended to estimate mixed effects conjunctions. It is now very fast if no sdeffect files are specified, making it suitable for analyzing PET data.

Locmax now finds clusters as well as local maxima.

Stat_threshold has been extended to find P-values and thresholds for conjunctions, and it now adjusts P-values and thresholds for clusters if the cluster extent is measured in resels and the resels are themselves estimated by fmrilm or multistat .

Example contains all the matlab commands for the example used here.

Data contains all the data for the example used here.

New functions

Fmri_mask_threshold finds a 'nice' threshold for masking the brain based on the first scan of the raw fMRI data.

Pca_image does an exploratory Principal Components Analysis (PCA).

Glass_brain produces a maximum intensity projection or 'glass brain'.

Fdr_threshold finds Q-values and thresholds that control the false discovery rate.

Stat_summary produces an SPM-stlye summary of clusters, peaks and their P-values and Q-values.

Conf_region produces approximate confidence regions for the location of peaks in a statistic image.

Gauss_blur blurs 3D images with a Gaussian filter.

Looking at the fMRI data using pca_image

A Principal Components Analysis (PCA) of the fMRI data is sometimes useful as an exploratory tool that does not require any model for the data. It can reveal unexpected signals, or outlying frames. The PCA writes the data as a sum of temporal components times spatial components, chosen to best approximate the fMRI data. To define the brain, we use the first slice of the fMRI data as a mask. fmri_mask_thresh.m tries to find a nice threshold, in this case 452. Pca_image produces the following graph of the first 4 components: input_file='g:/keith/data/brian_ha_19971124_1_100326_mri_MC.mnc'; mask_file=input_file; mask_thresh=fmri_mask_thresh(mask_file); pca_image(input_file, [], 4, mask_file, mask_thresh); colormap(spectral);

The first component, with an sd of 105.7 that accounts for 77.8% of the variance of the data, has extreme temporal values in the first few frames, perhaps because the EPI scanner has not reached a steady state. The spatial component is roughly the difference between the first few frames and the average of the others. The second component appears to capture a linear drift. We can see this more clearly by excluding the first two frames as follows:

exclude=[1 2]; V=pca_image(input_file, exclude, 4, mask_file, mask_thresh);

Now the first component appears to have captured the linear drift around the edges of the brain, with an sd of 27.8 that explains 22.9% of the variability of the fMRI data. Note that signs are arbitrary, e.g. we can change the signs of both the time and space components. This component is saved in the first column of V; we could use it later on as the CONFOUND parameter to fmrilm to remove drift, instead of, or in addition to removing polynomial drift.

The next two components, with sd's of 17.6 and 15.7, appear to capture a cyclic pattern with 10 cycles that might be related to the 10 repetitions of the heat stimulus paradigm. This cyclic pattern is expressed in the supplementary motor areas which is confirmed below by a model-driven analysis. The fourth component, with an sd of 12.1, appears to be noise.

Pca_image can also do a PCA on the coefficients from a linear model specified by a design matrix X_interest, after removing effects from another linear model specified by a design matrix X_remove (Worsley et al., 1998). As an example, we might try to confine the PCA to the 12 scans in each block. To do this, we set X_interest to a design matrix for the 12 time points within each block. We might also remove a constant and the linear drift already detected as the first component of the PCA above:

X_remove=[ones(120,1) (1:120)']; X_interest=repmat(eye(12),10,1); pca_image(input_file, exclude, 4, mask_file, mask_thresh, X_remove, X_interest);

The stimulus was 3 frames each of rest, hot, rest, warm, repeated 10 times, but delayed 6s (2 frames) by the hemodynamic response. The first component appears to have captured a warm minus hot stimulus effect. The second component, though much smaller, appears to be capturing a 6s delay of the first component. The rest appear to be noise.

Worsley, K.J., Poline, J.B., Friston, K.J. and Evans, A.C. (1998). Characterizing the response of PET and fMRI data using Multivariate Linear Models (MLM). NeuroImage , 6:305-319.

Plotting the hemodynamic response function (hrf) using fmridesign

The hrf is modeled as the difference of two gamma density functions (Glover, G.H. (1999). "Deconvolution of impulse response in event-related BOLD fMRI." NeuroImage, 9:416-429). The parameters of the hrf are specified by a row vector whose elements are:

1. PEAK1: time to the peak of the first gamma density;
2. FWHM1: approximate FWHM of the first gamma density;
3. PEAK2: time to the peak of the second gamma density;
4. FWHM2: approximate FWHM of the second gamma density;
5. DIP: coefficient of the second gamma density. The final hrf is: gamma1/max(gamma1)-DIP*gamma2/max(gamma2) scaled so that its total integral is 1.
If PEAK1=0 then there is no smoothing of that event type with the hrf. If PEAK1>0 but FWHM1=0 then the design is lagged by PEAK1. The default, chosen by Glover (1999) for an auditory stimulus, is:

hrf_parameters=[5.4 5.2 10.8 7.35 0.35] To look at the hemodynamic response function, try:
time=(0:240)/10; hrf0=fmridesign(time,0,[1 0],[],hrf_parameters); plot(time,hrf0.X,'LineWidth',2) xlabel('time (seconds)') ylabel('hrf') title('Glover hrf model')

Making the design matrices using fmridesign

Specifying all the frame times and slice times is now obligatory. For 120 scans, separated by 3 seconds, and 13 interleaved slices every 0.12 seconds use:

frametimes=(0:119)*3; slicetimes=[0.14 0.98 0.26 1.10 0.38 1.22 0.50 1.34 0.62 1.46 0.74 1.58 0.86]; The events are specified by a matrix of EVENTS, or directly by a stimulus design matrix S, or both. EVENTS is a matrix whose rows are events and whose columns are:

1. EVENTID - an integer from 1:(number of events) to identify event type;
2. EVENTIMES - start of event, synchronized with frame and slice times;
3. DURATION (optional - default is 0) - duration of event;
4. HEIGHT (optional - default is 1) - height of response for event.

For each event type, the response is a box function starting at the event times, with the specified durations and heights, convolved with the hemodynamic response function (see above). If the duration is zero, the response is the hemodynamic response function whose integral is the specified height - useful for ‘instantaneous’ stimuli such as visual stimuli. The response is then subsampled at the appropriate frame and slice times to create a design matrix for each slice, whose columns correspond to the event id number. EVENT_TIMES=[] will ignore event times and just use the stimulus design matrix S (see next).

This is a sample run on Brian Ha's data, which was a block design of 3 scans rest, 3 scans hot stimulus, 3 scans rest, 3 scans warm stimulus, repeated 10 times (120 scans total). The hot event is identified by 1, and the warm event by 2. The events start at times 9,27,45,63... and each has a duration of 9s, with equal height:

eventid=kron(ones(10,1),[1; 2]); eventimes=(0:19)'*18+9; duration=ones(20,1)*9; height=ones(20,1); events=[eventid eventimes duration height] Events can also be supplied by a stimulus design matrix S (not recommended), whose rows are the frames, and columns are the event types. Events are created for each column, beginning at the frame time for each row of S, with a duration equal to the time to the next frame, and a height equal to the value of S for that row and column. Note that a constant term is not usually required, since it is removed by the polynomial trend terms provided N_POLY>=0. For the same experiment:
S=kron(ones(10,1),kron([0 0; 1 0; 0 0; 0 1],ones(3,1))); Either of these give the same cache of design matrices, X_CACHE, a 4-D array (Dim 1: frames; Dim 2: response variables; Dim 3: derivative + 1 ; Dim 4: slices):
X_cache=fmridesign(frametimes,slicetimes,events,[],hrf_parameters); X_cache=fmridesign(frametimes,slicetimes, [] , S,hrf_parameters); or just
X_cache=fmridesign(frametimes,slicetimes,events); if you are using the default hrf parameters. To look at the two columns of the design matrix for the 4th slice:
plot(squeeze(X_cache.X(:,:,1,4)),'LineWidth',2) legend('Hot','Warm') xlabel('frame number') ylabel('response') title('Hot and Warm responses')

Analysing one run with fmrilm

First set up the contrasts. CONTRAST is a matrix whose rows are contrasts for the statistic images, with row length equal to the number regressor variables in X_CACHE. We wish to look for regions responding to the hot stimulus, the warm stimulus, and the difference hot-warm:

contrast=[1 0; 0 1; 1 -1]; EXCLUDE is a list of frames that should be excluded from the analysis. This must be used with Siemens EPI scans to remove the first few frames, which do not represent steady-state images. Default is [1], but [1 2] is better:

exclude=[1 2]; WHICH_STATS is a logical matrix indicating output statistics by 1. Rows correspond to rows of CONTRAST, columns correspond to:

1: T statistic image, OUTPUT_FILE_BASE_mag_t.mnc or .img (for contrasts in the delays, mag is replaced by del in the extension).
The degrees of freedom is DF. Note that tstat=effect/sdeffect. 2: effect image, OUTPUT_FILE_BASE_mag_ef.mnc or .img.
3: standard deviation of the effect, OUTPUT_FILE_BASE_mag_sd.mnc or .img.
4: F-statistic image, OUTPUT_FILE_BASE_mag_F.mnc or .img, for testing all rows of CONTRAST that deal with magnitudes (see CONTRAST_IS_DELAY below). The degrees of freedom are [DF, P].
5: the temporal autocorrelation(s), OUTPUT_FILE_BASE_rho.mnc or .img.
6: the residuals from the model, OUTPUT_FILE_BASE_resid.mnc or .img, only for the non-excluded frames (warning: uses lots of memory).
7: the whitened residuals from the model, OUTPUT_FILE_BASE_wresid.mnc or .img, only for the non-excluded frames (warning: uses lots of memory).
8: the AR parameter(s) a_1 ... a_p, in OUTPUT_FILE_BASE_A.mnc or .img.
9: the estimated FWHM in OUTPUT_FILE_BASE_FWHM.mnc or .img - note this uses more time and memory.

If WHICH_STATS to a row vector of length 9, then it is used for all contrasts; if the number of columns is less than 9 it is padded with zeros. Default is 1.

To get t statistics, effects and their sd's, and the F statistic for testing for any effect of either hot or warm vs baseline, the autocorrelation, and the FWHM:

which_stats=[1 1 1 1 1 0 0 0 1] Input minc file for one session, and output base for each row of CONTRAST (hot for hot, wrm for warm, hmw for hot-warm):

input_file='g:/keith/data/brian_ha_19971124_1_100326_mri_MC.mnc'; output_file_base=['g:/keith/results/ha_100326_hot'; 'g:/keith/results/ha_100326_wrm'; 'g:/keith/results/ha_100326_hmw']; The big run:

[df1 p]=fmrilm(input_file, output_file_base, X_cache, contrast, exclude, which_stats); and the output gives the degrees of freedom of the T statistic images (df1=112) and the F statistic image (df1=112 , p=2). Note that p=2, not 3, because the third contrast is a linear combination of the first two. The files generated are:

g:/keith/results/ha_100326_hot_rho.mnc g:/keith/results/ha_100326_hmw_mag_t.mnc g:/keith/results/ha_100326_hmw_mag_ef.mnc g:/keith/results/ha_100326_hmw_mag_sd.mnc g:/keith/results/ha_100326_hot_mag_t.mnc g:/keith/results/ha_100326_hot_mag_ef.mnc g:/keith/results/ha_100326_hot_mag_sd.mnc g:/keith/results/ha_100326_wrm_mag_t.mnc g:/keith/results/ha_100326_wrm_mag_ef.mnc g:/keith/results/ha_100326_wrm_mag_sd.mnc g:/keith/results/ha_100326_hot_fwhm.mnc

Visualizing the results using glass_brain

To look at slice 4 (the slice used in Worsley et al., 2001) of the t-statistic for hot-warm try

t_file='g:/keith/results/ha_100326_hmw_mag_t.mnc'; m=fmris_read_image(t_file,4,1); imagesc(m.data'))',[-6 6]); colorbar; axis xy; colormap(spectral);

To see the brain using the first slice of the fMRI data as a mask:

a=fmris_read_image(mask_file,4,1); imagesc((m.data.*(a.data>mask_thresh))',[-6 6]); colorbar; axis xy;

To get an SPM-style maximum intensity projection ('glass brain') above 3 try:

glass_brain(t_file,3,mask_file,mask_thresh);

F-tests

F-tests should only be used when we are interested in any linear combination of the contrasts. For example, an F-test would be appropriate for detecting regions with high polynomial drift, since we would be interested in either a linear, quadratic or cubic trend, or any linear combination of these. The drift terms appear after the regressor terms, in the order constant, linear, quadratic, cubic etc. So to test for a drift, use the contrast contrast=[0 0 0 1 0 0; 0 0 0 0 1 0; 0 0 0 0 0 1] which_stats=[0 0 0 1]; output_file_base='g:/keith/results/ha_100326_drift'; fwhm_rho='g:/keith/results/ha_100326_hot_rho.mnc'; fmrilm(input_file,output_file_base,X_cache,contrast,exclude,which_stats,fwhm_rho); Note that by replacing the fwhm_rho parameter (15mm by default) with the file name g:/keith/results/ha_100326_hot_rho.mnc we avoid re-calculating the autocorrelations. Another good use of the F-test is for detecting effects when the hemodynamic response is modeled by a set of basis functions (see below).

A linear effect of temperature

Instead of using just two values (hot=49^oC, and low=35^oC), suppose the temperature of the stimulus varied 'randomly' over the 20 blocks, taking 5 equally spaced values between 35 and 49:
temperature=[45.5 35.0 49.0 38.5 42.0 49.0 35.0 42.0 38.5 45.5 ... 38.5 49.0 35.0 45.5 42.0 45.5 38.5 42.0 35.0 49.0]'; To model a linear effect of temperature (see Figure (a) below), use one event type with height=1 and another with a height=temperature:
events=[zeros(20,1)+1 eventimes duration ones(20,1); zeros(20,1)+2 eventimes duration temperature] The following contrasts will estimate first the slope of the temperature effect (per ^oC), then the hot, warm and hot-warm effects exactly as before:
contrast=[0 1; 1 49; 1 35; 0 14];

To model a quadratic effect of temperature (Figure b), add a third event type with a height equal to the temperature²:

events=[zeros(20,1)+1 eventimes duration ones(20,1); zeros(20,1)+2 eventimes duration temperature; zeros(20,1)+3 eventimes duration temperature.^2] contrast=[0 0 1]; The contrast will test for a quadratic effect. Taking this further, we may wish to test for a cubic (Figure c), quartic (Figure d) or even a 'non-linear' effect of temperature, where the temperature effects are arbitrary. Note that the quartic model is identical to the arbitrary model because a quartic can be fitted exactly to any arbitrary 5 points. To model the arbitrary effect, simply assign a different event type from 1 to 5 for each of the 5 different temperature values: events=[floor((temperature-35)/3.5)+1 eventimes duration ones(20,1)] To look for any effect of the stimulus compared to baseline, use:
contrast=[1 0 0 0 0; 0 1 0 0 0; 0 0 1 0 0; 0 0 0 1 0; 0 0 0 0 1]; contrast=[eye(5) ones(5,4)]; which_stats=[0 0 0 1]; The resulting F statistic image will have 5 and 109 d.f.. To see if the changes in temperature have any effect, subtract the average from each row of the contrasts:
contrast=[.8 -.2 -.2 -.2 -.2; -.2 .8 -.2 -.2 -.2; -.2 -.2 .8 -.2 -.2; -.2 -.2 -.2 .8 -.2; -.2 -.2 -.2 -.2 .8]; contrast=[eye(5)-ones(5)/5 ones(5,4)]; which_stats=[0 0 0 1]; The resulting F statistic image will have 4 and 109 d.f.. To see if the changes in temperature have a linear effect, subtract the average from the 5 temperature values:
contrast=[-7.0 -3.5 0 3.5 7.0]; which_stats=[1 1 1]; This will produce a T statistic with 109 d.f.. How is this different from the previous linear effect? The effect is the same, but the standard deviation may be slightly different. The reason is that in this model, we are basing the standard deviation on errors about each temperature value; in the first analysis, it is based on errors about a linear model. As a result, the degrees of freedom is slightly different: 112 before, but 109 here.

Combining runs/sessions/subjects with multistat

Repeat previous analysis for hot-warm for the three other sessions (note that the same X_CACHE is used for each):

contrast=[1 -1]; input_file='g:/keith/data/brian_ha_19971124_1_093923_mri_MC.mnc'; output_file_base='g:/keith/results/ha_093923_hmw'; [df2 p]=fmrilm(input_file, output_file_base, X_cache, contrast, exclude, which_stats); input_file='g:/keith/data/brian_ha_19971124_1_101410_mri_MC.mnc'; output_file_base='g:/keith/results/ha_101410_hmw'; [df3 p]=fmrilm(input_file, output_file_base, X_cache, contrast, exclude, which_stats); input_file='g:/keith/data/brian_ha_19971124_1_102703_mri_MC.mnc'; output_file_base='g:/keith/results/ha_102703_hmw'; [df4 p]=fmrilm(input_file, output_file_base, X_cache, contrast, exclude, which_stats); Now we can average the different sessions as follows. NOTE THAT ALL FILES MUST HAVE EXACTLY THE SAME SHAPE (SLICES, COLUMNS, ROWS). If not, reshape them! This is done by again setting up design matrix and a contrast:

X=[1 1 1 1]'; contrast=[1]; which_stats=[1 1 1]; input_files_effect=['g:/keith/results/ha_093923_hmw_mag_ef.mnc'; 'g:/keith/results/ha_100326_hmw_mag_ef.mnc'; 'g:/keith/results/ha_101410_hmw_mag_ef.mnc'; 'g:/keith/results/ha_102703_hmw_mag_ef.mnc']; input_files_sdeffect=['g:/keith/results/ha_093923_hmw_mag_sd.mnc'; 'g:/keith/results/ha_100326_hmw_mag_sd.mnc'; 'g:/keith/results/ha_101410_hmw_mag_sd.mnc'; 'g:/keith/results/ha_102703_hmw_mag_sd.mnc']; output_file_base='g:/keith/results/ha_multi_hmw' For the final run, we need to specify the row vector of degrees of freedom of the input files, printed out by fmrilm.m:

input_files_df=[df1 df2 df3 df4] To get an idea of the FWHM, look at one of the FWHM files. The FWHM is close to 6 mm outside the brain (due to motion correction), but a value of input_files_fwhm = 8 mm seems more reasonable for the FWHM in the brain:

m=fmris_read_image('g:/keith/results/ha_100326_hot_fwhm.mnc',4,1); imagesc(m.data',[0 20]); colorbar; axis xy; input_files_fwhm=8;

These two parameters are only used to calculate the final degrees of freedom, and don't affect the images very much. Finally, for a fixed effects analysis (not recommended!) we set the last parameter to Inf (see later). The final run is:

df=multistat(input_files_effect,input_files_sdeffect,input_files_df, ... input_files_fwhm,X,contrast,output_file_base,which_stats,Inf) Note that the program prints and returns the final degrees of freedom of the tstat image, which is df=448 (see discussion on fixed and random effects).

For a more elaborate analysis, e.g. comparing the first two sessions with the next two, you can do it by:

X=[1 1 0 0; 0 0 1 1]'; contrast=[1 -1];

Converting the analysis to percent of raw fmri data using fmri2pcnt

To compare effects and sd's from different runs or different subjects, it might be better to convert the data to percentages of raw fmri data first, that is, divide all the data at each voxel by the average over non-exculded frames at that voxel, times 100%. However there is a much simpler way: just run the analysis on the raw data, then convert the results. Only the ef and sd files (and resid if requested) need to be converted, all the rest, inculding the t files, will be the same. This can be done using fmri2pcnt. For example, fmri2pcnt('g:/keith/results/ha_100326_hmw_mag_ef.mnc',input_file,exclude) fmri2pcnt('g:/keith/results/ha_100326_hmw_mag_sd.mnc',input_file,exclude) Note that only the data inside the brain is converted to percent, to avoid spurious large values outside the brain. Top do this, fmri2pcnt uses fmri_mask_thresh to find a suitable threshold (or you can supply your own) to mask the brain using the first slice of input_file.

Fmri2pcnt will also accept a matrix of file names to be converted. It will then convert each file to percent, and return a matrix of the names of the converted files. So to redo the analysis on percentages using multistat, try this:

input_files_effect_pcnt =fmri2pcnt(input_files_effect ,input_file,exclude) input_files_sdeffect_pcnt=fmri2pcnt(input_files_sdeffect,input_file,exclude) output_file_base='g:/keith/results/ha_multi_hmw_pcnt' df=multistat(input_files_effect_pcnt,input_files_sdeffect_pcnt,input_files_df, ... input_files_fwhm,X,contrast,output_file_base,which_stats,Inf)

Fixed and random effects

In the above run, we did a fixed effect analysis, that is, the analysis is only valid for the particular group of 4 sessions on Brian Ha. It is not valid for an arbitrary session on Brian Ha, nor for the population of subjects in general. It may turn out that the underlying effects themselves vary from session to session, and from subject to subject. To take this extra source of variability into account, we should do a random effects analysis.

The ratio of the random effects standard error, divided by the fixed effects standard error, is outputed as ha_multi_sdratio.mnc. If there is no random effect, then sdratio should be roughly 1. If there is a random effect, sdratio will be greater than 1. To allow for the random effects, we multiply the sdeffect by sdratio, and divide the tstat by sdratio. The only problem is that sdratio is itself extremely noisy due to its low degrees of freedom, which results in a loss of efficiency. One way around this is to assume that the underlying sdratio is fairly smooth, so it can be better estimated by smoothing sdratio. This will increase its degrees of freedom, thus increasing efficiency. Note however that too much smoothing could introduce bias.

The amount of smoothing is controlled by the final parameter, fwhm_varatio. It is the FWHM in mm of the Gaussian filter used to smooth the ratio of the random effects variance divided by the fixed effects variance. 0 will do no smoothing, and give a purely random effects analysis. Inf will do complete smoothing to a global ratio of one, giving a purely fixed effects analysis (which we did above). The higher the fwhm_varatio, the higher the ultimate degrees of freedom of the tstat image (printed out before a pause), and the more sensitive the test. However too much smoothing will bias the results. The default is 15, but a value of 20 mm gives a final df that is reasonably high. To invoke this, just run

which_stats=[1 1 1 1 0 0 0 0 1]; [df, df_resid]=multistat(input_files_effect,input_files_sdeffect,input_files_df, ... input_files_fwhm,X,contrast,output_file_base,which_stats,20) Note that the degrees of freedom of tstat is now reduced to df=112. The sd ratio of random to fixed effects looks like this:

m=fmris_read_image('g:/keith/results/ha_multi_hmw_sdratio.mnc',4,1); imagesc(m.data'); colorbar; axis xy;

Note that it is roughly 1 outside the brain and at most places inside, indicating little random effect over the runs. The random effect sd is about 60% higher than the fixed effect sd (sdratio = 1.6) in posterior regions. The sdratio is much higher (~3) over subjects.

The effective FWHM over runs looks very similar to that over scans, but a lot more noisy, so some smoothing with gauss_blur makes it easier to interpret:

m=fmris_read_image('g:/keith/results/ha_multi_hmw_fwhm.mnc',4,1); blured_fwhm=gauss_blur('g:/keith/results/ha_multi_hmw_fwhm.mnc',10); mb=fmris_read_image(blured_fwhm,4,1); imagesc([m.data' mb.data'],[0 20]); colorbar; axis xy;

Reference:

Worsley, K.J. (2002). Non-stationary FWHM and its effect on statistical inference of fMRI data. NeuroImage, submitted.

Thresholding the tstat image with stat_threshold and fdr_threshold

Suppose we search g:/keith/results/ha_multi_hmw_t.mnc for local maxima inside a search region of 1000cc. The voxel volume is 2.34375*2.34375*7= 38.4521, so the number of voxels is 1000000/38.4521=26000. There is roughly 8 mm smoothing. The degrees of freedom returned by multistat, is 112. The significance is the standard 0.05. The threshold to use is

stat_threshold(1000000,26000,8,112,0.05) which gives 4.86. The resulting thresholded tstat is shown in the logo at the top of this page, rendered using IRIS Explorer. For the critical size of clusters above a threshold of say 3 use

stat_threshold(1000000,26000,8,112,0.05,3) which gives peak_threshold=4.86 as above and extent_threshold=789 mm^3. This means that any cluster of neighbouring voxels above a threshold of 3 with a volume larger than 789 mm^3 is significant at P<0.05. Peak_threshold_1 = 4.17 is the threshold for a single peak chosen in advance of looking at the data, e.g. the nearest cluster to a pre-chosen voxel or ROI, extent_threshold_1 = 262 mm^3 is the extent of a single cluster chosen in advance - see Friston (1997).

Often the FWHM is estimated as an image rather than a fixed value. The extra variability in this estimate increases P-values and thresholds. It does not have much effect on P-values and thresholds for local maxima if the search region is large, but it can have a strong effect on P-values and thresholds for spatial extent, since clusters are small. The random noise in the FWHM is measured by the degrees of freedom of the linear model used to estimate it. For example, if the FWHM comes from a single run of fmrilm, then the df of the estimated FWHM is 112. To incorporate this into stat_threshold, add an extra row to the df:

stat_threshold(1000000,26000,8,[112; 112],0.05,3) The thresholds for local maxima are the same, but the thresholds for spatial extent increase to 841 mm^3 and 265 mm^3 for a pre-chosen cluster. If the FWHM comes from multistat, two degrees of freedom are required: the first is the residual degrees of freedom df_resid, equal to the number of runs - number of parameters in the model = 4 - 1 = 3 (output as the second parameter of multistat), and the second is df, the effective degrees of freedom after smoothing the sdratio = 112 (output as the first parameter of multistat). These are put into the second row of a 2 x 2 matrix as follows (the 0 indicates that we want calculations for a t statistic): stat_threshold(1000000,26000,8,[112 0; 3 112],0.05,3) Because of the much more variable estimate of the fwhm, the thresholds jump to 1282 mm^3 and 284 mm^3, respectively.

Fdr_threshold calculates the threshold for a t or F image for controlling the false discovery rate (FDR). The FDR is the expected proportion of false positives among the voxels above the threshold. This threshold is higher than that for controlling the expected proportion of false positives in the whole search volume, and usually lower than the Bonferroni threshold (printed out as BON_THRESH) which controls the probability that *any* voxels are above the threshold. The threshold depends on the data as well as the search region, but does not depend on the smoothness of the data. We must specify a search volume, here chosen as the first frame of the raw fMRI data thresholded at a value of mask_thresh=452:

input_file='g:/keith/results/ha_multi_hmw_t.mnc'; mask_file='g:/keith/data/brian_ha_19971124_1_100326_mri_MC.mnc'; fdr_threshold(input_file,[], mask_file,mask_thresh,112,0.05) The result is a threshold of 2.92, considerably lower than the random field / Bonferroni threshold of 4.86 above. But remember that the interpretation is different: 95% of the voxels above 2.92 are true signal (on average), whereas the 19 times out of 20, there will be *no* false positives in the voxels above 4.86.

References:

Benjamini, Y. and Hochberg, Y. (1995). Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal Statistical Society, Series B, 57:289-300.

Friston, K.J. (1997). Testing for anatomically specified regional effects. Human Brain Mapping, 5:133-136.

Genovese et al. (2001). Threshold determination using the false discovery rate. Neuroimage, 13:S124.

Worsley, K.J., Marrett, S., Neelin, P., Vandal, A.C., Friston, K.J., and Evans, A.C. (1996). A Unified Statistical Approach for Determining Significant Signals in Images of Cerebral Activation. Human Brain Mapping, 4:58-73.

Locating peaks and clusters with locmax

To locate peaks and clusters masked by the first slice if the fMRI data, use
stat_file='g:/keith/results/ha_multi_hmw_t.mnc'; mask_file='g:/keith/data/brian_ha_19971124_1_100326_mri_MC.mnc'; lm=locmax(stat_file, 4.86, mask_file, mask_thresh) which gives a list of local maxima above 4.86, sorted in descending order. The first column of LM is the values of the local maxima, and the next three are the x,y,z voxel coordinates in `register' format, i.e. starting at zero. The 5th column is a cluster id, and the 6th is the cluster volume. To find P-values for these local maxima, replace the 0.05 by the list of peak values from the first column of LM:
pval=stat_threshold(1000000,26000,6,112,lm(:,1))

Producing an SPM style summary with stat_summary

To look at all peaks and clusters above a threshold corresponding to an uncorrected P-value of 0.001, together with a complete listing of P-values and false discovery rates (based on the local fwhm), use:
fwhm_file='g:/keith/results/ha_multi_hmw_fwhm.mnc'; stat_summary(stat_file,fwhm_file,[112 0; 3 112],mask_file,mask_thresh,0.001) The columns are:

clus: index of cluster, in descending order of cluster volume.

vol: volume of cluster in mm^3.

resel: resels of cluster.

p-val: P-value of cluster extent.

(one): P-value if the cluster was chosen in advance, e.g. nearest to an ROI.

peak: values of local maxima, sorted in descending order.

p-val: P-value of local maxima if less than 0.2, otherwise expected number of false positive peaks.

(one): P-value if the peak was chosen in advance, e.g. nearest to an ROI.

q-val: Q-value or false discovery rate ~ probability that voxel is not signal.

x,y,z: coords of local maxima starting at 0, as in 'register'.
clus vol resel p-val (one) peak p-val (one) q-val x y z 1 20341 15.43 0 ( 0) 7.8 0 ( 0) 0 59 81 1 1 20341 15.43 0 ( 0) 7.48 0 ( 0) 0 58 73 1 1 20341 15.43 0 ( 0) 7.33 0 ( 0) 0 62 75 1 1 20341 15.43 0 ( 0) 7.17 0 ( 0) 0 61 74 2 1 20341 15.43 0 ( 0) 6.84 0 ( 0) 0 57 70 5 1 20341 15.43 0 ( 0) 6.55 0 ( 0) 0 62 66 4 1 20341 15.43 0 ( 0) 6.19 0 ( 0) 0 61 69 3 1 20341 15.43 0 ( 0) 6.12 0 ( 0) 0 61 70 4 1 20341 15.43 0 ( 0) 6.04 0 ( 0) 0 56 80 0 1 20341 15.43 0 ( 0) 5.85 0.001 ( 0) 0 56 89 1 1 20341 15.43 0 ( 0) 5.76 0.001 ( 0) 0 59 67 4 1 20341 15.43 0 ( 0) 5.44 0.005 (0.001) 0 54 84 2 1 20341 15.43 0 ( 0) 5.27 0.01 (0.002) 0 54 74 1 1 20341 15.43 0 ( 0) 5.14 0.018 (0.003) 0 57 61 5 1 20341 15.43 0 ( 0) 4.86 0.06 (0.007) 0 56 86 1 1 20341 15.43 0 ( 0) 4.66 0.136 (0.014) 0 55 75 0 1 20341 15.43 0 ( 0) 4.65 0.139 (0.015) 0 56 76 0 1 20341 15.43 0 ( 0) 4.21 0.806 (0.062) 0.002 52 83 2 1 20341 15.43 0 ( 0) 3.89 2.229 (0.159) 0.004 57 83 4 1 20341 15.43 0 ( 0) 3.89 2.239 (0.159) 0.004 54 79 2 1 20341 15.43 0 ( 0) 3.7 3.863 (0.273) 0.007 55 71 2 1 20341 15.43 0 ( 0) 3.63 4.597 (0.325) 0.009 52 93 2 2 8075 10.1 0 ( 0) 5.58 0.003 ( 0) 0 39 62 10 2 8075 10.1 0 ( 0) 5.47 0.004 (0.001) 0 52 61 10 2 8075 10.1 0 ( 0) 5.28 0.01 (0.002) 0 40 63 10 2 8075 10.1 0 ( 0) 5.14 0.018 (0.003) 0 51 61 9 2 8075 10.1 0 ( 0) 5.04 0.028 (0.004) 0 47 62 8 2 8075 10.1 0 ( 0) 4.74 0.099 (0.011) 0 46 61 9 2 8075 10.1 0 ( 0) 4.6 0.174 (0.018) 0.001 51 66 9 2 8075 10.1 0 ( 0) 3.78 3.052 (0.216) 0.006 48 71 9 2 8075 10.1 0 ( 0) 3.74 3.464 (0.245) 0.007 49 69 9 2 8075 10.1 0 ( 0) 3.47 6.925 (0.487) 0.013 38 60 8 2 8075 10.1 0 ( 0) 3.27 11.279 (0.787) 0.022 48 68 10 2 8075 10.1 0 ( 0) 3.22 12.668 (0.882) 0.025 37 61 8 3 4883 4.14 0.003 ( 0) 6.48 0 ( 0) 0 75 61 8 3 4883 4.14 0.003 ( 0) 5.46 0.004 (0.001) 0 81 66 9 3 4883 4.14 0.003 ( 0) 5.04 0.028 (0.004) 0 85 64 9 3 4883 4.14 0.003 ( 0) 4.91 0.048 (0.006) 0 85 65 8 3 4883 4.14 0.003 ( 0) 4.27 0.63 (0.051) 0.001 88 67 7 3 4883 4.14 0.003 ( 0) 4.24 0.698 (0.056) 0.002 78 65 8 3 4883 4.14 0.003 ( 0) 4.17 0.942 (0.071) 0.002 78 62 9 3 4883 4.14 0.003 ( 0) 3.66 4.281 (0.303) 0.008 86 65 6 3 4883 4.14 0.003 ( 0) 3.63 4.607 (0.325) 0.009 87 67 6 4 2115 3.26 0.007 (0.001) 5.32 0.008 (0.001) 0 71 61 10 4 2115 3.26 0.007 (0.001) 4.83 0.068 (0.008) 0 72 69 9 4 2115 3.26 0.007 (0.001) 3.8 2.892 (0.205) 0.006 73 73 9 4 2115 3.26 0.007 (0.001) 3.75 3.332 (0.236) 0.006 74 72 9 5 1423 2.4 0.018 (0.001) 5.76 0.001 ( 0) 0 67 85 1 5 1423 2.4 0.018 (0.001) 5.21 0.013 (0.002) 0 67 84 2 5 1423 2.4 0.018 (0.001) 5.17 0.016 (0.002) 0 68 85 2 5 1423 2.4 0.018 (0.001) 4.44 0.327 ( 0.03) 0.001 68 92 1 6 1154 1.42 0.073 (0.006) 4.64 0.144 (0.015) 0 40 77 8 6 1154 1.42 0.073 (0.006) 4.42 0.353 (0.032) 0.001 44 79 8 6 1154 1.42 0.073 (0.006) 4.16 0.952 (0.071) 0.002 40 80 7 7 923 0.91 0.184 (0.016) 5.87 0.001 ( 0) 0 67 79 2 8 692 0.43 0.526 (0.058) 4 1.646 (0.117) 0.003 76 48 4 8 692 0.43 0.526 (0.058) 3.58 5.215 (0.368) 0.01 78 53 4 8 692 0.43 0.526 (0.058) 3.5 6.437 (0.453) 0.012 77 51 4 9 308 0.2 0.844 (0.144) 4.44 0.323 (0.029) 0.001 40 67 4 10 154 0.13 0.937 (0.215) 3.39 8.48 (0.594) 0.016 46 55 10 11 154 0.09 0.979 (0.301) 3.59 5.107 ( 0.36) 0.01 69 70 7 12 154 0.14 0.927 (0.204) 3.91 2.118 (0.151) 0.004 48 46 5 13 154 0.24 0.788 ( 0.12) 4.52 0.24 (0.023) 0.001 45 74 4 14 154 0.21 0.835 ( 0.14) 3.77 3.165 (0.224) 0.006 79 73 2 15 115 0.35 0.621 (0.075) 3.33 9.942 (0.695) 0.019 71 78 7 16 115 0.18 0.874 (0.161) 3.76 3.267 (0.232) 0.006 57 78 5 17 115 0.28 0.723 ( 0.1) 3.39 8.462 (0.593) 0.016 90 83 5 18 77 0.21 0.833 (0.139) 3.57 5.423 (0.382) 0.01 67 66 11 19 77 0.09 0.975 (0.286) 3.61 4.818 ( 0.34) 0.009 43 65 9 20 77 0.05 0.995 (0.416) 3.44 7.509 (0.527) 0.014 65 65 9 21 77 0.07 0.987 (0.337) 3.54 5.841 (0.411) 0.011 47 49 6 22 77 0.02 0.999 (0.586) 3.43 7.816 (0.548) 0.015 42 68 3 23 38 0.05 0.995 (0.406) 3.28 11.131 (0.777) 0.022 76 88 11 24 38 0.04 0.997 (0.462) 3.32 10.023 ( 0.7) 0.02 68 67 10 25 38 0.04 0.997 ( 0.46) 3.4 8.386 (0.588) 0.016 74 53 9 26 38 0.07 0.99 (0.355) 3.41 8.028 (0.563) 0.016 45 77 7 27 38 0.03 0.998 (0.498) 3.21 12.875 (0.896) 0.025 67 90 2 A glass brain is produced as well:

Confidence regions for the spatial location of local maxima using conf_region

The spatial location of a peak also has a random error due to the noise component of the images. This error is roughly equal to sqrt(4log(2))/(FWHM * PEAK), where FWHM is the effective fwhm of the data, and PEAK is the height of the peak. Note that higher peaks are more accurately located, as you would expect. This standard error assumes that the signal has a Gaussian spatial shape with a fwhm that matches FWHM. If this is not the case, a more robust method is based on the spatial shape of the peak itself. Instead of estimating the standard error, we can find a confidence region for the location by thresholding the peak at a value of sqrt( peak^2 - chisq_pvalue), where the probability of a chisq random variable with 3 df exceeding chisq_pvalue is the desired P-value (chisq_pvalue = 7.81 for P-value = 0.05). This is implemented by conf_region, which produces a 95% (P=0.05) confidence region image with values equal to the peak value in each confidence region. Thresholding this image at a value of say 4.86 in register or glass_brain, produces an image of confidence regions for all peaks above 4.86:

conf_region(stat_file,4.86,mask_file,mask_thresh) glass_brain('g:/keith/results/ha_multi_hmw_t_95cr.mnc', ... 4.86,mask_file,mask_thresh);

Reference:

Ma, L., Worsley, K.J. and Evans, A.C. (1999). Variability of spatial location of activation in fMRI and PET CBF images. NeuroImage, 9:S178. (POSTER)

Conjunctions

An alternative to averaging the runs using multistat as above, is to find the common regions that are significantly activated in all runs. These are called conjunctions, and they can be found by thresholding the image of the minimum of the four T statistic images for each run. To incorporate random effects, we use the mixed effects standard deviation from multistat, rather than the fixed effects standard deviation from fmrilm. This can be calculated directly from multistat by setting the fifth component of which_stats to 1. P-values and thresholds can be obtained from stat_threshold, fdr_threshold and finally stat_summary by setting the last parameter to the number of conjunctions, here 4 (note that P-values and thresholds for clusters of conjunction images are not available yet):

which_stats=[0 0 0 0 1]; multistat(input_files_effect,input_files_sdeffect,input_files_df, ... input_files_fwhm,X,contrast,output_file_base,which_stats,20) stat_summary('g:/keith/results/ha_multi_hmw_conj.mnc', fwhm_file, ... [112 0; 3 112], mask_file, mask_thresh, 0.001, 1, 4); clus vol resel p-val(one) peak p-val (one) q-val x y z 1 11151 8.61 NaN (NaN) 3.38 0 ( 0) 0 58 72 1 1 11151 8.61 NaN (NaN) 2.55 0 ( 0) 0 62 75 1 1 11151 8.61 NaN (NaN) 2.43 0 ( 0) 0 58 81 1 1 11151 8.61 NaN (NaN) 2.2 0.001 ( 0) 0 56 86 1 1 11151 8.61 NaN (NaN) 1.99 0.011 (0.002) 0 56 79 0 1 11151 8.61 NaN (NaN) 1.79 0.067 (0.009) 0.001 57 70 5 etc.

Note that the peaks and thresholds are a lot lower than for a conventional T statistic image; the P=0.05 threshold for a 1000000mm^3 brain as above is

stat_threshold(1000000,26000,8,112,0.05,0.001,0.05,4) which gives 1.80, much lower than 4.86. Even allowing for this, there seems to be less significant activation in the conjunction (minimum) image than in the average image, perhaps because the subject was not consistently showing activation in all four runs.

Extracting values from a minc file using extract

You may want to look at the effects and their standard deviations at some of the local maxima just found, for example the peak with value 6.55 in slice 5 with voxel coordinates [62 66 4]:
voxel=[62 66 4] effect=extract(voxel,'g:/keith/results/ha_multi_hmw_ef.mnc') sdeffect=extract(voxel,'g:/keith/results/ha_multi_hmw_sd.mnc') which gives 5.1157 and 0.8627 respectively. Note that 5.1157/0.7810 = 6.5502, the value of the T statistic. A matrix of voxels can be given, and the corresponding vector of values will be extracted.

To look at the time course of the data for the first run, and compare it to the fitted values, try this:

ref_data=squeeze(extract(voxel,'g:/keith/data/brian_ha_19971124_1_100326_mri_MC.mnc')); ef_hot=extract(voxel,'g:/keith/results/ha_100326_hot_mag_ef.mnc') ef_wrm=extract(voxel,'g:/keith/results/ha_100326_wrm_mag_ef.mnc') fitted=mean(ref_data)+ef_hot*X_cache(:,1,1,5)+ef_wrm*X_cache(:,2,1,5); plot(frametimes,[ref_data fitted],'LineWidth',2); ylim([920 960]); legend('Reference data','Fitted values'); xlabel('time (seconds) from start of epoch'); ylabel('fMRI response');

Estimating the time course of the response

We may be intersted in estimating the time course of the response, rather than just the data as above. The easiest way to do this might be to average the responses beginning at the start of each `epoch' of the response, e.g. averaging the 12 scans starting at scans 1, 13, 25, ... 109. This does not remove drift, nor does it give you a standard error. A better, though more complex way, is to replace the block design of 2 blocks (9s hot, 9s warm) with say 12 blocks of 3s each covering the entire epoch, then estimate their effects with NO convolution by the hrf: eventid=kron(ones(10,1),(1:12)'); eventimes=frametimes'; duration=ones(120,1)*3; height=ones(120,1); events=[eventid eventimes duration height] X_bases=fmridesign(frametimes,slicetimes,events,[],zeros(1,5)); plot(squeeze(X_bases(:,1:4,1,5)),'LineWidth',2) legend('Time01','Time02','Time03','Time04') xlabel('frame number') ylabel('response') title('Slice 5, first 4 times only')

The advantage of this is that it is much more flexible; it can be applied to event related designs, with randomly timed events, even with events almost overlaping, and the duration of the `events' need not be equal to the TR. Moreover, the results from separate runs/sessions/subjects can be combined using multistat.

Now all the time course is covered by events, and the baseline has disappeared, so we must contrast each event with the average of all the other events, as follows:

contrast=[eye(12)-ones(12)/12]; exclude=[1 2]; which_stats=[0 1 1 1]; input_file='g:/keith/data/brian_ha_19971124_1_100326_mri_MC.mnc'; output_file_base=['g:/keith/results/ha_100326_time01'; 'g:/keith/results/ha_100326_time02'; 'g:/keith/results/ha_100326_time03'; 'g:/keith/results/ha_100326_time04'; 'g:/keith/results/ha_100326_time05'; 'g:/keith/results/ha_100326_time06'; 'g:/keith/results/ha_100326_time07'; 'g:/keith/results/ha_100326_time08'; 'g:/keith/results/ha_100326_time09'; 'g:/keith/results/ha_100326_time10'; 'g:/keith/results/ha_100326_time11'; 'g:/keith/results/ha_100326_time12']; fmrilm(input_file,output_file_base,X_bases,contrast,exclude,which_stats,fwhm_rho); We need 12 output files, but only the effect and its standard error. The F-statistic has been requested so we can find voxels with significant responses. Its critical threshold is:

stat_threshold(1000000,26000,6,[11 103]) which gives 5.18. The local maxima above this are:

lm=locmax([output_file_base(1,:) '_mag_F.mnc'],5.18) The extracted effects and their standard errors near the previous voxel=[63 66 4] are:

voxel=[63 66 4] values=extract(voxel,output_file_base,'_mag_ef.mnc') sd=extract(voxel,output_file_base,'_mag_sd.mnc') Plotting these out against time gives the estimated response, together with 1 sd error bars. Note that the times are offset by the slicetimes, so they are in seconds starting from the beginning of slice acquisition. On top of this we can add the modeled response by extracting the effects of the hot and warm stimuli, then multiplying these effects by the hrf convolved with the 9s blocks for hot and warm stimuli (also shown):

b_hot=extract(voxel,'g:/keith/results/ha_100326_hot_mag_ef.mnc') b_wrm=extract(voxel,'g:/keith/results/ha_100326_wrm_mag_ef.mnc') time=(1:360)/10; X_hrf=fmridesign(time,0,[1 9 9 1],[],hrf_parameters); hrf=squeeze(X_hrf(:,1,1)); plot((0:12)*3+slicetimes(voxel(3)),values([1:12 1]),'k', ... [0:11; 0:11]*3+slicetimes(voxel(3)), [values+sd; values-sd],'g', ... time,[zeros(1,90) ones(1,90) zeros(1,180)],'r', ... time,[zeros(1,270) ones(1,90)],'b', ... time,hrf*b_hot+hrf([181:360 1:180])*b_wrm,'g'); legend('Estimated response','Modeled response'); text(10,0.5,'Hot') text(28,0.5,'Warm') xlabel('time (seconds) from start of epoch'); ylabel('fMRI response');

The modeled response and the fitted response are quite close. Note that the graph `wraps around' in time.

Estimating the delay

Rather than the whole response at one voxel, we may prefer just one parameter, the delay, estimated at all voxels. Delays are shifts of the time origin of the HRF, measured in seconds. CONTRAST_IS_DELAY is a logical vector indicating if the row of CONTRAST is a contrast in the delays (1) or magnitudes (0). Note that you cannot estimate delays of the polynomial terms or confounds. Note that F statistics are not yet available with this option. If the length of CONTRAST_IS_DELAY is less then the number of contrasts, it is padded with zeros. Default is 0, i.e. all contrasts refer to magnitudes. Here is an example to estimate the delays of the hot, the warm and the difference hot-warm stimuli for the above data. In this case every contrast is a delay, so CONTRAST_IS_DELAY = [1 1 1]:

contrast=[1 0; 0 1; 1 -1] which_stats=[1 1 1] input_file='g:/keith/data/brian_ha_19971124_1_100326_mri_MC.mnc'; output_file_base=['g:/keith/results/ha_100326_hot'; 'g:/keith/results/ha_100326_wrm'; 'g:/keith/results/ha_100326_hmw'] n_poly=3; fwhm_rho='g:/keith/results/ha_100326_hot_rho.mnc'; confounds=[]; contrast_is_delay=[1 1 1]; fmrilm(input_file, output_file_base, X_cache, contrast, exclude, ... which_stats, fwhm_rho, n_poly, confounds, contrast_is_delay); The result is the following files: g:/keith/results/ha_100326_hmw_del_t.mnc g:/keith/results/ha_100326_hmw_del_ef.mnc g:/keith/results/ha_100326_hmw_del_sd.mnc g:/keith/results/ha_100326_hot_del_t.mnc g:/keith/results/ha_100326_hot_del_ef.mnc g:/keith/results/ha_100326_hot_del_sd.mnc g:/keith/results/ha_100326_wrm_del_t.mnc g:/keith/results/ha_100326_wrm_del_ef.mnc g:/keith/results/ha_100326_wrm_del_sd.mnc To visualize the results in MATLAB, try this:

t=fmris_read_image('g:/keith/results/ha_100326_hot_mag_t.mnc',4,1); d=fmris_read_image('g:/keith/results/ha_100326_hot_del_ef.mnc',4,1); s=fmris_read_image('g:/keith/results/ha_100326_hot_del_sd.mnc',4,1); a=fmris_read_image(input_file,5,1); imagesc([(d.data.*(t.data>4.86)+(a.data<mask_thresh)*0.1)' ... (s.data.*(t.data>4.86)+(a.data<mask_thresh)*0.1)']); colorbar; axis xy;

The first image is the delay, in seconds, where the t-statistic for the magnitude is greater than 4.86. The brain is outlined by thesholding the first fMRI scan at mask_thresh=452. The second image gives the sd of the delay. The delay shify is roughly 0.5 +/- 0.6 seconds, indicating no significant delay shift from the Glover HRF.

If a contrast involves the delays of more than one response, e.g. the third contrast which compares the delays of the hot and warm response, then it only makes sense to look at places where the magnitudes of all the responses involved are large. To do this, try:

t2=fmris_read_image('g:/keith/results/ha_100326_wrm_mag_t.mnc',4,1); sig_resp=min(t.data,t2.data)>4.86; d2=fmris_read_image('g:/keith/results/ha_100326_hmw_del_ef.mnc',4,1); s2=fmris_read_image('g:/keith/results/ha_100326_hmw_del_sd.mnc',4,1); imagesc([(d2.data.*sig_resp+(a.data<mask_thresh)*0.1)' ... (s2.data.*sig_resp+(a.data<mask_thresh)*0.1)']); colorbar; axis xy; but unfortunately there are no points where both the hot and the warm stimulus are exceed the 4.86 threshold, so we can't compare the delays.

Note that if you ask for magnitudes as well as delays in the same analysis, i.e.

contrast=[1 0; 0 1; 1 -1; 1 0; 0 1; 1 -1] contrast_is_delay=[0 0 0 1 1 1]; then the magnitude images are not quite the same as what you would get from fmrilm without asking for delays, i.e. contrast=[1 0; 0 1; 1 -1] contrast_is_delay=[0 0 0]; The reason is that in order to calculate delays, fmrilm replaces the convolution of the stimulus with the hrf by the convolution of the stimulus with two basis functions for the hrf, in the linear model. It uses these two basis functions for every response that has a non-zero value in any contrast in the delays, otherwise it just uses one basis function (the hrf itself). If you want to force two basis functions into the model for any extra responses, even though you don't want to estimate delays in these responses, see the next parameter NUM_HRF_BASES.

The best way of visualizing delays is to use register like this:

register g:/keith/results/ha_100326_hot_mag_t.mnc g:/keith/results/ha_100326_hot_del_ef.mnc After synchronising the two images, you can then search the tstat image in the first window for regions of high signal e.g. >4.86, then read off the estimated delay at that location from the second window. A fancier rendering is shown below colouring the outer surface of the activated regions in top and cut-away front views.

You could now put delay effects and sdeffects from separate runs/sessions/subjects through multistat as before.

Efficiency and choosing the best design

The efficiency of a design for an effect is inversely proportional to the sd of that effect using such a design. The program efficiency calculates the sd of an effect without needing any data. It simply executes all the code of fmrilm that calculates the sd, setting the sd of the noise to 1, so the returned sd is relative to the sd of the noise. The call is a lot like fmrilm except that you don't supply input_file or output_file_base, but you must supply details of the design through X_cache from fmridesign, the contrast, and the frames to exclude. Because it works out the sd at a single voxel, it is quite fast. It returns a matrix of sd's for each contrast (rows) and slice (columns). For the design above: efficiency(X_cache, contrast, exclude) ans = Columns 1 through 8 0.1558 0.1565 0.1559 0.1566 0.1560 0.1567 0.1561 0.1567 0.1619 0.1618 0.1619 0.1617 0.1619 0.1617 0.1618 0.1616 0.1918 0.1916 0.1918 0.1916 0.1918 0.1916 0.1917 0.1915 Columns 9 through 13 0.1562 0.1567 0.1563 0.1567 0.1564 0.1618 0.1615 0.1618 0.1613 0.1618 0.1917 0.1915 0.1917 0.1914 0.1917 The sd of the hot stimulus is about 0.156 in all 13 slices, the warm stimulus is about the same (0.162) and the sd for the hot-warm stimulus is larger (0.192) by about sqrt(2) since it is the difference of two effects.

You can also get the actual linear combination of the data that is used to estimate these effects. These weights show how the frames are actually contributing to the effect. The array Y (second component of the output) contains the weights for each frame (first dimension), contrast (second dimension) and frame (third dimension). For the first 40 frames of slice 4:

[sd_ef, Y]=efficiency(X_cache, contrast, exclude); slice=4; plot(squeeze(X_cache.X(:,:,1,4)),'LineWidth',2) hold on; plot(Y(:,:,4)'/max(max(Y(:,:,4))),':','LineWidth',2); hold off xlim([0 40]) legend('Hot resp','Warm resp','Hot wt','Warm wt','Hot - Warm wt') xlabel('frame number') ylabel('response') title('Hot and Warm responses and weights')

The hot weights don't quite match the hot response because they are adjusted for the warm response, which is not quite orthogonal (if they were orthogonal, the weights would match the responses). The weights of the hot-warm are in fact the hot weights minus the warm weights. The weights are also adjusted for the drift as well; this is more evident if we have longer blocks, e.g. the extreme case of two 90 second blocks instead of twenty 9 second blocks:

events=[1 90 90 1; 2 270 90 1] X_cache1=fmridesign(frametimes,slicetimes,events,[],hrf_parameters); [sd_ef, Y]=efficiency(X_cache1, contrast, exclude); sd_ef Columns 1 through 8 0.2770 0.2782 0.2771 0.2784 0.2773 0.2786 0.2775 0.2787 0.4297 0.4288 0.4295 0.4287 0.4294 0.4286 0.4292 0.4285 0.5460 0.5467 0.5461 0.5469 0.5461 0.5470 0.5462 0.5471 Columns 9 through 13 0.2777 0.2789 0.2778 0.2790 0.2780 0.4291 0.4283 0.4290 0.4282 0.4289 0.5464 0.5472 0.5465 0.5472 0.5466 slice=4; plot(squeeze(X_cache1.X(:,:,1,slice)),'LineWidth',2) hold on; plot(Y(:,:,4)'/max(max(Y(:,:,slice))),':','LineWidth',2); hold off legend('Hot resp','Warm resp','Hot wt','Warm wt','Hot - Warm wt') xlabel('frame number') ylabel('response') title('Hot and Warm responses and weights')

Now the sd of the effects are much larger (this is a less efficient design), particularly for the warm stimulus. Part of the reason is that the responses are now correlated (confounded) with the cubic drift. This can be seen more clearly in the plot of the weights, which are concentrated at the beginning and end of the blocks (particularly for the warm stimulus). In other words, because the long blocks look a bit like drift, most of the information about the response comes from the frames when the stimulus is changing rapidly. The steady state of the response in the middle of the blocks is not providing much information because it is indistinguishable from (condfounded with) drift. This also explains why this design is less efficient: the extra frames in the middle of the blocks are wasted because they don't contribute to the effect. This is reflected in the increased sd of the effects, as already noted.

efficiency can also be used to find the sd of delays as well as magnitudes. Interestingly, the sd is in seconds irrespective of the data. This sounds a little strange, but if you think about it, delay estimation does not depend on the scaling of the data. To do this, just set contrast_is_delay to 1 where you want sd's of delays instead of magnitudes. For a temporal correlation of 0.3 (typical of cortex), cubic drift, and no confounds:

rho=0.3; n_poly=3; confounds=[]; contrast_is_delay=[1 1 1]; efficiency(X_cache, contrast, exclude, rho, n_poly, confounds, contrast_is_delay) ans = Columns 1 through 8 0.5543 0.5542 0.5543 0.5542 0.5542 0.5542 0.5542 0.5543 0.5564 0.5593 0.5566 0.5598 0.5570 0.5603 0.5574 0.5608 0.8700 0.8724 0.8701 0.8728 0.8703 0.8733 0.8706 0.8738 Columns 9 through 13 0.5542 0.5544 0.5542 0.5544 0.5542 0.5579 0.5612 0.5583 0.5618 0.5588 0.8710 0.8742 0.8714 0.8747 0.8718 which gives you about 0.55 seconds sd for estimating delays, and about 0.87 seconds for their difference.

By looping through a variety of choices of stimulus duration and interstimulus interval, you can find out which combination gives the best design. The code is given in example.m, but the result is:

(a) A stimulus duration and interstimulus interval of about 8 seconds seems to be optimal for estimating the hot stimulus (the same is true of the warm stimulus). Fortunatley any values in the range 6 to 16 seem to be close to optimal.
(b) A stimulus duration of about 9 seconds and NO interstimulus interval seems to be optimal for hot-warm - this makes sense, because there is no need for the rest periods between the hot and warm stimuli. Again anything in the range 5 to 20 seems to be equally good, and short stimuli should be avoided.
(c) For delays, shorter bloocks are optimal, attaining sd's of about 0.2 seconds. Note that for very short blocks the sd of the magnitude is too large to reliably estimate the delay (black areas).
(d) Again shorter blocks seem to be better for delay differences.

Event related designs can be handled by setting the duration to zero. For a fixed number of events, we tried three different designs: equally spaced events ('uniform'), randomly spaced events ('random'), and all the events concentrated together at a single time point in the middle of the frames ('concentrated'). This last design is an extreme case, equivalent to a single event with magnitude = number of events. We then varied the number of events, and plotted the sd of the effect against the average time between events = time interval for the experiment (360s) divided by the number of events. Again the code is in example.m and the results are shown here:

For the magnitude (solid lines), and well seperated events, uniform and random designs give much the same increase in sd as the time between events increases. This is because as the time between events increases, the number of events decreases, and the sd is proportional to 1/sqrt(number of events) if the events are well separated. The uniform design seems to be optimumal at about 12s between events and gets less efficient as the time between events gets shorter than 12s. This is because after blurring by the HRF, the 'rest' periods get shorter and so there is less baseline to compare the stimulus with. On the other hand, the random design continues to show improvement as the time between events decreases because there is always some baseline present even after blurring with the HRF.

For the delays (dottted lines), it is interesting to note that the uniform design appears to be better, whereas the random design is better for the magnitude. Presumably the more consistent spacing between uniform events is better for estimating the shape of the HRF, including the delay. Also shorter times between events are better, but the results for very short times may be unreliable because the response of closely spaced events may be less than that of widely spaced responses, i.e. the additivity of responses may break down. For example, 10 presentations of a visual stimulus 0.1s apart might evoke less total response than 10 presentations 10s apart. The same thing might happen for blocks: 1s of continuous pain might evoke less total response than 10 well-separated presentations of 0.1s of pain.

This is exagerated when we force all the events very close together at a single time point (red lines). The sd of the effect is now inversely proportional to the number of events concentrated at the point, which is itself inversley propotional to the average time between events, so the sd of the effect is directly proportional to the average time between events. According to the graph, we reach the absurd conclusion that this is by far the best strategy! This shows that the additivity assumption must be wrong.

Conclusion: For block designs, try 8 or 9s blocks; for event-related designs, try about 12s or more between events, either uniformly or randomly spaced. For delays, shorter blocks and times between events are better. Fortunately halving or doubling these values produces only slightly less efficient designs, so there is great flexibility in the design parameters.

Effective connectivity of all voxels with a reference voxel

We might be interested in all voxels that are correlated with the time course at a pre-chosen reference voxel, removing the effect of the signal, to find those voxels that are effectively connected. To do this, we add the data from the pre-chosen voxel (ref_data) as a confound, then estimate its effect, sd, and t statistic as before. Note that ref_data is measured at times that depend on the slicetimes for that slice, i.e. slice 5. For other slices, ref_data must be resampled at the same frametimes and slicetimes as the fmri data, so use fmri_interp to do this:
input_file='g:/keith/data/brian_ha_19971124_1_100326_mri_MC.mnc'; output_file_base='g:/keith/results/ha_100326_connect'; contrast=[0 0 0 0 0 0 1]; which_stats=[1 1 1]; voxel=[62 66 4]; ref_data=squeeze(extract(voxel,input_file)); ref_times=frametimes'+slicetimes(voxel(3)+1); confounds=fmri_interp(ref_times,ref_data,frametimes,slicetimes); fmrilm(input_file, output_file_base, X_cache, contrast, exclude, ... which_stats,fwhm_rho,n_poly,confounds); m=fmris_read_image('g:/keith/results/ha_100326_connect_mag_t.mnc',4,1); imagesc((m.data.*(a.data>mask_thresh))'); colorbar; axis xy;

Note that voxels in the neighbourhood of the pre-chosen voxel are obviously highly correlated due to smoothness of the image. To avoid very large t statistics, the t statistic image has a ceiling of 100 (the F statistic image has a ceiling of 10000).

Local maxima of this image could be identified, their values extracted and added as columns to ref_data to find further voxels that are correlated with the first reference voxel removing the effect of the second reference voxel. This could be repeated to map out the connectivity.

The same idea could be used with multistat to look for connectivity across runs or subjects; simply add an extra column to the design matrix X that contains the subject or run values at the reference voxel, and set the contrast to choose that column by puting a value of 1 for that column and zero for the others.

Higher order autoregressive models

The last parameter of FMRILM is NUMLAGS, the number of lags or order of the autoregressive model for the temporal correlation structure. Order 1 (the default) seems to be adequate for 1.5T data, but higher order models may be needed for 3T data. To assess this, the first NUMLAGS autoregression parameters can be outputted as 'frames' by setting the 8th element of WHICH_STATS to 1. These parameters should be zero for orders higher than the true order of the data. Note: if NUMLAGS>1 it takes a lot longer to fit. Here is an example of an AR(2) model: which_stats=[1 1 1 1 1 0 0 1]; contrast=[1 -1]; output_file_base='g:/keith/results/ha_100326_hmw_ar2'; fmrilm(input_file, output_file_base, X_cache, contrast, exclude, ... which_stats, 15, 3, [], 0, [], 'spectral', 2); m=fmris_read_image('g:/keith/results/ha_100326_hmw_ar2_A.mnc',4,1:2); mm=reshape(permute(m.data,[2 1 3 4]),m.dim(1),prod(size(m.data))/m.dim(1)); imagesc(mm); colorbar; axis xy;

The images of the two autoregressive coeficients show that the second one, a_2, is near to zero so the first order autoregressive model is adequate, which is what we were using before (NUMLAGS=1, the default).

fmristatA general statistical analysis for fMRI data

Previous version of fmristat

All matlab programs are in the toolbox directory - you need to copy ALL the files to be able to read and write ANALYZE and MINC formatted files. If you are using MINC formated files you will need the extra tools in the emma toolbox.

WARNING: The second parameter of stat_threshold has been changed to the number of voxels in the search volume, not the voxel volume. This was done to allow for 2D data. Please update your code.

Summary of method