The Statistical Analysis of FMRI Data
The Statistical Analysis of FMRI Data
Abstract. In recent years there has been explosive growth in the num-
ber of neuroimaging studies performed using functional Magnetic Res-
arXiv:0906.3662v1 [stat.ME] 19 Jun 2009
onance Imaging (fMRI). The field that has grown around the acquisi-
tion and analysis of fMRI data is intrinsically interdisciplinary in nature
and involves contributions from researchers in neuroscience, psychology,
physics and statistics, among others. A standard fMRI study gives rise
to massive amounts of noisy data with a complicated spatio-temporal
correlation structure. Statistics plays a crucial role in understanding
the nature of the data and obtaining relevant results that can be used
and interpreted by neuroscientists. In this paper we discuss the analy-
sis of fMRI data, from the initial acquisition of the raw data to its use
in locating brain activity, making inference about brain connectivity
and predictions about psychological or disease states. Along the way,
we illustrate interesting and important issues where statistics already
plays a crucial role. We also seek to illustrate areas where statistics
has perhaps been underutilized and will have an increased role in the
future.
Key words and phrases: fMRI, brain imaging, statistical analysis, chal-
lenges.
and Ordidge, 1989; Mansfield, Coxon and Hykin, analysis, while the phase portion is discarded. Tra-
1995; Lindquist et al., 2008b). Though this approach ditionally, the phase has not been considered to con-
would potentially allow the same number of k-space tain relevant signal information, though models that
points to be sampled at a faster rate, the stacked use both components (Rowe and Logan, 2004) have
slice approach remains dominant. However, with in- been proposed. It should be noted that the mag-
creases in computational power and hardware im- nitude values no longer follow a normal distribu-
provements, 3D sampling should attract increased tion, but rather a Rice distribution (Gudbjartsson
attention. and Patz, 1995). The shape of this distribution de-
The process of designing new k-space sampling
pends on the signal-to-noise (SNR) ratio within the
trajectories is an interesting mathematical problem,
voxel. For the special case when no signal is present
which can easily be generalized to three dimensions
by letting k(t) = (kx (t), ky (t)), kz (t)). The goal is to (e.g., for voxels outside of the brain), it behaves
find a trajectory k(t) that moves through k-space like a Rayleigh distribution. When the SNR is high
and satisfies the necessary constraints. The trajec- (e.g., for voxels within the brain) the distribution
tory is defined as a continuous curve and along this is approximately Gaussian. Understanding the dis-
curve measurements are made at uniform time inter- tributional properties of MR images is important,
vals determined by the sampling bandwidth of the and this area provides some interesting research op-
scanner. The trajectory starts at the point (0, 0, 0) portunities for statisticians in terms of developing
and its subsequent movement is limited by methods for estimating the variance of the back-
constraints placed on both its speed and accelera- ground noise and methods for identifying and re-
tion. In addition, there is a finite amount of time the moving outliers that arise due to acquisition arti-
signal can be measured before the nuclei need to be facts.
re-exited and the trajectory is returned to the ori-
gin. Finally, the trajectory needs to be space-filling, 2.3 From MRI to fMRI
which implies that each point in the lattice con-
The data acquisition and reconstruction techniques
tained within some cubic or spherical region around
the center of k-space needs to be visited long enough outlined in this section provide the means for obtain-
to make a measurement. The size of this region de- ing a static image of the brain. However, changes
termines the spatial resolution of the subsequent im- in brain hemodynamics in response to neuronal ac-
age reconstruction. For a more complete formula- tivity impact the local intensity of the MR signal.
tion of the problem, see Lindquist et al. (2008a). Therefore, a sequence of properly acquired brain im-
The problem bears some resemblance to the trav- ages allows one to study changes in brain function
eling salesman problem and can be approached in over time.
an analogous manner. One application where trajec- An fMRI study consists of a series of brain vol-
tory design is important is rapid imaging (Lindquist umes collected in quick succession. The temporal
et al., 2006, 2008a) and we return to this issue in a resolution of the acquired data will depend on the
later section. time between acquisitions of each individual vol-
2.2 Statistical Properties of MR Images ume; once the k-space has been sampled, the pro-
cedure is ready to be repeated and a new volume
As the signal in (1) is measured over two chan- can be acquired. This is one reason why efficient
nels, the raw k-space data are complex valued. It
sampling of k-space is important. Typically, brain
is assumed that both the real and imaginary com-
volumes of dimensions 64 × 64 × 30 (i.e. 122,880 vox-
ponent is measured with independent normally dis-
tributed error. Since the Fourier transformation is els) are collected at T separate time points through-
a linear operation, the reconstructed voxel data will out the course of an experiment, where T varies be-
also be complex-valued with both parts following tween 100–2000. Hence, the resulting data consists
a normal distribution. In the final stage of the re- of roughly 100,000 time series of length T . On top
construction process, these complex valued measure- of this, the experiment is often repeated for M sub-
ments are separated into magnitude and phase com- jects, where M usually varies between 10 and 40. It
ponents. In the vast majority of studies only the quickly becomes clear that fMRI data analysis is a
magnitude portion of the signal is used in the data time series analysis problem of massive proportions.
THE STATISTICAL ANALYSIS OF fMRI DATA 5
compared to the positive BOLD signal depends on compare data across subjects, a normalization pro-
the strength of the magnet and has been reported cedure is used to warp the brains onto a standard
to be roughly 20% at 3 Tesla (Yacoub, Le and Hu, template brain. This procedure introduces spatial
1998). There is also evidence that the dip is more imprecision and blurring in the group data. An ob-
localized to areas of neural activity (Yacoub, Le and vious impact of all this blurring is that activation in
Hu, 1998; Duong et al., 2000; Kim, Duong and Kim, small structures may be mislocalized or even missed
2000; Thompson, Peterson and Freeman, 2004) than all together.
the subsequent rise which appears less spatially spe- Inferences in space can potentially be improved
cific. Due in part to these reasons, the negative re- by advances in data acquisition and preprocessing.
sponse has so far not been reliably observed and its The introduction of enhanced spatial inter-subject
existence remains controversial (Logothetis, 2000). normalization techniques and improved smoothing
3.2 Spatial and Temporal Limitations techniques would help researchers avoid the most
dramatic effects of blurring the data. Statistical is-
There are a number of limitations that restrict sues that arise due to smoothing and normalization
what fMRI can measure and what can be inferred will be revisited in a later section dealing specifically
from an fMRI study. Many of these limitations are with preprocessing. A recent innovation in signal ac-
directly linked to the spatial and temporal resolu- quisition has been the use of multiple coils with dif-
tion of the study. When designing an experiment ferent spatial sensitivities to simultaneously measure
it is therefore important to balance the need for k-space (Sodickson and Manning, 1997; Pruessmann
adequate spatial resolution with that of adequate et al., 1999). This approach, known as parallel imag-
temporal resolution. The temporal resolution deter-
ing, allows for an increase in the amount of data that
mines our ability to separate brain events in time,
can be collected in a given time window. Hence, it
while the spatial resolution determines our ability to
can be used to either increase the spatial resolu-
distinguish changes in an image across spatial loca-
tion of an image or decrease the amount of time
tions. The manner in which fMRI data is collected
required to sample an image with a certain speci-
makes it impossible to simultaneously increase both,
fied spatial resolution. Parallel imaging techniques
as increases in temporal resolution limit the number
have already had a great influence on the way data
of k-space measurements that can be made in the al-
located sampling window and thereby directly influ- is collected and its role will only increase. The ap-
ence the spatial resolution of the image. Therefore, propriate manner to deal with parallel imaging data
there are inherent trade-offs required when deter- is a key direction for future research. Designing new
mining the appropriate spatial and temporal resolu- ways of acquiring and reconstructing multi-coil data
tions to use in an fMRI experiment. is an important area of research where statistics can
One of the benefits of MRI as an imaging tech- play a vital role.
nique is its ability to provide detailed anatomical The temporal resolution of an fMRI study de-
scans of gray and white matter with a spatial reso- pends on the time between acquisition of each indi-
3
lution well below 1 mm . However, the time needed vidual image, or the repetition time (TR). In most
to acquire such scans is prohibitively high and cur- fMRI studies the TR ranges from 0.5–4.0 seconds.
rently not feasible for use in functional studies. In- These values indicate a fundamental disconnect be-
stead, the spatial resolution is typically on the order tween the underlying neuronal activity, which takes
of 3 × 3 × 5 mm3 , corresponding to image dimen- place on the order of tens of milliseconds, and the
sions on the order of 64 × 64 × 30, which can readily temporal resolution of the study. However, the sta-
be sampled in approximately 2 seconds. Still, fMRI tistical analysis of fMRI data is primarily focused on
provides relatively high spatial resolution compared using the positive BOLD response to study the un-
with many other functional imaging techniques. How- derlying neural activity. Hence, the limiting factor
ever, it is important to note that the potential high in determining the appropriate temporal resolution
spatial resolution is often limited by a number of fac- is generally not considered the speed of data acqui-
tors. First, it is common to spatially smooth fMRI sition, but rather the speed of the underlying evoked
data prior to analysis which decreases the effective hemodynamic response to a neural event. Since in-
resolution of the data. Second, performing popula- ference is based on oxygenation patterns taking place
tion inference requires the analysis of groups of sub- 5–8 seconds after activation, TR values in the range
jects with varying brain sizes and shapes. In order to of 2 seconds are generally deemed adequate.
THE STATISTICAL ANALYSIS OF fMRI DATA 7
Because of the relatively low temporal resolution the stimulus and BOLD response is modeled. We
and the sluggish nature of the hemodynamic re- differentiate between nonlinear physiological-based
sponse, inference regarding when and where acti- models, such as the Balloon model (Buxton, Wong
vation is taking place is based on oxygenation pat- and Frank, 1998; Friston et al., 2000; Riera et al.,
terns outside of the immediate vicinity of the under- 2004), which describes the dynamics of cerebral blood
lying event we want to base our conclusions on (i.e., volume and deoxygenation and their effects on the
the neural activity). Since the time-to-peak positive resulting BOLD signal, and models that assume a
BOLD response occurs in a larger time scale than linear time invariant (LTI) system, in which assumed
the speed of brain operations, there is a risk of un- neuronal activity (based on task manipulations) con-
known confounding factors influencing the ordering stitutes the input, or impulse, and the HRF is the
of time-to-peak relative to the ordering of brain acti-impulse response function.
vation in different regions of interest. For these rea- The Balloon model consists of a set of ordinary
sons it is difficult to determine the absolute timing differential equations that model changes in blood
of brain activity using fMRI. However, studies have volume, blood inflow, deoxyhemoglobin and flow-
shown (Menon, Luknowsky and Gati, 1998; Miezin inducing signal and how these changes impact the
et al., 2000) that the relative timing within a voxel
observed BOLD response. While models of this type
in response to different stimuli can be accurately
tend to be more biophysically plausible than their
captured. There are also indications that focusing
linear counterparts, they have a number of draw-
inference on features related to the initial dip can
backs. First, they require the estimation of a large
help alleviate concerns (Lindquist et al., 2008a) re-
number of model parameters. Second, they do not
garding possible confounders. However, these types
of studies require significant increases in temporal always provide reliable estimates with noisy data,
resolution and the ability to rapidly acquire data be- and third, they do not provide a direct framework
comes increasingly important. Finally, another way for performing inference. In general, they are not
of improving inferences in time is through appropri- yet considered feasible alternatives for performing
ate experimental design. In principal, it is possible whole-brain multi-subject analysis of fMRI data in
to estimate the HRF at a finer temporal resolution cognitive neuroscience studies, although promising
than the TR as long as the onsets of repeated stim- developments are being pursued on this front and
uli are jittered in time (Dale, 1999). For example, if this is an exciting area for future research.
the onset is shifted by TR/2 in half of the stimuli, While the flexibility of nonlinear models is attrac-
one can potentially estimate the HRF at a temporal tive, linear models provide robust and interpretable
resolution of TR/2, rather than TR if jittering is not characterizations in noisy systems. It is therefore
used. common to assume a linear relationship between
neuronal activity and BOLD response, where lin-
3.3 Understanding Signal and Noise
earity implies that the magnitude and shape of the
Prior to modeling fMRI data, it is useful to gain evoked HRF do not depend on any of the preced-
a better understanding of the components present ing stimuli. Studies have shown that under certain
in an fMRI time series. In general, it consists of the conditions the BOLD response can be considered
BOLD signal (which is the component of interest), linear with respect to the stimulus (Boynton et al.,
a number of nuisance parameters and noise. Each 1996), particularly if events are spaced at least 5
component is discussed in detail below. seconds apart (Miezin et al., 2000). However, other
BOLD signal. The evoked BOLD response in fMRI studies have found that nonlinear effects in rapid
is a complex, nonlinear function of the results of sequences (e.g., stimuli spaced less than 2 seconds
neuronal and vascular changes (Wager et al., 2005), apart) can be quite large (Birn, Saad and Bandet-
complicating the ability to appropriately model its tini, 2001; Wager et al., 2005). The ability to assume
behavior. The shape of the response depends both linearity is important, as it allows the relationship
on the applied stimulus and the hemodynamic re- between stimuli and the BOLD response to be mod-
sponse to neuronal events. A number of methods eled using a linear time invariant system, where the
for modeling the BOLD response and the underlying stimulus acts as the input and the HRF acts as the
HRF exist in the literature. A major difference be- impulse response function. Figure 4 shows an illus-
tween methods lies in how the relationship between tration of the estimated BOLD signal corresponding
8 M. A. LINDQUIST
to two different types of stimulus patterns. In a lin- squarely in the purview of statisticians, and promise
ear system framework the signal at time t, x(t), is to have increased importance in the future.
modeled as the convolution of a stimulus function
Noise and nuisance signal. The measured fMRI
v(t) and the hemodynamic response h(t), that is,
signal is corrupted by random noise and various nui-
(2) x(t) = (v ∗ h)(t). sance components that arise due both to hardware
reasons and the subjects themselves. For instance,
Here h(t) is either assumed to take a canonical form, fluctuations in the MR signal intensity caused by
or alternatively modeled using a set of linear basis thermal motion of electrons within the subject and
functions. the scanner gives rise to noise that tends to be highly
Another important modeling aspect is that the random and independent of the experimental task.
timing and shape of the HRF are known to vary The amount of thermal noise increases linearly as
across the brain, within an individual and across a function of the field strength of the scanner, with
individuals (Aguirre, Zarahn and D’Esposito, 1998; higher field strengths giving rise to more noise. How-
Schacter et al., 1997). Part of the variability is due ever, it does not exhibit spatial structure and its ef-
to the underlying configuration of the vascular bed, fects can be minimized by averaging the signal over
which may cause differences in the HRF across brain multiple data points. Another source of variability in
regions in the same task for purely physiological rea- the signal is due to scanner drift, caused by scanner
sons (Vazquez et al., 2006). Another source of vari- instabilities, which result in slow changes in voxel in-
ability is differences in the pattern of evoked neural tensity over time (low-frequency noise). The amount
activity in regions performing different functions re- of drift varies across space, and it is important to
lated to the same task. It is important that these include this source of variation in your models. Fi-
regional variations are accounted for when model- nally, physiological noise due to patient motion, res-
ing the BOLD signal and we return to this issue in piration and heartbeat cause fluctuations in signal
a later section. across both space and time. Physiological noise can
In general, one of the major shortfalls when an- often be modeled and the worst of its effects re-
alyzing fMRI data is that users typically assume a moved. In the next section we discuss how to correct
canonical HRF (Grinband et al., 2008), which leaves for subject motion as part of the preprocessing step
open the possibility for mismodeling the signal in of the analysis. However, heart-rate and respiration
large portions of the brain (Loh, Lindquist and Wa- gives rise to periodic fluctuations that are difficult to
ger, 2008). There has therefore been a movement model. According to the Nyquist criteria, it is nec-
toward both using more sophisticated models and essary to have a sampling rate at least twice as high
enhanced model diagnostics. Both of these areas fall as the frequency of the periodic function one seeks
Fig. 4. The BOLD response is typically modeled as the convolution of the stimulus function with the HRF. Varying stimulus
patterns will give rise to responses with radically different features.
THE STATISTICAL ANALYSIS OF fMRI DATA 9
to model. If the TR is too low, which is true in most task over time and the specific comparisons that one
fMRI studies, there will be problems with aliasing; is interested in making. In addition, as the efficiency
see Figure 5A for an illustration. In this situation the of the subsequent statistical analysis is directly re-
periodic fluctuations will be distributed throughout lated to the experimental design, it is important that
the time course giving rise to temporal autocorrela- it be carefully considered during the design process.
tion. Noise in fMRI is typically modeled using either A good experimental design attempts to maxi-
an AR(p) or an ARMA(1, 1) process (Purdon et al., mize both statistical power and psychological valid-
2001), where the autocorrelation is thought to be ity. The statistical performance can be characterized
due to an unmodeled nuisance signal. If these terms by its estimation efficiency (i.e., the ability to esti-
are properly removed, there is evidence that the re- mate the HRF) and its detection power (i.e., the
sulting error term corresponds to white noise (Lund ability to detect significant activation). The psycho-
et al., 2006). Note that for high temporal resolution logical validity is often measured by the random-
studies, heart-rate and respiration can be estimated ness of the stimulus presentation, as this helps con-
and included in the model, or alternatively removed trol for issues related to anticipation, habituation
through application of a band-pass filter. and boredom. When designing an experiment there
The spatiotemporal behavior of the noise process is inherent trade-offs between estimation efficiency,
is complex. Figure 5B shows a time course from detection power and randomness. The optimal bal-
a single voxel sampled at high temporal resolution ance between the three ultimately depends on the
(60 ms), as well as its power spectrum. The power goals of the experiment and the combination of con-
spectrum indicates periodic oscillations in the sig- ditions one is interested in studying. For example,
nal due to physiological effects and a low-frequency a design used to localize areas of brain activation
component corresponding to signal drift. At this res- stresses high detection power at the expensive of es-
olution it is relatively straightforward to remove the timation efficiency and randomness.
effects of these nuisance functions by applying an While the area of experimental design is a natu-
appropriate filter. In contrast, Figure 5C shows a ral domain for statisticians to conduct research, it
time course sampled at a more standard resolution has so far been largely unexplored by members of
(1 s). At this resolution, the sampling rate is too the field. Currently there are two major classes of
low to effectively model physiological noise and it fMRI experimental designs: block designs and event-
gives rise to temporal autocorrelation clearly visible related designs. In the following sections we describe
in the accompanying autocorrelation plot. Finally, each type and discuss the applications for which
Figure 5D shows spatial maps of the model parame- they are best suited. In addition, we also discuss
ters from an AR(2) model estimated for each voxel’s ways of optimizing the experimental design.
noise data. It is clear that the behavior of the noise Block Designs
is not consistent throughout the image, indicating
spatial dependence. In fact, it is clearly possible to In a block design the different experimental condi-
make out rough anatomical detail in the maps, indi- tions are separated into extended time intervals, or
cating higher amounts of variability in certain brain blocks. For example, one might repeat the process
regions. of interest (e.g., finger tapping) during an experi-
mental block (A) and have the subject rest during
4. EXPERIMENTAL DESIGN a control block (B); see Figure 6. The A–B com-
parison can than be used to compare differences in
The experimental design of an fMRI study is com- signal between the conditions. In general, increas-
plicated, as it not only involves the standard issues ing the length of each block will lead to a larger
relevant to psychological experiments, but also is- evoked response during the task. This increases the
sues related to data acquisition and stimulus pre- separation in signal between blocks, which, in turn,
sentation. Not all designs with the same number of leads to higher detection power. However, in con-
trials of a given set of conditions are equal, and the trast, it is also important to include multiple transi-
spacing and ordering of events is critical. What con- tions between conditions, as otherwise differences in
stitutes an optimal experimental design depends on signal due to low-frequency drift may be confused for
the psychological nature of the task, the ability of differences in task conditions. In addition, it is im-
the fMRI signal to track changes introduced by the portant that the same mental processes are evoked
10 M. A. LINDQUIST
Fig. 5. (A) The Nyquist criteria states that it is necessary to sample at a frequency at least twice as high as the frequency
of the periodic function one seeks to model to avoid aliasing. As an illustration assume that the signal is measured at the time
points indicated by circles. In this situation it is impossible to determine which of the two periodic signals shown in the plot
give rise to the observed measurements. (B) An fMRI time course measured at a single voxel sampled with 60 ms resolution. Its
power spectrum indicates components present in the signal whose periodicity corresponds to low-frequency drift and physiological
effects. (C) An fMRI time course measured with 1 s resolution. The autocorrelation function indicates autocorrelation present
in the signal. (D) Spatial maps of the model parameters from an AR(2) model (i.e. φ1 , φ2 and σ), estimated from each voxel’s
noise data, indicates clear spatial dependence.
Fig. 6. The two most common classes of experimental design are block designs and event-related designs. In a block design
(top) experimental conditions are separated into extended time intervals, or blocks, of the same type. In an event-related design
(bottom) the stimulus consists of short discrete events whose timing and order can be randomized.
THE STATISTICAL ANALYSIS OF fMRI DATA 11
throughout each block. If block lengths are too long, from a deeper understanding of the statistical anal-
this assumption may be violated due to the effects ysis of fMRI data.
of fatigue and/or boredom. Several methods have been introduced that allow
The main advantages to using a block design are researchers to optimally select the design parame-
that they offer high statistical power to detect acti- ters, as well as the sequencing of events that should
vation and are robust to uncertainties in the shape be used in an experiment (Wager and Nichols, 2003;
of the HRF. The latter advantage is due to the fact Liu and Frank, 2004). These methods define fitness
that the predicted response depends on the total ac- measures for the estimation efficiency, detection
tivation caused by a series of stimuli, which makes it power and randomness of the experiment, and ap-
less sensitive to variations in the shape of responses ply search algorithms (e.g., the genetic algorithm) to
to individual stimulus (see Figure 4). The flip side optimize the design according to the specified crite-
is that block designs provide imprecise information ria. When defining the fitness metrics it is typically
about the particular processes that activated a brain assumed that the subsequent data analysis will be
region and cannot be used to directly estimate im- performed in the general linear model (GLM) frame-
portant features of the HRF (e.g., onset or width). work described in Section 6.2.1 and that the rela-
tionship between stimulus and measured response
Event-Related Designs can be modeled using a linear time invariant sys-
In an event-related design the stimulus consists of tem. The use of more complex nonlinear models
short discrete events (e.g., brief light flashes) whose requires different considerations when defining ap-
timing can be randomized; see Figure 6 for an illus- propriate metrics, the development of which will be
tration with two conditions. These types of designs important as such models gain in popularity. Fi-
are attractive because of their flexibility and that nally, an important consideration relates to assump-
they allow for the estimation of key features of the tions made regarding the shape of the HRF and the
HRF (e.g., onset and width) that can be used to noise structure. The inclusion of flexible basis func-
make inference about the relative timing of activa- tions and correlated noise into the model will mod-
tion across conditions and about sustained activity. ify the trade-offs between estimation efficiency and
Event-related designs allow one to discriminate the detection power, and potentially alter what consti-
effects of different conditions as long as one either in- tutes an optimal design. Hence, even seemingly mi-
termixes events of different types or varies the inter- nor changes in the model formulation can have a
stimulus interval between trials. Another advantage large impact on the efficiency of the design. Together
to event-related designs is that the effects of fatigue, these issues complicate the design of experiments
boredom and systematic patterns of thought unre- and work remains to find the appropriate balance
lated to the task during long inter-trial intervals can between them. As research hypotheses ultimately
be avoided. A drawback is that the power to detect become more complicated, the need for more ad-
activation is typically lower than for block designs, vanced experimental designs will only increase fur-
though the capability to obtain images of more trials ther and this is clearly an area where statisticians
per unit time can counter this loss of power. can play an important role.
assumed that each data point in a specific voxel’s and therefore at different time points, similar time
time series only consists of a signal from that voxel courses from different slices will be temporally shifted
(i.e., that the participant did not move in between relative to one another. Figure 7A illustrates the
measurements). Finally, when performing group anal- point. Assume that three voxels contained in three
ysis and making population inference, all individual adjacent slices exhibit the same true underlying tem-
brains are assumed to be registered, so that each poral profile. Due to the fact that they are sam-
voxel is located in the same anatomical region for pled at different time points relative to one another,
all subjects. Without preprocessing the data prior the corresponding measured time courses will ap-
to analysis, none of these assumptions would hold pear different. Slice timing correction involves shift-
and the resulting statistical analysis would be in- ing each voxel’s time course so that one can as-
valid. sume they were measured simultaneously. This can
The major steps involved in fMRI preprocessing be done either using interpolation or the Fourier
are slice timing correction, realignment, coregistra- shift theorem to correct for differences in acquisi-
tion of structural and functional images, normaliza- tion times.
tion and smoothing. Below each step is discussed in
detail. Motion Correction
obtained from functional data onto a high resolution FWHM of 3 times the voxel size (e.g., 9 mm for
structural MR image for presentation purposes. The 3 mm voxels). Third, if the spatial extent of a re-
process of aligning structural and functional images, gion of interest is larger than the spatial resolution,
called coregistration, is typically performed using ei- smoothing may reduce random noise in individual
ther a rigid body (6 parameters) or an affine (12 voxels and increase the signal-to-noise ratio within
parameters) transformation. the region.
For group analysis, it is important that each voxel The process of spatially smoothing an image is
lie within the same brain structure for each individ- equivalent to applying a low-pass filter to the sam-
ual subject. Of course individual brains have differ- pled k-space data prior to reconstruction. This im-
ent shapes and features, but there are regularities plies that much of the acquired data is discarded as
shared by every nonpathological brain. Normaliza- a byproduct of smoothing and temporal resolution
tion attempts to register each subjects anatomy to is sacrificed without gaining any benefits. Addition-
a standardized stereotaxic space defined by a tem- ally, acquiring an image with high spatial resolution
plate brain [e.g., the Talairach or Montreal Neuro- and thereafter smoothing the image does not lead to
logical Institute (MNI) brain]. In this scenario using the same results as directly acquiring a low resolu-
a rigid body transformation is inappropriate due to tion image. The signal-to-noise ratio during acquisi-
the inherent differences in the subjects brains. In- tion increases as the square of the voxel volume, so
stead, it is common to use nonlinear transformations acquiring small voxels means that signal is lost that
to match local features. One begins by estimating a can never be recovered. Hence, it is optimal in terms
smooth continuous mapping between the points in of sensitivity to acquire images at the desired resolu-
an input image with those in the target image. Next, tion and not employ smoothing. Some recent acqui-
the mapping is used to resample the input image so sition schemes have been designed to acquire images
that it is warped onto the target image. Figure 7B il- at the final functional resolution desired (Lindquist
lustrates the process, where a high resolution image et al., 2008b). This allows for much more rapid im-
is warped onto a template image, resulting in a nor- age acquisition, as time is not spent acquiring in-
malized image that can be compared with similarly formation that will be discarded in the subsequent
normalized images obtained from other subjects. analysis.
The main benefits of normalizing data are that While all the preprocessing steps outlined above
spatial locations can be reported and interpreted are essential for the standard model assumptions re-
in a consistent manner, results can be generalized quired for statistical analysis to hold, there needs to
to a larger population and results can be compared be a clear understanding of the effects they have on
across studies and subjects. The drawbacks are that both the spatial and temporal correlation structure.
it reduces spatial resolution and may introduce er- More generally, it is necessary to study the interac-
rors due to interpolation. tions among the individual preprocessing steps. For
example, is it better to perform slice timing correc-
Spatial Smoothing tion or realignment first, and how will this choice
It is common practice to spatially smooth fMRI impact the resulting data? Ideally there would be
data prior to analysis. Smoothing typically involves one model for both, that also performs outlier de-
convolving the functional images with a Gaussian tection and correction for physiological noise. There
kernel, often described by the full width of the ker- has been increased interest in developing generative
nel at half its maximum height (FWHM). Common models that incorporate multiple steps at once, and
values for the kernel widths vary between 4–12 mm this is another problem with a clear statistical com-
FWHM. There are several reasons why it is com- ponent that promises to play an important role in
mon to smooth fMRI data. First, it may improve the future.
inter-subject registration and overcome limitations
6. DATA ANALYSIS
in the spatial normalization by blurring any resid-
ual anatomical differences. Second, it ensures that There are several common objectives in the anal-
the assumptions of random field theory (RFT), com- ysis of fMRI data. These include localizing regions
monly used to correct for multiple-comparisons, are of the brain activated by a certain task, determining
valid. A rough estimate of the amount of smooth- distributed networks that correspond to brain func-
ing required to meet the assumptions of RFT is a tion and making predictions about psychological or
14 M. A. LINDQUIST
disease states. All of these objectives are related to preprocessing step, or by including covariates of no
understanding how the application of certain stimuli interest into the model. As an example of the latter,
leads to changes in neuronal activity. They are also the drift, µ(t), can be modeled using a pth order
all intrinsically statistical in nature, and this area polynomial function, that is,
is the primary domain of statisticians currently in-
X
p
volved in the field. The statistical analysis of fMRI (4) µij (t) = γijg tg−1 ,
data involves working with massive data sets that g=1
exhibit a complicated spatial and temporal noise
structure. The size and complexity of the data make which, assuming zijg (t) = tg−1 , fits into the frame-
it difficult to create a full statistical model for de- work described in model (3).
scribing its behavior, and a number of shortcuts are There are several alternative functions that have
required to balance computational feasibility with been used to model the drift. For example, it is com-
model efficiency. mon to use a series of low frequency cosine functions.
The most important issue when using a high-pass fil-
6.1 Modeling the fMRI Signal ter is to ensure that the fluctuations induced by the
In this section we introduce a generic model for task design are not in the range of frequencies re-
describing fMRI data, and proceed by making a moved by the filter, as we do not want to throw out
number of model assumptions that impact the direc- the signal of interest. Hence, the ultimate choice in
tion of the analysis. We begin by assuming that the how to model the drift needs to be made with the
data consists of a brain volume with N voxels that experimental design in mind.
is repeatedly measured at T different time points. Seasonal components. Additional covariates may
In addition, suppose the experiment is repeated for be included to account for periodic noise present in
M subjects. In Section 3.3 the various components the signal, such as heart-rate and respiration. Phys-
present in an fMRI time series were discussed. These iological noise can in certain circumstances be di-
included the BOLD response, various nuisance sig- rectly estimated from the data (Lindquist et al.,
nal and noise. Incorporating all these components, 2008a), or it can be removed using a properly de-
our model for fMRI activation in a single voxel for signed band-pass filter. However, in most studies,
a single subject can be expressed with TR values ranging from 2–4 s, one cannot hope
X
G X
K to estimate and remove the effects of heart-rate and
(3) yij (t) = zijg (t)γijg + xijk (t)βijk + εij (t), respiration solely by looking at the fMRI time series.
g=1 k=1 Some groups have therefore begun directly measur-
for i = 1, . . . , N , j = 1, . . . , M and t = 1, . . . , T . Here ing heart beat and respiration during scanning and
zijg (t) represents the contribution of nuisance co- using this information to remove signal related to
variates at time t, including terms modeling the physiological fluctuations from the data (Glover, Li
scanner drift, periodic fluctuations due to heart rate and Ress, 2000). This is done either as a preprocess-
and respiration, and head motion. Similarly, xijk (t) ing step, or by including these terms as covariates
represents the task-related BOLD response (the sig- in the model. However, more often than not, the ef-
nal of interest) corresponding to the kth condition fects of physiological noise are left unmodeled, and
at time t. The terms βijk and γijg represent the un- the aliased physiological artifacts give rise to the au-
known amplitude of xijk and zijg , respectively, and tocorrelated noise present in fMRI data (Lund et al.,
εij (t) the noise process. Appropriate ways of mod- 2006).
eling each of these signal components are described
Noise. In standard time series analysis, model iden-
in detail below.
tification techniques are used to determine the ap-
The drift component. In fMRI the signal typically propriate type and order of a noise process. In fMRI
drifts slowly over time due to scanner instabilities. data analysis this approach is not feasible due to
Therefore, most of the power lies in the low-frequency the large number of time series being analyzed, and
portion of the signal. To remove the effects of drift, it noise models are specified a priori. In our own work,
is common to remove fluctuations below a specified we typically use an auto-regressive process of or-
frequency cutoff using a high-pass filter. This can be der 2. The reason we choose an AR model over an
performed either by applying a temporal filter as a ARMA model is that it allows us to use method
THE STATISTICAL ANALYSIS OF fMRI DATA 15
components and unknown amplitudes. These are the where εj ∼ N (0, V) with the structure of the covari-
assumptions made in the hugely popular GLM ap- ance matrix V corresponding to an AR(2) process
proach (Worsley and Friston, 1995; Friston et al., with unknown parameters φ1 , φ2 and σ. The model
2002), though the assumption regarding fixed HRF parameters can be estimated using a Cochrane–
can be relaxed. However, in many areas of psycho- Orcutt fitting procedure, where the variance compo-
logical inquiry (e.g., emotion and stress), it may be nents are estimated using the Yule–Walker method
difficult to specify information regarding the stimu- (Brockwell and Davis, 1998). After fitting the model,
lus function a priori. If one is unwilling to make any one can test for an effect cT β j where c is a contrast
assumptions regarding the exact timing of neuronal vector. The contrast vector can be used to estimate
activity, alternative methods may be more appropri- signal magnitudes in response to a single condition,
ate for analyzing the data. In the next two sections an average over multiple conditions or the difference
both scenarios will be discussed. in magnitude between two conditions. Hypothesis
6.2.1 The general linear model approach. The gen- testing is performed in the usual manner by testing
eral linear model (GLM) approach has arguably be- individual model parameters using a t-test and sub-
come the dominant way to analyze fMRI data. It sets of parameters using a partial F -test. Since the
models the time series as a linear combination of sev- covariance matrix has to be estimated, a Satterth-
eral different signal components and tests whether waite approximation is used to calculate the effec-
activity in a brain region is systematically related tive degrees of freedom for the test statistics. This
to any of these known input functions. The simplest procedure is repeated for brain voxel and the results
version of the GLM assumes that both the stimu- are summarized in a statistical map consisting of an
lus function and the HRF are known. The stimu- image whose voxel measurements correspond to the
lus is assumed to be equivalent to the experimental test statistic calculated at that particular voxel.
paradigm, while the HRF is modeled using a canon- While the GLM is a simple and powerful approach
ical HRF, typically either a gamma function or the toward modeling the data, it is also extremely rigid.
difference between two gamma functions (see Fig- Even minor mismodeling (e.g., incorrect stimulus
ure 5). Under these assumptions, the convolution function or HRF) can result in severe power loss,
term in the BOLD response is a known function and and can inflate the false positive rate beyond the
(7) reverts to a standard multiple linear regression nominal value. Due to the massive amount of data,
model. The BOLD response can be summarized in a examining the appropriateness of the model is chal-
design matrix X, containing a separate column for lenging and standard methods of model diagnostics
each of the K predictors; see Figure 8 for an example are not feasible. Recently some techniques have been
when K = 2. introduced (Luo and Nichols, 2003; Loh, Lindquist
In the remainder of the section we will, for simplic- and Wager, 2008) that allow one to quickly deter-
ity, assume that the nuisance term Z is accounted for mine, through graphical representations, areas in
and can be ignored. Further, we assume a separate, the brain where assumptions are violated and model
but identical, model for each voxel and suppress the misfit may be present. However, in the vast major-
voxel index. Hence, the data for subject j at voxel i ity of studies no model checking is performed, call-
can be written ing into question the validity of the results. Moving
(9) yj = Xj β j + εj , toward using more sophisticated models, as well as
Fig. 8. In an fMRI experiment with two conditions ( A and B), the stimulus function is convolved with a canonical HRF
to obtain two sets of predicted BOLD responses. The responses are placed into the columns of a design matrix X and used to
compute whether there is significant signal corresponding to the two conditions in a particular time course.
THE STATISTICAL ANALYSIS OF fMRI DATA 17
increased use of diagnostics, is an important area of hierarchical in nature, with lower-level observations
current and future research. In both of these areas (e.g., individual subjects) nested within higher levels
statisticians can play an important role. (e.g., groups of subjects). Multi-level models provide
As mentioned in Section 3.3, the shape of the HRF a framework for performing mixed-effects analysis
may vary across both space and subjects. Therefore, on multi-subject fMRI data. In fMRI it is common
assuming that the shape of the HRF is constant to use a two-level model where the first level deals
across all voxels and subjects may give rise to sig- with individual subjects and the second level deals
nificant mismodeling in large parts of the brain. We with groups of subjects. In the first-level the data
can relax this assumption by expressing the HRF as are autocorrelated with a relatively large number of
a linear combination of reference waveforms. This observations, while in the second-level we have IID
can be done in the GLM framework by convolving data with relatively few observations. The first-level
the same stimulus function with multiple canonical model can be written
waveforms and entering them into multiple columns (10) y = Xβ + ε,
of X for each condition. These reference waveforms
are called basis functions, and the predictors for an where y = (y1T , . . . , yM T )T , X = diag(X , . . . , X ),
1 M
T T T
event type constructed using different basis func- β = (β 1 , . . . , β M ) , ε = (εT1 , . . . , εTM )T and Var(ε) =
tions can combine linearly to better fit the evoked V where V = diag(V1T , . . . , VM T ).
BOLD responses. The ability of a basis set to cap- The second-level model can be written
ture variations in hemodynamic responses depends (11) β = XG β G + εG ,
both on the number and shape of the reference wave- 2 ). Here X is the second-level
where εG ∼ N (0, IσG G
forms. There is a fundamental tradeoff between flex-
design matrix (e.g., separating cases from controls)
ibility to model variations and power, as flexible
and β G the vector of second-level parameters. The
models can model noise and produce noisier parame-
two-level model can be combined into a single level
ter estimates. In addition, the inclusion of additional
model, which can be expressed as
model parameters decreases the number of degrees
of freedom for the subsequent test statistic. (12) y = XXG β G + XεG + ε.
One of the most flexible models, a finite impulse Estimation of the regression parameters and vari-
response (FIR) basis set, contains one free parame- ance components can be performed iteratively, with
ter for every time-point following stimulation in ev- regression parameters estimated using GLS and vari-
ery cognitive event-type that is modeled (Glover, ance components estimated using restricted maxi-
1999a; Goutte, Nielsen and Hansen, 2000). Thus, mum likelihood (ReML) and the EM-algorithm.
the model is able to estimate an HRF of arbitrary Recently, these types of multi-level mixed-effects
shape for each event type in every voxel of the brain. models have become popular in the neuroimaging
Another possible choice is to use the canonical HRF community due to their ability to perform valid pop-
together with its temporal derivative in order to al- ulation level inference (e.g., Friston et al., 2002; Beck-
low for small shifts in the onset of the HRF. Other mann, Jenkinson and Smith, 2003). However, be-
choices of basis sets include those composed of prin- cause of the massive amount of data being analyzed
cipal components (Aguirre, Zarahn and D’Esposito, and the fact that it must be feasible to repeatedly
1998; Woolrich, Behrens and Smith, 2004), cosine fit the model across all brain voxels, the most com-
functions (Zarahn, 2002), radial basis functions (Ri- monly used techniques are by necessity simplistic.
era et al., 2004), spectral basis sets (Liao et al., For example, they do not readily allow for unbal-
2002) and inverse logit functions (Lindquist and Wa- anced designs and missing data. However, both is-
ger, 2007b). For a critical evaluation of various basis sues are prevalent in fMRI data analysis. Missing
sets, see Lindquist and Wager (2007b) and Lindquist data may be present in a study because of artifacts
et al. (2008c). and errors due to the complexity of data acquisi-
Multi-subject analysis. The analysis so far has been tion (including human error), while unbalanced de-
concerned with single subject data. However, re- signs are important because of interest in relating
searchers typically want to make conclusions on pop- brain activity to performance and other variables
ulation effects, and statistical analysis needs to be that cannot be experimentally controlled. The in-
extended to incorporate information from a group troduction of techniques for performing rapid esti-
of subjects. Multi-subject fMRI data is intrinsically mation of multi-level model parameters that allow
18 M. A. LINDQUIST
for this type of data is of utmost importance. Multi- Some headway has recently been made, but work
level models have been heavily researched in the sta- remains to be done and ideas from spatial statistics
tistical community, and statisticians can play an im- can potentially play an important role. Fitting spa-
portant role in developing methods tailored directly tial models using Bayesian statistics has been the
to the complexities of fMRI data analysis. focus of much attention lately and several promis-
Spatial modeling. Up to this point the entire anal- ing approaches have been suggested (e.g., Bowman,
ysis procedure outlined in this section has been uni- 2005; Bowman et al., 2008; Woolrich et al., 2005).
variate, that is, performed separately at each voxel. However, model complexity is sometimes constrained
Indeed, one of the most common short cuts used in by the massive amounts of data and there is a clear
the field is, somewhat surprisingly, to perform fMRI need for statisticians with strong training in Bayesian
data analysis in a univariate setting (the so-called computation to optimize the model fitting proce-
“massive univariate approach”), where each voxel dure.
is modeled and processed independently of the oth-
ers. At the model-level this approach assumes that 6.2.2 Data with uncertain timing of activation. In
neighboring voxels are independent, which is gen- many areas of psychological inquiry—including stud-
erally not a reasonable assumption as most activa- ies on memory, motivation and emotion—it is hard
tion maps show a clear spatial coherence. In these to specify the exact timing of activation a priori. In
situations the spatial relationship is sometimes ac- this situation it may not be reasonable to assume
counted for indirectly by smoothing the data prior that either the experimental paradigm or the HRF
to voxel-wise analysis, and thereafter applying ran- are known. Therefore, the GLM cannot be directly
dom field theory to the map of test statistics to applied to these data sets and alternative methods
determine statistical significance for the entire set are needed. Typically, researchers take a more data-
of voxels. Hence, the “massive univariate approach” driven approach that attempts to characterize reli-
does take spatial correlation into account at the level able patterns in the data, and relate those patterns
of thresholding using Gaussian random fields. How- to psychological activity post hoc. One popular ap-
ever, while the random field theory approach does proach is independent components analysis (ICA)
link voxel-wise statistics, it does not directly esti- (Beckmann and Smith, 2005; Calhoun et al., 2001b;
mate spatial covariances under a linear model. We McKeown and Makeig, 1998), a member of a fam-
discuss random field theory further in Section 6.2.3. ily of analytic methods that also includes principal
Incorporating spatial considerations into the GLM components and factor analysis. While these meth-
framework has become a subject of increased inter- ods provide a great deal of flexibility, they do not
est in recent years. In the earliest approaches indi- provide a formal framework for performing inference
vidual voxel-wise GLMs were augmented with time about whether a component varies over time and
series from neighboring voxels (Katanoda, Matsuda when changes occur in the time series. In addition,
and Sugishita, 2002; Gossl, Auer and Fahrmeir, 2001). because they do not contain any model informa-
Recently, a series of Bayesian approaches have been tion, they capture regularities whatever the source.
suggested. Penny, Trujillo-Barreto and Friston (2005) Therefore, they are highly susceptible to noise and
have proposed a fully Bayesian model with spatial components are often dominated by artifacts. For
priors defined over the coefficients of the GLM. Bow- these reasons we prefer to use methods from change
man (Bowman, 2005) presents a whole-brain spatio- point analysis to model fMRI data with unknown
temporal model that partitions voxels into function- activation profiles.
ally related networks and applies a spatial simulta- In our own work, we use a three step procedure
neous autoregressive model to capture intraregional for modeling such data. In a first stage we employ a
correlations between voxels. Finally, Woolrich et al. multi-subject (mixed-effects) extension of the expo-
(2005) have developed a spatial mixture model us- nentially weighted moving average (EWMA) method
ing a discrete Markov random field (MRF) prior on (Roberts, 1959), denoted HEWMA (Hierarchical
a spatial map of classification labels. While these EWMA) (Lindquist and Wager, 2007a), as a sim-
models are certainly a step in the right direction, it ple screening procedure to determine which voxels
is clear that the massive univariate approach con- have time courses that deviate from a baseline level
tinues to be exceedingly popular among end users and should be moved into the next stage of the anal-
due to its relative simplicity. ysis. In the second stage we estimate voxel-specific
THE STATISTICAL ANALYSIS OF fMRI DATA 19
procedures work only on the p-values and not on N data matrix, Y, into a set of spatial and temporal
the actual test statistics, it can be applied to any components according to some criteria.2
valid statistical test. In contrast, for the RFT ap- PCA allows one to determine the spatial patterns
proach the test statistics need to follow a known that account for the greatest amount of variability
distribution. in a time series of images. This can be achieved by
The FDR controlling procedure that is most com- finding the singular value decomposition of the data
monly used in fMRI data analysis is the so-called matrix,
Benjamini–Hochberg procedure (Benjamini and (13) Y = USVT ,
Hochberg, 1995), where all tests are assumed to be
independent. However, in fMRI data analysis it is where U is an T × T unitary matrix, S is a T × N di-
agonal matrix with nonnegative elements, and V is
more realistic to assume that tests are dependent,
an N × N unitary matrix. The columns of U repre-
as neighboring voxels are more likely to have sim-
sent the weighted sum of time series from different
ilarly valued p-values. Hence, the introduction of
voxels, while the columns of V contain the voxel
FDR controlling procedures that incorporate spatial weights required to create each component in U.
information is of utmost importance and an area of Thus, U represents the temporal components and
future research for statisticians. V the spatial components of the data. The elements
6.3 Assessing Brain Connectivity of S represent the amount of variability explained
by each component and are ordered in nonincreas-
Human brain mapping has primarily been used to ing fashion. Hence, the first column of V shows how
construct maps indicating regions of the brain that to weight each of the N voxel time series in order to
are activated by specific tasks. Recently, there has capture the most variance in Y, etc. The usefulness
been an increased interest in augmenting this type of this technique is twofold: this decomposition can
of analysis with connectivity studies that describe potentially reveal the nature of the observed signal
how various brain regions interact and how these by finding its linearly independent sources. Also, de-
interactions depend on experimental conditions. It composing the signal and ordering the components
is common in the fMRI literature to distinguish be- according to their weight is a useful way to simplify
tween anatomical, functional and effective connec- the data or filter out unwanted components, and can
be used in the preprocessing stage as a data reduc-
tivity (Friston, 1994). Anatomical connectivity deals
tion tool.
with describing how different brain regions are phys-
ICA is similar to PCA, but the components are
ically connected and can be tackled using diffusion
required to be independent rather than orthogonal.
tensor imaging (DTI). Functional connectivity is de- ICA assumes that Y is a weighted sum of p inde-
fined as the undirected association between two or pendent source signals contained in the p × N source
more fMRI time series, while effective connectivity matrix X, whose weights are described by a T × p
is the directed influence of one brain region on oth- mixing matrix of weights M, that is,
ers. In this work we concentrate on describing the
latter two types of connectivity. (14) Y = MX.
Iterative search algorithms are used to estimate M
6.3.1 Functional connectivity. The simplest
and X, simultaneously. In order to solve (14), ICA
approach toward functional connectivity analyses
makes a number of assumptions, the main ones be-
compares correlations between regions of interest, or
ing that the data consist of p statistically indepen-
between a “seed” region and other voxels through- dent components, where at most one component
out the brain. Alternative approaches include us- is Gaussian. The independence assumption entails
ing multivariate methods, such as Principal Compo- that the activations do not have a systematic over-
nents Analysis (PCA) (Andersen, Gash and Avison, lap in time or space, while the non-Gaussiantity as-
1999) and Independent Components Analysis (ICA) sumption is required for the problem to be well de-
(Calhoun et al., 2001b; McKeown and Makeig, 1998), fined. An ICA of Y produces spatially independent
to identify task-related patterns of brain activation
without making any a priori assumptions about its 2
Note that throughout this section we have suppressed the
form. These methods involve decomposing the T × subject index previously used.
THE STATISTICAL ANALYSIS OF fMRI DATA 21
component images in the matrix X and is usually re- connectivity methods depend on two models: a neu-
ferred to as spatial ICA (sICA). Performing ICA on roanatomical model that describes the areas of in-
Y T instead produces temporally independent com- terest, and a model that describes how they are con-
ponent time series and is referred to as temporal nected. Commonly used methods include Structural
ICA (tICA). Equation Modeling (SEM) (McIntosh and Gonzalez-
Both PCA and ICA reduce the data to a lower- Lima, 1994) and Dynamic Causal Modeling (DCM)
dimensional space by capturing the most prominent (Friston, Harrison and Penny, 2003).
variations across the set of voxels. The components In SEM the emphasis lies on explaining the
may either reflect signals of interest or they may be variance-covariance structure of the data. It com-
dominated by artifacts; it is up to the user to deter- prises a set of regions and directed connections be-
mine which are important. Both ICA and PCA as- tween them. Further, path coefficients are defined
sume all variability results from the signal, as noise for each link representing the expected change in
is not included in the model formulation, though activity of one region given a unit change in the re-
noise-added versions of ICA that account for non- gion influencing it. The path coefficient indicates the
average influence across the time interval measured.
source noise have been introduced (Hyvarinen,
Algebraically, we can express an SEM model as
Karhunen and Oja, 2001). In ICA, interpretation
is made more difficult by the fact that the sign of (15) Y = MY + ε,
the independent components cannot be determined.
where Y is the data matrix, M is a matrix of path
In addition, the independent components are not
coefficients and ε is independent and identically dis-
ranked in order of appearance and it is therefore
tributed Gaussian noise. This can be rewritten
necessary to sift through all of the components to
search for the ones that are important. (16) Y = (I − M)−1 ε,
ICA has been successfully applied to single-subject where I represents the identity matrix. The solution
fMRI data. Extending the approach to allow for of the unknown coefficients contained in M is ob-
group inference is currently an active area of re- tained by studying the empirical covariance matrix
search. Several techniques for performing multisub- of Y. In SEM we seek to minimize the difference be-
ject ICA have been proposed. The GIFT approach tween the observed covariance matrix and the one
(Calhoun et al., 2001a) consists of temporally con- implied by the structure of the model (16). The pa-
catenating the data across subjects, and perform- rameters of the model are adjusted iteratively to
ing ICA decomposition on the resulting data set. minimize the difference between the observed and
In contrast, the tensor ICA (Beckmann and Smith, modeled covariance matrix. All inference rests on
2005) approach factors multisubject data as a trilin- the use of nested models and the likelihood ratio
ear combination of three outer products. This results test (LRT) to determine whether a path coefficient
in a three-way decomposition that represents the is reliably different from zero.
data in terms of their temporal, spatial and subject- A number of model assumptions are made when
dependent variations. Finally, Guo and Pagnoni (Guo formulating the SEM. The data are assumed to be
and Pagnoni, 2008) have proposed a unified frame- normally distributed and independent from sample
work for fitting group ICA models. They consider to sample. An important consequence of this as-
a class of models, assuming independence in the sumption is that SEM discounts temporal informa-
spatial domain, which incorporate existing methods tion. Consequently, permuted data sets produce the
such as the GIFT and tensor ICA as special cases. same path coefficients as the original data, which
In general, the ability to perform functional connec- is a major weakness, as the assumption of indepen-
tivity analysis in the multisubject domain promises dence is clearly violated in the analysis of a single
to be an area of intense research in the future. subject.
The measurements used in SEM analysis are based
6.3.2 Effective connectivity. In effective connec- on the observed BOLD response and this ultimately
tivity analysis a small set of regions with a pro- limits the scope of any interpretation that can be
posed set of directed connections are specified a pri- made at the neuronal level. Dynamic Casual Model-
ori, and tests are used to assess the statistical sig- ing (DCM) (Friston, Harrison and Penny, 2003) is an
nificance of individual connections. Most effective attempt to move the analysis to the neuronal level.
22 M. A. LINDQUIST
DCM uses a standard state-space design, and treats past values from one brain region in predicting cur-
the brain as a deterministic nonlinear dynamic sys- rent values in another. Granger causality provides
tem that is subject to inputs and produces outputs. information about the temporal precedence of rela-
It is based on a neuronal model of interacting corti- tionships among two regions, but is a misnomer be-
cal regions, supplemented with a forward model de- cause it does not actually provide information about
scribing how neuronal activity is transformed into causality.
measured hemodynamic response. Effective connec- Let x and y be two time courses of length T
tivity is parameterized in terms of the coupling extracted from two voxels. First, each time course
among unobserved neuronal activity in different re- is modeled using a linear autoregressive model of
gions. We can estimate these parameters by perturb- the pth order (where p ≤ T − 1). Second, each time
ing the system and measuring the response. Experi- course model is expanded using the autoregressive
mental inputs cause changes in effective connectivity terms from the other voxel, that is,
at the neuronal level which, in turn, causes changes X
p
in the observed hemodynamics. x(n) = a(i)x(n − i)
DCM uses a bilinear model for the neuronal level i=1
and an extended Balloon model (Buxton, Wong and (17)
X
p
Frank, 1998) for the hemodynamic level. In a DCM + b(i)y(n − i) + εx (n),
model the user specifies a set of experimental in- i=1
puts (the stimuli) and a set of outputs (the ac- X
p
tivity in each region). The task of the algorithm y(n) = a(i)y(n − i)
is then to estimate the parameters of the system, i=1
(18)
or the “state variables.” Each region has five state X
p
variables, four corresponding to the hemodynamic + b(i)x(n − i) + εy (n),
model and a fifth corresponding to neuronal activ- i=1
ity. The estimation process is then carried out using where both εx and εy are defined to be white noise
Bayesian methods, where Normal priors are placed processes. In this formulation the current value of
on the model parameters and an optimization scheme both time courses are assumed to depend both on
is used to estimate parameters that maximize the the past p-values of its own time course as well as
posterior probability. The posterior density is then the past p-values of the other time course. By fit-
used to make inferences about the significance of the ting both of these models, one can test whether the
connections between various brain regions. previous history of x has predictive value on the
While many researchers use SEM and DCM in or- time course y (and vice versa). If the model fit is
der to ascribe causality between different brain re- significantly improved by the inclusion of the cross-
gions, it is important to keep in mind that the tests autoregressive terms, it provides evidence that the
performed by both techniques are based on model history of one of the time courses can be used to pre-
fit rather than on the causality of the effect. Any dict the current value of the other and a Granger-
misspecification of the underlying model can lead causal relationship is inferred.
to erroneous conclusions. In particular, the exclu- While the analysis of brain connectivity has been
sion of important lurking variables (e.g., brain re- an area of intense research the past couple of years,
gions involved in the network but not included in the it has primarily been concerned with analyzing con-
model) can completely change the fit of the model nectivity between different brain regions. However,
and thereby affect both the direction and strength there is increasing interest in studying networks that
of the connections. Therefore, a great deal of care incorporate information about performance scores
needs to be taken when interpreting the results of on the task and/or physiological measures. For ex-
these methods. ample, it may be of interest to determine brain re-
Granger causality (Roebroeck, Formisano and gions that mediate changes in heart rate or increases
Goebel, 2005) is another approach that is consid- in reported stress in response to a task (Wager et al.,
ered to test effective connectivity. This approach 2008). The incorporation of this information is com-
does not rely on a priori specification of a struc- plicated by the fact that the different components
tural model, but rather quantifies the usefulness of included in the network measure different types of
THE STATISTICAL ANALYSIS OF fMRI DATA 23
responses, possibly on completely different time sophisticated, the need for novel statistical method-
scales. These types of extensions of current connec- ology will only increase. As we look toward the fu-
tivity methods are an area where statisticians can ture, there are many open statistical challenges that
play an important role in the future. need to be addressed for fMRI to reach its full poten-
6.3.3 Understanding connectivity. Ultimately, the tial. We have attempted to highlight many of these
distinction between functional and effective connec- challenges throughout, but below we discuss several
tivity is not entirely clear (Horwitz, 2003). If the additional topics in detail.
discriminating features are a directional model in
Classification and Prediction
which causal influences are specified and the ability
to draw conclusions about direct vs. indirect connec- There is a growing interest in using fMRI data as
tions, then many analyses (e.g., multiple regression) a tool for classification of mental disorders, brain-
might count as effective connectivity. In the end, based nosology and predicting the early onset of
it is not the label that is important, but the spe- disease. For example, the promise of using fMRI
cific assumptions and robustness and validity of in- as a screening device in detecting early onset of
ference afforded by each method. When performing Alzheimer’s disease holds obvious appeal. In addi-
connectivity studies researchers are often interested tion, there has been growing interest in developing
in making statements regarding causal links between
methods for predicting stimuli directly from func-
different brain regions. However, the idea of causal-
tional data. This would allow for the possibility to
ity is a very deep and important philosophical issue
infer information from the scans about the subjects
(Rubin, 1974; Pearl, 2000). Often a cavalier attitude
is taken in attributing causal effects and the differ- thought process and use brain activation patterns
entiation between explanation and causation is of- to characterize subjective human experience. A par-
ten blurred. Properly randomized experimental de- ticularly controversial application has been the idea
signs permit causal inferences of task manipulations of using fMRI for lie detection. The efficient pre-
on brain activity. However, in fMRI studies, all the diction of brain states is a challenging process that
brain variables are observed, and none are manipu- requires the application of novel statistical and ma-
lated. It is therefore difficult to make strong conclu- chine learning techniques. Various multivariate pat-
sions about causality and direct influences among tern classification approaches have successfully been
brain regions, because the validity of such conclu- applied to fMRI data in which a classifier is trained
sions is difficult to verify. In general, the area of to discriminate between different brain states and
brain connectivity is experiencing certain growing then used to predict the brain states in a new set
pains. There is a clear need for a discussion of the of fMRI data. To date, efficient preprocessing of the
goals of the analysis, as well as which model assump- data has been shown to be more important than the
tions are reasonable. To date, many of these critical actual method of prediction. However, this is an area
issues have not been properly addressed, and terms that without a doubt will be the focus of intense re-
such as causality are used inappropriately. In ad- search in the future and where statisticians are well
dition, there is also room for introducing new tech-
positioned to make a significant impact.
niques for testing connectivity and ultimately we be-
lieve ideas from casual inference will come to play a Multi-modal Techniques
role.
In neuroscience there is a general trend toward
7. ADDITIONAL OPEN STATISTICAL using multiple imaging methods in tandem to over-
CHALLENGES come some of the limitations of each method used in
Throughout this paper we have attempted to high- isolation. For example, recent advances in engineer-
light the many interesting and important statistical ing and signal processing allow electroencephalog-
problems that arise in fMRI research. It is clear that raphy (EEG) and fMRI data to be collected simul-
analyzing these massive data structures with their taneously (Goldman et al., 2000). EEG has an ex-
complex correlation patterns provides a serious chal- tremely high temporal resolution (on the order of
lenge for researchers in the field. Many standard sta- ms) but poor spatial resolution, while fMRI suf-
tistical techniques are neither appropriate nor feasi- fers from the opposite problem. By merging these
ble for direct application to fMRI data. As experi- two techniques, the hope is that one can get the
mental designs and imaging techniques become more best of both worlds. In another example, diffusion
24 M. A. LINDQUIST
tensor imaging (DTI), a popular technique for mea- The open statistical challenges discussed in this
suring directional diffusion and reconstructing the paper are by no means complete. Rather, we hope
fiber tracts of the brain (Le Bihan et al., 2001), that they illustrate some of the possible statistical
can be combined with fMRI to determine appro- problems that may be at the forefront of the statisti-
priate regions of the brain to include in subsequent cal analysis of fMRI data in the future. Other prob-
connectivity models. Finally, neuroimaging data are lems that will be of importance include the acqui-
being combined with transcranial magnetic stimula- sition and analysis of real-time fMRI data, the de-
tion (TMS) to integrate the ability of neuroimaging velopment of efficient nonlinear models for describ-
to observe brain activity with the ability of TMS ing the relationship between stimulus and BOLD re-
to manipulate brain function (Bohning et al., 1997). sponse, and the synthesis of findings across studies
Using this technique, one can simulate temporary (e.g., meta-analyses), among many other things.
“brain lesions” while the subject performs certain A critical job for any statistician involved in the
tasks. One can then attempt to infer causal rela- field will be stressing the need for researchers to
tionships by studying differences in a brain network stringently state and check model assumptions. Due
when a region is functioning and when it is not. to the enormity of the analysis, model assumptions
Combining information from different modalities are typically neither checked nor often even speci-
will be challenging to data analysts, if for no other
fied. However, for most models even relatively small
reason than that the amount of data will signifi-
amounts of mismodeling can result in severe power
cantly increase. In addition, since the different modal-
loss, and inflate the false positive rate beyond the
ities are measuring fundamentally different quanti-
nominal value. As inference may be incorrect if model
ties, it is not immediately clear how to best com-
assumptions do not hold, the lack of diagnostics calls
bine the information. This is an extremely impor-
some of the validity of the analysis into doubt. This
tant problem that has already started to become a
major area of research. is an area where statisticians must lead the way.
Katanoda, K., Matsuda, Y. and Sugishita, M. (2002). Mansfield, P., Howseman, A. and Ordidge, R. (1989).
A spatio-temporal regression model for the analysis of func- Volumar imaging using nmr spin echos: Echo-volumar
tional mri data. NeuroImage 17 1415–1428. imaging (evi) at 0.1 t. J. Phys. E 22 324–330.
Kim, D., Duong, T. and Kim, S. (2000). High-resolution McIntosh, A. and Gonzalez-Lima, F. (1994). Structural
mapping of iso-orientation columns by fmri. Nature Neu- equation modeling and its application to network analysis
roscience 3 164–169. in functional brain imaging. Human Brain Mapping 2 2–22.
Le Bihan, D., Mangin, J.-F., Poupon, C., Clark, C. A., McKeown, M. J. and Makeig, S. (1998). Analysis of fmri
Pappata, S., Molko, N. and Chabriat, H. (2001). Dif- data by blind separation into independant spatial compo-
fusion tensor imaging: Concepts and applications. Journal nents. Human Brain Mapping 6 160–188.
of Magnetic Resonance Imaging 13 534–546. Menon, R., Luknowsky, D. C. and Gati, J. S. (1998).
Liao, C., Worsley, K. J., Poline, J.-B., Duncan, G. H. Mental chronometry using latency resolved functional mri.
and Evans, A. C. (2002). Estimating the delay of the re- Proc. Natl. Acad Sci. USA 95 10902–10907.
sponse in fMRI data. NeuroImage 16 593–606. Menon, R., Ogawa, S., Hu, X., Strupp, J., Andersen,
Lindquist, M., Zhang, C., Glover, G., Shepp, L. and P. and Ugurbil, K. (1995). Bold based functional mri at
Yang, Q. (2006). A generalization of the two dimensional 4 tesla includes a capillary bed contribution: Echo-planar
prolate spheroidal wave function method for non-rectilinear imaging mirrors previous optical imaging using intrinsic
mri data acquisition methods. IEEE Transactions in Image signals. Magnetic Resonance in Medicine 33 453–459.
Processing 15 2792–2804. Miezin, F., Maccotta, L., Ollinger, J., Petersen, S.
Lindquist, M., Zhang, C., Glover, G. and Shepp, L. and Buckner, R. (2000). Characterizing the hemody-
(2008a). Acquisition and statistical analysis of rapid 3d namic response: Effects of presentation rate, sampling pro-
fmri data. Statist. Sinica 18 1395–1419. cedure, and the possibility of ordering brain activity based
Lindquist, M., Zhang, C., Glover, G. and Shepp, L. on relative timing. NeuroImage 11 735–759.
(2008b). Rapid three-dimensional functional magnetic res- Nichols, T. E. and Holmes, A. P. (2002). Nonparametric
onance imaging of the negative bold response. Journal of permutation tests for functional neuroimaging: A primer
Magnetic Resonance 191 100–111. with examples. Human Brain Mapping 15 1–25.
Lindquist, M. A., Loh, J., Atlas, L. and Wager, T. D. Ogawa, S., Tank, D., Menon, R., Ellerman, J., Kim, S.,
(2008c). Modeling the hemodynamic response function in Merkle, H. and Ugurbil, K. (1992). Intrinsic sig-
fmri: Efficiency, bias and mis-modeling. NeuroImage. To nal changes accompanying sensory simulation: Functional
appear. brain mapping and magnetic resonance imaging. Proc. Nat.
Lindquist, M. A. and Wager, T. D. (2007a). Modeling Acad. Sci. 89 5951–5955.
state-related fMRI activity using change-point theory. Neu- Pearl, J. (2000). Causality: Models, Reasoning, and Infer-
roImage 35 1125–1141. ence. Cambridge Univ. Press. MR1744773
Lindquist, M. A. and Wager, T. D. (2007b). Validity and Penny, W., Trujillo-Barreto, N. and Friston, K.
power in hemodynamic response modeling: A comparison (2005). Bayesian fMRI time series analysis with spatial pri-
study and a new approach. Human Brain Mapping 28 764– ors. NeuroImage 24 350–362.
784. Pruessmann, K., Weiger, M., Scheidegger, M. and Boe-
Liu, T. and Frank, L. (2004). Efficiency, power, and en- siger, P. (1999). Sense: Sensitivity encoding for fast mri.
tropy in event-related fmri with multiple trial types: Part Magnetic Resonance in Medicine 42 952–956.
i: Theory. NeuroImage 21 387–400. Purdon, P. L., Solo, V., Weissko, R. M. and Brown, E.
Logothetis, N. K. (2000). Can current fmri techniques re- (2001). Locally regularized spatiotemporal modeling and
veal the micro-architecture of cortex? Nature Neuroscience model comparison for functional MRI. NeuroImage 14 912–
3 413. 923.
Loh, J., Lindquist, M. A. and Wager, T. D. (2008). Resid- Riera, J. J., Watanabe, J., Kazuki, I., Naoki, M.,
ual analysis for detecting mis-modeling in fmri. Statist. Aubert, E., Ozaki, T. and Kawashima, R. (2004). A
Sinica. MR2468275 state-space model of the hemodynamic approach: Nonlin-
Lund, T. E., Madsen, K. H., Sidaros, K., Luo, W. L. ear filtering of bold signals. NeuroImage 21 547–567.
and Nichols, T. E. (2006). Non-white noise in fmri: Does Roberts, S. W. (1959). Control chart tests based on geo-
modelling have an impact? NeuroImage 29 54–66. metric moving averages. Technometrics 1 239–250.
Luo, W.-L. and Nichols, T. E. (2003). Diagnosis and explo- Robinson, L. F., Wager, T. and Lindquist, M. (2009).
ration of massively univariate neuroimaging models. Neu- Change point estimation in multi-subject fmri studies. To
roImage 19 1014–1032. appear.
Malonek, D. and Grinvald, A. (1996). The imaging spec- Roebroeck, A., Formisano, E. and Goebel, R. (2005).
troscopy reveals the interaction between electrical activity Mapping directed influence over the brain using granger
and cortical microcirculation: Implication for optical, pet causality and fmri. NeuroImage 25 230–242.
and mr functional brain imaging. Science 272 551–554. Rowe, D. B. and Logan, B. R. (2004). A complex way to
Mansfield, P., Coxon, R. and Hykin, J. (1995). Echo- compute fmri activation. NeuroImage 23 1078–1092.
volumar imaging (evi) at 3.0 t: First normal volunteer and Rubin, D. (1974). Estimating causal effects of treatments in
functional imaging results. Journal of Computer Assisted randomized and non-randomized studies. Journal of Edu-
Tomography 19 847–852. cational Psychology 66 688–701.
THE STATISTICAL ANALYSIS OF fMRI DATA 27
Schacter, D. L., Buckner, R. L., Koutstaal, W., Dale, genetic algorithm. NeuroImage 18 293–309.
A. M. and Rosen, B. R. (1997). Rectangular confidence Wager, T. D., Vazquez, A., Hernandez, L. and Noll,
regions for the means of multivariate normal distributions. D. C. (2005). Accounting for nonlinear BOLD effects in
Late onset of anterior prefrontal activity during true and fMRI: Parameter estimates and a model for prediction in
false recognition: An event-related fMRI study. NeuroIm- rapid event-related studies. NeuroImage 25 206–218.
age 6 259–269. Woolrich, M. W., Behrens, T. E., Beckmann, C. F. and
Sodickson, D. and Manning, W. (1997). Simultaneous Smith, S. M. (2005). Mixture models with adaptive spatial
acquisition of spatial harmonics (smash): Fast imaging regularization for segmentation with an application to fmri
with radiofrequency coil arrays. Magnetic Resonance in data. IEEE Transactions on Medical Imaging 24 1–11.
Medicine 38 591–603. Woolrich, M. W., Behrens, T. E. and Smith, S. M.
Thompson, J., Peterson, M. and Freeman, R. (2004). (2004). Constrained linear basis sets for HRF modelling
High-resolution neurometabolic coupling revealed by focal using variational Bayes. NeuroImage 21 1748–1761.
activation of visual neurons. Nature Neuroscience 7 919– Worsley, K. J. and Friston, K. J. (1995). Analysis of fMRI
920. time-series revisited-again. NeuroImage 2 173–181.
Vazquez, A. L., Cohen, E. R., Gulani, V., Hernandez- Worsley, K. J., Taylor, J. E., Tomaiuolo, F. and
Garcia, L., Zheng, Y., Lee, G. R., Kim, S. G., Grot- Lerch, J. (2004). Unified univariate and multivariate ran-
berg, J. B. and Noll, D. C. (2006). Vascular dynamics dom field theory. NeuroImage 23 189–195.
and bold fmri: Cbf level effects and analysis considerations. Yacoub, E., Le, T. and Hu, X. (1998). Detecting the early
NeuroImage 32 1642–1655. response at 1.5 tesla. NeuroImage 7 S266.
Wager, T. D., Davidson, M. L., Hughes, B. L., Zarahn, E. (2002). Using larger dimensional signal subspaces
Lindquist, M. A. and Ochsner, K. N. (2008). Prefrontal- to increase sensitivity in fmri time series analyses. Hum.
subcortical pathways mediating successful emotion regula- Brain Mapp. 17 13–16.
tion. Neuron 59 1037–1050.
Wager, T. D. and Nichols, T. E. (2003). Optimization of
experimental design in fmri: A general framework using a