Proc Mixed - Sas User Guide
Proc Mixed - Sas User Guide
SAS/STAT 15.3
User’s Guide
The MIXED Procedure
®
SAS Documentation
January 31, 2023
This document is an individual chapter from SAS/STAT® 15.3 User’s Guide.
The correct bibliographic citation for this manual is as follows: SAS Institute Inc. 2023. SAS/STAT® 15.3 User’s Guide. Cary, NC:
SAS Institute Inc.
SAS/STAT® 15.3 User’s Guide
Copyright © 2023, SAS Institute Inc., Cary, NC, USA
All Rights Reserved. Produced in the United States of America.
For a hard-copy book: No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by
any means, electronic, mechanical, photocopying, or otherwise, without the prior written permission of the publisher, SAS Institute
Inc.
For a web download or e-book: Your use of this publication shall be governed by the terms established by the vendor at the time
you acquire this publication.
The scanning, uploading, and distribution of this book via the internet or any other means without the permission of the publisher is
illegal and punishable by law. Please purchase only authorized electronic editions and do not participate in or encourage electronic
piracy of copyrighted materials. Your support of others’ rights is appreciated.
January 2023
SAS® and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the
USA and other countries. ® indicates USA registration.
Other brand and product names are trademarks of their respective companies.
SAS software may be provided with certain third-party software, including but not limited to open source software, which is licensed
under its applicable third-party software license agreement. For license information about third-party software distributed with SAS
software, refer to Third-Party Software Reference | SAS Support.
Chapter 84
The MIXED Procedure
Contents
Overview: MIXED Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6602
Basic Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6603
Notation for the Mixed Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6604
PROC MIXED Contrasted with Other SAS Procedures . . . . . . . . . . . . . . . . . 6605
Getting Started: MIXED Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6606
Clustered Data Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6606
Syntax: MIXED Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6612
PROC MIXED Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6614
BY Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6626
CLASS Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6627
CODE Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6628
CONTRAST Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6628
ESTIMATE Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6631
ID Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6633
LSMEANS Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6634
LSMESTIMATE Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6640
MODEL Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6641
PARMS Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6656
PRIOR Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6659
RANDOM Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6663
REPEATED Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6668
SLICE Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6682
STORE Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6682
WEIGHT Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6682
Details: MIXED Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6683
Mixed Models Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6683
Parameterization of Mixed Models . . . . . . . . . . . . . . . . . . . . . . . . . . . 6695
Residuals and Influence Diagnostics . . . . . . . . . . . . . . . . . . . . . . . . . . . 6700
Default Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6708
ODS Table Names . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6712
ODS Graphics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6717
Computational Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6722
Examples: MIXED Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6726
Example 84.1: Split-Plot Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6726
Example 84.2: Repeated Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . 6731
Example 84.3: Plotting the Likelihood . . . . . . . . . . . . . . . . . . . . . . . . . 6742
6602 F Chapter 84: The MIXED Procedure
The means (expected values) of the data are linear in terms of a certain set of parameters.
The variances and covariances of the data are in terms of a different set of parameters, and they exhibit
a structure matching one of those available in PROC MIXED.
Since Gaussian data can be modeled entirely in terms of their means and variances/covariances, the two
sets of parameters in a mixed linear model actually specify the complete probability distribution of the data.
The parameters of the mean model are referred to as fixed-effects parameters, and the parameters of the
variance-covariance model are referred to as covariance parameters.
The fixed-effects parameters are associated with known explanatory variables, as in the standard linear model.
These variables can be either qualitative (as in the traditional analysis of variance) or quantitative (as in
standard linear regression). However, the covariance parameters are what distinguishes the mixed linear
model from the standard linear model.
The need for covariance parameters arises quite frequently in applications, the following being the two most
typical scenarios:
The experimental units on which the data are measured can be grouped into clusters, and the data from
a common cluster are correlated.
Repeated measurements are taken on the same experimental unit, and these repeated measurements are
correlated or exhibit variability that changes.
Basic Features F 6603
The first scenario can be generalized to include one set of clusters nested within another. For example,
if students are the experimental unit, they can be clustered into classes, which in turn can be clustered
into schools. Each level of this hierarchy can introduce an additional source of variability and correlation.
The second scenario occurs in longitudinal studies, where repeated measurements are taken over time.
Alternatively, the repeated measures could be spatial or multivariate in nature.
PROC MIXED provides a variety of covariance structures to handle the previous two scenarios. The most
common of these structures arises from the use of random-effects parameters, which are additional unknown
random variables assumed to affect the variability of the data. The variances of the random-effects parameters,
commonly known as variance components, become the covariance parameters for this particular structure.
Traditional mixed linear models contain both fixed- and random-effects parameters, and, in fact, it is the
combination of these two types of effects that led to the name mixed model. PROC MIXED fits not only
these traditional variance component models but numerous other covariance structures as well.
PROC MIXED fits the structure you select to the data by using the method of restricted maximum likelihood
(REML), also known as residual maximum likelihood. It is here that the Gaussian assumption for the data is
exploited. Other estimation methods are also available, including maximum likelihood and MIVQUE0. The
details behind these estimation methods are discussed in subsequent sections.
After a model has been fit to your data, you can use it to draw statistical inferences via both the fixed-effects
and covariance parameters. PROC MIXED computes several different statistics suitable for generating
hypothesis tests and confidence intervals. The validity of these statistics depends upon the mean and variance-
covariance model you select, so it is important to choose the model carefully. Some of the output from PROC
MIXED helps you assess your model and compare it with others.
Basic Features
PROC MIXED provides easy accessibility to numerous mixed linear models that are useful in many common
statistical analyses. In the style of the GLM procedure, PROC MIXED fits the specified mixed linear model
and produces appropriate statistics.
Here are some basic features of PROC MIXED:
GLM-type grammar, by using MODEL, RANDOM, and REPEATED statements for model specifica-
tion and CONTRAST, ESTIMATE, and LSMEANS statements for inferences
appropriate standard errors for all specified estimable linear combinations of fixed and random effects,
and corresponding t and F tests
subject and group effects that enable blocking and heterogeneity, respectively
PROC MIXED uses the Output Delivery System (ODS), a SAS subsystem that provides capabilities for
displaying and controlling the output from SAS procedures. ODS enables you to convert any of the output
from PROC MIXED into a SAS data set. See the section “ODS Table Names” on page 6712.
The MIXED procedure uses ODS Graphics to create graphs as part of its output. For general information
about ODS Graphics, see Chapter 24, “Statistical Graphics Using ODS.” For specific information about the
statistical graphics available with the MIXED procedure, see the PLOTS= option in the PROC MIXED
statement and the section “ODS Graphics” on page 6717.
y D Xˇ C
In this expression, y represents a vector of observed data, ˇ is an unknown vector of fixed-effects parameters
with known design matrix X, and is an unknown random error vector modeling the statistical noise around
Xˇ. The focus of the standard linear model is to model the mean of y by using the fixed-effects parameters ˇ.
The residual errors are assumed to be independent and identically distributed Gaussian random variables
with mean 0 and variance 2 .
The mixed model generalizes the standard linear model as follows:
y D Xˇ C Z C
Here, is an unknown vector of random-effects parameters with known design matrix Z, and is an unknown
random error vector whose elements are no longer required to be independent and homogeneous.
To further develop this notion of variance modeling, assume that and are Gaussian random variables that
are uncorrelated and have expectations 0 and variances G and R, respectively. The variance of y is thus
V D ZGZ0 C R
Note that, when R D 2 I and Z D 0, the mixed model reduces to the standard linear model.
You can model the variance of the data, y, by specifying the structure (or form) of Z, G, and R. The model
matrix Z is set up in the same fashion as X, the model matrix for the fixed-effects parameters. For G and R,
you must select some covariance structure. Possible covariance structures include the following:
variance components
autoregressive
PROC MIXED Contrasted with Other SAS Procedures F 6605
spatial
general linear
factor analytic
By appropriately defining the model matrices X and Z, as well as the covariance structure matrices G and R,
you can perform numerous mixed model analyses.
the procedure supports certain Kronecker-type covariance structures. These features are not available in the
GLIMMIX procedure. The GLIMMIX procedure, on the other hand, accommodates nonnormal data and
offers a broader array of post-processing features than the MIXED procedure.
data heights;
input Family Gender$ Height @@;
datalines;
1 F 67 1 F 66 1 F 64 1 M 71 1 M 72 2 F 63
2 F 63 2 F 67 2 M 69 2 M 68 2 M 70 3 F 63
3 M 64 4 F 67 4 F 66 4 M 67 4 M 67 4 M 69
;
The response variable Height measures the heights (in inches) of 18 individuals. The individuals are classified
according to Family and Gender. You can perform a traditional two-way analysis of variance of these data
with the following PROC MIXED statements:
Model Information
Data Set WORK.HEIGHTS
Dependent Variable Height
Covariance Structure Diagonal
Estimation Method REML
Residual Variance Method Profile
Fixed Effects SE Method Model-Based
Degrees of Freedom Method Residual
The “Class Level Information” table in Figure 84.2 lists the levels of all variables specified in the CLASS
statement. You can check this table to make sure that the data are correct.
Figure 84.2 Class Level Information
Class Level
Information
Class Levels Values
Family 4 1234
Gender 2 FM
The “Dimensions” table in Figure 84.3 lists the sizes of relevant matrices. This table can be useful in
determining CPU time and memory requirements.
Dimensions
Covariance Parameters 1
Columns in X 15
Columns in Z 0
Subjects 1
Max Obs per Subject 18
The “Number of Observations” table in Figure 84.4 displays information about the sample size being
processed.
Number of Observations
Number of Observations Read 18
Number of Observations Used 18
Number of Observations Not Used 0
The “Covariance Parameter Estimates” table in Figure 84.5 displays the estimate of 2 for the model.
6608 F Chapter 84: The MIXED Procedure
Covariance
Parameter
Estimates
Cov Parm Estimate
Residual 2.1000
The “Fit Statistics” table in Figure 84.6 lists several pieces of information about the fitted mixed model,
including values derived from the computed value of the restricted/residual likelihood.
Fit Statistics
-2 Res Log Likelihood 41.6
AIC (Smaller is Better) 43.6
AICC (Smaller is Better) 44.1
BIC (Smaller is Better) 43.9
The “Type 3 Tests of Fixed Effects” table in Figure 84.7 displays significance tests for the three effects listed
in the MODEL statement. The Type 3 F statistics and p-values are the same as those produced by the GLM
procedure. However, because PROC MIXED uses a likelihood-based estimation scheme, it does not directly
compute or display sums of squares for this analysis.
The Type 3 test for Family*Gender effect is not significant at the 5% level, but the tests for both main effects
are significant.
The important assumptions behind this analysis are that the data are normally distributed and that they are
independent with constant variance. For these data, the normality assumption is probably realistic since
the data are observed heights. However, since the data occur in clusters (families), it is very likely that
observations from the same family are statistically correlated—that is, not independent.
The methods implemented in PROC MIXED are still based on the assumption of normally distributed data,
but you can drop the assumption of independence by modeling statistical correlation in a variety of ways.
You can also model variances that are heterogeneous—that is, nonconstant.
For the height data, one of the simplest ways of modeling correlation is through the use of random effects.
Here the family effect is assumed to be normally distributed with zero mean and some unknown variance.
This is in contrast to the previous model in which the family effects are just constants, or fixed effects.
Declaring Family as a random effect sets up a common correlation among all observations having the same
level of Family.
Clustered Data Example F 6609
Declaring Family*Gender as a random effect models an additional correlation between all observations that
have the same level of both Family and Gender. One interpretation of this effect is that a female in a certain
family exhibits more correlation with the other females in that family than with the other males, and likewise
for a male. With the height data, this model seems reasonable.
The statements to fit this correlation model in PROC MIXED are as follows:
proc mixed;
class Family Gender;
model Height = Gender;
random Family Family*Gender;
run;
Note that Family and Family*Gender are now listed in the RANDOM statement. The dummy variables
associated with them are used to construct the Z matrix in the mixed model. The X matrix now consists of a
column of 1s and the dummy variables for Gender.
The G matrix for this model is diagonal, and it contains the variance components for both Family and
Family*Gender. The R matrix is still assumed to equal 2 I, where I is an identity matrix.
Model Information
Data Set WORK.HEIGHTS
Dependent Variable Height
Covariance Structure Variance Components
Estimation Method REML
Residual Variance Method Profile
Fixed Effects SE Method Model-Based
Degrees of Freedom Method Containment
The “Model Information” table in Figure 84.8 shows that the containment method is used to compute the
degrees of freedom for this analysis. This is the default method when a RANDOM statement is used; for
more information, see the description of the DDFM= option.
Class Level
Information
Class Levels Values
Family 4 1234
Gender 2 FM
The “Class Level Information” table in Figure 84.9 is the same as before. The “Dimensions” table in
Figure 84.10 displays the new sizes of the X and Z matrices.
6610 F Chapter 84: The MIXED Procedure
Dimensions
Covariance Parameters 3
Columns in X 3
Columns in Z 12
Subjects 1
Max Obs per Subject 18
Number of Observations
Number of Observations Read 18
Number of Observations Used 18
Number of Observations Not Used 0
The “Iteration History” table in Figure 84.11 displays the results of the numerical optimization of the
restricted/residual likelihood. Six iterations are required to achieve the default convergence criterion of 1E–8.
Iteration History
Iteration Evaluations -2 Res Log Like Criterion
0 1 74.11074833
1 2 71.51614003 0.01441208
2 1 71.13845990 0.00412226
3 1 71.03613556 0.00058188
4 1 71.02281757 0.00001689
5 1 71.02245904 0.00000002
6 1 71.02245869 0.00000000
The “Covariance Parameter Estimates” table in Figure 84.12 displays the results of the REML fit. The
Estimate column contains the estimates of the variance components for Family and Family*Gender, as well
as the estimate of 2 .
Figure 84.12 Covariance Parameter Estimates (REML)
Covariance Parameter
Estimates
Cov Parm Estimate
Family 2.4010
Family*Gender 1.7657
Residual 2.1668
The “Fit Statistics” table in Figure 84.13 contains basic information about the REML fit.
Clustered Data Example F 6611
Fit Statistics
-2 Res Log Likelihood 71.0
AIC (Smaller is Better) 77.0
AICC (Smaller is Better) 79.0
BIC (Smaller is Better) 75.2
The “Type 3 Tests of Fixed Effects” table in Figure 84.14 contains a significance test for the lone fixed
effect, Gender. Note that the associated p-value is not nearly as significant as in the previous analysis. This
illustrates the importance of correctly modeling correlation in your data.
An additional benefit of the random effects analysis is that it enables you to make inferences about gender
that apply to an entire population of families, whereas the inferences about gender from the analysis where
Family and Family*Gender are fixed effects apply only to the particular families in the data set.
PROC MIXED thus offers you the ability to model correlation directly and to make inferences about fixed
effects that apply to entire populations of random effects.
6612 F Chapter 84: The MIXED Procedure
The PROC MIXED statement invokes the MIXED procedure. Table 84.2 summarizes the options available
in the PROC MIXED statement. These and other options in the PROC MIXED statement are then described
fully in alphabetical order.
Option Description
Basic Options
DATA= Specifies input data set
METHOD= Specifies the estimation method
NOPROFILE Includes scale parameter in optimization
ORDER= Determines the sort order of CLASS variables
Displayed Output
ASYCORR Displays asymptotic correlation matrix of covariance parameter
estimates
ASYCOV Displays asymptotic covariance matrix of covariance parameter
estimates
CL Requests confidence limits for covariance parameter estimates
COVTEST Displays asymptotic standard errors and Wald tests for covariance
parameters
PROC MIXED Statement F 6615
Option Description
IC Displays a table of information criteria
ITDETAILS Displays estimates and gradients added to “Iteration History”
LOGNOTE Writes periodic status notes to the log
MMEQ Displays mixed model equations
MMEQSOL Displays the solution to the mixed model equations
NOCLPRINT Suppresses “Class Level Information” completely or in parts
NOITPRINT Suppresses “Iteration History” table
PLOTS= Produces ODS statistical graphics
RANKS= Displays a table of ranks of design matrices X and (XZ)
RATIO Produces ratio of covariance parameter estimates with residual
variance
Optimization Options
MAXFUNC= Specifies the maximum number of likelihood evaluations
MAXITER= Specifies the maximum number of iterations
Computational Options
CONVF Requests and tunes the relative function convergence criterion
CONVG Requests and tunes the relative gradient convergence criterion
CONVH Requests and tunes the relative Hessian convergence criterion
DFBW Selects between-within degree of freedom method
EMPIRICAL Computes empirical (“sandwich”) estimators
NOBOUND Unbounds covariance parameter estimates
RIDGE= Specifies starting value for minimum ridge value
SCORING= Applies Fisher scoring where applicable
ABSOLUTE
makes the convergence criterion absolute. By default, it is relative (divided by the current objective
function value). See the CONVF, CONVG, and CONVH options in this section for a description of
various convergence criteria.
ALPHA=number
requests that confidence limits be constructed for the covariance parameter estimates with confidence
level 1 number . The value of number must be between 0 and 1; the default is 0.05.
ANOVAF
The ANOVAF option computes F tests in models with REPEATED statement and without RANDOM
statement by a method similar to that of Brunner, Domhof, and Langer (2002). The method consists of
computing special F statistics and adjusting their degrees of freedom. The technique is a generalization
of the Greenhouse-Geisser adjustment in MANOVA models (Greenhouse and Geisser 1959). For more
details, see the section “F Tests With the ANOVAF Option” on page 6693.
6616 F Chapter 84: The MIXED Procedure
ASYCORR
produces the asymptotic correlation matrix of the covariance parameter estimates. It is computed from
the corresponding asymptotic covariance matrix (see the description of the ASYCOV option, which
follows). The name of the “Asymptotic Correlation” table is AsyCorr.
ASYCOV
requests that the asymptotic covariance matrix of the covariance parameters be displayed. By default,
this matrix is the observed inverse Fisher information matrix, which equals 2H 1 , where H is the
Hessian (second derivative) matrix of the objective function. For more information about this matrix,
see the section “Covariance Parameter Estimates” on page 6710. When you use the SCORING=
option and PROC MIXED converges without stopping the scoring algorithm, PROC MIXED uses the
expected Hessian matrix to compute the covariance matrix instead of the observed Hessian. The ODS
name of the “Asymptotic Covariance” table is AsyCov.
gk 0 H k 1 gk
number
jfk j
where fk is the value of the objective function, gk is the gradient (first derivative) of the objective
function, and Hk is the Hessian (second derivative) of the objective function, all at iteration k.
If Hk is singular, then PROC MIXED uses the following relative criterion:
gk0 gk
number
jfk j
To prevent the division by jfk j, use the ABSOLUTE option. The default convergence criterion is
CONVH, and the default tolerance is 1E–8.
COVTEST
produces asymptotic standard errors and Wald Z-tests for the covariance parameter estimates.
DATA=SAS-data-set
names the SAS data set to be used by PROC MIXED. The default is the most recently created data set.
DFBW
has the same effect as the DDFM=BW option in the MODEL statement.
EMPIRICAL
computes the estimated variance-covariance matrix of the fixed-effects parameters by using the
asymptotically consistent estimator described in Huber (1967); White (1980); Liang and Zeger (1986);
Diggle, Liang, and Zeger (1994). This estimator is commonly referred to as the “sandwich” estimator,
and it is computed as follows:
S
!
X
0b 1
.X V X/ ci 1 bi bi 0 V
X0i V ci 1 Xi .X0 V
b 1
X/
i D1
Here, bi D yi Xi b̌, S is the number of subjects, and matrices with an i subscript are those for the ith
subject. You must include the SUBJECT= option in either a RANDOM or REPEATED statement for
this option to take effect.
When you specify the EMPIRICAL option, PROC MIXED adjusts all standard errors and test statis-
tics involving the fixed-effects parameters. This changes output in the following tables (listed in
Table 84.26): Contrast, CorrB, CovB, Diffs, Estimates, InvCovB, LSMeans, Slices, SolutionF, Tests1–
Tests3. The OUTP= and OUTPM= data sets are also affected. Finally, the Satterthwaite and Kenward-
Roger degrees of freedom methods are not available if you specify the EMPIRICAL option.
6618 F Chapter 84: The MIXED Procedure
IC
displays a table of various information criteria. The criteria are all in smaller-is-better form, and are
described in Table 84.3.
Here ` denotes the maximum value of the (possibly restricted) log likelihood, d the dimension of the
model, and n the number of observations. In SAS 6 of SAS/STAT software, n equals the number of
valid observations for maximum likelihood estimation and n p for restricted maximum likelihood
estimation, where p equals the rank of X. In later versions, n equals the number of effective subjects as
displayed in the “Dimensions” table, unless this value equals 1, in which case n equals the number
of levels of the first random effect you specify in a RANDOM statement. If the number of effective
subjects equals 1 and you have no RANDOM statements, then n reverts to the SAS 6 values. For AICC
(a finite-sample corrected version of AIC), n equals the SAS 6 values of n, unless this number is less
than d + 2, in which case it equals d + 2. When n 1, the value of the HQIC criterion is 2`. When
n=0, the values of the BIC and CAIC criteria are 2` and 2` C d , respectively.
For restricted likelihood estimation, d equals q, the effective number of estimated covariance parameters.
In SAS 6, when a parameter estimate lies on a boundary constraint, then it is still included in the
calculation of d, but in later versions it is not. The most common example of this behavior is when a
variance component is estimated to equal zero. For maximum likelihood estimation, d equals q C p
where p is by default the sum of the Type 3 degrees of freedom associated with each fixed effect or the
rank of X if you specify NOTEST option. The value of d is displayed in the “Information Criteria”
table as the value of Parms variable; see Table 84.27.
The ODS name of the “Information Criteria” table is InfoCrit.
INFO
is a default option. The creation of the “Model Information,” “Dimensions,” and “Number of Observa-
tions” tables can be suppressed by using the NOINFO option.
Note that in SAS 6 this option displays the “Model Information” and “Dimensions” tables.
ITDETAILS
displays the parameter values at each iteration and enables the writing of notes to the SAS log pertaining
to “infinite likelihood” and “singularities” during Newton-Raphson iterations.
PROC MIXED Statement F 6619
LOGNOTE
writes periodic notes to the log describing the current status of computations. It is designed for use
with analyses requiring extensive CPU resources.
MAXFUNC=number
specifies the maximum number of likelihood evaluations in the optimization process. The default is
150.
MAXITER=number
specifies the maximum number of iterations. The default is 50.
MMEQ
requests that the coefficient matrix and the right-hand side of the mixed model equations be displayed.
If G
b is nonsingular, the coefficient matrix and the right-hand side have the following form:
" # " 0 1 #
X0 b
R 1X X0 b
R 1Z b̌ Xb R y
0b 1 0b 1 D
ZR X ZR ZCG b 1 b Z0 b
R 1y
If G
b is singular, the coefficient matrix and right-hand side have the following modified form:
" # " #
X0 b
R 1X X0 b
R 1 ZGb b̌ X0 bR 1y
D
Gb0 Z0 b
R 1X G b0 Z0 b
R 1 ZGbCG
b b0 Z0 b
G R 1y
See the section “Estimating Fixed and Random Effects in the Mixed Model” on page 6690 for further
information about these equations.
MMEQSOL
requests that a solution to the mixed model equations be produced, in addition to the inverted coefficients
matrix. If G
b is nonsingular, the formula is the same as the preceding description of the MMEQ option.
If G
b is singular, b̌ and Gb are displayed in addition to the inverse of the modified coefficient matrix.
See the section “Estimating Fixed and Random Effects in the Mixed Model” on page 6690 for further
information about these equations and solution transformation.
6620 F Chapter 84: The MIXED Procedure
NOBOUND
has the same effect as the NOBOUND option in the PARMS statement.
NOINFO
suppresses the display of the “Model Information,” “Dimensions,” and “Number of Observations”
tables.
NOITPRINT
suppresses the display of the “Iteration History” table.
NOPROFILE
includes the residual variance as part of the Newton-Raphson iterations. This option applies only
to models that have a residual variance parameter. By default, this parameter is profiled out of the
likelihood calculations, except when you have specified the HOLD= option in the PARMS statement.
ORD
displays ordinates of the relevant distribution in addition to p-values. The ordinate can be viewed as an
approximate odds ratio of hypothesis probabilities.
PLOTS < (global-plot-options ) > < =plot-request < (options ) > >
PLOTS < (global-plot-options ) > < = (plot-request< (options) >< . . . plot-request< (options) > >) >
requests that the MIXED procedure produce statistical graphics via the Output Delivery System,
provided that ODS Graphics is enabled.
ODS Graphics must be enabled before plots can be requested. For example:
For more information about enabling and disabling ODS Graphics, see the section “Enabling and
Disabling ODS Graphics” on page 663 in Chapter 24, “Statistical Graphics Using ODS.”
For examples of the basic statistical graphics produced by the MIXED procedure and aspects of their
computation and interpretation, see the section “ODS Graphics” on page 6717.
The global-plot-options apply to all relevant plots generated by the MIXED procedure. The global-plot-
options supported by the MIXED procedure follow.
OBSNO
uses the data set observation number to identify observations in tooltips, provided that the
observation number can be determined. Otherwise, the number displayed in tooltips is the index
of the observation as it is used in the analysis within the BY group.
ONLY
suppresses the default plots. Only the plots specifically requested are produced.
UNPACKPANEL
UNPACK
displays each graph separately. (By default, some graphs can appear together in a single panel.)
MAXPOINTS=NONE | number
specifies that plots with elements that require processing more than number points be sup-
pressed. The default is MAXPOINTS=5000. No plots are suppressed if you specify MAX-
POINTS=NONE.
The following listing describes the specific plots and their options.
6622 F Chapter 84: The MIXED Procedure
ALL
requests that all plots appropriate for the particular analysis be produced.
NONE
suppresses all plots.
The residual-plot-options determine both the composition of the panels and the type of residuals
being plotted.
BOX
BOXPLOT
replaces the inset of summary statistics in the lower-right corner of the panel with a box plot
of the residual (the “PROC GLIMMIX look”).
CONDITIONAL
BLUP
constructs plots from conditional residuals.
MARGINAL
NOBLUP
constructs plots from marginal residuals.
UNPACK
produces separate plots from the elements of the panel. The inset statistics are not part of
the unpack operation.
The boxplot-options determine whether box plots are produced for residuals or for residuals
and observed values, and for which model effects the box plots are constructed. The available
boxplot-options are as follows.
6624 F Chapter 84: The MIXED Procedure
CONDITIONAL
BLUP
constructs box plots from conditional residuals—that is, residuals using the estimated BLUPs
of random effects.
FIXED
produces box plots for all fixed effects (MODEL statement) consisting entirely of classifica-
tion variables
GROUP
produces box plots for all GROUP= effects (RANDOM and REPEATED statement) consist-
ing entirely of classification variables
MARGINAL
NOBLUP
constructs box plots from marginal residuals.
NPANEL=number
provides the ability to break a box plot into multiple graphics. If number is negative, no
balancing of the number of boxes takes place and number is the maximum number of boxes
per graphic. If number is positive, the number of boxes per graphic is balanced. For example,
suppose variable A has 125 levels, and consider the following statements:
The box balancing results in six plots with 18 boxes each and one plot with 17 boxes. If
number is zero, and this is the default, all levels of the effect are displayed in a single plot.
OBSERVED
adds box plots of the observed data for the selected effects.
RANDOM
produces box plots for all random effects (RANDOM statement) consisting entirely of clas-
sification variables. This does not include effects specified in the GROUP= or SUBJECT=
options of the RANDOM statement.
REPEATED
produces box plots for the repeated effects (REPEATED statement). This does not include
effects specified in the GROUP= or SUBJECT= options of the REPEATED statement.
STUDENT
constructs box plots from studentized residuals rather than from raw residuals.
PROC MIXED Statement F 6625
SUBJECT
produces box plots for all SUBJECT= effects (RANDOM and REPEATED statement)
consisting entirely of classification variables.
USEINDEX
uses as the horizontal axis label the index of the effect level rather than the formatted
value(s). For classification variables with many levels or model effects that involve multiple
classification variables, the formatted values identifying the effect levels can take up too
much space as axis tick values, leading to extensive thinning. The USEINDEX option
replaces tick values constructed from formatted values with the internal level number.
You can list a plot request one or more times with different options. For example, the following
statements request a panel of marginal raw residuals, individual plots generated from a panel of the
conditional raw residuals, and a panel of marginal studentized residuals:
The inset of residual statistics is replaced in this last panel by a box plot of the studentized residuals.
Similarly, if you specify the INFLUENCE option in the MODEL statement, then the following
statements request statistical graphics of fixed-effects deletion estimates (in a panel), covariance
parameter deletion estimates (unpacked in individual plots), and box plots for the SUBJECT= and
fixed classification effects based on residuals and observed values:
The STATICMAP image format enables tooltips that show, for example, values of influence diagnostics
associated with a particular delete estimate.
This concludes the syntax section for the PLOTS= option in the PROC MIXED statement.
RANKS
displays the ranks of design matrices X and (XZ).
6626 F Chapter 84: The MIXED Procedure
RATIO
produces the ratio of the covariance parameter estimates to the estimate of the residual variance when
the latter exists in the model.
RIDGE=number
specifies the starting value for the minimum ridge value used in the Newton-Raphson algorithm. The
default is 0.3125.
SIGITER
is an alias for the NOPROFILE option.
UPDATE
is an alias for the LOGNOTE option.
BY Statement
BY variables ;
You can specify a BY statement in PROC MIXED to obtain separate analyses of observations in groups that
are defined by the BY variables. When a BY statement appears, the procedure expects the input data set to be
sorted in order of the BY variables. If you specify more than one BY statement, only the last one specified is
used.
If your input data set is not sorted in ascending order, use one of the following alternatives:
Sort the data by using the SORT procedure with a similar BY statement.
Specify the NOTSORTED or DESCENDING option in the BY statement in the MIXED procedure.
The NOTSORTED option does not mean that the data are unsorted but rather that the data are arranged
in groups (according to values of the BY variables) and that these groups are not necessarily in
alphabetical or increasing numeric order.
Create an index on the BY variables by using the DATASETS procedure (in Base SAS software).
Because sorting the data changes the order in which PROC MIXED reads observations, the sort order for the
levels of the CLASS variable might be affected if you have specified ORDER=DATA in the PROC MIXED
statement. This, in turn, affects specifications in the CONTRAST or ESTIMATE statement.
For more information about BY-group processing, see the discussion in SAS Language Reference: Concepts.
For more information about the DATASETS procedure, see the discussion in the Base SAS Procedures Guide.
CLASS Statement F 6627
CLASS Statement
CLASS variable < (REF= option) > . . . < variable < (REF= option) > > < / global-options > ;
The CLASS statement names the classification variables to be used in the model. Typical classification
variables are Treatment, Sex, Race, Group, and Replication. If you use the CLASS statement, it must appear
before the MODEL statement.
Classification variables can be either character or numeric. By default, class levels are determined from the
entire set of formatted values of the CLASS variables.
N OTE : Prior to SAS 9, class levels were determined by using no more than the first 16 characters of the
formatted values. To revert to this previous behavior, you can use the TRUNCATE option in the CLASS
statement.
In any case, you can use formats to group values into levels. See the discussion of the FORMAT procedure
in the Base SAS Procedures Guide and the discussions of the FORMAT statement and SAS formats in SAS
Formats and Informats: Reference. You can adjust the order of CLASS variable levels with the ORDER=
option in the PROC MIXED statement.
You can specify the following REF= option to indicate how the levels of an individual classification variable
are to be ordered by enclosing it in parentheses after the variable name:
You can specify the following global-options in the CLASS statement after a slash (/):
REF=FIRST | LAST
specifies a level of all classification variables to be put at the end of the list of levels. This level thus
corresponds to the reference level in the usual interpretation of the estimates with PROC MIXED’s
singular parameterization. Specify REF=FIRST to designate that the first ordered level for each
classification variable serve as the reference. Specify REF=LAST to designate that the last ordered
level serve as the reference. This option applies to all the variables specified in the CLASS statement. To
specify different reference levels for different classification variables, use REF= options for individual
variables.
TRUNCATE
specifies that class levels be determined by using only up to the first 16 characters of the formatted
values of CLASS variables. When formatted values are longer than 16 characters, you can use this
option to revert to the levels as determined in releases prior to SAS 9.
6628 F Chapter 84: The MIXED Procedure
CODE Statement
CODE < options > ;
The CODE statement writes SAS DATA step code for computing predicted values of the fitted model either
to a file or to a catalog entry. This code can then be included in a DATA step to score new data.
Table 84.4 summarizes the options available in the CODE statement.
Option Description
CATALOG= Names the catalog entry where the generated code is saved
DUMMIES Retains the dummy variables in the data set
ERROR Computes the error function
FILE= Names the file where the generated code is saved
FORMAT= Specifies the numeric format for the regression coefficients
GROUP= Specifies the group identifier for array names and statement labels
IMPUTE Imputes predicted values for observations with missing or invalid
covariates
LINESIZE= Specifies the line size of the generated code
LOOKUP= Specifies the algorithm for looking up CLASS levels
RESIDUAL Computes residuals
For details about the syntax of the CODE statement, see the section “CODE Statement” on page 409 in
Chapter 20, “Shared Concepts and Topics.”
CONTRAST Statement
CONTRAST ’label’ < fixed-effect values . . . >
< | random-effect values . . . >, . . . < / options > ;
The CONTRAST statement provides a mechanism for obtaining custom hypothesis tests. It is patterned after
the CONTRAST statement in PROC GLM, although it has been extended to include random effects. This
enables you to select an appropriate inference space (McLean, Sanders, and Stroup 1991).
You can test the hypothesis L0 D 0, where L0 D .K0 M0 / and 0 D .ˇ 0 0 /, in several inference spaces. The
inference space corresponds to the choice of M. When M D 0, your inferences apply to the entire population
from which the random effects are sampled; this is known as the broad inference space. When all elements
of M are nonzero, your inferences apply only to the observed levels of the random effects. This is known as
the narrow inference space, and you can also choose it by specifying all of the random effects as fixed. The
GLM procedure uses the narrow inference space. Finally, by setting to zero the portions of M corresponding
to selected main effects and interactions, you can choose intermediate inference spaces. The broad inference
space is usually the most appropriate, and it is used when you do not specify any random effects in the
CONTRAST statement.
The CONTRAST statement has the following arguments:
CONTRAST Statement F 6629
label identifies the contrast in the table. A label is required for every contrast specified. Labels
can be up to 200 characters and must be enclosed in quotes.
fixed-effect identifies an effect that appears in the MODEL statement. The keyword INTERCEPT can
be used as an effect when an intercept is fitted in the model. You do not need to include
all effects that are in the MODEL statement.
random-effect identifies an effect that appears in the RANDOM statement. The first random effect must
follow a vertical bar (|); however, random effects do not have to be specified.
values are constants that are elements of the L matrix associated with the fixed and random
effects.
The rows of L0 are specified in order and are separated by commas. The rows of the K0 component of L0 are
specified on the left side of the vertical bars (|). These rows test the fixed effects and are, therefore, checked
for estimability. The rows of the M0 component of L0 are specified on the right side of the vertical bars. They
test the random effects, and no estimability checking is necessary.
If PROC MIXED finds the fixed-effects portion of the specified contrast to be nonestimable (see the
SINGULAR= option), then it displays a message in the log.
The following CONTRAST statement reproduces the F test for the effect A in the split-plot example (see
Example 84.1):
higher-order effect are determined by equitably distributing the coefficients of the lower-level effect, as in
the construction of least squares means. In addition, if the intercept is specified, it is distributed over all
classification effects that are not contained by any other specified effect. If an effect is not specified and does
not contain any specified effects, then all of its coefficients in L are set to 0. You can override this behavior
by specifying coefficients for the higher-order effect.
If too many values are specified for an effect, the extra ones are ignored; if too few are specified, the remaining
ones are set to 0. If no random effects are specified, the vertical bar can be omitted; otherwise, it must be
present. If a SUBJECT= effect is used in the RANDOM statement, then the coefficients specified for the
effects in the RANDOM statement are equitably distributed across the levels of the SUBJECT effect. You
can use the E option to see exactly which L matrix is used.
The SUBJECT and GROUP options in the CONTRAST statement are useful for the case when a SUBJECT=
or GROUP= variable appears in the RANDOM statement, and you want to contrast different subjects or
groups. By default, CONTRAST statement coefficients on random effects are distributed equally across
subjects and groups.
PROC MIXED handles missing level combinations of classification variables similarly to the way PROC
GLM does. Both procedures delete fixed-effects parameters corresponding to missing levels in order to
preserve estimability. However, PROC MIXED does not delete missing level combinations for random-
effects parameters because linear combinations of the random-effects parameters are always estimable. These
conventions can affect the way you specify your CONTRAST coefficients.
The CONTRAST statement computes the statistic
0
CL/ 1 L0
L.L0 b
b̌ b̌
b b
F D
r
where r D rank.L0 b CL/, and approximates its distribution with an F distribution. In this expression, b
C is
an estimate of the generalized inverse of the coefficient matrix in the mixed model equations. For more
information about this F statistic, see the section “Inference and Test Statistics” on page 6692.
The numerator degrees of freedom in the F approximation are r D rank.L0 b CL/, and the denominator degrees
of freedom are taken from the “Tests of Fixed Effects” table and corresponds to the final effect you list in the
CONTRAST statement. You can change the denominator degrees of freedom by using the DF= option.
You can specify the following options in the CONTRAST statement after a slash (/).
CHISQ
requests that chi-square tests be performed in addition to any F tests. A chi-square statistic equals its
corresponding F statistic times the associate numerator degrees of freedom, and the same degrees of
freedom are used to compute the p-value for the chi-square test. This p-value is always less than that
for the F -test, as it effectively corresponds to an F test with infinite denominator degrees of freedom.
DF=number
specifies the denominator degrees of freedom for the F test. For the degrees-of-freedom methods
DDFM=BETWITHIN, DDFM=CONTAIN, and DDFM=RESIDUAL, the default is the denominator
degrees of freedom taken from the “Tests of Fixed Effects” table and corresponds to the final effect
you list in the CONTRAST statement. For DDFM=SATTERTHWAITE, DDFM=KENWARDROGER,
and DDFM=KENWARDROGER2, the denominator degrees of freedom are computed separately for
each contrast.
ESTIMATE Statement F 6631
E
requests that the L matrix coefficients for the contrast be displayed. The ODS name of the “L Matrix
Coefficients” table is Coef.
GROUP coeffs
GRP coeffs
sets up random-effect contrasts between different groups when a GROUP= variable appears in the
RANDOM statement. By default, CONTRAST statement coefficients on random effects are distributed
equally across groups.
SINGULAR=number
tunes the estimability checking. If v is a vector, define ABS(v) to be the absolute value of the element
of v with the largest absolute value. If ABS(K0 K0 T) is greater than c*number for any row of K0 in
the contrast, then K is declared nonestimable. Here T is the Hermite form matrix .X0 V 1 X/ X0 V 1 X,
and c is ABS(K0 ) except when it equals 0, and then c is 1. The value for number must be between 0
and 1; the default is 1E–4.
SUBJECT coeffs
SUB coeffs
sets up random-effect contrasts between different subjects when a SUBJECT= variable appears in the
RANDOM statement. By default, CONTRAST statement coefficients on random effects are distributed
equally across subjects.
ESTIMATE Statement
ESTIMATE ’label’ < fixed-effect values . . . >
< | random-effect values . . . > < / options > ;
The ESTIMATE statement is exactly like a CONTRAST statement, except only one-row L matrices are
permitted. The actual estimate, L0b p, is displayed along with its approximate standard error. An approximate t
0
test that L bp = 0 is also produced.
PROC MIXED selects the degrees of freedom to match those displayed in the “Tests of Fixed Effects” table
for the final effect you list in the ESTIMATE statement. You can modify the degrees of freedom by using the
DF= option.
If PROC MIXED finds the fixed-effects portion of the specified estimate to be nonestimable, then it displays
“Non-est” for the estimate entries.
The following examples of ESTIMATE statements compute the mean of the first level of A in the split-plot
example (see Example 84.1) for various inference spaces:
Option Description
ALPHA= Specifies the confidence level
CL Constructs t-type confidence limits
DF= Specifies the degrees of freedom
DIVISOR= Specifies a value by which to divide all coefficients
E Displays the L matrix coefficients
GROUP Sets up random-effect contrasts between different groups
LOWER Performs lower-tailed tests
SINGULAR= Tunes the estimability checking
SUBJECT Sets up random-effect contrasts between different subjects
UPPER Performs upper-tailed tests
You can specify the following options in the ESTIMATE statement after a slash (/).
ALPHA=number
requests that a t-type confidence interval be constructed with confidence level 1 – number . The value
of number must be between 0 and 1; the default is 0.05.
CL
requests that t-type confidence limits be constructed. The confidence level is 0.95 by default; this can
be changed with the ALPHA= option.
DF=number
specifies the degrees of freedom for the t test and confidence limits. The default is the denominator
degrees of freedom taken from the “Tests of Fixed Effects” table and corresponds to the final effect
you list in the ESTIMATE statement.
DIVISOR=number
specifies a value by which to divide all coefficients so that fractional coefficients can be entered as
integer numerators.
E
requests that the L matrix coefficients be displayed. The ODS name of this “L Matrix Coefficients”
table is “Coef.”
ID Statement F 6633
GROUP coeffs
GRP coeffs
sets up random-effect contrasts between different groups when a GROUP= variable appears in the
RANDOM statement. By default, ESTIMATE statement coefficients on random effects are distributed
equally across groups.
LOWER
LOWERTAILED
requests that the p-value for the t test be based only on values less than the t statistic. A two-tailed test
is the default. A lower-tailed confidence limit is also produced if you specify the CL option.
SINGULAR=number
tunes the estimability checking as documented for the SINGULAR= option in the CONTRAST
statement.
SUBJECT coeffs
SUB coeffs
sets up random-effect contrasts between different subjects when a SUBJECT= variable appears in
the RANDOM statement. By default, ESTIMATE statement coefficients on random effects are
distributed equally across subjects. For example, the ESTIMATE statement in the following code from
Example 84.5 constructs the difference between the random slopes of the first two batches.
UPPER
UPPERTAILED
requests that the p-value for the t test be based only on values greater than the t statistic. A two-tailed
test is the default. An upper-tailed confidence limit is also produced if you specify the CL option.
ID Statement
ID variables ;
The ID statement specifies which variables from the input data set are to be included in the OUTP= and
OUTPM= data sets from the MODEL statement. If you do not specify an ID statement, then all variables are
included in these data sets. Otherwise, only the variables you list in the ID statement are included. Specifying
an ID statement with no variables prevents any variables from being included in these data sets.
6634 F Chapter 84: The MIXED Procedure
LSMEANS Statement
LSMEANS fixed-effects < / options > ;
The LSMEANS statement computes least squares means (LS-means) of fixed effects. As in the GLM
procedure, LS-means are predicted population margins—that is, they estimate the marginal means over a
balanced population. In a sense, LS-means are to unbalanced designs as class and subclass arithmetic means
are to balanced designs. The L matrix constructed to compute them is the same as the L matrix formed in
PROC GLM; however, the standard errors are adjusted for the covariance parameters in the model.
Each LS-mean is computed as Lb̌, where L is the coefficient matrix associated with the least
squares mean and b̌ is the estimate of the fixed-effects parameter vector (see the section
“Estimating Fixed and Random Effects in the Mixed Model” on page 6690). The approximate standard
b 1 X/ L0 .
errors for the LS-mean is computed as the square root of L.X0 V
LS-means can be computed for any effect in the MODEL statement that involves CLASS variables. You
can specify multiple effects in one LSMEANS statement or in multiple LSMEANS statements, and all
LSMEANS statements must appear after the MODEL statement. As in the ESTIMATE statement, the L
matrix is tested for estimability, and if this test fails, PROC MIXED displays “Non-est” for the LS-means
entries.
Assuming the LS-mean is estimable, PROC MIXED constructs an approximate t test to test the null hypothesis
that the associated population quantity equals zero. By default, the denominator degrees of freedom for
this test are the same as those displayed for the effect in the “Tests of Fixed Effects” table (see the section
“Default Output” on page 6708).
Table 84.6 summarizes the options available in the LSMEANS statement. All LSMEANS options are
subsequently discussed in alphabetical order.
Option Description
Construction and Computation of LS-Means
AT Modifies covariate value in computing LS-means
BYLEVEL Computes separate margins
DIFF Requests differences of LS-means
OM Specifies weighting scheme for LS-mean computation
SINGULAR= Tunes estimability checking
SLICE= Partitions F tests (simple effects)
Degrees of Freedom and p-Values
ADJDFE= Determines whether to compute rowwise denominator degrees of
freedom by using DDFM=SATTERTHWAITE,
DDFM=KENWARDROGER, or DDFM=KENWARDROGER2
ADJUST= Determines the method for multiple comparison adjustment of
LS-mean differences
ALPHA=˛ Determines the confidence level (1 ˛)
DF= Assigns specific value to degrees of freedom for tests and
confidence limits
LSMEANS Statement F 6635
Option Description
Statistical Output
CL Constructs confidence limits for means and or mean differences
CORR Displays correlation matrix of LS-means
COV Displays covariance matrix of LS-means
E Prints the L matrix
You can specify the following options in the LSMEANS statement after a slash (/).
ADJDFE=SOURCE | ROW
specifies how denominator degrees of freedom are determined when p-values and confidence limits
are adjusted for multiple comparisons with the ADJUST= option. When you do not specify the
ADJDFE= option, or when you specify ADJDFE=SOURCE, the denominator degrees of freedom for
multiplicity-adjusted results are the denominator degrees of freedom for the LS-mean effect in the
“Type 3 Tests of Fixed Effects” table. When you specify ADJDFE=ROW, the denominator degrees of
freedom for multiplicity-adjusted results correspond to the degrees of freedom displayed in the DF
column of the “Differences of Least Squares Means” table.
The ADJDFE=ROW setting is particularly useful if you want multiplicity adjustments to take into
account that denominator degrees of freedom are not constant across LS-mean differences. This can
be the case, for example, when the DDFM=SATTERTHWAITE, DDFM=KENWARDROGER, or
DDFM=KENWARDROGER2 degrees-of-freedom method is in effect.
In one-way models with heterogeneous variance, combining certain ADJUST= options with the
ADJDFE=ROW option corresponds to particular methods of performing multiplicity adjustments in
the presence of heteroscedasticity. For example, the following statements fit a heteroscedastic one-way
model and perform Dunnett’s T3 method (Dunnett 1980), which is based on the studentized maximum
modulus (ADJUST=SMM):
proc mixed;
class A;
model y = A / ddfm=satterth;
repeated / group=A;
lsmeans A / adjust=smm adjdfe=row;
run;
If you combine the ADJDFE=ROW option with ADJUST=SIDAK, the multiplicity adjustment corre-
sponds to the T2 method of Tamhane (1979), whereas ADJUST=TUKEY corresponds to the method of
Games-Howell (Games and Howell 1976). Note that ADJUST=TUKEY gives the exact results for the
case of fractional degrees of freedom in the one-way model, but it does not take into account that the
degrees of freedom are subject to variability. A more conservative method, such as ADJUST=SMM,
might protect the overall error rate better.
Unless the ADJUST= option of the LSMEANS statement is specified, the ADJDFE= option has no
effect.
6636 F Chapter 84: The MIXED Procedure
ADJUST=BON
ADJUST=DUNNETT
ADJUST=SCHEFFE
ADJUST=SIDAK
ADJUST=SIMULATE< (sim-options) >
ADJUST=SMM | GT2
ADJUST=TUKEY
requests a multiple comparison adjustment for the p-values and confidence limits for the differences
of LS-means. By default, PROC MIXED adjusts all pairwise differences unless you specify AD-
JUST=DUNNETT, in which case PROC MIXED analyzes all differences with a control level. The
ADJUST= option implies the DIFF option.
The BON (Bonferroni) and SIDAK adjustments involve correction factors described in
Chapter 53, “The GLM Procedure,” and Chapter 86, “The MULTTEST Procedure”; also see Westfall
and Young (1993) and Westfall et al. (1999). When you specify ADJUST=TUKEY and your data
are unbalanced, PROC MIXED uses the approximation described in Kramer (1956). Similarly,
when you specify ADJUST=DUNNETT and the LS-means are correlated, PROC MIXED uses the
factor-analytic covariance approximation described in Hsu (1992). The preceding references also
describe the SCHEFFE and SMM adjustments.
The SIMULATE adjustment computes adjusted p-values and confidence limits from the simulated
distribution of the maximum or maximum absolute value of a multivariate t random vector. All
covariance parameters except the residual variance are fixed at their estimated values throughout the
simulation, potentially resulting in some underdispersion. The simulation estimates q, the true .1 ˛/
quantile, where 1 ˛ is the confidence coefficient. The default ˛ is 0.05, and you can change this
value with the ALPHA= option in the LSMEANS statement.
The number of samples is set so that the tail area for the simulated q is within of 1 ˛ with
100.1 /% confidence. In equation form,
P .jF .b
q/ .1 ˛/j / D 1
where qO is the simulated q and F is the true distribution function of the maximum; see Edwards and
Berry (1987) for details. By default, = 0.005 and = 0.01, placing the tail area of qO within 0.005 of
0.95 with 99% confidence. The ACC= and EPS= sim-options reset and , respectively; the NSAMP=
sim-option sets the sample size directly; and the SEED= sim-option specifies an integer used to start
the pseudo-random number generator for the simulation. If you do not specify a seed, or if you specify
a value less than or equal to zero, the seed is generated from reading the time of day from the computer
clock. For additional descriptions of these and other simulation options, see the section “LSMEANS
Statement” on page 4060 in Chapter 53, “The GLM Procedure.”
ALPHA=number
requests that a t-type confidence interval be constructed for each of the LS-means with confidence
level 1 – number . The value of number must be between 0 and 1; the default is 0.05.
LSMEANS Statement F 6637
AT variable = value
AT (variable-list)= (value-list)
AT MEANS
enables you to modify the values of the covariates used in computing LS-means. By default, all
covariate effects are set equal to their mean values for computation of standard LS-means. The AT
option enables you to assign arbitrary values to the covariates. Additional columns in the output table
indicate the values of the covariates.
If there is an effect containing two or more covariates, the AT option sets the effect equal to the product
of the individual means rather than the mean of the product (as with standard LS-means calculations).
The AT MEANS option sets covariates equal to their mean values (as with standard LS-means) and
incorporates this adjustment to crossproducts of covariates.
As an example, consider the following invocation of PROC MIXED:
proc mixed;
class A;
model Y = A X1 X2 X1*X2;
lsmeans A;
lsmeans A / at means;
lsmeans A / at X1=1.2;
lsmeans A / at (X1 X2)=(1.2 0.3);
run;
For the first two LSMEANS statements, the LS-means coefficient for X1 is x1 (the mean of X1) and
for X2 is x2 (the mean of X2). However, for the first LSMEANS statement, the coefficient for X1*X2
is x1 x2 , but for the second LSMEANS statement, the coefficient is x1 x2 . The third LSMEANS
statement sets the coefficient for X1 equal to 1.2 and leaves it at x2 for X2, and the final LSMEANS
statement sets these values to 1.2 and 0.3, respectively.
If a WEIGHT variable is present, it is used in processing AT variables. Also, observations with missing
dependent variables are included in computing the covariate means, unless these observations form a
missing cell and the FULLX option in the MODEL statement is not in effect. You can use the E option
in conjunction with the AT option to check that the modified LS-means coefficients are the ones you
want.
The AT option is disabled if you specify the BYLEVEL option.
BYLEVEL
requests PROC MIXED to process the OM data set by each level of the LS-mean effect (LSMEANS
effect) in question. For more details, see the OM option later in this section.
CL
requests that t-type confidence limits be constructed for each of the LS-means. The confidence level is
0.95 by default; this can be changed with the ALPHA= option.
6638 F Chapter 84: The MIXED Procedure
CORR
displays the estimated correlation matrix of the least squares means as part of the “Least Squares
Means” table.
COV
displays the estimated covariance matrix of the least squares means as part of the “Least Squares
Means” table.
DF=number
specifies the degrees of freedom for the t test and confidence limits. The default is the denomi-
nator degrees of freedom taken from the “Tests of Fixed Effects” table corresponding to the LS-
means effect, unless you specify the DDFM=SATTERTHWAITE, DDFM=KENWARDROGER, or
DDFM=KENWARDROGER2 option in the MODEL statement. For these DDFM= methods, degrees
of freedom are determined separately for each test; for more information, see the DDFM= option.
For multiple effects, the results depend upon the order of the list, and so you should check the output
to make sure that the controls are correct.
Two-tailed tests and confidence limits are associated with the CONTROL difftype. For one-tailed results,
use either the CONTROLL or CONTROLU difftype. The CONTROLL difftype tests whether the
noncontrol levels are significantly smaller than the control; the upper confidence limits for the control
minus the noncontrol levels are considered to be infinity and are displayed as missing. Conversely, the
CONTROLU difftype tests whether the noncontrol levels are significantly larger than the control; the
upper confidence limits for the noncontrol levels minus the control are considered to be infinity and are
displayed as missing.
If you want to perform multiple comparison adjustments on the differences of LS-means, you must
specify the ADJUST= option.
The differences of the LS-means are displayed in a table titled “Differences of Least Squares Means.”
The ODS table name is Diffs.
LSMEANS Statement F 6639
E
requests that the L matrix coefficients for all LSMEANS effects be displayed. The ODS name of this
“L Matrix Coefficients” table is Coef.
PDIFF
is the same as the DIFF option.
SINGULAR=number
tunes the estimability checking as documented for the SINGULAR= option in the CONTRAST
statement.
This code tests for the simple main effects of A for B, which are calculated by extracting the appropriate
rows from the coefficient matrix for the A*B LS-means and by using them to form an F test. For more
information about this F test, see the section “Inference and Test Statistics” on page 6692.
The SLICE option produces a table titled “Tests of Effect Slices.” The ODS table name is Slices.
LSMESTIMATE Statement
LSMESTIMATE model-effect < 'label' > values < divisor =n >
< , . . . < 'label' > values < divisor =n > >
< / options > ;
The LSMESTIMATE statement provides a mechanism for obtaining custom hypothesis tests among least
squares means.
Table 84.7 summarizes the options available in the LSMESTIMATE statement.
Option Description
Construction and Computation of LS-Means
AT Modifies covariate values in computing LS-means
BYLEVEL Computes separate margins
DIVISOR= Specifies a list of values to divide the coefficients
OM= Specifies the weighting scheme for LS-means computation as
determined by a data set
SINGULAR= Tunes estimability checking
Degrees of Freedom and p-Values
ADJUST= Determines the method of multiple-comparison adjustment of
LS-means differences
ALPHA=˛ Determines the confidence level (1 ˛)
LOWER Performs one-sided, lower-tailed inference
STEPDOWN Adjusts multiple-comparison p-values further in a step-down
fashion
TESTVALUE= Specifies values under the null hypothesis for tests
UPPER Performs one-sided, upper-tailed inference
MODEL Statement F 6641
Option Description
Statistical Output
CL Constructs confidence limits for means and mean differences
CORR Displays the correlation matrix of LS-means
COV Displays the covariance matrix of LS-means
E Prints the L matrix
ELSM Prints the K matrix
JOINT Produces a joint F or chi-square test for the LS-means and
LS-means differences
PLOTS= Produces graphs of means and mean comparisons
SEED= Specifies the seed for computations that depend on random
numbers
Generalized Linear Modeling
CATEGORY= Specifies how to construct estimable functions for multinomial data
EXP Exponentiates and displays LS-means estimates
ILINK Computes and displays estimates and standard errors of LS-means
(but not differences) on the inverse linked scale
For more information about the syntax of the LSMESTIMATE statement, see the section “LSMESTIMATE
Statement” on page 498 in Chapter 20, “Shared Concepts and Topics.”
MODEL Statement
MODEL dependent = < fixed-effects > < / options > ;
The MODEL statement names a single dependent variable and the fixed effects, which determine the X
matrix of the mixed model (see the section “Parameterization of Mixed Models” on page 6695 for details).
The specification of effects is the same as in the GLM procedure; however, unlike PROC GLM, you do not
specify random effects in the MODEL statement. The MODEL statement is required.
An intercept is included in the fixed-effects model by default. If no fixed effects are specified, only this
intercept term is fit. The intercept can be removed by using the NOINT option.
Table 84.8 summarizes the options available in the MODEL statement. These are subsequently discussed in
detail in alphabetical order.
Option Description
Model Building
NOINT Excludes fixed-effect intercept from model
Statistical Computations
ALPHA=˛ Determines the confidence level (1 ˛) for fixed effects
6642 F Chapter 84: The MIXED Procedure
Option Description
ALPHAP=˛ Determines the confidence level (1 ˛) for predicted values
CHISQ Requests chi-square tests
DDF= Specifies denominator degrees of freedom (list)
DDFM= Specifies the method for computing denominator degrees of free-
dom
HTYPE= Selects the type of hypothesis test
INFLUENCE Requests influence and case-deletion diagnostics
NOTEST Suppresses hypothesis tests for the fixed effects
OUTP= Specifies output data set for predicted values and related quantities
OUTPM= Specifies output data set for predicted means and related quantities
RESIDUAL Adds Pearson-type and studentized residuals to output data sets
VCIRY Adds scaled marginal residual to output data sets
Statistical Output
CL Displays confidence limits for fixed-effects parameter estimates
CORRB Displays correlation matrix of fixed-effects parameter estimates
COVB Displays covariance matrix of fixed-effects parameter estimates
COVBI Displays inverse covariance matrix of fixed-effects parameter esti-
mates
E, E1, E2, E3 Displays L matrix coefficients
INTERCEPT Adds a row for the intercept to test tables
SOLUTION Displays fixed-effects parameter estimates (and scale parameter in
GLM models)
Singularity Tolerances
SINGCHOL= Tunes sensitivity in computing Cholesky roots
SINGRES= Tunes singularity criterion for residual variance
SINGULAR= Tunes the sensitivity in sweeping
ZETA= Tunes the sensitivity in forming Type 3 functions
You can specify the following options in the MODEL statement after a slash (/).
ALPHA=number
requests that a t-type confidence interval be constructed for each of the fixed-effects parameters with
confidence level 1 – number . The value of number must be between 0 and 1; the default is 0.05.
ALPHAP=number
requests that a t-type confidence interval be constructed for the predicted values with confidence level
1 – number . The value of number must be between 0 and 1; the default is 0.05.
MODEL Statement F 6643
CHISQ
requests that chi-square tests be performed for all specified effects in addition to the F tests. Type 3
tests are the default; you can produce the Type 1 and Type 2 tests by using the HTYPE= option.
CL
requests that t-type confidence limits be constructed for each of the fixed-effects parameter estimates.
The confidence level is 0.95 by default; this can be changed with the ALPHA= option.
CONTAIN
has the same effect as the DDFM=CONTAIN option.
CORRB
produces the approximate correlation matrix of the fixed-effects parameter estimates. The ODS name
of this table is CorrB.
COVB
produces the approximate variance-covariance matrix of the fixed-effects parameter estimates b̌. By
b 1 X/ and results from sweeping .X y/0 V
default, this matrix equals .X0 V b 1 .X y/ on all but its
last pivot and removing the y border. The EMPIRICAL option in the PROC MIXED statement
changes this matrix into “empirical sandwich” form. The ODS name of this table is CovB. If the
degrees-of-freedom method of Kenward and Roger (1997) is in effect (DDFM=KENWARDROGER or
DDFM=KENWARDROGER2), the COVB matrix changes because the method entails an adjustment
of the variance-covariance matrix of the fixed effects by the method proposed by Prasad and Rao
(1990); Harville and Jeske (1992). See also Kackar and Harville (1984).
COVBI
produces the inverse of the approximate variance-covariance matrix of the fixed-effects parameter
estimates. The ODS name of this table is InvCovB.
DDF=value-list
enables you to specify your own denominator degrees of freedom for the fixed effects. The value-list
specification is a list of numbers or missing values (.) separated by commas. The degrees of freedom
should be listed in the order in which the effects appear in the “Tests of Fixed Effects” table. If you
want to retain the default degrees of freedom for a particular effect, use a missing value for its location
in the list. For example, the following statement assigns 3 denominator degrees of freedom to A and
4.7 to A*B, while those for B remain the same:
DDFM=
DDFM=CONTAIN
DDFM=BETWITHIN
DDFM=RESIDUAL
DDFM=SATTERTHWAITE
DDFM=KENWARDROGER< (FIRSTORDER) >
DDFM=KENWARDROGER< (LINEAR) >
DDFM=KENWARDROGER2
specifies the method for computing the denominator degrees of freedom for the tests of fixed effects
resulting from the MODEL, CONTRAST, ESTIMATE, and LSMEANS statements.
Table 84.9 lists syntax aliases for the degrees-of-freedom methods.
The DDFM=CONTAIN option invokes the containment method to compute denominator degrees of
freedom, and it is the default when you specify a RANDOM statement. The containment method is
carried out as follows: Denote the fixed effect in question A, and search the RANDOM effect list for
the effects that syntactically contain A. For example, the random effect B(A) contains A, but the random
effect C does not, even if it has the same levels as B(A).
Among the random effects that contain A, compute their rank contribution to the (X Z) matrix. The
DDF assigned to A is the smallest of these rank contributions. If no effects are found, the DDF for A is
set equal to the residual degrees of freedom, N rank.X Z/. This choice of DDF matches the tests
performed for balanced split-plot designs and should be adequate for moderately unbalanced designs.
C AUTION : If you have a Z matrix with a large number of columns, the overall memory requirements
and the computing time after convergence can be substantial for the containment method. If it is too
large, you might want to use the DDFM=BETWITHIN option.
The DDFM=BETWITHIN option is the default for REPEATED statement specifications (with no
RANDOM statements). It is computed by dividing the residual degrees of freedom into between-
subject and within-subject portions. PROC MIXED then checks whether a fixed effect changes within
any subject. If so, it assigns within-subject degrees of freedom to the effect; otherwise, it assigns
the between-subject degrees of freedom to the effect (see Schluchter and Elashoff 1990). If there
are multiple within-subject effects containing classification variables, the within-subject degrees of
freedom are partitioned into components corresponding to the subject-by-effect interactions.
One exception to the preceding method is the case where you have specified no RANDOM statements
and a REPEATED statement with the TYPE=UN option. In this case, all effects are assigned the
MODEL Statement F 6645
between-subject degrees of freedom to provide for better small-sample approximations to the relevant
sampling distributions. DDFM=KENWARDROGER or DDFM=KENWARDROGER2 might be a
better option to try for this case.
The DDFM=RESIDUAL option performs all tests by using the residual degrees of freedom, n
rank.X/, where n is the number of observations.
The DDFM=SATTERTHWAITE option performs a general Satterthwaite approximation for the
denominator degrees of freedom, computed as follows. Suppose is the vector of unknown parameters
in V, and suppose C D .X0 V 1 X/ , where denotes a generalized inverse. Let b C and b
be the
corresponding estimates.
Consider the one-dimensional case, and consider ` to be a vector defining an estimable linear combina-
tion of ˇ. The Satterthwaite degrees of freedom for the t statistic
`b̌
tDp
O 0
`C`
is computed as
O 0 /2
2.`C`
D
g0 Ag
CL0 as q
For the multidimensional case, let L be an estimable contrast matrix and denote the rank of Lb
> 1. The Satterthwaite denominator degrees of freedom for the F statistic
CL0 /
b̌0 L0 .Lb 1 Lb̌
F D
q
are computed by first performing the spectral decomposition LbCL0 D P0 DP, where P is an orthogonal
matrix of eigenvectors and D is a diagonal matrix of eigenvalues, both of dimension q q. Define `m
to be the mth row of PL, and let
2.Dm /2
m D 0 Ag
gm m
where Dm is the mth diagonal element of D and gm is the gradient of `m C`0m with respect to ,
. Then let
evaluated at b
q
X m
ED I.m > 2/
m 2
mD1
where the indicator function eliminates terms for which m 2. The degrees of freedom for F are
then computed as
2E
D
E q
provided E > q; otherwise is set to zero.
6646 F Chapter 84: The MIXED Procedure
This method is a generalization of the techniques described in Giesbrecht and Burns (1985); McLean
and Sanders (1988); Fai and Cornelius (1996). The method can also include estimated random effects.
C to be the inverse of the coefficient matrix in the mixed model
In this case, append b to b̌ and change b
equations. The calculations require extra memory to hold c matrices that are the size of the mixed
model equations, where c is the number of covariance parameters. In the notation of Table 84.29,
this is approximately 8q.p C g/.p C g/=2 bytes. Extra computing time is also required to process
these matrices. The Satterthwaite method implemented here is intended to produce an accurate F
approximation; however, the results can differ from those produced by PROC GLM. Also, the small
sample properties of this approximation have not been extensively investigated for the various models
available with PROC MIXED.
The DDFM=KENWARDROGER option performs the degrees of freedom calculations detailed by
Kenward and Roger (1997). This approximation involves inflating the estimated variance-covariance
matrix of the fixed and random effects by the method proposed by Prasad and Rao (1990) and Harville
and Jeske (1992), see also Kackar and Harville (1984). Satterthwaite-type degrees of freedom are
then computed based on this adjustment. By default, the observed information matrix of the covari-
ance parameter estimates is used in the calculations. For covariance structures that have nonzero
second derivatives with respect to the covariance parameters, the Kenward-Roger covariance matrix
adjustment includes a second-order term. This term can result in standard error shrinkage. Also,
the resulting adjusted covariance matrix can then be indefinite and is not invariant under reparam-
eterization. The FIRSTORDER or LINEAR suboption of the DDFM=KENWARDROGER option
eliminates the second derivatives from the calculation of the covariance matrix adjustment. The
LINEAR suboption is an alias for FIRSTORDER. For the case of scalar estimable functions, the
e@ in Harville and Jeske (1992). The
resulting estimator is referred to as the Prasad-Rao estimator m
following are examples of covariance structures that generally lead to nonzero second derivatives:
TYPE=ANTE(1), TYPE=AR(1), TYPE=ARH(1), TYPE=ARMA(1,1), TYPE=CSH, TYPE=FA,
TYPE=FA0(q), TYPE=TOEPH, TYPE=UNR, and all TYPE=SP() structures.
The DDFM=KENWARDROGER2 option specifies an improved approximation of the
DDFM=KENWARDROGER method that uses a less biased precision estimator, as proposed
by Kenward and Roger (2009). For an intrinsically linear covariance parameterization, this option
produces the same precision estimator as that obtained using DDFM=KR(FIRSTORDER).
When the asymptotic variance matrix of the covariance parameters is found to be singular, a gen-
eralized inverse is used. Covariance parameters with zero variance then do not contribute to the
degrees-of-freedom adjustment for DDFM=SATTERTHWAITE, DDFM=KENWARDROGER, or
DDFM=KENWARDROGER2, and a message is written to the log.
This method changes output in the following tables (listed in Table 84.26): Contrast, CorrB, CovB,
Diffs, Estimates, InvCovB, LSMeans, Slices, SolutionF, SolutionR, Tests1–Tests3. The OUTP= and
OUTPM= data sets are also affected.
E
requests that Type 1, Type 2, and Type 3 L matrix coefficients be displayed for all specified effects.
The ODS name of the table is Coef.
MODEL Statement F 6647
E1
requests that Type 1 L matrix coefficients be displayed for all specified effects. The ODS name of the
table is Coef.
E2
requests that Type 2 L matrix coefficients be displayed for all specified effects. The ODS name of the
table is Coef.
E3
requests that Type 3 L matrix coefficients be displayed for all specified effects. The ODS name of the
table is Coef.
FULLX
requests that columns of the X matrix that consist entirely of zeros not be eliminated from X; otherwise,
they are eliminated by default. For a column corresponding to a missing cell to be added to X, its
particular levels must be present in at least one observation in the analysis data set along with a
missing dependent variable. The use of the FULLX option can affect coefficient specifications in the
CONTRAST and ESTIMATE statements, as well as covariate coefficients from LSMEANS statements
specified with the AT MEANS option.
HTYPE=value-list
indicates the type of hypothesis test to perform on the fixed effects. Valid entries for values in the list
are 1, 2, and 3; the default value is 3. You can specify several types by separating the values with a
comma or a space. The ODS table names are Tests1 for the Type 1 tests, Tests2 for the Type 2 tests,
and Tests3 for the Type 3 tests.
Description Suboption
Compute influence diagnostics for individual observations Default
Measure influence of sets of observations chosen according to a EFFECT=
classification variable or effect
Remove pairs of observations and report the results sorted by degree SIZE=2
of influence
Remove triples, quadruples of observations, etc. SIZE=
Allow selection of individual observations, observations sharing SELECT=
specific levels of effects, and construction of tuples from specified
subsets of observations
Update fixed effects and covariance parameters by refitting the ITER=n > 0
mixed model, adding up to n iterations
Compute influence diagnostics for the covariance parameters ITER=n > 0
Update only fixed effects and the residual variance, if it is profiled ITER=0
Add the reduced-data estimates to the data set created with ODS ESTIMATES
OUTPUT
The modifiers and their default values are discussed in the following paragraphs. The set of computed
influence diagnostics varies with the suboptions. The most extensive set of influence diagnostics is
obtained when ITER=n with n > 0.
You can produce statistical graphics of influence diagnostics when ODS Graphics is enabled. For gen-
eral information about ODS Graphics, see Chapter 24, “Statistical Graphics Using ODS.” For specific
information about the graphics available in the MIXED procedure, see the section “ODS Graphics” on
page 6717.
You can specify the following influence-options in parentheses:
EFFECT=effect
specifies an effect according to which observations are grouped. Observations sharing the same
level of the effect are removed from the analysis as a group. The effect must contain only
classification variables, but they need not be contained in the model.
Removing observations can change the rank of the .X0 V 1 X/ matrix. This is particularly
likely to happen when multiple observations are eliminated from the analysis. If the rank of
the estimated variance-covariance matrix of b̌ changes or its singularity pattern is altered, no
influence diagnostics are computed.
ESTIMATES
EST
specifies that the updated parameter estimates should be written to the ODS output data set. The
values are not displayed in the “Influence” table, but if you use ODS OUTPUT to create a data
set from the listing, the estimates are added to the data set. If ITER=0, only the fixed-effects
estimates are saved. In iterative influence analyses, fixed-effects and covariance parameters are
MODEL Statement F 6649
stored. The p fixed-effects parameter estimates are named Parm1–Parmp, and the q covariance
parameter estimates are named CovP1–CovPq. The order corresponds to that in the “Solution
for Fixed Effects” and “Covariance Parameter Estimates” tables. If parameter updates fail—for
example, because of a loss of rank or a nonpositive definite Hessian—missing values are reported.
ITER=n
controls the maximum number of additional iterations PROC MIXED performs to update the
fixed-effects and covariance parameter estimates following data point removal. If you specify n >
0, then statistics such as DFFITS, MDFFITS, and the likelihood distances measure the impact
of observation(s) on all aspects of the analysis. Typically, the influence will grow compared to
values at ITER=0. In models without RANDOM or REPEATED effects, the ITER= option has
no effect.
This documentation refers to analyses when n > 0 simply as iterative influence analysis, even
if final covariance parameter estimates can be updated in a single step (for example, when
METHOD=MIVQUE0 or METHOD=TYPE3). This nomenclature reflects the fact that only if
n > 0 are all model parameters updated, which can require additional iterations. If n > 0 and
METHOD=REML (default) or METHOD=ML, the procedure updates fixed effects and variance-
covariance parameters after removing the selected observations with additional Newton-Raphson
iterations, starting from the converged estimates for the entire data. The process stops for each
observation or set of observations if the convergence criterion is satisfied or the number of further
iterations exceeds n. If n > 0 and METHOD=TYPE1, TYPE2, or TYPE3, ANOVA estimates of
the covariance parameters are recomputed in a single step.
Compared to noniterative updates, the computations are more involved. In particular for large
data sets or a large number of random effects (or both), iterative updates require considerably
more resources. A one-step (ITER=1) or two-step update might be a good compromise. The
output includes the number of iterations performed, which is less than n if the iteration converges.
If the process does not converge in n iterations, you should be careful in interpreting the results,
especially if n is fairly large.
Bounds and other restrictions on the covariance parameters carry over from the full-data model.
Covariance parameters that are not iterated in the model fit to the full data (the NOITER or
HOLD= option in the PARMS statement) are likewise not updated in the refit. In certain models,
such as random-effects models, the ratios between the covariance parameters and the residual
variance are maintained rather than the actual value of the covariance parameter estimate (see the
section “Influence Diagnostics” on page 6702).
KEEP=n
determines how many observations are retained for display and in the output data set or how
many tuples if you specify SIZE=. The output is sorted by an influence statistic as discussed for
the SIZE= suboption.
SELECT=value-list
specifies which observations or effect levels are chosen for influence calculations. If the SELECT=
suboption is not specified, diagnostics are computed as follows:
When you specify an effect with the EFFECT= option, the values in value-list represent indices of
the levels in the order in which PROC MIXED builds classification effects. Which observations
in the data set correspond to this index depends on the order of the variables in the CLASS
statement, not the order in which the variables appear in the interaction effect. See the section
“Parameterization of Mixed Models” on page 6695 to understand precisely how the procedure
indexes nested and crossed effects and how levels of classification variables are ordered. The
actual values of the classification variables involved in the effect are shown in the output so you
can determine which observations were removed.
If the EFFECT= suboption is not specified, the SELECT= value list refers to the sequence in
which observations are read from the input data set or from the current BY group if there is a BY
statement. This indexing is not necessarily the same as the observation numbers in the input data
set, for example, if a WHERE clause is specified or during BY processing.
SIZE=n
instructs PROC MIXED to remove groups of observations formed as tuples of size n. For
example, SIZE=2 specifies all n .n 1/=2 unique pairs of observations. The number of tuples
for SIZE=k is nŠ=.kŠ.n k/Š/ and grows quickly with n and k. Using the SIZE= option can result
in considerable computing time. The MIXED procedure displays by default only the 50 tuples
with the greatest influence. Use the KEEP= option to override this default and to retain a different
number of tuples in the listing or ODS output data set. Regardless of the KEEP= specification, all
tuples are evaluated and the results are ordered according to an influence statistic. This statistic is
the (restricted) likelihood distance as a measure of overall influence if ITER= n > 0 or when a
residual variance is profiled. When likelihood distances are unavailable, the results are ordered
by the PRESS statistic.
To reduce computational burden, the SIZE= option can be combined with the SELECT=value-list
modifier. For example, the following statements evaluate all 15 D 6 5=2 pairs formed from
observations 13, 14, 18, 30, 31, and 33 and display the five pairs with the greatest influence:
proc mixed;
class a m f;
model penetration = a m /
influence(size=2 keep=5
select=13,14,18,30,31,33);
random f(m);
run;
If any observation in a tuple contains missing values or has otherwise not contributed to the
analysis, the tuple is not evaluated. This guarantees that the displayed results refer to the same
number of observations, so that meaningful statistics are available by which to order the results.
If computations fail for a particular tuple—for example, because the .X0 V 1 X/ matrix changes
rank or the G matrix is not positive definite—no results are produced. Results are retained when
the maximum number of iterative updates is exceeded in iterative influence analyses.
The SIZE= suboption cannot be combined with the EFFECT= suboption. As in the case of the
EFFECT= suboption, the statistics being computed are those appropriate for removal of multiple
data points, even if SIZE=1.
The ODS name of the “Influence Diagnostics” table is Influence. The variables in this table depend on
whether you specify the EFFECT=, SIZE=, or KEEP= suboption and whether covariance parameters
MODEL Statement F 6651
are iteratively updated. When ITER=0 (the default), certain influence diagnostics are meaningful only
if the residual variance is profiled. Table 84.11 and Table 84.12 summarize the statistics obtained
depending on the model and modifiers. The last column in these tables gives the variable name in the
ODS OUTPUT INFLUENCE= data set. Restricted likelihood distances are reported instead of the
likelihood distance unless METHOD=ML. See the section “Influence Diagnostics” on page 6702 for
details about the individual statistics.
INTERCEPT
adds a row to the tables for Type 1, 2, and 3 tests corresponding to the overall intercept.
LCOMPONENTS
requests an estimate for each row of the L matrix used to form tests of fixed effects. Components
corresponding to Type 3 tests are the default; you can produce the Type 1 and Type 2 component
estimates with the HTYPE= option.
Tests of fixed effects involve testing of linear hypotheses of the form Lˇ D 0. The matrix L is
constructed from Type 1, 2, or 3 estimable functions. By default the MIXED procedure constructs
Type 3 tests. In many situations, the individual rows of the matrix L represent contrasts of interest.
For example, in a one-way classification model, the Type 3 estimable functions define differences
MODEL Statement F 6653
of factor-level means. In a balanced two-way layout, the rows of L correspond to differences of cell
means.
For example, suppose factors A and B have a and b levels, respectively. The following statements
produce (a – 1) one degree of freedom tests for the rows of L associated with the Type 1 and Type 3
estimable functions for factor A, (b – 1) tests for the rows of L associated with factor B, and a single
test for the Type 1 and Type 3 coefficients associated with regressor X:
class A B;
model y = A B x / htype=1,3 lcomponents;
The denominator degrees of freedom associated with a row of L are the same as those in
the corresponding “Tests of Fixed Effects” table, except for DDFM=KENWARDROGER,
DDFM=KENWARDROGER2, and DDFM=SATTERTHWAITE. For these degrees-of-freedom
methods, the denominator degrees of freedom are computed separately for each row of L.
The ODS name of the table containing all requested component tests is LComponents. See Exam-
ple 84.9 for applications of the LCOMPONENTS option.
NOCONTAIN
has the same effect as the DDFM=RESIDUAL option.
NOINT
requests that no intercept be included in the model. An intercept is included by default.
NOTEST
specifies that no hypothesis tests be performed for the fixed effects.
OUTP=SAS-data-set
OUTPRED=SAS-data-set
specifies an output data set containing predicted values and related quantities. This option replaces the
P option from SAS 6.
Predicted values are formed by using the rows from (X Z) as L matrices. Thus, predicted values from
the original data are Xb̌ C Zb. Their approximate standard errors of prediction are formed from the
quadratic form of L with bC defined in the section “Statistical Properties” on page 6691. The L95 and
U95 variables provide a t-type confidence interval for the predicted values, and they correspond to the
L95M and U95M variables from the GLM and REG procedures for fixed-effects models. The residuals
are the observed minus the predicted values. Predicted values for data points other than those observed
can be obtained by using missing dependent variables in your input data set.
Specifications that have a REPEATED statement with the SUBJECT= option and missing dependent
variables compute predicted values by using empirical best linear unbiased prediction (EBLUP). Using
hats .O/ to denote estimates, the EBLUP formula is
O mV
O 1
O D Xm ˇO C C
m .y O
Xˇ/
where m represents a hypothetical realization of a missing data vector with associated design matrix
Xm . The matrix Cm is the model-based covariance matrix between m and the observed data y, and
other notation is as presented in the section “Mixed Models Theory” on page 6683.
6654 F Chapter 84: The MIXED Procedure
b O
Var.m Om
m/ D V C O 1C
O mV O0 C
m
ŒXm O O 1
Cm V X.X0 VO 1 X/ ŒXm O mV
C O 1
X0
where Vm is the model-based variance matrix of m. For further details, see Henderson (1984) and
Harville (1990). This feature can be useful for forecasting time series or for computing spatial
predictions.
By default, all variables from the input data set are included in the OUTP= data set. You can select a
subset of these variables by using the ID statement.
OUTPM=SAS-data-set
OUTPREDM=SAS-data-set
specifies an output data set containing predicted means and related quantities. This option replaces the
PM option from SAS 6.
The output data set is of the same form as that resulting from the OUTP= option, except that the
predicted values do not incorporate the EBLUP values Zb. They also do not use the EBLUPs for
specifications that have a REPEATED statement with the SUBJECT= option and missing dependent
variables. The predicted values are formed as Xb̌ in the OUTPM= data set, and standard errors are
quadratic forms in the approximate variance-covariance matrix of b̌ as displayed by the COVB option.
By default, all variables from the input data set are included in the OUTPM= data set. You can select a
subset of these variables by using the ID statement.
RESIDUAL
RESIDUALS
requests that Pearson-type and (internally) studentized residuals be added to the OUTP= and OUTPM=
data sets. Studentized residuals are raw residuals standardized by their estimated standard error. When
residuals are internally studentized, the data point in question has contributed to the estimation of
the covariance parameter estimates on which the standard error of the residual is based. Externally
studentized marginal residuals can be computed with the INFLUENCE option. Pearson-type residuals
scale the residual by the standard deviation of the response.
The option has no effect unless the OUTP= or OUTPM= option is specified or un-
less ODS Graphics is enabled. For general information about ODS Graphics, see
Chapter 24, “Statistical Graphics Using ODS.” For specific information about the graphics avail-
able in the MIXED procedure, see the section “ODS Graphics” on page 6717. For computational
details about studentized and Pearson residuals in MIXED, see the section “Residual Diagnostics” on
page 6700.
SINGCHOL=number
tunes the sensitivity in computing Cholesky roots. If a diagonal pivot element is less than D*number
as PROC MIXED performs the Cholesky decomposition on a matrix, the associated column is declared
to be linearly dependent upon previous columns and is set to 0. The value D is the original diagonal
element of the matrix. The default for number is 1E4 times the machine epsilon; this product is
approximately 1E–12 on most computers.
MODEL Statement F 6655
SINGRES=number
sets the tolerance for which the residual variance is considered to be zero. The default is 1E4 times the
machine epsilon; this product is approximately 1E–12 on most computers.
SINGULAR=number
tunes the sensitivity in sweeping. If a diagonal pivot element is less than D*number as PROC MIXED
sweeps a matrix, the associated column is declared to be linearly dependent upon previous columns,
and the associated parameter is set to 0. The value D is the original diagonal element of the matrix.
The default is 1E4 times the machine epsilon; this product is approximately 1E–12 on most computers.
SOLUTION
S
requests that a solution for the fixed-effects parameters be produced. Using notation from the section
“Mixed Models Theory” on page 6683, the fixed-effects parameter estimates are b̌ and their approximate
standard errors are the square roots of the diagonal elements of .X0 V b 1 X/ . You can output this
approximate variance matrix with the COVB option or modify it with the EMPIRICAL option in the
PROC MIXED statement or the DDFM=KENWARDROGER or DDFM=KENWARDROGER2 option
in the MODEL statement.
Along with the estimates and their approximate standard errors, a t statistic is computed as the estimate
divided by its standard error. The degrees of freedom for this t statistic matches the one appearing in the
“Tests of Fixed Effects” table under the effect containing the parameter. The “Pr > |t|” column contains
the two-tailed p-value corresponding to the t statistic and associated degrees of freedom. You can use
the CL option to request confidence intervals for all of the parameters; they are constructed around the
estimate by using a radius of the standard error times a percentage point from the t distribution.
VCIRY
requests that responses and marginal residuals be scaled by the inverse Cholesky root of the marginal
variance-covariance matrix. The variables ScaledDep and ScaledResid are added to the OUTPM=
data set. These quantities can be important in bootstrapping of data or residuals. Examination of the
scaled residuals is also helpful in diagnosing departures from normality. Notice that the results of this
scaling operation can depend on the order in which the MIXED procedure processes the data.
The VCIRY option has no effect unless you also use the OUTPM= option or un-
less ODS Graphics is enabled. For general information about ODS Graphics, see
Chapter 24, “Statistical Graphics Using ODS.” For specific information about the graphics avail-
able in the MIXED procedure, see the section “ODS Graphics” on page 6717.
XPVIX
is an alias for the COVBI option.
XPVIXI
is an alias for the COVB option.
ZETA=number
tunes the sensitivity in forming Type 3 functions. Any element in the estimable function basis with an
absolute value less than number is set to 0. The default is 1E–8.
6656 F Chapter 84: The MIXED Procedure
PARMS Statement
PARMS (value-list). . . < / options > ;
The PARMS statement specifies initial values for the covariance parameters, or it requests a grid search over
several values of these parameters. You must specify the values in the order in which they appear in the
“Covariance Parameter Estimates” table.
The value-list specification can take any of several forms:
m a single value
m1 ; m2 ; : : : ; mn several values
m to n a sequence where m equals the starting value, n equals the ending value, and the increment
equals 1
m to n by i a sequence where m equals the starting value, n equals the ending value, and the increment
equals i
m1 ; m2 to m3 mixed values and sequences
You can use the PARMS statement to input known parameters. Referring to the split-plot example (Exam-
ple 84.1), suppose the three variance components are known to be 60, 20, and 6. The SAS statements to fix
the variance components at these values are as follows:
Option Description
HOLD= Holds parameter values equal to the specified values
LOGDETH Evaluates the log determinant of the Hessian matrix
LOWERB= Specifies lower boundary constraints
NOBOUND Removes boundary constraints on covariance parameters
NOITER Performs inferences using the best value from the grid search
NOPRINT Suppresses the “Parameter Search” table
NOPROFILE Specifies a different computational method for the residual variance
OLS Requests starting values corresponding to the usual general linear model
PARMSDATA= Reads in covariance parameter values from a SAS data set
RATIOS Indicates that ratios with the residual variance are specified
UPPERB= Specifies upper boundary constraints
You can specify the following options in the PARMS statement after a slash (/).
HOLD=value-list
EQCONS=value-list
specifies which parameter values PROC MIXED should hold to equal the specified values. For
example, the following statement constrains the first and third covariance parameters to equal 5 and 2,
respectively:
LOGDETH
evaluates the log determinant of the Hessian matrix for each point specified in the PARMS statement.
A Log Det H column is added to the “Parameter Search” table.
LOWERB=value-list
enables you to specify lower boundary constraints on the covariance parameters. The value-list
specification is a list of numbers or missing values (.) separated by commas. You must list the numbers
in the order that PROC MIXED uses for the covariance parameters, and each number corresponds to
the lower boundary constraint. A missing value instructs PROC MIXED to use its default constraint,
and if you do not specify numbers for all of the covariance parameters, PROC MIXED assumes the
remaining ones are missing.
An example for which this option is useful is when you want to constrain the G matrix to be positive
definite in order to avoid the more computationally intensive algorithms required when G becomes
singular. The corresponding statements for a random coefficients model are as follows:
proc mixed;
class person;
model y = time;
random int time / type=fa0(2) sub=person;
parms / lowerb=1e-4,.,1e-4;
run;
6658 F Chapter 84: The MIXED Procedure
Here the TYPE=FA0(2) structure is used in order to specify a Cholesky root parameterization for the
2 2 unstructured blocks in G. This parameterization ensures that the G matrix is nonnegative definite,
and the PARMS statement then ensures that it is positive definite by constraining the two diagonal
terms to be greater than or equal to 1E–4.
NOBOUND
requests the removal of boundary constraints on covariance parameters. For example, variance
components have a default lower boundary constraint of 0, and the NOBOUND option allows their
estimates to be negative.
NOITER
requests that no Newton-Raphson iterations be performed and that PROC MIXED use the best value
from the grid search to perform inferences. By default, iterations begin at the best value from the
PARMS grid search.
NOPRINT
suppresses the display of the “Parameter Search” table.
NOPROFILE
specifies a different computational method for the residual variance during the grid search. By default,
PROC MIXED estimates this parameter by using the profile likelihood when appropriate. This
estimate is displayed in the Variance column of the “Parameter Search” table. The NOPROFILE
option suppresses the profiling and uses the actual value of the specified variance in the likelihood
calculations.
OLS
requests starting values corresponding to the usual general linear model. Specifically, all variances
and covariances are set to zero except for the residual variance, which is set equal to its ordinary least
squares (OLS) estimate. This option is useful when the default MIVQUE0 procedure produces poor
starting values for the optimization process.
PARMSDATA=SAS-data-set
PDATA=SAS-data-set
reads in covariance parameter values from a SAS data set. The data set should contain the Est or
Covp1–Covpn variables.
RATIOS
indicates that ratios with the residual variance are specified instead of the covariance parameters
themselves. The default is to use the individual covariance parameters.
UPPERB=value-list
enables you to specify upper boundary constraints on the covariance parameters. The value-list
specification is a list of numbers or missing values (.) separated by commas. You must list the numbers
in the order that PROC MIXED uses for the covariance parameters, and each number corresponds to
the upper boundary constraint. A missing value instructs PROC MIXED to use its default constraint,
and if you do not specify numbers for all of the covariance parameters, PROC MIXED assumes that
the remaining ones are missing.
PRIOR Statement F 6659
PRIOR Statement
PRIOR < distribution > < / options > ;
The PRIOR statement enables you to carry out a sampling-based Bayesian analysis in PROC MIXED. It
currently operates only with variance component models. Other TYPE= structures are not supported. The
analysis produces a SAS data set containing a pseudo-random sample from the joint posterior density of the
variance components and other parameters in the mixed model.
The posterior analysis is performed after all other PROC MIXED computations. It begins with the “Posterior
Sampling Information” table, which provides basic information about the posterior sampling analysis,
including the prior densities, sampling algorithm, sample size, and random number seed. The ODS name of
this table is Posterior.
By default, PROC MIXED uses an independence chain algorithm in order to generate the posterior sample
(Tierney 1994). This algorithm works by generating a pseudo-random proposal from a convenient base
distribution, chosen to be as close as possible to the posterior. The proposal is then retained in the sample
with probability proportional to the ratio of weights constructed by taking the ratio of the true posterior to the
base density. If a proposal is not accepted, then a duplicate of the previous observation is added to the chain.
In selecting the base distribution, PROC MIXED makes use of the fact that the fixed-effects parameters can
be analytically integrated out of the joint posterior, leaving the marginal posterior density of the variance
components. In order to better approximate the marginal posterior density of the variance components, PROC
MIXED transforms them by using the MIVQUE(0) equations. You can display the selected transformation
with the PTRANS option or specify your own with the TDATA= option. The density of the transformed
parameters is then approximated by a product of inverted gamma densities (see Gelfand et al. 1990).
To determine the parameters for the inverted gamma densities, PROC MIXED evaluates the logarithm of
the posterior density over a grid of points in each of the transformed parameters, and you can display the
results of this search with the PSEARCH option. PROC MIXED then performs a linear regression of these
values on the logarithm of the inverted gamma density. The resulting base densities are displayed in the
“Base Densities” table; the ODS name of this table is Base. You can input different base densities with the
BDATA= option.
At the end of the sampling, the “Acceptance Rates” table displays the acceptance rate computed as the number
of accepted samples divided by the total number of samples generated. The ODS name of the “Acceptance
Rates” table is AccRates.
The OUT= option specifies the output data set containing the posterior sample. PROC MIXED automatically
includes all variance component parameters in this data set (labeled COVP1–COVPn), the Type 3 F statistics
constructed as in Ghosh (1992) discussing Schervish (1992) (labeled T3Fn), the log values of the posterior
(labeled LOGF), the log of the base sampling density (labeled LOGG), and the log of their ratio (labeled
LOGRATIO). If you specify the SOLUTION option in the MODEL statement, the data set also contains
a random sample from the posterior density of the fixed-effects parameters (labeled BETAn); and if you
specify the SOLUTION option in the RANDOM statement, the table contains a random sample from the
posterior density of the random-effects parameters (labeled GAMn). PROC MIXED also generates additional
variables corresponding to any CONTRAST, ESTIMATE, or LSMEANS statement that you specify.
Subsequently, you can use SAS/INSIGHT or the UNIVARIATE, CAPABILITY, or KDE procedure to analyze
the posterior sample.
6660 F Chapter 84: The MIXED Procedure
The prior density of the variance components is, by default, a noninformative version of Jeffreys’ prior (Box
and Tiao 1973). You can also specify informative priors with the DATA= option or a flat (equal to 1) prior for
the variance components. The prior density of the fixed-effects parameters is assumed to be flat (equal to
1), and the resulting posterior is conditionally multivariate normal (conditioning on the variance component
parameters) with mean .X0 V 1 X/ X0 V 1 y and variance .X0 V 1 X/ .
Table 84.14 summarizes the options available in the PRIOR statement.
Option Description
DATA= Inputs the prior densities of the variance components
JEFFREYS Specifies a noninformative reference version of Jeffreys’ prior
FLAT Specifies a prior density equal to 1 everywhere
ALG= Specifies the algorithm used for generating the posterior sample
BDATA= Inputs the base densities used by the sampling algorithm
GRID= Specifies a grid of values over which to evaluate the posterior density
GRIDT= Specifies a transformed grid of values over which to evaluate the posterior
density
IFACTOR= An alias for the SFACTOR= option
LOGNOTE= Writes a note to the log after generating the sample
LOGRBOUND= Specifies the bounding constant for rejection sampling
NSAMPLE= Specifies the number of posterior samples to generate
NSEARCH= Specifies the number of posterior evaluations
OUT= Creates an output data set containing the sample from the posterior density
OUTG= Creates an output data set from the grid evaluations
OUTGT= Creates an output data set from the transformed grid evaluations
PSEARCH Displays the search used to determine the parameters for the inverted gamma
densities
PTRANS Displays the transformation of the variance components
SEED= Specifies an integer used to start the pseudo-random number generator
SFACTOR= Adjusts the search range of the transformed parameters
TDATA= Inputs the transformation used by the sampling algorithm
TRANS= Specifies the algorithm that determines the transformation of the covariance
parameters
UPDATE= An alias for the LOGNOTE= option
The distribution argument in the PRIOR statement determines the prior density for the variance component
parameters of your mixed model. Valid values are as follows.
DATA=
enables you to input the prior densities of the variance components used by the sampling algorithm.
This data set must contain the Type and Parm1–Parmn variables, where n is the largest number of
parameters among each of the base densities. The format of the DATA= data set matches that created
by PROC MIXED in the “Base Densities” table, so you can output the densities from one run and use
them as input for a subsequent run.
PRIOR Statement F 6661
JEFFREYS
specifies a noninformative reference version of Jeffreys’ prior constructed by using the square root of
the determinant of the expected information matrix as in (1.3.92) of Box and Tiao (1973). This is the
default prior.
FLAT
specifies a prior density equal to 1 everywhere, making the likelihood function the posterior.
You can specify the following options in the PRIOR statement after a slash (/).
ALG=IC | INDCHAIN
ALG=IS | IMPSAMP
ALG=RS | REJSAMP
ALG=RWC | RWCHAIN
specifies the algorithm to use for generating the posterior sample. The ALG=IC option requests an
independence chain algorithm. The option ALG=IS requests importance sampling, ALG=RS requests
rejection sampling, and ALG=RWC requests a random walk chain. The IC algorithm works only for
models that have random effects. If you specify ALG=IC and the model has no random effects, then it
silently switches to ALG=IS. By default, when the model has the random-effects, ALG=IC; otherwise,
ALG=IS. When ALG=IS, the “Acceptance Rates” table is not generated. For more information about
these techniques, see Ripley (1987); Smith and Gelfand (1992); Tierney (1994).
BDATA=
enables you to input the base densities used by the sampling algorithm. This data set must contain the
Type and Parm1–Parmn variables, where n is the largest number of parameters among each of the base
densities. The format of the BDATA= data set matches that created by PROC MIXED in the “Base
Densities” table, so you can output the densities from one run and use them as input for a subsequent
run.
GRID=(value-list)
specifies a grid of values over which to evaluate the posterior density. The value-list syntax is the same
as in the PARMS statement, and you must specify an output data set name with the OUTG= option.
GRIDT=(value-list)
specifies a transformed grid of values over which to evaluate the posterior density. The value-list
syntax is the same as in the PARMS statement, and you must specify an output data set name with the
OUTGT= option.
IFACTOR=number
is an alias for the SFACTOR= option.
LOGNOTE=number
instructs PROC MIXED to write a note to the SAS log after it generates the sample corresponding to
each multiple of number . This is useful for monitoring the progress of CPU-intensive runs.
6662 F Chapter 84: The MIXED Procedure
LOGRBOUND=number
specifies the bounding constant for rejection sampling. The value of number equals the maximum of
logff =gg over the variance component parameter space, where f is the posterior density and g is the
product inverted gamma densities used to perform rejection sampling.
When performing the rejection sampling, you might encounter the following message:
When this occurs, PROC MIXED reruns an optimization algorithm to determine a new log upper
bound and then restarts the rejection sampling. The resulting OUT= data set contains all observations
that have been generated; therefore, assuming that you have requested N samples, you should retain
only the final N observations in this data set for analysis purposes.
NSAMPLE=number
specifies the number of posterior samples to generate. The default is 1000, but more accurate results
are obtained with larger samples such as 10000.
NSEARCH=number
specifies the number of posterior evaluations PROC MIXED makes for each transformed parameter in
determining the parameters for the inverted gamma densities. The default is 20.
OUT=SAS-data-set
creates an output data set containing the sample from the posterior density.
OUTG=SAS-data-set
creates an output data set from the grid evaluations specified in the GRID= option.
OUTGT=SAS-data-set
creates an output data set from the transformed grid evaluations specified in the GRIDT= option.
PSEARCH
displays the search used to determine the parameters for the inverted gamma densities. The ODS name
of the table is Search.
PTRANS
displays the transformation of the variance components. The ODS name of the table is Trans.
SEED=number
specifies an integer used to start the pseudo-random number generator for the simulation. If you do not
specify a seed, or if you specify a value less than or equal to zero, the seed is by default generated from
reading the time of day from the computer clock. You should use a positive seed (less than 231 1)
whenever you want to duplicate the sample in another run of PROC MIXED.
SFACTOR=number
enables you to adjust the range over which PROC MIXED searches the transformed parameters in
order to determine the parameters for the inverted gamma densities. PROC MIXED determines the
range by first transforming the estimates from the standard PROC MIXED analysis (REML, ML, or
MIVQUE0, depending upon which estimation method you select). It then multiplies and divides the
transformed estimates by 2number to obtain upper and lower bounds, respectively. Transformed
values that produce negative variance components in the original scale are not included in the search.
The default value is 1; number must be greater than 0.5.
RANDOM Statement F 6663
TDATA=SAS-data-set
enables you to input the transformation of the covariance parameters used by the sampling algorithm.
This data set should contain the CovP1–CovPn variables. The format of the TDATA= data set matches
that created by PROC MIXED in the Trans table, so you can output the transformation from one run
and use it as input for a subsequent run.
UPDATE=number
is an alias for the LOGNOTE= option.
RANDOM Statement
RANDOM random-effects < / options > ;
The RANDOM statement defines the random effects constituting the vector in the mixed model. It
can be used to specify traditional variance component models (as in the VARCOMP procedure) and to
specify random coefficients. The random effects can be classification or continuous, and multiple RANDOM
statements are possible.
Using notation from the section “Mixed Models Theory” on page 6683, the purpose of the RANDOM
statement is to define the Z matrix of the mixed model, the random effects in the vector, and the structure of
G. The Z matrix is constructed exactly as the X matrix for the fixed effects, and the G matrix is constructed
to correspond with the effects constituting Z. The structure of G is defined by using the TYPE= option.
You can specify INTERCEPT (or INT) as a random effect to indicate the intercept. PROC MIXED does not
include the intercept in the RANDOM statement by default as it does in the MODEL statement.
Table 84.15 summarizes the options available in the RANDOM statement. All options are subsequently
discussed in alphabetical order.
Option Description
Construction of Covariance Structure
GDATA= Requests that the G matrix be read from a SAS data set
GROUP= Varies covariance parameters by groups
LDATA= Specifies data set with coefficient matrices for TYPE=LIN
NOFULLZ Eliminates columns in Z corresponding to missing values
RATIOS Indicates that ratios are specified in the GDATA= data set
SUBJECT= Identifies the subjects in the model
TYPE= Specifies the covariance structure
Statistical Output
ALPHA=˛ Determines the confidence level (1 ˛)
6664 F Chapter 84: The MIXED Procedure
Option Description
CL Requests confidence limits for predictors of random effects
G Displays the estimated G matrix
GC Displays the Cholesky root (lower) of estimated G matrix
GCI Displays the inverse Cholesky root (lower) of estimated G matrix
GCORR Displays the correlation matrix corresponding to estimated G ma-
trix
GI Displays the inverse of the estimated G matrix
SOLUTION Displays solutions b of the G-side random effects
V Displays blocks of the estimated V matrix
VC Displays the lower-triangular Cholesky root of blocks of the esti-
mated V matrix
VCI Displays the inverse Cholesky root of blocks of the estimated V
matrix
VCORR Displays the correlation matrix corresponding to blocks of the
estimated V matrix
VI Displays the inverse of the blocks of the estimated V matrix
You can specify the following options in the RANDOM statement after a slash (/).
ALPHA=number
requests that a t-type confidence interval be constructed for each of the random-effect estimates with
confidence level 1 – number . The value of number must be between 0 and 1; the default is 0.05.
CL
requests that t-type confidence limits be constructed for each of the random-effect estimates. The
confidence level is 0.95 by default; this can be changed with the ALPHA= option.
G
requests that the estimated G matrix be displayed. PROC MIXED displays blanks for values that are 0.
If you specify the SUBJECT= option, then the block of the G matrix corresponding to the first subject
is displayed. The ODS name of the table is G.
GC
displays the lower-triangular Cholesky root of the estimated G matrix according to the rules listed
under the G option. The ODS name of the table is CholG.
GCI
displays the inverse Cholesky root of the estimated G matrix according to the rules listed under the G
option. The ODS name of the table is InvCholG.
GCORR
displays the correlation matrix corresponding to the estimated G matrix according to the rules listed
under the G option. The ODS name of the table is GCorr.
RANDOM Statement F 6665
GDATA=SAS-data-set
requests that the G matrix be read in from a SAS data set. This G matrix is assumed to be known;
therefore, only R-side parameters from effects in the REPEATED statement are included in the
Newton-Raphson iterations. If no REPEATED statement is specified, then only a residual variance is
estimated.
The information in the GDATA= data set can appear in one of two ways. The first is a sparse
representation for which you include Row, Col, and Value variables to indicate the row, column, and
value of G, respectively. All unspecified locations are assumed to be 0. The second representation
is for dense matrices. In it you include Row and Col1–Coln variables to indicate, respectively, the
row and columns of G, which is a symmetric matrix of order n. For both representations, you must
specify effects in the RANDOM statement that generate a Z matrix that contains n columns. (See
Example 84.4.)
If you have more than one RANDOM statement, only one GDATA= option is required in any one of
them, and the data set you specify must contain the entire G matrix defined by all of the RANDOM
statements.
If the GDATA= data set contains variance ratios instead of the variances themselves, then use the
RATIOS option.
Known parameters of G can also be input by using the PARMS statement with the HOLD= option.
GI
displays the inverse of the estimated G matrix according to the rules listed under the G option. The
ODS name of the table is InvG.
GROUP=effect
GRP=effect
defines an effect specifying heterogeneity in the covariance structure of G. All observations having the
same level of the group effect have the same covariance parameters. Each new level of the group effect
produces a new set of covariance parameters with the same structure as the original group. You should
exercise caution in defining the group effect, because strange covariance patterns can result from its
misuse. Also, the group effect can greatly increase the number of estimated covariance parameters,
which can adversely affect the optimization process.
Continuous variables are permitted as arguments to the GROUP= option. PROC MIXED does not
sort by the values of the continuous variable; rather, it considers the data to be from a new subject or
group whenever the value of the continuous variable changes from the previous observation. Using a
continuous variable decreases execution time for models with a large number of subjects or groups and
also prevents the production of a large “Class Level Information” table.
LDATA=SAS-data-set
reads the coefficient matrices associated with the TYPE=LIN(number ) option. The data set must
contain the variables Parm, Row, Col1–Coln or Parm, Row, Col, Value. The Parm variable denotes
which of the number coefficient matrices is currently being constructed, and the Row, Col1–Coln, or
Row, Col, Value variables specify the matrix values, as they do with the GDATA= option. Unspecified
values of these matrices are set equal to 0.
6666 F Chapter 84: The MIXED Procedure
NOFULLZ
eliminates the columns in Z corresponding to missing levels of random effects involving CLASS
variables. By default, these columns are included in Z.
RATIOS
indicates that ratios with the residual variance are specified in the GDATA= data set instead of the
covariance parameters themselves. The default GDATA= data set contains the individual covariance
parameters.
SOLUTION
S
requests that the solution for the random-effects parameters be produced. Using notation from the
section “Mixed Models Theory” on page 6683, these estimates are the empirical best linear unbiased
predictors (EBLUPs) b D GZ b 0V b 1 .y Xb̌/. They can be useful for comparing the random effects
from different experimental units and can also be treated as residuals in performing diagnostics for
your mixed model.
The numbers displayed in the SE Pred column of the “Solution for Random Effects” table are not
the standard errors of the b displayed in the Estimate column; rather, they are the standard errors of
predictions bi i , where bi is the ith EBLUP and i is the ith random-effect parameter.
SUBJECT=effect
SUB=effect
identifies the subjects in your mixed model. Complete independence is assumed across subjects; thus,
for the RANDOM statement, the SUBJECT= option produces a block-diagonal structure in G with
identical blocks. The Z matrix is modified to accommodate this block diagonality. In fact, specifying a
subject effect is equivalent to nesting all other effects in the RANDOM statement within the subject
effect.
Continuous variables are permitted as arguments to the SUBJECT= option. PROC MIXED does not
sort by the values of the continuous variable; rather, it considers the data to be from a new subject or
group whenever the value of the continuous variable changes from the previous observation. Using a
continuous variable decreases execution time for models with a large number of subjects or groups and
also prevents the production of a large “Class Level Information” table.
When you specify the SUBJECT= option and a classification random effect, computations are usually
much quicker if the levels of the random effect are duplicated within each level of the SUBJECT=
effect.
TYPE=covariance-structure
specifies the covariance structure of G. Valid values for covariance-structure and their descriptions are
listed in Table 84.17 and Table 84.18. Although a variety of structures are available, most applications
call for either TYPE=VC or TYPE=UN. The TYPE=VC (variance components) option is the default
structure, and it models a different variance component for each random effect.
The TYPE=UN (unstructured) option is useful for correlated random coefficient models. For example,
the following statement specifies a random intercept-slope model that has different variances for the
intercept and slope and a covariance between them:
RANDOM Statement F 6667
You can also use TYPE=FA0(2) here to request a G estimate that is constrained to be nonnegative
definite.
If you are constructing your own columns of Z with continuous variables, you can use the
TYPE=TOEP(1) structure to group them together to have a common variance component. If you
want to have different covariance structures in different parts of G, you must use multiple RANDOM
statements with different TYPE= options.
REPEATED Statement
REPEATED < repeated-effect > < / options > ;
The REPEATED statement is used to specify the R matrix in the mixed model. Its syntax is different from
that of the REPEATED statement in PROC GLM. If no REPEATED statement is specified, R is assumed to
be equal to 2 I.
For many repeated measures models, no repeated effect is required in the REPEATED statement. Simply use
the SUBJECT= option to define the blocks of R and the TYPE= option to define their covariance structure.
In this case, the repeated measures data must be similarly ordered for each subject, and you must indicate
all missing response variables with periods in the input data set unless they all fall at the end of a subject’s
repeated response profile. These requirements are necessary in order to inform PROC MIXED of the proper
location of the observed repeated responses.
Specifying a repeated effect is useful when you do not want to indicate missing values with periods in the
input data set. The repeated effect must contain only classification variables. Make sure that the levels of
the repeated effect are different for each observation within a subject; otherwise, PROC MIXED constructs
identical rows in R corresponding to the observations with the same level. This results in a singular R and an
infinite likelihood.
Whether you specify a REPEATED effect or not, the rows of R for each subject are constructed in the order
in which they appear in the input data set.
Table 84.16 summarizes the options available in the REPEATED statement. All options are subsequently
discussed in alphabetical order.
Option Description
Construction of Covariance Structure
GROUP= Defines an effect specifying heterogeneity in the R-side covariance
structure
LDATA= Specifies data set with coefficient matrices for TYPE=LIN
LOCAL Requests that a diagonal matrix be added to R
LOCALW Specifies that only the local effects are weighted
NONLOCALW Specifies that only the nonlocal effects are weighted
SUBJECT= Identifies the subjects in the R-side model
TYPE= Specifies the R-side covariance structure
Statistical Output
HLM Produces a table of Hotelling-Lawley-McKeon statistics (McKeon
1974)
HLPS Produces a table of Hotelling-Lawley-Pillai-Samson statistics (Pil-
lai and Samson 1959)
R Displays blocks of the estimated R matrix
RC Display the Cholesky root (lower) of blocks of the estimated R
matrix
RCI Displays the inverse Cholesky root (lower) of blocks of the esti-
mated R matrix
REPEATED Statement F 6669
Option Description
RCORR Displays the correlation matrix corresponding to blocks of the
estimated R matrix
RI Displays the inverse of blocks of the estimated R matrix
You can specify the following options in the REPEATED statement after a slash (/).
GROUP=effect
GRP=effect
defines an effect that specifies heterogeneity in the covariance structure of R. All observations that
have the same level of the GROUP effect have the same covariance parameters. Each new level
of the GROUP effect produces a new set of covariance parameters with the same structure as the
original group. You should exercise caution in properly defining the GROUP effect, because strange
covariance patterns can result with its misuse. Also, the GROUP effect can greatly increase the number
of estimated covariance parameters, which can adversely affect the optimization process.
Continuous variables are permitted as arguments to the GROUP= option. PROC MIXED does not
sort by the values of the continuous variable; rather, it considers the data to be from a new subject or
group whenever the value of the continuous variable changes from the previous observation. Using a
continuous variable decreases execution time for models with a large number of subjects or groups and
also prevents the production of a large “Class Level Information” table.
HLM
produces a table of Hotelling-Lawley-McKeon statistics (McKeon 1974) for all fixed effects whose
levels change across data having the same level of the SUBJECT= effect (the within-subject fixed
effects). This option applies only when you specify a REPEATED statement with the TYPE=UN
option and no RANDOM statements. For balanced data, this model is equivalent to the multivariate
model for repeated measures in PROC GLM.
The Hotelling-Lawley-McKeon statistic has a slightly better F approximation than the Hotelling-
Lawley-Pillai-Samson statistic (see the description of the HLPS option, which follows). Both of
the Hotelling-Lawley statistics can perform much better in small samples than the default F statistic
(Wright 1994).
Separate tables are produced for Type 1, 2, and 3 tests, according to the ones you select. The ODS
table names are HLM1, HLM2, and HLM3, respectively.
HLPS
produces a table of Hotelling-Lawley-Pillai-Samson statistics (Pillai and Samson 1959) for all fixed
effects whose levels change across data having the same level of the SUBJECT= effect (the within-
subject fixed effects). This option applies only when you specify a REPEATED statement with the
TYPE=UN option and no RANDOM statements. For balanced data, this model is equivalent to
the multivariate model for repeated measures in PROC GLM, and this statistic is the same as the
Hotelling-Lawley Trace statistic produced by PROC GLM.
Separate tables are produced for Type 1, 2, and 3 tests, according to the ones you select. The ODS
table names are HLPS1, HLPS2, and HLPS3, respectively.
6670 F Chapter 84: The MIXED Procedure
LDATA=SAS-data-set
reads the coefficient matrices associated with the TYPE=LIN(number ) option. The data set must
contain the variables Parm, Row, Col1–Coln or Parm, Row, Col, Value. The Parm variable denotes
which of the number coefficient matrices is currently being constructed, and the Row, Col1–Coln, or
Row, Col, Value variables specify the matrix values, as they do with the RANDOM statement option
GDATA=. Unspecified values of these matrices are set equal to 0.
LOCAL
LOCAL=EXP(< effects >)
LOCAL=POM(POM-data-set)
requests that a diagonal matrix be added to R. With just the LOCAL option, this diagonal matrix
equals 2 I, and 2 becomes an additional variance parameter that PROC MIXED profiles out of the
likelihood provided that you do not specify the NOPROFILE option in the PROC MIXED statement.
The LOCAL option is useful if you want to add an observational error to a time series structure (Jones
and Boadi-Boateng 1991) or a nugget effect to a spatial structure Cressie (1993).
The LOCAL=EXP(<effects>) option produces exponential local effects, also known as dispersion
effects, in a log-linear variance model. These local effects have the form
2 diagŒexp.Uı/
where U is the full-rank design matrix corresponding to the effects that you specify and ı are the
parameters that PROC MIXED estimates. An intercept is not included in U because it is accounted for
by 2 . PROC MIXED constructs the full-rank U in terms of 1s and –1s for classification effects. Be
sure to scale continuous effects in U sensibly.
The LOCAL=POM(POM-data-set ) option specifies the power-of-the-mean structure. This structure
possesses a variance of the form 2 jx0i ˇ j for the ith observation, where xi is the ith row of X (the
design matrix of the fixed effects) and ˇ is an estimate of the fixed-effects parameters that you specify
in POM-data-set .
The SAS data set specified by POM-data-set contains the numeric variable Estimate (in previous
releases, the variable name was required to be EST), and it has at least as many observations as
there are fixed-effects parameters. The first p observations of the Estimate variable in POM-data-set
are taken to be the elements of ˇ , where p is the number of columns of X. You must order these
observations according to the non-full-rank parameterization of the MIXED procedure. One easy way
to set up POM-data-set for a ˇ corresponding to ordinary least squares is illustrated by the following
statements:
proc mixed;
class a;
model y = a x;
repeated / local=pom(sf);
run;
REPEATED Statement F 6671
Note that the generalized least squares estimate of the fixed-effects parameters from the second PROC
MIXED step usually is not the same as your specified ˇ . However, you can iterate the POM fitting
until the two estimates agree. Continuing from the previous example, the statements for performing
one step of this iteration are as follows:
data sf;
set sf1;
run;
Unfortunately, this iterative process does not always converge. For further details, see the description
of pseudo-likelihood in Chapter 3 of Carroll and Ruppert (1988).
LOCALW
specifies that only the local effects and no others be weighted. By default, all effects are weighted. The
LOCALW option is used in connection with the WEIGHT statement and the LOCAL option in the
REPEATED statement.
NONLOCALW
specifies that only the nonlocal effects and no others be weighted. By default, all effects are weighted.
The NONLOCALW option is used in connection with the WEIGHT statement and the LOCAL option
in the REPEATED statement.
See the PARMS statement for the possible forms of value-list . The ODS table name is R.
6672 F Chapter 84: The MIXED Procedure
SSCP
requests that an unstructured R matrix be estimated from the sum-of-squares-and-crossproducts matrix
of the residuals. It applies only when you specify TYPE=UN and have no RANDOM statements. Also,
you must have a sufficient number of subjects for the estimate to be positive definite.
This option is useful when the size of the blocks of R is large (for example, greater than 10) and you
want to use or inspect an unstructured estimate that is much quicker to compute than the default REML
estimate. The two estimates will agree for certain balanced data sets when you have a classification
fixed effect defined across all time points within a subject.
SUBJECT=effect
SUB=effect
identifies the subjects in your mixed model. Complete independence is assumed across subjects;
therefore, the SUBJECT= option produces a block-diagonal structure in R with identical blocks.
When the SUBJECT= effect consists entirely of classification variables, the blocks of R correspond to
observations sharing the same level of that effect. These blocks are sorted according to this effect as
well.
Continuous variables are permitted as arguments to the SUBJECT= option. PROC MIXED does not
sort by the values of the continuous variable; rather, it considers the data to be from a new subject or
group whenever the value of the continuous variable changes from the previous observation. Using a
continuous variable decreases execution time for models with a large number of subjects or groups and
also prevents the production of a large “Class Level Information” table.
If you want to model nonzero covariance among all of the observations in your SAS data set, specify
SUBJECT=INTERCEPT to treat the data as if they are all from one subject. However, be aware that in
this case PROC MIXED manipulates an R matrix with dimensions equal to the number of observations.
If no SUBJECT= effect is specified, then every observation is assumed to be from a different subject
and R is assumed to be diagonal. For this reason, you usually want to use the SUBJECT= option in the
REPEATED statement.
REPEATED Statement F 6673
TYPE=covariance-structure
specifies the covariance structure of the R matrix. The SUBJECT= option defines the blocks of R, and
the TYPE= option specifies the structure of these blocks. Valid values for covariance-structure and
their descriptions are provided in Table 84.17 and Table 84.18. The default structure is VC.
In Table 84.17, “Parms” is the number of covariance parameters in the structure, t is the overall
dimension of the covariance matrix, and 1.A/ equals 1 when A is true and 0 otherwise. For example,
1.i D j / equals 1 when i D j and 0 otherwise, and 1.ji j j < q/ equals 1 when ji j j < q
and 0 otherwise. For the TYPE=TOEPH structures, 0 D 1, and for the TYPE=UNR structures,
6674 F Chapter 84: The MIXED Procedure
i i D 1 for all i. For the direct product structures, the subscripts “1” and “2” see the first and second
structure in the direct product, respectively, and i1 D int..i C t2 1/=t2 /, j1 D int..j C t2 1/=t2 /,
i2 D mod.i 1; t2 / C 1, and j2 D mod.j 1; t2 / C 1.
In Table 84.18, c-list contains the names of the numeric variables used as coordinates of the location
of the observation in space, and dij is the Euclidean distance between the ith and jth vectors of these
coordinates, which correspond to the ith and jth observations in the input data set. For SP(POWA)
and SP(EXPA), c is the number of coordinates, and d.i; j; k/ is the absolute distance between the kth
coordinate, k D 1; : : : ; c, of the ith and jth observations in the input data set. For the geometrically
anisotropic structures SP(EXPGA), SP(GAUGA), and SP(SPHGA), exactly two spatial coordinate
variables must be specified as c 1 and c 2 . Geometric anisotropy is corrected by applying a rotation
and scaling to the coordinate system, and dij .; / represents the Euclidean distance between two
points in the transformed space. SP(MATERN) and SP(MATHSW) represent covariance structures in
a class defined by Matérn (see Matérn 1986; Handcock and Stein 1993; Handcock and Wallis 1994).
The function K is the modified Bessel function of the second kind of (real) order > 0; the parameter
governs the smoothness of the process (see below for more details).
Table 84.19 lists some examples of the structures in Table 84.17 and Table 84.18.
REPEATED Statement F 6675
3 2 1 2
2 2 3
1 0 0
6 1 2 1 0 7
Toeplitz with TOEP(2) 4 0 1 2 1 5
6 7
two bands
0 0 1 2
d12 d13 d14
2 3
1
6d21 1 d23 d24 7
Spatial SP(POW)(c ) 2 6
4d31 d32
7
1 d34 5
power
d41 d42 d43 1
12 1 2 1 3 2 1 4 3
2 3
6 2 1 22 2 3 2 4 2 7
Heterogeneous ARH(1) 2 32
6 7
43 1 3 2 3 4 5
AR(1)
4 1 3 4 2 2 4 3 42
2
2 3
1
1 7
2 6
6
First-order ARMA(1,1) 4
7
1 5
autoregressive 2
1
moving average
6676 F Chapter 84: The MIXED Procedure
12
2 3
1 2 1 1 3 2 1 4 3
62 1 1 22 2 3 1 2 4 2 7
Heterogeneous TOEPH
32
6 7
43 1 2 3 2 1 3 4 1 5
Toeplitz
4 1 3 4 2 2 4 3 1 42
12
2 3
1 2 21 1 3 31 1 4 41
62 1 21 22 2 3 32 2 4 42 7
Unstructured UNR
32
6 7
43 1 31 3 2 32 3 4 43 5
correlations
4 1 41 4 2 42 4 3 43 42
2
2 3
1
12
21
Direct product UN@AR(1) ˝ 1 5D
22
4
21
AR(1) 2 1
The following provides some further information about these covariance structures:
ANTE(1) specifies the first-order antedependence structure (see Kenward 1987; Patel 1991;
Macchiavelli and Arnold 1994). In Table 84.17, i2 is the ith variance parameter,
and k is the kth autocorrelation parameter satisfying jk j < 1.
AR(1) specifies a first-order autoregressive structure. PROC MIXED imposes the con-
straint jj < 1 for stationarity.
REPEATED Statement F 6677
.1 C b1 1 /.1 C b1 /
D
1 C b12 C 2b1 1
You can perform a likelihood ratio test of the Huynh-Feldt conditions by running
PROC MIXED twice, once with TYPE=HF and once with TYPE=UN, and then
subtracting their respective values of –2 times the maximized likelihood.
If PROC MIXED does not converge under your Huynh-Feldt model, you can specify
your own starting values with the PARMS statement. The default MIVQUE(0)
starting values can sometimes be poor for this structure. A good choice for starting
values is often the parameter estimates corresponding to an initial fit that uses
TYPE=CS.
LIN(q ) specifies the general linear covariance structure with q parameters. This structure
consists of a linear combination of known matrices that are input with the LDATA=
option. This structure is very general, and you need to make sure that the variance
matrix is positive definite. By default, PROC MIXED sets the initial values of the
parameters to 1. You can use the PARMS statement to specify other initial values.
LINEAR(q ) is an alias for TYPE=LIN(q ).
SIMPLE is an alias for TYPE=VC.
SP(EXPA)(c-list ) specifies the spatial anisotropic exponential structure, where c-list is a list of
variables indicating the coordinates. This structure has .i; j / element equal to
c
Y
2 expf k d.i; j; k/pk g
kD1
where c is the number of coordinates and d.i; j; k/ is the absolute distance between
the kth coordinate (k D 1; : : : ; c) of the ith and jth observations in the input data
set. There are 2c + 1 parameters to be estimated: k , pk (k D 1; : : : ; c), and 2 .
You might want to constrain some of the EXPA parameters to known values. For
example, suppose you have three coordinate variables C1, C2, and C3 and you want
to constrain the powers pk to equal 2, as in Sacks et al. (1989). Suppose further
that you want to model covariance across the entire input data set and you suspect
the k and 2 estimates are close to 3, 4, 5, and 1, respectively. Then specify the
following statements:
for a properly chosen angle and scaling factor . Elliptical isocorrelation con-
tours are thereby transformed to spherical contours, adding two parameters to the
respective isotropic covariance structures. Euclidean distances (see Table 84.18)
are expressed in terms of c .
The angle of the clockwise rotation is reported in radians, 0 2. The
scaling parameter represents the ratio of the range parameters in the direction of
the major and minor axis of the correlation contours. In other words, following a
rotation of the coordinate system by angle , isotropy is achieved by compressing
or magnifying distances in one coordinate by the factor .
Fixing D 1:0 reduces the models to isotropic ones for any angle of rotation. If the
scaling parameter is held constant at 1.0, you should also hold constant the angle of
rotation, as in the following statements:
If is fixed at any other value than 1.0, the angle of rotation can be estimated.
Specifying a starting grid of angles and scaling factors can considerably improve
the convergence properties of the optimization algorithm for these models. Only a
single random effect with geometrically anisotropic structure is permitted.
SP(MATERN)(c-list ) | SP(MATHSW)(c-list ) specifies covariance structures in the Matérn class of
covariance functions (Matérn 1986). Two observations for the same subject (block
of R) that are Euclidean distance dij apart have covariance
21 dij
2K .dij =/ > 0; > 0
./ 2
where K is the modified Bessel function of the second kind of (real) order > 0.
The smoothness (continuity) of a stochastic process with covariance function in
this class increases with . The Matérn class thus enables data-driven estimation of
the smoothness properties. The covariance is identical to the exponential model for
D 0:5 (TYPE=SP(EXP)(c-list)), while for D 1 the model advocated by Whittle
(1954) results. As ! 1 the model approaches the gaussian covariance structure
(TYPE=SP(GAU)(c-list)).
The MATHSW structure represents the Matérn class in the parameterization of
Handcock and Stein (1993) and Handcock and Wallis (1994),
p p
2 1 dij 2dij
2K
./
Since computation of the function K and its derivatives is numerically very inten-
sive, fitting models with Matérn covariance structures can be more time-consuming
than with other spatial covariance structures. Good starting values are essential.
SP(POW)(c-list ) | SP(POWA)(c-list ) specifies the spatial power structures. When the estimated
value of becomes negative, the computed covariance is multiplied by cos.dij /
to account for the negativity.
6680 F Chapter 84: The MIXED Procedure
where dmin and dmax are the smallest and largest distance between any two obser-
vations, ı 0 is the decay speed, and 0 < 1. See TYPE=SP(EXP) for the
computation of the distance dij from the variables specified in c-list . When the
estimated value of becomes negative, the computed covariance is multiplied by
cos.dij / to account for the negativity.
For power analysis of repeated measures designs that have a LEAR correlation
structure, see the section “POWER Statement” on page 4258 in Chapter 55, “The
GLMPOWER Procedure.”
Note that TYPE=SP(LEAR) is not supported for GROUP= option in this SAS
release.
TOEP< (q ) > specifies a banded Toeplitz structure. This can be viewed as a moving-average
structure with order equal to q 1. The TYPE=TOEP option is a full Toeplitz
matrix, which can be viewed as an autoregressive structure with order equal to the
dimension of the matrix. The specification TYPE=TOEP(1) is the same as 2 I ,
where I is an identity matrix, and it can be useful for specifying the same variance
component for several effects.
TOEPH< (q ) > specifies a heterogeneous banded Toeplitz structure. In Table 84.17, i2 is the ith
variance parameter and j is the jth correlation parameter satisfying jj j < 1. If
you specify the order parameter q , then PROC MIXED estimates only the first q
bands of the matrix, setting all higher bands equal to 0. The option TOEPH(1) is
equivalent to both the TYPE=UN(1) and TYPE=UNR(1) options.
UN< (q ) > specifies a completely general (unstructured) covariance matrix parameterized
directly in terms of variances and covariances. The variances are constrained to be
nonnegative, and the covariances are unconstrained. This structure is not constrained
to be nonnegative definite in order to avoid nonlinear constraints; however, you
can use the TYPE=FA0 structure if you want this constraint to be imposed by a
Cholesky factorization. If you specify the order parameter q , then PROC MIXED
estimates only the first q bands of the matrix, setting all higher bands equal to 0.
UNR< (q ) > specifies a completely general (unstructured) covariance matrix parameterized
in terms of variances and correlations. This structure fits the same model as
the TYPE=UN(q) option but with a different parameterization. The ith variance
parameter is i2 . The parameter j k is the correlation between the jth and kth
measurements; it satisfies jj k j < 1. If you specify the order parameter r, then
PROC MIXED estimates only the first q bands of the matrix, setting all higher
bands equal to zero.
UN@AR(1) | UN@CS | UN@UN specify direct (Kronecker) product structures designed for
multivariate repeated measures (see Galecki 1994). These structures are constructed
by taking the Kronecker product of an unstructured matrix (modeling covariance
REPEATED Statement F 6681
proc mixed;
class Var Year Child;
model Y = Var Year Var*Year;
repeated Var Year / type=un@ar(1)
subject=Child;
run;
You should nearly always want to model different means for the multivariate
observations; hence the inclusion of Var in the MODEL statement. The preceding
mean model consists of cell means for all combinations of VAR and YEAR.
VC specifies standard variance components and is the default structure for both the
RANDOM and REPEATED statements. In the RANDOM statement, a distinct
variance component is assigned to each effect. In the REPEATED statement, this
structure is usually used only with the GROUP= option to specify a heterogeneous
variance model.
Jennrich and Schluchter (1986) provide general information about the use of covariance structures,
and Wolfinger (1996) presents details about many of the heterogeneous structures. Modeling with
spatial covariance structures is discussed in many sources (Marx and Thompson 1987; Zimmerman
and Harville 1991; Cressie 1993; Brownie, Bowman, and Burton 1993; Stroup, Baenziger, and Mulitze
1994; Brownie and Gumpertz 1997; Gotway and Stroup 1997; Chilès and Delfiner 1999; Schabenberger
and Gotway 2005; Littell et al. 2006).
6682 F Chapter 84: The MIXED Procedure
SLICE Statement
SLICE model-effect < / options > ;
The SLICE statement provides a general mechanism for performing a partitioned analysis of the LS-means
for an interaction. This analysis is also known as an analysis of simple effects.
This statement uses the same options as the LSMEANS statement, which are summarized in Table 20.23 in
Chapter 20, “Shared Concepts and Topics.” For more information about the syntax of the SLICE statement,
see the section “SLICE Statement” on page 527 in Chapter 20, “Shared Concepts and Topics.”
N OTE : Use the section “LSMEANS Statement” on page 477 in Chapter 20, “Shared Concepts and Topics,”
only for definitions of the options that you can use with the SLICE statement. PROC MIXED uses a slightly
different syntax for the LSMEANS, which is described in the section “LSMEANS Statement” on page 6634.
STORE Statement
STORE < OUT= >item-store-name < / LABEL='label' > ;
The STORE statement saves the context and results of the statistical analysis. The resulting item store has
a binary file format that cannot be modified. The contents of the item store can be processed using the
PLM procedure. For more information about the syntax of the STORE statement, see the section “STORE
Statement” on page 531 in Chapter 20, “Shared Concepts and Topics.”
WEIGHT Statement
WEIGHT variable ;
If you do not specify a REPEATED statement, the WEIGHT statement operates exactly like the one in PROC
GLM. In this case PROC MIXED replaces X0 X and Z0 Z with X0 WX and Z0 WZ, where W is the diagonal
weight matrix. If you specify a REPEATED statement, then the WEIGHT statement replaces R with LRL,
where L is a diagonal matrix with elements W 1=2 . Observations with nonpositive or missing weights are
not included in the PROC MIXED analysis.
If a computation in PROC MIXED involves R, then the WEIGHT statement replaces R with W 1=2 RW 1=2 .
For example, the covariance matrix V for the observations usually have the form V D ZGZ0 C R, which
with the WEIGHT statement becomes V D ZGZ0 C W 1=2 RW 1=2 :
Details: MIXED Procedure F 6683
Matrix Notation
Suppose that you observe n data points y1 ; : : : ; yn and that you want to explain them by using n values for
each of p explanatory variables x11 ; : : : ; x1p , x21 ; : : : ; x2p , : : : ; xn1 ; : : : ; xnp . The xij values can be either
regression-type continuous variables or dummy variables indicating class membership. The standard linear
model for this setup is
p
X
yi D xij ˇj C i i D 1; : : : ; n
j D1
where ˇ1 ; : : : ; ˇp are unknown fixed-effects parameters to be estimated and 1 ; : : : ; n are unknown inde-
pendent and identically distributed normal (Gaussian) random variables with mean 0 and variance 2 .
The preceding equations can be written simultaneously by using vectors and a matrix, as follows:
2 3 2 32 3 2 3
y1 x11 x12 : : : x1p ˇ1 1
6 y2 7 6 x21 x22 : : : x2p 7 6 ˇ2 7 6 2 7
6 :: 7 D 6 :: :: 7 6 :: 7 C 6 :: 7
6 7 6 76 7 6 7
::
4 : 5 4 : : : 54 : 5 4 : 5
yn xn1 xn2 : : : xnp ˇp n
y D Xˇ C
where y denotes the vector of observed yi ’s, X is the known matrix of xij ’s, ˇ is the unknown fixed-effects
parameter vector, and is the unobserved vector of independent and identically distributed Gaussian random
errors.
In addition to denoting data, random variables, and explanatory variables in the preceding fashion, the
subsequent development makes use of basic matrix operators such as transpose (0 ), inverse ( 1 ), generalized
inverse ( ), determinant (j j), and matrix multiplication. See Searle (1982) for details about these and other
matrix techniques.
6684 F Chapter 84: The MIXED Procedure
y D Xˇ C Z C
where everything is the same as in the general linear model except for the addition of the known design matrix,
Z, and the vector of unknown random-effects parameters, . The matrix Z can contain either continuous
or dummy variables, just like X. The name mixed model comes from the fact that the model contains both
fixed-effects parameters, ˇ, and random-effects parameters, . See Henderson (1990) and Searle, Casella,
and McCulloch (1992) for historical developments of the mixed model.
A key assumption in the foregoing analysis is that and are normally distributed with
0
E D
0
G 0
Var D
0 R
The variance of y is, therefore, V D ZGZ0 C R. You can model V by setting up the random-effects design
matrix Z and by specifying covariance structures for G and R.
Note that this is a general specification of the mixed model, in contrast to many texts and articles that discuss
only simple random effects. Simple random effects are a special case of the general specification with Z
containing dummy variables, G containing variance components in a diagonal structure, and R D 2 In ,
where In denotes the n n identity matrix. The general linear model is a further special case with Z D 0 and
R D 2 In .
The following two examples illustrate the most common formulations of the general linear mixed model.
The first column (coded entirely with 1s) fits an intercept, and the second column (coded with times of 1; 2; 3)
fits a slope. Here, n D 3s and p D 2.
Suppose further that you want to introduce a common correlation among the observations from a single
individual, with correlation being the same for all individuals. One way of setting this up in the general mixed
Mixed Models Theory F 6685
model is to eliminate the Z and G matrices and let the R matrix be block diagonal with blocks corresponding
to the individuals and with each block having the compound-symmetry structure. This structure has two
unknown parameters, one modeling a common covariance and the other modeling a residual variance. The
form for R would then be as follows:
2 2
1 C 2 12 12
3
6 12 12 C 2 12 7
2 2 2 2
6 7
6
6 1 1 1 C 7
7
RD6
6 :: 7
: 7
2 2 2 2
6 7
6
6 1 C 1 1
7
7
2 2 2 2
4 1 1 C 1 5
2 2 2
1 1 1 C 2
where blanks denote zeros. There are 3s rows and columns altogether, and the common correlation is
12 =.12 C 2 /.
The PROC MIXED statements to fit this model are as follows:
proc mixed;
class indiv;
model y = time;
repeated / type=cs subject=indiv;
run;
Here, indiv is a classification variable indexing individuals. The MODEL statement fits a straight line for
time ; the intercept is fit by default just as in PROC GLM. The REPEATED statement models the R matrix:
TYPE=CS specifies the compound symmetry structure, and SUBJECT=INDIV specifies the blocks of R.
An alternative way of specifying the common intra-individual correlation is to let
1
2 3
6
6 1 7
7
6
6 1 7
7
6
6 1 7
7
6 1 7
ZD6
6 7
6 1 7
7
6 :: 7
6
6 : 7
7
6
6 1 7
7
4 1 5
1
12
2 3
6 12 7
GD6
6 7
:: 7
4 : 5
12
proc mixed;
class indiv;
model y = time;
random indiv;
run;
proc mixed;
class indiv;
model y = time;
random intercept / subject=indiv;
run;
Both of these specifications fit the same model as the previous one that used the REPEATED statement;
however, the RANDOM specifications constrain the correlation to be positive, whereas the REPEATED
specification leaves the correlation unconstrained.
proc mixed;
class a b block;
model y = a|b;
random block a*block;
run;
Here
R D 2 I24
Mixed Models Theory F 6687
1 1
2 3
6
6 1 1 7
7
6
6 1 1 7
7
6
6 1 1 7
7
6
6 1 1 7
7
6
6 1 1 7
7
6
6 1 1 7
7
6
6 1 1 7
7
6
6 1 1 7
7
6
6 1 1 7
7
6
6 1 1 7
7
1 1
6 7
ZD6
6 7
1 1
7
6 7
1 1
6 7
6 7
1 1
6 7
6 7
6 7
6 1 1 7
6 7
6
6 1 1 7
7
6
6 1 1 7
7
6
6 1 1 7
7
6
6 1 1 7
7
6
6 1 1 7
7
6
6 1 1 7
7
4 1 1 5
1 1
B2
2 3
2
6 AB 7
2
6 7
6 AB 7
2
6 7
6
6 AB 7
7
GD6
6 :: 7
: 7
B2
6 7
6 7
6 7
2
6
6 AB 7
7
2
4 AB 5
2
AB
Mixed Models Theory F 6689
.y Xˇ/0 V 1
.y Xˇ/
However, it requires knowledge of V and, therefore, knowledge of G and R. Lacking such information, one
approach is to use estimated GLS, in which you insert some reasonable estimate for V into the minimization
problem. The goal thus becomes finding a reasonable estimate of G and R.
In many situations, the best approach is to use likelihood-based methods, exploiting the assumption that
and are normally distributed (Hartley and Rao 1967; Patterson and Thompson 1971; Harville 1977; Laird
and Ware 1982; Jennrich and Schluchter 1986). PROC MIXED implements two likelihood-based methods:
maximum likelihood (ML) and restricted/residual maximum likelihood (REML). A favorable theoretical
property of ML and REML is that they accommodate data that are missing at random (Rubin 1976; Little
1995).
PROC MIXED constructs an objective function associated with ML or REML and maximizes it over all
unknown parameters. Using calculus, it is possible to reduce this maximization problem to one over only the
parameters in G and R. The corresponding log-likelihood functions are as follows:
1 1 0 1 n
ML W l.G; R/ D log jVj rV r log.2/
2 2 2
1 1 1 0 n p
REML W lR .G; R/ D log jVj log jX0 V 1 Xj rV 1
r log.2/g
2 2 2 2
where r D y X.X0 V 1 X/ X0 V 1 y and p is the rank of X. PROC MIXED actually minimizes –2
times these functions by using a ridge-stabilized Newton-Raphson algorithm. Lindstrom and Bates (1988)
provide reasons for preferring Newton-Raphson to the Expectation-Maximum (EM) algorithm (Dempster,
Laird, and Rubin 1977; Laird, Lange, and Stram 1987), as well as analytical details for implementing a
QR-decomposition approach to the problem. Wolfinger, Tobias, and Sall (1994) present the sweep-based
algorithms that are implemented in PROC MIXED.
One advantage of using the Newton-Raphson algorithm is that the second derivative matrix of the objective
function evaluated at the optima is available upon completion. Denoting this matrix H, the asymptotic theory
of maximum likelihood (see Serfling 1980) shows that 2H 1 is an asymptotic variance-covariance matrix of
the estimated parameters of G and R. Thus, tests and confidence intervals based on asymptotic normality can
be obtained. However, these can be unreliable in small samples, especially for parameters such as variance
components that have sampling distributions that tend to be skewed to the right.
If a residual variance 2 is a part of your mixed model, it can usually be profiled out of the likelihood.
This means solving analytically for the optimal 2 and plugging this expression back into the likelihood
formula (see Wolfinger, Tobias, and Sall 1994). This reduces the number of optimization parameters by
one and can improve convergence properties. PROC MIXED profiles the residual variance out of the log
likelihood whenever it appears reasonable to do so. This includes the case when R equals 2 I and when it
has blocks with a compound symmetry, time series, or spatial structure. PROC MIXED does not profile the
log likelihood when R has unstructured blocks, when you use the HOLD= or NOITER option in the PARMS
statement, or when you use the NOPROFILE option in the PROC MIXED statement.
Instead of ML or REML, you can use the noniterative MIVQUE0 method to estimate G and R (Rao 1972;
LaMotte 1973; Wolfinger, Tobias, and Sall 1994). In fact, by default PROC MIXED uses MIVQUE0
6690 F Chapter 84: The MIXED Procedure
estimates as starting values for the ML and REML procedures. For variance component models, another
estimation method involves equating Type 1, 2, or 3 expected mean squares to their observed values and
solving the resulting system. However, Swallow and Monahan (1984) present simulation evidence favoring
REML and ML over MIVQUE0 and other method-of-moment estimators.
" # " #
X0 b
R 1X X0 b
R 1Z X0 b 1y
b̌ R
D
Z0 b
R 1 X Z0 b
R 1Z C Gb 1 b 0b
ZR 1y
b̌ D .X0 b
V 1
X/ X0 b
V 1
y
b 0b
b D GZ V 1
.y Xb̌/
and have connections with empirical Bayes estimators (Laird and Ware 1982; Carlin and Louis 1996).
Note that the mixed model equations are extended normal equations and that the preceding expression
assumes that Gb is nonsingular. For the extreme case where the eigenvalues of G b are very large, Gb 1
contributes very little to the equations and b is close to what it would be if actually contained fixed-effects
parameters. On the other hand, when the eigenvalues of G b 1 dominates the equations
b are very small, G
and b is close to 0. For intermediate cases, G 1
b can be viewed as shrinking the fixed-effects estimates of
toward 0 (Robinson 1991).
If G
b is singular, then the mixed model equations are modified (Henderson 1984) as follows:
" # " #
X0 bR 1X X0 b
R 1 ZG
b b̌ X0 bR 1y
D
Gb 0 Z0 b
R 1X G b 0 Z0 b
R 1 ZGbCG
b b0 Z0 b
G R 1y
Denote the generalized inverses of the nonsingular G b and singular Gb forms of the mixed model equations by
C and M, respectively. In the nonsingular case, the solution b estimates the random effects directly, but in
the singular case the estimates of random effects are achieved through a back-transformation b D Gb b where
is the solution to the modified mixed model equations. Similarly, while in the nonsingular case C itself is
b
the estimated covariance matrix for .b̌; b/, in the singular case the covariance estimate for .b̌; Gb
b / is given
by PMP where
I
PD
G
b
An example of when the singular form of the equations is necessary is when a variance component estimate
falls on the boundary constraint of 0.
Mixed Models Theory F 6691
Model Selection
The previous section on estimation assumes the specification of a mixed model in terms of X, Z, G, and R.
Even though X and Z have known elements, their specific form and construction are flexible, and several
possibilities can present themselves for a particular data set. Likewise, several different covariance structures
for G and R might be reasonable.
Space does not permit a thorough discussion of model selection, but a few brief comments and references are
in order. First, subject matter considerations and objectives are of great importance when selecting a model;
see Diggle (1988) and Lindsey (1993).
Second, when the data themselves are looked to for guidance, many of the graphical methods and diagnostics
appropriate for the general linear model extend to the mixed model setting as well (Christensen, Pearson, and
Johnson 1992).
Finally, a likelihood-based approach to the mixed model provides several statistical measures for model
adequacy as well. The most common of these are the likelihood ratio test and Akaike’s and Schwarz’s criteria
(Bozdogan 1987; Wolfinger 1993; Keselman et al. 1998, 1999).
Statistical Properties
If G and R are known, b̌ is the best linear unbiased estimator (BLUE) of ˇ, and b is the best linear unbiased
predictor (BLUP) of (Searle 1971; Harville 1988, 1990; Robinson 1991; McLean, Sanders, and Stroup
1991). Here, “best” means minimum mean squared error. The covariance matrix of .b̌ ˇ; b / is
0 1
X0 R 1 Z
XR X
CD
Z0 R 1 X Z0 R 1 Z C G 1
as the approximate variance-covariance matrix of .b̌ ˇ; b ). In this case, the BLUE and BLUP acronyms
no longer apply, but the word empirical is often added to indicate such an approximation. The appropriate
acronyms thus become EBLUE and EBLUP.
C can also be written as
McLean and Sanders (1988) show that b
" #
0
C
b 11 C
b
21
CD
b
C21 b
b C22
where
C11 D .X0 b
b V 1 X/
C21 D GZ
b b 0bV 1 Xb
C11
0 1 b 1/ 1
C22 D .Z b
b R ZCG C21 X0 b
b V 1
ZG
b
6692 F Chapter 84: The MIXED Procedure
C11 is the familiar estimated generalized least squares formula for the variance-covariance matrix
Note that b
of .
b̌
C tends to underestimate the true sampling variability of (b̌ b) because no account
As a cautionary note, b
is made for the uncertainty in estimating G and R. Although inflation factors have been proposed (Kackar
and Harville 1984; Kass and Steffey 1989; Prasad and Rao 1990), they tend to be small for data sets that
are fairly well balanced. PROC MIXED does not compute any inflation factors by default, but rather
accounts for the downward bias by using the approximate t and F statistics described subsequently. The
DDFM=KENWARDROGER or DDFM=KENWARDROGER2 option in the MODEL statement prompts
PROC MIXED to compute a specific inflation factor along with Satterthwaite-based degrees of freedom.
The estimability requirement (Searle 1971) applies only to the ˇ portion of L, because any linear combination
of is estimable. Such a formulation in terms of a general L matrix encompasses a wide variety of common
inferential procedures such as those employed with Type 1–Type 3 tests and LS-means. The CONTRAST and
ESTIMATE statements in PROC MIXED enable you to specify your own L matrices. Typically, inference on
fixed effects is the focus, and, in this case, the portion of L is assumed to contain all 0s.
Statistical inferences are obtained by testing the hypothesis
ˇ
H WL D0
Mixed Models Theory F 6693
where tb
;˛=2
is the 100.1 ˛=2/th percentile of the tb
distribution.
When the rank of L is greater than 1, PROC MIXED constructs the following general F statistic:
0
0
CL0 / 1L
b̌ b̌
L .Lb
b b
F D
r
where r D rank.Lb CL0 /. Analogous to t, F in general has an approximate F distribution with r numerator
denominator degrees of freedom.
degrees of freedom and b
The t and F statistics enable you to make inferences about your fixed effects, which account for the variance-
covariance model you select. An alternative is the 2 statistic associated with the likelihood ratio test. This
statistic compares two fixed-effects models, one a special case of the other. It is computed just as when
comparing different covariance models, although you should use ML and not REML here because the penalty
term associated with restricted likelihoods depends upon the fixed-effects specification.
Notice that this is a modification of the usual F statistic where .Lb CL0 / 1 is replaced with .LL0 / 1 and
rank.L/ is replaced with t1 D trace.Mb C/; see, for example, Brunner, Domhof, and Langer (2002, Sec. 5.4).
The p-values for this statistic are computed from either an F1 ;2 or an F1 ;1 distribution. The respective
6694 F Chapter 84: The MIXED Procedure
t12
1 D
trace.Mb CMb
C/
2t12
2 D
g0 Ag
maxfminf2 ; dfe g; 1g g0 Ag > 1E3 MACEPS
2 D
1 otherwise
The term g0 Ag in the term 2 for the denominator degrees of freedom is based on approximating
VarŒtrace.Mb C/ based on a first-order Taylor series about the true covariance parameters. This gener-
alizes results in the appendix of Brunner, Dette, and Munk (1997) to a broader class of models. The vector
g D Œg1 ; : : : ; gq contains the partial derivatives
!
0 1 @C
b
0
trace L LL L
@i
and A is the asymptotic variance-covariance matrix of the covariance parameter estimates (ASYCOV option).
PROC MIXED reports 1 and 2 as “NumDF” and “DenDF” under the “ANOVA F” heading in the output.
The corresponding p-values are denoted as “Pr > F(DDF)” for F1 ;2 and “Pr > F(infty)” for F1 ;1 ,
respectively.
P-values computed with the ANOVAF option can be identical to the nonparametric tests in Akritas, Arnold,
and Brunner (1997) and in Brunner, Domhof, and Langer (2002), provided that the response data consist
of properly created (and sorted) ranks and that the covariance parameters are estimated by MIVQUE0 in
models with REPEATED statement and properly chosen SUBJECT= and/or GROUP= effects.
If you model an unstructured covariance matrix in a longitudinal model with one or more repeated factors,
the ANOVAF results are identical to a multivariate MANOVA where degrees of freedom are corrected with
the Greenhouse-Geisser adjustment (Greenhouse and Geisser 1959). For example, suppose that factor A has
2 levels and factor B has 4 levels. The following two sets of statements produce the same p-values:
run;
y D Xˇ C Z C
where y represents univariate data, ˇ is an unknown vector of fixed effects with known model matrix X, is
an unknown vector of random effects with known model matrix Z, and is an unknown random error vector.
PROC MIXED constructs a mixed model according to the specifications in the MODEL, RANDOM, and
REPEATED statements. Each effect in the MODEL statement generates one or more columns in the model
matrix X, and each effect in the RANDOM statement generates one or more columns in the model matrix Z.
Effects in the REPEATED statement do not generate model matrices; they serve only to index observations
within subjects. This section shows precisely how PROC MIXED builds X and Z.
6696 F Chapter 84: The MIXED Procedure
Intercept
By default, all models automatically include a column of 1s in X to estimate a fixed-effect intercept parameter
. You can use the NOINT option in the MODEL statement to suppress this intercept. The NOINT option is
useful when you are specifying a classification effect in the MODEL statement and you want the parameter
estimate to be in terms of the mean response for each level of that effect, rather than in terms of a deviation
from an overall mean.
By contrast, the intercept is not included by default in Z. To obtain a column of 1s in Z, you must specify in
the RANDOM statement either the INTERCEPT effect or some effect that has only one level.
Regression Effects
Numeric variables, or polynomial terms involving them, can be included in the model as regression effects
(covariates). The actual values of such terms are included as columns of the model matrices X and Z. You
can use the bar operator with a regression effect to generate polynomial effects. For instance, X|X|X expands
to X X*X X*X*X, a cubic model.
Main Effects
If a classification variable has m levels, PROC MIXED generates m columns in the model matrix for its main
effect. Each column is an indicator variable for a given level. The order of the columns is the sort order of
the values of their levels and can be controlled with the ORDER= option in the PROC MIXED statement.
Table 84.20 is an example.
Data I A B
A B A1 A2 B1 B2 B3
1 1 1 1 0 1 0 0
1 2 1 1 0 0 1 0
1 3 1 1 0 0 0 1
2 1 1 0 1 1 0 0
2 2 1 0 1 0 1 0
2 3 1 0 1 0 0 1
Typically, there are more columns for these effects than there are degrees of freedom for them. In other
words, PROC MIXED uses an overparameterized model.
Interaction Effects
Often a model includes interaction (crossed) effects. With an interaction, PROC MIXED first reorders the
terms to correspond to the order of the variables in the CLASS statement. Thus, B*A becomes A*B if A
precedes B in the CLASS statement. Then, PROC MIXED generates columns for all combinations of levels
that occur in the data. The order of the columns is such that the rightmost variables in the cross index faster
than the leftmost variables (Table 84.21). Empty columns (that would contain all 0s) are not generated for X,
but they are for Z.
Parameterization of Mixed Models F 6697
Data I A B A*B
A B A1 A2 B1 B2 B3 A1B1 A1B2 A1B3 A2B1 A2B2 A2B3
1 1 1 1 0 1 0 0 1 0 0 0 0 0
1 2 1 1 0 0 1 0 0 1 0 0 0 0
1 3 1 1 0 0 0 1 0 0 1 0 0 0
2 1 1 0 1 1 0 0 0 0 0 1 0 0
2 2 1 0 1 0 1 0 0 0 0 0 1 0
2 3 1 0 1 0 0 1 0 0 0 0 0 1
In the preceding matrix, main-effects columns are not linearly independent of crossed-effects columns; in
fact, the column space for the crossed effects contains the space of the main effect.
When your model contains many interaction effects, you might be able to code them more parsimoniously by
using the bar operator ( | ). The bar operator generates all possible interaction effects. For example, A|B|C
expands to A B A*B C A*C B*C A*B*C. To eliminate higher-order interaction effects, use the at sign (@) in
conjunction with the bar operator. For instance, A|B|C|D @2 expands to A B A*B C A*C B*C D A*D B*D
C*D.
Nested Effects
Nested effects are generated in the same manner as crossed effects. Hence, the design columns generated by
the following two statements are the same (but the ordering of the columns is different):
Data I A B(A)
A B A1 A2 B1A1 B2A1 B3A1 B1A2 B2A2 B3A2
1 1 1 1 0 1 0 0 0 0 0
1 2 1 1 0 0 1 0 0 0 0
1 3 1 1 0 0 0 1 0 0 0
2 1 1 0 1 0 0 0 1 0 0
2 2 1 0 1 0 0 0 0 1 0
2 3 1 0 1 0 0 0 0 0 1
6698 F Chapter 84: The MIXED Procedure
Note that nested effects are often distinguished from interaction effects by the implied randomization structure
of the design. That is, they usually indicate random effects within a fixed-effects framework. The fact that
random effects can be modeled directly in the RANDOM statement might make the specification of nested
effects in the MODEL statement unnecessary.
Continuous-Nesting-Class Effects
When a continuous variable nests with a classification variable, the design columns are constructed by
multiplying the continuous values into the design columns for the class effect (Table 84.23).
Data I A X(A)
X A A1 A2 X(A1) X(A2)
21 1 1 1 0 21 0
24 1 1 1 0 24 0
22 1 1 1 0 22 0
28 2 1 0 1 0 28
19 2 1 0 1 0 19
23 2 1 0 1 0 23
Continuous-by-Class Effects
Continuous-by-class effects generate the same design columns as continuous-nesting-class effects. The two
models are made different by the presence of the continuous variable as a regressor by itself, as well as a
contributor to a compound effect. Table 84.24 shows an example.
Data I X A X*A
X A X A1 A2 X*A1 X*A2
21 1 1 21 1 0 21 0
24 1 1 24 1 0 24 0
22 1 1 22 1 0 22 0
28 2 1 28 0 1 0 28
19 2 1 19 0 1 0 19
23 2 1 23 0 1 0 23
General Effects
An example that combines all the effects is X1*X2*A*B*C (D E). The continuous list comes first, followed
by the crossed list, followed by the nested list in parentheses. You should be aware of the sequencing
of parameters when you use the CONTRAST or ESTIMATE statement to compute some function of the
parameter estimates.
Effects might be renamed by PROC MIXED to correspond to ordering rules. For example, B*A(E D) might
be renamed A*B(D E) to satisfy the following:
Classification variables that occur outside parentheses (crossed effects) are sorted in the order in which
they appear in the CLASS statement.
Variables within parentheses (nested effects) are sorted in the order in which they appear in the CLASS
statement.
The sequencing of the parameters generated by an effect can be described by which variables have their
levels indexed faster:
Variables in the crossed list index faster than variables in the nested list.
Within a crossed or nested list, variables to the right index faster than variables to the left.
For example, suppose a model includes four effects—A, B, C, and D—each having two levels, 1 and 2.
Suppose the CLASS statement is as follows:
class A B C D;
Then the order of the parameters for the effect B*A(C D), which is renamed A*B (C D), is
A1 B1 C1 D1 ! A1 B2 C1 D1 ! A2 B1 C1 D1 ! A2 B2 C1 D1 !
A1 B1 C1 D2 ! A1 B2 C1 D2 ! A2 B1 C1 D2 ! A2 B2 C1 D2 !
A1 B1 C2 D1 ! A1 B2 C2 D1 ! A2 B1 C2 D1 ! A2 B2 C2 D1 !
A1 B1 C2 D2 ! A1 B2 C2 D2 ! A2 B1 C2 D2 ! A2 B2 C2 D2
Note that first the crossed effects B and A are sorted in the order in which they appear in the CLASS
statement so that A precedes B in the parameter list. Then, for each combination of the nested effects in turn,
combinations of A and B appear. The B effect moves fastest because it is rightmost in the cross list. Then A
moves next fastest, and D moves next fastest. The C effect is the slowest since it is leftmost in the nested list.
When numeric levels are used, levels are sorted by their character format, which might not correspond to their
numeric sort sequence (for example, noninteger levels). Therefore, it is advisable to include a desired format
for numeric levels or to use the ORDER=INTERNAL option in the PROC MIXED statement to ensure that
levels are sorted by their internal values.
6700 F Chapter 84: The MIXED Procedure
and studentized as
eei
p
vi
b
External studentization uses an estimate of VarŒe e i that does not involve the ith observation. Externally
studentized residuals are often preferred over internally studentized residuals because they have well-known
distributional properties in standard linear models for independent data.
b
q
Residuals that are scaled by the estimated variance of the response, i.e., e e i = VarŒYi , are referred to as
Pearson-type residuals.
rm D Y Xb̌
Residuals and Influence Diagnostics F 6701
rc D Y Xb̌ Zb D rm Zb
Then
b
VarŒrm D V Q
barŒr D K.Vb Q/K
b
0
V c
For an individual observation the raw, studentized, and Pearson-type residuals computed by the MIXED
procedure are given in Table 84.25.
Studentized student
rmi D q rmi student
rci D q rci
V
carŒrmi V
carŒrci
pearson
q rmi
pearson
Pearson rmi D rci D q rci
V
carŒYi V
carŒYi j
When the OUTPM= option is specified in addition to the RESIDUAL option in the MODEL statement,
student pearson
rmi and rmi are added to the data set as variables Resid, StudentResid, and PearsonResid, respec-
student pearson
tively. When the OUTP= option is specified, rci and rci are added to the data set. Raw residuals
are part of the OUTPM= and OUTP= data sets without the RESIDUAL option.
Scaled Residuals
For correlated data, a set of scaled quantities can be defined through the Cholesky decomposition of the
variance-covariance matrix. Since fitted residuals in linear models are rank-deficient, it is customary to
draw on the variance-covariance matrix of the data. If VarŒY D V and C0 C D V, then C0 1 Y has uniform
dispersion and its elements are uncorrelated.
Scaled residuals in a mixed model are meaningful for quantities based on the marginal distribution of the
C denote the Cholesky root of V,
data. Let b C0 b
b so that b C D V,
b and define
Yc D bC0 1
Y
C0
rm.c/ D b 1
rm
By analogy with other scalings, the inverse Cholesky decomposition can also be applied to the residual vector,
C0 1 rm , although V is not the variance-covariance matrix of rm .
b
To diagnose whether the covariance structure of the model has been specified correctly can be difficult based
on Yc , since the inverse Cholesky transformation affects the expected value of Yc . You can draw on rm.c/ as
a vector of (approximately) uncorrelated data with constant mean.
6702 F Chapter 84: The MIXED Procedure
When the OUTPM= option in the MODEL statement is specified in addition to the VCIRY option, Yc is
added as variable ScaledDep and rm.c/ is added as ScaledResid to the data set.
Influence Diagnostics
Basic Idea and Statistics
The general idea of quantifying the influence of one or more observations relies on computing parameter
estimates based on all data points, removing the cases in question from the data, refitting the model, and
computing statistics based on the change between full-data and reduced-data estimation. Influence statistics
can be coarsely grouped by the aspect of estimation that is their primary target:
overall measures compare changes in objective functions: (restricted) likelihood distance (Cook and
Weisberg 1982, Ch. 5.2)
influence on parameter estimates: Cook’s D (Cook 1977, 1979), MDFFITS (Belsley, Kuh, and Welsch
1980, p. 32)
influence on precision of estimates: CovRatio and CovTrace
influence on fitted and predicted values: PRESS residual, PRESS statistic (Allen 1974), DFFITS
(Belsley, Kuh, and Welsch 1980, p. 15)
outlier properties: internally and externally studentized residuals, leverage
For linear models for uncorrelated data, it is not necessary to refit the model after removing a data point in
order to measure the impact of an observation on the model. The change in fixed effect estimates, residuals,
residual sums of squares, and the variance-covariance matrix of the fixed effects can be computed based on
the fit to the full data alone. By contrast, in mixed models several important complications arise. Data points
can affect not only the fixed effects but also the covariance parameter estimates on which the fixed-effects
estimates depend. Furthermore, closed-form expressions for computing the change in important model
quantities might not be available.
This section provides background material for the various influence diagnostics available with the MIXED
procedure. See the section “Mixed Models Theory” on page 6683 for relevant expressions and definitions.
The parameter vector denotes all unknown parameters in the R and G matrix.
The observations whose influence is being ascertained are represented by the set U and referred to simply as
“the observations in U.” The estimate of a parameter vector, such as ˇ, obtained from all observations except
those in the set U is denoted b̌.U / . In case of a matrix A, the notation A.U / represents the matrix with the
rows in U removed; these rows are collected in AU . If A is symmetric, then notation A.U / implies removal
of rows and columns. The vector YU comprises the responses of the data points being removed, and V.U / is
the variance-covariance matrix of the remaining observations. When k = 1, lowercase notation emphasizes
that single points are removed, such as A.u/ .
1. If the residual variance is not profiled, either because the model does not contain a residual variance or
because it is part of the Newton-Raphson iterations, then b .U / b
.
.U / b
2. If the residual variance is profiled, then b 2.U / D
and b 6 b 2 . Influence statistics such as
Cook’s D and internally studentized residuals are based on V.b /, whereas externally studentized
2 /. In a random components model
residuals and the DFFITS statistic are based on V.b U / D .U /
V.b
with uncorrelated errors, for example, the computation of V.b U / involves scaling of G b and R
b by the
2 2
and multiplying the result with the reduced-data estimate b
full-data estimate b .U / .
Certain statistics, such as MDFFITS, CovRatio, and CovTrace, require an estimate of the variance of the
fixed effects that is based on the reduced number of observations. For example, V.b U / is evaluated at the
reduced-data parameter estimates but computed for the entire data set. The matrix V.U / .b
.U / /, on the other
hand, has rows and columns corresponding to the points in U removed. The resulting matrix is evaluated at
the delete-case estimates.
When influence analysis is iterative, the entire vector is updated, whether the residual variance is profiled or
not. The matrices to be distinguished here are V.b /, V.b .U / /, and V.U / .b
.U / /, with unambiguous notation.
i.U / D yi
b x0i b̌.U /
Leverage
For the general mixed model, leverage can be defined through the projection matrix that results from a
transformation of the model with the inverse of the Cholesky decomposition of V, or through an oblique
projector. The MIXED procedure follows the latter path in the computation of influence diagnostics. The
leverage value reported for the ith observation is the ith diagonal entry of the matrix
H D X.X0 V.b
/ 1
X/ X0 V.b
/ 1
which is the weight of the observation in contributing to its own predicted value, H D d Y=d
b Y.
While H is idempotent, it is generally not symmetric and thus not a projection matrix in the narrow sense.
The properties of these leverages are generalizations of the properties in models with diagonal variance-
covariance matrices. For example, Y b D HY, and in a model with intercept and V D 2 I, the leverage
values
hi i D x0i .X0 X/ xi
6704 F Chapter 84: The MIXED Procedure
are hli i D 1=n hi i 1 D huii and niD1 hi i D rank.X/. The lower bound for hi i is achieved in an
P
intercept-only model, and the upper bound is achieved in a saturated model. The trace of H equals the rank
of X.
If ij denotes the element in row i, column j of V 1, then for a model containing only an intercept the
diagonal elements of H are
Pn
j D1 ij
hi i D Pn Pn
i D1 j D1 ij
Because njD1 ij is a sum of elements in the ith row of the inverse variance-covariance matrix, hi i can
P
be negative, even if the correlations among data points are nonnegative. In case of a saturated model with
X D I, hi i D 1:0.
Cook’s D
Cook’s D statistic is an invariant norm that measures the influence of observations in U on a vector of
parameter estimates (Cook 1977). In case of the fixed-effects coefficients, let
ı.U / D b̌ b̌.U /
b
where VarŒb̌ is the matrix that results from sweeping .X0 V.b
/ 1 X/ .
If V is known, Cook’s D can be calibrated according to a chi-square distribution with degrees of freedom
equal to the rank of X (Christensen, Pearson, and Johnson 1992). For estimated V the calibration can be
carried out according to an F .rank.X/; n rank.X// distribution. To interpret D on a familiar scale, Cook
(1979) and Cook and Weisberg (1982, p. 116) refer to the 50th percentile of the reference distribution. If D is
equal to that percentile, then removing the points in U moves the fixed-effects coefficient vector from the
center of the confidence region to the 50% confidence ellipsoid (Myers 1990, p. 262).
In the case of iterative influence analysis, the MIXED procedure also computes a D-type statistic for the
covariance parameters. If is the asymptotic variance-covariance matrix of b
, then MIXED computes
D D .b
.U // /0
b b 1
.b
.U / /
b
Residuals and Influence Diagnostics F 6705
DFFITSi D .b
yi y i.u/ /=ese.b
b yi /
The MIXED procedure computes DFFITS when the EFFECT= or SIZE= modifier of the INFLUENCE
option is not in effect. In general, an external estimate of the estimated standard error is used. When ITER >
0, the estimate is
q
y i / D x0i .X0 V.b
ese.b .u/ / X/ 1 xi
When the EFFECT=, SIZE=, or KEEP= modifier is specified, the MIXED procedure computes a multivariate
version suitable for the deletion of multiple data points. The statistic, termed MDFFITS after the MDFFIT
statistic of Belsley, Kuh, and Welsch (1980, p. 32), is closely related to Cook’s D. Consider the case
V D 2 V. / so that
e
and let VarŒb̌.U / be an estimate of VarŒb̌.U / that does not use the observations in U. The MDFFITS statistic
is then computed as
0
MDFFITS.ˇ/ D ı.U e
/ VarŒ .U / ı.U / =rank.X/
b̌
e
If ITER=0 and 2 is profiled, then VarŒb̌.U / is obtained by sweeping
would be VarŒb̌= 2 in a generalized least squares regression with all but the data in U.
e
In the case of iterative influence analysis, VarŒb̌.U / is evaluated at b
.U / . Furthermore, a MDFFITS-type
statistic is then computed for the covariance parameters:
MDFFITS./ D .b
b b
.U / /0 VarŒb
.U / 1
.b
.U / /
b
6706 F Chapter 84: The MIXED Procedure
Likelihood Distances
The log-likelihood function l and restricted log-likelihood function lR of the linear mixed model are given
in the section “Estimating Covariance Parameters in the Mixed Model” on page 6689. Denote as the
collection of all parameters, i.e., the fixed effects ˇ and the covariance parameters . Twice the difference
between the (restricted) log-likelihood evaluated at the full-data estimates b and at the reduced-data estimates
b .U / is known as the (restricted) likelihood distance:
RLD.U / D 2flR . b / lR . b .U / /g
LD.U / D 2fl. b / l. b .U / /g
Cook and Weisberg (1982, Ch. 5.2) refer to these differences as likelihood distances, Beckman, Nachtsheim,
and Cook (1987) call the measures likelihood displacements. If the number of elements in that are subject
to updating following point removal is q, then likelihood displacements can be compared against cutoffs from
a chi-square distribution with q degrees of freedom. Notice that this reference distribution does not depend
on the number of observations removed from the analysis, but rather on the number of model parameters that
are updated. The likelihood displacement gives twice the amount by which the log likelihood of the full data
changes if one were to use an estimate based on fewer data points. It is thus a global, summary measure of
the influence of the observations in U jointly on all parameters.
Unless METHOD=ML, the MIXED procedure computes the likelihood displacement based on the residual
(=restricted) log likelihood, even if METHOD=MIVQUE0 or METHOD=TYPE1, TYPE2, or TYPE3.
Updating the Fixed Effects Denote by U the .n k/ matrix that is assembled from k columns of the
identity matrix. Each column of U corresponds to the removal of one data point. The point being targeted by
the ith column of U corresponds to the row in which a 1 appears. Furthermore, define
D .X0 V 1
X/
0
Q D XX
1 1
PDV .V Q/V
The change in the fixed-effects estimates following removal of the observations in U is
b̌ b̌.U / D X0 V 1
U.U0 PU/ 1
U0 V 1
.y Xb̌/
Using results in Cook and Weisberg (1982, A2) you can further compute
e D .X0 V 1 X.U / / D C X0 V
1
U.U0 PU/ 1
U0 V 1
X
.U / .U /
If X is .n p/ of rank m < p, then is deficient in rank and the MIXED procedure computes needed
e by sweeping (Goodnight 1979). If the rank of the .k k/ matrix U0 PU is less than k, the
quantities in
removal of the observations introduces a new singularity, whether X is of full rank or not. The solution
vectors b̌ and b̌.U / then do not have the same expected values and should not be compared. When the
MIXED procedure encounters this situation, influence diagnostics that depend on the choice of generalized
inverse are not computed. The procedure also monitors the singularity criteria when sweeping the rows of
.X0 V 1 X/ and of .X0.U / V.U1/ X.U / / . If a new singularity is encountered or a former singularity disappears,
no influence statistics are computed.
Residual Variance When 2 is profiled out of the marginal variance-covariance matrix, a closed-form
estimate of 2 that is based on only the remaining observations can be computed provided V D V.b / is
known. Hurtado (1993, Thm. 5.2) shows that
.n q 2.U / D .n
r/b 2
q/b b 2 U0 PU/ 1b
0U .b U
U D U0 V 1 .y Xb̌/. In the case of maximum likelihood estimation q = 0 and for REML estimation
and b
q D rank.X/. The constant r equals the rank of .U0 PU/ for REML estimation and the number of effective
observations that are removed if METHOD=ML.
Likelihood Distances For noniterative methods the following computational devices are used to compute
(restricted) likelihood distances provided that the residual variance 2 is profiled.
The log likelihood function l.b
/ evaluated at the full-data and reduced-data estimates can be written as
n 1 1 1 n
l. b / D 2/
log.b log jV j .y Xb̌/0 V .y Xb̌/=b 2 log.2/
2 2 2 2
n 1 1 1 n
l. b .U / / D 2.U / /
log.b log jV j 2.U /
.y Xb̌.U / /0 V .y Xb̌.U / /=b log.2/
2 2 2 2
Expressions for RLD.U / in noniterative influence analysis are derived along the same lines.
6708 F Chapter 84: The MIXED Procedure
Default Output
The following sections describe the output PROC MIXED produces by default. This output is organized into
various tables, and they are discussed in order of appearance.
Model Information
The “Model Information” table describes the model, some of the variables it involves, and the method used
in fitting it. It also lists the method (profile, factor, parameter, or none) for handling the residual variance in
the model. The profile method concentrates the residual variance out of the optimization problem, whereas
the parameter method retains it as a parameter in the optimization. The factor method keeps the residual
fixed, and none is displayed when a residual variance is not part of the model.
The “Model Information” table also has a row labeled Fixed Effects SE Method. This row describes the
method used to compute the approximate standard errors for the fixed-effects parameter estimates and related
functions of them. The two possibilities for this row are Model-Based, which is the default method, and
Empirical, which results from using the EMPIRICAL option in the PROC MIXED statement.
The ODS name of the “Model Information” table is ModelInfo.
Dimensions
The “Dimensions” table lists the sizes of relevant matrices. This table can be useful in determining CPU time
and memory requirements. The ODS name of the “Dimensions” table is Dimensions.
Number of Observations
The “Number of Observations” table shows the number of observations read from the data set and the number
of observations used in fitting the model.
Iteration History
The “Iteration History” table describes the optimization of the residual log likelihood or log likelihood. The
function to be minimized (the objective function) is 2l for ML and 2lR for REML; the column name of
the objective function in the “Iteration History” table is “-2 Log Like” for ML and “-2 Res Log Like” for
REML. The minimization is performed by using a ridge-stabilized Newton-Raphson algorithm, and the rows
of this table describe the iterations that this algorithm takes in order to minimize the objective function.
The Evaluations column of the “Iteration History” table tells how many times the objective function is
evaluated during each iteration.
Default Output F 6709
The Criterion column of the “Iteration History” table is, by default, a relative Hessian convergence quantity
given by
gk0 Hk 1 gk
jfk j
where fk is the value of the objective function at iteration k, gk is the gradient (first derivative) of fk , and Hk
is the Hessian (second derivative) of fk . If Hk is singular, then PROC MIXED uses the following relative
quantity:
gk0 gk
jfk j
To prevent the division by jfk j, use the ABSOLUTE option in the PROC MIXED statement. To use a relative
function or gradient criterion, use the CONVF or CONVG option, respectively.
The Hessian criterion is considered superior to function and gradient criteria because it measures orthogonality
rather than lack of progress (Bates and Watts 1988). Provided the initial estimate is feasible and the maximum
number of iterations is not exceeded, the Newton-Raphson algorithm is considered to have converged when
the criterion is less than the tolerance specified with the CONVF, CONVG, or CONVH option in the PROC
MIXED statement. The default tolerance is 1E–8. If convergence is not achieved, PROC MIXED displays
the estimates of the parameters at the last iteration.
A convergence criterion that is missing indicates that a boundary constraint has been dropped; it is usually
not a cause for concern.
If you specify the ITDETAILS option in the PROC MIXED statement, then the covariance parameter
estimates at each iteration are included as additional columns in the “Iteration History” table.
The ODS name of the “Iteration History” table is IterHistory.
Convergence Status
The “Convergence Status” table informs about the status of the iterative estimation process at the end of the
Newton-Raphson optimization. It appears as a message in the listing, and this message is repeated in the
log. The ODS object ConvergenceStatus also contains several nonprinting columns that can be helpful in
checking the success of the iterative process, in particular during batch processing or when analyzing BY
groups. The Status variable takes on the value 0 for a successful convergence (even if the Hessian matrix
might not be positive definite). The values 1 and 2 of the Status variable indicate lack of convergence and
infeasible initial parameter values, respectively. The variables pdG and pdH can be used to check whether the
G and H (Hessian) matrices are positive definite.
For models that are not fit iteratively, such as models without random effects or when the NOITER option is
in effect, the “Convergence Status” is not produced.
6710 F Chapter 84: The MIXED Procedure
Fit Statistics
The “Fit Statistics” table provides some statistics about the estimated mixed model. Expressions for the –2
times the log likelihood are provided in the section “Estimating Covariance Parameters in the Mixed Model”
on page 6689. If the log likelihood is an extremely large number, then PROC MIXED has deemed the
estimated V matrix to be singular. In this case, all subsequent results should be viewed with caution.
In addition, the “Fit Statistics” table lists three information criteria: AIC, AICC, and BIC, all in smaller-is-
better form. Expressions for these criteria are described under the IC option.
The ODS name of the “Model Fitting Information” table is FitStatistics.
Pq
contains only the fixed effects that are listed in the MODEL statement, and R D 2 I or R D 2 kD1 .Ak /
when TYPE=LIN(q) in the REPEATED statement. This statistic has an asymptotic 2 distribution with q 1
degrees of freedom, where q is the effective number of covariance parameters (those not estimated to be on a
boundary constraint). The Pr > ChiSq column contains the upper-tail area from this distribution. This p-value
can be used to assess the significance of the model fit.
This test is not produced for cases where the null hypothesis lies on the boundary of the parameter space,
which is typically true of variance component models. This is because the standard asymptotic theory does
not apply in such cases (Self and Liang 1987, Case 5).
If you specify a PARMS statement, PROC MIXED constructs a likelihood ratio test between the best model
from the grid search and the final fitted model and reports the results in the “Parameter Search” table.
The ODS name of the “Null Model Likelihood Ratio Test” table is LRT.
b̌0 L0 ŒL.X0 b
V 1 X/ L0 Lb̌
F D
r
where r D rank.L.X0 b V 1 X/ L0 /. A p-value for the test is computed as the tail area beyond this statistic
from an F distribution with NDF and DDF degrees of freedom. The numerator degrees of freedom (NDF)
are the row rank of L, and the denominator degrees of freedom (DDF) are computed by using one of the
methods described under the DDFM= option. Small p-values (typically less than 0.05 or 0.01) indicate a
significant effect.
You can use the HTYPE= option in the MODEL statement to obtain tables of Type 1 (sequential) tests and
Type 2 (adjusted) tests in addition to or instead of the table of Type 3 (partial) tests.
You can use the CHISQ option in the MODEL statement to obtain Wald 2 tests of the fixed effects. These
are carried out by using the numerator of the F statistic and comparing it with the 2 distribution with NDF
degrees of freedom. It is more liberal than the F test because it effectively assumes infinite denominator
degrees of freedom.
The ODS names of the “Type 1 Tests of Fixed Effects” through the “Type 3 Tests of Fixed Effects” tables are
Tests1 through Tests3, respectively.
6712 F Chapter 84: The MIXED Procedure
In Table 84.26, “Coef” refers to multiple tables produced by the E, E1, E2, or E3 option in the MODEL
statement and the E option in the CONTRAST, ESTIMATE, and LSMEANS statements. You can create one
large data set of these tables with a statement similar to the following:
Some of the variables listed in Table 84.27 are created only when you specify certain options in the relevant
PROC MIXED statements.
ODS Graphics F 6717
ODS Graphics
Statistical procedures use ODS Graphics to create graphs as part of their output. ODS Graphics is described
in detail in Chapter 24, “Statistical Graphics Using ODS.”
Before you create graphs, ODS Graphics must be enabled (for example, by specifying the ODS GRAPH-
ICS ON statement). For more information about enabling and disabling ODS Graphics, see the section
“Enabling and Disabling ODS Graphics” on page 663 in Chapter 24, “Statistical Graphics Using ODS.”
The overall appearance of graphs is controlled by ODS styles. Styles and other aspects of using ODS
Graphics are discussed in the section “A Primer on ODS Statistical Graphics” on page 662 in Chapter 24,
“Statistical Graphics Using ODS.”
Some graphs are produced by default; other graphs are produced by using statements and options.
When ODS Graphics is enabled, the LSMESTIMATE and SLICE statements can produce plots that are
associated with their analyses. For information about these plots, see the sections “LSMESTIMATE
Statement” on page 498 and “SLICE Statement” on page 527 in Chapter 20, “Shared Concepts and Topics.”
Residual Plots
The MIXED procedure can generate panels of residual diagnostics. Each panel consists of a plot of residuals
versus predicted values, a histogram with normal density overlaid, a Q-Q plot, and summary residual and fit
statistics (Figure 84.15). The plots are produced even if the OUTP= and OUTPM= options in the MODEL
statement are not specified. Residual panels can be generated for marginal and conditional raw, studentized,
and Pearson residuals as well as for scaled residuals (see the section “Residual Diagnostics” on page 6700).
Recall the example in the section “Getting Started: MIXED Procedure” on page 6606. The following
statements generate several 2 2 panels of residual graphs:
The graphs are created when ODS Graphics is enabled. The panel of the studentized marginal residuals is
shown in Figure 84.15, and the panel of the studentized conditional residuals is shown in Figure 84.16.
Since the fixed-effects part of the model comprises only an intercept and the gender effect, the marginal
mean takes on only two values, one for each gender. The “Residual Statistics” inset in the lower-right corner
provides descriptive statistics for the set of residuals that is displayed. Note that residuals in a mixed model
do not necessarily sum to zero, even if the model contains an intercept.
6720 F Chapter 84: The MIXED Procedure
Influence Plots
The graphical features of the MIXED procedure enable you to generate plots of influence diagnostics and of
deletion estimates. The type and number of plots produced depend on your modifiers of the INFLUENCE
option in the MODEL statement and on the PLOTS= option in the PROC MIXED statement. Plots related to
covariance parameters are produced only when diagnostics are computed by iterative methods (ITER=). The
estimates of the fixed effects—and covariance parameters when updates are iterative—are plotted when you
specify the ESTIMATES modifier or when you request PLOTS=INFLUENCEESTPLOT.
Two basic types of influence panels are shown in Figure 84.17 and Figure 84.18. The diagnostics panel
shows Cook’s D and CovRatio statistics for the fixed effects and the covariance parameters. For the SAS
statements that produce these influence panels, see Example 84.8. In this example, the impact of subjects
(Person) on the analysis is assessed. The Cook’s D statistic measures a subject’s impact on the estimates,
and the CovRatio statistic measures a subject’s impact on the precision of the estimates. Separate statistics
are computed for the fixed effects and the covariance parameters. The CovRatio statistic has a threshold of
1.0. Values larger than 1.0 indicate that precision of the estimates is lost by exclusion of the observations
in question. Values smaller than 1.0 indicate that precision is gained by exclusion of the observations from
the analysis. For example, it is evident from Output 84.17 that person 20 has considerable impact on the
ODS Graphics F 6721
covariance parameter estimates and moderate influence on the fixed-effects estimates. Furthermore, exclusion
of this subject from the analysis increases the precision of the covariance parameters, whereas the effect on
the precision of the fixed effects is minor.
Output 84.18 shows another type of influence plot, a panel of the deletion estimates. Each plot within the
panel corresponds to one of the model parameters. A reference line is drawn at the estimate based on the full
data.
Figure 84.17 Influence Diagnostics
6722 F Chapter 84: The MIXED Procedure
Computational Issues
Computational Method
In addition to numerous matrix-multiplication routines, PROC MIXED frequently uses the sweep operator
(Goodnight 1979) and the Cholesky root (Golub and Van Loan 1989). The routines perform a modified
W transformation (Goodnight and Hemmerle 1979) for G-side likelihood calculations and a direct method
for R-side likelihood calculations. For the Type 3 F tests, PROC MIXED uses the algorithm described in
Chapter 53, “The GLM Procedure.”
PROC MIXED uses a ridge-stabilized Newton-Raphson algorithm to optimize either a full (ML) or residual
(REML) likelihood function. The Newton-Raphson algorithm is preferred to the EM algorithm (Lindstrom
and Bates 1988). PROC MIXED profiles the likelihood with respect to the fixed effects and also with respect
to the residual variance whenever it appears reasonable to do so. The residual profiling can be avoided by
using the NOPROFILE option of the PROC MIXED statement. PROC MIXED uses the MIVQUE0 method
(Rao 1972; Giesbrecht 1989) to compute initial values.
Computational Issues F 6723
The likelihoods that PROC MIXED optimizes are usually well-defined continuous functions with a single
optimum. The Newton-Raphson algorithm typically performs well and finds the optimum in a few iterations.
It is a quadratically converging algorithm, meaning that the error of the approximation near the optimum
is squared at each iteration. The quadratic convergence property is evident when the convergence criterion
drops to zero by factors of 10 or more.
Symbol Number
p Columns of X
g Columns of Z
N Observations
q Covariance parameters
t Maximum observations per subject
S Subjects
Using the notation from Table 84.29, the following are estimates of the computational speed of the algorithms
used in PROC MIXED. For likelihood calculations, the crossproducts matrix construction is of order
N.p C g/2 and the sweep operations are of order .p C g/3 . The first derivative calculations for parameters
in G are of order qg 3 for ML and q.g 3 C pg 2 C p 2 g/ for REML. If you specify a subject effect in the
RANDOM statement and if you are not using the REPEATED statement, then replace g with g=S and q with
qS in these calculations. The first derivative calculations for parameters in R are of order qS.t 3 C gt 2 C g 2 t /
for ML and qS.t 3 C .p C g/t 2 C .p 2 C g 2 /t / for REML. For the second derivatives, replace q with
q.q C 1/=2 in the first derivative expressions. When you specify both G- and R-side parameters (that is,
when you use both the RANDOM and REPEATED statements), then additional calculations are required of
an order equal to the sum of the orders for G and R. Considerable execution times can result in this case.
For further details about the computational techniques used in PROC MIXED, see Wolfinger, Tobias, and
Sall (1994).
Parameter Constraints
By default, some covariance parameters are assumed to satisfy certain boundary constraints during the
Newton-Raphson algorithm. For example, variance components are constrained to be nonnegative, and
autoregressive parameters are constrained to be between –1 and 1. You can remove these constraints with
the NOBOUND option in the PARMS statement (or with the NOBOUND option in the PROC MIXED
statement), but this can lead to estimates that produce an infinite likelihood. You can also introduce or change
boundary constraints with the LOWERB= and UPPERB= options in the PARMS statement.
During the Newton-Raphson algorithm, a parameter might be set equal to one of its boundary constraints for
a few iterations and then it might move away from the boundary. You see a missing value in the Criterion
column of the “Iteration History” table whenever a boundary constraint is dropped.
For some data sets the final estimate of a parameter might equal one of its boundary constraints. This is
usually not a cause for concern, but it might lead you to consider a different model. For instance, a variance
component estimate can equal zero; in this case, you might want to drop the corresponding random effect
from the model. However, be aware that changing the model in this fashion can affect degrees-of-freedom
calculations.
6724 F Chapter 84: The MIXED Procedure
Convergence Problems
For some data sets, the Newton-Raphson algorithm can fail to converge. Nonconvergence can result from a
number of causes, including flat or ridged likelihood surfaces and ill-conditioned data.
It is also possible for PROC MIXED to converge to a point that is not the global optimum of the likelihood,
although this usually occurs only with the spatial covariance structures.
If you experience convergence problems, the following points might be helpful:
One useful tool is the PARMS statement, which lets you input initial values for the covariance
parameters and performs a grid search over the likelihood surface.
Sometimes the Newton-Raphson algorithm does not perform well when two of the covariance parame-
ters are on a different scale—that is, when they are several orders of magnitude apart. This is because
the Hessian matrix is processed jointly for the two parameters, and elements of it corresponding to
one of the parameters can become close to internal tolerances in PROC MIXED. In this case, you can
improve stability by rescaling the effects in the model so that the covariance parameters are on the
same scale.
Data that are extremely large or extremely small can adversely affect results because of the internal
tolerances in PROC MIXED. Rescaling it can improve stability.
For stubborn problems, you might want to specify ODS OUTPUT COVPARMS=data-set-name to
output the “Covariance Parameter Estimates” table as a precautionary measure. That way, if the
problem does not converge, you can read the final parameter values back into a new run with the
PARMSDATA= option in the PARMS statement.
Fisher scoring can be more robust than Newton-Raphson with poor MIVQUE(0) starting values.
Specifying a SCORING= value of 5 or so might help to recover from poor starting values.
Tuning the singularity options SINGULAR=, SINGCHOL=, and SINGRES= in the MODEL statement
can improve the stability of the optimization process.
Tuning the MAXITER= and MAXFUNC= options in the PROC MIXED statement can save resources.
Also, the ITDETAILS option displays the values of all the parameters at each iteration.
Using the NOPROFILE and NOBOUND options in the PROC MIXED statement might help conver-
gence, although they can produce unusual results.
Although the CONVH convergence criterion usually gives the best results, you might want to try
CONVF or CONVG, possibly along with the ABSOLUTE option.
If the convergence criterion reaches a relatively small value such as 1E–7 but never gets lower than
1E–8, you might want to specify CONVH=1E–6 in the PROC MIXED statement to get results; however,
interpret the results with caution.
An infinite likelihood during the iteration process means that the Newton-Raphson algorithm has
stepped into a region where either the R or V matrix is nonpositive definite. This is usually no cause
for concern as long as iterations continue. If PROC MIXED stops because of an infinite likelihood,
recheck your model to make sure that no observations from the same subject are producing identical
rows in R or V and that you have enough data to estimate the particular covariance structure you
have selected. Any time that the final estimated likelihood is infinite, subsequent results should be
interpreted with caution.
Computational Issues F 6725
A nonpositive definite Hessian matrix can indicate a surface saddlepoint or linear dependencies among
the parameters.
A warning message about the singularities of X changing indicates that there is some linear dependency
b 1 X that is not found in X0 X. This can adversely affect the likelihood calculations
in the estimate of X0 V
and optimization process. If you encounter this problem, make sure that your model specification is
reasonable and that you have enough data to estimate the particular covariance structure you have
selected. Rearranging effects in the MODEL statement so that the most significant ones are first can
help, because PROC MIXED sweeps the estimate of X0 V 1 X in the order of the MODEL effects and
the sweep is more stable if larger pivots are dealt with first. If this does not help, specifying starting
values with the PARMS statement can place the optimization on a different and possibly more stable
path.
Lack of convergence can indicate model misspecification or a violation of the normality assumption.
Memory
Let p be the number of columns in X, and let g be the number of columns in Z. For large models, most of the
memory resources are required for holding symmetric matrices of order p, g, and p C g. The approximate
memory requirement in bytes is
If you have a large model that exceeds the memory capacity of your computer, see the suggestions listed
under “Computing Time.”
Computing Time
PROC MIXED is computationally intensive, and execution times can be long. In addition to the CPU time
used in collecting sums and crossproducts and in solving the mixed model equations (as in PROC GLM),
considerable CPU time is often required to compute the likelihood function and its derivatives. These latter
computations are performed for every Newton-Raphson iteration.
If you have a model that takes too long to run, the following suggestions can be helpful:
Examine the “Model Information” table to find out the number of columns in the X and Z matrices.
A large number of columns in either matrix can greatly increase computing time. You might want to
eliminate some higher-order effects if they are too large.
If you have a Z matrix with a lot of columns, use the DDFM=BW option in the MODEL statement to
eliminate the time required for the containment method.
If possible, “factor out” a common effect from the effects in the RANDOM statement and make it the
SUBJECT= effect. This creates a block-diagonal G matrix and can often speed calculations.
If possible, use the same or nested SUBJECT= effects in all RANDOM and REPEATED statements.
If your data set is very large, you might want to analyze it in pieces. The BY statement can help
implement this strategy.
6726 F Chapter 84: The MIXED Procedure
In general, specify random effects with a lot of levels in the REPEATED statement and those with a
few levels in the RANDOM statement.
The METHOD=MIVQUE0 option runs faster than either the METHOD=REML or METHOD=ML
option because it is noniterative.
You can specify known values for the covariance parameters by using the HOLD= or NOITER option
in the PARMS statement or the GDATA= option in the RANDOM statement. This eliminates the need
for iteration.
The LOGNOTE option in the PROC MIXED statement writes periodic messages to the SAS log
concerning the status of the calculations. It can help you diagnose where the slowdown is occurring.
data sp;
input Block A B Y @@;
datalines;
1 1 1 56 1 1 2 41
1 2 1 50 1 2 2 36
1 3 1 39 1 3 2 35
Example 84.1: Split-Plot Design F 6727
2 1 1 30 2 1 2 25
2 2 1 36 2 2 2 28
2 3 1 33 2 3 2 30
3 1 1 32 3 1 2 24
3 2 1 31 3 2 2 27
3 3 1 15 3 3 2 19
4 1 1 30 4 1 2 25
4 2 1 35 4 2 2 30
4 3 1 17 4 3 2 18
;
The following statements fit the split-plot model assuming random block effects:
proc mixed;
class A B Block;
model Y = A B A*B;
random Block A*Block;
run;
The variables A, B, and Block are listed as classification variables in the CLASS statement. The columns of
model matrix X consist of indicator variables corresponding to the levels of the fixed effects A, B, and A*B
listed on the right side of the MODEL statement. The dependent variable Y is listed on the left side of the
MODEL statement.
The columns of the model matrix Z consist of indicator variables corresponding to the levels of the random
effects Block and A*Block. The G matrix is diagonal and contains the variance components of Block and
A*Block. The R matrix is also diagonal and contains the residual variance.
Model Information
Data Set WORK.SP
Dependent Variable Y
Covariance Structure Variance Components
Estimation Method REML
Residual Variance Method Profile
Fixed Effects SE Method Model-Based
Degrees of Freedom Method Containment
The “Class Level Information” table in Output 84.1.2 lists the levels of all variables specified in the CLASS
statement. You can check this table to make sure that the data are correct.
6728 F Chapter 84: The MIXED Procedure
Class Level
Information
Class Levels Values
A 3 123
B 2 12
Block 4 1234
The “Dimensions” table in Output 84.1.3 lists the magnitudes of various vectors and matrices. The X matrix
is seen to be 24 12, and the Z matrix is 24 16.
Dimensions
Covariance Parameters 3
Columns in X 12
Columns in Z 16
Subjects 1
Max Obs per Subject 24
The “Number of Observations” table in Output 84.1.4 shows that all observations read from the data set are
used in the analysis.
Number of Observations
Number of Observations Read 24
Number of Observations Used 24
Number of Observations Not Used 0
PROC MIXED estimates the variance components for Block, A*Block, and the residual by REML. The
REML estimates are the values that maximize the likelihood of a set of linearly independent error contrasts,
and they provide a correction for the downward bias found in the usual maximum likelihood estimates. The
objective function is –2 times the logarithm of the restricted likelihood, and PROC MIXED minimizes this
objective function to obtain the estimates.
The minimization method is the Newton-Raphson algorithm, which uses the first and second derivatives of
the objective function to iteratively find its minimum. The “Iteration History” table in Output 84.1.5 records
the steps of that optimization process. For this example, only one iteration is required to obtain the estimates.
The Evaluations column reveals that the restricted likelihood is evaluated once for each of the iterations. A
criterion of 0 indicates that the Newton-Raphson algorithm has converged.
Iteration History
Iteration Evaluations -2 Res Log Like Criterion
0 1 139.81461222
1 1 119.76184570 0.00000000
Example 84.1: Split-Plot Design F 6729
The REML estimates for the variance components of Block, A*Block, and the residual are 62.40, 15.38,
and 9.36, respectively, as listed in the Estimate column of the “Covariance Parameter Estimates” table in
Output 84.1.6.
Covariance
Parameter
Estimates
Cov Parm Estimate
Block 62.3958
A*Block 15.3819
Residual 9.3611
The “Fit Statistics” table in Output 84.1.7 lists several pieces of information about the fitted mixed model,
including the residual log likelihood. The Akaike (AIC) and Bayesian (BIC) information criteria can be used
to compare different models; the ones with smaller values are preferred. The AICC information criteria is a
small-sample bias-adjusted form of the Akaike criterion (Hurvich and Tsai 1989).
Fit Statistics
-2 Res Log Likelihood 119.8
AIC (Smaller is Better) 125.8
AICC (Smaller is Better) 127.5
BIC (Smaller is Better) 123.9
Finally, the fixed effects are tested by using Type 3 estimable functions (Output 84.1.8).
The tests match the one obtained from the following PROC GLM statements:
You can continue this analysis by producing solutions for the fixed and random effects and then testing
various linear combinations of them by using the CONTRAST and ESTIMATE statements. If you use
the same CONTRAST and ESTIMATE statements with PROC GLM, the test statistics correspond to the
fixed-effects-only model. The test statistics from PROC MIXED incorporate the random effects.
The various “inference space” contrasts given by Stroup (1989a) can be implemented via the ESTIMATE
statement. Consider the following examples:
Estimates
Standard
Label Estimate Error DF t Value Pr > |t|
a1 mean narrow 32.8750 1.0817 9 30.39 <.0001
a1 mean intermed 32.8750 2.2396 9 14.68 <.0001
a1 mean broad 32.8750 4.5403 9 7.24 <.0001
Note that all the estimates are equal, but their standard errors increase with the size of the inference space.
The narrow inference space consists of the observed levels of Block and A*Block, and the t-statistic value of
30.39 applies only to these levels. This is the same t statistic computed by PROC GLM, because it computes
standard errors from the narrow inference space. The intermediate inference space consists of the observed
levels of Block and the entire population of levels from which A*Block are sampled. The t-statistic value of
14.68 applies to this intermediate space. The broad inference space consists of arbitrary random levels of
both Block and A*Block, and the t-statistic value of 7.24 is appropriate. Note that the larger the inference
space, the weaker the conclusion. However, the broad inference space is usually the one of interest, and
even in this space conclusive results are common. The highly significant p-value for ’a1 mean broad’ is an
example. You can also obtain the ’a1 mean broad’ result by specifying A in an LSMEANS statement. For
more discussion of the inference space concept, see McLean, Sanders, and Stroup (1991).
The following statements illustrate another feature of the RANDOM statement. Recall that the basic
statements for a split-plot design with whole plots arranged in randomized blocks are as follows.
Example 84.2: Repeated Measures F 6731
proc mixed;
class A B Block;
model Y = A B A*B;
random Block A*Block;
run;
An equivalent way of specifying this model is as follows:
/* equivalent model */
proc mixed data=sp;
class A B Block;
model Y = A B A*B;
random intercept A / subject=Block;
run;
In general, if all of the effects in the RANDOM statement can be nested within one effect, you can specify
that one effect by using the SUBJECT= option. The subject effect is, in a sense, “factored out” of the
random effects. The specification that uses the SUBJECT= effect can result in faster execution times for large
problems because PROC MIXED is able to perform the likelihood calculations separately for each subject.
data pr;
input Person Gender $ y1 y2 y3 y4;
y=y1; Age=8; output;
y=y2; Age=10; output;
y=y3; Age=12; output;
y=y4; Age=14; output;
drop y1-y4;
datalines;
1 F 21.0 20.0 21.5 23.0
2 F 21.0 21.5 24.0 25.5
3 F 20.5 24.0 24.5 26.0
4 F 23.5 24.5 25.0 26.5
5 F 21.5 23.0 22.5 23.5
6 F 20.0 21.0 21.0 22.5
7 F 21.5 22.5 23.0 25.0
8 F 23.0 23.0 23.5 24.0
9 F 20.0 21.0 22.0 21.5
6732 F Chapter 84: The MIXED Procedure
Model Information
Data Set WORK.PR
Dependent Variable y
Covariance Structure Unstructured
Subject Effect Person
Estimation Method ML
Residual Variance Method None
Fixed Effects SE Method Model-Based
Degrees of Freedom Method Between-Within
In Output 84.2.1, the covariance structure is listed as “Unstructured,” and no residual variance is used with
this structure. The default degrees-of-freedom method here is “Between-Within.”
In Output 84.2.2, note that Person has 27 levels and Gender has 2.
Dimensions
Covariance Parameters 10
Columns in X 6
Columns in Z 0
Subjects 27
Max Obs per Subject 4
In Output 84.2.3, the 10 covariance parameters result from the 4 4 unstructured blocks of R. There is no Z
matrix for this model, and each of the 27 subjects has a maximum of 4 observations.
Number of Observations
Number of Observations Read 108
Number of Observations Used 108
Number of Observations Not Used 0
Iteration History
Iteration Evaluations -2 Log Like Criterion
0 1 478.24175986
1 2 419.47721707 0.00000152
2 1 419.47704812 0.00000000
6734 F Chapter 84: The MIXED Procedure
Three Newton-Raphson iterations are required to find the maximum likelihood estimates (Output 84.2.4).
The default relative Hessian criterion has a final value less than 1E–8, indicating the convergence of the
Newton-Raphson algorithm and the attainment of an optimum.
The 44 matrix in Output 84.2.5 is the estimated unstructured covariance matrix. It is the estimate of the first
block of R, and the other 26 blocks all have the same estimate.
The “Covariance Parameter Estimates” table in Output 84.2.6 lists the 10 estimated covariance parameters in
order; note their correspondence to the first block of R displayed in Output 84.2.5. The parameter estimates
are labeled according to their location in the block in the Cov Parm column, and all of these estimates are
associated with Person as the subject effect. The Std Error column lists approximate standard errors of the
covariance parameters obtained from the inverse Hessian matrix. These standard errors lead to approximate
Wald Z statistics, which are compared with the standard normal distribution The results of these tests indicate
that all the parameters are significantly different from 0; however, the Wald test can be unreliable in small
samples.
To carry out Wald tests of various linear combinations of these parameters, use the following procedure. First,
run the statements again, adding the ASYCOV option and an ODS statement:
Example 84.2: Repeated Measures F 6735
proc iml;
use cp;
read all var {Estimate} into est;
use asy;
read all var ('CovP1':'CovP10') into asy;
quit;
You can then construct your desired linear combinations and corresponding quadratic forms with the asy
matrix.
Output 84.2.7 Repeated Measures Analysis (continued)
Fit Statistics
-2 Log Likelihood 419.5
AIC (Smaller is Better) 447.5
AICC (Smaller is Better) 452.0
BIC (Smaller is Better) 465.6
The null model likelihood ratio test (LRT) in Output 84.2.7 is highly significant for this model, indicating
that the unstructured covariance matrix is preferred to the diagonal matrix of the ordinary least squares null
model. The degrees of freedom for this test is 9, which is the difference between 10 and the 1 parameter for
the null model’s diagonal matrix.
The “Solution for Fixed Effects” table in Output 84.2.8 lists the solution vector for the fixed effects. The
estimate of the boys’ intercept is 15.8423, while that for the girls is 15:8423 C 1:5831 D 17:0654. Similarly,
the estimate for the boys’ slope is 0.8268, while that for the girls is 0:8268 0:3504 D 0:4764. Thus the
girls’ starting point is larger than that for the boys, but their growth rate is about half that of the boys.
Note that two of the estimates equal 0; this is a result of the overparameterized model used by PROC MIXED.
You can obtain a full-rank parameterization by using the following MODEL statement:
The “Type 3 Tests of Fixed Effects” table in Output 84.2.9 displays Type 3 tests for all of the fixed effects.
These tests are partial in the sense that they account for all of the other fixed effects in the model. In addition,
you can use the HTYPE= option in the MODEL statement to obtain Type 1 (sequential) or Type 2 (also
partial) tests of effects.
It is usually best to consider higher-order terms first, and in this case the Age*Gender test reveals a difference
between the slopes that is statistically significant at the 1% level. Note that the p-value for this test (0.0091) is
the same as the p-value in the “Age*Gender F” row in the “Solution for Fixed Effects” table (Output 84.2.8)
and that the F statistic (7.99) is the square of the t statistic (–2.83), ignoring rounding error. Similar
connections are evident among the other rows in these two tables.
The Age test is one for an overall growth curve accounting for possible heterogeneous slopes, and it is highly
significant. Finally, the Gender row tests the null hypothesis of a common intercept, and this hypothesis
cannot be rejected from these data.
As an alternative to the F tests shown here, you can carry out likelihood ratio tests of various hypotheses
by fitting the reduced models, subtracting –2 log likelihoods, and comparing the resulting statistics with 2
distributions.
Since the different levels of the repeated effect represent different years, it is natural to try fitting a time series
model to the data within each subject. To obtain time series structures in R, you can replace TYPE=UN
with TYPE=AR(1) or TYPE=TOEP to obtain the first- or nth-order autoregressive covariance matrices,
respectively. For example, the statements to fit an AR(1) structure are as follows:
/* first-order autoregressive */
proc mixed data=pr method=ml;
class Person Gender;
model y = Gender Age Gender*Age / s;
repeated / type=ar(1) sub=Person r;
run;
To fit a random coefficients model, use the following statements:
Example 84.2: Repeated Measures F 6737
Model Information
Data Set WORK.PR
Dependent Variable y
Covariance Structure Compound Symmetry
Subject Effect Person
Estimation Method ML
Residual Variance Method Profile
Fixed Effects SE Method Model-Based
Degrees of Freedom Method Between-Within
The “Dimensions” table in Output 84.2.11 shows that there are only two covariance parameters in the
compound symmetry model; this covariance structure has common variance and common covariance.
Dimensions
Covariance Parameters 2
Columns in X 6
Columns in Z 0
Subjects 27
Max Obs per Subject 4
Number of Observations
Number of Observations Read 108
Number of Observations Used 108
Number of Observations Not Used 0
Since the data are balanced, only one step is required to find the estimates (Output 84.2.12).
Iteration History
Iteration Evaluations -2 Log Like Criterion
0 1 478.24175986
1 1 428.63905802 0.00000000
Output 84.2.13 displays the estimated R matrix for the first subject. Note the compound symmetry structure
here, which consists of a common covariance with a diagonal enhancement.
The common covariance is estimated to be 3.0306, as listed in the CS row of the “Covariance Parameter
Estimates” table in Output 84.2.14, and the residual variance is estimated to be 1.8746, as listed in the
Residual row. You can use these two numbers to estimate the intraclass correlation coefficient (ICC) for
this model. Here, the ICC estimate equals 3:0306=.3:0306 C 1:8746/ D 0:6178. You can also obtain this
number by adding the RCORR option to the REPEATED statement.
In the case shown in Output 84.2.15, the null model LRT has only one degree of freedom, corresponding to
the common covariance parameter. The test indicates that modeling this extra covariance is superior to fitting
the simple null model.
Fit Statistics
-2 Log Likelihood 428.6
AIC (Smaller is Better) 440.6
AICC (Smaller is Better) 441.5
BIC (Smaller is Better) 448.4
Note that the fixed-effects estimates and their standard errors (Output 84.2.16) are not very different from
those in the preceding unstructured example (Output 84.2.8).
The F tests shown in Output 84.2.17 are also similar to those from the preceding unstructured example
(Output 84.2.9). Again, the slopes are significantly different but the intercepts are not.
You can fit the same compound symmetry model with the following specification by using the RANDOM
statement:
run;
Compound symmetry is the structure that Jennrich and Schluchter deemed best among the ones they fit. To
carry the analysis one step further, you can use the GROUP= option as follows to specify heterogeneity of
this structure across girls and boys:
Model Information
Data Set WORK.PR
Dependent Variable y
Covariance Structure Compound Symmetry
Subject Effect Person
Group Effect Gender
Estimation Method ML
Residual Variance Method None
Fixed Effects SE Method Model-Based
Degrees of Freedom Method Between-Within
The four covariance parameters listed in Output 84.2.19 result from the two compound symmetry structures
corresponding to the two levels of Gender.
Dimensions
Covariance Parameters 4
Columns in X 6
Columns in Z 0
Subjects 27
Max Obs per Subject 4
Number of Observations
Number of Observations Read 108
Number of Observations Used 108
Number of Observations Not Used 0
Example 84.2: Repeated Measures F 6741
As Output 84.2.20 shows, even with the heterogeneity, only one iteration is required for convergence.
Iteration History
Iteration Evaluations -2 Log Like Criterion
0 1 478.24175986
1 1 408.81297228 0.00000000
The “Covariance Parameter Estimates” table in Output 84.2.21 lists the heterogeneous estimates. Note that
both the common covariance and the diagonal enhancement differ between girls and boys.
As Output 84.2.22 shows, both Akaike’s information criterion (424.8) and Schwarz’s Bayesian information
criterion (435.2) are smaller for this model than for the homogeneous compound symmetry model (440.6
and 448.4, respectively). This indicates that the heterogeneous model is more appropriate. To construct the
likelihood ratio test between the two models, subtract the –2 log likelihood values: 428:6 408:8 D 19:8.
Comparing this value with the 2 distribution with two degrees of freedom yields a p-value less than 0.0001,
again favoring the heterogeneous model.
Fit Statistics
-2 Log Likelihood 408.8
AIC (Smaller is Better) 424.8
AICC (Smaller is Better) 426.3
BIC (Smaller is Better) 435.2
Note that the fixed-effects estimates shown in Output 84.2.23 are the same as in the homogeneous case, but
the standard errors are different.
6742 F Chapter 84: The MIXED Procedure
The fixed-effects tests shown in Output 84.2.24 are similar to those from previous models, although the
p-values do change as a result of specifying a different covariance structure. It is important for you to select a
reasonable covariance structure in order to obtain valid inferences for your fixed effects.
data hh;
input a b y @@;
datalines;
1 1 237 1 1 254 1 1 246
1 2 178 1 2 179
2 1 208 2 1 178 2 1 187
2 2 146 2 2 145 2 2 141
3 1 186 3 1 183
3 2 142 3 2 125 3 2 136
;
Model Information
Data Set WORK.HH
Dependent Variable y
Covariance Structure Variance Components
Estimation Method REML
Residual Variance Method Profile
Fixed Effects SE Method Model-Based
Degrees of Freedom Method Containment
The “Class Level Information” table in Output 84.3.2 lists the levels for A and B.
Class Level
Information
Class Levels Values
a 3 123
b 2 12
The “Dimensions” table in Output 84.3.3 reveals that X is 164 and Z is 168. Since there are no SUBJECT=
effects, PROC MIXED considers the data to be effectively from one subject with 16 observations.
6744 F Chapter 84: The MIXED Procedure
Dimensions
Covariance Parameters 3
Columns in X 4
Columns in Z 8
Subjects 1
Max Obs per Subject 16
Number of Observations
Number of Observations Read 16
Number of Observations Used 16
Number of Observations Not Used 0
Only a portion of the “Parameter Search” table is shown in Output 84.3.4 because the full listing has 651
rows.
Output 84.3.4 Selected Results of Parameter Search
The Mixed Procedure
-2 Res
CovP1 CovP2 CovP3 Variance Res Log Like Log Like
17.0000 0.3000 1.0000 80.1400 -52.4699 104.9399
17.0000 0.3050 1.0000 80.0466 -52.4697 104.9393
17.0000 0.3100 1.0000 79.9545 -52.4694 104.9388
17.0000 0.3150 1.0000 79.8637 -52.4692 104.9384
17.0000 0.3200 1.0000 79.7742 -52.4691 104.9381
17.0000 0.3250 1.0000 79.6859 -52.4690 104.9379
17.0000 0.3300 1.0000 79.5988 -52.4689 104.9378
17.0000 0.3350 1.0000 79.5129 -52.4689 104.9377
17.0000 0.3400 1.0000 79.4282 -52.4689 104.9377
17.0000 0.3450 1.0000 79.3447 -52.4689 104.9378
. . . . . .
. . . . . .
. . . . . .
20.0000 0.3550 1.0000 78.2003 -52.4683 104.9366
20.0000 0.3600 1.0000 78.1201 -52.4684 104.9368
20.0000 0.3650 1.0000 78.0409 -52.4685 104.9370
20.0000 0.3700 1.0000 77.9628 -52.4687 104.9373
20.0000 0.3750 1.0000 77.8857 -52.4689 104.9377
20.0000 0.3800 1.0000 77.8096 -52.4691 104.9382
20.0000 0.3850 1.0000 77.7345 -52.4693 104.9387
20.0000 0.3900 1.0000 77.6603 -52.4696 104.9392
20.0000 0.3950 1.0000 77.5871 -52.4699 104.9399
20.0000 0.4000 1.0000 77.5148 -52.4703 104.9406
As Output 84.3.5 shows, convergence occurs quickly because PROC MIXED starts from the best value from
the grid search.
Example 84.3: Plotting the Likelihood F 6745
Iteration History
Iteration Evaluations -2 Res Log Like Criterion
1 2 104.93416367 0.00000000
The “Covariance Parameter Estimates” table in Output 84.3.6 lists the variance components estimates. Note
that B is much more variable than A*B.
Output 84.3.6 Estimated Covariance Parameters
The asymptotic covariance matrix in Output 84.3.7 also reflects the large variability of B relative to A*B.
As Output 84.3.8 shows, the PARMS likelihood ratio test (LRT) compares the best model from the grid
search with the final fitted model. Since these models are nearly the same, the LRT is not significant.
Fit Statistics
-2 Res Log Likelihood 104.9
AIC (Smaller is Better) 110.9
AICC (Smaller is Better) 113.6
BIC (Smaller is Better) 107.0
The mixed model equations are analogous to the normal equations in the standard linear model. As
Output 84.3.9 shows, for this example, rows 1–4 correspond to the fixed effects, rows 5–12 correspond to the
random effects, and Col13 corresponds to the dependent variable.
6746 F Chapter 84: The MIXED Procedure
Mixed Model
Equations
Row Col13
1 36.4143
2 13.8757
3 12.7469
4 9.7917
5 21.2956
6 15.1187
7 9.3477
8 4.5280
9 7.2676
10 5.4793
11 4.6802
12 5.1115
The solution matrix in Output 84.3.10 results from sweeping all but the last row of the mixed model equations
matrix. The final column contains a solution vector for the fixed and random effects. The first four rows
correspond to fixed effects and the last eight correspond to random effects.
Example 84.3: Plotting the Likelihood F 6747
Output 84.3.12 shows that the significance of A appears to be from the difference between its first level and
its other two levels.
6748 F Chapter 84: The MIXED Procedure
Output 84.3.13 lists the predicted values from the model. These values are the sum of the fixed-effects
estimates and the empirical best linear unbiased predictors (EBLUPs) of the random effects.
To plot the likelihood surface by using ODS Graphics, use the following statements:
proc template;
define statgraph surface;
begingraph;
layout overlay3d;
surfaceplotparm x=CovP1 y=CovP2 z=ResLogLike;
endlayout;
endgraph;
end;
run;
proc sgrender data=parms template=surface;
run;
The results from this plot are shown in Output 84.3.14. The peak of the surface is the REML estimates for
the B and A*B variance components.
Example 84.4: Known G and R F 6749
data h;
input Trait Animal Y;
datalines;
1 1 6
1 2 8
1 3 7
2 1 9
2 2 5
2 3 .
;
6750 F Chapter 84: The MIXED Procedure
In order to read G into PROC MIXED by using the GDATA= option in the RANDOM statement, perform
the following DATA step:
data g;
input Row Col1-Col6;
datalines;
1 2 1 1 2 1 1
2 1 2 .5 1 2 .5
3 1 .5 2 1 .5 2
4 2 1 1 3 1.5 1.5
5 1 2 .5 1.5 3 .75
6 1 .5 2 1.5 .75 3
;
The preceding data are in the dense representation for a GDATA= data set. You can also construct a data
set with the sparse representation by using Row, Col, and Value variables, although this would require 21
observations instead of 6 for this example.
The PROC MIXED statements are as follows:
option, and it is read into PROC MIXED from a SAS data set by using the GDATA=G specification. The G
and GI options request the display of G and G 1 , respectively. The S option requests that the random-effects
solution vector be displayed.
Note that the preceding R matrix is block diagonal if the data are sorted by animals. The REPEATED
statement exploits this fact by requesting R to have unstructured 22 blocks corresponding to animals, which
are the subjects. The R and RI options request that the estimated 22 blocks for the first animal and its
inverse be displayed. The PARMS statement lists the parameters of this 22 matrix. Note that the parameters
from G are not specified in the PARMS statement because they have already been assigned by using the
GDATA= option in the RANDOM statement. The NOITER option prevents PROC MIXED from computing
residual (restricted) maximum likelihood estimates; instead, the known values are used for inferences.
The results from this analysis are shown in Output 84.4.1–Output 84.4.12.
The “Unstructured” covariance structure (Output 84.4.1) applies to both G and R here. The levels of Trait
and Animal have been specified correctly.
Model Information
Data Set WORK.H
Dependent Variable Y
Covariance Structure Unstructured
Subject Effect Animal
Estimation Method REML
Residual Variance Method None
Fixed Effects SE Method Model-Based
Degrees of Freedom Method Containment
Class Level
Information
Class Levels Values
Trait 2 12
Animal 3 123
The three covariance parameters indicated in Output 84.4.2 correspond to those from the R matrix. Those
from G are considered fixed and known because of the GDATA= option.
Dimensions
Covariance Parameters 3
Columns in X 2
Columns in Z 6
Subjects 1
Max Obs per Subject 5
6752 F Chapter 84: The MIXED Procedure
Number of Observations
Number of Observations Read 6
Number of Observations Used 5
Number of Observations Not Used 1
Because starting values for the covariance parameters are specified in the PARMS statement, the MIXED
procedure prints the residual (restricted) log likelihood at the starting values. Because of the NOITER option
in the PARMS statement, this is also the final log likelihood in this analysis (Output 84.4.3).
Parameter Search
CovP1 CovP2 CovP3 Res Log Like -2 Res Log Like
4.0000 1.0000 5.0000 -7.3731 14.7463
The block of R corresponding to the first animal and the inverse of this block are shown in Output 84.4.4.
Estimated R Matrix
for Animal 1
Row Col1 Col2
1 4.0000 1.0000
2 1.0000 5.0000
Estimated Inv(R)
Matrix for Animal 1
Row Col1 Col2
1 0.2632 -0.05263
2 -0.05263 0.2105
The G matrix as specified in the GDATA= data set and its inverse are shown in Output 84.4.5 and Out-
put 84.4.6.
Estimated G Matrix
Row Effect Trait Animal Col1 Col2 Col3 Col4 Col5 Col6
1 Trait*Animal 1 1 2.0000 1.0000 1.0000 2.0000 1.0000 1.0000
2 Trait*Animal 1 2 1.0000 2.0000 0.5000 1.0000 2.0000 0.5000
3 Trait*Animal 1 3 1.0000 0.5000 2.0000 1.0000 0.5000 2.0000
4 Trait*Animal 2 1 2.0000 1.0000 1.0000 3.0000 1.5000 1.5000
5 Trait*Animal 2 2 1.0000 2.0000 0.5000 1.5000 3.0000 0.7500
6 Trait*Animal 2 3 1.0000 0.5000 2.0000 1.5000 0.7500 3.0000
Example 84.4: Known G and R F 6753
The table of covariance parameter estimates in Output 84.4.7 displays only the parameters in R. Because of
the GDATA= option in the RANDOM statement, the G-side parameters do not participate in the parameter
estimation process. Because of the NOITER option in the PARMS statement, however, the R-side parameters
in this output are identical to their starting values.
Covariance Parameter
Estimates
Cov Parm Subject Estimate
UN(1,1) Animal 4.0000
UN(2,1) Animal 1.0000
UN(2,2) Animal 5.0000
The coefficients of the mixed model equations in Output 84.4.8 agree with Henderson (1984, p. 55). Recall
from Output 84.4.1 that there are 2 columns in X and 6 columns in Z. The first 8 columns of the mixed model
equations correspond to the X and Z components. Column 9 represents the Y border.
The solution to the mixed model equations also matches that given by Henderson (1984, p. 55). After solving
the augmented mixed model equations, you can find the solutions for fixed and random effects in the last
column (Output 84.4.9).
6754 F Chapter 84: The MIXED Procedure
The solutions for the fixed and random effects in Output 84.4.10 correspond to the last column in Output 84.4.9.
Note that the standard errors for the fixed effects and the prediction standard errors for the random effects are
the square root values of the diagonal entries in the solution of the mixed model equations (Output 84.4.9).
The estimates for the two traits are nearly identical, but the standard error of the second trait is larger because
of the missing observation.
The Estimate column in the “Solution for Random Effects” table lists the best linear unbiased predictions
(BLUPs) of the breeding values of both traits for all three animals. The p-values are missing because the
default containment method for computing degrees of freedom results in zero degrees of freedom for the
random effects parameter tests.
The two estimated traits are significantly different from zero at the 5% level (Output 84.4.11).
Output 84.4.12 displays the predicted values of the observations based on the trait and breeding value
estimates—that is, the fixed and random effects.
The predicted values are not the predictions of future records in the sense that they do not contain a component
corresponding to a new observational error. See Henderson (1984) for information about predicting future
records. The Lower and Upper columns usually contain confidence limits for the predicted values; they are
missing here because the random-effects parameter degrees of freedom equals 0.
data rc;
input Batch Month @@;
Monthc = Month;
do i = 1 to 6;
input Y @@;
output;
end;
datalines;
1 0 101.2 103.3 103.3 102.1 104.4 102.4
1 1 98.8 99.4 99.7 99.5 . .
1 3 98.4 99.0 97.3 99.8 . .
1 6 101.5 100.2 101.7 102.7 . .
1 9 96.3 97.2 97.2 96.3 . .
1 12 97.3 97.9 96.8 97.7 97.7 96.7
2 0 102.6 102.7 102.4 102.1 102.9 102.6
2 1 99.1 99.0 99.9 100.6 . .
2 3 105.7 103.3 103.4 104.0 . .
2 6 101.3 101.5 100.9 101.4 . .
2 9 94.1 96.5 97.2 95.6 . .
2 12 93.1 92.8 95.4 92.2 92.2 93.0
3 0 105.1 103.9 106.1 104.1 103.7 104.6
6756 F Chapter 84: The MIXED Procedure
Model Information
Data Set WORK.RC
Dependent Variable Y
Covariance Structure Unstructured
Subject Effect Batch
Estimation Method REML
Residual Variance Method Profile
Fixed Effects SE Method Model-Based
Degrees of Freedom Method Containment
Batch is the only classification variable in this analysis, and it has three levels (Output 84.5.2).
Example 84.5: Random Coefficients F 6757
Class Level
Information
Class Levels Values
Batch 3 123
The “Dimensions” table in Output 84.5.3 indicates that there are three subjects (corresponding to batches).
The 24 observations not used correspond to the missing values of Y in the input data set.
Dimensions
Covariance Parameters 4
Columns in X 2
Columns in Z per Subject 2
Subjects 3
Max Obs per Subject 28
Number of Observations
Number of Observations Read 108
Number of Observations Used 84
Number of Observations Not Used 24
Iteration History
Iteration Evaluations -2 Res Log Like Criterion
0 1 367.02768461
1 1 350.32813577 0.00000000
The Estimate column in Output 84.5.5 lists the estimated elements of the unstructured 22 matrix comprising
the blocks of G. Note that the random coefficients are negatively correlated.
Covariance Parameter
Estimates
Cov Parm Subject Estimate
UN(1,1) Batch 0.9768
UN(2,1) Batch -0.1045
UN(2,2) Batch 0.03717
Residual 3.2932
The null model likelihood ratio test indicates a significant improvement over the null model consisting of no
random effects and a homogeneous residual error (Output 84.5.6).
6758 F Chapter 84: The MIXED Procedure
Fit Statistics
-2 Res Log Likelihood 350.3
AIC (Smaller is Better) 358.3
AICC (Smaller is Better) 358.8
BIC (Smaller is Better) 354.7
The fixed-effects estimates represent the estimated means for the random intercept and slope, respectively
(Output 84.5.7).
The random-effects estimates represent the estimated deviation from the mean intercept and slope for each
batch (Output 84.5.8). Therefore, the intercept for the first batch is close to 102:7 1 D 101:7, while the
intercepts for the other two batches are greater than 102.7. The second batch has a slope less than the mean
slope of –0.526, while the other two batches have slopes greater than –0.526.
The F statistic in the “Type 3 Tests of Fixed Effects” table in Output 84.5.9 is the square of the t statistic
used in the test of Month in the preceding “Solution for Fixed Effects” table (compare Output 84.5.7 and
Output 84.5.9). Both statistics test the null hypothesis that the slope assigned to Month equals 0, and this
hypothesis can barely be rejected at the 5% level.
Example 84.5: Random Coefficients F 6759
It is also possible to fit a random coefficients model with error terms that follow a nested structure (Fuller and
Battese 1973). The following SAS statements represent one way of doing this:
Model Information
Data Set WORK.RC
Dependent Variable Y
Covariance Structure Variance Components
Subject Effect Batch
Estimation Method REML
Residual Variance Method Profile
Fixed Effects SE Method Model-Based
Degrees of Freedom Method Containment
Dimensions
Covariance Parameters 4
Columns in X 2
Columns in Z per Subject 8
Subjects 3
Max Obs per Subject 28
Number of Observations
Number of Observations Read 108
Number of Observations Used 84
Number of Observations Not Used 24
6760 F Chapter 84: The MIXED Procedure
Iteration History
Iteration Evaluations -2 Res Log Like Criterion
0 1 367.02768461
1 4 277.51945360 .
2 1 276.97551718 0.00104208
3 1 276.90304909 0.00003174
4 1 276.90100316 0.00000004
5 1 276.90100092 0.00000000
Covariance Parameter
Estimates
Cov Parm Subject Estimate
Intercept Batch 0
Month Batch 0.01243
Monthc Batch 3.7411
Residual 0.7969
For this analysis, the Newton-Raphson algorithm requires five iterations and nine likelihood evaluations to
achieve convergence. The missing value in the Criterion column in iteration 1 indicates that a boundary
constraint has been dropped.
The estimate for the Intercept variance component equals 0. This occurs frequently in practice and indicates
that the restricted likelihood is maximized by setting this variance component equal to 0. Whenever a zero
variance component estimate occurs, the following note appears in the SAS log:
Fit Statistics
-2 Res Log Likelihood 276.9
AIC (Smaller is Better) 282.9
AICC (Smaller is Better) 283.2
BIC (Smaller is Better) 280.2
The better-fitting covariance model affects the standard errors of the fixed-effects parameter estimates more
than the estimates themselves (Output 84.5.12).
Example 84.5: Random Coefficients F 6761
The random-effects solution provides the empirical best linear unbiased predictions (EBLUPs) for the
realizations of the random intercept, slope, and nested errors (Output 84.5.13). You can use these values to
compare batches and months.
The test of Month is similar to that from the previous model, although it is no longer significant at the 5%
level (Output 84.5.14).
data line;
length Cult$ 8;
input Block Cult$ @;
row = _n_;
do Sbplt=1 to 12;
if Sbplt le 6 then do;
Irrig = Sbplt;
Dir = 'North';
end; else do;
Irrig = 13 - Sbplt;
Dir = 'South';
end;
input Y @; output;
end;
datalines;
1 Luke 2.4 2.7 5.6 7.5 7.9 7.1 6.1 7.3 7.4 6.7 3.8 1.8
1 Nugaines 2.2 2.2 4.3 6.3 7.9 7.1 6.2 5.3 5.3 5.2 5.4 2.9
1 Bridger 2.9 3.2 5.1 6.9 6.1 7.5 5.6 6.5 6.6 5.3 4.1 3.1
2 Nugaines 2.4 2.2 4.0 5.8 6.1 6.2 7.0 6.4 6.7 6.4 3.7 2.2
2 Bridger 2.6 3.1 5.7 6.4 7.7 6.8 6.3 6.2 6.6 6.5 4.2 2.7
2 Luke 2.2 2.7 4.3 6.9 6.8 8.0 6.5 7.3 5.9 6.6 3.0 2.0
3 Nugaines 1.8 1.9 3.7 4.9 5.4 5.1 5.7 5.0 5.6 5.1 4.2 2.2
3 Luke 2.1 2.3 3.7 5.8 6.3 6.3 6.5 5.7 5.8 4.5 2.7 2.3
3 Bridger 2.7 2.8 4.0 5.0 5.2 5.2 5.9 6.1 6.0 4.3 3.1 3.1
;
proc mixed;
class Block Cult Dir Irrig;
model Y = Cult|Dir|Irrig@2;
random Block Block*Dir Block*Irrig;
repeated / type=toep(4) sub=Block*Cult r;
lsmeans Cult|Irrig;
estimate 'Bridger vs Luke' Cult 1 -1 0;
estimate 'Linear Irrig' Irrig -5 -3 -1 1 3 5;
Example 84.6: Line-Source Sprinkler Irrigation F 6763
Model Information
Data Set WORK.LINE
Dependent Variable Y
Covariance Structures Variance Components, Toeplitz
Subject Effect Block*Cult
Estimation Method REML
Residual Variance Method Profile
Fixed Effects SE Method Model-Based
Degrees of Freedom Method Containment
The levels of each classification variable are listed as a single string in the Values column, regardless of
whether the levels are numeric or character (Output 84.6.2).
Even though there is a SUBJECT= effect in the REPEATED statement, the analysis considers all of the data
to be from one subject because there is no corresponding SUBJECT= effect in the RANDOM statement
(Output 84.6.3).
6764 F Chapter 84: The MIXED Procedure
Dimensions
Covariance Parameters 7
Columns in X 48
Columns in Z 27
Subjects 1
Max Obs per Subject 108
Number of Observations
Number of Observations Read 108
Number of Observations Used 108
Number of Observations Not Used 0
Iteration History
Iteration Evaluations -2 Res Log Like Criterion
0 1 226.25427252
1 4 187.99336173 .
2 3 186.62579299 0.10431081
3 1 184.38218213 0.04807260
4 1 183.41836853 0.00886548
5 1 183.25111475 0.00075353
6 1 183.23809997 0.00000748
7 1 183.23797748 0.00000000
The first block of the estimated R matrix has the TOEP(4) structure, and the observations that are three plots
apart exhibit a negative correlation (Output 84.6.5).
Output 84.6.6 lists the estimated covariance parameters from both G and R. The first three are the variance
components making up the diagonal G, and the final four make up the Toeplitz structure in the blocks of
R. The Residual row corresponds to the variance of the Toeplitz structure, and it represents the parameter
profiled out during the optimization process.
Covariance Parameter
Estimates
Cov Parm Subject Estimate
Block 0.2194
Block*Dir 0.01768
Block*Irrig 0.03539
TOEP(2) Block*Cult 0.007986
TOEP(3) Block*Cult 0.001452
TOEP(4) Block*Cult -0.09253
Residual 0.2850
The “–2 Res Log Likelihood” value in Output 84.6.7 is the same as the final value listed in the “Iteration
History” table (Output 84.6.4).
Fit Statistics
-2 Res Log Likelihood 183.2
AIC (Smaller is Better) 197.2
AICC (Smaller is Better) 198.8
BIC (Smaller is Better) 190.9
Every fixed effect except for Dir and Cult*Irrig is significant at the 5% level (Output 84.6.8).
The “Estimates” table lists the results from the various linear combinations of fixed effects specified in the
ESTIMATE statements (Output 84.6.9). Bridger is not significantly different from Luke, and Irrig possesses a
strong linear component. This strength appears to be influencing the significance of the interaction.
6766 F Chapter 84: The MIXED Procedure
Estimates
Standard
Label Estimate Error DF t Value Pr > |t|
Bridger vs Luke -0.03889 0.09524 68 -0.41 0.6843
Linear Irrig 30.6444 1.4412 10 21.26 <.0001
B vs L x Linear Irrig -9.8667 2.7400 68 -3.60 0.0006
The least squares means shown in Output 84.6.10 are useful in comparing the levels of the various fixed
effects. For example, it appears that irrigation levels 5 and 6 have virtually the same effect.
Output 84.6.10 Least Squares Means for Cult, Irrig, and Their Interaction
An interesting exercise is to fit other variance-covariance models to these data and to compare them to
this one by using likelihood ratio tests, Akaike’s information criterion, or Schwarz’s Bayesian information
criterion. In particular, some spatial models are worth investigating (Marx and Thompson 1987; Zimmerman
and Harville 1991). The following is one example of spatial model statements:
Example 84.7: Influence in Heterogeneous Variance Model F 6767
proc mixed;
class Block Cult Dir Irrig;
model Y = Cult|Dir|Irrig@2;
repeated / type=sp(pow)(Row Sbplt) sub=intercept;
run;
The TYPE=SP(POW)(Row Sbplt) option in the REPEATED statement requests the spatial power structure,
with the two defining coordinate variables being Row and Sbplt. The SUBJECT=INTERCEPT option
indicates that the entire data set is to be considered as one subject, thereby modeling R as a dense 108108
covariance matrix. See Wolfinger (1993) for further discussion of this example and additional analyses.
data absorb;
input FatType Absorbed @@;
datalines;
1 164 1 172 1 168 1 177 1 156 1 195
2 178 2 191 2 197 2 182 2 185 2 177
3 175 3 193 3 178 3 171 3 163 3 176
4 155 4 166 4 149 4 164 4 170 4 168
;
The statistical model for these data can be written as
Yij D C i C ij
i D 1; : : : ; t D 4
j D 1; : : : ; r D 6
ij D N.0; i2 /
where Yij is the amount of fat absorbed by the jth batch of the ith fat type, and i denotes the fat-type effects.
A quick glance at the data suggests that observations 6, 9, 14, and 21 might be influential on the analysis,
because these are extreme observations for the respective fat types.
The following SAS statements fit this model and request influence diagnostics for the fixed effects and
covariance parameters. ODS Graphics is used to create plots of the influence diagnostics in addition to the
tabular output. The ESTIMATES suboption requests plots of “leave-one-out” estimates for the fixed effects
and group variances.
run;
Model Information
Data Set WORK.ABSORB
Dependent Variable Absorbed
Covariance Structure Variance Components
Group Effect FatType
Estimation Method REML
Residual Variance Method None
Fixed Effects SE Method Model-Based
Degrees of Freedom Method Between-Within
Covariance Parameter
Estimates
Cov Parm Group Estimate
Residual FatType 1 178.00
Residual FatType 2 60.4000
Residual FatType 3 97.6000
Residual FatType 4 67.6000
2
You can easily verify that these estimates are simple functions of the arithmetic means y i: in the groups.
2
For example, C 4 D y 4: D 162:0, 1 4 D y 1: y 4: D 10:0, and so forth. The covariance parameter
estimates are the sample variances in the groups and are uncorrelated.
The variances in the four groups are shown in the “Covariance Parameter Estimates” table (Output 84.7.1).
The estimated variance in the first group is two to three times larger than the variance in the other groups.
Example 84.7: Influence in Heterogeneous Variance Model F 6769
In groups where the residual variance estimate is large, the precision of the estimate is also small (Out-
put 84.7.2).
The following statements print the “leave-one-out” estimates for fixed effects and covariance parameters that
were written to the inf data set with the ESTIMATES suboption (Output 84.7.3):
The graphical displays in Output 84.7.4 and Output 84.7.5 are created when ODS Graphics is enabled. For
general information about ODS Graphics, see Chapter 24, “Statistical Graphics Using ODS.” For specific
6770 F Chapter 84: The MIXED Procedure
information about the graphics available in the MIXED procedure, see the section “ODS Graphics” on
page 6717.
The estimate of the intercept is affected only when observations from the last group are removed. The
estimate of the “FatType 1” effect reacts to removal of observations in the first and last group (Output 84.7.4).
While observations can affect one or more fixed-effects solutions in this model, they can affect only one
covariance parameter, the variance in their group (Output 84.7.5). Observations 6, 9, 14, and 21, which are
extreme in their group, reduce the group variance considerably.
Diagnostics related to residuals and predicted values are printed with the following statements:
Internally Externally
Observed Predicted PRESS Studentized Studentized
Obs Value Mean Residual Residual Residual Residual
1 164 172.0 -8.000 -9.600 -0.6569 -0.6146
2 172 172.0 0.000 0.000 0.0000 0.0000
3 168 172.0 -4.000 -4.800 -0.3284 -0.2970
4 177 172.0 5.000 6.000 0.4105 0.3736
5 156 172.0 -16.000 -19.200 -1.3137 -1.4521
6 195 172.0 23.000 27.600 1.8885 3.1544
7 178 185.0 -7.000 -8.400 -0.9867 -0.9835
8 191 185.0 6.000 7.200 0.8457 0.8172
9 197 185.0 12.000 14.400 1.6914 2.3131
10 182 185.0 -3.000 -3.600 -0.4229 -0.3852
11 185 185.0 0.000 -0.000 0.0000 0.0000
12 177 185.0 -8.000 -9.600 -1.1276 -1.1681
13 175 176.0 -1.000 -1.200 -0.1109 -0.0993
14 193 176.0 17.000 20.400 1.8850 3.1344
15 178 176.0 2.000 2.400 0.2218 0.1993
16 171 176.0 -5.000 -6.000 -0.5544 -0.5119
17 163 176.0 -13.000 -15.600 -1.4415 -1.6865
18 176 176.0 0.000 0.000 0.0000 0.0000
19 155 162.0 -7.000 -8.400 -0.9326 -0.9178
20 166 162.0 4.000 4.800 0.5329 0.4908
21 149 162.0 -13.000 -15.600 -1.7321 -2.4495
22 164 162.0 2.000 2.400 0.2665 0.2401
23 170 162.0 8.000 9.600 1.0659 1.0845
24 168 162.0 6.000 7.200 0.7994 0.7657
Observations 6, 9, 14, and 21 have large studentized residuals (Output 84.7.6). That the externally studentized
residuals are much larger than the internally studentized residuals for these observations indicates that the
variance estimate in the group shrinks when the observation is removed. Also important to note is that
comparisons based on raw residuals in models with heterogeneous variance can be misleading. Observation
5, for example, has a larger residual but a smaller studentized residual than observation 21. The variance for
the first fat type is much larger than the variance in the fourth group. A “large” residual is more “surprising”
in the groups with small variance.
A measure of the overall influence on the analysis is the (restricted) likelihood distance, shown in Out-
put 84.7.7. Observations 6, 9, 14, and 21 clearly displace the REML solution more than any other observa-
tions.
Example 84.7: Influence in Heterogeneous Variance Model F 6773
The following statements list the restricted likelihood distance and various diagnostics related to the fixed-
effects estimates (Output 84.7.8):
Restr.
Observed Cook's Likelihood
Obs Leverage Value D DFFITS COVRATIO Distance
1 0.167 164 0.02157 -0.27487 1.3706 0.1178
2 0.167 172 0.00000 -0.00000 1.4998 0.1156
3 0.167 168 0.00539 -0.13282 1.4675 0.1124
4 0.167 177 0.00843 0.16706 1.4494 0.1117
5 0.167 156 0.08629 -0.64938 0.9822 0.5290
6 0.167 195 0.17831 1.41069 0.4301 5.8101
7 0.167 178 0.04868 -0.43982 1.2078 0.1935
8 0.167 191 0.03576 0.36546 1.2853 0.1451
9 0.167 197 0.14305 1.03446 0.6416 2.2909
10 0.167 182 0.00894 -0.17225 1.4463 0.1116
11 0.167 185 0.00000 -0.00000 1.4998 0.1156
12 0.167 177 0.06358 -0.52239 1.1183 0.2856
13 0.167 175 0.00061 -0.04441 1.4961 0.1151
14 0.167 193 0.17766 1.40175 0.4340 5.7044
15 0.167 178 0.00246 0.08915 1.4851 0.1139
16 0.167 171 0.01537 -0.22892 1.4078 0.1129
17 0.167 163 0.10389 -0.75423 0.8766 0.8433
18 0.167 176 0.00000 0.00000 1.4998 0.1156
19 0.167 155 0.04349 -0.41047 1.2390 0.1710
20 0.167 166 0.01420 0.21950 1.4148 0.1124
21 0.167 149 0.15000 -1.09545 0.6000 2.7343
22 0.167 164 0.00355 0.10736 1.4786 0.1133
23 0.167 170 0.05680 0.48500 1.1592 0.2383
24 0.167 168 0.03195 0.34245 1.3079 0.1353
In this example, observations with large likelihood distances also have large values for Cook’s D and values
of CovRatio far less than one (Output 84.7.8). The latter indicates that the fixed effects are estimated more
precisely when these observations are removed from the analysis.
The following statements print the values of the D statistic and the CovRatio for the covariance parameters:
Cook's D COVRATIO
Obs Iterations CovParms CovParms
1 3 0.05050 1.6306
2 3 0.15603 1.9520
3 3 0.12426 1.8692
4 3 0.10796 1.8233
5 4 0.08232 0.8375
6 4 1.02909 0.1606
7 1 0.00011 1.2662
8 2 0.01262 1.4335
9 3 0.54126 0.3573
10 3 0.10531 1.8156
11 3 0.15603 1.9520
12 2 0.01160 1.0849
13 3 0.15223 1.9425
14 4 1.01865 0.1635
15 3 0.14111 1.9141
16 3 0.07494 1.7203
17 3 0.18154 0.6671
18 3 0.15603 1.9520
19 2 0.00265 1.3326
20 3 0.08008 1.7374
21 1 0.62500 0.3125
22 3 0.13472 1.8974
23 2 0.00290 1.1663
24 2 0.02020 1.4839
Output 84.7.10 displays the standard panel of influence diagnostics that is obtained when influence analysis is
iterative. The Cook’s D and CovRatio statistics are displayed for each deletion set for both fixed-effects and
covariance parameter estimates. This provides a convenient summary of the impact on the analysis for each
deletion set, since Cook’s D statistic measures impact on the estimates and the CovRatio statistic measures
impact on the precision of the estimates.
6776 F Chapter 84: The MIXED Procedure
Observations 6, 9, 14, and 21 have considerable impact on estimates and precision of fixed effects and
covariance parameters. This is not necessarily the case. Observations can be influential on only some aspects
of the analysis, as shown in the next example.
Each observation in the “Influence Diagnostics for Levels of Person” table in Output 84.8.1 represents the
removal of four observations. The subjects 10, 15, and 24 have the greatest impact on the fixed effects
(Cook’s D), and subject 10 and 21 have large PRESS statistics. The 21st child has a large PRESS statistic,
and its D statistic is not that extreme. This is an indication that the model fits rather poorly for this child,
whether it is part of the data or not.
6778 F Chapter 84: The MIXED Procedure
The previous analysis does not take into account the effect on the covariance parameters when a subject is
removed from the analysis. If you also update the covariance parameters, the impact of observations on these
can amplify or allay their effect on the fixed effects. To assess the overall influence of subjects on the analysis
and to compute separate statistics for the fixed effects and covariance parameters, an iterative analysis is
obtained by adding the INFLUENCE suboption ITER=, as follows:
As judged by the restricted likelihood distance, subjects 20 and 24 clearly have the most influence on the
overall analysis (Output 84.8.2).
Output 84.8.3 displays Cook’s D and CovRatio statistics for the fixed effects and covariance parameters.
Clearly, subject 20 has a dramatic effect on the estimates of variances and covariances. This subject also
affects the precision of the covariance parameter estimates more than any other subject in Output 84.8.3
(CovRatio near 0).
The child who exerts the greatest influence on the fixed effects is subject 24. Maybe surprisingly, this
subject affects the variance-covariance matrix of the fixed effects more than subject 20 (small CovRatio in
Output 84.8.3).
The final model investigated for these data is a random coefficient model as in Stram and Lee (1994) with
random effects for the intercept and age effect. The following statements examine the estimates for fixed
effects and the entries of the unstructured 2 2 variance matrix of the random coefficients graphically:
In Output 84.8.4 the graphs on the left side of the panel represent the intercept and slope estimate for boys;
the graphs on the right side represent the difference in intercept and slope between boys and girls. Removing
any one of the first eleven children, who are girls, does not alter the intercept or slope in the group of boys.
The difference in these parameters between boys and girls is altered by the removal of any child. Subject 24
changes the fixed effects considerably, subject 20 much less so.
6782 F Chapter 84: The MIXED Procedure
The covariance parameter deletion estimates in Output 84.8.5 show several important features.
The panels do not contain information about subject 24. Estimation of the G matrix following removal
of that child did not yield a positive definite matrix. As a consequence, covariance parameter diagnostics
are not produced for this subject.
Subject 20 has great impact on the four covariance parameters. Removing this child from the analysis
increases the variance of the random intercept and random slope and reduces the residual variance by
almost 80%. The repeated measurements of this child exhibit an up-and-down behavior.
The variance of the random intercept and slope are reduced when child 15 is removed from the analysis.
This child’s growth measurements oscillate about 27.0 from age 10 on.
Examining observed and residual values by levels of classification variables is also a useful tool to diagnose
the adequacy of the model and unusual observations. Box plots for effects in the model that consist of only
classification variables can be requested with the BOXPLOT option of the PLOTS= option in the PROC
MIXED statement. For example, the following statements produce box plots for the SUBJECT= effects in
the model:
Example 84.8: Influence Analysis for Repeated Measures Data F 6783
The conditional residuals incorporate the EBLUPs for each child and enable you to examine whether the
subject-specific model is adequate (Output 84.8.8). By using each child “as its own control,” the residuals are
now centered near zero. Subjects 20 and 24 stand out as unusual in all three sets of box plots.
Example 84.9: Examining Individual Test Components F 6785
If i: denotes a whole-plot main effect mean, :j denotes a subplot main effect mean, and ij denotes a cell
mean, the five components shown in Output 84.9.3 correspond to tests of the following:
H0 W 1: D 3:
H0 W 2: D 3:
H0 W :1 D :2
The first three components are comparisons of marginal means. The fourth component compares the effect
of factor B at the first whole-plot level against the effect of B at the third whole-plot level. Finally, the last
component tests whether the factor B effect changes between the second and third whole-plot level.
The Type 3 component tests can also be produced with these corresponding ESTIMATE statements:
Estimates
Standard
Label Estimate Error DF t Value Pr > |t|
a 1 7.1250 3.1672 6 2.25 0.0655
a 2 8.3750 3.1672 6 2.64 0.0383
b 1 5.5000 1.2491 9 4.40 0.0017
a*b 1 7.7500 3.0596 9 2.53 0.0321
a*b 2 7.2500 3.0596 9 2.37 0.0419
A second useful application of the LCOMPONENTS option is in polynomial models, where Type 1 tests are
often used to test the entry of model terms sequentially. The SOLUTION option in the MODEL statement
displays the regression coefficients that correspond to a Type 3 analysis. That is, the coefficients represent
the partial coefficients you would get by adding the regressor variable last in a model containing all other
effects, and the tests are identical to those in the “Type 3 Tests of Fixed Effects” table.
Consider the following DATA step and the fit of a third-order polynomial regression model.
data polynomial;
do x=1 to 20; input y@@; output; end;
datalines;
1.092 1.758 1.997 3.154 3.880
3.810 4.921 4.573 6.029 6.032
6.291 7.151 7.154 6.469 7.137
6.374 5.860 4.866 4.155 2.711
;
The Type 3 L components are identical to the tests in the “Solutions for Fixed Effects” table shown in
Output 84.9.5. The Type 1 table yields the following:
Example 84.10: Isotonic Contrasts for Ordered Mean Values F 6789
sequential (Type 1) tests of regression variables that test the significance of a regressor given all other
variables preceding it in the model list
The estimate of 0.1763 is the regression coefficient in a simple linear regression of Y on X. The estimate of
–0.04886 is the partial coefficient for the quadratic term when it is added to a model containing only a linear
component. Similarly, the value –0.00306 is the partial coefficient for the cubic term when it is added to a
model containing a linear and quadratic component. The last Type 1 component is always identical to the
corresponding Type 3 component.
data FerriteCores;
do Temp = 1 to 4;
do rep = 1 to 5; drop rep;
input MagneticForce @@;
output;
end;
end;
datalines;
10.8 9.9 10.7 10.4 9.7
10.7 10.6 11.0 10.8 10.9
11.9 11.2 11.0 11.1 11.3
11.4 10.7 10.9 11.3 11.7
;
The method presented by Hirotsu and Srivastava (2000) to test whether the magnetic force of the cores rises
monotonically with temperature depends on the lower confidence limits of the isotonic contrasts of the force
6790 F Chapter 84: The MIXED Procedure
means at each temperature, adjusted for multiplicity. The corresponding isotonic contrast compares the
average of a particular group and the preceding groups with the average of the succeeding groups. You can
compute adjusted confidence intervals for isotonic contrasts by using the LSMESTIMATE statement.
The following statements analyze the FerriteCores data as a one-way design and multiplicity-adjusted lower
confidence limits for the isotonic contrasts. For the multiplicity adjustment, the LSMESTIMATE statement
employs simulation, which provides adjusted p-values and lower confidence limits that are exact up to Monte
Carlo error.
With an adjusted p-value of 0.001, the magnetic force at the first temperature is significantly less than the
average of the other temperatures. Likewise, the average of the first two temperatures is significantly less
than the average of the last two (p = 0.0009). However, the magnetic force at the last temperature is not
significantly greater than the average magnetic force of the others (p = 0.0625). These results indicate a
significant monotone increase over the first three temperatures, but not across all four temperatures.
References
Akaike, H. (1974). “A New Look at the Statistical Model Identification.” IEEE Transactions on Automatic
Control AC-19:716–723.
Akritas, M. G., Arnold, S. F., and Brunner, E. (1997). “Nonparametric Hypotheses and Rank Statistics for
Unbalanced Factorial Designs.” Journal of the American Statistical Association 92:258–265.
Allen, D. M. (1974). “The Relationship between Variable Selection and Data Augmentation and a Method of
Prediction.” Technometrics 16:125–127.
References F 6791
Bates, D. M., and Watts, D. G. (1988). Nonlinear Regression Analysis and Its Applications. New York: John
Wiley & Sons.
Beckman, R. J., Nachtsheim, C. J., and Cook, R. D. (1987). “Diagnostics for Mixed-Model Analysis of
Variance.” Technometrics 29:413–426.
Belsley, D. A., Kuh, E., and Welsch, R. E. (1980). Regression Diagnostics: Identifying Influential Data and
Sources of Collinearity. New York: John Wiley & Sons.
Box, G. E. P., and Tiao, G. C. (1973). Bayesian Inference in Statistical Analysis. New York: John Wiley &
Sons.
Bozdogan, H. (1987). “Model Selection and Akaike’s Information Criterion (AIC): The General Theory and
Its Analytical Extensions.” Psychometrika 52:345–370.
Brown, H., and Prescott, R. (1999). Applied Mixed Models in Medicine. New York: John Wiley & Sons.
Brownie, C., Bowman, D. T., and Burton, J. W. (1993). “Estimating Spatial Variation in Analysis of Data
from Yield Trials: A Comparison of Methods.” Agronomy Journal 85:1244–1253.
Brownie, C., and Gumpertz, M. L. (1997). “Validity of Spatial Analysis of Large Field Trials.” Journal of
Agricultural, Biological, and Environmental Statistics 2:1–23.
Brunner, E., Dette, H., and Munk, A. (1997). “Box-Type Approximations in Nonparametric Factorial
Designs.” Journal of the American Statistical Association 92:1494–1502.
Brunner, E., Domhof, S., and Langer, F. (2002). Nonparametric Analysis of Longitudinal Data in Factorial
Experiments. New York: John Wiley & Sons.
Burdick, R. K., and Graybill, F. A. (1992). Confidence Intervals on Variance Components. New York: Marcel
Dekker.
Burnham, K. P., and Anderson, D. R. (1998). Model Selection and Inference: A Practical Information-
Theoretic Approach. New York: Springer-Verlag.
Carlin, B. P., and Louis, T. A. (1996). Bayes and Empirical Bayes Methods for Data Analysis. London:
Chapman & Hall.
Carroll, R. J., and Ruppert, D. (1988). Transformation and Weighting in Regression. London: Chapman &
Hall.
Chilès, J.-P., and Delfiner, P. (1999). Geostatistics: Modeling Spatial Uncertainty. New York: John Wiley &
Sons.
Christensen, R., Pearson, L. M., and Johnson, W. (1992). “Case-Deletion Diagnostics for Mixed Models.”
Technometrics 34:38–45.
Cook, R. D. (1979). “Influential Observations in Linear Regression.” Journal of the American Statistical
Association 74:169–174.
Cook, R. D., and Weisberg, S. (1982). Residuals and Influence in Regression. New York: Chapman & Hall.
6792 F Chapter 84: The MIXED Procedure
Cressie, N. (1993). Statistics for Spatial Data. Rev. ed. New York: John Wiley & Sons.
Crowder, M. J., and Hand, D. J. (1990). Analysis of Repeated Measures. New York: Chapman & Hall.
Dempster, A. P., Laird, N. M., and Rubin, D. B. (1977). “Maximum Likelihood from Incomplete Data via
the EM Algorithm.” Journal of the Royal Statistical Society, Series B 39:1–38.
Diggle, P. J. (1988). “An Approach to the Analysis of Repeated Measurements.” Biometrics 44:959–971.
Diggle, P. J., Liang, K.-Y., and Zeger, S. L. (1994). Analysis of Longitudinal Data. Oxford: Clarendon Press.
Dunnett, C. W. (1980). “Pairwise Multiple Comparisons in the Unequal Variance Case.” Journal of the
American Statistical Association 75:796–800.
Edwards, D., and Berry, J. J. (1987). “The Efficiency of Simulation-Based Multiple Comparisons.” Biometrics
43:913–928.
Everitt, B. S. (1995). “The Analysis of Repeated Measures: A Practical Review with Examples.” Journal of
the Royal Statistical Society, Series D 44:113–135. https://doi.org/10.2307/2348622.
Fai, A. H. T., and Cornelius, P. L. (1996). “Approximate F-Tests of Multiple Degree of Freedom Hypotheses
in Generalized Least Squares Analyses of Unbalanced Split-Plot Experiments.” Journal of Statistical
Computation and Simulation 54:363–378.
Federer, W. T., and Wolfinger, R. D. (1998). “SAS Code for Recovering Intereffect Information in Experi-
ments with Incomplete Block and Lattice Rectangle Designs.” Agronomy Journal 90:545–551.
Fuller, W. A. (1976). Introduction to Statistical Time Series. New York: John Wiley & Sons.
Fuller, W. A., and Battese, G. E. (1973). “Transformations for Estimation of Linear Models with Nested
Error Structure.” Journal of the American Statistical Association 68:626–632.
Galecki, A. T. (1994). “General Class of Covariance Structures for Two or More Repeated Factors in
Longitudinal Data Analysis.” Communications in Statistics—Theory and Methods 23:3105–3109.
Games, P. A., and Howell, J. F. (1976). “Pairwise Multiple Comparison Procedures with Unequal n’s and/or
Variances: A Monte Carlo Study.” Journal of Educational Statistics 1:113–125.
Gelfand, A. E., Hills, S. E., Racine-Poon, A., and Smith, A. F. M. (1990). “Illustration of Bayesian
Inference in Normal Data Models Using Gibbs Sampling.” Journal of the American Statistical Association
85:972–985.
Ghosh, M. (1992). “Discussion of Schervish, M., ‘Bayesian Analysis of Linear Models’.” In Bayesian
Statistics, vol. 4, edited by J. M. Bernardo, J. O. Berger, A. P. Dawid, and A. F. M. Smith, 432–433. Oxford:
Clarendon Press.
Giesbrecht, F. G. (1989). A General Structure for the Class of Mixed Linear Models. Southern Cooperative
Series Bulletin 343, Louisiana Agricultural Experiment Station, Baton Rouge.
Giesbrecht, F. G., and Burns, J. C. (1985). “Two-Stage Analysis Based on a Mixed Model: Large-Sample
Asymptotic Theory and Small-Sample Simulation Results.” Biometrics 41:477–486.
Golub, G. H., and Van Loan, C. F. (1989). Matrix Computations. 2nd ed. Baltimore: Johns Hopkins
University Press.
References F 6793
Goodnight, J. H. (1978). Tests of Hypotheses in Fixed-Effects Linear Models. Technical Report R-101, SAS
Institute Inc., Cary, NC.
Goodnight, J. H., and Hemmerle, W. J. (1979). “A Simplified Algorithm for the W-Transformation in
Variance Component Estimation.” Technometrics 21:265–267.
Gotway, C. A., and Stroup, W. W. (1997). “A Generalized Linear Model Approach to Spatial Data and
Prediction.” Journal of Agricultural, Biological, and Environmental Statistics 2:157–187.
Greenhouse, S. W., and Geisser, S. (1959). “On Methods in the Analysis of Profile Data.” Psychometrika
32:95–112.
Gregoire, T. G., Schabenberger, O., and Barrett, J. P. (1995). “Linear Modelling of Irregularly Spaced,
Unbalanced, Longitudinal Data from Permanent Plot Measurements.” Canadian Journal of Forest Research
25:137–156.
Handcock, M. S., and Stein, M. L. (1993). “A Bayesian Analysis of Kriging.” Technometrics 35:403–410.
Handcock, M. S., and Wallis, J. R. (1994). “An Approach to Statistical Spatial-Temporal Modeling of
Meteorological Fields.” Journal of the American Statistical Association 89:368–390. With discussion.
Hanks, R. J., Sisson, D. V., Hurst, R. L., and Hubbard, K. G. (1980). “Statistical Analysis of Results from
Irrigation Experiments Using the Line-Source Sprinkler System.” Soil Science Society American Journal
44:886–888.
Hannan, E. J., and Quinn, B. G. (1979). “The Determination of the Order of an Autoregression.” Journal of
the Royal Statistical Society, Series B 41:190–195.
Hartley, H. O., and Rao, J. N. K. (1967). “Maximum-Likelihood Estimation for the Mixed Analysis of
Variance Model.” Biometrika 54:93–108.
Harville, D. A. (1977). “Maximum Likelihood Approaches to Variance Component Estimation and to Related
Problems.” Journal of the American Statistical Association 72:320–338.
Harville, D. A. (1990). “BLUP (Best Linear Unbiased Prediction), and Beyond.” In Advances in Statistical
Methods for Genetic Improvement of Livestock, edited by D. Gianola and K. Hammond, 239–276. Vol. 18
of Advanced Series in Agricultural Sciences. Berlin: Springer-Verlag.
Harville, D. A., and Jeske, D. R. (1992). “Mean Squared Error of Estimation or Prediction under a General
Linear Model.” Journal of the American Statistical Association 87:724–731.
Hemmerle, W. J., and Hartley, H. O. (1973). “Computing Maximum Likelihood Estimates for the Mixed
AOV Model Using the W-Transformation.” Technometrics 15:819–831.
Henderson, C. R. (1984). Applications of Linear Models in Animal Breeding. Guelph, ON: University of
Guelph.
Hirotsu, C., and Srivastava, M. (2000). “Simultaneous Confidence Intervals Based on One-Sided Max t Test.”
Statistics and Probability Letters 49:25–37.
Hsu, J. C. (1992). “The Factor Analytic Approach to Simultaneous Inference in the General Linear Model.”
Journal of Computational and Graphical Statistics 1:151–168.
Huber, P. J. (1967). “The Behavior of Maximum Likelihood Estimates under Nonstandard Conditions.”
Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability 1:221–233.
Hurtado, G. I. (1993). “Detection of Influential Observations in Linear Mixed Models.” Ph.D. diss.,
Department of Statistics, North Carolina State University.
Hurvich, C. M., and Tsai, C.-L. (1989). “Regression and Time Series Model Selection in Small Samples.”
Biometrika 76:297–307.
Huynh, H., and Feldt, L. S. (1970). “Conditions Under Which Mean Square Ratios in Repeated Measurements
Designs Have Exact F-Distributions.” Journal of the American Statistical Association 65:1582–1589.
Jennrich, R. I., and Schluchter, M. D. (1986). “Unbalanced Repeated-Measures Models with Structured
Covariance Matrices.” Biometrics 42:805–820.
Johnson, D. E., Chaudhuri, U. N., and Kanemasu, E. T. (1983). “Statistical Analysis of Line-Source Sprinkler
Irrigation Experiments and Other Nonrandomized Experiments Using Multivariate Methods.” Soil Science
Society American Journal 47:309–312.
Jones, R. H., and Boadi-Boateng, F. (1991). “Unequally Spaced Longitudinal Data with AR(1) Serial
Correlation.” Biometrics 47:161–175.
Kackar, R. N., and Harville, D. A. (1984). “Approximations for Standard Errors of Estimators of Fixed and
Random Effects in Mixed Linear Models.” Journal of the American Statistical Association 79:853–862.
Kass, R. E., and Steffey, D. (1989). “Approximate Bayesian Inference in Conditionally Independent Hierar-
chical Models (Parametric Empirical Bayes Models).” Journal of the American Statistical Association
84:717–726.
Kenward, M. G. (1987). “A Method for Comparing Profiles of Repeated Measurements.” Journal of the
Royal Statistical Society, Series C 36:296–308.
Kenward, M. G., and Roger, J. H. (1997). “Small Sample Inference for Fixed Effects from Restricted
Maximum Likelihood.” Biometrics 53:983–997.
Kenward, M. G., and Roger, J. H. (2009). “An Improved Approximation to the Precision of Fixed Effects
from Restricted Maximum Likelihood.” Computational Statistics and Data Analysis 53:2583–2595.
Keselman, H. J., Algina, J., Kowalchuk, R. K., and Wolfinger, R. D. (1998). “A Comparison of Two
Approaches for Selecting Covariance Structures in the Analysis of Repeated Measures.” Communications
in Statistics—Simulation and Computation 27:591–604.
Keselman, H. J., Algina, J., Kowalchuk, R. K., and Wolfinger, R. D. (1999). “A Comparison of Recent
Approaches to the Analysis of Repeated Measurements.” British Journal of Mathematical and Statistical
Psychology 52:63–78.
Kramer, C. Y. (1956). “Extension of Multiple Range Tests to Group Means with Unequal Numbers of
Replications.” Biometrics 12:307–310.
References F 6795
Laird, N. M., Lange, N. T., and Stram, D. O. (1987). “Maximum Likelihood Computations with Repeated
Measures: Application of the EM Algorithm.” Journal of the American Statistical Association 82:97–105.
Laird, N. M., and Ware, J. H. (1982). “Random-Effects Models for Longitudinal Data.” Biometrics
38:963–974.
Liang, K.-Y., and Zeger, S. L. (1986). “Longitudinal Data Analysis Using Generalized Linear Models.”
Biometrika 73:13–22.
Lindstrom, M. J., and Bates, D. M. (1988). “Newton-Raphson and EM Algorithms for Linear Mixed-Effects
Models for Repeated-Measures Data.” Journal of the American Statistical Association 83:1014–1022.
Littell, R. C., Milliken, G. A., Stroup, W. W., Wolfinger, R. D., and Schabenberger, O. (2006). SAS for Mixed
Models. 2nd ed. Cary, NC: SAS Institute Inc.
Little, R. J. A. (1995). “Modeling the Drop-Out Mechanism in Repeated-Measures Studies.” Journal of the
American Statistical Association 90:1112–1121.
Louis, T. A. (1988). “General Methods for Analyzing Repeated Measures.” Statistics in Medicine 7:29–45.
Macchiavelli, R. E., and Arnold, S. F. (1994). “Variable Order Ante-dependence Models.” Communications
in Statistics—Theory and Methods 23:2683–2699.
Marx, D., and Thompson, K. (1987). Practical Aspects of Agricultural Kriging. Bulletin 903, Arkansas
Agricultural Experiment Station, Fayetteville.
McLean, R. A., and Sanders, W. L. (1988). “Approximating Degrees of Freedom for Standard Errors in
Mixed Linear Models.” In Proceedings of the Statistical Computing Section, 50–59. Alexandria, VA:
American Statistical Association.
McLean, R. A., Sanders, W. L., and Stroup, W. W. (1991). “A Unified Approach to Mixed Linear Models.”
American Statistician 45:54–64.
Milliken, G. A., and Johnson, D. E. (1992). Designed Experiments. Vol. 1 of Analysis of Messy Data. Reprint
edition. New York: Chapman & Hall.
Moriguchi, S., ed. (1976). Statistical Method for Quality Control. Tokyo: Japan Standards Association. In
Japanese.
Murray, D. M. (1998). Design and Analysis of Group-Randomized Trials. New York: Oxford University
Press.
Myers, R. H. (1990). Classical and Modern Regression with Applications. 2nd ed. Belmont, CA: PWS-Kent.
Obenchain, R. L. (1990). STABLSIM.EXE, Version 9010. Unpublished C code. Indianapolis: Eli Lilly.
6796 F Chapter 84: The MIXED Procedure
Patel, H. I. (1991). “Analysis of Incomplete Data from a Clinical Trial with Repeated Measurements.”
Biometrika 78:609–619.
Patterson, H. D., and Thompson, R. (1971). “Recovery of Inter-block Information When Block Sizes Are
Unequal.” Biometrika 58:545–554.
Pillai, K. C. S., and Samson, P., Jr. (1959). “On Hotelling’s Generalization of T 2 .” Biometrika 46:160–168.
Pothoff, R. F., and Roy, S. N. (1964). “A Generalized Multivariate Analysis of Variance Model Useful
Especially for Growth Curve Problems.” Biometrika 51:313–326.
Prasad, N. G. N., and Rao, J. N. K. (1990). “The Estimation of Mean Squared Error of Small-Area Estimators.”
Journal of the American Statistical Association 85:163–171.
Pringle, R. M., and Rayner, A. A. (1971). Generalized Inverse Matrices with Applications to Statistics. New
York: Hafner Publishing.
Rao, C. R. (1972). “Estimation of Variance and Covariance Components in Linear Models.” Journal of the
American Statistical Association 67:112–115.
Ripley, B. D. (1987). Stochastic Simulation. New York: John Wiley & Sons.
Robinson, G. K. (1991). “That BLUP Is a Good Thing: The Estimation of Random Effects.” Statistical
Science 6:15–51.
Sacks, J., Welch, W. J., Mitchell, T. J., and Wynn, H. P. (1989). “Design and Analysis of Computer
Experiments.” Statistical Science 4:409–435.
Schabenberger, O., and Gotway, C. A. (2005). Statistical Methods for Spatial Data Analysis. Boca Raton,
FL: Chapman & Hall/CRC.
Schervish, M. J. (1992). “Bayesian Analysis of Linear Models.” In Bayesian Statistics, vol. 4, edited by J. M.
Bernardo, J. O. Berger, A. P. Dawid, and A. F. M. Smith, 419–434. Oxford: Clarendon Press.
Schluchter, M. D., and Elashoff, J. D. (1990). “Small-Sample Adjustments to Tests with Unbalanced
Repeated Measures Assuming Several Covariance Structures.” Journal of Statistical Computation and
Simulation 37:69–87.
Searle, S. R. (1971). Linear Models. New York: John Wiley & Sons.
Searle, S. R. (1982). Matrix Algebra Useful for Statisticians. New York: John Wiley & Sons.
Searle, S. R. (1988). “Mixed Models and Unbalanced Data: Wherefrom, Whereat, and Whereto?” Communi-
cations in Statistics—Theory and Methods 17:935–968.
Searle, S. R., Casella, G., and McCulloch, C. E. (1992). Variance Components. New York: John Wiley &
Sons.
Self, S. G., and Liang, K.-Y. (1987). “Asymptotic Properties of Maximum Likelihood Estimators and
Likelihood Ratio Tests under Nonstandard Conditions.” Journal of the American Statistical Association
82:605–610.
References F 6797
Serfling, R. J. (1980). Approximation Theorems of Mathematical Statistics. New York: John Wiley & Sons.
Simpson, S. L., Edwards, L. J., Muller, K. E., Sen, P. K., and Styner, M. A. (2010). “A Linear Exponent
AR(1) Family of Correlation Structures.” Statistics in Medicine 29:1825–1838.
Singer, J. D. (1998). “Using SAS PROC MIXED to Fit Multilevel Models, Hierarchical Models, and
Individual Growth Models.” Journal of Educational and Behavioral Statistics 23:323–355.
Smith, A. F. M., and Gelfand, A. E. (1992). “Bayesian Statistics without Tears: A Sampling-Resampling
Perspective.” American Statistician 46:84–88.
Snedecor, G. W., and Cochran, W. G. (1980). Statistical Methods. 7th ed. Ames: Iowa State University Press.
Steel, R. G. D., Torrie, J. H., and Dickey, D. A. (1997). Principles and Procedures of Statistics: A Biometrical
Approach. 3rd ed. New York: McGraw-Hill.
Stram, D. O., and Lee, J. W. (1994). “Variance Components Testing in the Longitudinal Mixed Effects
Model.” Biometrics 50:1171–1177.
Stroup, W. W. (1989a). “Predictable Functions and Prediction Space in the Mixed Model Procedure.”
Applications of Mixed Models in Agriculture and Related Disciplines, 39–48. Southern Cooperative Series
Bulletin No. 343, Louisiana Agricultural Experiment Station, Baton Rouge.
Stroup, W. W. (1989b). “Use of Mixed Model Procedure to Analyze Spatially Correlated Data: An Example
Applied to a Line-Source Sprinkler Irrigation Experiment.” Applications of Mixed Models in Agriculture
and Related Disciplines, 104–122. Southern Cooperative Series Bulletin No. 343, Louisiana Agricultural
Experiment Station, Baton Rouge.
Stroup, W. W., Baenziger, P. S., and Mulitze, D. K. (1994). “Removing Spatial Variation from Wheat Yield
Trials: A Comparison of Methods.” Crop Science 86:62–66.
Sullivan, L. M., Dukes, K. A., and Losina, E. (1999). “An Introduction to Hierarchical Linear Modelling.”
Statistics in Medicine 18:855–888.
Swallow, W. H., and Monahan, J. F. (1984). “Monte Carlo Comparison of ANOVA, MIVQUE, REML, and
ML Estimators of Variance Components.” Technometrics 28:47–57.
Tamhane, A. C. (1979). “A Comparison of Procedures for Multiple Comparisons of Means with Unequal
Variances.” Journal of the American Statistical Association 74:471–480.
Tierney, L. (1994). “Markov Chains for Exploring Posterior Distributions.” Annals of Statistics 22:1701–1762.
Verbeke, G., and Molenberghs, G., eds. (1997). Linear Mixed Models in Practice: A SAS-Oriented Approach.
New York: Springer.
Verbeke, G., and Molenberghs, G. (2000). Linear Mixed Models for Longitudinal Data. New York: Springer.
Westfall, P. H., Tobias, R. D., Rom, D., Wolfinger, R. D., and Hochberg, Y. (1999). Multiple Comparisons
and Multiple Tests Using the SAS System. Cary, NC: SAS Institute Inc.
Westfall, P. H., and Young, S. S. (1993). Resampling-Based Multiple Testing: Examples and Methods for
p-Value Adjustment. New York: John Wiley & Sons.
6798 F Chapter 84: The MIXED Procedure
White, H. (1980). “A Heteroskedasticity-Consistent Covariance Matrix Estimator and a Direct Test for
Heteroskedasticity.” Econometrica 48:817–838.
Winer, B. J. (1971). Statistical Principles in Experimental Design. 2nd ed. New York: McGraw-Hill.
Wolfinger, R. D. (1997). “An Example of Using Mixed Models and PROC MIXED for Longitudinal Data.”
Journal of Biopharmaceutical Statistics 7:481–500.
Wolfinger, R. D., and Chang, M. (1995). “Comparing the SAS GLM and MIXED Procedures for Repeated
Measures.” In Proceedings of the Twentieth Annual SAS Users Group Conference, 1172–1182. Cary, NC:
SAS Institute Inc. https://support.sas.com/resources/papers/proceedings-archive/
SUGI95/Sugi-95-198%20Wolfinger%20Chang.pdf.
Wolfinger, R. D., Tobias, R. D., and Sall, J. (1991). “Mixed Models: A Future Direction.” In Pro-
ceedings of the Sixteenth Annual SAS Users Group Conference, 1380–1388. Cary, NC: SAS Institute
Inc. https://support.sas.com/resources/papers/proceedings-archive/SUGI91/
Sugi-91-249%20Wolfinger%20Tobias%20Sall.pdf.
Wolfinger, R. D., Tobias, R. D., and Sall, J. (1994). “Computing Gaussian Likelihoods and Their Derivatives
for General Linear Mixed Models.” SIAM Journal on Scientific Computing 15:1294–1310.
Wright, S. P. (1994). Adjusted F Tests for Repeated Measures with the MIXED Procedure. Knoxville:
Statistics Department, University of Tennessee.
Zimmerman, D. L., and Harville, D. A. (1991). “A Random Field Approach to the Analysis of Field-Plot
Experiments and Other Spatial Experiments.” Biometrics 47:223–239.