KEMBAR78
Factorial Design Notes and Examples | PDF | Errors And Residuals | Applied Mathematics
0% found this document useful (0 votes)
178 views20 pages

Factorial Design Notes and Examples

- A 2k factorial design is a commonly used response surface design where each of k factors has two levels (-1 and +1) and there are 2k experimental runs corresponding to all combinations of the factor levels. - For a 22 design with factors A and B, the average effects of A and B and their interaction AB can be estimated from the sums (ab + a - b - (1)) , (ab - a + b - (1)) , and (ab - a - b + (1)) respectively. - These sums correspond to orthogonal contrasts whose sum of squares provide estimates of the mean squares for testing the significance of A, B, and AB.

Uploaded by

AZ Ndingwan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
178 views20 pages

Factorial Design Notes and Examples

- A 2k factorial design is a commonly used response surface design where each of k factors has two levels (-1 and +1) and there are 2k experimental runs corresponding to all combinations of the factor levels. - For a 22 design with factors A and B, the average effects of A and B and their interaction AB can be estimated from the sums (ab + a - b - (1)) , (ab - a + b - (1)) , and (ab - a - b + (1)) respectively. - These sums correspond to orthogonal contrasts whose sum of squares provide estimates of the mean squares for testing the significance of A, B, and AB.

Uploaded by

AZ Ndingwan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

4 Two-Level (2k ) Factorial Designs

• Many applications of response surface methodology are based on fitting one of the following
models:

First order model y = β0 + β1 x1 + β2 x2 + · · · + βk xk (3)


Xk XX k
Interaction model y = β0 + βi xi + βij xi xj (4)
i=1 i<j
k
X k
XX k
X
Second order model y = β0 + βi xi + βij xi xj + βii x2i (5)
i=1 i<j i=1

• One commonly-used response surface design is a 2k factorial design.

• A 2k factorial design is a k-factor design such that

(i) Each factor has two levels (coded −1 and +1).


(ii) The 2k experimental runs are based on the 2k combinations of the ±1 factor levels.

• Common applications of 2k factorial designs (and the fractional factorial designs in Section 5
of the course notes) include the following:

– As screening experiments: A 2k design is used to identify or screen for potentially


important process or system variables. Once screened, these important variables are
then incorporated into a more complex experimental study.
– To fit the first-order model in (3) or the interaction model in (4): The 2k design can be
used to fit model (3) or (4). One application of fitting these models is in the method of
steepest ascent or descent (Section 6 of the course notes).
– As a building block for second-order response surface designs: 2k designs are used to
generate central composite designs (CCDs) and Box-Behnken designs (BBDs).

• We will first analyze each 2k design as a fixed effects design. We will also generalize the
fixed effects results to the regression model approach for which the model contains regression
coefficients β0 , β1 , β2 , . . . as in (3) and (4).

• Before analyzing the data, you must determine if the design was completely randomized or
if blocking was used. Your answer to this question will indicate the appropriate analysis.
Initially, we will assume the design was completely randomized.

4.1 The 22 Design


• The simplest 2k design is the 22 design. This is a special case of a two-factor factorial design
with factors A and B having two levels.

• Because a 22 design has only 4 runs, several (n) replications are taken.

• Notationally, we use lowercase letters a, b, ab, and (1) to indicate the sum of the responses
for all replications at each of the corresponding levels of A and B.

– If the lower case letter appears, then that factor is at its high (+1) level.
– If the lower case letter does not appear, then that factor is at its low (−1) level.

38
Factor Level Coded Replicate Sum of n
Combination Levels 1 2 ··· n Replicates
A low , B low −1 −1 xxx xxx ··· xxx (1) = y11·
A high, B low +1 −1 xxx xxx ··· xxx a = y21·
A low , B high −1 +1 xxx xxx ··· xxx b = y12·
A high, B high +1 +1 xxx xxx ··· xxx ab = y22·

• We will use the notation A+ and A− to represent the set of observations with factor A at its
high (+1) and its low (−1) levels, respectively. The same notation applies to B + and B − for
factor B.

a and ab correspond to A+ and (1) and b correspond to A− .


b and ab correspond to B + and (1) and a correspond to B − .

• y A+ and y A− are the means of all observations when A = +1 and A = −1, respectively.

• y B + and y B − are the means of all observations when B = +1 and B = −1, respectively.

• The average effect of a factor is the average change in the response produced by a change
in the level of that factor averaged over the levels of the other factor.

• For a 22 design with n replicates, the

— Average effect of Factor A, denoted A, is

1
A = y A+ − y A− = = [ab + a − b − (1)] .
2n

— Average effect of Factor B, denoted B, is

1
B = yB+ − yB− = = [ab − a + b − (1)] .
2n

— Interaction effect between Factors A and B, denoted AB, is the difference between (i)
the average change in response when the levels of Factor A are changed given Factor B is at
its high level and (ii) the average change in response when the levels of Factor A are changed
given Factor B is at its low level:

AB = (y A+ B + − y A− B + ) − (y A+ B − − y A− B − )
ab − a − b + (1)
= =
2n
Note: The results would be the same if we switched the roles of A and B in the definition:

AB = (y A+ B + − y A+ B − ) − (y A− B + − y A− B − )
ab − a − b + (1)
= =
2n

Sums of Squares for A, B and AB.

• Note that when estimating the effects for A, B and AB the following contrasts are used:

ΓA = ab + a − b − (1) ΓB = ab − a + b − (1) ΓAB = ab − a − b + (1)

39
• ΓA , ΓB , and ΓAB are used to estimate A, B, and AB, and they are orthogonal contrasts.

– The coefficient vectors for the contrasts are [1 1 − 1 − 1] for A, [1 − 1 1 − 1] for B, and
[1 − 1 − 1 1] for AB. Note the dot product of any two vectors = 0. This is why they
are called orthogonal contrasts.

• The sum of squares for contrast Γ is 7

• For a replicated 22 design, this is equivalent to:

[ab + a − b − (1)]2 [ab − a + b − (1)]2 [ab − a − b + (1)]2


SSA = SSB = SSAB =
4n 4n 4n

• Because there are two levels for both factors, the degrees of freedom associated with each sum
of squares is 1. Thus, M SA = SSA , M SB = SSB , and M SAB = SSAB .

• Because there are n replicates for each of the four A ∗ B treatment combinations, there are
4(n − 1) degrees of freedom for error for the four-parameter interaction model in (4).

• It is common to list the treatment combinations in standard order: (1), a, b, and ab. Many
references use a shortened notation (− or +) to denote the low (−1) and high (+1) levels of
a factor.
Example: An engineer designs a 22 design with n = 4 replicates to study the effects of bit size (A)
and cutting speed (B) on routing notches in a printed circuit board.
A B AB Replicates Totals
− − + 18.2 18.9 12.9 14.4 (1) = 64.4
+ − − 27.2 24.0 22.4 22.5 a = 96.1
− + − 15.9 14.5 15.1 14.2 b = 59.7
+ + + 41.0 43.9 36.3 39.9 ab = 161.1
Note: the signs in the AB column are the signs that result when multiplying the A and B columns.
• The estimates of the fixed effects are:
ΓA ab + a − b − (1) 161.1 + 96.1 − 59.7 − 64.4
A = = = =
2n 2n 8
ΓB ab − a + b − (1) 161.1 − 96.1 + 59.7 − 64.4
B = = = =
2n 2n 8
ΓAB ab − a − b + (1) 161.1 − 96.1 − 59.7 + 64.4
AB = = = =
2n 2n 8

• The sum of squares SSi = Γ2i /4n for i = A, B, AB, T is:

133.12 60.32
SSA = = 1107.2256 SSB = = 227.2556
16 16
2 X
2 X
4
69.72 X
n
2
y··· 381.32
SSAB = = 303.6306 SST = yijk − = 10796.7− = 1709.8344
16 i=1 j=1 k=1
4n 16
SSE = SST − SSA − SSB − SSAB = 71.7225

• Sums of squares can also be calculated using the formulas for a two-factor factorial design.

40
The Regression Model
• If both factors in the 22 design are quantitative (say, x1 and x2 ), we can fit the first order
regression model
y = β0 + β1 x1 + β2 x2 + .
or, we can fit the regression model with interaction:
y = β0 + β1 x1 + β2 x2 + β12 x1 x2 + .

• The least squares estimates [ b0 b1 b2 b12 ]0 = (X0 X)−1 X 0 y are directly related to the estimated
effects A, B, and AB from the fixed effects analysis:
ab + a + b + (1)
b0 = or b0 = y
4n
ΓA ab + a − b − (1)
b1 = = or b1 = A/2
4n 4n
ΓB ab + b − a − (1)
b2 = = or b2 = B/2
4n 4n
ΓAB ab + (1) − a − b
b12 = = or b2 = AB/2
4n 4n
• For the previous example:
b0 = y = 381.3/16 = 23.83125
b1 = A/2 = 16.6375/2 = 8.31875
b2 = B/2 = 7.5375/2 = 3.76875
b12 = AB/2 = 8.7125/2 = 4.35625
• Therefore, the fitted regression equation is
yb = 23.83125 + 8.31875x1 + 3.76875x2 + 4.35625x1 x2
where (x1 , x2 ) are the coded levels of factors A and B.

4.2 The 23 Design


• Let A, B, and C be three factors each having two levels. The design which includes the 23 = 8
treatment combinations of A ∗ B ∗ C is called a 23 (factorial) design.
• The following table summarizes the eight treatment combinations and the signs for calculating
effects in the 23 design (I =intercept). Assume each treatment is replicted n times.
Factorial Effect Sum of
I A B C AB AC BC ABC replicates
+ − − − + + + − (1) = y111·
+ + − − − − + + a = y211·
+ − + − − + − + b = y121·
+ + + − + − − − ab = y221·
+ − − + + − − + c = y112·
+ + − + − + − − ac = y212·
+ − + + − − + − bc = y122·
+ + + + + + + + abc = y222·
• The signs in the interaction columns are the signs that result when multiplying the main effect
columns in the interaction of interest. Note that all columns are mutually orthogonal.

41
• For a 23 design with n replicates, each estimated effect is the differences between two means:
The first mean is the average of all data corresponding to the + rows in an effect column and
the second mean is the average of all data corresponding to the − rows in an effect column.

Average effect of Factor A, denoted A, is

(a + ab + ac + abc) (1) + b + c + bc
A = y A+ − y A− = −
4n 4n
1
= [a + ab + ac + abc − (1) − b − c − bc] .
4n

Average effect of Factor B, denoted B, is

(b + ab + bc + abc) (1) + a + c + ac
B = yB+ − yB− = −
4n 4n
1
= [b + ab + bc + abc − (1) − a − c − ac] .
4n

Average effect of Factor C, denoted C, is

(c + ac + bc + abc) (1) + a + b + ab
C = yC + − yC − = −
4n 4n
1
= [c + ac + bc + abc − (1) − a − b − ab] .
4n

Two-factor interaction effect between Factors A and B, denoted AB, is

ab + abc − a − ac b + bc − (1) − c abc + ab + c + (1) − a − ac − bc − b


AB = − = .
4n 4n 4n

Two-factor interaction effect between Factors A and C, denoted AC, is

ac + abc − a − ab c + bc − (1) − b abc + ac + b + (1) − ab − a − bc − c


AC = − = .
4n 4n 4n

Two-factor interaction effect between Factors B and C, denoted BC, is

bc + abc − b − ab c + ac − (1) − a abc + bc + a + (1) − ab − b − ac − c


BC = − = .
4n 4n 4n

Three-factor interaction effect between Factors A, B and C, denoted ABC, is the


average difference between the AB interaction for the two different levels of C. That is,

(abc − bc) − (ac − c) (ab − b) − (a − (1))


ABC = −
4n 4n
abc + a + b + c − ab − ac − bc − (1)
=
4n

• Let Γ = the contrast sum in the numerator for any of the effects. Then the sums of squares
associated with that effect is SS =

42
Geometric Representation for a 23 Design

A effect B effect C effect

Estimation of Main Effects


A effect B effect

C effect

43
Estimation of Two-Factor Interaction Effects

Estimation of the Three-Factor Interaction Effect

44
The Regression Model

• If all three factors in the 23 design are quantitative (say, x1 , x2 , and x3 ), we can fit the
regression model

y = β0 + β1 x1 + β2 x2 + β3 x3 + β12 x1 x2 + β13 x1 x3 + β23 x2 x3 + β123 x1 x2 x3 + . (6)

• The least squares estimates (with the exception of b0 ) are 1/2 of the estimated effects from
the fixed effects analysis. That is,

b0 = y b1 = A/2 b2 = B/2 b3 = C/2

b12 = AB/2 b13 = AC/2 b23 = BC/2 b123 = ABC/2

• Because all of the contrasts associated with each of the effects are orthogonal, the least squares
estimates remain unchanged for any model containing a subset of terms in (6).

4.2.1 A 23 Design Example


An engineer is interested in the effects of cutting speed (A), tool geometry (B), and cutting angle
(C) on the life (in hours) of a machine tool. Two levels of each factor are chosen, and three replicates
of a 23 design are run. The results are summarized below:

A B C Replicates Treatment
x1 x2 x3 Sums
− − − 22 31 25 (1) = 78
+ − − 32 43 29 a = 104
− + − 35 34 50 b = 119
+ + − 55 47 46 ab = 148
− − + 44 45 38 c = 127
+ − + 40 37 36 ac = 113
− + + 60 50 54 bc = 164
+ + + 39 41 47 abc = 127

Analyze the data (with lack-of-fit tests) assuming the following 4 models:

• (Model 1): An additive model with fixed (categorical) effects.

• (Model 2): A first-order regression model.

• (Model 3): An interaction model with fixed (categorical) effects.

• (Model 4): A regression model with all two-factor crossproduct (interaction). terms.

Note there are df for pure error.

45
• We will first estimate effects and sums of squares using the formulas, then use SAS to perform
the analysis. Recall:

(1) a b ab c ac bc abc
78 104 119 148 127 113 164 127

Model
Fixed Effects −→ I A B C AB AC BC ABC Treatment
Regression −→ Int x1 x2 x3 x 1 x2 x1 x3 x 2 x3 x1 x2 x3 Sums
+ − − − + + + − (1) = 78
+ + − − − − + + a = 104
+ − + − − + − + b = 119
+ + + − + − − − ab = 148
+ − − + + − − + c = 127
+ + − + − + − − ac = 113
+ − + + − − + − bc = 164
+ + + + + + + + abc = 127

• The fixed effects estimates are


104 + 148 + 113 + 127 − 78 − 119 − 127 − 164 4
A = = = .3
(4)(3) 12
119 + 148 + 164 + 127 − 78 − 104 − 127 − 113 136
B = = = 11.3
(4)(3) 12
127 + 113 + 164 + 127 − 78 − 104 − 119 − 148 82
C = = = 6.83
(4)(3) 12
78 + 148 + 127 + 127 − 104 − 119 − 113 − 164 −20
AB = = = −1.6
(4)(3) 12
78 + 119 + 113 + 127 − 104 − 148 − 127 − 164 −106
AC = = = −8.83
(4)(3) 12
78 + 104 + 164 + 127 − 119 − 148 − 127 − 113 −34
BC = = = −2.83
(4)(3) 12
104 + 119 + 127 + 127 − 78 − 148 − 113 − 164 −26
ABC = = = −2.16
(4)(3) 12

Γ2ef f ect
• The sums of squares are calculated using :
8n
42 (136)2 822
SSA = = .6 SSB = = 770.6 SSC = = 280.16
24 24 24
(−20)2 (−106)2
SSAB = = 16.6 SSAC = = 468.16
24 24
(−34)2 (−26)2
SSBC = = 48.16 SSABC = = 28.16
24 24

46
• Fixed effects additive model (Model 1):

yijkl = µ + αi + βj + γk + ijkl (i = ±1, j = ±1, k = ±1, l = 1, 2, 3)

• Note the effect estimates in the SAS output match the formula calculations.

• First-order regression model (Model 2): For i = 1, 2, . . . , 24

yi = β0 + β1 x1i + β2 x2i + β3 x3i + i

Note that the parameter estimates are 1/2 of those from the fixed effects in Model 1.

• For Models 1 and 2, there are df for pure error and df for total error. Thus, the
df for lack-of-fit = . This means we can add at most additional terms in the
model (such as interaction terms).

• There is a significant lack-of-fit (p-value = ). We can add at most additional terms


in the model (such as interaction terms).

• The residuals in the Residual vs Predicted Value plot (page 50) are not randomly scattered
about 0 for several (x1 , x2 , x3 ) combinations. This suggests a lack-of-fit problem.

MODEL 1: ADDITIVE FIXED EFFECTS MODEL MODEL 2: FIRST ORDER REGRESSION MODEL

The GLM Procedure The REG Procedure


Model: MODEL1
ependent Variable: Y Dependent Variable: Y

Sum of Number of Observations Read 24


Source DF Squares Mean Square F Value Pr > F
Number of Observations Used 24
Model 3 1051.500000 350.500000 6.72 0.0026

Error 20 1043.833333 52.191667


Analysis of Variance
Corrected Total 23 2095.333333
Sum of Mean
Source DF Squares Square F Value Pr > F
R-Square Coeff Var Root MSE Y Mean Model 3 1051.50000 350.50000 6.72 0.0026
0.501829 17.69236 7.224380 40.83333 Error 20 1043.83333 52.19167

Lack of Fit 4 561.16667 140.29167 4.65 0.0111

Source DF Type III SS Mean Square F Value Pr > F Pure Error 16 482.66667 30.16667

A 1 0.6666667 0.6666667 0.01 0.9111 Corrected Total 23 2095.33333

B 1 770.6666667 770.6666667 14.77 0.0010


MODEL 1: ADDITIVE FIXED EFFECTS MODEL
C 1 280.1666667 280.1666667 5.37 0.0312 Root MSE 7.22438 R-Square 0.5018
The GLM Procedure Dependent Mean 40.83333 Adj R-Sq 0.4271

nt Variable: Y Coeff Var 17.69236

Standard
Parameter Estimate Error t Value Pr > |t| Parameter Estimates

A 0.3333333 2.94934079 0.11 0.9111 Parameter Standard Variance


Variable DF Estimate Error t Value Pr > |t| Inflation
B 11.3333333 2.94934079 3.84 0.0010
Intercept 1 40.83333 1.47467 27.69 <.0001 0
C 6.8333333 2.94934079 2.32 0.0312
X1 1 0.16667 1.47467 0.11 0.9111 1.00000

X2 1 5.66667 1.47467 3.84 0.0010 1.00000

X3 1 3.41667 1.47467 2.32 0.0312 1.00000

47
MODEL 1: ADDITIVE FIXED EFFECTS MODEL
The GLM Procedure
The GLM Procedure

Y
Y
Level of
Level
A of N Mean Std Dev
A N Mean Std Dev
-1 12 40.6666667 11.7808267
-1 12 40.6666667 11.7808267
MODEL 3: INTERACTION
1
FIXED 7.1858447
12 41.0000000
EFFECTS MODEL
1 12 41.0000000 7.1858447
The GLM Procedure
Y
Y
Level of
Level
B of N Mean Std Dev
B
A N Mean Std Dev
-1 12 35.1666667 7.46912838
-1 12 35.1666667
40.6666667 7.46912838
11.7808267
1 12 46.5000000 8.03967435
1 12 46.5000000
41.0000000 8.03967435
7.1858447

Y
Y
Level of Level of
A LevelBof N Mean Std Dev
C
B N Mean Std Dev
-1 -1 6 34.1666667 9.7039511
-1 12 37.4166667 7.46912838
35.1666667 10.5093753
-1 1 6 47.1666667 10.4769588
1 12 44.2500000 7.3870279
46.5000000 8.03967435
1 -1 6 36.1666667 5.1153364

1 1 6 45.8333333 5.6005952
Y
Level of Level of
A B N Mean
Y Std Dev
-1 Level-1of 6 34.1666667 9.7039511
C N Mean Std Dev
-1 1 6 47.1666667 10.4769588
-1 12 37.4166667 10.5093753
1 -1 6 36.1666667 5.1153364
1 12 44.2500000 7.3870279
1 1 6 45.8333333 5.6005952

Y
Y
Level of Level of
A LevelCof N Mean Std Dev
C N Mean Std Dev
-1 -1 6 32.8333333 9.82683401
-1 12 37.4166667 10.5093753
-1 1 6 48.5000000 7.84219357
1 12 44.2500000 7.3870279
1 -1 6 42.0000000 9.79795897
MODEL 3: INTERACTION FIXED EFFECTS MODEL
1 1 6 40.0000000 3.89871774
The GLM ProcedureY
Level of Level of
A C N Mean Y Std Dev
-1
Level of -1
Level of 6 32.8333333 9.82683401
B C N Mean Std Dev
-1 1 6 48.5000000 7.84219357
-1 -1 6 30.3333333 7.25718035
1 -1 6 42.0000000 9.79795897
-1 1 6 40.0000000 3.74165739
1 1 6 40.0000000 3.89871774
1 -1 6 44.5000000 8.36062199

1 1 6 48.5000000 7.91833316

48
• Now let’s add the three two-factor interactions to get Models 3 and 4.

• Fixed effects interaction model (Model 3):

yijkl = µ + αi + βj + γk + αβij + αγik + βγjk + ijkl

for (i = ±1, j = ±1, k = ±1, l = 1, 2, 3)

• Note the effect estimates match the formula calculations.

• Interaction regression model (Model 4): For i = 1, 2, . . . , 24

yi = β0 + β1 x1i + β2 x2i + β3 x3i + + β12 x1i x2i + β13 x1i x3i + β23 x2i x3i + i

Note that the parameter estimates are 1/2 of those from the fixed effects in Model 3.

• The residuals are randomly scattered about 0. This suggests there is no lack-of-fit problem.
The lack-of-fit test (p-value= ) supports this.

MODEL 3: INTERACTION FIXED EFFECTS MODEL MODEL 4: INTERACTION REGRESSION MODEL

The GLM Procedure The REG Procedure


Model: MODEL1
ependent Variable: Y Dependent Variable: Y

Sum of Number of Observations Read 24


Source DF Squares Mean Square F Value Pr > F
Number of Observations Used 24
Model 6 1584.500000 264.083333 8.79 0.0002

Error 17 510.833333 30.049020


Analysis of Variance
Corrected Total 23 2095.333333
Sum of Mean
Source DF Squares Square F Value Pr > F
R-Square Coeff Var Root MSE Y Mean Model 6 1584.50000 264.08333 8.79 0.0002
0.756204 13.42457 5.481699 40.83333 Error 17 510.83333 30.04902

Lack of Fit 1 28.16667 28.16667 0.93 0.3483

Source DF Type III SS Mean Square F Value Pr > F Pure Error 16 482.66667 30.16667

A 1 0.6666667 0.6666667 0.02 0.8833 Corrected Total 23 2095.33333

B 1 770.6666667 770.6666667 25.65 <.0001

A*B 1 16.6666667 16.6666667 0.55 0.4666 Root MSE 5.48170 R-Square 0.7562
C 1 280.1666667 280.1666667 9.32 0.0072 Dependent Mean 40.83333 Adj R-Sq 0.6702
A*C 1 468.1666667 468.1666667 15.58 0.0010 Coeff Var 13.42457
MODEL 3: INTERACTION FIXED EFFECTS MODEL
B*C 1 48.1666667 48.1666667 1.60 0.2226
The GLM Procedure
Parameter Estimates
nt Variable: Y Parameter Standard Variance
Variable DF Estimate Error t Value Pr > |t| Inflation
Standard
Parameter Estimate Error t Value Pr > |t| Intercept 1 40.83333 1.11895 36.49 <.0001 0

A 0.3333333 2.23789408 0.15 0.8833 X1 1 0.16667 1.11895 0.15 0.8833 1.00000

B 11.3333333 2.23789408 5.06 <.0001 X2 1 5.66667 1.11895 5.06 <.0001 1.00000

C 6.8333333 2.23789408 3.05 0.0072 X3 1 3.41667 1.11895 3.05 0.0072 1.00000

A*B -1.6666667 2.23789408 -0.74 0.4666 X1X2 1 -0.83333 1.11895 -0.74 0.4666 1.00000

A*C -8.8333333 2.23789408 -3.95 0.0010 X1X3 1 -4.41667 1.11895 -3.95 0.0010 1.00000

B*C -2.8333333 2.23789408 -1.27 0.2226 X2X3 1 -1.41667 1.11895 -1.27 0.2226 1.00000

49
MODEL 2: FIRST ORDER REGRESSION MODEL

The REG Procedure


Model: MODEL1
Dependent Variable: Y

Fit Diagnostics for Y


2 2
10

1 1
5

RStudent

RStudent
Residual

0 0 0

-5 -1 -1

-10
-2 -2

35 40 45 50 35 40 45 50 0.20 0.25 0.30


Predicted Value Predicted Value Leverage

60
10 0.15
50
5

Cook's D
Residual

0.10
0 40
Y

-5 0.05
30
-10
20 0.00

-2 -1 0 1 2 20 30 40 50 60 0 5 10 15 20 25
Quantile Predicted Value Observation
30 Fit–Mean REGRESSION
MODEL 4: INTERACTION Residual MODEL
25
10
20 The REG Procedure
5 Model: MODEL1 Observations 24
Percent

15 Parameters 4
Dependent
0 Variable: Y Error DF 20
10 MSE 52.192
-5 R-Square 0.5018
5
Fit
-10 Diagnostics for Y Adj R-Square 0.4271
0
10
-20 -10 0 10 20 2 2
0.0 0.4 0.8 0.0 0.4 0.8
Residual Proportion Less
5 1 1
RStudent

RStudent
Residual

0 0
0
-1 -1
-5
-2 -2

30 40 50 30 40 50 0.3 0.4 0.5 0.6


Predicted Value Predicted Value Leverage

10 60 0.25

0.20
5 50
Cook's D
Residual

0.15
0 40
Y

0.10
-5 30
0.05

-10 20 0.00

-2 -1 0 1 2 20 30 40 50 60 0 5 10 15 20 25
Quantile Predicted Value Observation

Fit–Mean Residual
30
10
Observations 24
ercent

20
50 Parameters 7
0
SAS Code for the 23 Design Example

• ESTIMATE statements in SAS are used to calculate average effect estimates.


• Because of orthogonality, all standard errors are identically
p p
2.24227067 = M SE/2n = 30.1667/6

DM ’LOG; CLEAR; OUT; CLEAR;’;


ODS LISTING;
ODS PRINTER PDF file=’C:\COURSES\ST578\SAS\TWO3.PDF’;
OPTIONS NODATE NONUMBER;
OPTIONS PS=54 LS=76 NODATE NONUMBER;
DATA IN;
DO C = -1 TO 1 BY 2;
DO B = -1 TO 1 BY 2;
DO A = -1 TO 1 BY 2;
DO REP = 1 TO 3;
INPUT Y @@;
X1=A; X2=B; X3=C;
X1X2 = X1*X2; X1X3 = X1*X3; X2X3 = X2*X3;
OUTPUT;
END; END; END; END;
LINES;
22 31 25 32 43 29 35 34 50 55 47 46
44 45 38 40 37 36 60 50 54 39 41 47
;
PROC GLM DATA=IN PLOTS=NONE;
CLASS A B C;
MODEL Y = A B C / SS3;
MEANS A B C;
ESTIMATE ’A’ A -1 1;
ESTIMATE ’B’ B -1 1;
ESTIMATE ’C’ C -1 1;
TITLE ’MODEL 1: ADDITIVE FIXED EFFECTS MODEL’;
PROC REG DATA=IN PLOTS=(DIAGNOSTICS);
MODEL Y = X1 X2 X3 / LACKFIT VIF;
TITLE ’MODEL 2: FIRST ORDER REGRESSION MODEL’;
PROC GLM DATA=IN PLOTS=NONE;
CLASS A B C;
MODEL Y = A|B|C@2 / SS3 ;
MEANS A|B|C@2;
ESTIMATE ’A’ A -1 1;
ESTIMATE ’B’ B -1 1;
ESTIMATE ’C’ C -1 1;
ESTIMATE ’A*B’ A*B 1 -1 -1 1 / DIVISOR=2;
ESTIMATE ’A*C’ A*C 1 -1 -1 1 / DIVISOR=2;
ESTIMATE ’B*C’ B*C 1 -1 -1 1 / DIVISOR=2;
* ESTIMATE ’A*B*C’ A*B*C -1 1 1 -1 1 -1 -1 1 ;
TITLE ’MODEL 3: INTERACTION FIXED EFFECTS MODEL’;
PROC REG DATA=IN PLOTS=(DIAGNOSTICS);
MODEL Y = X1 X2 X3 X1X2 X1X3 X2X3 / LACKFIT VIF;
TITLE ’MODEL 4: INTERACTION REGRESSION MODEL’;
RUN;

51
4.3 Analyzing Unreplicated Experiments
• To test hypotheses in an unreplicated 2k design (n = 1), it is necessary to “pool” interaction
terms (especially higher-order interaction terms), and use the MSE after pooling as an estimate
of the random error σ 2 .
• The problem is to determine which interaction terms should be pooled together. The following
three steps are recommended:
1. Estimate all effects for the full-factorial interaction model.
2. Make a normal probability plot of the estimated effects (excluding the intercept), and
label the “outlier” effects. Higher-order interactions which are not outliers can be pooled
to form the MSE.
3. Run the ANOVA using this pooled error term.
• Warning: When a higher-order interaction exists, it is inappropriate to pool that interaction
with the other interactions because it will inflate the MSE.
• Some comments on the normal probability plot of the 2k − 1 estimates for either the fixed
effects or regression model:
– If an effect is not significantly different than zero, then it should be randomly and nor-
mally distributed about 0. That is, it is N (0, σ 2 / . When plotted, all of the effects
which are not significantly different than zero should lie along a straight line on the
normal probability plot.
– If an effect is significantly different than zero, then it should be randomly and normally
distributed about its mean which we will call β. That is, the effect is N (β, σ 2 / ).
Then, in the normal probability plot, all of the non-zero effects will be plotted away from
the line formed by the zero-mean effects.

Unreplicated 24 Design Example (from Montgomery text): In a process development


study on process yield in pounds, four factors were studied: time, concentration (conc), pressure ,
and temperature (temp). Each factor had two levels. A single replicate of the 24 design was run as
a completely randomized design. The resulting data are shown in the following table:
time conc pressure temp yield
− − − − 12
+ − − − 18
− + − − 13
+ + − − 16
− − + − 17
+ − + − 15
− + + − 20
+ + + − 15
− − − + 10
+ − − + 25
− + − + 13
+ + − + 24
− − + + 19
+ − + + 21
− + + + 17
+ + + + 23

Analyze the data from this unreplicated experiment from Design and Analysis of Experiments, by
D. Montgomery (8th ed., p.298).

52
A 2**4 DESIGN -- ESTIMATION OF EFFECTS

The GLM Procedure

Dependent Variable: YIELD

Sum of
Source DF Squares Mean Square F Value Pr > F

Model 15 291.7500000 19.4500000 . .


Error 0 0.0000000 .
Corrected Total 15 291.7500000

R-Square Coeff Var Root MSE YIELD Mean


1.000000 . . 17.37500

Type III Mean


Source DF SS Square F Value Pr > F

TIME 1 81.00 81.00 . .


CONC 1 1.00 1.00 . .
TIME*CONC 1 2.25 2.25 . .
PRESSURE 1 16.00 16.00 . .
TIME*PRESSURE 1 72.25 72.25 . .
CONC*PRESSURE 1 0.25 0.25 . .
TIME*CONC*PRESSURE 1 4.00 4.00 . .
TEMP 1 42.25 42.25 . .
TIME*TEMP 1 64.00 64.00 . .
CONC*TEMP 1 0.00 0.00 . .
TIME*CONC*TEMP 1 2.25 2.25 . .
PRESSURE*TEMP 1 0.00 0.00 . .
TIME*PRESSURE*TEMP 1 0.25 0.25 . .
CONC*PRESSURE*TEMP 1 2.25 2.25 . .
TIME*CONC*PRESS*TEMP 1 4.00 4.00 . .

Standard
Parameter Estimate Error t Value Pr > |t|

A TIME 4.50 . . .
B CONC 0.50 . . .
C PRESSURE 2.00 . . .
D TEMP 3.25 . . .
A*B TIME*CONC -0.75 . . .
A*C TIME*PRES -4.25 . . .
A*D TIME*TEMP 4.00 . . .
B*C CONC*PRES 0.25 . . .
B*D CONC*TEMP 0.00 . . .
C*D PRES*TEMP 0.00 . . .
A*B*C TIME*C*P 1.00 . . .
A*B*D TIME*C*T 0.75 . . .
A*C*D TIME*P*T -0.25 . . .
B*C*D C*P*TEMP -0.75 . . .
A*B*C*D T*C*P*T 1.00 . . .

^^^^^^^^^^^
Make a NPP of these estimates

53
DM ’LOG; CLEAR; OUT; CLEAR;’;
ODS LISTING;
* ODS PRINTER PDF file=’C:\COURSES\ST578\SAS\TWO4.PDF’;
OPTIONS PS=54 LS=78 NODATE NONUMBER;

DATA IN;
DO TEMP = -1 TO 1 BY 2;
DO PRESSURE = -1 TO 1 BY 2;
DO CONC = -1 TO 1 BY 2;
DO TIME = -1 TO 1 BY 2;
INPUT YIELD @@; OUTPUT;
END; END; END; END;
LINES;
12 18 13 16 17 15 20 15 10 25 13 24 19 21 17 23
;
**********************************************************;
*** PART I: DETERMINE THE ESTIMATES OF THE 15 EFFECTS ***;
**********************************************************;

PROC GLM DATA=IN;


CLASS TIME CONC PRESSURE TEMP;
MODEL YIELD = TIME|CONC|PRESSURE|TEMP / SS3;

ESTIMATE ’TIME’ TIME -1 1;


ESTIMATE ’CONC’ CONC -1 1;
ESTIMATE ’PRESSURE’ PRESSURE -1 1;
ESTIMATE ’TEMP’ TEMP -1 1;

ESTIMATE ’TIME*CONC’ TIME*CONC 1 -1 -1 1 / DIVISOR=2;


ESTIMATE ’TIME*PRES’ TIME*PRESSURE 1 -1 -1 1 / DIVISOR=2;
ESTIMATE ’TIME*TEMP’ TIME*TEMP 1 -1 -1 1 / DIVISOR=2;
ESTIMATE ’CONC*PRES’ CONC*PRESSURE 1 -1 -1 1 / DIVISOR=2;
ESTIMATE ’CONC*TEMP’ CONC*TEMP 1 -1 -1 1 / DIVISOR=2;
ESTIMATE ’PRES*TEMP’ PRESSURE*TEMP 1 -1 -1 1 / DIVISOR=2;

ESTIMATE ’TIME*C*P’ TIME*CONC*PRESSURE -1 1 1 -1 1 -1 -1 1 / DIVISOR=4;


ESTIMATE ’TIME*C*T’ TIME*CONC*TEMP -1 1 1 -1 1 -1 -1 1 / DIVISOR=4;
ESTIMATE ’TIME*P*T’ TIME*PRESSURE*TEMP -1 1 1 -1 1 -1 -1 1 / DIVISOR=4;
ESTIMATE ’C*P*TEMP’ CONC*PRESSURE*TEMP -1 1 1 -1 1 -1 -1 1 / DIVISOR=4;

ESTIMATE ’T*C*P*T’ TIME*CONC*PRESSURE*TEMP


1 -1 -1 1 -1 1 1 -1 -1 1 1 -1 1 -1 -1 1 / DIVISOR=8;
TITLE ’A 2**4 DESIGN -- ESTIMATION OF EFFECTS’;

54
**************************************************************************;
*** PART II: MAKE A NORMAL PROBABILITY PLOT OF THE ESTIMATED EFFECTS ***;
**************************************************************************;

DATA FX; INPUT EFFECTS @@; LINES;


4.5 0.5 2 3.25 -0.75 -4.25 4 0.25 0 0 1 0.75 -0.25 -0.75 1
;
PROC UNIVARIATE DATA=FX PLOTS;
VAR EFFECTS;
TITLE ’A 2**4 DESIGN -- NORMAL PROBABILITY PLOT OF EFFECTS’;

A 2**4 DESIGN -- NORMAL PROBABILITY PLOT OF EFFECTS

The UNIVARIATE Procedure

Distribution and Probability Plot for EFFECTS

2
EFFECTS

-2

-4

0 2 4 6 8
Count

2
EFFECTS

-2

-4

-2 -1 0 1 2
Normal Quantiles

55
Analysis I: Pooling high order interactions

• After pooling all 3-factor and 4-factor interaction, we have 5 df for the M SE .

• The ANOVA indicates significant A, C, AC, D, and AD effects. These match the highlighted
points on the normal probability plot of effects.

******************************************************************;
*** PART III: RUN ANOVA WITH POOLED HIGHER ORDER INTERACTIONS ***;
******************************************************************;

PROC GLM DATA=IN;


CLASS TIME CONC PRESSURE TEMP;
MODEL YIELD = TIME|CONC|PRESSURE|TEMP@2 / SS3;
TITLE ’A 2**4 DESIGN -- POOLING HIGHER ORDER INTERACTIONS’;

56
Analysis II: Pooling terms involving factor B = concentration (CONC)

• After pooling all terms involving CONC, we have 8 df for the M SE .

• The ANOVA indicates significant A, C, AC, D, and AD effects. These match the highlighted
points on the normal probability plot of effects.

• After factor B is removed, we still retain balance and orthogonality. We now have a 23 design
with n = 2 replicates for each combination of factor levels for A, C, and D.

**************************************************************;
*** RUN ANOVA WITH CONCENTRATION REMOVED FROM THE ANALYSIS ***;
**************************************************************;

PROC GLM DATA=IN;


CLASS TIME PRESSURE TEMP;
MODEL YIELD = TIME|PRESSURE|TEMP / SS3;
TITLE ’ANOVA WITH CONCENTRATION REMOVED FROM THE ANALYSIS’;

RUN;

57

You might also like