KEMBAR78
Course Reference Sheet - Revision 2 | PDF
0% found this document useful (0 votes)
62 views6 pages

Course Reference Sheet - Revision 2

This document provides a reference sheet summarizing common statistical formulas for measures of central tendency, dispersion, probability, confidence intervals, and distributions. Key formulas are given for the mean, median, mode, variance, standard deviation, probability, binomial and hypergeometric distributions, normal distribution, confidence intervals for the mean, variance and proportion.

Uploaded by

mattbutler1401
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
62 views6 pages

Course Reference Sheet - Revision 2

This document provides a reference sheet summarizing common statistical formulas for measures of central tendency, dispersion, probability, confidence intervals, and distributions. Key formulas are given for the mean, median, mode, variance, standard deviation, probability, binomial and hypergeometric distributions, normal distribution, confidence intervals for the mean, variance and proportion.

Uploaded by

mattbutler1401
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

COURSE REFERENCE SHEET – REVISION 2

𝑛
Mean (𝑥̅ ) 1
∑ 𝑥𝑖
𝑛
𝑖=1
Median 𝑛+1
𝑝𝑜𝑠𝑖𝑡𝑖𝑜𝑛 =
2
Mode The mode is the most frequently
occurring number in the data set.

Midrange x𝑠𝑚𝑎𝑙𝑙𝑒𝑠𝑡 + x𝑙𝑎𝑟𝑔𝑒𝑠𝑡


2
Lower (first) Quartile (Q1) 𝑛 + 1 𝑡ℎ
𝑣𝑎𝑙𝑢𝑒
4
Upper (third) Quartile (Q3) 3(𝑛 + 1) 𝑡ℎ
𝑣𝑎𝑙𝑢𝑒
4
Midhinge Q1 + Q 3
2
Range Largest Measurement – Smallest
Measurement
(Sample) Variance (𝑠 2 ) ∑𝑛𝑖=1(𝑥𝑖 − 𝑥̅ )2
𝑛−1

(Population) Variance (𝜎 2 ) ∑𝑛𝑖=1(𝑥𝑖 − 𝑥̅ )2


𝑁
(Sample) Standard Deviation (s) √𝑠 2

(Midspread) = Interquartile IQR = Q3 – Q1


Range
Coefficient of Variation (CV) 𝑆
𝑋̅
Probability Basic Formulas
Permutation
𝑛!
𝑃𝑟𝑛 =
(𝑛 − 𝑟)!

Combination
𝑛 𝑛!
𝐶𝑟𝑛 = ( ) =
𝑟 𝑟! (𝑛 − 𝑟)!

Union Probability of two events (not mutually exclusive)


𝑃(𝐴 𝑜𝑟 𝐵) = 𝑃(𝐴) + 𝑃(𝐵) − 𝑃(𝐴 𝑎𝑛𝑑 𝐵)

Union Probability of two mutually exclusive events


𝑃(𝐴 𝑜𝑟 𝐵) = 𝑃(𝐴) + 𝑃(𝐵)

Conditional Probability of event B given event A has occurred


𝑃(𝐴 𝑎𝑛𝑑 𝐵)
𝑃(𝐵|𝐴) =
𝑃(𝐴)

Intersection (Joint) Probability of two events


𝑃(𝐴 𝑎𝑛𝑑 𝐵) = 𝑃(𝐴) × 𝑃(𝐵|𝐴)

Intersection (Joint) Probability of two independent events


𝑃(𝐴 𝑎𝑛𝑑 𝐵) = 𝑃(𝐴) × 𝑃(𝐵)
For a discrete random variable
n
Expectation of a discrete random
variable: E(X) = μX = ∑ xi p(xi )
i=1

n
Variance of a discrete random
variable: V(X) = σ2X = ∑(xi − μx )2 p(xi ) = (∑ 𝑥 2 𝑓(𝑥)) − 𝜇 2
i=1

For a discrete probability distribution

Distribution Probability Mass Function Mean Variance


Binomial n np np(1-p)
P(X = x|n, p) = ( ) px (1 − p)n−x for x
x
= 0,1, … . n

Where: n = # of trials, p = probability of success, x = # of successes

Hypergeometric A N −A nA nA(N − A) N − n
( )( ) ( )( )
P(X = x|n, N, A) = x n −x for x N N2 N−1
N
( )
n
b
= 0,1 … A and ( )
a
= 0 if a > b

Where: n = # of trials, A = total successes available, x = # of successes, n = sample size


Continuous Random Variables
𝑏
𝑃(𝑎 ≤ 𝑥 ≤ 𝑏) = ∫ 𝑓(𝑥)𝑑𝑥
𝑎
𝑏
𝐹(𝑥) = 𝑃(𝑋 ≤ 𝑥) = ∫ 𝑓(𝑥)𝑑𝑥
−∞

𝜇 = 𝐸(𝑋) = ∫ 𝑥𝑓(𝑥)𝑑𝑥
−∞
∞ ∞
𝜎 = 𝑉(𝑋) = 𝐸(𝑋 − 𝜇) = ∫ (𝑥 − 𝜇)2 𝑓(𝑥)𝑑𝑥 = (∫ 𝑥 2 𝑓(𝑥)𝑑𝑥 ) − 𝜇 2
2 2
−∞ −∞
Confidence Intervals for Single Sample

CI on the Mean of a Normal Distribution, Standard Deviation Known or Large n 1


2-Sided 1-Sided Lower 1-Sided Upper
𝜎 𝜎 𝜎 𝜎
𝑋̅ − 𝑍𝛼⁄2 ( ) ≤ 𝜇 ≤ 𝑋̅ + 𝑍𝛼⁄2 ( ) 𝑋̅ − 𝑍𝛼 ( ) ≤ 𝜇 𝜇 ≤ 𝑋̅ + 𝑍𝛼 ( )
√𝑛 √𝑛 √𝑛 √𝑛

Choice of Sample Size for CI on the Mean of a Normal Distribution, Std Dev Known or Large n
2
𝑍𝛼⁄2 𝜎
𝑛=( )
𝐸

CI on the Mean of a Normal Distribution, Standard Deviation Unknown and/or n < 25


2-Sided 1-Sided Lower 1-Sided Upper
𝑠 𝑠 𝑠 𝑠
𝑋̅ − 𝑡𝛼⁄2, 𝑛−1 ( ) ≤ 𝜇 ≤ 𝑋̅ + 𝑡𝛼⁄2, 𝑛−1 ( ) 𝑋̅ − 𝑡𝛼, 𝑛−1 ( ) ≤ 𝜇 𝜇 ≤ 𝑋̅ + 𝑡𝛼, 𝑛−1 ( )
√𝑛 √𝑛 √𝑛 √𝑛

Prediction Interval Estimate for a Future Individual Value (𝑋𝑓 )


1-Sided Lower 1-Sided Upper

1 1 1 1
𝑋̅ − (𝑠 × 𝑡𝛼⁄2, 𝑛−1 ) √1 + ≤ 𝑋𝑓 ≤ 𝑋̅ + (𝑠 × 𝑡𝛼⁄2, 𝑛−1 ) √1 + 𝑋̅ − (𝑠 × 𝑡𝛼, 𝑛−1 )√1 + ≤ 𝑋𝑓 𝑋̅ − (𝑠 × 𝑡𝛼, 𝑛−1 )√1 + ≤ 𝑋𝑓
𝑛 𝑛 𝑛 𝑛

Tolerance Interval: Includes at least a certain proportion of measurements with a stated confidence
2-Sided 1-Sided Lower 1-Sided Upper

̅ ± 𝐾2 𝑠
𝑋 ̅ − 𝐾1 𝑠
𝑋 ̅ + 𝐾1 𝑠
𝑋

CI on the Variance of a Normal Distribution2


2-Sided 1-Sided Lower 1-Sided Upper
(𝑛 − 1)𝑠 2 (𝑛 − 1)𝑠 2 (𝑛 − 1)𝑠 2 (𝑛 − 1)𝑠 2
( ) ≤ 𝜎2 ≤ ( )
𝜒𝛼2⁄ ,(𝑛−1) 2
𝜒(1−𝛼⁄ ),(𝑛−1)
2
𝜒𝛼,(𝑛−1) 2
𝜒(1−𝛼),(𝑛−1)
2 2

CI Estimate for the Proportion3


2-Sided 1-Sided Lower 1-Sided Upper

𝑝(1 − 𝑝) 𝑝(1 − 𝑝) 𝑝(1 − 𝑝) 𝑝(1 − 𝑝)


𝑝 − 𝑍𝛼⁄2 √ ≤ 𝜋 ≤ 𝑝 + 𝑍𝛼⁄2 √ 𝑝 − 𝑍𝛼 √ ≤ 𝜋 𝜋 ≤ 𝑝 + 𝑍𝛼 √
𝑛 𝑛 𝑛 𝑛

1. For large samples (25 or more), this interval can be used by substituting the sample standard deviation (s) for the known standard deviation.
2. Verify the assumption of a normal distribution using a probability plot and a statistical test.
3. Assumes p is not very close to 0 or 1 AND np≥10 AND n(1-p) ≥10
Hypothesis Test Reference Sheet
1-Sample Tests
Mean of a Normal Distribution, Variance Known or Large n1
Test Statistic Null Alternative Rejection Criteria
𝑋̅ − 𝜇0 𝐻0 : 𝜇 = 𝜇0 𝐻1 : 𝜇 ≠ 𝜇0 𝑍0 > 𝑍𝛼/2 or 𝑍0 < −𝑍𝛼/2
𝑍0 = 𝜎 𝐻0 : 𝜇 ≤ 𝜇0 𝐻1 : 𝜇 > 𝜇0 𝑍0 > 𝑍𝛼
⁄ 𝑛
√ 𝐻0 : 𝜇 ≥ 𝜇0 𝐻1 : 𝜇 < 𝜇0 𝑍0 < −𝑍𝛼
Mean of a Normal Distribution, Variance Unknown2
𝑋̅ − 𝜇0 𝐻0 : 𝜇 = 𝜇0 𝐻1 : 𝜇 ≠ 𝜇0 𝑡0 > 𝑡𝛼⁄2,n−1 or 𝑡0 < −𝑡𝛼⁄2,n−1
𝑡0 = 𝑠 𝐻0 : 𝜇 ≤ 𝜇0 𝐻1 : 𝜇 > 𝜇0 𝑡0 > 𝑡𝛼,𝑛−1
⁄ 𝑛
√ 𝐻0 : 𝜇 ≥ 𝜇0 𝐻1 : 𝜇 < 𝜇0 𝑡0 < −𝑡𝛼,𝑛−1
Variance of a Normal Distribution2*
𝐻0 : 𝜎 2 = 𝜎02 𝐻1 : 𝜎 2 ≠ 𝜎02 𝜒02 > 𝜒𝛼2 ⁄2,𝑛−1 or 𝜒02 < 𝜒1−𝛼
2
(𝑛 − 1)𝑠 2 ⁄2,𝑛−1
𝜒02 = 𝐻0 : 𝜎 2 ≤ 𝜎02 𝐻1 : 𝜎 2 > 𝜎02 2 2
𝜒0 > 𝜒𝛼,𝑛−1
𝜎02
𝐻0 : 𝜎 2 ≥ 𝜎02 𝐻1 : 𝜎 2 < 𝜎02 𝜒02 < 𝜒1−𝛼,𝑛−1
2

Approximate Value of a Binomial Proportion3 Sample Proportion


𝑝̂ − 𝑝0 𝐻0 : 𝑝 = 𝑝0 𝐻1 : 𝑝 ≠ 𝑝0 𝑍0 > 𝑍𝛼/2 or 𝑍0 < −𝑍𝛼/2
𝑍0 = , 𝑋
𝑝 (1 − 𝑝 ) 𝐻0 : 𝑝 ≤ 𝑝0 𝐻1 : 𝑝 > 𝑝0 𝑍0 > 𝑍𝛼 𝑝̂ =
√ 0 0
𝐻0 : 𝑝 ≥ 𝑝0 𝐻1 : 𝑝 < 𝑝0 𝑛
𝑛 𝑍0 < −𝑍𝛼
2-Sample Tests
Difference in Means of two Normal Distributions, Variance Known1
(𝑋̅1 − 𝑋̅2 ) − 𝐷0
𝑍0 = 𝐻0 : 𝜇1 − 𝜇2 = 𝐷0 𝐻1 : 𝜇1 − 𝜇2 ≠ 𝐷0 𝑍0 > 𝑍𝛼/2 or 𝑍0 < −𝑍𝛼/2
𝜎12 𝜎22 𝐻0 : 𝜇1 − 𝜇2 ≤ 𝐷0 𝐻1 : 𝜇1 − 𝜇2 > 𝐷0 𝑍0 > 𝑍𝛼
√ + 𝐻0 : 𝜇1 − 𝜇2 ≥ 𝐷0 𝐻1 : 𝜇1 − 𝜇2 < 𝐷0 𝑍0 < −𝑍𝛼
𝑛1 𝑛2
Difference in Means of two Normal Distributions, Variance Unknown1
Var. Assumed Equal Pooled SD
𝑡0 > 𝑡𝛼⁄2,(𝑛1+𝑛2−2) or 𝑡0 < 𝑆𝑝
(𝑋̅1 − 𝑋̅2 ) − 𝐷0 𝐻0 : 𝜇1 − 𝜇2 = 𝐷0 𝐻1 : 𝜇1 − 𝜇2 ≠ 𝐷0
𝑡0 = −𝑡𝛼⁄2,(𝑛1+𝑛2−2)
1 1 𝐻0 : 𝜇1 − 𝜇2 ≤ 𝐷0 𝐻1 : 𝜇1 − 𝜇2 > 𝐷0 (𝑛1 − 1)𝑆12 + (𝑛2 − 1)𝑆22
𝑆𝑝 √ + 𝑡0 > 𝑡𝛼,(𝑛1+𝑛2−2) =√
𝑛1 𝑛2 𝐻0 : 𝜇1 − 𝜇2 ≥ 𝐷0 𝐻1 : 𝜇1 − 𝜇2 < 𝐷0 (𝑛1 − 1) + (𝑛2 − 1)
𝑡0 < −𝑡𝛼,(𝑛1+𝑛2−2)
Var. Assumed Unequal Degrees of Freedom
(𝑋̅1 − 𝑋̅2 ) − 𝐷0 𝑣
𝑡0 = 𝐻0 : 𝜇1 − 𝜇2 = 𝐷0 𝐻1 : 𝜇1 − 𝜇2 ≠ 𝐷0 𝑡0 > 𝑡𝛼⁄2,𝑣 or 𝑡0 < −𝑡𝛼⁄2,𝑣 𝑠12 𝑠22
2
( + )
𝑠2 𝑠2 𝐻0 : 𝜇1 − 𝜇2 ≤ 𝐷0 𝐻1 : 𝜇1 − 𝜇2 > 𝐷0 𝑡0 > 𝑡𝛼,𝑣 = 𝑛1 𝑛2
√ 1+ 2 𝐻0 : 𝜇1 − 𝜇2 ≥ 𝐷0 𝐻1 : 𝜇1 − 𝜇2 < 𝐷0 𝑡0 < −𝑡𝛼,𝑣 𝑠12
2
𝑠2
2
𝑛1 𝑛2 [(
𝑛1
) ⁄(𝑛1 −1)]+[( 2 ) ⁄(𝑛2 −1)]
𝑛2

Difference in Means of two Normal Distributions, Paired Samples2, 4


̅ − 𝐷0
𝐷 𝐻0 : 𝜇𝐷 = 𝐷0 𝐻1 : 𝜇𝐷 ≠ 𝐷0 𝑡0 > 𝑡𝛼⁄2,n−1 or 𝑡0 < −𝑡𝛼⁄2,n−1
𝑡0 = 𝑠 𝐻0 : 𝜇𝐷 ≤ 𝐷0 𝐻1 : 𝜇𝐷 > 𝐷0 𝑡0 > 𝑡𝛼,𝑛−1
𝐷
⁄ 𝐻0 : 𝜇𝐷 ≥ 𝐷0 𝐻1 : 𝜇𝐷 < 𝐷0
√𝑛 𝑡0 < −𝑡𝛼,𝑛−1
Equality of Variances for two Normal Distributions2*
𝑓0 > 𝑓𝛼,(𝑛1 −1),(𝑛2 −1) or 𝑓0 < 𝑓(1−𝛼),(𝑛 −1),(𝑛 −1) = 1/
2 1 2
2 2 2 2 2
𝑆12 𝐻0 : 𝜎1 = 𝜎2 𝐻 1 : 𝜎1 ≠ 𝜎2 𝑓𝛼
𝐹0 = 2 𝐻0 : 𝜎12 ≤ 𝜎22 𝐻1 : 𝜎12 > 𝜎22 2
,(𝑛2 −1),(𝑛1 −1)
𝑆2 𝐻0 : 𝜎12 ≥ 𝜎22 𝐻1 : 𝜎12 < 𝜎22 𝑓0 > 𝑓𝛼,(𝑛1−1),(𝑛2−1)
𝑓0 < 𝑓(1−𝛼),(𝑛1 −1),(𝑛2 −1) or 𝑓0 < 1/𝑓𝛼,(𝑛2−1),(𝑛1−1)
Common Parameter
Difference of Population Proportions3
(Proportion)
𝑍0
(𝑝̂1 − 𝑝̂2 ) − 𝐷0 𝐻0 : 𝑝1 − 𝑝2 = 𝐷0 𝐻1 : 𝑝1 − 𝑝2 ≠ 𝐷0 𝑍0 > 𝑍𝛼/2 or 𝑍0 < −𝑍𝛼/2 𝑋2 + 𝑋2
= 𝐻0 : 𝑝1 − 𝑝2 ≤ 𝐷0 𝐻1 : 𝑝1 − 𝑝2 > 𝐷0 𝑍0 > 𝑍𝛼 𝑝̂ =
1 1 𝑛1 + 𝑛2
√𝑝̂ (1 − 𝑝̂ ) ( + ) 𝐻0 : 𝑝1 − 𝑝2 ≥ 𝐷0 𝐻1 : 𝑝1 − 𝑝2 < 𝐷0 𝑍0 < −𝑍𝛼
𝑛1 𝑛2

* All values of α assume the upper tail probability of the stated distribution.
1. For large samples (25 or more), this test can be used by substituting the sample standard deviation (s) for the known standard deviation.
2. Verify the assumption of a normal distribution using a probability plot and a statistical test.
3. Assumes p is not very close to 0 or 1 AND np≥10 AND n(1-p) ≥10
4. Can substitute a Z-test when sample sizes are large (25 or more).

You might also like