IEM 4103 Quality Control & Reliability Analysis
IEM 5103 Breakthrough Quality & Reliability
Modeling Process Quality
Instructor: Dr. Chenang Liu
Email: Chenang.Liu@okstate.edu
Outline
The DMAIC Process
Describing Variation
Histogram
Numerical Summary of Data
Probability Distributions
Important Probability Distributions
Discrete Distribution
Continuous Distribution
Some Useful Approximations
Textbook: Chapter 2,3
1/20/2021
2
The DMAIC Process
DMAIC is a structured problem-solving procedure widely used in quality and
process improvement. (a very general procedure)
But not necessarily!
It is often associated with Six Sigma activities, and almost all implementations of
Six Sigma use the DMAIC process for project management and completion.
The letters DMAIC form an acronym for the five steps:
Define: identify the project opportunity.
Measure: evaluate and understand the current state of the process.
Analyze: determine the cause-and-effect relationships and understand the variability.
Improve: creative thinking about the specific changes to have the desired impact.
Control: ensure that the gains are of help in the process.
1/20/2021 Control charts are an important statistical
3 tool used in the Control step of DMAIC
The DMAIC Process (cont’d)
1/20/2021
4
2.1 Describing Variation
Method 1: Statistical Plot - Histogram
Method 2: Numerical Summary
Method 3: Probability Distribution
1/20/2021
5
Population & Sample
A population is the entire collection of units/individuals/outcomes in which we
are interested. It is usually very large (and sometimes infinite), so to find out
what is going on in the population we observe a sample - a representative
subset.
The key word here is representative. A sample should be 'the population in
miniature'. Then by examining a sample we can draw conclusions about the
population. Such conclusions, however, CANNOT be made with 100% certainty
and are stated in terms of probabilities.
Probability
Population Sample
Statistical inference
1/20/2021
6
2.1.1 Histogram
Example
Measured variable: thickness of metal layer on silicon wafer
Samples: 100 observations
1/20/2021
7
To construct a histogram
Find the range of the data
Min: 413 Max: 487
Divide the rage of data into equal-width intervals (bins)
In practice, number of bins≈ 𝑁𝑁 = 100 = 10
487−413
Width of bins= = 7.4, so use width= 8
10
Determine the lower and upper bounds of each bin
Bin 1: [410, 417] 1
Bin 2: [418, 425] 3
… …
Bin 10: [482, 489] 1
Count the frequency of each bin
1/20/2021
8
Interpretation based on the Histogram
Histogram with different bins
8 bins (top) and 15 bins (bottom)
Visual display of three properties of sample data
Shape of distribution
Roughly symmetric and unimodal
Central tendency
Data tend to cluster near 450
Scatter or variability
Variability is relatively high (min=413; max=487)
1/20/2021
9
2.1.2 Numerical Summary of Data
Central Tendency: sample average/mean
𝑥𝑥1 + 𝑥𝑥2 + ⋯ + 𝑥𝑥𝑛𝑛 ∑𝑛𝑛𝑖𝑖=1 𝑥𝑥𝑖𝑖
𝑥𝑥̅ = =
𝑛𝑛 𝑛𝑛
Scatter/variability: sample variance or sample standard deviation
∑ 𝑛𝑛 2
𝑖𝑖=1 (𝑥𝑥𝑖𝑖 − 𝑥𝑥)
̅
𝑠𝑠 2 = = 𝜎𝜎̂ 2 Not reflect the magnitude
𝑛𝑛 − 1 of data, only the scatter
∑ 𝑛𝑛 2 about the average.
(𝑥𝑥
𝑖𝑖=1 𝑖𝑖 −𝑥𝑥)
̅
𝑠𝑠 = 𝑠𝑠 2 =
𝑛𝑛 − 1
Median: a value such that at least 50% of the data values are at or below this value
and at least 50% of the data values are at or above this value.
1/20/2021
10
Example
Calculate the sample mean, median, variance, and standard deviation of a
sample of observations: 𝑥𝑥1 = 1, 𝑥𝑥2 = 3, 𝑥𝑥3 = 5
If 𝑥𝑥3 is 500 instead of 5, what is the sample mean and median of the sample?
If 𝑥𝑥1 = 101, 𝑥𝑥2 = 103, 𝑥𝑥3 = 105, is the sample variance different from the first
sample?
1/20/2021
11
Numerical Summary of Data - More
Skewness
If the distribution of the data is not symmetrical it is called asymmetrical or skewed
Skewness characterizes the degree of asymmetry of a distribution around its mean
𝑛𝑛
𝑛𝑛 𝑥𝑥𝑖𝑖 − 𝑥𝑥̅ 3
𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 = �( )
(𝑛𝑛 − 1)(𝑛𝑛 − 2) 𝑠𝑠
𝑖𝑖=1
Examples
1/20/2021
12
Numerical Summary of Data - More
Kurtosis
Kurtosis characterizes the relative peakedness or flatness of a distribution
compared with the bell-shaped distribution (normal distribution)
Examples
1/20/2021
13
2.1.3 Probability Distributions
Definition: a probability distribution is a mathematical model that relates the value
of the variable with the probability of occurrence of that value in the population.
Two types of distributions:
Continuous: if the variable being measured is expressed on a continuous scale
Discrete: if the parameter being measured can only take on certain values, e.g. 1,2,3…
Examples:
Discrete case (left)
Continuous case (right)
1/20/2021
14
Review of probability distribution calculation
Continuous Distribution Discrete Distribution
𝑏𝑏
Probability 𝑃𝑃{𝑎𝑎 ≤ 𝑥𝑥 ≤ 𝑏𝑏} = � 𝑓𝑓 𝑥𝑥 𝑑𝑑𝑑𝑑 𝑃𝑃 𝑥𝑥 = 𝑥𝑥𝑖𝑖 = 𝑝𝑝(𝑥𝑥𝑖𝑖 )
𝑎𝑎
∞ ∞
Population
𝐸𝐸 𝑥𝑥 = 𝜇𝜇 = � 𝑥𝑥𝑥𝑥 𝑥𝑥 𝑑𝑑𝑑𝑑 𝐸𝐸 𝑥𝑥 = 𝜇𝜇 = � 𝑥𝑥𝑖𝑖 𝑝𝑝(𝑥𝑥𝑖𝑖 )
Mean/Expectation −∞
𝑖𝑖=1
∞ ∞
Population Variance 𝑉𝑉𝑉𝑉𝑉𝑉 𝑥𝑥 = 𝜎𝜎 2 = � (𝑥𝑥 − 𝜇𝜇)2 𝑓𝑓 𝑥𝑥 𝑑𝑑𝑑𝑑 𝑉𝑉𝑉𝑉𝑉𝑉 𝑥𝑥 = 𝜎𝜎 2 = �(𝑥𝑥𝑖𝑖 −𝜇𝜇)2 𝑝𝑝(𝑥𝑥𝑖𝑖 )
−∞
𝑖𝑖=1
𝑥𝑥1 + 𝑥𝑥2 + ⋯ + 𝑥𝑥𝑛𝑛 ∑𝑛𝑛𝑖𝑖=1 𝑥𝑥𝑖𝑖
Sample mean 𝑥𝑥̅ = =
𝑛𝑛 𝑛𝑛
𝑛𝑛
∑ ̅ 2
𝑖𝑖=1(𝑥𝑥𝑖𝑖 −𝑥𝑥)
Sample variance 2
𝑠𝑠 = = 𝜎𝜎̂ 2
1/20/2021 𝑛𝑛 − 1
15
Useful results on mean and variance
If 𝑥𝑥 is a random variable and 𝑎𝑎 is a constant, then
𝐸𝐸 𝑥𝑥 + 𝑎𝑎 = 𝐸𝐸 𝑥𝑥 + 𝑎𝑎
𝐸𝐸 𝑎𝑎𝑥𝑥 = 𝑎𝑎𝐸𝐸 𝑥𝑥
𝑉𝑉𝑉𝑉𝑉𝑉 𝑥𝑥 + 𝑎𝑎 = 𝑉𝑉𝑉𝑉𝑉𝑉 𝑥𝑥
𝑉𝑉𝑉𝑉𝑉𝑉 𝑎𝑎𝑥𝑥 = 𝑎𝑎 2 𝑉𝑉𝑉𝑉𝑉𝑉 𝑥𝑥
If 𝑥𝑥1 , 𝑥𝑥2 , ⋯ , 𝑥𝑥𝑛𝑛 are random variables, then
𝐸𝐸 𝑥𝑥1 + 𝑥𝑥2 + ⋯ + 𝑥𝑥𝑛𝑛 = 𝐸𝐸(𝑥𝑥1 ) + 𝐸𝐸(𝑥𝑥2 ) + ⋯ + 𝐸𝐸(𝑥𝑥𝑛𝑛 )
If they are mutually independent, and 𝑎𝑎1 , 𝑎𝑎2 , ⋯ , 𝑎𝑎𝑛𝑛 are constant, then
𝑉𝑉𝑉𝑉𝑉𝑉 𝑎𝑎1 𝑥𝑥1 + 𝑎𝑎2 𝑥𝑥2 + ⋯ + 𝑎𝑎𝑛𝑛 𝑥𝑥𝑛𝑛 = 𝑎𝑎12 𝑉𝑉𝑉𝑉𝑉𝑉(𝑥𝑥1 ) + 𝑎𝑎22 𝑉𝑉𝑉𝑉𝑉𝑉(𝑥𝑥2 ) + ⋯ + 𝑎𝑎𝑛𝑛2 𝑉𝑉𝑉𝑉𝑉𝑉(𝑥𝑥𝑛𝑛 )
1/20/2021
16
2.2 Important Probability Distributions
Discrete Probability Distribution
Hypergeometric distribution
Binomial distribution
Poisson distribution
Continuous Probability Distribution
Normal distribution
Chi-square distribution
𝑡𝑡 distribution
𝐹𝐹 distribution
1/20/2021
17
Hypergeometric Distribution
Example: Suppose that there are 𝑁𝑁 balls in total, including 𝐷𝐷 red balls. 𝑛𝑛 balls
are randomly selected without replacement, and the number of red balls – say
𝑥𝑥 is observed. The probability distribution of 𝑥𝑥 is Hypergeometric.
Definition:
𝐷𝐷 𝑁𝑁 − 𝐷𝐷
𝑝𝑝 𝑥𝑥 = 𝑥𝑥 𝑛𝑛 − 𝑥𝑥 𝑥𝑥 = 0,1,2, ⋯ , min(𝑛𝑛, 𝐷𝐷)
𝑁𝑁
𝑛𝑛
The mean and variance of the distribution are
𝑛𝑛𝑛𝑛 𝑛𝑛𝑛𝑛 𝐷𝐷 𝑁𝑁 − 𝑛𝑛
𝜇𝜇 = and 𝜎𝜎 2 = (1 − )( )
𝑁𝑁 𝑁𝑁 𝑁𝑁 𝑁𝑁 − 1
1/20/2021
18
Binomial Distribution
Example: A quarter coin is tossed for 𝑛𝑛 times (𝑛𝑛 independent trials). For each
toss, suppose the probability to get the obverse is 𝑝𝑝. The number of “obverse”
in these 𝑛𝑛 trials—say, 𝑥𝑥—is observed. Then the probability distribution of 𝑥𝑥 is
Binomial.
Definition:
𝑛𝑛 𝑥𝑥
𝑝𝑝 𝑥𝑥 = 𝑝𝑝 (1 − 𝑝𝑝)𝑛𝑛−𝑥𝑥 𝑥𝑥 = 0,1,2, ⋯ , 𝑛𝑛
𝑥𝑥
The mean and variance of the distribution are
𝜇𝜇 = 𝑛𝑛𝑛𝑛 and 𝜎𝜎 2 = 𝑛𝑛𝑛𝑛(1 − 𝑝𝑝)
1/20/2021
19
Binomial Distribution (cont’d)
Assumption:
Constant probability of success 𝑝𝑝
Two mutually exclusive outcomes
All trials statistically independent
Number of trials 𝑛𝑛 is known and constant
Application:
It can be used as a model when sampling from an infinitely large population. The
constant 𝑝𝑝 normally represents the fraction of defective or nonconforming items in
the population.
1/20/2021
20
Example
A firm claims that 99% of their products meet specifications. To support this claim, an
inspector draws a random sample of 20 items and ships the lot if the entire sample is
in conformance. Find the probability of committing both of the following errors:
(1) Refusing to ship a lot even though 99% of the items are in conformance.
(2) Shipping a lot even though only 95% of the items are conforming.
1/20/2021
21
Poisson Distribution
Example: Suppose that the number of wire-bonding defects per unit that occur
in a semiconductor device (i.e., 𝑥𝑥) is Poisson distributed with parameter 𝜆𝜆 = 4
Definition: the number of random events occurring during a specific “time”
period with the average occurrence rate 𝜆𝜆 known:
𝑒𝑒 −𝜆𝜆 𝜆𝜆𝑥𝑥
𝑝𝑝 𝑥𝑥 = 𝑥𝑥 = 0,1,2, ⋯
𝑥𝑥!
The mean and variance of the distribution are
𝜇𝜇 = 𝜆𝜆 and 𝜎𝜎 2 = 𝜆𝜆
1/20/2021
22
Poisson Distribution (cont’d)
Assumptions:
The average occurrence rate 𝜆𝜆 (per unit) is a known constant
Occurrences are equally likely to occur within any unit of time/area
Occurrences are statistically independent
Poisson probability distributions
for selected values of 𝜆𝜆
1/20/2021
23
Exercises for discrete distribution
What is the distribution of x in the following scenarios?
60% of pulleys are produced using Lathe #1, 40% are produced using Lathe #2. A
random sample of four production parts containing 𝑥𝑥 parts coming from Lathe #1.
Accidents in a building are assumed to occur randomly with an average rate of 36
per year. There will be 𝑥𝑥 accidents in the coming April.
The probability that a basketball player will make a free throw is 0.7. Let 𝑥𝑥 denote the
number of free throws he will make in a game of 7 free throw attempts.
A production process operates with 2% nonconforming output. Every hour a sample
of 50 units of product is taken, and the number of nonconforming units counted as 𝑥𝑥.
A book of 200 pages with 2 error pages. There are 𝑥𝑥 error pages in a random
selection of 10 pages.
1/20/2021
24
Normal Distribution (Continuous)
The normal distribution is probably the most important distribution in both the
theory and application of statistics.
Definition: if 𝑥𝑥 is a normal random variable, then the probability distribution
of 𝑥𝑥 is defined as follows,
1 1 𝑥𝑥−𝜇𝜇 2
𝑓𝑓 𝑥𝑥 = − (
𝑒𝑒 2 𝜎𝜎 ) − ∞ < 𝑥𝑥 < ∞
𝜎𝜎 2𝜋𝜋
The mean of the normal distribution is 𝜇𝜇 and the variance is 𝜎𝜎 2 > 0. Also,
2
𝑥𝑥 − 𝜇𝜇
𝑥𝑥~𝑁𝑁 𝜇𝜇, 𝜎𝜎 ⇒ 𝑧𝑧 = ~𝑁𝑁(0,1)
𝜎𝜎 Standard normal distribution
𝑎𝑎 − 𝜇𝜇 𝑎𝑎 − 𝜇𝜇
𝑃𝑃 𝑥𝑥 ≤ 𝑎𝑎 = 𝑃𝑃 𝑧𝑧 ≤ = Φ( )
𝜎𝜎 𝜎𝜎
1/20/2021
25
Normal Distribution (cont’d)
If 𝑥𝑥1 and 𝑥𝑥2 are independently normally distributed random variables, then
𝑦𝑦 = 𝑎𝑎1 𝑥𝑥1 + 𝑎𝑎2 𝑥𝑥2 also follows the normal distribution, i.e.,
𝑦𝑦~𝑁𝑁(𝑎𝑎1 𝜇𝜇1 + 𝑎𝑎2 𝜇𝜇2 , 𝑎𝑎12 𝜎𝜎12 + 𝑎𝑎22 𝜎𝜎22 )
The visual appearance of the normal distribution is a symmetric, unimodal or
bell-shaped curve
𝑃𝑃 𝜇𝜇 − 𝜎𝜎 ≤ 𝑥𝑥 ≤ 𝜇𝜇 + 𝜎𝜎 = 68.26%
𝑃𝑃 𝜇𝜇 − 2𝜎𝜎 ≤ 𝑥𝑥 ≤ 𝜇𝜇 + 2𝜎𝜎 = 95.46%
𝑃𝑃 𝜇𝜇 − 3𝜎𝜎 ≤ 𝑥𝑥 ≤ 𝜇𝜇 + 3𝜎𝜎 = 99.73%
1/20/2021
26
Normal Distribution (cont’d)
Example for calculation: 𝑥𝑥~𝑁𝑁(40, 52 )
42.1 − 40
𝑝𝑝 𝑥𝑥 ≥ 42.1 = 1 − 𝑝𝑝 𝑥𝑥 ≤ 42.1 = 1 − Φ = 1 − Φ(0.42)
5
Check the table in Appendix II, the
table will be provided in exams.
1/20/2021
27
Example
Three shafts are made and assembled in a linkage. The length of each shaft,
in centimeters, is distributed as follows:
Shaft 1: 𝑁𝑁(75,0.09)
Shaft 2: 𝑁𝑁(75,0.16)
Shaft 3: 𝑁𝑁(75,0.25)
Assume the shafts’ length are independent to each other:
(a) What is the distribution of the linkage?
(b) What is the probability that the linkage will be longer than 160.5 cm?
1/20/2021
28
Central limit theorem
In practice, the normal distribution is often assumed as the appropriate
probability model for a random variable. Why?
The central limit theorem is often a justification of approximate normality.
The distribution of an average tends to be Normal, even when the distribution from
which the average is computed is decidedly non-Normal.
Thus, the Central Limit Theorem is the foundation for many statistical
procedures, including Quality Control Charts, because the distribution of the
phenomenon under study does not have to be Normal because its average will
be.
1/20/2021
29
Central limit theorem (cont’d)
Mathematical representation:
If 𝑥𝑥1 , 𝑥𝑥2 , ⋯ , 𝑥𝑥𝑛𝑛 are independent random variables with mean 𝜇𝜇𝑖𝑖 and variance 𝜎𝜎𝑖𝑖2 ,
and if 𝑦𝑦 = 𝑥𝑥1 + 𝑥𝑥2 + ⋯ + 𝑥𝑥𝑛𝑛 , then the distribution of
𝑦𝑦 − ∑𝑛𝑛𝑖𝑖=1 𝜇𝜇𝑖𝑖
∑𝑛𝑛𝑖𝑖=1 𝜎𝜎𝑖𝑖2
approaches the 𝑁𝑁(0,1) distribution as 𝑛𝑛 approaches infinity.
1/20/2021
30
Chi-square 𝜒𝜒 2 Distribution
The Chi-square distribution is associated with Normal random variables
𝑦𝑦 = 𝑥𝑥12 + 𝑥𝑥22 + ⋯ + 𝑥𝑥𝑛𝑛2
𝑦𝑦 follows 𝜒𝜒 2 (𝑛𝑛) if 𝑥𝑥12 , 𝑥𝑥22 , ⋯ , 𝑥𝑥𝑛𝑛2 are normally and independently distributed
with mean 0 and variance 1.
𝑛𝑛 is the degree of freedom.
The most popular use of this distribution is for testing hypotheses about
variances of samples from normal distributions.
The percentage points of the 𝜒𝜒 2 distribution is listed in Appendix III.
1/20/2021
31
t Distribution
If 𝑥𝑥 and 𝑦𝑦 are independent standard normal and chi-square random variable
with 𝑘𝑘 degrees of freedom respectively, then
𝑥𝑥
𝑡𝑡 =
𝑦𝑦/𝑛𝑛
is distributed as 𝑡𝑡 with 𝑘𝑘 degrees of freedom.
This distribution is used for testing hypotheses about two population means.
As 𝑘𝑘 → ∞ the 𝑡𝑡 distribution approximate to standard normal distribution.
The percentage points of the 𝑡𝑡 distribution is listed in Appendix IV.
1/20/2021
32
F Distribution
If 𝑤𝑤 and 𝑦𝑦 are two independent chi-square random variables with 𝑢𝑢 and 𝑣𝑣
degrees of freedom, respectively, then the ratio
𝑤𝑤/𝑢𝑢
𝐹𝐹𝑢𝑢,𝑣𝑣 =
𝑦𝑦/𝑣𝑣
is distributed as 𝐹𝐹 with 𝑢𝑢 numerator degrees of freedom and 𝑣𝑣 denominator
degrees of freedom.
This distribution is used for testing hypotheses about two population variances.
The percentage points of the 𝐹𝐹 distribution is listed in Appendix V (textbook,
page 715)
1/20/2021
33
Example
Chi-square distribution for selected values of 𝑛𝑛 The 𝑡𝑡 distribution for selected values of 𝑘𝑘
1/20/2021
34 The 𝐹𝐹 distribution for selected values of 𝑢𝑢 (numerator degrees of freedom)
and 𝑣𝑣 (denominator degrees of freedom).
2.3 Some Useful Approximation
The Binomial Approximation to the Hypergeometric
𝑛𝑛 𝐷𝐷
If < 0.1, use 𝑝𝑝 =
𝑁𝑁 𝑁𝑁
The Poisson Approximation to the Binomial
If 𝑛𝑛 is large enough and 𝑝𝑝 is small enough, use 𝜆𝜆 = 𝑛𝑛𝑛𝑛
The Normal Approximation to the Binomial
If 𝑛𝑛𝑛𝑛 > 10 and 0.1 ≤ 𝑝𝑝 ≤ 0.9, use 𝜇𝜇 = 𝑛𝑛𝑛𝑛 and 𝜎𝜎 2 = 𝑛𝑛𝑛𝑛(1 − 𝑝𝑝)
1/20/2021
35
Approximations to distributions
1/20/2021
36
Thank you!
Any Questions?
1/20/2021
37
Appendix I
Summary of common probability distributions often used in statistical quality control
1/20/2021
38
Appendix I (cont’d)
Summary of common probability distributions often used in statistical quality control
1/20/2021
39
Appendix II
Cumulative Standard Normal Distribution
1/20/2021
40
Appendix II (cont’d)
Cumulative Standard Normal Distribution
1/20/2021
41
Appendix III
Percentage Points of the 𝜒𝜒 2 Distribution
1/20/2021
42
Appendix IV
Percentage Points of the 𝑡𝑡 Distribution
1/20/2021
43
Appendix V
Percentage Points of the 𝐹𝐹 Distribution
1/20/2021
44
Appendix V (cont’d)
Percentage Points of the 𝐹𝐹 Distribution (cont’d)
1/20/2021
45
Appendix V (cont’d)
Percentage Points of the 𝐹𝐹 Distribution (cont’d)
1/20/2021
46