Probability Basics for Students
Probability Basics for Students
2
4.1 Randomness
3
The Language of Probability
Chance behavior is unpredictable in the short run but has a regular and
predictable pattern in the long run.
4
Thinking About Randomness
The result of any single coin toss is random. But the result over
many tosses is predictable, as long as the trials are independent
(i.e., the outcome of a new coin flip is not influenced by the result of
the previous flip).
The probability of
heads is 0.5 =
the proportion of
times you get
heads in many
repeated trials.
First series of tosses
Second series
5
The Language of Probability (Cont..)
6
Concept of Probability
1. Repeat an experiment (or observe a random phenomenon)
a large number of times.
2. Record the number of times a desirable outcome occurs.
3. Compute the ratio:
# of times the desirable outcome occurs
Total # of times the experiment was performed
Definition: The probability of any outcome of a random
phenomenon is the proportion of times the outcome would
occur in a very long series of repetitions.
The Language of Probability (Cont..)
Example:
7 The statistics of a particular
basketball player state that he makes 4 out of 5
free-throw attempts.
The basketball player is just about to attempt a free throw.
What do you estimate the probability that the player makes
this next free throw to be?
A. 0.16
B. 50-50. Either he makes it or he doesn’t.
C. 0.80
D. 1.2
Answer: C
4.2 Probability Models
▪ Sample spaces
▪ Probability rules
▪ Assigning probabilities
▪ Independence and the multiplication rule
8
Probability Models
Descriptions of chance behavior contain two parts:
1. a list of possible outcomes and
2. a probability for each outcome.
9
Probability Models (Cont…)
10
13
Probability Rules (Cont…)
Example:
14
16
Venn Diagrams
Sometimes, it is helpful to draw a picture to display relations among several
events. A picture that shows the sample space S as a rectangular area and
events as areas within S is called a Venn diagram.
Two events A and B are independent if knowing that one occurs does
not change the probability that the other occurs. If A and B are
independent:
P(A and B) = P(A) P(B)
18
4.3 Random Variables
▪ Random variable
▪ Discrete random variables
▪ Continuous random variables
▪ Normal distributions as probability distributions
19
Random Variables
➢ A probability model describes the possible outcomes of a chance
process and the likelihood that those outcomes will occur.
➢ A numerical variable that describes the outcomes of a chance
process is called a random variable.
➢ The probability model for a random variable is its probability
distribution.
X = 0: TTT
X = 1: HTT THT TTH
Value 0 1 2 3
X = 2: HHT HTH THH
X = 3: HHH Probability 1/8 3/8 3/8 1/8 20
Discrete Random Variable
There are two main types of random variables: discrete and continuous.
If we can find a way to list all possible outcomes for a random variable
and assign probabilities to each one, we have a discrete random
variable.
Uniform
Distribution
General Form:
(x − m)
z=
s
z 29
Normal Probability Model (Cont…)
Reminder: standardizing N(µ,σ)
30
(70 − 64.5)
For x = 70", z= = 2.2
2.5
31
Normal Probability
Model (Cont…)
32
33
The Mean of a Random Variable
When analyzing discrete random variables, we’ll follow the same
strategy we used with quantitative data―describe the shape, center,
and spread, and identify any outliers.
μ x = E ( X ) = x1 p1 + x2 p2 + x3 p3 + ...
= xi pi
34
The Mean of a Random Variable
(Cont…)
35
μ x = x1 p1 + x2 p2 + x3 p3 + ... + xk pk
= (0 *1 / 8) + (1* 3 / 8) + (2 * 3 / 8) + (3 *1 / 8)
= 12 / 8 = 3 / 2 = 1.5
Example: Babies’ Health at Birth
The probability distribution for X = Apgar scores is shown below:
a. Show that the probability distribution for X is legitimate.
b. Make a histogram of the probability distribution. Describe what you see.
c. Apgar scores of 7 or higher indicate a healthy baby. What is P(X ≥ 7)?
Value: 0 1 2 3 4 5 6 7 8 9 10
Probability: 0.001 0.006 0.007 0.008 0.012 0.020 0.038 0.099 0.319 0.437 0.053
Value: 0 1 2 3 4 5 6 7 8 9 10
Probability: 0.001 0.006 0.007 0.008 0.012 0.020 0.038 0.099 0.319 0.437 0.053
μ x = E ( X ) = xi pi
= (0)(0.001) + (1)(0.006) + (2)(0.007) + ...+ (10)(0.053)
= 8.128
The mean Apgar score of a randomly selected newborn is 8.128. This is the
long-term average Apgar score of many, many randomly chosen babies.
Note: The expected value does not need to be a possible value of X or an
integer! It is a long-term average over many repetitions. 37
The Mean of a Random Variable
(Cont…)
38
Calculate E(X).
Solution:
Rules for Means
Rules for Means
Rule 1: If X is a random variable and a and b are fixed numbers,
then:
µa+bX = a + bµX
Example: The crickets living in a field have a mean length of 1.2 inches.
What is the mean in centimeters? There are 2.54 cm in an inch, so the
mean length in inches is multiplied by 2.54: 1.2 x 2.54 = 3.05 cm. (Note
39
that we used Rule 1 with b = 1.2. There was no a for this situation.)
Variance of a Random Variable
Since we use the mean as the measure of center for a discrete random
variable, we’ll use the standard deviation as our measure of spread. The
definition of the variance of a random variable is similar to the definition
of the variance for a set of quantitative data.
Value: 0 1 2 3 4 5 6 7 8 9 10
Probability: 0.001 0.006 0.007 0.008 0.012 0.020 0.038 0.099 0.319 0.437 0.053
s X2 = (x i −mX ) 2 pi
= (0 − 8.128)2 (0.001) + (1− 8.128)2 (0.006) + ...+ (10 − 8.128)2 (0.053)
= 2.066 Variance
s X = 2.066 = 1.437
The standard deviation of X is 1.437. On average, a randomly selected
baby’s Apgar score will differ from the mean 8.128 by about 1.4 units. 42
Variance of a Random Variable
(Cont…)
43
−
s X = E ( X 2 ) − ( E ( X )) 2
2
Theorem:
44
Rules for Means and Variance (Cont…)
Example:
45
You invest 20% of your funds in Treasury bills and 80% in an “index
fund” that represents all U.S. common stocks. Your rate of return of
over time is the proportional to that of the T-bills (X) and of the index fund (Y), such that
R = 0.2 X + 0.8 Y.
? ?
The Law of Large Numbers
Suppose
46 we would like to estimate an unknown
mean µ.
We could select an SRS and calculate sample mean .
However, a different SRS would probably yield a different
sample mean.
▪ Rules of probability
▪ General addition rules
▪ Conditional probability
▪ General multiplication rules
▪ Bayes’s rule
▪ Independence
47
Probability Rules
48
Venn Diagrams
Sometimes, it is helpful to draw a picture to display relations among several
events. A picture that shows the sample space S as a rectangular area and
events as areas within S is called a Venn diagram.
49
The General Addition Rule
50
Conditional Probability
The probability we assign to an event can change if we know that some
other event has occurred. This idea is the key to many applications of
probability.
When we are trying to find the probability that one event will happen
under the condition that some other event is already known to have
occurred, we are trying to determine a conditional probability.
The probability that one event happens given that another event is
already known to have happened is called a conditional probability.
When P(A) > 0, the probability that event B happens given that event A
has happened is found by:
P( A and B)
P( B | A) =
P( A)
51
The General Multiplication Rule
The probability that events A and B both occur can be found using the general
multiplication rule:
P(A and B) = P(A) • P(B | A)
where P(B | A) is the conditional probability that event B occurs given that
event A has already occurred.
Note: Two events A and B that both have positive probability are independent if:
P(B|A) = P(B)
52
The General Multiplication Rule
53
Example:
Suppose 29% of Internet users download music
files, and 67% of downloaders say they don’t care if the music
is copyrighted.
What is the percent of Internet users who download music
and don’t care about copyright.
Example:
Consider flipping a
coin twice.
Sample Space:
HH HT TH TT
54
Example
The Pew Internet and American Life Project finds that 93% of
teenagers (ages 12 to 17) use the Internet, and that 55% of online
teens have posted a profile on a social-networking site.
What percent of teens are online and have posted a profile?
P (online) = 0.93
P (profile | online) = 0.55
P(C | Ai ) P( Ai )
P( Ai | C ) =
P(C | A1 ) P( A1 ) + P (C | A2 ) P ( A2 ) + + P (C | Ak ) P ( Ak )
57
Example
If a woman in her 20s gets screened for breast cancer and receives a positive test
result, what is the probability that she does have breast cancer?
Diagnosis
Disease sensitivity 0.8
incidence Positive
Cancer
0.0004 Negative False negative
0.2
Mammography
0.1 False positive
0.9996 A12 Positive
No cancer
Incidence of breast A12 Negative
cancer among 0.9
Diagnosis
women ages 20–30 specificity Mammography
performance
If two events A and B do not influence each other, and if knowledge about
one does not change the probability of the other, the events are said to
be independent of each other.
Two events A and B are independent if knowing that one occurs does
not change the probability that the other occurs. If A and B are
independent:
P(A and B) = P(A) P(B)
Note: Two events A and B that both have positive probability are also
independent if:
P(B|A) = P(B)