Statistics and Probability
Statistics and Probability
QUARTER
1II
Begin
The hands are one of the greatest assets of the human body. No other beings in the world has hands that can grasp, hold,
move, and manipulate objects like human hands. Through our hands, we may learn, create and accomplish. Hence, the hand in this
learning resource signifies that you as a learner is capable and empowered to successfully achieve the relevant and essential
competencies and skills at your own pace and time. Your academic success lies in your own hands!
This module was designed to provide you with fun and meaningful opportunities for guided and independent learning at your
own pace and time. You will be enabled to process the contents of the learning resource while being an active learner.
4. A researcher separates the list of boys and girls, then draws 7 names by gender.
Situation 4 illustrates stratified random sampling because the students were divided into two different strata or groups, boys and girls.
With a proportional number for each group, samples will then be selected at random from these two groups.
5. A researcher surveys all students from 3 randomly selected classes out of 7 classes.
Situation 5 illustrates cluster sampling since all students are divided into clusters or classes, then 3 classes were selected at random out
of the 7 classes. All of the students of these three classes comprised the samples of the study. Take note that each cluster is mutually
homogeneous yet internally heterogeneous.
c. Stratified random sampling is a random sampling wherein the population is divided into different strata or divisions. The
number of samples will be proportionately picked in each stratum that is why all strata are represented in the samples.
d. Cluster sampling is a random sampling wherein population is divided into clusters or groups and then the clusters are
randomly selected. All elements of the clusters randomly selected are considered the samples of the study.
The sampling techniques that involve random selection are called probability sampling. Likewise, simple random, systematic, and
stratified and cluster sampling are all probability sampling techniques.
There are also sampling techniques that do not involve random selection of data. They are called non-probability sampling. An
example of this is convenience sampling wherein the researcher gathers data from nearby sources of information exerting minimal
effort. Convenience is being used by persons giving questionnaires on the streets to ask the passers-by.
Purposive sampling is also not considered a random sampling since the respondents are being selected based on the goal of the
studies of the researcher. If the study is about the students who are children of OFW, the researcher will get samples who are children
of OFW. This excludes other students from being a sample.
POPULATION MEAN
The mean is the sum of the data divided by the number of data. The mean is used to describe where the set of data tends to
concentrate at a certain point. Population mean is the mean computed based on the elements of the population or data. The symbol µ
(read as “mu”) is used to represent population mean. To compute for the population mean, we simply add all the data (X) and then,
∑X
divide it by the number of elements in the population (N). We apply the formula: µ =
N
where:
µ = the population mean
Population variance is the computed variance of the elements of the population. The symbol 𝜎2 (read as “sigma squared”) is
is simply the square root of the variance.
where:
X = given data
Population standard deviation is the computed standard deviation of the elements of the population. The symbol 𝜎 (read as
“sigma”) is used to represent population standard deviation.
To compute for the population standard deviation, we use the formula:
where:
X = given data
Consider the data given above, to solve for the population variance and population standard deviation, we have this table:
The third column is computed through subtracting the mean to the scores, while the fourth column is computed by squaring
the third column. Since there is a symbol ∑ or summation in the formula, we need to add the computed values in the fourth column.
Again, for the population mean,
For the population variance, we substitute the computed values to our formula, thus
For the population standard deviation, we can also substitute the computed values to the formula, or we can simply get the square root
of the variance.
Population mean (µ), population variance ( 2) and population standard deviation (𝜎) are what we called parameters.
STATISTIC
From the previous data of the population, suppose that we randomly select only 7 data out of the total 10 data in the
population.
Compute the sample mean, sample variance, and sample standard deviation. Here is the result:
SAMPLE MEAN
The sample mean is the average of all the data of the samples. The symbol 𝑥̅ (read as “x bar”) is used to represent the sample
mean. To compute for the sample mean, we simply add all the data and divide it by the number of elements in the sample (n). We
In our case, adding the 7 samples will give us a sum of 602. We substitute to the formula 𝑥̅= , therefore, 𝑥̅= 602/7 = 86.
∑x
In this example, there is a slight difference between the population mean and the sample mean. But notice that there is no
difference regarding the method in determining the value of the population mean and the sample mean. For the divisor, the population
mean µ uses N (population size) while sample mean x applies n (sample size).
where:
Sample standard deviation is the computed standard deviation of the elements of the sample. s is used to represent sample
standard deviation. To compute for the sample standard deviation, we use the formula:
where:
As you would notice, the sample standard deviation is also the square root of the sample variance.
The fourth column is computed by subtracting the mean to the grades, while the last column is computed by squaring the third
column. Since there is a symbol ∑ or summation, we need to add the computed values.
Sample mean (𝑥̅), sample variance (s2) and sample standard deviation (s) are what we call statistic. Remember that
parameters are for population while statistics are for sample
Other examples of parameters and statistics are the population proportion and correlation coefficient. For population
proportion, we use “p” for sample and “P” for the population. In correlation coefficient, we use “r” for the sample and “𝜌” (read as
rho) for the population. These will be discussed in the latter part of this course.
2. List all the possible random samples and solve for the sample mean of each set of samples.
3. Construct a frequency and probability distribution table of the sample means indicating its number of occurrence or the
frequency and probability.
𝜇 = ∑[𝑋̅ • 𝑃(𝑋̅)]
Mean of the Sampling Distribution of the Sample Mean
(𝑋̅ − 𝜇)2 = square of the difference between the sample mean and population mean
(𝑋̅) = probability of the sample mean
∑[𝑃(𝑋̅) • (𝑋̅ − 𝜇)2 ] = summation of the products of probability of the sample mean and the square of the difference between the
Example:
Mark is conducting a survey on grade 12 students of Nasyonalismo High School. He found out that there are only few students
who knew about the makers of the Philippine flag consisting of 1, 2, 3, 4, and 5 SHS students from 5 sections. Suppose that the sample
size of 2 sections were drawn from this population (without replacement), describe the sampling distribution of the sample means.
1. Compute the mean of the population using the formula µ = 𝛴𝑥/𝑁. The value equals to 3.0.
Solution:
µ = 𝛴𝑥/𝑁 =
(1+2+3+ 4+5)
= 3.00
Compute the variance of the population using the formula 𝜎 2 = (𝑥 − µ) 2/ 𝑁. a. Subtract each measurement by the
5
2.
computed population mean. b. Square the results obtained in (a) then add. Divide the sum by the frequency of measurement
to get the value of the population variance. The value equals to 2.0.
3. Determine the number of possible samples of size 2 (without replacement). Use the combination formula NCn where N is the
population size and n is the sample size.
N!
Use the formula (3) NCn = . Here N=5 and n=2.
n !(N −n)!
5C2 = 10 So, there are 10 possible samples of size 2 that can be drawn
6. Compute the mean of the sampling distribution of the sample means. Follow these steps:
a. Multiply each sample mean by the corresponding probability.
MIND THIS: The mean of the population is equal to the sampling distribution of sample mean
the mean by writing the symbol 𝜎𝑥̅ , when the population variance is known and the symbol 𝑠𝑥̅when the population variance is
population variance. Write your answer on the space provided. Identify also the formula to be used to estimate the standard error of
unknown
Solution:
1. Known – This is a population data. Although the value of the variance is not given, you can still determine the population
σ
in the computation of the standard error of the mean.
Unknown- The population variance is unknown. The given values are the population mean 𝜇 = 15, the sample standard
√n
2.
𝜎x =
σ
in computing for the standard error of the mean.
Unknown- The population variance is unknown. The only given values are the population mean 𝜇 = 92.78 and the sample
√n
4.
,𝜎 =
∑ X 2 Σ ( x−µ)2
. So, this situation is an example of the sampling distribution where the population variance can be
N N
computed. Apply the formula 𝜎x =
σ
to solve for the standard error of the mean.
√n
Based on the previous activity, you learned from the situations presented that by analysis, you can easily find out if the given problem
the standard deviation of the sampling distribution of the mean is computed using the formula 𝜎𝑥̅= 𝜎 √ , while the formula 𝑠𝑥̅= 𝑠
provides the value of the population variance or if the population variance is unknown. Also, when the population variance is known,
√𝑛 is used to estimate the standard error of the mean when the population variance is unknown.
1. Population variance 𝜎2 is known The population has a mean μ and variance of 𝜎2 , the distribution of the sample mean is (at
Distribution of the Sample Mean for Normal Population
least approximately) normal and standard error of the mean 𝜎x = , where σ is the population standard deviation and 𝑛 is the
σ
sample size. To determine the probability of a certain event, we can use the 𝑧 − 𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛 by transforming the mean of the
√n
x ̅ −μ
sample data to an approximately normal variable , using the relation 𝑧 = σ . This distribution is best applied for large sample
2. Population variance 𝜎2 is unknown The standard error of the mean becomes 𝑠x = , where 𝑠 is the point estimate of 𝜎
s
(population standard deviation) or the sample standard deviation and 𝑛 is the sample size. To estimate the population parameters, we
√n
x ̅ −μ
can use the 𝑡 − 𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛 by using the formula 𝑡 = s . Remember that as n the sample size is very large, the standard
deviation 𝑠 is almost indistinguishable from the population standard deviation 𝜎 and therefore 𝑡 and 𝑧 distributions are essentially
√n
identical. Remember that, we use the 𝑡 − 𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛 for small sample size, say 𝑛 < 30.
If the population is normally distributed, the sampling distribution of the sample mean is also normally distributed. But what if the
population is not normal?
That is where the Central Limit Theorem addressed this question. The distribution of the sample mean tends toward the normal
distribution as the sample size increases, regardless of the distribution from which we are sampling. As a simple guideline, the sample
mean can be considered approximately normally distributed if the sample size is at least 30 (n ≥ 30 ). If the sample size is sufficiently
large, the Central Limit Theorem can be used to answer the sample mean in the same manner that a normal distribution can be used to
answer questions about individual samples. This also means that even if the population is not normally distributed, or if we don’t
know of its distribution, the Central Limit Theorem allows us to conclude that the distribution of the sample mean will be normal if
the sample size is sufficiently large. It is generally accepted that a sample size of at least 30 is large enough to conclude that the
Central Limit Theorem will ensure a normal distribution in the sampling process regardless of the distribution of the original
population. Further, we can continue to use the z conversion formula in our calculations. This time we will use the formula,
x ̅ −μ
𝑧= σ
√n
Why it is important to know the Central Limit theorem?
Suppose that the average age of the people living in a Barangay is 34 with a standard deviation of 4. If 100 residents of a certain
Barangay decided to take summer outing after COVID-19 pandemic and Enhanced Community Quarantine has been lifted for
bonding and relaxation, what is the probability that the average age of these residents is less than 35?
Solution:
It is not given that the population is normally distributed but since n > 30, then you can assume that the sampling distribution of
the mean ages of 100 barangay residents is normal according to the Central Limit Theorem.
The Central Limit Theorem describes the normality of the distribution of the sample mean taken from a population that is not
normally distributed.
Step 2: Convert the raw score to the standard score using the formula.
x ̅ −μ
𝑧= σ
√n
Suppose that the average age of the people living in a Barangay is 34 with a standard deviation of 4. One hundred (100) residents
of a certain Barangay decided to take summer outing after COVID-19 pandemic and Enhanced Community Quarantine has been lifted
for bonding and relaxation.
If we make a relative histogram of samples with various sample sizes, it would look like the histograms below.
calculate the sample mean, the histogram of the illustration comes to be normally distributed. And that is where the Central Limit
Theorem is used to make better inferences.
𝜇 = 50.6 𝜎 = 6 X = 48
Step 1. Identify the given information:
Therefore, the probability that a randomly selected college student will complete the examination in less than 48 minutes is
0.3336 or 33.36%
2. If 49 randomly selected senior high school students take the examination, what is the probability that the mean time it takes
the group to complete the test will be less than 48 minutes?
𝜇= 50.6 𝜎 = 6 𝑋̅= 48 𝑛= 49
Step 1: Identify the given information:
Find P(𝑋̅ ˂ 48) by getting the area under the normal curve.
P(𝑋̅ ˂ 48) = P(z ˂ -3.03) = 0.0012
The probability that 49 randomly selected senior high school students will
complete the test in less than 48 minutes is 0.0012 or 0.12%
3. If 49 randomly selected senior high school students take the examination, what is the probability that the mean time it takes
the group to complete the test will be more than 51 minutes?
𝜇= 50.6 𝜎 = 6 𝑋̅= 51 𝑛= 49
Step 1: Identify the given information:
Find P(𝑋̅ > 51) by getting the area under the normal curve.
P(𝑋̅ > 51) = P(z > 0.47)
= 1 −P(𝑧 < 0.47)
= 1 – 0.6808
= 0.3192
The probability that 49 randomly selected senior high students will
complete the test in more than 51 minutes is 0.3192 or 31.92%
4. If 49 randomly selected senior high school students take the examination, what is the probability that the mean time it takes
the group to complete the test is between 47.8 and 53 minutes?
Find P(𝑋̅ > 47.8) by getting the Find P(𝑥̅< 53) by getting the
area under the normal curve. area under the normal curve
P(𝑥̅> 47.8) = P(z > - 3.27) = 0.0005 P(𝑥̅< 53) = P(z < 2.8) = 0.9974
According to the Central Limit Theorem, the sampling distribution of a statistic (like a sample mean, 𝑥̅) will follow a normal
Illustrating the t-Distribution
distribution, as long as the sample size (𝑛) is sufficiently large. Therefore, when we know the standard deviation of the population, we
can compute a z-score and use the normal distribution to evaluate probabilities with the sample mean.
But sample sizes are sometimes small, and often we do not know the standard deviation of the population. When either of these
problems occurs, the solution is to use a different distribution.
Student’s t-distribution
small (𝑖. 𝑒. 𝑠𝑎𝑚𝑝𝑙𝑒 𝑠𝑖𝑧𝑒 < 30) and/or when the population variance is unknown. It was developed by William Sealy Gosset in
The Student’s t-distribution is a probability distribution that is used to estimate population parameters when the sample size is
1908. He used the pseudonym or pen name “Student” when he published his paper which describes the distribution. That is why it is
called “Student’s tdistribution”. He worked at a brewery and was interested in the problems of small samples, for example, the
chemical properties of barley. In the problem he analyzed, the sample size might be as low as three.
Suppose you are about to draw a random sample of n observations from a normally distributed population, you previously learned
that,
x ̅ −μ
𝑧= σ
where 𝑧 is the z-score, 𝑥 is the sample mean, 𝜇 is the population mean, 𝜎 is the population standard deviation and 𝑛 is the sample
√n
size, have the standard normal distribution. (Note that if we are standardizing a single observation, the value of n is 1. Hence, the
. You can use this concept to construct a confidence interval for the population mean, 𝜇. But in practice,
X−μ
formula becomes z =
you encounter a problem, and that problem is that you don’t know the value of the population standard deviation, 𝜎. The standard
σ
deviation for the entire population 𝜎 is a parameter and you don’t typically know its value, so you can’t use that in your formula. If
that happens, you could do the next best thing, instead of using the “population” standard deviation, 𝜎; you are going to use your
x ̅ −μ x ̅ −μ
“sample” standard deviation s, to estimate it. And instead of σ , you are going to have s where s is your sample
√n √n
You must take note of the change in the formula. The quantity 𝜎 is a constant but you don’t know its value, so you used s which is
standard deviation.
a statistic and this statistic s has a sampling distribution and its value would vary from sample to sample. And so, the quantity
x ̅ −μ
s would no longer have the standard normal distribution. This quantity is labeled as t because it has a t-distribution. When you
√n
x ̅ −μ
are sampling from a normally distributed population, the quantity t = s has the t-distribution with n-1 degrees of freedom.
√n
Note that the number of degrees of freedom is one less than the sample size. So, if the sample size n is 25, the number of degrees of
freedom is 24. Similarly, at t distribution having 16 degrees of freedom, the sample size is 17.
x ̅ −μ
What does the t-distribution look like? If you look at the statistic s , it looks like a z-statistic which has standard normal
distribution except that you replaced the population standard deviation, 𝜎, by the sample standard deviation s. You are estimating a
√n
parameter with a statistic, so there is a greater variability. Hence, your t-distribution is going to look like the normal distribution
except with greater variance.
You have here a plot of standard normal distribution in black and t-distributions with 3, 5, 20, and 30 degrees of freedom in red,
green, violet, and blue respectively. You can see that both the z-distribution and t-distributions are symmetric about 0 and bell-shaped.
But the t-distributions have heavier tails (more area in the tails) and lower peaks.
The exact shape of the t-distribution depends on the degrees of freedom. The figure above tells you that as the degrees of freedom
increase, the t-distribution tends toward the standard normal distribution. At 30 degrees of freedom, the blue curve might look very
close to the normal curve. But if you look very closely, you would see that the t-distribution still has slightly heavier tails and slightly
lower peak. But if you let those degrees of freedom continue to increase, the t-distribution is going to get closer and closer to the
standard normal distribution.
Properties of t-distribution
The t-distribution has the following properties:
1. The t-distribution is symmetrical about 0. That means if you draw a segment from the peak of the curve down to the 0 mark
on the horizontal axis, the curve is divided into two equal parts or areas. The t- scores on the horizontal axis will be divided
also with half of the t-scores being positive and half negative.
2. The t-distribution is bell-shaped like the normal distribution but has heavier tails. That means it is more prone to producing
values that fall far from the mean. The tails are asymptotic to the horizontal axis. (Each tail approaches the horizontal axis but
never touches it.)
3. The mean, median, and mode of the t-distribution are all equal to zero.
where 𝑣 is the number of degrees of freedom. As the number of
v
4. The variance is always greater than 1. It is equal to
v−2
degrees of freedom increases and approaches infinity, the variance approaches 1. Using the formula, if the number of degrees
10 10
of freedom is 10, the variance is = = 1.25
10−2 8
5. As the degrees of freedom increase, the t-distribution curve looks more and more like the normal distribution. With infinite
degrees of freedom, tdistribution is the same as the normal distribution.
6. The standard deviation of the t-distribution varies with the sample size. It is always greater than 1. Unlike the normal
distribution, which has a standard deviation of 1.
7. The total area under a t-distribution curve is 1 or 100%. One can say that the area under the t-distribution curve represents the
probability or the percentage associated with specific sets of t-values.
In finding the areas and percentiles for a t-distribution you need to familiarize yourself with the t-table. You are going to use
a table that is different from the ztable you used in finding the area under the normal curve.
Below is an example of a t-table. It is a right-tailed t-table because the given areas in this table are areas on the right tail of
the t-distribution. Some t-tables are slightly different in format. Look at the t-table below. In the first column in the leftmost part, you
have the degrees of freedom. It ranges from 1 down to ∞. While the first row in the upper part of the t-table represents the area under
the right tail of the t-distribution. Some of the given areas are from 0.25 down to 0.0005. The rest of the entries in the body of the table
are the values of the variable t (t-values).
By looking at the table, you can see that the t-value for an area of 0.10 in the right tail of the t-distribution with 10 degrees of
freedom is 1.372. This is the intersection of the row containing the 10 degrees of freedom and the column containing the area of 0.10.
Similarly, the area to the right tail of a t-distribution with 15 degrees of freedom corresponding to the t-value of 2.249 is 0.02.
Focus on the row containing 15 degrees of freedom, then look for the t-value of 2.249. The column that you need is the column
containing the area of 0.02.
Identifying Percentiles
1. Using the t-Table A percentile is a value on a t-distribution that is
less than the probability in the given percentage. For example, the
90th percentile of the t-distribution is that tvalue whose left tail
probability is 90% and whose right-tail probability is 10%. Since
the area under the t-distribution curve also represents the
probability, the 90th percentile of the t-distribution is the t-value
whose area on its left tail is 0.90 and whose area on its right tail is
0.10.
And since the area of the entire curve is 1, this implies that the area
to the right of the 95th percentile is 0.05. Hence, the 95th percentile
is the value of the variable t that has an area of 0.05 to the right.
That means finding the 95th percentile is looking for the t-value
with an area to the right of 0.05 under a t-distribution with 6
degrees of freedom.
Also, if you look at your illustration of the 5th percentile below you will
realize that the t-value that you are looking for lies between -1 and -2. Hence
its value should be a negative number. But if you observe the body of the table
where t-values are located, you cannot find any negative t-value. The table
gives only positive values of t.
At this point, you need to recall one of the properties of the t-distribution that
it is symmetric about zero. That means the right tail of the distribution is
exactly the mirror image of its left tail. So, you can easily find the values in the left tail by relying on this “symmetry–about–
zero” property. Hence, if you are going to find the value of t such that the area to the left of it is 0.05, recall that the area to
the right of 1.943 is also 0.05
Therefore, you can say that since the t-distribution is symmetric about 0, the t-value with an area to the left of 0.05 must be -
1.943. So, you will find that the 5th percentile is –1.943.
4. What is the area to the right of 2.4 under a t-distribution with 7 degrees of freedom?
Remember that in the previous example, you found t-values using the given areas under the t-distribution curve. But in this
example, you will be doing the opposite because in this problem you are given a t-value and you need to find the area to the
right of the t-distribution with 7 degrees of freedom.
You can illustrate the problem with the figure shown below. The t-value of 2.4 is somewhere between 2 and 3, and you are
going to find the area to the right of it.
So, looking back at the table, you need to focus on the 7 degrees of freedom line. You will observe that the t-value of 2.4
cannot be found in this row but you do find these two values 2.365 and 2.517 that surround 2.4 (The t-value 2.4 is between
2.365 and 2.517).
The table tells you that the area to the right of 2.365 is 0.025 and the area to the right of 2.517 is 0.02. You figure out earlier
that our t-value of 2.4 falls in between two values 2.365 and 2.517 and it tends to reason then, that the area to the right of 2.4 must be
between those two values 0.025 and 0.02.
So, using the table you found that the area to the right of 2.4 under the tdistribution with 7 degrees of freedom lies
somewhere between 0.02 and 0.025.
If you need to get the exact value, you need to use software that easily calculates the area under the t-distribution curve with
the given t-value and number of degrees of freedom. Using such software, you could find that the area to five decimal places is
0.02373.
What if you needed to use the t-table to find the area to the left of 2.4?
Since the area under the entire curve is 1, the area to the left of 2.4 is equal to 1 minus the area to the right of 2.4. So, based
on the table the area to the left of 2.4 under the t distribution with 7 degrees of freedom must lie somewhere between 0.98 and 0.975 (1
– 0.02 = 0.98 and 1 – 0.025 = 0.975). But since you already knew that the area to the right of 2.4 is 0.02373, you could find the exact
area to the left of 2.4 to five decimal places as 1 minus 0.02373 or 0.97627.
Identifying the Length of a Confidence Interval
What is the difference between the Confidence Level and Confidence interval?
The Confidence level of an interval estimate of a parameter is the probability that the interval estimate contains a parameter,
it describes what percentage of intervals from many different samples contains the unknown population parameter.
The confidence level has its corresponding coefficient which is called confidence coefficients. These coefficients are used to
find the margin of error, for instance, the table below shows the corresponding coefficient confidence level
Confidence interval or interval estimate is a range of values that is used to estimate a parameter. This estimate may or may
Or
The Lower limit is obtained by using the formula LL= 𝑿̅ − 𝑬, while the Upper limit is obtained by using the formula UL=
(Lower limit, Upper Limit)
where,
Example:
A random sample of 46 scores from the examination of ABM learners is taken and it gives a sample mean of 78 with the
interval scores between 77.18 and 78.82 having a 90% level of confidence.
earlier the formula of the upper limit and the lower limit includes the Margin of error. LL= 𝑿̅ – E
As we can see, the Margin of error is not directly mentioned, but the lower limit and upper limit is there. As mentioned
78.82 = 𝑿̅ + E
Let’s see if we can get the same value of E if we use the formula for the upper limit.
What is the confidence interval in the given statement? To find the confidence interval, we have to use Lower limit < 𝝁<
E = 0.82 Therefore, the margin of error is 0.82
Note: Sometimes, you just need to convert the formula to find what is missing.
The margin of error is the range of values above and below the given statistical number or sample in a confidence interval.
To compute for the margin of error, use the formula given below:
Where, 𝒛a/2 means the critical values or confidence coefficients, is the population standard deviation and n as the sample
size.
Consider the given example below.
Example 1:
Isabel owns a shoe store. She used 160 pairs of shoes as her samples for the different designs. The population standard
deviation of the price of the shoes is ₱75. Suppose that Isabel wants a 95% level of confidence to determine the mean price of all her
shoes she is selling. Compute for the margin of error of her estimate.
Solution:
Step 1: Write the given data. n = 160 = ₱75 95% confidence level where zc = 1.96
Step 2: Apply the formula and substitute the given data.
or
The lower limit is obtained by using the formula LL= 𝑿̅ − 𝑬, while the upper limit is obtained by using the formula UL= 𝑿̅
(Lower limit, Upper Limit)
where E is the margin of error, 𝑧𝛼/2 is the confidence coefficient, 𝜎 is the population standard deviation, n is the sample
size and 𝑛 0
Example 2:
The population of Sulu Hornbill (one of the endangered bird species in the Philippines) has a standard deviation of 40.
Compute for the length of the confidence interval for a 90% confidence level having a sample size of 150 and a sample mean of 65.
Solution:
UL = 𝑿̅ + 𝑬 LL = 𝑿̅ − 𝑬
Step 3: Compute for the upper and lower limit of the confidence interval.
= 𝟕𝟎. 𝟑𝟕 = 𝟓𝟗. 𝟔𝟑
= 65 + 5.37 = 65 − 5.37
formula: L = 𝟐𝒛a/2(
σ
) = 2E. Therefore, L = 2 (5.37) =10.74
√n
Example 3:
Jennifer wanted to know the average price of shoes that her customer purchased. She sampled 160 pairs of shoes that were
sold and found out that the mean average price is ₱800 with a standard deviation of ₱75. Construct a 95% confidence interval for the
mean price of all shoes that were sold. Compute for the length of the confidence interval.
Solution:
E = 𝒛a/2(
σ
)
√n
E = (1.96)( 75/√160 )
E= (1.96)(5.929) (use three decimal places for partial answer)
Step 3: Compute for the upper and lower limit of the confidence interval. UL = 𝑿̅ + 𝑬 LL = 𝑿̅ − 𝑬 = 800 + 11.62 = 800 − 11.62 =
E = 11.62 (round off final answer to two decimal places)
811.62 = 788.38
Step 4: Write the confidence interval.
788.38< 811.62 or (788.38, 811.62)
This means that we are 95% confident that the true mean lies between 788.38 and 811.62. In the context of this problem, Jennifer can
state that she is 95% confident that the average price of a pair of shoes purchased by her customers lies between ₱788.38 to ₱811.62 or
₱788 to ₱812 when rounded to the nearest peso.
Step 5: Compute for the length of the confidence interval.
L = UL – L
= 811.62 – 788.38
I Direction: Solve each of the following problems. All answers should be in two-decimal places.
1. The IQs of Grade 11 students in MAKATAO NATIONAL HIGH SCHOOL were measured and found to be normally distributed
with a mean of 98 and a standard deviation of 8
a. If a student from the school is chosen at random, what is the probability that his score is higher than 110?
________________________________________________________________________
b. What is the probability that a random sample of 4 students will have an average of above 110?
________________________________________________________________________
2. The mean annual salary of all the frontlines (nurses, medical technologists, radiologic technologists, phlebotomists) in the
Philippines is Php 42,500. Assume that this is normally distributed with standard deviation Php 5,600. A random sample of 25 health
workers is drawn from this population, find the probability that the mean salary of the sample is:
a. between Php 40,400 and Php 45,000?
_________________________________________________________________________
b. greater than Php 41,000?
________________________________________________________________________
3. Find the 90th percentile of the t-distribution if the sample size is 25.
III Directions: Compute the length of the confidence interval for estimating the population mean using a sample size of 300 and with a
standard deviation 84. Use a 92% confidence level.
Write all the given data based on the problem statement and show the complete solution.
n = __________ formula to be used: ________________
= __________
solution: zc = __________