MODULE STATISTICAL ANALYSIS WITH SOFTWARE APPLICATION – CAE11
CHAPTER 3: Data Collection and Basic Concepts in Sampling Design
Objectives:
1.Determine the sources of data (primary and secondary
. data).
2. Determine the appropriate sample size.
3. Differentiate various sampling techniques.
BASIC SAMPLING DESIGN
The goal in sampling is to obtain individuals for a study in such a way that accurate information about the
population can be obtained.
Reason for Sampling
- Important that the individuals included in a sample represent a cross section of individuals in the population.
- If sample is not representative it is biased. You cannot generalize to the population from your statistical data.
Some definitions are needed to make the notion of a good sample more precise.
Definitions:
• Observation unit - An object on which a measurement is taken. This is the basic unit of observation,
sometimes called an element. In studying human populations, observation units are often individuals.
• Target population - The complete collection of observations we want to study.
• Sampled population - The collection of all possible observation units that might have been chosen in a
sample; the population from which the sample was taken.
• Sample - A subset of a population.
• Sampling unit - A unit that can be selected for a sample. We may want to study individuals, but do not
have a list of all individuals in the target population. Instead, households serve as the sampling units,
and the observation units are the individuals living in the households.
• Sampling frame - A list, map, or other specification of sampling units in the population from which a
sample may be selected. For a survey using in-person interviews, the sampling frame might be a list of
all street addresses.
• Sampling technique/Sampling Strategies - It is a plan you set forth to be sure that the sample you use
in your research study represents the population from which you drew your sample.
Page 1
MODULE STATISTICAL ANALYSIS WITH SOFTWARE APPLICATION – CAE11
• Sampling Bias - This involves problems in your sampling, which reveals that your sample is not
representative of your population.
The following examples indicate some ways in which selection bias can occur:
Deliberately or purposively selecting a “representative” sample.
Misspecifying the target population.
Failing to include all of the target population in the sampling frame, called undercoverage.
Including population units in the sampling frame that are not in the target population, called
overcoverage.
Having multiplicity of listings in the sampling frame.
Substituting a convenient member of a population for a designated member who is not readily
available.
Failing to obtain responses from all of the chosen sample. (Nonresponse)
Allowing the sample to consist entirely of volunteers.
Advantage of Sampling Over Complete Enumeration
• Less Labor
• Reduced Cost
• Greater Speed
• Greater Scope
• Greater Efficiency and Accuracy
• Convenience
• Ethical Considerations
Two Type of Samples
1. Probability Sample
- Samples are obtained using some objective chance mechanism, thus involving randomization.
- They require the use of a complete listing of the elements of the universe called the sampling frame.
- The probabilities of selection are known.
- They are generally referred to as random samples.
- They allow drawing of valid generalizations about the universe/population.
2. Non - probability Sample
- Samples are obtained haphazardly, selected purposively or are taken as volunteers.
- The probabilities of selection are unknown.
Page 2
MODULE STATISTICAL ANALYSIS WITH SOFTWARE APPLICATION – CAE11
- They should not be used for statistical inference.
Sampling Procedure
- Identify the population.
- Determine if population is accessible.
- Select a sampling method.
- Choose a sample that is representative of the population.
- Ask the question, can I generalize to the general population from the accessible population?
Sampling technique can be grouped into how selections of items are made such as probability sampling and
non-probability sampling.
Basic Sampling Technique of Probability Sampling
• Simple Random Sampling
- Most basic method of drawing a probability sample.
- Assigns equal probabilities of selection to each possible sample.
- Results to a simple random sample.
Advantage: It is very simple and easy to use.
Disadvantage: The sample chosen may be distributed over a wide geographic area.
When to use: This is preferable to use if the population is not widely spread geographically. Also, this is more
appropriate to use if the population is more or less homogenous with respect to the characteristics of the
population.
Simple Random Sampling
Page 3
MODULE STATISTICAL ANALYSIS WITH SOFTWARE APPLICATION – CAE11
Systematic Random Sampling
- It is obtained by selecting every kth individual from the population.
- The first individual selected corresponds to a random number between 1 to k.
Obtaining a Systematic Random Sample
1. Decide on a method of assigning a unique serial number, from 1 to N, to each one of the elements in
the population.
2. Compute for the sampling interval
3. Select a number, from 1 to k, using a randomization mechanism. The element in the population
assigned to this number is the first element of the sample. The other elements of the sample are those
assigned to the numbers and so on until you get a sample of size.
Example:
We want to select a sample of 50 students from 500 students under this method kth item and picked up
from the sampling frame.
We start to get a sample starting form i and for every kth unit subsequently. Suppose the random
number i is 6, then we select 15, 25, 35, 45, .. .
Advantage: Drawing of the sample is easy. It is easy to administer in the field, and the sample is spread
evenly over the population.
Disadvantage: May give poor precision when unsuspected periodicity is present in the population.
When to use: This is advisable to us if the ordering of the population is essentially random and when
stratification with numerous data is used.
When to use: This is advisable to us if the ordering of the population is essentially random and when
stratification with numerous data is used.
Page 4
MODULE STATISTICAL ANALYSIS WITH SOFTWARE APPLICATION – CAE11
Stratified Random Sampling
- It is obtained by separating the population into non-overlapping groups called strata and then obtaining a simple
random sample from each stratum.
- The individuals within each stratum should be homogeneous (or similar) in some way.
Example:
A sample of 50 students is to be drawn from a population consisting of 500 students belonging to two institutions A
and B. The number of students in the institution A is 200 and the institution B is 300. How will you draw the sample
using proportional allocation?
Page 5
MODULE STATISTICAL ANALYSIS WITH SOFTWARE APPLICATION – CAE11
Advantage: Stratification of respondents is advantageous in terms of precision of the estimates of the characteristics
of the population. Sampling designs may vary by stratum to adjust for the differences in the conditions across strata. It
is easy to use as a random sampling design.
Disadvantage: Values of the stratification variable may not be easily available for all units in the population especially
if the characteristic of interest is homogeneous. It is possible that there are not representative in one or two strata.
Also, transportation costs can be high if the population covers a wide geographic area.
When to use: If the population is such that the distribution of the characteristics of the respondents under consideration
concentrated in small and spread segment of the population. Thus, this is preferred to use if precise estimates are
desired for stratified parts of the population and if sampling problems differ in the various strata of the population.
Stratified Random Sampling
Page 6
MODULE STATISTICAL ANALYSIS WITH SOFTWARE APPLICATION – CAE11
Cluster Sampling
- You take the sample from naturally occurring groups in your population.
- The clusters are constructed such that the sampling units are heterogeneous within the cluster and homogeneous
among the clusters.
Obtaining a Cluster Sample
1. Divide the population into non-overlapping clusters.
2. Number the clusters in the population from 1 to N.
3. Select n distinct numbers from 1 to N using a randomization mechanism. The selected clusters are the
clusters associated with the selected numbers.
4. The sample will consist of all the elements in the selected clusters.
Example:
A researcher wants to survey academic performance of high school students in MIMAROPA.
1. He/She can divide the entire population into different clusters.
2. Then the researcher selects a number of clusters depending on his research through simple or
systematic random sampling.
3. Then, from the selected clusters the researcher can either include all the high school students as
subject or he can select a number of subjects from each cluster through simple or systematic random
sampling.
Advantage: There is no need to come out with a list of units in the population; all what is needed is simply a
list of the clusters. It is also less costly since the elements are physically closer together.
Disadvantage: In actual field applications, adjacent households tend to have more similar characteristics than
households distantly apart.
When to use: If the population can be grouped into clusters where individual population elements are known
to be different with respect to the characteristics under study, this preferable to use.
Page 7
MODULE STATISTICAL ANALYSIS WITH SOFTWARE APPLICATION – CAE11
Multi - Stage Sampling
- Selection of the sample is done in two or more steps or stages, with sampling units varying in each stage.
- The population is first divided into a number of first-stage sampling units from which a sample is drawn.
Smaller units, called the secondary sampling units, comprising the selected first-stage units then serve as the
sampling units for the next stage. If needed additional stages may be added until the units of observation for the
survey are clearly identified. The units comprising the samples selected from the previous stage constitute the frame
for the stages.
-
Obtaining a Multi-Stage Sampling
1. Organize the sampling process into stages where the unit of analysis is systematically grouped.
2. Select a sampling technique for each
3. Systematically apply the sampling technique to each stage until the unit of analysis has been selected.
Example:
Suppose we wish to study the expenditure patterns of households in NCR. We can select a sample of households for
this study using simple three-stage sampling.
• First, divide into smaller cities/municipalities and a random sample of these cities/ municipalities is collected.
• Second, a random sample of smaller areas such as barangays is taken from within each of the
cities/municipalities chosen in the first stage.
• Third, a random sample of even smaller areas such as households is taken from within each of the areas
chosen in the second stage.
Page 8
MODULE STATISTICAL ANALYSIS WITH SOFTWARE APPLICATION – CAE11
Advantage: It is easier to generate adequate sampling frames. Transportation costs are greatly reduced since there
is some form of clustering among the ultimate or final samples; i.e., they are in the sample lower-stage units.
Disadvantage: Its complexity in theory may be difficult to apply in the field. Estimation procedures may be difficult for
non-statisticians to follow.
When to use: If no population list is available and if the population covers a wide area.
Take Note!
Used probability sampling if the main objective of the sample survey is making inferences about the characteristics of
the population under study
Multi-Stage Sampling
Basic Sampling Technique of Non- Probability Sampling
• Accidental Sampling - There is no system of selection but only those whom the researcher or interviewer
meets by chance.
• Quota Sampling - There is specified number of persons of certain types is included in the sample. The
researcher is aware of categories within the population and draws samples from each category. The size of each
categorical sample is proportional to the proportion of the population that belongs in that category.
• Convenience Sampling - It is a process of picking out people in the most convenient a n d f a s t e s t w a y t
o g e t r e a c t i o n s immediately. This method can be done by telephone interview to get the immediate reactions of
a certain group of sample for a certain issue.
Page 9
MODULE STATISTICAL ANALYSIS WITH SOFTWARE APPLICATION – CAE11
• Purposive Sampling - It is based on certain criteria laid down by the researcher. People who satisfy the
criteria are interviewed. It is used to determine the target population of those who will be taken for the study.
• Judgement Sampling - selects sample in accordance with an expert’s judgment.
Cases wherein Non-Probability Sampling is Useful
- Only few are willing to be interviewed
- Extreme difficulties in locating or identifying subjects
- Probability sampling is more expensive to implement
- Cannot enumerate the population elements.
Sources of Errors in Sampling
1. Non-sampling Error
- Errors that result from the survey process.
- Any errors that cannot be attributed to the sample-to-sample variability.
Sources of Non-Sampling Error
1. Non-responses
2. Interviewer Error
3. Misrepresented Answers
4. Data entry errors
5. Questionnaire Design
6. Wording of Questions
7. Selection Bias
2. Sampling Error
- Error that results from taking one sample instead of examining the whole population.
- Error that results from using sampling to estimate information regarding a population.
For more knowledge about this lesson, please check the link provided;
https://www.youtube.com/watch?v=5PsF5MsrCOo
REFERENCES
https://www.investopedia.com/terms/s/statistics.asp
Page 10
MODULE STATISTICAL ANALYSIS WITH SOFTWARE APPLICATION – CAE11
Page 11