What is Sampling
Sampling is a process used in statistical analysis in which a predetermined
number of observations are taken from a larger population. The methodology
used to sample from a larger population depends on the type of analysis being
performed but may include simple random sampling or systematic sampling.
Difference Between Census and Sampling
Census and sampling are two methods of collecting survey data about the population that are
used by many countries. Census refers to the quantitative research method, in which all the
members of the population are enumerated. On the other hand, the sampling is the widely
used method, in statistical testing, wherein a data set is selected from the large population,
which represents the entire group.
Census implies complete enumeration of the study objects, whereas Sampling connotes
enumeration of the subgroup of elements chosen for participation. These two survey methods
are often contrasted with each other, and so this article makes an attempt to clear
the differences between census and sampling, in detail; Have a look.
Definition of Census
A well-organised procedure of gathering, recording and analysing information regarding the
members of the population is called a census. It is an official and complete count of the
universe, wherein each and every unit of the universe is included in the collection of data. Here
universe implies any region (city or country), a group of people, through which the data can be
acquired.
Under this technique, the enumeration is conducted about the population by considering the
entire population. Hence this method requires huge finance, time and labour for gathering
information. This method is useful, to find out the ratio of male to female, the ratio of literate
to illiterate people, the ratio of people living in urban areas to the people in rural areas.
 BASIS FOR
                     CENSUS                               SAMPLING
 COMPARISON
 Meaning             A systematic method that             Sampling refers to a portion of
                     collects and records the data        the population selected to
                     about the members of the             represent the entire group, in
                     population is called Census.         all its characteristics.
 Enumeration         Complete                             Partial
 Study of            Each and every unit of the           Only a handful of units of the
                     population.                          population.
 Time required       It is a time consuming process.      It is a fast process.
 Cost                Expensive method                     Economical method
 Results             Reliable and accurate                Less reliable and accurate, due
                                                          to the margin of error in the
                                                          data collected.
 Error               Not present.                         Depends on the size of the
                                                          population
 Appropriate for     Population of heterogeneous          Population of homogeneous
                     nature.                              nature.
Sampling: Definition
Sampling is defined as the process of selecting certain members or a subset of the population
to make statistical inferences from them and to estimate characteristics of the whole
population. Sampling is widely used by researchers in market research so that they do not need
to research the entire population to collect actionable insights. It is also a time-convenient and
a cost-effective method and hence forms the basis of any research design.
For example, if a drug manufacturer would like to research the adverse side effects of a drug on
the population of the country, it is close to impossible to be able to conduct a research study
that involves everyone. In this case, the researcher decides a sample of people from
each demographic and then conducts the research on them which gives them an indicative
feedback on the behavior of the drug on the population.
Types of Sampling: Sampling Methods 
Any market research study requires two essential types of sampling. They are:
   1. Probability Sampling: Probability sampling s a sampling method that selects random
       members of a population by setting a few selection criteria. These selection parameters
       allow every member to have the equal opportunities to be a part of various samples.
   2. Non-probability Sampling: Non probability sampling method is reliant on a researcher’s
       ability to select members at random. This sampling method is not a fixed or pre-defined
       selection process which makes it difficult for all elements of a population to have equal
       opportunities to be included in a sample.
In this blog, we discuss the various probability and non-probability sampling methods that can
be implemented in any market research study.
Types of Sampling: Probability Sampling Methods
Probability Sampling is a sampling technique in which sample from a larger population are
chosen using a method based on the theory of probability. This sampling method considers
every member of the population and forms samples on the basis of a fixed process. For
example, in a population of 1000 members, each of these members will have 1/1000 chances of
being selected to be a part of a sample. It gets rid of bias in the population and gives a fair
chance to all members to be included in the sample.
There are 4 types of probability sampling technique:
      Simple Random Sampling: One of the best probability sampling techniques that helps in
       saving time and resources, is the Simple Random Sampling method. It is a trustworthy
       method of obtaining information where every single member of a population is chosen
       randomly, merely by chance and each individual has the exact same probability of being
       chosen to be a part of a sample.
For example, in an organization of 500 employees, if the HR team decides on conducting team
building activities, it is highly likely that they would prefer picking chits out of a bowl. In this
case, each of the 500 employees has an equal opportunity of being selected.
      Cluster Sampling: Cluster sampling is a method where the researchers divide the entire
       population into sections or clusters that represent a population. Clusters are identified
       and included in a sample on the basis of defining demographic parameters such as age,
       location, sex etc. which makes it extremely easy for a survey creator to derive effective
       inference from the feedback.
For example, if the government of the United States wishes to evaluate the number of
immigrants living in the Mainland US, they can divide it into clusters on the basis of states such
as California, Texas, Florida, Massachusetts, Colorado, Hawaii etc. This way of conducting a
survey will be more effective as the results will be organized into states and provides insightful
immigration data.
      Systematic Sampling: Using systematic sampling method, members of a sample are
       chosen at regular intervals of a population. It requires selection of a starting point for
       the sample and sample size that can be repeated at regular intervals. This type of
       sampling method has a predefined interval and hence this sampling technique is the
       least time-consuming.
For example, a researcher intends to collect a systematic sample of 500 people in a population
of 5000. Each element of the population will be numbered from 1-5000 and every 10th
individual will be chosen to be a part of the sample (Total population/ Sample Size = 5000/500 =
10).
      Stratified Random Sampling: Stratified Random sampling is a method where the
       population can be divided into smaller groups, that don’t overlap but represent the
       entire population together. While sampling, these groups can be organized and then
       draw a sample from each group separately.
For example, a researcher looking to analyze the characteristics of people belonging to different
annual income divisions, will create strata (groups) according to annual family income such as –
Less than $20,000, $21,000 – $30,000, $31,000 to $40,000, $41,000 to $50,000 etc. and people
belonging to different income groups can be observed to draw conclusions of which income
strata have which characteristics. Marketers can analyze which income groups to target and
which ones to eliminate in order to create a roadmap that would definitely bear fruitful results.
Use of the Probability Sampling Method
There are multiple uses of the probability sampling method. They are:
      Reduce Sample Bias: Using the probability sampling method, the bias in the sample
       derived from a population is negligible to non-existent. The selection of the sample
       largely depicts the understanding and the inference of the researcher. Probability
       sampling leads to higher quality data collection as the population is appropriately
       represented by the sample.
      Diverse Population: When the population is large and diverse, it is important to have
       adequate representation so that the data is not skewed towards one demographic. For
       example, if Square would like to understand the people that could their point-of-sale
       devices, a survey conducted from a sample of people across US from different industries
       and socio-economic backgrounds, helps.
      Create an Accurate Sample: Probability sampling helps the researchers plan and create
       an accurate sample. This helps to obtain well-defined data.
Types of Sampling: Non-probability Sampling Methods
The non-probability method is a sampling method that involves a collection of feedback on the
basis of a researcher or statistician’s sample selection capabilities and not on a fixed selection
process. In most situations, output of a survey conducted with a non-probable sample leads to
skewed results, which may not totally represent the desired target population. But, there are
situations such as the preliminary stages of research or where there are cost constraints for
conducting research, where non-probability sampling will be much more effective than the
other type.
There are 4 types of non-probability sampling which will explain the purpose of this sampling
method in a better manner:
      Convenience sampling: This method is dependent on the ease of access to subjects such
       as surveying customers at a mall or passers-by on a busy street. It is usually termed
       as convenience sampling, as it’s carried out on the basis of how easy is it for a
       researcher to get in touch with the subjects. Researchers have nearly no authority over
       selecting elements of the sample and it’s purely done on the basis of proximity and not
       representativeness. This non-probability sampling method is used when there are time
       and cost limitations in collecting feedback. In situations where there are resource
       limitations such as the initial stages of research, convenience sampling is used.
For example, startups and NGOs usually conduct convenience sampling at a mall to distribute
leaflets of upcoming events or promotion of a cause – they do that by standing at the entrance
of the mall and giving out pamphlets randomly.
      Judgmental or Purposive Sampling: In judgemental or purposive sampling, the sample
       is formed by the discretion of the judge purely considering the purpose of study along
       with the understanding of target audience. Also known as deliberate sampling, the
       participants are selected solely on the basis of research requirements and elements who
       do not suffice the purpose are kept out of the sample. For instance, when researchers
       want to understand the thought process of people who are interested in studying for
       their master’s degree. The selection criteria will be: “Are you interested in studying for
       Masters in …?” and those who respond with a “No” will be excluded from the sample.
      Snowball sampling: Snowball sampling is a sampling method that is used in studies
       which need to be carried out to understand subjects which are difficult to trace. For
       example, it will be extremely challenging to survey shelterless people or illegal
       immigrants. In such cases, using the snowball theory, researchers can track a few of that
       particular category to interview and results will be derived on that basis. This sampling
       method is implemented in situations where the topic is highly sensitive and not openly
       discussed such as conducting surveys to gather information about HIV Aids. Not many
       victims will readily respond to the questions but researchers can contact people they
       might know or volunteers associated with the cause to get in touch with the victims and
       collect information.
      Quota sampling:  In Quota sampling, selection of members in this sampling technique
       happens on basis of a pre-set standard. In this case, as a sample is formed on basis of
       specific attributes, the created sample will have the same attributes that are found in
       the total population. It is an extremely quick method of collecting samples.
Use of the Non-Probability Sampling Method
There are multiple uses of the non-probability sampling method. They are:
      Create a hypothesis: The non-probability sampling method is used to create a
       hypothesis when limited to no prior information is  available. This method helps with
       immediate return of data and helps to build a base for any further research.
      Exploratory research: This sampling technique is widely used when researchers aim at
       conducting qualitative research, pilot studies or exploratory research.
      Budget and time constraints: The non-probability method when there are budget and
       time constraints and some preliminary data has to be collected. Since the survey
       design is not rigid, it is easier to pick respondents at random and have them take
       the survey or questionnaire.
Difference between Probability Sampling and Non-Probability Sampling Methods
We have looked at the different types of sampling methods above and their subtypes. To
encapsulate the whole discussion though, the major differences between probability sampling
methods and non-probability sampling methods are as below:
                  Probability Sampling Methods            Non-Probability Sampling Methods
                  Probability Sampling is a sampling      Non-probability sampling is a sampling
                  technique in which sample from a        technique in which the researcher selects
 Definition       larger population are chosen using a    samples based on the subjective
                  method based on the theory of           judgment of the researcher rather than
                  probability.                            random selection.
 Alternatively
                  Random sampling method.                 Non-random sampling method
 Known as
 Population
                  The population is selected randomly.    The population is selected arbitrarily.
 selection
 Market
                  The research is conclusive in nature.   The research is exploratory in nature.
 Research
 Sample           Since there is method to deciding       Since the sampling method is arbitrary,
                  the sample, the population              the population demographics
                   demographics is conclusively
                                                           representation is almost always skewed.
                   represented.
                   Take a longer time to conduct since
                                                           This type of sampling method is quick
                   the research design defines the
 Time Taken                                                since neither the sample or selection
                   selection parameters before the
                                                           criteria of the sample is undefined.
                   market research study begins.
                   This type of sampling is entirely       This type of sampling is entirely biased
 Results           unbiased and hence the results are      and hence the results are biased too
                   unbiased too and conclusive.            rendering the research speculative.
                   In probability sampling, there is an
                   underlying hypothesis before the        In non-probability sampling, the
 Hypothesis        study begins and the objective of       hypothesis is derived after conducting
                   this method is to prove the             the research study.
                   hypothesis.
Essentials of sampling
The following are some of the essentials of sampling.
1. The sample selected should be representative of the entire population. This may be achieved
by using the random sampling method.
2. The size of the sample must also be adequate. The larger the size of the sample, the greater
will be the accuracy of the results.
3. All the units of the universe should have the same chance of getting selected. The researcher
should not use his judgement in selecting the sample.
4. There should be no basic difference in the nature of the units of the universe.
Types of sampling
1. Simple random sampling
In this case each individual is chosen entirely by chance and each member of the population has
an equal chance, or probability, of being selected. One way of obtaining a random sample is to
give each individual in a population a number, and then use a table of random numbers to
decide which individuals to include.1 For example, if you have a sampling frame of 1000
individuals, labelled 0 to 999, use groups of three digits from the random number table to pick
your sample. So, if the first three numbers from the random number table were 094, select the
individual labelled “94”, and so on.
As with all probability sampling methods, simple random sampling allows the sampling error to
be calculated and reduces selection bias. A specific advantage is that it is the most
straightforward method of probability sampling. A disadvantage of simple random sampling is
that you may not select enough individuals with your characteristic of interest, especially if that
characteristic is uncommon. It may also be difficult to define a complete sampling frame and
inconvenient to contact them, especially if different forms of contact are required (email,
phone, post) and your sample units are scattered over a wide geographical area.
  Restricted Random Sampling
There are three methods of restricted random sampling each of which is briefly explained
below:
    1) systematic
    2) stratified
    3) cluster
2. Systematic sampling
Individuals are selected at regular intervals from the sampling frame. The intervals are chosen
to ensure an adequate sample size. If you need a sample size n from a population of size x, you
should select every x/nth individual for the sample.  For example, if you wanted a sample size of
100 from a population of 1000, select every 1000/100 = 10 th member of the sampling frame.
Systematic sampling is often more convenient than simple random sampling, and it is easy to
administer. However, it may also lead to bias, for example if there are underlying patterns in
the order of the individuals in the sampling frame, such that the sampling technique coincides
with the periodicity of the underlying pattern. As a hypothetical example, if a group of students
were being sampled to gain their opinions on college facilities, but the Student Record
Department’s central list of all students was arranged such that the sex of students alternated
between male and female, choosing an even interval (e.g. every 20 th student) would result in a
sample of all males or all females. Whilst in this example the bias is obvious and should be
easily corrected, this may not always be the case.
 
3. Stratified sampling
In this method, the population is first divided into subgroups (or strata) who all share a similar
characteristic. It is used when we might reasonably expect the measurement of interest to vary
between the different subgroups, and we want to ensure representation from all the
subgroups. For example, in a study of stroke outcomes, we may stratify the population by sex,
to ensure equal representation of men and women. The study sample is then obtained by
taking equal sample sizes from each stratum. In stratified sampling, it may also be appropriate
to choose non-equal sample sizes from each stratum. For example, in a study of the health
outcomes of nursing staff in a county, if there are three hospitals each with different numbers
of nursing staff (hospital A has 500 nurses, hospital B has 1000 and hospital C has 2000), then it
would be appropriate to choose the sample numbers from each hospital proportionally (e.g. 10
from hospital A, 20 from hospital B and 40 from hospital C). This ensures a more realistic and
accurate estimation of the health outcomes of nurses across the county, whereas simple
random sampling would over-represent nurses from hospitals A and B. The fact that the sample
was stratified should be taken into account at the analysis stage.
Stratified sampling improves the accuracy and representativeness of the results by reducing
sampling bias. However, it requires knowledge of the appropriate characteristics of the
sampling frame (the details of which are not always available), and it can be difficult to decide
which characteristic(s) to stratify by.
 
4.- Multi-stage sampling
Multi-stage sampling (also known as multi-stage cluster sampling) is a more complex form
of cluster sampling which contains two or more stages in sample selection. In simple terms, in
multi-stage sampling large clusters of population are divided into smaller clusters in several
stages in order to make primary data collection more manageable. It has to be acknowledged
that multi-stage sampling is not as effective as true random sampling; however, it addresses
certain disadvantages associated with true random sampling such as being overly expensive
and time-consuming.
Merits and Demerits of Sampling Method of Data Collection
Merits:
1. Economical:
It is economical, because we have not to collect all data. Instead of getting data from 5000
farmers, we get it from 50-100 only.
2. Less Time Consuming:
As no of units is only a fraction of the total universe, time consumed is also a fraction of total
time. Number of units is considerably small, hence the time.
3. Reliable:
If sample is taken judiciously, the results are very reliable and accurate.
ADVERTISEMENTS:
4. Organisational Convenience:
As samples are taken and the number of units is smaller, the better (Trained) enumerators can
be employed by the organisation.
5. More Scientific:
According to Prof R.A. Fisher, “The sample technique has four important advantages over
census technique of data collection. They are Speed, Economy, Adaptability and Scientific
approach.”
It is based on certain laws such as:
(a) Law of Statistical Regularity
(b) Law of Inertia of Large numbers
(c) Law of Persistence
(d) Law of Validity.
6. Detailed Enquiry:
A detailed study can be undertaken in case of the units included in the sample. Size of sample
can be taken according to time and money available with the investigator.
7. Indispensable Method:
If universe is bigger, there remains no option but to proceed for this method. It is specially used
for infinite, hypothetical and perishable universes.
Demerits:
1. Absence of Being Representative:
Methods, such as purposive sampling may not provide a sample, that is representative.
2. Wrong Conclusion:
If the sample is not representative, the results will not be correct. These will lead to the wrong
conclusions.
ADVERTISEMENTS:
3. Small Universe:
Sometimes universe is so small that proper samples cannot be taken not of it. Number of units
are so less.
4. Specialised Knowledge:
It is a scientific method. Therefore, to get a good and representative sample, one should have
special knowledge to get good sample and to perform proper analysis so that reliable result
may be achieved.
5. Inherent defects:
The results which are achieved though the analysis of sampling data may not be accurate as this
method have inherent defects. There is not even a single method of sampling which has no
demerit.
6. Sampling Error:
This method of sampling has many errors.
7. Personal Bias:
As in many cases the investigator, chooses samples, such as convenience method, chances of
personal bias creep in.
Content: Sampling Error Vs Non-Sampling Error
   1. Comparison Chart
   2. Definition
   3. Key Differences
   4. Conclusion
Comparison Chart
 BASIS FOR
                       SAMPLING ERROR                        NON-SAMPLING ERROR
 COMPARISON
 Meaning               Sampling error is a type of error,    An error occurs due to sources
                       occurs due to the sample selected     other than sampling, while
                       does not perfectly represents the     conducting survey activities is
 BASIS FOR
                     SAMPLING ERROR                          NON-SAMPLING ERROR
 COMPARISON
                     population of interest.                 known as non sampling error.
 Cause               Deviation between sample mean and       Deficiency and analysis of data
                     population mean
 Type                Random                                  Random or Non-random
 Occurs              Only when sample is selected.           Both in sample and census.
 Sample size         Possibility of error reduced with the   It has nothing to do with the sample
                     increase in sample size.                size.
Definition of Sampling Error
Sampling Error denotes a statistical error arising out of a certain sample selected being
unrepresentative of the population of interest. In simple terms, it is an error which occurs when
the sample selected does not contain the true characteristics, qualities or figures of the whole
population.
The main reason behind sampling error is that the sampler draws various sampling units from
the same population but, the units may have individual variances. Moreover, they can also arise
out of defective sample design, faulty demarcation of units, wrong choice of statistic,
substitution of sampling unit done by the enumerator for their convenience. Therefore, it is
considered as the deviation between true mean value for the original sample and the
population.
Definition of Non-Sampling Error
Non-Sampling Error is an umbrella term which comprises of all the errors, other than the
sampling error. They arise due to a number of reasons, i.e. error in problem definition,
questionnaire design, approach, coverage, information provided by respondents, data
preparation, collection, tabulation, and analysis.
There are two types of non-sampling error:
      Response Error: Error arising due to inaccurate answers were given by respondents, or
       their answer is misinterpreted or recorded wrongly. It consists of researcher error,
       respondent error and interviewer error which are further classified as under.
          o Researcher Error
                    Surrogate Error
                    Sampling Error
                    Measurement Error
                    Data Analysis Error
                    Population Definition Error
                    Respondent Error
                    Inability Error
                    Unwillingness Error
                    Interviewer Error
                    Questioning Error
                    Recording Erro
                    Respondent Selection Error
                    Cheating Error
             Non-Response Error: Error arising due to some respondents who are a part of
              the sample do not respond.
Key Differences Between Sampling and Non-Sampling Error
The significant differences between sampling and non-sampling error are mentioned in the
following points:
   1. Sampling error is a statistical error happens due to the sample selected does not
      perfectly represents the population of interest. Non-sampling error occurs due to
      sources other than sampling while conducting survey activities is known as non-
      sampling error.
   2. Sampling error arises because of the variation between the true mean value for the
      sample and the population. On the other hand, the non-sampling error arises because of
      deficiency and inappropriate analysis of data.
   3. Non-sampling error can be random or non-random whereas sampling error occurs in the
      random sample only.
   4. Sample error arises only when the sample is taken as a representative of a
      population.As opposed to non-sampling error which arises both in sampling and
      complete enumeration.
   5. Sampling error is mainly associated with the sample size, i.e. as the sample size
      increases the possibility of error decreases. On the contrary, the non-sampling error is
      not related to the sample size, so, with the increase in sample size, it won’t be reduced.
Conclusion
To end this discussion, it is true to say that sampling error is one which is completely related to
the sampling design and can be avoided, by expanding the sample size. Conversely, non-
sampling error is a basket that covers all the errors other than the sampling error and so, it
unavoidable by nature as it is not possible to completely remove it.
Central Limit Theorem
Definition: The Central Limit Theorem states that when a large number of simple random
samples are selected from the population and the mean is calculated for each then the
distribution of these sample means will assume the normal probability distribution.
In other words, the sample means will be normally distributed when the mean and standard
deviation of the population is given, and large random samples are selected from the
population, irrespective of whether the population is normal or skewed. Symbolically the
central limit theorem can be explained as:
When ‘n’ number of independent random variables are given each having the same
distribution, then:
X = X1+X2+X3+X4+…. +Xn, the mean and variance of X will be:
                                                     The following three probability
distributions must be understood for the complete understanding of the Sampling Theory:
      Population (Universe) Distribution
      Sample Distribution
      Sampling Distribution
The utility of the central limit theorem is that it requires no condition on distribution patterns of
the random variables and in fact, uses the practical method to compute the approximate
probability values for the arbitrarily distributed random variables.
Also, it helps to determine why the vast number of phenomena shows approximate normal
distribution. Suppose, the population is skewed, the skewness of the sampling distribution is
inversely proportional to the square root of the sample size. Thus, if the sample size is 25,
then the sampling distribution exhibits only one-fifth as much skewness as the population.
Thus, it can be said that the sampling distribution of the sample mean assumes the normal
distribution irrespective of what distribution a population assumes from which the samples are
drawn, and the approximation to the normal distribution is likely to increase with the increase
in the sample size.