Research Project
Lecture 6
Quantitative Data Collection
LECTURE CONTENTS
6.1. INTRODUCTION TO QUANTITATIVE DATA COLLECTION
6.2. SURVEYS AND QUESTIONNAIRES
6.3. SECONDARY DATA COLLECTION
6.1. Introduction To Quantitative
Data Collection
Definition of Quantitative Research
Quantitative research is a systematic investigation of
phenomena by gathering numerical data and analyzing
it using statistical, mathematical, or computational
techniques.
Saunders, Lewis, and Thornhill (2023)
Characteristics Of Quantitative Research
It focuses on measuring variables numerically and analyzing relationships between them.
It employs statistical and graphical techniques to examine relationships, differences, and
trends.
Data is collected in a structured manner using surveys, structured observations,
experiments, or secondary datasets.
Probability sampling techniques are used to ensure results can be generalized to a larger
population.
When To Use Quantitative Methods
When the research aims to test a theory or hypothesis by measuring relationships
between variables in a systematic way.
When researchers need results that can be generalized to a broader population,
quantitative research often employs probability sampling to ensure statistical validity.
Quantitative methods are ideal for surveys, structured observations, and experiments
where data collection needs to be efficient and systematic.
Quantitative methods are useful in explanatory studies to understand cause-and-effect
relationships, as well as in evaluative research to assess the effectiveness of policies,
programs, or interventions.
Types of Data
Discrete Data vs Continuous Data
Discrete Data vs Continuous Data
Discrete data refers to numerical data that can take only specific, separate values and
cannot be meaningfully subdivided.
It consists of finite numbers and is often countable, meaning it can be represented by
whole numbers or specific, distinct categories.
Example:
Number of employees in a company (e.g., 25, 50, 100 employees—whole numbers).
Number of products sold per day in a retail store (e.g., 10, 25, 57 items).
Number of invoices processed per week in an accounting department (e.g., 34, 78, 120 invoices).
Discrete Data vs Continuous Data
Continuous data refers to numerical data that can take any value within a given range.
Unlike discrete data, which consists of distinct, separate values, continuous data can be
measured with as much precision as the measuring instrument allows.
Example:
Temperature (e.g., 36.5°C, 98.7°F) – Can be measured to any decimal point depending on the precision of the
thermometer.
Delivery Distance (e.g., 5.3 km, 10.25 miles) – Can have fractional values.
Length of Service in a Company (e.g., 5.7 years, 12.25 years) – A measure that is not limited to whole
numbers.
Levels of Measurement:
Nominal;
Ordinal;
Interval;
Ratio.
Elements Of An Effective Survey:
Survey;
Collect secondary data.
6.2. Surveys And Questionnaires
Elements Of An Effective Survey
Clarity
Clarity is a fundamental aspect of an effective survey, ensuring that respondents can
understand and accurately answer questions.
Each question should have a single, clear meaning to prevent misinterpretation.
The questionnaire should cover all necessary topics without overwhelming the
respondent or omitting essential aspects.
The layout should be clear and attractive, making it easy for respondents to follow the
sequence of questions without confusion.
Before finalizing the survey, pilot testing with a small group helps identify unclear
questions and areas needing improvement.
Relevance
Every question in the survey should directly contribute to answering the research
question(s).
A survey should be designed to collect only the necessary information, avoiding overly
detailed or tangential questions that do not contribute to the study's purpose.
If respondents see the survey as relevant to their experiences or opinions, they are more
likely to provide accurate and thoughtful responses.
Brevity
Excessively long surveys reduce response rates.
A balance should be maintained where the survey is long enough to capture necessary
information but concise enough to prevent respondent fatigue.
Long or complex questions can confuse respondents and lead to inaccurate answers.
Different survey formats require varying levels of brevity. For example:
SMS Surveys: Should be limited to a few questions.
Online & Paper Surveys: Typically 4-8 pages are acceptable.
Telephone Surveys: Should not exceed 30 minutes.
Structuring Questions:
Closed-ended questions;
Likert scale;
Demographic questions.
Closed-ended Questions
Closed-ended questions limit respondents to specific
choices, such as "yes/no," "true/false," or multiple-choice
options.
This structure ensures uniformity in responses, simplifying
analysis and comparison.
Respondents can quickly select from predefined choices,
making the survey more efficient and increasing completion
rates.
Likert Scale
Likert scale is commonly used in rating scale questions where respondents express their level
of agreement or disagreement with statements.
A simple Likert scale typically consists of a set of statements where respondents rate their
level of agreement on a scale, e.g.,
Strongly disagree (1)
Disagree (2)
Neutral (3)
Agree (4)
Strongly agree (5)
Demographic Questions
Demographic questions are used to gather factual information about
respondents.
These questions help in analyzing differences in opinions, behaviors,
and events across various population segments.
Common Demographic Variables: Age; Gender; Marital status;
Education level; Occupation; Income; Nationality or Ethnicity.
Sensitive topics (e.g., income) should be framed carefully to encourage
responses.
Avoiding Common Pitfalls:
Leading questions;
Double-barreled questions;
Jargon.
Avoiding Common Pitfalls:
Leading questions;
Double-barreled
questions;
Jargon.
Avoiding Common Pitfalls:
Leading questions introduce bias by suggesting a particular answer.
They often contain assumptions that the respondent may not have
Leading questions; made independently.
Double-barreled For example: "How did the change impact your workload?” (This
questions; assumes that the change had an impact.).
How to avoid:
Jargon.
Ensure questions do not contain assumptions or implied judgments.
Use neutral wording to allow respondents to provide their own perspective.
Test for bias by reviewing "cleanness" in question phrasing, ensuring they do
not lead respondents toward a particular answer.
Avoiding Common Pitfalls:
These are long questions that contain two or more questions in one.
They can confuse respondents because they do not know which part to
Leading questions;
answer. This often results in incomplete or ambiguous responses.
Double-barreled
Example: "How often do you visit your mother and father?”
questions;
This question assumes that both parents are equally accessible and that
Jargon. the respondent visits them together.
A better approach is to split it into two separate questions:
"How often do you visit your mother?"
"How often do you visit your father?"
Avoiding Common Pitfalls:
Overuse of jargon makes writing unclear and pretentious.
It can alienate readers who are unfamiliar with certain industry-specific
Leading questions;
terms.
Double-barreled Writers should not assume that readers have the same level of subject
questions; knowledge.
Example of phrases that are vague or unnecessarily complex:
Jargon.
“Ongoing situation”; “Going down the route of”; “At the end of the day”;
“The bottom line”; “At this moment in time”.
Instead, use clear and direct alternatives: “Now” is clearer than “at this
moment in time.”
Cronbach’s alpha
Cronbach’s alpha is a statistic that measures internal consistency, assessing whether a set
of items in a scale reliably measure a single concept.
Values range between 0 and 1, with:
Cronbach Alpha Value Interpretation
0.91-1.00 Excellent
0.81-0.90 Good
0.71-0.80 Good and Acceptable
0.61-0.70 Acceptable
0.01-0.60 Non acceptable
(Konting et al., 2009)
6.3. Secondary Data Collection
Sources:
Government reports;
Databases;
Published research
Sources:
Government agencies regularly publish statistical reports, policy
documents, and official datasets that provide structured and reliable
Government reports; numerical data.
Databases; Data can be collected from government reports:
Census Reports – National population statistics, demographics, employment rates.
Published research
Economic Reports – GDP growth, inflation rates, trade balances.
Health Statistics – Disease prevalence, vaccination rates, mortality rates.
Education Reports – School enrollment rates, literacy rates, graduation statistics.
Business and Industry Reports – Market trends, labor force participation, taxation
statistics.
Sources:
Some Website to collect data:
General Statistics Office of Vietnam: https://www.gso.gov.vn/en/homepage/
Government reports;
U.S Open Data: https://data.gov/
Databases; Ministry of Finance (MoF) Statistics: https://www.mof.gov.vn/webcenter/portal/btcvn
U.S. Embassy & Consulate in Vietnam – Economic Data & Reports
Published research
Bureau Of Labor Statistics (bls): https://www.bls.gov/
Sources:
Databases are large repositories of organized data that researchers can
access for statistical analysis in quantitative research. These include
Government reports; open-access government databases, commercial research databases,
Databases; and institutional repositories.
Example:
Published research
World Bank Open Data: https://data.worldbank.org/
Financial Market Databases – Bloomberg, Reuters, Stock Market
Indices.
https://vietstock.vn/
Sources:
Published research includes peer-reviewed journal articles, industry
reports, and conference papers that present findings based on previous
Government reports; quantitative studies.
Databases; Example:
Google Scholar, PubMed, ScienceDirect – Academic journals with statistical
Published research
findings.
Market Research Reports (Statista, IBISWorld, Nielsen) – Consumer behavior,
product demand.
Company Financial Reports – Annual reports, financial performance metrics.
Technical Reports from Research Institutions – Engineering, environmental impact
studies.
Evaluating Data Quality:
Relevance,
Reliability,
Timeliness.
Evaluating Data Quality:
Relevance refers to how well the secondary data aligns with the specific
objectives and research questions of the study.
Relevance,
Key Considerations:
Reliability,
Does the data directly address the research problem?
Timeliness. Is the data collected for a similar purpose or industry?
Are the variables and definitions consistent with the study’s requirements?
Evaluating Data Quality:
Reliability refers to the accuracy, consistency, and trustworthiness of the
secondary data source.
Relevance,
Reliable data comes from credible and well-established institutions with
Reliability, transparent data collection methodologies.
Key Considerations:
Timeliness.
Who collected the data? (Government agencies, research institutions, reputable
organizations)
What methods were used? (Sampling techniques, data collection procedures,
verification process)
Has the data been peer-reviewed or validated?
Evaluating Data Quality:
Timeliness refers to how current and up-to-date the secondary data is in
relation to the research needs.
Relevance,
Outdated data can lead to misleading conclusions in quantitative research.
Reliability,
Key Considerations:
Timeliness. When was the data last updated?
Does the dataset cover the relevant timeframe for the research?
Are there more recent datasets available?
THANK YOU