lOMoARcPSD|8230253
Summary - Research methodology
Research Methodology for IB (Rijksuniversiteit Groningen)
StuDocu is not sponsored or endorsed by any college or university
Downloaded by farabi nawar (faraobon@gmail.com)
lOMoARcPSD|8230253
Summary Research Methodology
The research process
Step 1: The problem statement
Analyse certain situations within company in order to gain insight in possible problems and
how to solve these
Step 2: Research question and factors
What information is needed to solve problem statement
Find factors that might influence central concept through scientific literature
Step 3: Sub-research questions
Step 4: Conceptual model
Dependent and independent variable
Control variables
Step 5: Hypotheses
Sub research questions formulated in clearly directional way
Step 6: Conceptual- and operational definitions
Conceptual definition: definition originated directly from literature about the concept
Operational definition: indicates how the concept can be measured
Step 7: Indicators
Used to gather all necessary information about the concept
Content validity: degree to which all aspects of a concept are covered by the indicators
Construct validity: do the indicators fit with existing theory about the concept
The measurement instrument
> Experiment
True experiment
o Full experimental control: groups are equal
o Study something which in real life is impossible, but is less realistic
Field experiment
o Natural setting, more heterogeneous, but less experimental control
Quasi-experiment
o Natural setting, more heterogeneous, use existing groups, but no random assignment
and less experimental control
Data types
1: Nominal data: collecting information on a variable that naturally or by design can be grouped into
two or more categories that are mutually exclusive and collectively exhaustive
Least powerful
Categories (1 = food; 2 = manufacturing; 3 = transport; 4 = utility; etc.)
2: Ordinal data: characteristics of nominal scale plus an indicator of order
Categories (1 = poor; 2 = reasonable; 3 = good; 4 = excellent; etc..)
3: Interval data: power of nominal and ordinal data plus they incorporate the concept of equality f
interval (distance between 1 and 2 equals distance between 2 and 3).
Categories (1 = 6 a.m.; 2 = 7 a.m.; 3 = 8 a.m.; 4 = 9 a.m.; etc.)
4: Ratio data: has all the powers of previous data types plus provision for absolute zero or origin
Number
Downloaded by farabi nawar (faraobon@gmail.com)
lOMoARcPSD|8230253
The sample
Target population: the group that you will target
Sample: when target population is too large, use sample group
Likely to differ if you take several samples
Has a certain random distribution that can differ across samples
o Respondents that are used for sample shouldn’t deviate from the target population
Otherwise: only discuss operational population (only about actual
respondents)
o Sample should be a type of probability sample
o Sample size: not too small, not too large
What makes a good sample
Accuracy: degree to which bias is absent from sample
o Proper sample has some sample elements that underestimate the population values
and others that overestimate them offset each other
o No systematic variance: variation in measures due to some known or unknown
influences that ‘cause’ the scores to lean in one direction more than another
o Non-response can be systematic: those who respond to a survey request differ from
those who refuse to participate
Precision: degree to which sample represents its population
o Absence of sampling error: numerical descriptors that describe sample may be
expected to differ from those that describe populations because of random
fluctuations
o Measured by the standard error of estimate; the smaller the greater the precision
Representation
Probability sampling: based on the concept of random selection – a controlled procedure
that ensures that each population element is given a known non-zero chance of selection
Provide estimates of precision
o Simple random sample: each population element has a known and equal chance of
selection
Difficult and impractical (lots of information necessary)
o Systematic sample: select every kth element of the population
Simple, but not possible without meaningful system in elements
o Stratified sample: divide population into subpopulations, large difference between
strata, small differences within strata
Possibility of different methods and efficient, but hard to make decision
between independent variables
o Cluster sample: population consists of clusters of elements which are close to each
other
Downloaded by farabi nawar (faraobon@gmail.com)
lOMoARcPSD|8230253
Non-probability sampling: arbitrary (non-random) and subjective
o Convenience sampling: freedom to choose whoever can be found or wants to
participate
Quick, can be used to examine an idea in short time
o Snowball sampling: respondents identify other potential respondents
Efficient to find respondents in niche situations, but respondents may overlap
Census study: study of every unit in a population
Feasible when population is small
Necessary when elements are quite different from each other
Why sample instead of whole population?
Costs: sample is less expensive
Accuracy: research based on sampling can be more accurate than research based on census
Speed: less information means faster processing
Availability of population elements: destruction testing
The data collection
> Not every respondent will return questionnaire non-response may bias your results
Random non-response: respondents and non-respondents do not differ systematically on
important variables; results can give a correct view of reality; needed larger sample
Systematic non-response: respondents and non-respondents differ systematically on
important variables; results can give biased view of reality; needed larger and better sample
The data-analysis
Step 1: Preparing the dataset
Every variable has to be labelled
Check for outliers (check what causes them)
Step 2: Descriptive statistics
Check distribution of variables (i.e. age, gender, education level)
Mention total number of respondents as well as actual respondents
Step 3: Correlation and reliability analysis
Find out if indicators actually measure the same underlying concept calculate correlation
o Only possible for scales which are measured on ordinal or interval scale
When correlation is high Cronbach’s Alpha
o Measure for internal reliability
o If alpha is > 0,6 sum the questions
o All variables should be same measurement scale (4-point or 5-point or 7-point, etc.)
o All variables should be asked in a positive or negative way
Validity: Do I measure what I want to measure?
> Face validity: weak evidence
> Content validity: extent to which measuring instrument provides adequate coverage of concept you
study
> Predictive validity: possible to predict future (measure at different time)
> Concurrent validity: possible to predict future (measure at same moment)
> Construct validity: extent to which concepts relate to other concepts
Convergent validity: measures of constructs that theoretically should be related to each other
are observed to be related to each other
Discriminant validity: measures of constructs that theoretically should not be related to each
other are observed to not be related to each other
Downloaded by farabi nawar (faraobon@gmail.com)
lOMoARcPSD|8230253
Reliability: items within a survey which theoretically should measure same concept
> Stability: fluctuations in results because of personal and situational aspects
> Equivalence: fluctuations in results because of differences between researchers
> Internal consistency: degree in which multiple items measure the same construct
Split half method: choose one half of items randomly and compare those to other half
Statistic: Cronbach’s alpha
Association vs. tests
Associations: provide information about the strength of a relationship between two (or
more) variables
Tests: provide information about generalisability of the results (from sample to population)
o Test statistic (χ2, t, F): Number which gives information about the test outcome
o P-value: Probability that the data is based on chance
o Degree of freedoms (df): Number of independent observations based on which the
test statistic is calculated
Describing associations
Pearson correlation
o Measure for strength of linear relationship between two interval variables expressed
in r
o -1 ≤ r ≤ 1
o Absolute value ↑ then Strength relationship ↑
Spearman rank correlation
o Measure for how well the relationship between two ordinal variables can be
described using a monotonic function
o -1 ≤ rho ≤ 1
o Absolute value ↑ then Strength relationship ↑
Testing groups
T-test
o When two groups are compared on an interval variable
o i.e. Hypothesis: Men are more afraid of spiders than women
H0: µ(men) = µ(women)
H1: µ(men) > µ(women)
Result: t(38) = 4,5; p < 0.001
o Involved parameters: t-value, df
one-way ANOVA
o When you compare more than two groups on an interval variable
o i.e. Hypothesis: Negotiators will demand least for themselves when opponent is
angry, more when opponent does not express any emotion and most when opponent
expresses happiness
H0: µ(angry opponent) = µ(no emotion) = µ(happy opponent)
H1: µ(angry opponent) < µ(no emotion) < µ(happy opponent)
Result: F(2, 78) = 3,6; p < 0.01
o Involved parameters: F-value, df (between groups) and df (within groups)
Chi-square test
o When two groups are compared on a nominal variable
o i.e. Hypothesis: Debtors receiving a positive message are more likely to contact debt
collection agency than debtors receiving a negative message
H0: P(A│B) = P(A)
Downloaded by farabi nawar (faraobon@gmail.com)
lOMoARcPSD|8230253
H1: P(A│B) > P(A)
Mann-Whitney U-test
o To compare two groups on an ordinal variable (or when assumptions t-test violated)
o i.e. Hypothesis: Men have a higher educational degree than women
H0: Median(men) = Median(women)
H1: Median (men) > Median(women)
Result: MWU = 345; p < 0,001
Involved parameter: Mann-Whitney U, Wilcoxon W, Wilcoxon Z
Testing relationships
Regression test
o To predict an interval variable (Y) on one or multiple interval variable(s) (X 1, X2, Xk)
o i.e. Hypothesis: Advertising results in more sales
H0: b = 0
H1: b > 0
Result: Sales = 1055 + 10*advertising
Simple regression: ŷ = a = bX + Ԑ
Multiple regression: ŷ = a + b1X1 + b2X2 + … + bkXk + Ԑ
Predict Y on X: X Y
Simple regression vs. correlation: X X ; X1 ↔ X2
o Regression model: Y = β0 + β1X + β2Z + Ԑ
o Involved parameters: b or β (positive or negative), a (constant) and R 2 (proportion
explained variance)
B vs. β
B = unstandardized: when X increases with one unit, Y increases with
b units
β = standardised, when X increases one standard deviation, Y
increases with β standard deviations
Downloaded by farabi nawar (faraobon@gmail.com)
lOMoARcPSD|8230253
Step 4:New variable
When variables have Cronbach’s Alpha > 0,6 variables can be summed-
Step 5:Choice of technique
The Big Three:
o How many variables are involved in analysis?
One: univariate analysis (descriptive statistics)
Two: bivariate analysis (inferential statistics)
> Two: multivariate analysis (inferential statistics)
o What are the data type of the involved variable(s)?
Independent variable (X): nominal, ordinal or interval (ratio)
Dependent variable (Y): nominal, ordinal or interval (ratio)
o Asymmetric vs. symmetric (only for two or more variables)
Asymmetric: when variables have a different data type or if you want to
predict the dependent variable based on the independent variables (causal
relationship)
Symmetric: when you don’t want to predict a causal relationship and the
variables have the same data type
Univariate
Bivariate Symmetric
Bivariate asymmetric
Downloaded by farabi nawar (faraobon@gmail.com)
lOMoARcPSD|8230253
Regression vs. Correlation
Correlation: when we want to establish a linear relationship association
Regression: when we want to predict one variable based on another variable: causal
relationship
Related vs. unrelated sample
Related sample Unrelated sample
- Within subjects design - Between subjects design
- Each participant in each condition - Each participant only in one condition
- Small sample needed and control for - No practice effects or hypothesis
intra-individual effects guessing
Step 6: Control variable
Possible effects of control variables have to be excluded
Test on dependent variable
If it does have an effect, correct for this when testing hypotheses
o Stepwise multiple regression
o ANOVA-analysis
Step 7: Testing the hypotheses
Formulate H0 and H1
o H0: no effect or relationship
o H1: does assume an effect
Use statistical test to see whether or not the two groups differ significantly
o P-value > 0,05 chance that H1 is true is too small, which means accept H0
o P-value < 0,05 chance that H0 is true is too small, which means accept H1
Step 8: Conclusions and implications
Draw conclusion upon H1 or H0
Measurement error
Systematic error: results from a bias
Random error: occurs occasionally
Error sources
1: Participant
2: Situational factors
3: Measurer
4: Data-collection instrument
Appendix
Downloaded by farabi nawar (faraobon@gmail.com)
lOMoARcPSD|8230253
Downloaded by farabi nawar (faraobon@gmail.com)
lOMoARcPSD|8230253
Downloaded by farabi nawar (faraobon@gmail.com)