0% found this document useful (0 votes)

23 views34 pages

CH 8 Data Analysis

Chapter Eight discusses data analysis, outlining the importance of understanding data types, which include categorical and numerical data, before applying statistical techniques. It explains various coding methods for data consistency and describes different types of data analysis, including univariate, bivariate, and multivariate analysis, along with specific statistical methods such as regression and correlation. The chapter emphasizes the significance of these analyses in deriving insights and making informed decisions based on the data.

Uploaded by

Abdiman Habibo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

23 views34 pages

CH 8 Data Analysis

Uploaded by

Abdiman Habibo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 34

CHAPTER EIGHT

DATA ANALYSIS

Data for Analysis ?

 Data Analysis is the process of systematically applying statistical and/or

logical techniques to describe and illustrate, condense and recap, and
evaluate data.

Before analyzing the data for your research, it is important to know the type of data you have
at hand as the technique you use is determined by the data.
The following figure provides you clear information of the type of data to be used for
research.
1 01-09-2024
2 01-09-2024
8.1.1. Quantitative data can be divided into
two distinct groups:

A. Categorical and
B. Numerical
A. Categorical data

 These are data that can‘t be measured numerically as

quantities.
 Categorical data can be further sub-divided into
3 01-09-2024
1. Nominal- whose values can‘t be measured numerically
or can‘t be ranked. Rather these data simply count the
number of occurrences in each category of a variable.
Examples of nominal variables:
Where a person lives (AA, Adama, B/Dar, etc.)
Gender (male, female)
Nationality (American, Ethiopian, Chinese)
Ethnicity (Oromo, Amhara, Tgire, Gurage…)
4 01-09-2024
2. Ranked/Ordinal data - whose values can be ranked in orders
 Examples of ordinal data

 Education (Elementary school, High school, College Diploma, College

degree, Masters)
 Agreement (strongly disagree, disagree, neutral, agree, strongly agree)

 Rating (poor, fair, good, excellent)

 Frequency (never, often, sometimes; always,, )

 Any other scale (―On a scale of 1 to 5...‖)

5 01-09-2024
 Descriptive data with only two categories are known as

dichotomous data.
 E.g. gender can be divide into female and male.

 Or questions with a ‗yes‘ or ‗No‘ response

6 01-09-2024
Cont…
B. Numerical Data

 Which are sometimes termed ‗quantifiable‘, are those

whose values are measured or counted numerically as

quantities.
 Numerical data can be analysed using a far wider
range of statistics than categorical data.

7 01-09-2024
Coding the Data
 Coding – Process of translating information gathered from
questionnaires or other sources into something that can be
analyzed
 Involves assigning a value to the information given—often value is
given a label.
 Coding can make data more consistent

 Example: Question = Sex

 Answers = Male, Female, M, or F

 Coding will avoid such inconsistencies

11 01-09-2024
Coding Systems
 Common coding systems (code and label) for dichotomous variables:

0=No 1=Yes
(1 = value assigned,Yes= label of value)
OR: 1=No 2=Yes

 When you assign a value, you must also make it clear what that value

means
 In first example above, 1=Yes but in second example 1=No

 As long as it is clear how the data are coded, either is fine

12 01-09-2024
Coding- Ordinal Variables
 Coding process is similar with other categorical variables

 Example: variable EDUCATION, possible coding:

0 = Did not graduate from high school

1 = High school graduate
2 = Some college or post-high school education
3 = College graduate

 Could be coded in reverse order (0=college graduate, 3=did

not graduate high school)

13 01-09-2024
Coding: Nominal Variables
For coding nominal variables, order makes no difference
 Example: variable RESIDENCE

1 = Northeast
2 = South
3 = Northwest
4 = Midwest
5 = Southwest
 Order does not matter, no ordered value associated with each
response
14 01-09-2024
Coding: Continuous Variables
Creating categories from a continuous variable (ex. age) is
common
 May break down a continuous variable into chosen categories by
creating an ordinal categorical variable
 Example: variable = AGE
1 = 0–9 years old
2 = 10–19 years old
3 = 20–39 years old
4 = 40–59 years old
5 = 60 years or older

15 01-09-2024
8.2. Types of Data Analysis
 Is the process of inspecting, cleaning, transforming, and modelling data

with the goal of discovering useful information suggesting conclusions, and

supporting decision making.
 Data analysis can be made using:
(i) Descriptive Statistics
(ii) Inferential Statistics
 Descriptive statistics are used to describe, summarize, or
explain a given set of data.

 inferential statistics is used to infer certain characteristics of

samples to population.
22 01-09-2024
8.2.1. Univariate Analysis
 Is the analysis carried out with the description of single

variable in terms of the applicable unit of analysis.

 Measure of central tendencies and measure of dispersion are

the typical categories of univariate analysis.

24 01-09-2024
A. Measures of Central Tendency

 The three most frequently used measures of central

tendency are
• Mode
• Median and
• Mean

25 01-09-2024
1. Mode
 Mode can be defined as the most frequently occurring value in a
group of observations.
 If the scores for a given sample distributions are:
32, 32, 35, 36, 37, 38, 38, 39, 39, 39, 40, 40, 42, 45
 Then the mode would be 39 because a score of 39 occurs three

times, more than any other score.

 Mode is very good measure for ascertaining the location of

distribution in the case of nominal data.

26 01-09-2024
2. Median
 Median is defined as the middle value in an ordered arrangement

of observations.

 The median is often used to summarize the location of a distribution.

 Further, the median can be used with ordinal, interval, or ratio

measurements.

 If the scores for a given sample distributions are:

32, 32, 35, 36, 37, 38, 38, 39, 39, 39, 40, 40, 42, 45
The median will be 38 + 39 = 38.5
2
27 01-09-2024
3. Mean
 The arithmetic mean is the most commonly used and accepted

measure of central tendency.

 This should be used in the case of interval or ratio data.

If the scores for a given sample distributions are:
32, 32, 35, 36, 37, 38, 38, 39, 39, 39, 40, 40, 42, 45
The mean of the distribution will be:
32+32+35+36+37+38+38+39+39+39+40+40+42+45/14= 38
Mid-mean, geometric mean, mid-range are other types of means. (P.139 of
QRM)

28 01-09-2024
Bivariate Analysis/Relationships between Variables

 Help researchers to know the nature, direction, and significance

of the relationships between two variables in the study.

 Often in practical situations, researchers are interested in

describing associations between variables.

 They try to ascertain how two variables are related with each

other, that is, whether a change in one affects the other.

 The measures of association depend on the nature of the data

and could be positive, negative or neutral.

30 01-09-2024
8.2.1.1. Relation between two nominal variables -X2 Test

This analysis technique is used to know if there is relationship between

two nominal variables.
 E.g. Is viewing television advertisement of a product (yes/No)
related to buying that particular product ( buy/Not buy).
 An international business researcher wants to establish if the

performance ( categorized as loss, breakeven and profit) of a

firm is dependent on which country ( categorized as low, middle
and high income) it is located.

32 01-09-2024
There are three different types of chi-square analysis
1. Chi-square test for goodness of fit

2. Chi-square test for homogeneity

3. Chi-square test of independence

 The first one used to see if the sample has been drawn from
the population and the second if the population are
homogenous with respect to a given characteristics.
 The two are not common and we will focus on the third
type of test
33 01-09-2024
8.2.1. 2. Correlations Analysis
 Correlation is a measure of relationship between two variable. It has wide
application in business and statistics.

 The correlation coefficient describes the direction of the correlation, that is,

whether it is
• Positive or

• Negative,

 And the strength of the correlation, that is, whether an existing correlation is:

• Strong or
• Weak.

35 01-09-2024
8.2.1.3. Bi-variate regression analysis
 Regression is one of the most frequently used techniques in business and

social researches.

 Regression analysis is used to predict the value of one variable (the

dependent variable) on the basis of other variables (the independent

variable).

 The most common form of regression, however, is linear regression,

where the dependent variable is related to the independent variable in a

linear way.

39 01-09-2024
 The linear regression equation takes the
following form

Variables:
X = Independent Variable (we provide this)
Y = Dependent Variable (we observe this)
Parameters:
β0 = Y-Intercept
β1 = Slope
ε = error term
Note: β1 = Indicates the change in the dependent variable for
every unit change in the independent variable

40 01-09-2024
Regression coefficient

Is the measure of how strongly the predictor (IDV)

predicts the DV

There are two types of regression coefficients

1. Unstandardized coefficients
2. Standardized coefficients (Beta Values)

42 01-09-2024
 The unstandardized coefficient can be used in the equation as

coefficients of different independent variables along with the

constant term to predict the value of the dependent variable.
o Difference in “Y” per Unit change in “X”

 The standardized coefficient (Beta) is measured in

standard deviation, i.e. the difference in “Y” in standard

deviation per standard deviation difference in “X”

43 01-09-2024
R values
 R represents the correlation between the observed values and the

predicted values (based on the regression equation obtained) of the

dependent variable.

 Is used to measure the fitness of the model used for the

research.

45 01-09-2024
 R square is the square of R and gives the proportion of variance in the

dependent variable accounted for by the set of independent variables

chosen for the model.

 R-square value tend to be influenced when the number of independent

variables is more or when the number of cases if large.

 Therefore the adjusted R square that takes in to account these things and

provides more accurate information about the fitness of the model.

 While it is not uncommon to get R square value of as high as 0.99 in

natural science, a much lower value (0.10 – 0.20 ) of R2 /R-square

is acceptable in social science research.

46 01-09-2024
2. Multicollinearity

 Is a situation when two or more IVs are highly

correlated to each other.

 If variables are so highly correlated with each other, it is

difficult to come up with reliable estimates of their

individual regression coefficients.

 In other words, when two variables are highly correlated,

they both convey essentially the same information.

49 01-09-2024
How to know the presence of Multicollinearity?
1. If the Variance Inflation Factor ( VIF) > 5 or it mean the Tolerance is < 0.2 as
tolerance is the inverse of VIF

2. If any two IDV have Variance proportion in excess of 0.9 (Column value)
corresponding to any raw in which the condition index is in excess of 30.

 If there is serious multicollinearity problem, try other solutions such as:

 Removing highly correlated predictors

 Linearly combining predictors, such as adding them together

 Running entirely different analyses, such as principal components analysis ( to
know similarities and differences)

50 01-09-2024
8.2. 2. Multivariate Analysis
 In many real life situations, it becomes necessary to analyse

relationship among three or more variables led to the

popularity of multivariate statistics.

 Multivariate statistics techniques look at the pattern of

relationships between several variables simultaneously.

 The following section deals with categories of multivariate

analysis techniques.

51 01-09-2024
8.2. 2. Multivariate Analysis …
8.2.2.1. Multiple linear Regression
 In simple regression, there is one dependent variable and one

independent variable, whereas in

 multiple regression, there is one dependent variable and many

independent variables.

 It examines the relationship between a single metric dependent

variable and two or more metric independent variables

52 01-09-2024
 .

 Assumptions of normality and linearity should be checked before using multiple

regression.

Where: y is a dependent variable and x1, x2, … xk are independent variables and a is
the Y intercept , b1, b2 … bk are the regression coefficient.

Note: All the conditions and tests above are common in case of
multivariate analysis too.
.
53 01-09-2024
End

Thanks

Questions

57 01-09-2024

Unit 2 Descriptive Analytics
No ratings yet
Unit 2 Descriptive Analytics
87 pages
Introduction To Statistics Final
No ratings yet
Introduction To Statistics Final
30 pages
Chapter 8
No ratings yet
Chapter 8
36 pages
Topic 1 Introduction To Statistics
No ratings yet
Topic 1 Introduction To Statistics
35 pages
Data Analysis Plan Handout
No ratings yet
Data Analysis Plan Handout
15 pages
Data Analysis Procedure
0% (1)
Data Analysis Procedure
27 pages
Qunt Data Coding & Analysis
No ratings yet
Qunt Data Coding & Analysis
104 pages
BRM Chapter 6
No ratings yet
BRM Chapter 6
8 pages
Introduction To Data Analtsis
No ratings yet
Introduction To Data Analtsis
33 pages
FDSA Unit - 2
No ratings yet
FDSA Unit - 2
142 pages
Quantitative Research Methods - Data Processing and Analysis
No ratings yet
Quantitative Research Methods - Data Processing and Analysis
25 pages
Lecture 8 Data Analysis
No ratings yet
Lecture 8 Data Analysis
30 pages
Unit 4
No ratings yet
Unit 4
21 pages
RM EBBA Class 8 CH0 11 Quatitative Analysis
No ratings yet
RM EBBA Class 8 CH0 11 Quatitative Analysis
37 pages
Intro To Course and Basic Statistics
No ratings yet
Intro To Course and Basic Statistics
31 pages
Data Preparation and Analysis 3
No ratings yet
Data Preparation and Analysis 3
182 pages
Chapter 14 - Analyzing Quantitative Data
No ratings yet
Chapter 14 - Analyzing Quantitative Data
8 pages
Fundamentals of Data Science and Analytics On Descriptive Analysis
No ratings yet
Fundamentals of Data Science and Analytics On Descriptive Analysis
53 pages
Not 1
No ratings yet
Not 1
8 pages
CAMAD - Data Analysis
No ratings yet
CAMAD - Data Analysis
21 pages
Statistical Analysis (Lecture 1)
No ratings yet
Statistical Analysis (Lecture 1)
40 pages
Data Analysis
No ratings yet
Data Analysis
49 pages
Statistics Course Overview
100% (3)
Statistics Course Overview
43 pages
Data Analysis Chapter 7
No ratings yet
Data Analysis Chapter 7
20 pages
Basic Course in Statistics: Reinhard Tolken
No ratings yet
Basic Course in Statistics: Reinhard Tolken
76 pages
Quantitative Methods 3
No ratings yet
Quantitative Methods 3
174 pages
QM 1
No ratings yet
QM 1
58 pages
CH01 - Introduction To Statistics 2
No ratings yet
CH01 - Introduction To Statistics 2
52 pages
Data Analysis and Interpretation: Major Points For Discussions
No ratings yet
Data Analysis and Interpretation: Major Points For Discussions
39 pages
Data Analysis Julie and Field Activties
No ratings yet
Data Analysis Julie and Field Activties
33 pages
Week 1 Chapter 1 - Introduction To Statistics and Sata Collection
No ratings yet
Week 1 Chapter 1 - Introduction To Statistics and Sata Collection
28 pages
Lecture 1 - Introduction To Statistics
No ratings yet
Lecture 1 - Introduction To Statistics
3 pages
Introduction To Statistics..Final
No ratings yet
Introduction To Statistics..Final
221 pages
Lecture Notes: (Introduction To Medical Laboratory Science Research)
No ratings yet
Lecture Notes: (Introduction To Medical Laboratory Science Research)
13 pages
Business Analytics (MIS171) Summary Notes
No ratings yet
Business Analytics (MIS171) Summary Notes
6 pages
Data Analysis by Dr. E. Mushi
No ratings yet
Data Analysis by Dr. E. Mushi
70 pages
ST1009 - Week 1
No ratings yet
ST1009 - Week 1
26 pages
Research Methodology: Result and Analysis (Part 1)
No ratings yet
Research Methodology: Result and Analysis (Part 1)
65 pages
MR Unit-V
No ratings yet
MR Unit-V
13 pages
Data Science (Unit 02) Notes
No ratings yet
Data Science (Unit 02) Notes
7 pages
Levels of Data
100% (1)
Levels of Data
26 pages
Topic 8 Data Processing and Analysis PDF
No ratings yet
Topic 8 Data Processing and Analysis PDF
157 pages
Week One: Introduction To Quantitative Methods MBA 2013
No ratings yet
Week One: Introduction To Quantitative Methods MBA 2013
49 pages
Data Management
No ratings yet
Data Management
48 pages
Introduction To Data Viz Lecture 2
No ratings yet
Introduction To Data Viz Lecture 2
44 pages
Lesson 2 Notes
No ratings yet
Lesson 2 Notes
11 pages
Inferential Statistics Course
No ratings yet
Inferential Statistics Course
46 pages
What Are Your Results?: Jeffrey Barnes
No ratings yet
What Are Your Results?: Jeffrey Barnes
17 pages
Statistics For Decision-Making 2024
No ratings yet
Statistics For Decision-Making 2024
375 pages
Unit - IV Part-2
No ratings yet
Unit - IV Part-2
41 pages
Chapter 6 Research Methods
No ratings yet
Chapter 6 Research Methods
24 pages
Quantitative Data Analysis
No ratings yet
Quantitative Data Analysis
44 pages
Data Analysis, Interpretation and Presentation
No ratings yet
Data Analysis, Interpretation and Presentation
21 pages
Quantitative Data Analysis Guide
No ratings yet
Quantitative Data Analysis Guide
26 pages
Lecture 1-Statistics Introduction-Defining, Displaying and Summarizing Data
No ratings yet
Lecture 1-Statistics Introduction-Defining, Displaying and Summarizing Data
53 pages
Chapter 7
No ratings yet
Chapter 7
39 pages
Research Methdology by YY
No ratings yet
Research Methdology by YY
31 pages
02 - Data Exploration: IS5740: Management Support and Business Intelligence Systems
No ratings yet
02 - Data Exploration: IS5740: Management Support and Business Intelligence Systems
37 pages
Analysing Quantitative Data - DPPM-2020
No ratings yet
Analysing Quantitative Data - DPPM-2020
34 pages
SR 20215023033
No ratings yet
SR 20215023033
6 pages
Role of Project Managers in The Stakehol
No ratings yet
Role of Project Managers in The Stakehol
6 pages
Iajef v3 I8 103 122
No ratings yet
Iajef v3 I8 103 122
20 pages
Construction Project Finances Rate Gies For Success
No ratings yet
Construction Project Finances Rate Gies For Success
13 pages
1 PB
No ratings yet
1 PB
9 pages
Research Article: Sustainable Financing Model Considering Project Risk
No ratings yet
Research Article: Sustainable Financing Model Considering Project Risk
19 pages
Chapter-7Data Collection
No ratings yet
Chapter-7Data Collection
24 pages
Risk Management Methods in Projects: Research Article
No ratings yet
Risk Management Methods in Projects: Research Article
11 pages
Insights Into Educational Attainment Maya City Grade 12 National Examination Outcomes
No ratings yet
Insights Into Educational Attainment Maya City Grade 12 National Examination Outcomes
18 pages
ECT Skill
No ratings yet
ECT Skill
192 pages
Multiple Choice Questions On Effective Teaching and Learning Practices
83% (6)
Multiple Choice Questions On Effective Teaching and Learning Practices
8 pages
Chapter 6-PM & QC-Project Quality Improvement
No ratings yet
Chapter 6-PM & QC-Project Quality Improvement
33 pages
Public Enterprineurship 1
No ratings yet
Public Enterprineurship 1
99 pages
Chap-4 Research Design Ed
No ratings yet
Chap-4 Research Design Ed
25 pages
Chapter 1
No ratings yet
Chapter 1
21 pages
3 Module 1 Part 1 Day 2 F2F Training PPT For SL and SS Autosaved
No ratings yet
3 Module 1 Part 1 Day 2 F2F Training PPT For SL and SS Autosaved
29 pages
Stakeholder Management and Performance of County Government Funded Projects in Nyeri County, Kenya
No ratings yet
Stakeholder Management and Performance of County Government Funded Projects in Nyeri County, Kenya
68 pages
Answers of Reflection Question From 21-2
No ratings yet
Answers of Reflection Question From 21-2
1 page
Impactful School Leadership Training
No ratings yet
Impactful School Leadership Training
29 pages
Module 1 Part 2 - Day 2 F2F Training PPT For SL and SS
No ratings yet
Module 1 Part 2 - Day 2 F2F Training PPT For SL and SS
29 pages
Monitoring and Evaluation
No ratings yet
Monitoring and Evaluation
16 pages
An Automobile Rental Company Wants To Predict The Yearly Maintenance Expense
No ratings yet
An Automobile Rental Company Wants To Predict The Yearly Maintenance Expense
2 pages
Meinshausen & Bühlmann, High-Dimensional Graphs and Variable Selection With The Lasso 009053606000000281
No ratings yet
Meinshausen & Bühlmann, High-Dimensional Graphs and Variable Selection With The Lasso 009053606000000281
27 pages
Machine Learning: An Applied Econometric Approach: Sendhil Mullainathan and Jann Spiess
No ratings yet
Machine Learning: An Applied Econometric Approach: Sendhil Mullainathan and Jann Spiess
48 pages
Chemometric Classification Techniques As
No ratings yet
Chemometric Classification Techniques As
10 pages
06 Forecasting Methods
No ratings yet
06 Forecasting Methods
58 pages
House Price Prediction for Investors
No ratings yet
House Price Prediction for Investors
3 pages
Data Science With Python Class Room Notes Qulaity Thought
100% (2)
Data Science With Python Class Room Notes Qulaity Thought
489 pages
Tig Welding Process Parameters Optimization For Stainless Steel Materials Using Regression Analysis
No ratings yet
Tig Welding Process Parameters Optimization For Stainless Steel Materials Using Regression Analysis
7 pages
Documentation of Our Project
No ratings yet
Documentation of Our Project
21 pages
Customer Satisfaction Towards Nissan
No ratings yet
Customer Satisfaction Towards Nissan
43 pages
Lecture 8-Association Between Variables
No ratings yet
Lecture 8-Association Between Variables
28 pages
Numerical Computation - 7 - Linear Regression
No ratings yet
Numerical Computation - 7 - Linear Regression
27 pages
LSR and Correlation
No ratings yet
LSR and Correlation
2 pages
Optimization of GuHCl Extraction Protocol On Collagen Ba 2022 Journal of Cul
No ratings yet
Optimization of GuHCl Extraction Protocol On Collagen Ba 2022 Journal of Cul
9 pages
Cameron C. Microeconometrics Using Stata Vol II. 2ed 2022
No ratings yet
Cameron C. Microeconometrics Using Stata Vol II. 2ed 2022
1,198 pages
Camden County College MTH-111 Final Exam Sample Questions
No ratings yet
Camden County College MTH-111 Final Exam Sample Questions
13 pages
2009 Ridge Regression
No ratings yet
2009 Ridge Regression
8 pages
Educational Data Mining Insights
100% (1)
Educational Data Mining Insights
29 pages
Machine Learning
No ratings yet
Machine Learning
54 pages
Theory Questions
No ratings yet
Theory Questions
4 pages
Linear Models - Numeric Prediction
No ratings yet
Linear Models - Numeric Prediction
7 pages
Stock Market Prediction
No ratings yet
Stock Market Prediction
13 pages
Crime Analysis and Prediction Using Machine Learning
No ratings yet
Crime Analysis and Prediction Using Machine Learning
5 pages
Econometrics Solutions for Students
No ratings yet
Econometrics Solutions for Students
9 pages
Probability & Stats for Engineers
No ratings yet
Probability & Stats for Engineers
18 pages
Elliott 2007
No ratings yet
Elliott 2007
23 pages
215 Final Exam Formula Sheet
No ratings yet
215 Final Exam Formula Sheet
2 pages
JSO (Test - 10) Paid
No ratings yet
JSO (Test - 10) Paid
6 pages
Declarations: Instructions
No ratings yet
Declarations: Instructions
23 pages
An Introduction To Classical Econometric Theory-Ruud
No ratings yet
An Introduction To Classical Econometric Theory-Ruud
975 pages