0% found this document useful (0 votes)

41 views20 pages

Case 4 - Tutorial 2

1. Descriptive statistics including summary tables, graphs, and tests are conducted on a dataset to analyze characteristics by province and ownership. 2. A one-way ANOVA finds no significant difference in total assets between provinces. Assumptions of normality and equal variances are satisfied. 3. A two-way ANOVA tests for differences in total assets between province, ownership, and their interaction. While distributions are approximately normal, variances are unequal. The ANOVA finds no significant differences.

Uploaded by

Hoàng Huế

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

41 views20 pages

Case 4 - Tutorial 2

Uploaded by

Hoàng Huế

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 20

Question 1: Produce descriptive statistics to summarize the data.

You are expected to generate as

many relevant descriptive statistics as possible using ALL the relevant tools introduced in the
labs of this course. Remember to provide appropriate interpretations for the descriptive statistics.
Try not to include unnecessary or irrelevant descriptive statistics.
Firstly, import the dataset23.csv data frame into R and assign it to case4.
getwd()
setwd("")
case4<- read.table("dataset23.csv", header = TRUE , sep = ",", quote ="/", stringsAsFactors =
FALSE )
1. Some first rows of the data
head(case4)

2. Display the structure of case4 data frame

str(case4)

3. Convert character variables into factors

case4$X.province<-factor(case4$X.province, levels = c("Hanoi","Haiphong","TP HCM"))
case4$own<-factor(case4$own, levels = c("One-owner","Multi-owner"))
str(case4)
4. Summary data
Summary(case4)

table(case4$X.province,case4$own)

5. Summary data by groups

by(case4$X.quantityproduct,list(case4$X.province,case4$own),summary)
by(case4$X.quantitysold,list(case4$X.province,case4$own),summary)

by(case4$totalass,list(case4$X.province,case4$own),summary)
# Descriptive data in statistic
install.packages("psych")
library("psych")
describeBy(case4["X.quantityproduct"],list(case4$X.province,case4$own))

describeBy(case4["X.quantitysold"],list(case4$X.province,case4$own))
describeBy(case4["totalass"],list(case4$X.province,case4$own))

# Descriptive data in graphs

boxplot(X.quantityproduct ~ X.province + own, data = case4, xlab = "Specify address of firm
and Ownership status", ylab = "Quantity produced for the most important product", col =
c("red", "blue", "yellow","pink","grey","green"))
boxplot(X.quantitysold ~ X.province + own, data = case4, xlab = "Specify address of firm and
Ownership status", ylab = "Quantity sold base one quantity produced for the most important
product", col = c("red", "blue", "yellow","pink","grey","green"))

boxplot(totalass ~ X.province + own, data = case4, xlab = "Specify address of firm and
Ownership status", ylab = "Total assets in 2014", col = c("red", "blue",
"yellow","pink","grey","green"))

plotmeans(X.quantitysold~interaction(X.province, own), data=case4, xlab = "Specify address of

firm and Ownership status", ylab = "Quantity sold base one quantity produced for the most
important product", main="Mean Plot with 95% CI")
plotmeans(totalass~interaction(X.province, own), data=case4, xlab = "Specify address of firm
and Ownership status", ylab = "Total assets in 2014", main="Mean Plot with 95% CI")

Question 2: Use analysis of variance to test for any significant differences due to province. Use
a .05 level of significance, and for now, ignore the effect of types of ownership, quantity
produced and quantity sold. Check all the assumptions of the inference technique you use. Are
the assumptions satisfied? Explain.
Check assumption:
1. All populations are normally distributed (qqplot)
 install.packages("car")
 library(car)
 qqPlot(lm(case4$totalass ~ case4$X.province,data = case4), simulate=T, main="Q-Q
Plot", labels=F)
2. Samples were selected by using simple random sampling. Samples are independent and
simple random sample and sample sizes are equal
 table(case4$X.province)

3. All population variances are equal (Slargest <2Ssmallest )

 by(case4$totalass,case4$X.province,sd)
Slargest 110745.9
= = 9.921601 > 2
Ssmallest 11162.1
The ratio of largest SD over smallest SD is around 9.92 (which is greater than 2) in this case it is
not so clear to pool variances, then it’s good to check again using Levene’s test:
(limitation: the ratio is too big
Hypothesis:
Ho : All populations variances are equal
Ha : At least 2 populations variances are different.
R code:
 library(car)
 leveneTest(case4$totalass, case4$X.province, center=median)

p-value = 0.0077
Decision rule : Reject Ho if p-value < ∝
We have : p-value = 0.077 > 0.05
 Do not reject Ho
 We have enough evidence to conclude that all populations variances are equal
 Assumption 3 correct
Use one-way ANOVA to test for any significant differences due to province
# One-way ANOVA
aovcase4.1<- aov(case4$totalass~ case4$X.province, data=case4)
summary(aovcase4.1)

Question 3: At the .05 level of significance test for any significant differences due to
X.province, types of ownership, and interaction (ignore the effect of quantity produced and
quantity sold. Check all the assumptions of the inference technique you use. Are the assumptions
satisfied? Explain. Draw an interaction plot and interpret the plot. Is the plot consistent with the
conclusions?
I. Assumptions:
1) All populations are normally distributed
2) Samples were selected by using simple random sampling
3) Samples are independent
4) All population standard deviations are equal (Slargest <2Ssmallest )

Assumption 1: All populations are normally distributed

In order to check the normal distribution of the populations, we use QQ plot with R command:
 install.packages("car")
 library(car)
 qqPlot(lm(case4$totalass ~ X.province + own + own*X.province, data = case4), simulate
= T, main = “Q-Q Plot”, labels=F)
few outliers  vẫn cho là normally distributed và cho phần outliers vào limitations
Assumption 2 & 3: Samples were selected by using simple random sampling, independent
table(case4$own, case4$X.X.province)
Output:

Assumption 4: All population standard deviations are equal

by(case4$totalass, list(case4$X.X.province,case4$own),sd)

Slargest 148,425.6
= = 19.62588
Ssmallest 7562.748

 standard deviation of each

sample was not
 equal.
 standard deviation of each
sample was not
 equal.
 SD are not all equal  continue to use ANOVA  limitation
 Use Levene test although >2.5 many times
Rstudio:
 install.packages("car")
 library(car)
 leveneTest(case4$totalass, case4$own, center = median)
Output:

1. Hypothesis
H0: The population variances are equal
Ha: The population variances are not all equal
2. P-value = 0.2179
3. Rejection rule: Reject H0 if p-value < α
We have: 0.2179 > 0.05
 Do not reject H0
4. Conclusion
 Assumption 3 is satisfied
II. Hypothesis
H0: µ1 = µ2
Ha: Two populations are different
 aov2 <- aov(totalass ~ own,data= case4)
 summary(aov2)
Output:

III. Rejection Rules: Reject H0 if p-value < α

We have: 0.152 > 0.05
 Do not reject H0
Conclusion
R INPUT
Q1
# import the .csv file “dataset23.csv”
getwd()
setwd("")
case4<- read.table("dataset23.csv", header = TRUE , sep = ",", quote ="/", stringsAsFactors =
FALSE )

# Some first rows of the data

head(case4)

# Display the structure of case4 data frame

str(case4)

# Convert character variables into factors

case4$X.province<-factor(case4$X.province, levels = c("Hanoi","Haiphong","TP HCM"))
case4$own<-factor(case4$own, levels = c("One-owner","Multi-owner"))
str(case4)

# Summary data
summary(case4)
table(case4$X.province,case4$own)

# Summary data by groups

by(case4$X.quantityproduct,list(case4$X.province,case4$own),summary)
by(case4$X.quantitysold,list(case4$X.province,case4$own),summary)
by(case4$totalass,list(case4$X.province,case4$own),summary)

# Descriptive data in statistic

install.packages("psych")
library("psych")
describeBy(case4["X.quantityproduct"],list(case4$X.province,case4$own))
describeBy(case4["X.quantitysold"],list(case4$X.province,case4$own))
describeBy(case4["totalass"],list(case4$X.province,case4$own))

# Descriptive data in graphs

install.packages("gplots")
library("gplots")
plotmeans(X.quantityproduct~interaction(X.province, own), data=case4, xlab = "Specify address
of firm and Ownership status", ylab = "Quantity produced for the most important product",
main="Mean Plot with 95% CI")
plotmeans(X.quantitysold~interaction(X.province, own), data=case4, xlab = "Specify address of
firm and Ownership status", ylab = "Quantity sold base one quantity produced for the most
important product", main="Mean Plot with 95% CI")
plotmeans(totalass~interaction(X.province, own), data=case4, xlab = "Specify address of firm
and Ownership status", ylab = "Total assets in 2014", main="Mean Plot with 95% CI")

Q2
#Check assumptions
#Check independence and simple random sample and sample sizes are equal
table(case4$X.province)

#Check population are normally distributed

install.packages("car")
library(car)
qqPlot(lm(case4$totalass ~ case4$X.province,data = case4), simulate=T, main="Q-Q Plot",
labels=F)

#Check all population variances are equal

by(case4$totalass,case4$X.province,sd)
43451.78/110745.9

#levene test
library(car)
leveneTest(case4$totalass, case4$X.province, center=median)

# One-way ANOVA
aovcase4.1<- aov(case4$totalass~ case4$X.province, data=case4)
summary(aovcase4.1)

Q3
#Check assumptions
#Check independent and simple random sample and sample sizes are equal
table(case4$own,case4$X.province)
str(case4)

#Check population are normally distributed

library(car)
qqPlot(lm(case4$totalass ~ X.province + own + own*X.province, data = case4), simulate = T,
labels=F)

#Check all population variances are equal

by(case4$totalass, list(case4$X.province,case4$own),sd)
148425.6/7562.748

#levene test
library(car)
leveneTest(case4$totalass, case4$own, center = median)

# Two-way ANOVA
aovcase4.2 <- aov(case4$totalass ~ own,data= case4)
summary(aovcase4.2)
Box plot:

QQ plot:

Example Report
No ratings yet
Example Report
22 pages
R Programming Lab Assignments
No ratings yet
R Programming Lab Assignments
40 pages
Case Study 6 Tut 01 Group 04
No ratings yet
Case Study 6 Tut 01 Group 04
23 pages
BES Case Study Presentation Tut 5 Group 2 Ms Hien 1 1
No ratings yet
BES Case Study Presentation Tut 5 Group 2 Ms Hien 1 1
36 pages
BES Test 2
No ratings yet
BES Test 2
5 pages
Mock Exam - Appendix
No ratings yet
Mock Exam - Appendix
15 pages
R Practice
No ratings yet
R Practice
38 pages
R Console
No ratings yet
R Console
6 pages
R Code
No ratings yet
R Code
9 pages
AMDA Practical - A048
No ratings yet
AMDA Practical - A048
35 pages
STAT-2450 Assignment 1: Name:, Student ID: B00
No ratings yet
STAT-2450 Assignment 1: Name:, Student ID: B00
9 pages
R Course
No ratings yet
R Course
7 pages
Commands For Data Analysis Using R
No ratings yet
Commands For Data Analysis Using R
11 pages
ECO 4000 R Assignment
No ratings yet
ECO 4000 R Assignment
3 pages
BES - R Lab 4
No ratings yet
BES - R Lab 4
6 pages
IBS Sample I
No ratings yet
IBS Sample I
10 pages
Advanced Statistical Methods Using R
No ratings yet
Advanced Statistical Methods Using R
32 pages
Report Stats PDF
No ratings yet
Report Stats PDF
23 pages
R Codes
No ratings yet
R Codes
5 pages
ProbList2 24 SLN
No ratings yet
ProbList2 24 SLN
20 pages
MA Hw2sol
No ratings yet
MA Hw2sol
16 pages
BES - R Lab 6
No ratings yet
BES - R Lab 6
7 pages
?
No ratings yet
?
30 pages
Maths Lab
No ratings yet
Maths Lab
17 pages
Module2 BDA
No ratings yet
Module2 BDA
44 pages
Yadunandan Sharma 500826933 MTH480 Due Date: April 15, 2021
No ratings yet
Yadunandan Sharma 500826933 MTH480 Due Date: April 15, 2021
16 pages
CS1B April 2024
No ratings yet
CS1B April 2024
9 pages
Formulas
No ratings yet
Formulas
2 pages
Lab Wk1soln PDF
No ratings yet
Lab Wk1soln PDF
14 pages
Analysis Using Statistical: Introduction & Data Exploration
No ratings yet
Analysis Using Statistical: Introduction & Data Exploration
23 pages
Actuary Math Lecture Notes
No ratings yet
Actuary Math Lecture Notes
23 pages
R Syntax for Statistical Analyses
No ratings yet
R Syntax for Statistical Analyses
11 pages
Data Analysis with R: Tables & Plots
No ratings yet
Data Analysis with R: Tables & Plots
13 pages
Week1 PDF
No ratings yet
Week1 PDF
22 pages
BES - R Lab 5
No ratings yet
BES - R Lab 5
7 pages
DEV Lab Manual
No ratings yet
DEV Lab Manual
27 pages
Questions With No Solutions
No ratings yet
Questions With No Solutions
20 pages
Workshop Activity: X Seq y Length
No ratings yet
Workshop Activity: X Seq y Length
3 pages
Business Analytics (Unit4 Chapter5)
No ratings yet
Business Analytics (Unit4 Chapter5)
7 pages
List of Correction For Applied Statistics Module
No ratings yet
List of Correction For Applied Statistics Module
26 pages
REPORT
No ratings yet
REPORT
19 pages
Cost Practical
No ratings yet
Cost Practical
13 pages
Exercise 3 Factors
No ratings yet
Exercise 3 Factors
15 pages
Stat 362 UNIT 4
No ratings yet
Stat 362 UNIT 4
30 pages
Assignment 2
No ratings yet
Assignment 2
9 pages
Test#1 PDF
No ratings yet
Test#1 PDF
5 pages
BAN5
No ratings yet
BAN5
2 pages
OneFactorANOVA Introduction
No ratings yet
OneFactorANOVA Introduction
11 pages
R
No ratings yet
R
6 pages
7 OLS Assumptions
No ratings yet
7 OLS Assumptions
37 pages
Pool
No ratings yet
Pool
13 pages
Solutions Modernstatistics
No ratings yet
Solutions Modernstatistics
144 pages
R Basics: 26-JULY-2019
No ratings yet
R Basics: 26-JULY-2019
32 pages
2023 Tutorial 12
No ratings yet
2023 Tutorial 12
6 pages
Assignment 5
No ratings yet
Assignment 5
13 pages
BDA 09 Shridhti Tiwari
No ratings yet
BDA 09 Shridhti Tiwari
12 pages
Stat 151 - Final Review
No ratings yet
Stat 151 - Final Review
15 pages
Lec6 - Chapter 10+11
No ratings yet
Lec6 - Chapter 10+11
1 page
Income Statement - Annual - As Originally Reported
No ratings yet
Income Statement - Annual - As Originally Reported
2 pages
2104040066 Nguyễn Thị Tuyết Mai
No ratings yet
2104040066 Nguyễn Thị Tuyết Mai
3 pages
Bang 2
No ratings yet
Bang 2
15 pages
Effective Internal Controls Over Financial Reporting (ICFR) Testing Questionnaire
100% (1)
Effective Internal Controls Over Financial Reporting (ICFR) Testing Questionnaire
2 pages
GE ELEC 1 PH Pop Culture Syllabus
No ratings yet
GE ELEC 1 PH Pop Culture Syllabus
8 pages
Research Methodology 1
0% (1)
Research Methodology 1
11 pages
JIK Manuscript Template
No ratings yet
JIK Manuscript Template
3 pages
2024 Summer Program Schedule: Lectures & Events
No ratings yet
2024 Summer Program Schedule: Lectures & Events
7 pages
Dissertation Economie Methodologie
100% (2)
Dissertation Economie Methodologie
5 pages
Donald W. Winnicott and The History of The Present: Understanding The Man and His Work 1st Edition Angela Joyce PDF Download
100% (4)
Donald W. Winnicott and The History of The Present: Understanding The Man and His Work 1st Edition Angela Joyce PDF Download
63 pages
Media Research
No ratings yet
Media Research
15 pages
This Research Plan Was Structured Into Five Chapters
No ratings yet
This Research Plan Was Structured Into Five Chapters
2 pages
Introduction To Experimental Psychology Class 11
No ratings yet
Introduction To Experimental Psychology Class 11
4 pages
The Multinational Telecommunication Companies As A Moderator To Achieve Sustainable Development Considering Standard of Living and Digital Education in Sri Lanka
100% (1)
The Multinational Telecommunication Companies As A Moderator To Achieve Sustainable Development Considering Standard of Living and Digital Education in Sri Lanka
11 pages
Orange Peel Na JD Nia Nga Final Printtt VERSION 6 Manifesting2x1
No ratings yet
Orange Peel Na JD Nia Nga Final Printtt VERSION 6 Manifesting2x1
78 pages
SCNS1508 Reading Pieces
No ratings yet
SCNS1508 Reading Pieces
160 pages
The Qualitative Content Analysis Process
No ratings yet
The Qualitative Content Analysis Process
9 pages
Opening For Black According To Karpov Repertoire Books 1ST Edition Alexander Khalifman PDF Version
No ratings yet
Opening For Black According To Karpov Repertoire Books 1ST Edition Alexander Khalifman PDF Version
156 pages
Unveiling The Coping Strategy of Orphans A Qualitative Exploration of Filipino Adolescents Living in An Orphanage
No ratings yet
Unveiling The Coping Strategy of Orphans A Qualitative Exploration of Filipino Adolescents Living in An Orphanage
10 pages
Chapter 2-AskingAnswering Sociological Qs
No ratings yet
Chapter 2-AskingAnswering Sociological Qs
8 pages
Validity
No ratings yet
Validity
31 pages
Data Quality Assessment
No ratings yet
Data Quality Assessment
9 pages
Corporate Governance & Environmental Disclosure Study
No ratings yet
Corporate Governance & Environmental Disclosure Study
12 pages
The Uses of Grammar 2nd Edition Judith Rodby Download
No ratings yet
The Uses of Grammar 2nd Edition Judith Rodby Download
97 pages
The Impact of Reward and Recognition Programs On Employees' Motivation and Satisfaction
No ratings yet
The Impact of Reward and Recognition Programs On Employees' Motivation and Satisfaction
12 pages
Literature Review Help for Students
100% (2)
Literature Review Help for Students
4 pages
Gi Revised For checkingTABLE OF CONTENTS 2 1
No ratings yet
Gi Revised For checkingTABLE OF CONTENTS 2 1
8 pages
Mekelle University - Graduate Thesis Proposal Wrting Formats
No ratings yet
Mekelle University - Graduate Thesis Proposal Wrting Formats
29 pages
RF and Microwave Circuit Design: Theory and Applications Charles E.
No ratings yet
RF and Microwave Circuit Design: Theory and Applications Charles E.
68 pages
Upload Isah Nurudeen Project
No ratings yet
Upload Isah Nurudeen Project
61 pages
NCM 111
No ratings yet
NCM 111
15 pages
Evaluation of Threat Modeling Methodologies
No ratings yet
Evaluation of Threat Modeling Methodologies
90 pages
Sharonyabanerjee BSM201
No ratings yet
Sharonyabanerjee BSM201
4 pages

Case 4 - Tutorial 2

Uploaded by

Case 4 - Tutorial 2

Uploaded by

Question 1: Produce descriptive statistics to summarize the data.

You are expected to generate as

2. Display the structure of case4 data frame

3. Convert character variables into factors

5. Summary data by groups

# Descriptive data in graphs

plotmeans(X.quantitysold~interaction(X.province, own), data=case4, xlab = "Specify address of

3. All population variances are equal (Slargest <2Ssmallest )

Assumption 1: All populations are normally distributed

Assumption 4: All population standard deviations are equal

 standard deviation of each

III. Rejection Rules: Reject H0 if p-value < α

# Some first rows of the data

# Display the structure of case4 data frame

# Convert character variables into factors

# Summary data by groups

# Descriptive data in statistic

# Descriptive data in graphs

#Check population are normally distributed

#Check all population variances are equal

#Check population are normally distributed

#Check all population variances are equal

You might also like