KEMBAR78
R Statistic Example | PDF | Regression Analysis | Student's T Test
0% found this document useful (0 votes)
8 views26 pages

R Statistic Example

Uploaded by

yaw shu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views26 pages

R Statistic Example

Uploaded by

yaw shu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

An Example of Using R in

Statistical Analysis

1
OUTLINES
• Correlation Analysis
• Simple Linear Regression
• Multiple Linear Regression
• Non-Linear Regression
• Hypothesis Testing
• Sampling Design (Cochran’s Formula)

2
Correlation Analysis
Positive Correlation:
Variables move in the same
• Negative Correlation:
Variables move in opposite directions
• No Correlation:
No apparent relationship between the variables.

• Ranges from -1 to +1
• r=+1 Perfect
positive correlation
• 𝑟=−1 Perfect
negative correlation
• 𝑟=0 No correlation.

3
Simple Linear Regression
Definition:
Simple linear regression is a statistical method to model and
predict the relationship between two variables:

•Independent Variable (X): The predictor.


•Dependent Variable (Y): The outcome being predicted.

The R2 (R-squared) ranges from 0 to 1,


where:

R2 =1: Perfect fit (the model explains


100% of the variance in Y).

R2 =0: No explanatory power (the model


explains none of the variance in Y).

4
Multiple Linear Regression
Definition:

Multiple linear regression is a statistical technique used to


model and predict the relationship between one dependent variable
(𝑌) and two or more independent variables (𝑋1,𝑋2,…,𝑋𝑘).

The R2 (R-squared) ranges from 0 to 1,


where:

R2 =1: Perfect fit (the model explains


100% of the variance in Y).

R2 =0: No explanatory power (the model


explains none of the variance in Y).

5
Non-Linear Regression
Definition:

Non-linear regression models the relationship between a dependent


variable (𝑌) and one or more independent variables (𝑋) when the
relationship is not linear. The model fits a curve instead of a
straight line.

The R2 (R-squared) ranges from 0 to 1,


where:

R2 =1: Perfect fit (the model explains


100% of the variance in Y).

R2 =0: No explanatory power (the model


explains none of the variance in Y).

6
Non-Linear Regression (Example II)
A civil engineer studies the relationship between concrete's
compressive strength (strength) and its curing time (time). It is
hypothesized that the relationship follows a logarithmic model:

Strength = a+b⋅ln(time)

Using this data, perform the following tasks:


1. Fit a non-linear regression model to
estimate the parameters aaa and bbb.
2. Evaluate the model's fit and interpret the
results.
3. Visualize the data and the fitted model.

7
Hypothesis Testing

8
Significance Level (Alpha)
The significance level, also known as alpha or α, is an evidentiary standard that researchers set before the study.
It specifies how strongly the sample evidence must contradict the null hypothesis before you can reject the null
for the entire population.

α = 0.05 is the standard

P-values
P-values indicate the strength of the sample evidence against the null hypothesis. If it is less than the
significance level, your results are statistically significant

• When the p-value is less than or equal to the significance level, you reject the null hypothesis.
“When the p-value is low, the null must go.”

• When the p-value is greater than the significance level, you fail to reject the null hypothesis.
“If the p-value is high, the null will fly.”

9
Hypothesis Testing (Example of Z-Score)
A civil engineer is testing the compressive strength of a new type of
concrete. The manufacturer claims that the average compressive
strength is 35 MPa. The engineer collects a random sample of 30
specimens and finds an average strength of 33 MPa with a standard
deviation of 4 MPa.

Perform a one-sample z-test to determine if the observed average


compressive strength significantly differs from the manufacturer's
claim at a 5% significance level.

10
Hypothesis Testing (Example of Z-Score)

11
Hypothesis Testing (Example of 1 sample T-test)

12
Hypothesis Testing (Example of 1 sample T-Test)

13
Hypothesis Testing (Pair T-Test)
A civil engineer wants to determine if the curing process significantly
improves the compressive strength of concrete samples. They test the
compressive strength of 10 samples before curing and after curing and
perform a paired t-test.

14
Hypothesis Testing (Pair T-test)

15
Hypothesis Testing
2 Sample T Test: Equal Variance)
A civil engineer wants to determine if two types of concrete (Type A and
Type B) have significantly different compressive strengths. The engineer
collects data on compressive strength (in MPa) from 10 samples for each
type of concrete.

16
Hypothesis Testing
(2 Sample T-test: Equal Variance)

17
Hypothesis Testing
(2 Sample T-Test: unequal variance)
A civil engineer wants to determine if the slump height (in cm) of two
different types of concrete mix designs (Mix A and Mix B) significantly
differs. The slump height is an indicator of the workability of concrete.
The engineer tests 8 samples of Mix A and 10 samples of Mix B.

18
Hypothesis Testing
(2 Sample T-Test: unequal variance)

19
General Methods for Designing Sample Size

20
Sampling Size Design (Cochran’s Formula)
Cochran’s Formula helps determine the minimum sample size needed
for a survey or experiment to achieve a desired level of
precision, confidence, and variability.

21
Sampling Size Design (Cochran’s Formula)

22
Sampling Size Design (Cochran’s Formula)
To determine the sample size for two cities, one large and one small, with
unknown population sizes, we can apply a conservative sampling approach
using the formula for sample size calculation for proportions. Here's a
breakdown:

23
Sampling Size Design (Cochran’s Formula)

24
Sampling Size Design (Cochran’s Formula)

un-known known e N-known


large 385 384 5% 100,000

small 1,068 357 3% 5,000

25
Sampling Size Design (Cochran’s Formula)

26

You might also like