KEMBAR78
Statistical Analysis of Rainfall Data | PDF | Skewness | Coefficient Of Variation
0% found this document useful (0 votes)
62 views22 pages

Statistical Analysis of Rainfall Data

The document discusses statistical analysis of rainfall data. It covers topics such as population and samples, finite and infinite populations, probability concepts, descriptive statistics including measures of central tendency, and the length of data needed for analysis. The document determines that rainfall values are treated as coming from an infinite population and the sample is assumed to be representative of the larger population.

Uploaded by

Momostafa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
62 views22 pages

Statistical Analysis of Rainfall Data

The document discusses statistical analysis of rainfall data. It covers topics such as population and samples, finite and infinite populations, probability concepts, descriptive statistics including measures of central tendency, and the length of data needed for analysis. The document determines that rainfall values are treated as coming from an infinite population and the sample is assumed to be representative of the larger population.

Uploaded by

Momostafa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

Statistical Analysis of Rainfall Data

Dr. Morad Abdelsalheen

1
Statistics
Statistics is a branch of Science, which we:

1. Collect
2. Observe
3. Analyze
4. Interpret

Numerical Facts and Figures

2
Population & Sample
Population
a population is the entire pool from which a
statistical sample is drawn. A population may refer to
an entire group of people, objects, events, hospital
visits, or measurements. A population can thus be
said to be an aggregate observation of subjects
grouped together by a common feature.

Sample
A sample is a selection of members of a population. It
is a smaller group drawn from the population that
has the characteristics of the entire population. The
observations and conclusions made against the
sample data are attributed to the population.

3
Finite & infinite
Finite Population
A finite population is a collection of objects or
individuals that are objects of research that occupy a
certain area. It clear boundaries that distinguish
these population groups from other populations.
>>>>DICE

Infinite Population
Infinite population is a collection of objects or
individuals that are no boundaries or we can not
measure about the total number of individuals in the
occupied territories.
>>>>BLOOD CELLS

4
RAINFALL DATA???

A. Population or Sample ???

B. Finite or Infinite Population ???

C. Rainfall values for a given point are analyzed as of infinite


population , and it is usually assumed that the sample is
representative of its parent population.

5
Probability Concepts
▪ A probability is a number that reflects the chance or likelihood that a particular event will occur.
▪ Probabilities can be expressed as proportions that range from 0 to 1, and they can also be expressed as
percentages ranging from 0% to 100%.
▪ A probability of 0 indicates that there is no chance that a particular event will occur, whereas a
probability of 1 indicates that an event is certain to occur. A probability of 0.45 (45%) indicates that there
are 45 chances out of 100 of the event occurring.

Suppose an event X can happen in r ways out of a total of n possible equally likely ways.

Probability of occurrence of the event: (SUCCESS)


𝒓
𝑷𝒓 𝒙 =
𝒏
Probability of non-occurrence (FAILURE)
𝒏−𝒓 𝒓
ഥ =
𝑷𝒓 𝑿 =𝟏−
𝒏 𝒏
ഥ =𝟏
𝑷𝒓 𝒙 + 𝑷𝒓 𝑿

6
Probability Concepts
Return period
The inverse of probability (generally expressed in %), it gives the estimated time interval
between events of a similar size or intensity.

For example, the return period of a flood might be 100 years; otherwise expressed as its
probability of occurring being 1/100, or 1% in any one year.
This does not mean that if a flood with such a return period occurs, then the next will occur
in about one hundred years' time - instead, it means that, in any given year, there is a 1%
chance that it will happen, regardless of when the last similar event was.

Or, put differently, it is 10 times less likely to occur than a flood with a return period of 10
years (or a probability of 10%).
𝟏
𝑻=
𝑷𝒓

7
Length of data needed for analysis
What is the difference between interpolation & extrapolation?

For extrapolation :

• Water resources council bulletin 17B, 1981


➢ At least 10 years of records to warrant a statistical analysis.

• Some references
➢ Suggest 1/3 of data record i.e. (75-years design storm needs records for 25 years)

• USGS (United States Geological Survey)


➢ recommend 1/2 of data record i.e. (50-years design storm needs records for 25 years).
➢ Then in 1973, recommend
➢ 10 years >>>>> for 10-years design storm
➢ 15 years >>>>> for 25-years design storm
➢ 20 years >>>>> for 50-years design storm
➢ 25 years >>>>> for 100-years design storm
8
Descriptive statistics
Descriptive statistics allow you to characterize your data based on its properties. There are four major types of descriptive statistics:

1. Measures of Frequency:
• Count, Percent, Frequency
• Shows how often something occurs
• Use this when you want to show how often a response is given

2. Measures of Central Tendency


• Mean, Median, and Mode
• Locates the distribution by various points
• Use this when you want to show how an average or most commonly indicated response

3. Measures of Dispersion or Variation


• Range, Variance, Standard Deviation
• Identifies the spread of scores by stating intervals
• Range = High/Low points
• Variance or Standard Deviation = difference between observed score and mean
• Use this when you want to show how "spread out" the data are. It is helpful to know when your data are so spread out that it affects the mean

4. Measures of Position
• Percentile Ranks, Quartile Ranks
• Describes how scores fall in relation to one another. Relies on standardized scores
• Use this when you need to compare scores to a normalized score (e.g., a national norm)

9
Descriptive statistics
Mean - Median - Mode >>>>>>>Find the Mathematical Equation!!!
• The Mean is the average of a data set.
• The Median is the middle of the set of numbers.
• The Mode is the most common number in a data set.

EX. Find the mean, median, mode, and range for the following list of values:
(13, 18, 13, 14, 13, 16, 14, 21, 13)
➢ The mean >>> (13 + 18 + 13 + 14 + 13 + 16 + 14 + 21 + 13) ÷ 9 = 15
➢ Note that the mean, in this case, isn't a value from the original list. This is a common result. You should not assume that your mean will be one of your
original numbers.

➢ The median is the middle value, so first I'll have to rewrite the list in numerical order:
➢ 13, 13, 13, 13, 14, 14, 16, 18, 21
➢ There are nine numbers in the list, so the middle one will be the (9 + 1) ÷ 2 = 10 ÷ 2 = 5th number:
➢ 13, 13, 13, 13, 14, 14, 16, 18, 21 >>> So the median is 14.

➢ The mode is the number that is repeated more often than any other, so 13 is the mode.

10
Descriptive statistics
Outlier:
➢ is a data point that differs significantly from other observations.
➢ may be due to variability in the measurement or it may indicate experimental error.
➢ can cause serious problems in statistical analyses.

Will the (Mean - Median - Mode) be affected by sample? And How ?


Mean will ……
Median will ……
Mode will ……

11
Descriptive statistics
Standard deviation σ:
➢ The Standard Deviation is a measure of how spread out numbers are.
➢ Is the mount of variation or dispersions of set of value
➢ The formula is easy: it is the square root of the Variance.
➢ So now you ask, “What is the Variance?”

• Variance σ2
➢ The average of the squared differences from the Mean.

• Coefficient of Variation (CV) or Relative Standard Deviation (RSD)


➢ is a statistical measure of the dispersion of data points in a data series around the mean. The coefficient of variation
represents the ratio of the standard deviation to the mean, and it is a useful statistic for comparing the degree of
variation from one data series to another, even if the means are drastically different from one another. And it is equal
𝜎
to ത .
𝑋

12
Descriptive statistics
Standard deviation σ:
Ex. You and your friends have just measured the heights of your dogs (in millimeters):
The heights (at the shoulders) are: 600mm, 470mm, 170mm, 430mm and 300mm.
Find the Standard Deviation.

600 + 470 + 170 + 430 + 300


𝑚𝑒𝑎𝑛 = = 394𝑚𝑚
5

2
2
600 − 394 + 470 − 394 2 + 170 − 394 2 + 430 − 394 2 + 300 − 394 2
𝜎 = = 21704
5

𝜎 = 21704 = 147 𝑚𝑚

Standard deviation for population and for sample

13
Descriptive statistics
Low & high Standard deviation σ:
A standard deviation (or σ) is a measure of how dispersed the data is in relation to the mean.
➢ Low standard deviation means data are clustered around the mean, standard deviation close to zero indicates
that data points are close to the mean.
➢ high standard deviation indicates data are more spread out, where high standard deviation indicates data points
are respectively above or below the mean.

14
Descriptive statistics
• What is Normal Distribution
Normal distribution, also known as the Gaussian distribution, is a probability distribution that is symmetric
about the mean, showing that data near the mean are more frequent in occurrence than data far from the
mean. In graph form, normal distribution will appear as a bell curve.

• A normal distribution is the proper term for a probability bell curve.


• In a normal distribution the mean is zero and the standard deviation is 1.
• It has zero skew and a kurtosis of 3.
• Normal distributions are symmetrical, but not all symmetrical distributions are normal.

The normal distribution is the most common type of distribution assumed in statistical analyses.
The standard normal distribution has two parameters: the mean and the standard deviation.
For a normal distribution, 68% of the observations are within +/- one standard deviation of the mean, 95%
are within +/- two standard deviations, and 99.7% are within +- three standard deviations.

15
Descriptive statistics
Skewness
• Skewness refers to distortion or asymmetry in a symmetrical bell curve, or normal distribution, in a set of data. If the curve is shifted
to the left or to the right, it is said to be skewed.
• Skewness can be quantified as a representation of the extent to which a given distribution varies from a normal distribution. A normal
distribution has a skew of zero
• positively-skewed (or right-skewed) to increasing degree.
• Negatively-skewed (or left-skewed) to decreasing degree.

16
Descriptive statistics
Coefficient of Kurtosis
Kurtosis is a measure of how sharp the data peak is. Traditionally the value of this coefficient is compared to
a value of 0.0, which is the coefficient of kurtosis for a normal distribution (i.e. the bell-shaped curve).
➢ value greater than 0 indicates a peaked distribution.
➢ value less than 0 indicates a flat distribution.

17
Descriptive statistics
Empirical Probability Equations
• Empirical probability uses the number of occurrences of an outcome within a sample set as a basis for
determining the probability of that outcome. The number of times "event X" happens out of 100 trials will
be the probability of event X happening. An empirical probability is closely related to the relative
frequency of an event.

18
Hydrological Data characteristics
Rank X Pr (%) T (year) Rank X Pr (%) T (year)

1 50 0.1 10 1 500 0.1 10

2 45 0.2 5 2 45 0.2 5

3 42 0.3 3 42 0.3

4 39 0.4 4 39 0.4

5 36 0.5 5 36 0.5

6 31 0.6 6 31 0.6

7 25 0.7 7 25 0.7

19
Typical Distribution
Pr(x) = F( X, X avg, SD, Sk, Kor,…..)

Types :
Normal Family (Normal , log Normal)
Extreme value Family (Gamble)
Pearson Family (Log Pearson type III)

Characteristics
Number of distribution parameters
Range of variables
Values of Statistical Parameters

20
Typical Distribution
Distribution Fittings for What?
▪ To extrapolate
▪ To assume a population with known characteristics of sample

Types :
Normal Family (Normal , log Normal)
Extreme value Family (Gamble) >>>>>> SEARCH
Pearson Family (Log Pearson type III)

Characteristics
Number of distribution parameters
Range of variables
Values of Statistical Parameters
21
Typical Distribution
Types :
Normal Family (Normal , log Normal)
HyFrAn
Extreme value Family (Gamble) >>>>>> SEARCH
Pearson Family (Log Pearson type III)

Frequency factor method


𝑋𝑝 = 𝑋ത + 𝑘. 𝑆

k depend on
1. Distribution
2. Return period
3. A sample statistics
22

You might also like