0% found this document useful (0 votes)

95 views18 pages

Hydrology Time Series Analysis

This document describes an analysis of annual mean runoff data from a station. The analysis includes: 1) Testing for trends using linear regression and finding a significant positive trend. 2) Removing the trend from the data by calculating the difference between observed and estimated values from the regression model. 3) Calculating the autocorrelation coefficient up to a time lag of 5 years and determining which lags are statistically significant by comparing to confidence limits.

Uploaded by

Bikas C. Bhattarai

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

95 views18 pages

Hydrology Time Series Analysis

Uploaded by

Bikas C. Bhattarai

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 18

9/26/2020 Jupyter Notebook Viewer

Course-work-and-data-analysis (/github/bikasbhattarai/Course-work-and-data-analysis/tree/master)
/ Hydrology-Course (/github/bikasbhattarai/Course-work-and-data-analysis/tree/master/Hydrology-Course)
/
GEO4310_2015 (/github/bikasbhattarai/Course-work-and-data-analysis/tree/master/Hydrology-Course/GEO4310_2015)
/
EX6 (/github/bikasbhattarai/Course-work-and-data-analysis/tree/master/Hydrology-Course/GEO4310_2015/EX6)

Exercise 6: Time series analysis and stochastic

modelling
Name: Bikas Chandra Bhattarai

Date: 2015/10/12

Question 1: Time series analysis

The annual mean runoff data in a station are given in the file Qyearlymean.txt.

1. Investigate if there is a trend using linear regression method for α =10%.

2. If the trend is significant, remove the trend from the data series.
3. Calculate and plot the autocorrelation ρ k and 95% confidence limits for time shifts, k, up to 5 years. Check up
to which time lag is the ρ k significant.

Test for significance of b

Xi = a + bTi

Ho : b = 0; Ha : b ≠ 0

b−o
Test statistic t =
sb

Ho is rejected if |t| ≥ t 1−α/2,n−2 i.e. b is significantally different from zero.

Where
−−−−−−−−− n 2
2 ∑ (χi −χ
^i )
S
sb = √ and S 2 =
i=1
(1)
n ¯ 2 n−2
∑ (T t −T )
t=1

The equation for autocorrelation coefficient, ρ k is given as

1 n−k ¯ ¯
C ov(X t ,X t+k ) ∑ (X t −X )(X t+k −X )
t=1

(2)
n−1
ρk = =
1 n ¯ 2
V ar(X t ) ∑ (X t −X )
n−1 t=1

Test the significance of ρ k

https://nbviewer.jupyter.org/github/bikasbhattarai/Course-work-and-data-analysis/blob/master/Hydrology-Course/GEO4310_20… 1/18
9/26/2020 Jupyter Notebook Viewer

(−1−z1−α/2 √n−2)
l =
n−1

and

(−1+z1−α/2 √n−2)
u =
n−1

Ho : ρ k = 0 ; Ha : ρ k ≠ 0

If the calculated ρ k falls outside these confidence limits, then the hypothesis is rejected, i.e., ρ k is significant for
significant level α and time lag k.

In [2]:

%matplotlib inline
import matplotlib.pyplot as plt
import scipy.stats
import pandas as pd
import numpy as np
import statsmodels.formula.api as smf
from scipy.stats.stats import pearsonr
from pandas import Series
import statsmodels.tsa.stattools

In [3]:

q=pd.read_table('Qyearlymean.txt')
q.columns = ([u'YEAR', u'T', u'discharge'])
discharge = q['discharge']
print q.head(3)
print len(q)

YEAR T discharge
0 1940 1 398.0
1 1941 2 429.1
2 1942 3 690.2
51

https://nbviewer.jupyter.org/github/bikasbhattarai/Course-work-and-data-analysis/blob/master/Hydrology-Course/GEO4310_20… 2/18
9/26/2020 Jupyter Notebook Viewer

In [4]:

q= pd.read_table('Qyearlymean.txt')
# create a fitted model between discharge as dependent variable and time (T) as indip
fd = smf.ols(formula='discharge ~ T', data= q).fit()
#print summary statistics
print(fd.summary())

OLS Regression Results

==========================================================================
Dep. Variable: discharge R-squared: 0
Model: OLS Adj. R-squared: 0
Method: Least Squares F-statistic: 2
Date: Tue, 13 Oct 2015 Prob (F-statistic): 0
Time: 09:41:14 Log-Likelihood: -3
No. Observations: 51 AIC: 6
Df Residuals: 49 BIC: 6
Df Model: 1
Covariance Type: nonrobust
==========================================================================
coef std err t P>|t| [95.0% Conf.
-------------------------------------------------------------------------
Intercept 491.2333 36.817 13.343 0.000 417.247 565
T 2.1136 1.232 1.715 0.093 -0.363 4
==========================================================================
Omnibus: 2.044 Durbin-Watson: 0
Prob(Omnibus): 0.360 Jarque-Bera (JB):
Skew: 0.348 Prob(JB): 0
Kurtosis: 2.390 Cond. No.
==========================================================================

Warnings:
[1] Standard Errors assume that the covariance matrix of the errors is co

So the regression model becomes:

Xt = a + bTt = 491.23 + 2.11 ∗ Tt

Now formulating the test hypothesis for the coefficients to test wheather the coefficients are significantly different or
not and the test hypothesis can be formulated as given below:

H0 : b = 0

Ha : b ≠ 0

If the |t| > t critical , then H0 is rejected

From above table |t| = 1.715

In [5]:

# Obtaining the value of 't' from t-table using the python function at alpha = 10, de
round(scipy.stats.t.ppf(0.90,len(q)-2),3)

Out[5]:

1.299

From the t-table calculation tcritical = t 1− α ;n − 2 = 1.299 (for α = 10%)

https://nbviewer.jupyter.org/github/bikasbhattarai/Course-work-and-data-analysis/blob/master/Hydrology-Course/GEO4310_20… 3/18
9/26/2020 Jupyter Notebook Viewer

Conclusion: From the analysis, |t| value is greater than tcritical so H0 is rejected and concluded that, there is a
trend.

2. If the trend is significant, remove the trend from the data series.

To remove the trend, the difference between observed values and estimated values, the error, is calculated. The
error is always without trend. To get the time series of discharge without trend, interception and each error value is
added.

https://nbviewer.jupyter.org/github/bikasbhattarai/Course-work-and-data-analysis/blob/master/Hydrology-Course/GEO4310_20… 4/18
9/26/2020 Jupyter Notebook Viewer

In [6]:

# Defining x-axis
year = q['T']
#creating plot
fig, ax = plt.subplots(figsize=(15,7))

#creating linear fit

fit = np.polyfit(year, discharge, deg=1)

#plotting trend line to the figure

ax.plot(year, fit[0] * year + fit[1], color='red')
ax.plot(year, discharge,color='red', label = 'with trend')
plt.legend()
#setting the font size for x and y-axis
plt.rc('xtick', labelsize=16)
plt.rc('ytick', labelsize=16)

#priting the linear equation of fit

print 'y = %.2fx + %.3f'%(fit[0],fit[1])

#plot for without trend

#creating x-axis for the data with out trend
x1 = year
#date = range(1940, 1991,1)
#removing the trend from data ( compendium 139, 2013)
y1 = 2.114*x1
without_trend = discharge-y1
ax.plot(x1,without_trend,label='without trend')
plt.legend()

#creating linear fit for non trended data series

fit = np.polyfit(x1, without_trend, deg=1)
ax.plot(x1, fit[0] * x1 + fit[1], color='blue')
ax.set_xlim([0,52])
# display the equation for data without trend
print 'y_without_trend = %.2fx + %.3f'%(fit[0],fit[1])

plt.ylabel('Discharge ($m^3/s$)', fontsize =18)

plt.xlabel('Year',fontsize=18)
plt.show()

y = 2.11x + 491.233
y_without_trend = -0.00x + 491.233

3) Calculate and plot the autocorrelation ρ k and 95% confidence limits for time shifts, k, up to 5 years.

https://nbviewer.jupyter.org/github/bikasbhattarai/Course-work-and-data-analysis/blob/master/Hydrology-Course/GEO4310_20… 5/18
9/26/2020 Jupyter Notebook Viewer

Check up to which time lag is the ρ k significant.

Solution:The equation for autocorrelation coefficient, ρ k is given as

1 n−k ¯ ¯
C ov(X t ,X t+k ) ∑ (X t −X )(X t+k −X )
t=1

Which is given by the python function :

n−1
ρk = =
1 n ¯ 2
V ar(X t ) ∑ (X t −X )
n−1 t=1

statsmodels.tsa.stattools.acf(Series, nlags= Timelag) is used in this calculation. Conficence intervals are

calculated by using the formula given below:

(−1−z1−α/2 √n−2)
l =
n−1

and

(−1+z1−α/2 √n−2)
u =
n−1

Ho : ρ k = 0 ; Ha : ρ k ≠ 0

If the calculated ρ k falls outside these confidence limits, then the hypothesis is rejected, i.e., ρ k is significant for
significant level α and time lag k.

In [7]:

# generating series of total number of data which is equivalent to n in the confidenc

v = pd.Series(range(51 ,45, -1))
# Defining the time lag as index in the dataframe
index = range(0 ,6)
# creating dataframe
auto = pd.DataFrame( index = index)
# assigning the title to index as time lag
auto.columns.name = 'Time lag'
#calculating autocorrelation by using the function from pytohn where nlags is the all
auto['autocorr (pk)'] = Series(statsmodels.tsa.stattools.acf(without_trend, nlags=5))
# calculating lower confidence interval where 1.96 is the value from normal distribut
auto['CI-lower'] = (-1-1.96*np.sqrt(v-2))/(v-1)
# calculating upper confidence interval
auto['CI-upper'] = (-1+1.96*np.sqrt(v-2))/(v-1)
# displaying the data
auto

Out[7]:

Time lag autocorr (pk) CI-lower CI-upper

0 1.000000 -0.294400 0.254400

1 0.527878 -0.297536 0.256720

2 0.453294 -0.300773 0.259106

3 0.386475 -0.304114 0.261561

4 0.194384 -0.307567 0.264089

5 0.159846 -0.311137 0.266693

Conclustion: From above calculation H0 : pk = 0 is rejected up to time lag of 3 years. This means the
autocorrelation coefficient is significant for α =5% up to 3 years of time lag. The autocorrelation for time lag up to 5
years together with limits for confidence interval with 5% leel of significance is given below:

https://nbviewer.jupyter.org/github/bikasbhattarai/Course-work-and-data-analysis/blob/master/Hydrology-Course/GEO4310_20… 6/18
9/26/2020 Jupyter Notebook Viewer

In [8]:

fig, ax = plt.subplots(figsize=(10,5))
ax.plot(auto, marker='o')
plt.grid()
plt.legend(['Auto correlation', 'Lower CI' , 'Uppper CI'])
plt.ylabel('rL', fontsize =18)
plt.xlabel('Time lag (Years)',fontsize=18)
plt.show()

In python there is a inbuild function instide statsmodels.graphics.tsaplots called plot_acf to calculate and plot the
auto-correlation with 95% significance level. The plot given below is plotted by using the plot_acf function where
area under the shaded region is the confidence intervals.

In [9]:

from statsmodels.graphics.tsaplots import plot_acf

plot_acf(without_trend)
plt.show()

Question 2: Stochastic models for stationary time series

Fit the first order Markov model for annual data (after the removal of trend if there is any)

Xt+1 = μ + β(1)(Xt − μ) + ϵ t+1 (3)

https://nbviewer.jupyter.org/github/bikasbhattarai/Course-work-and-data-analysis/blob/master/Hydrology-Course/GEO4310_20… 7/18
9/26/2020 Jupyter Notebook Viewer

Parameter estimation:

^
β 1 = ρ1 where ρ 1 is calculated using equation (2).

n 2
2 ∑ (χ −χ̄ )
2 ^
2
^ = σ x (1 − β )
σ ϵ 1
where σ x2 =
i=1 i i

n−1

and

1 n
^ = x̄ =
μ ∑ xi
n i=1

equation (3) becomes

−−−−−
2
^ ^
xt = μ
^
x
+ β (x t−1 − μ
1
^
x
+ z t σ x )√1 − β
1
(4)

Procedure of using equation (4):

1. Estimate μx , σ x , and β1 and substitute them in (4)

2. Generate random z t series which is from a N(0,1) distribution. N_yrs = 1991-1960+1
random.standard_normal((N_yrs,))
3. Use the 1959’s value as initial value and simulate for the period 1960-1991
4. Compare the simulated time series with the observed time series. Plot two time series in the same graph,
calculate and compare the statistics of the two series, i.e. mean, standard deviation, ρ 1 and ρ 2 , etc.,
5. Use AR(1) model to simulate 100 samples, each having 33 years, calculate the mean values of the statistics
(average, standard deviation, r1) from the 100 samples and compare them with the statistics calculated from
observed 33 years data series

Note that in step (4), one simulation sample is compared with the observed data, while in step (5) the mean values
of 100 samples are compared with the observed values.

Solution:

In [10]:

#Calculating the different statistics for the data:

#Defining the dataframe
frame= pd.DataFrame()
#counting the number of data
n = len(without_trend)
#calculating mean
mean = 491.22
frame['no_obs'] = Series(len(without_trend))
frame['mean']= np.mean(without_trend).round(2)
frame['std'] = (np.sqrt(sum((without_trend-mean)**2)/(n-1))).round(2)
#Displaying the basic statistics of data with out trend
frame

Out[10]:

no_obs mean std

0 51 491.22 128.23

Now Fitting of first order Markov model for annual data after trend is removed. Equation (4) to simulate
discharge.Autocorrelation for time lag of 1 is calculated by using the python function:

https://nbviewer.jupyter.org/github/bikasbhattarai/Course-work-and-data-analysis/blob/master/Hydrology-Course/GEO4310_20… 8/18
9/26/2020 Jupyter Notebook Viewer

In [11]:

#calculating the autocorrelation

auto_1 = statsmodels.tsa.stattools.acf(without_trend, nlags=1)
#converting the array data in to series
auto_1 = Series(auto_1[1,]).round(3)
print auto_1

0 0.528
dtype: float64

Here, the autocorrelation with time lag of 1 year is found to be 0.528

β1 = ρ 1 = auto1 = 0.528

mean = x̄ = 491.22

2
std = σ x = 128.23

and z t is series generated from random normal distribution: N(0,1). The initial value for Xt−1 is selected as the
3
1959 value i.e 552.63 m /s .

In [12]:

mean = 491.22
std = 128.23
#Selecting the data series 1959 to 1990
no_trend = without_trend[19:]
#generating normally distributed random data series
z= np.random.normal(0, 1, 32)
# Simulating the discharge by using equation (4)
simulated = pd.DataFrame(mean+(0.557*(no_trend -mean))+(z*std*(np.sqrt(1-0.557**2))))

calculating the auto-correlation upto 2 year time lag for observed and simulated datasets.

In [13]:

auto_2_obs = pd.DataFrame(statsmodels.tsa.stattools.acf(no_trend, nlags=2))[1:]

print auto_2_obs.T

1 2
0 0.595307 0.614706

In [14]:

auto_2_sim = pd.DataFrame(statsmodels.tsa.stattools.acf(simulated, nlags=2))[1:]

print auto_2_sim.T

1 2
0 0.252234 0.267889

Calculating other statistics and making the dataframe

https://nbviewer.jupyter.org/github/bikasbhattarai/Course-work-and-data-analysis/blob/master/Hydrology-Course/GEO4310_20… 9/18
9/26/2020 Jupyter Notebook Viewer

In [15]:

index = ['Observed','Simulated']
statistic = pd.DataFrame(index = index)
statistics = pd.DataFrame({'Auto_1': [0.595, 0.128], 'Auto_2': [0.6147, 0.0735]},inde
statistics['Mean'] = round(no_trend.mean(),2),round(simulated.mean(),2)
statistics['std'] = round(no_trend.std(),2), round(simulated.std(),2)
statistics['median'] = round(no_trend.median(),2), round(simulated.median(),2)
statistics

Out[15]:

Auto_1 Auto_2 Mean std median

Observed 0.595 0.6147 494.06 126.30 471.80

Simulated 0.128 0.0735 511.54 110.66 511.32

In [16]:

fig, ax = plt.subplots(figsize=(13,5))
ax.plot(no_trend, label='observed')
ax.plot(simulated, label='simulated')
plt.grid()
ax.set_xlim([1 ,31])
plt.legend()
plt.xlabel('Years',fontsize =18)
plt.ylabel('Discharge ($m^3/s$)',fontsize =18)
plt.show()

Conclusion: The statistics for observed and modelled time series (1960 -1990) is in above table. And aslo it is
clear from the plot that the variation between the observed and simulated data series is large. Comparision shows
that the Markov medeled time series is a more ramdom series as is expected when including the random z values
from a normal distribution in the Markov model. Simulated times series has lower mean, and lower standard
deviation then that of the observed series. And the value of mean and standard deviation changes everytimes after
the model because of random number.

5 Use AR(1) model to simulate 100 samples, each having 33 years, calculate the mean values of the statistics
(average, standard deviation, r1) from the 100 samples and compare them with the statistics calculated from
observed 33 years data series

https://nbviewer.jupyter.org/github/bikasbhattarai/Course-work-and-data-analysis/blob/master/Hydrology-Course/GEO4310_2… 10/18
9/26/2020 Jupyter Notebook Viewer

In [395]:

#Defining the dataframe

df = pd.DataFrame()
#defining the function for n times model run
def repeat_fun(times, f, args):
for i in range(times): f(i,args)

def f(iteration,df):
g = np.random.normal(0, 1, 32)
mm = Series(no_trend)
x = 491.22+(0.557*(mm -491.22))+(g*128.23*(np.sqrt(1-0.557**2)))
df[iteration] = x

repeat_fun(100,f,df)
#Dataframe with 100 columns and 31 rows will be obtained by printing df.
#calculating the mean for 100 times run datasets
Simulated_100 =df.mean(axis=1)

Calculating auto-correlation up to 2 year time lag for mean which is calculated from 100 years run datasets.

In [397]:

auto_2_100_sim = pd.DataFrame(statsmodels.tsa.stattools.acf(Simulated_100, nlags=2))[

print auto_2_sim.T

1 2
0 0.219762 0.432154

Calculating and comparing the different statistics for the mean of 100 times run and observed data without trend.

In [398]:

ind = ['Observed','Simulated']
simulated_100 = pd.DataFrame(index = ind)
simulated_100 = pd.DataFrame({'Auto_1': [0.595, 0.128], 'Auto_2': [0.219, 0.432]},ind
simulated_100['Mean'] = round(no_trend.mean(),2),round(Simulated_100.mean(),2)
simulated_100['std'] = round(no_trend.std(),2), round(Simulated_100.std(),2)
simulated_100['median'] = round(no_trend.median(),2), round(Simulated_100.median(),2)
simulated_100

Out[398]:

Auto_1 Auto_2 Mean std median

Observed 0.595 0.219 494.06 126.30 471.80

Simulated 0.128 0.432 491.35 70.28 476.09

https://nbviewer.jupyter.org/github/bikasbhattarai/Course-work-and-data-analysis/blob/master/Hydrology-Course/GEO4310_2… 11/18
9/26/2020 Jupyter Notebook Viewer

In [401]:

fig, ax = plt.subplots(figsize=(13,5))
ax.plot(Simulated_100,label='Simulated_100')
ax.plot(no_trend, label='observed')
plt.grid()
ax.set_xlim([1 ,31])
plt.ylabel('Discharge($m^3/s$)', fontsize =18)
plt.xlabel('Years',fontsize=18)
plt.legend()
plt.show()

Conclusion mean and standard deviation for mean simulated discharge is lower then that of observed one. On the
basis of standard deviation we can say that the variation in simulated data sets is comperatively less then that of
observed datasets.

Question 3: Stochastic model for non-stationary time series

Adapt a stochastic model to monthly discharge COLUMBIA_M1.txt.

Thomas-Fiering is a generalization of the first order Markov model. In general for the stochastic models (MA, AR,
ARMA) stationarity is assumed. For example, the first order Markov model assumes stationarity in the first three
moments. However, there are generalizations of these models that account for non-stationarity. For data sets which
are seasonally non-stationary, you can use Thomas-Fiering.

1. Fit the Thomas-Fiering Model to the data and simulate the monthly discharge from January 1949 to December
1975 using the value of December 1948 as the initial value. Plot the simulated and observed data on the same
graph and compare the statistics of the two data series
2. Simulate the 100-year discharge series from 1976 to 2075 using the value of December 1975 as initial value
and compare the statistics of the simulated data series with the observed data series.

The Thomas-Fiering Model is given as:

−−−−−
^j+1,i = q
q ¯ ) + z j+1,i ⋅ s j+1 √1 − r2
¯j+1 + bj (qj,i − qj j
(1)

Where i = year, j = month, and z is a random series with N(0, 1).

Calculation procedure:

https://nbviewer.jupyter.org/github/bikasbhattarai/Course-work-and-data-analysis/blob/master/Hydrology-Course/GEO4310_2… 12/18
9/26/2020 Jupyter Notebook Viewer
1
(a) The mean flow: q¯j =
n
∑ qj,i ,
i
where i = {j, 12 + j, 24 + j, ⋯}

−−−−−−−−
2
∑ (q j,i −q
¯ )
(b) The standard deviation: Sj = √
i j

n−1

(c) the correlation coefficient with flow in the next month,

∑ (q j,i −q
¯ j)(q j+1,i −q
¯ j+1 )
i
rj =
2 2
√∑ (q −q
¯ ) ∑ (q j+1,i −q
¯ )
i j,i j i j+1

(d) the slope of the regression equation relating the month’s flow to flow in the preceding month:

Sj+1
bj = rj
Sj

Solution

https://nbviewer.jupyter.org/github/bikasbhattarai/Course-work-and-data-analysis/blob/master/Hydrology-Course/GEO4310_2… 13/18
9/26/2020 Jupyter Notebook Viewer

In [18]:

def thomas_fiering(q_hist, q_init, j_init, period):

""" Thomas Fiering Model """

q = q_hist

months = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]

months1 = [2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1]

# create a random generator, note we have to call this '()' and take the
# first element returned [0]... so in use it is Z()[0]
Z = np.random.randn

# get the mean monthly discharges for the j months

q_mean_j = [q[q.index.month == i].discharge.mean() for i in months]

# get the mean monthly discharges for the j+1 months

q_mean_j1 = [q[q.index.month == i].discharge.mean() for i in months1]

# do the same for standard deviation

sj = [q[q.index.month == i].discharge.std() for i in months]

sj1 = [q[q.index.month == i].discharge.std() for i in months1]

# calculate the pearson r using built-ins

rj = [pearsonr(q[q.index.month == i].discharge,
q[q.index.month == j].discharge)[0] for i, j in zip(months, months

# the model, not vectorized. assumes each of the variables above are
# of length 12 ... and note that Python is zero-indexed (e.g. first element is '0
def tf1(qi, month):
i = month-1 #account for zero offset
return q_mean_j1[i] + (rj[i] * (sj1[i] / sj[i])) * (qi - q_mean_j[i]) + Z(1)[

# initiate simulated q with q_init and j_init (month)

# simq is a list, and we just call the model function to get the first element
simq = [tf1(q_init, j_init)]

# enumerate (e.g. makes 'i' a counter) and get each month in the simulation perio
for i, m in enumerate(period.month[1:]):

#append to the list by calling the model with the prior value and the month
simq.append(tf1(simq[i-1], m))

# convert it to a dataframe for convenience

return pd.DataFrame(np.array(simq), columns=['discharge'], index=period)

if __name__ == "__main__":

# use pandas to read the data

q = pd.read_table('mon.csv', sep=',',
names=['year', 'month', 'discharge'])[1:].convert_objects(convert_n
#q = q['discharge'].convert_objects(convert_numeric=True)

# create a pandas date range to index the data frame

rng = pd.date_range('1/1/1949', periods=len(q), freq='M')

# set the index (now we can easily get monthly values)

q.index = rng

# create a simulation period

period = pd.date_range('Jan-1949', periods=324, freq='M')
q_init = q.discharge['Dec-1949'] #december 1949

https://nbviewer.jupyter.org/github/bikasbhattarai/Course-work-and-data-analysis/blob/master/Hydrology-Course/GEO4310_2… 14/18
9/26/2020 Jupyter Notebook Viewer

simq = thomas_fiering(q, q_init, 12, period)

# plot the data
plt.figure(figsize=(15,5)); plt.hold(True)
plt.plot(q.index, q.discharge, c='b', label='observed')
plt.plot(simq.index, simq.discharge, c='r', label='simulated')
plt.legend()
plt.grid(True)

Comparing the statistics:

In [19]:

index = ['Observed','Simulated']
stat = pd.DataFrame(index = index)
stat['Mean'] = round(q.discharge.mean(),2),round(simq.mean(),2)
stat['std'] = round(q.discharge.std(),2), round(simq.std(),2)
stat['median'] = round(q.discharge.median(),2), round(simq.median(),2)
stat

Out[19]:

Mean std median

Observed 5551.51 3366.80 4207.00

Simulated 5697.04 2480.31 5043.47

Conclusion: From the plot above it is clear that the there is a variation between observed and simulated discharge
from thomas fiering model. Simulated mean is higher than that of observed while the simulated standard deviation
is lower then that of observed datasets. And also the mean is greater than the median in both case. This is common
for a distribution that is skewed to the right (that is, bunched up toward the left and with a "tail" stretching toward the
right)

Simulating discharge for monthly 100 years by using the Thomas-Fiering model

https://nbviewer.jupyter.org/github/bikasbhattarai/Course-work-and-data-analysis/blob/master/Hydrology-Course/GEO4310_2… 15/18
9/26/2020 Jupyter Notebook Viewer

In [20]:

# Repeat Thomas-Fiering 1200 time steps:

#Define an array of 1200 elements
#sim_100 = np.arange(100)

# Define the period (100 years = 1200 months time steps)

period = pd.date_range('Jan-1975', periods=1200, freq='M')
sim_100 = thomas_fiering(q, q_init, 12, period)
#print(sim_100)

# plot the data

plt.figure(figsize=(15,6)); plt.hold(True)
plt.plot(sim_100.index, sim_100, c='b')
#plt.plot(simq.index, simq.discharge, c='r', label='simulated')
plt.xlabel('Years',fontsize =18)
plt.ylabel('Discharge ($m^3/s$)',fontsize =18)
plt.grid(True)

Calculating the different statistics for 100 years simulated data

In [21]:

sim=pd.DataFrame(sim_100['discharge'])
mean_100 = pd.DataFrame(sim.mean().round(2))
std_100 = pd.DataFrame(sim.std().round(2))
median_100 =pd.DataFrame(sim.median().round(2))
stats_100= (pd.concat([mean_100, std_100,median_100], axis =1))
my_columns = ["mean", "std", "median"]
stats_100.columns = my_columns
stats_100

Out[21]:

mean std median

discharge 5685.43 2622.74 5085.56

https://nbviewer.jupyter.org/github/bikasbhattarai/Course-work-and-data-analysis/blob/master/Hydrology-Course/GEO4310_2… 16/18
9/26/2020 Jupyter Notebook Viewer

In [30]:

plot_acf(sim_yearly)

Out[30]:

Calculating the yearly sum for 100 years simulated data

In [29]:

sim_yearly = sim_100.groupby(sim_100.index.year).sum()

https://nbviewer.jupyter.org/github/bikasbhattarai/Course-work-and-data-analysis/blob/master/Hydrology-Course/GEO4310_2… 17/18
9/26/2020 Jupyter Notebook Viewer

In [40]:

# plotting 100 years simulated data

fig, ax = plt.subplots(figsize=(10,5))
ax.plot(sim_yearly.index,sim_yearly)
plt.grid()
plt.ylabel('Discharge($m^3/s$)', fontsize =18)
ax.set_xlim([1975 ,2075])
plt.xlabel('Years',fontsize=18)
plt.show()

In [ ]:

https://nbviewer.jupyter.org/github/bikasbhattarai/Course-work-and-data-analysis/blob/master/Hydrology-Course/GEO4310_2… 18/18

Exercise 4: Simple and Multiple Linear Regression Analysis
No ratings yet
Exercise 4: Simple and Multiple Linear Regression Analysis
15 pages
OLS Regression Analysis on INTRATE
No ratings yet
OLS Regression Analysis on INTRATE
8 pages
SAT and GPA Regression Analysis
100% (1)
SAT and GPA Regression Analysis
1 page
Pregunta 5
No ratings yet
Pregunta 5
2 pages
Assignment 1 Tuneer
No ratings yet
Assignment 1 Tuneer
5 pages
Results
No ratings yet
Results
8 pages
Assignment 2
No ratings yet
Assignment 2
5 pages
Assignment 6
No ratings yet
Assignment 6
6 pages
Output - Group - Work - Project - 4652 - GWP1.ipynb - Colaboratory
No ratings yet
Output - Group - Work - Project - 4652 - GWP1.ipynb - Colaboratory
6 pages
Bda Assign
No ratings yet
Bda Assign
15 pages
STAT721 Test1 2022 Solutions
No ratings yet
STAT721 Test1 2022 Solutions
5 pages
Langkah-Langkah Pemodelan Trend Analisis Model Regresi Regression Analysis: Data Versus T
No ratings yet
Langkah-Langkah Pemodelan Trend Analisis Model Regresi Regression Analysis: Data Versus T
14 pages
Regression Analysis Insights
No ratings yet
Regression Analysis Insights
11 pages
Maxbox Starter149 Statistic Package Statsmodels
No ratings yet
Maxbox Starter149 Statistic Package Statsmodels
9 pages
Problem 4.1 A)
No ratings yet
Problem 4.1 A)
11 pages
Exp9 Time Series Analysis
No ratings yet
Exp9 Time Series Analysis
8 pages
Copper Linear Regression Results
No ratings yet
Copper Linear Regression Results
3 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
5 pages
Untitled 11
No ratings yet
Untitled 11
12 pages
Percobaan 19 Nov 15-19
No ratings yet
Percobaan 19 Nov 15-19
4 pages
Run Stata 17-11
No ratings yet
Run Stata 17-11
7 pages
Week 2 MrSumanBera HandsOn
No ratings yet
Week 2 MrSumanBera HandsOn
9 pages
Regression Anallysis Hands0n 1
100% (1)
Regression Anallysis Hands0n 1
3 pages
BA Soln
No ratings yet
BA Soln
9 pages
Practice Problem Set 1 Solution
No ratings yet
Practice Problem Set 1 Solution
10 pages
Solución 1:: Examen de Investigación
No ratings yet
Solución 1:: Examen de Investigación
8 pages
Time - Series - Forecasting Using Teleco Telecom Revenue
No ratings yet
Time - Series - Forecasting Using Teleco Telecom Revenue
27 pages
Linear Regression
No ratings yet
Linear Regression
7 pages
PPP Models - ARIMA & NARNN - Ipynb - Colaboratory
No ratings yet
PPP Models - ARIMA & NARNN - Ipynb - Colaboratory
8 pages
Popularity Prediction On Twitter EE239AS Project 3
No ratings yet
Popularity Prediction On Twitter EE239AS Project 3
21 pages
Linear Regression - Ipynb - Colab
No ratings yet
Linear Regression - Ipynb - Colab
4 pages
Da Exp 07
No ratings yet
Da Exp 07
6 pages
Swd325 Practical Solution
No ratings yet
Swd325 Practical Solution
9 pages
Panel Data Models Using Stata: Source
No ratings yet
Panel Data Models Using Stata: Source
22 pages
Model Summaries
No ratings yet
Model Summaries
1 page
Exam
No ratings yet
Exam
6 pages
Results 1
No ratings yet
Results 1
4 pages
Practical 7
No ratings yet
Practical 7
14 pages
Lab Linear Regression
No ratings yet
Lab Linear Regression
21 pages
Are107 HW2
No ratings yet
Are107 HW2
5 pages
A1w2017s PDF
No ratings yet
A1w2017s PDF
11 pages
Final AK (Spring 2024)
No ratings yet
Final AK (Spring 2024)
14 pages
HHH
No ratings yet
HHH
30 pages
14-1-2025
No ratings yet
14-1-2025
3 pages
Analysis
No ratings yet
Analysis
9 pages
Chapter 2
No ratings yet
Chapter 2
39 pages
Percobaan 16 Nov
No ratings yet
Percobaan 16 Nov
7 pages
OLS Stata9
No ratings yet
OLS Stata9
13 pages
Re: ST: Panel Data-FIXED, RANDOM EFFECTS and Hausman Test: Sindijul@msu - Edu Statalist@hsphsun2.harvard - Edu
No ratings yet
Re: ST: Panel Data-FIXED, RANDOM EFFECTS and Hausman Test: Sindijul@msu - Edu Statalist@hsphsun2.harvard - Edu
5 pages
Examen Parcial 2 2023-2 Secc 1 (Solutions Alumnos)
No ratings yet
Examen Parcial 2 2023-2 Secc 1 (Solutions Alumnos)
5 pages
ANCOVA How To Perform An Ancova in Python
No ratings yet
ANCOVA How To Perform An Ancova in Python
4 pages
Simple and Multiple Regression
No ratings yet
Simple and Multiple Regression
9 pages
Granger Causality and VAR Models
No ratings yet
Granger Causality and VAR Models
1 page
Final Exam Econom
No ratings yet
Final Exam Econom
10 pages
Getting Your Hands-On Climate Data - Visualize Climate Data With Python
No ratings yet
Getting Your Hands-On Climate Data - Visualize Climate Data With Python
20 pages
Hydrology Probability Analysis
No ratings yet
Hydrology Probability Analysis
10 pages
GEO 2010 Surface Hydrology Spring 2017 Exercise 1 Date: 2017-01-16
No ratings yet
GEO 2010 Surface Hydrology Spring 2017 Exercise 1 Date: 2017-01-16
7 pages
Hydrology Probability Analysis
No ratings yet
Hydrology Probability Analysis
10 pages
Surface Hydrology: GEO2010 Spring 2017
No ratings yet
Surface Hydrology: GEO2010 Spring 2017
8 pages
Black Carbon in Himalayas
No ratings yet
Black Carbon in Himalayas
7 pages
Exercise 5: Frequency Analysis:) Against
No ratings yet
Exercise 5: Frequency Analysis:) Against
9 pages
GEO 2010 Surface Hydrology Spring 2017 Exercise 1 Date: 2017-01-16
No ratings yet
GEO 2010 Surface Hydrology Spring 2017 Exercise 1 Date: 2017-01-16
7 pages
Hydrology D 19 00079 PDF
No ratings yet
Hydrology D 19 00079 PDF
25 pages
Rubric Malikhaing Pagkukwento - 2015 PDF
No ratings yet
Rubric Malikhaing Pagkukwento - 2015 PDF
4 pages
Differences Between Face Up Blackjack and Regular Blackjack
No ratings yet
Differences Between Face Up Blackjack and Regular Blackjack
2 pages
Orta Sevi̇yede İngi̇li̇zce Bi̇len Ana Di̇li̇ Türkçe Olan Öğrenci̇leri̇n Vücut
No ratings yet
Orta Sevi̇yede İngi̇li̇zce Bi̇len Ana Di̇li̇ Türkçe Olan Öğrenci̇leri̇n Vücut
163 pages
100 General Grammar MCQs
No ratings yet
100 General Grammar MCQs
16 pages
GX-20 QG Eng 0307
No ratings yet
GX-20 QG Eng 0307
2 pages
Azure Data Fundamental
No ratings yet
Azure Data Fundamental
81 pages
Social Media Presentation - Yelp - Kittia Dedtaryoon
No ratings yet
Social Media Presentation - Yelp - Kittia Dedtaryoon
11 pages
Connect Diag SUR-RON - 105V - Brake - v08
No ratings yet
Connect Diag SUR-RON - 105V - Brake - v08
1 page
First Mbbs Syllabus 21 03 2020 Final Full
No ratings yet
First Mbbs Syllabus 21 03 2020 Final Full
129 pages
Mechanical Reliability
No ratings yet
Mechanical Reliability
3 pages
Electronic Chattel Paper-Invitation Accepted
No ratings yet
Electronic Chattel Paper-Invitation Accepted
27 pages
English and Chinese Reader
No ratings yet
English and Chinese Reader
299 pages
KG2 Marine & Winged Life Activities
No ratings yet
KG2 Marine & Winged Life Activities
1 page
Libro Ingles ID 3 Profesores
No ratings yet
Libro Ingles ID 3 Profesores
192 pages
Account Statement: Penyata Akaun
No ratings yet
Account Statement: Penyata Akaun
2 pages
Princeton Chromatography SFC & HPLC Solutions
No ratings yet
Princeton Chromatography SFC & HPLC Solutions
20 pages
21 Chump Street Questions
No ratings yet
21 Chump Street Questions
3 pages
Chromosome
No ratings yet
Chromosome
10 pages
VVTS
No ratings yet
VVTS
12 pages
Chapter Two and References - 043431
No ratings yet
Chapter Two and References - 043431
9 pages
Comments/Remarks:: Name: Tupas, Re Charles
No ratings yet
Comments/Remarks:: Name: Tupas, Re Charles
3 pages
Hafed TPP
No ratings yet
Hafed TPP
49 pages
Netlab Cyberops Associate Pod
No ratings yet
Netlab Cyberops Associate Pod
25 pages
On-Premise - PCE SAC Financial Planning
100% (2)
On-Premise - PCE SAC Financial Planning
71 pages
Wilm's Tumor
100% (2)
Wilm's Tumor
17 pages
Melancholy in The Medieval World The Christian, Jewish, and Muslim Traditions
No ratings yet
Melancholy in The Medieval World The Christian, Jewish, and Muslim Traditions
19 pages
Pamflet CPHI 2018
No ratings yet
Pamflet CPHI 2018
1 page
Solving Problems Involving Loans
No ratings yet
Solving Problems Involving Loans
13 pages
Rutkas Notebook A Voice From The Holocaust
No ratings yet
Rutkas Notebook A Voice From The Holocaust
102 pages
Wave Field Synthesis Explained
No ratings yet
Wave Field Synthesis Explained
26 pages