Deep Learning Manual
Deep Learning Manual
LIST OF EXPERIMENTS
SNO NAME OF
THE DATE PAGENO SIGNATURE
EXPERIMEN
T
4
Initially, it had over 4800 contributors during its launch, which now has gone up to
250,000 developers. It has a 2X growth ever since every year it has grown. Big
companies like Microsoft, Google, NVIDIA, and Amazon have actively contributed
to the development of Keras. It has an amazing industry interaction, and it is used
in the development of popular firms likes Netflix, Uber, Google, Expedia, etc.
Specialties of Keras
Keras can be developed in R as well as Python, such that the code can be run with
TensorFlow, Theano, CNTK, or MXNet as per the requirement. Keras can be run
on CPU, NVIDIA GPU, AMD GPU, TPU, etc. It ensures that producing models with
Keras is really simple as it totally supports to run with TensorFlow serving, GPU
acceleration (WebKeras, Keras.js), Android (TF, TF Lite), iOS (Native CoreML) and
Raspberry Pi. Play Video
Keras Backend
o TensorFlow
TensorFlow is a Google product, which is one of the most famous deep
learning tools
widely used in the research area of machine learning and deep neural
network. It came into the market on 9 th November 2015 under the Apache
License 2.0. It is built in such a way that it can easily run on multiple CPUs
and GPUs as well as on mobile operating systems. It consists of various
wrappers in distinct languages such as Java, C++, or Python.
o Theano
Theano was developed at the University of Montreal, Quebec, Canada, by
the MILA group. It is an open-source python library that is widely used for
performing mathematical operations on multi-dimensional arrays by
incorporating scipy and numpy. It utilizes GPUs for faster computation and
efficiently computes the gradients by building
symbolic graphs automatically. It has come out to be very suitable for
unstable expressions, as it first observes them numerically and then
computes them with more stable
algorithms.
o CNTK
Microsoft Cognitive Toolkit is deep learning's open-source framework. It
consists of all the basic building blocks, which are required to form a neural
network. The models are trained using C++ or Python, but it incorporates
C# or Java to load the model for making predictions.
Advantages of Keras
Disadvantages of Keras
o The only disadvantage is that Keras has its own pre-configured layers, and if
you want to
create
APIs. Itan abstract
only layer,
supports it won'tAPI
high-level let you because
running it cannot
on the handle
top of the low-level
backend engine
(TensorFlow, Theano,
and CNTK).
Prerequisite
This Keras tutorial is made for both beginners and professionals, to help them
understand the fundamental concept of Keras. After the completion of this tutorial,
you will find yourself at a moderate level of expertise from where you can take
yourself to the next level.
7
To download Anaconda, you can either go to one of your favorite browser and type
Download Anaconda Python in the search bar or, simply follow the link given below.
https://www.anaconda.com/distribution/#download-section.
Click on the very first link, and you will get directed to the Anaconda's download
page, as shown below:
8
You will notice that Anaconda is available for various operating systems such as
Windows,
M A C O S , an d Li n u x . Y o u c S
I t w ill of f er y ou P y t h o n 2 .7 and s
o w n l ad i t b y c l c k i n g o n th e av i la b l o p t io n .
P yt h on 3 . 7 v e rs i o n . S i n ce th e lat e s t ve rs i o n is o
a s p e r y o ur O
P y t h o n 3 . 7 ,
download it by clicking on the download option. The downloading will
automatically start after you hit the download option.
After the download is finished, go to the download folder and click on the
Anaconda's .exe file (Anaconda3-2019.03-Windows-x86_64.exe). The setup window
for the installation of Anaconda will get open up where you have to click on Next,
as shown below:
9
After clicking on the Next, it will open a License Agreement window, click on I
Agree to move ahead with the installation.
Next, you will get two options in the window; click on the first option, followed
by clicking
10
on Next.
Now that you are done with installing Anaconda, you have to create a new conda
environment
12
where you will be installing all your modules to build your models.
You can run Anaconda prompt as an Administrator, which you can do by searching
the Anaconda prompt in the search bar and then click right on it, followed by
selecting the first option, which says Run as administrator.
After you click on it, you will see that your anaconda prompt has opened, and it
will look like the image given below.
Next, you will need to create an environment. For which you have to write the
following
13
command on the anaconda prompt and press enter. Here deeplearning specifies to
the name of the environment, but you can write anything as per your choice.
From the image given above, you can see that it is asking you for the package plan
environment location, click on y and press enter.
So, you can see in the above image that you have successfully created an
environment. Now the next step is to activate the environment that you created
earlier. To activate the environment, write the following;
14
1. activate deeplearning
From the above image, you can see that you are in this environment.
Ncoemxt have to install the Keras, which you can simply do by using the
below-given
m,
aynodu.
1. conda install -c anaconda keras
You can see that it is asking you to install the following packages, so proceed with
typing y.
15
From the above image, you can see that you are done with the installation
successfully.
So, you have to run two of the most important commands because when you
create an environment, jupyter and spyder are not preinstalled, that is why you have
to run them.
First, you will run the command for jupyter, which is as follow:
Again, it will ask you to install the following packages, so proceed with typing y.
spyder.
Since you are doing for the very first time, so it will again ask you for y/n, so you
just simply
17
From the image given above, you can see that it also has been installed successfully.
20
Experiment 3 - Train the model to add two numbers and report the
result
With the advent of Deep Learning, there have been huge successes for these kinds
of perceptual problems. In this guide, for the sake of simplicity and ease of
understanding, we will try to change the simple arithmetic addition to that of a
perceptual problem and then try to predict the values through this trained model.
In this guide, we are going to use Keras library which is made available as part of
the Tensorflow
library.
Data Tensors
Getting the data in proper shape is perhaps the most important aspect of any
machine learning model and it holds true here as well. The below program
(data_creation.py) creates the training and test sets for the Addition problem.
1import numpy as np
2train_data =
np.array([[1.0,1.0]])
3train_targets =
np.array([2.0])
4print(train_data)
5for i in range(3,10000,2):
6 train_data= np.append(train_data,[[i,i]],axis=0)
7 train_targets=
np.append(train_targets,[i+i]) 8test_data
= np.array([[2.0,2.0]])
9test_targets =
np.array([4.0]) 10for i in
range(4,8000,4):
11 test_data = np.append(test_data,[[i,i]],axis=0)
12 test_targets = np.append(test_targets,[i+i])
python
Let's analyze the above program:
1import numpy as np
2train_data =
np.array([[1.0,1.0]])
3train_targets =
np.array([2.0]) python
In the above three lines, we are importing the Numpy library and creating
train_data and train_target data sets. train_data is the array that will be used to
hold the two numbers that are going to be added while train_targets is the vector
that will hold the Addition value of the two. train_data is initialized to contain the
value like 1.0 and 1.0 as two numbers. This is a very simple program so you will
see the same number repeated (1.0) and this pattern is repeated in the
entire train and test data set that is same number (i) is used to add itself.
21
1for i in range(3,10000,2):
2train_data= np.append(train_data,[[i,i]],axis=0)
3train_targets= np.append(train_targets,[i+i])
python
The above lines append the train_data array and train_target vector by looping over the counter
(l i k) eth: at starts from 3 and goes up to 10000 with a step function of 2. This is what train_data
looks
Output
1[[1.000e+00 1.000e+00]
2 [3.000e+00 3.000e+00]
3 [5.000e+00 5.000e+00]
4 ...
5 [9.995e+03 9.995e+03]
6 [9.997e+03 9.997e+03]
7 [9.999e+03 9.999e+03]]
train_targets:
Output
1[2.0000e+00 6.0000e+00 1.0000e+01......1.9990e+04 1.9994e+04 1.9998e+04]
test_data and test_targets are also created in a similar fashion, with one difference: it goes till
8000 with the step of 4.
1test_data = np.array([[2.0,2.0]])
2test_targets = np.array([4.0])
3for i in range(4,8000,4):
4test_data = np.append(test_data,[[i,i]],axis=0)
5test_targets = np.append(test_targets,[i+i])
test_data:
Output
1[[2.000e+00 2.000e+00]
2 [4.000e+00 4.000e+00]
3 [8.000e+00 8.000e+00]
4 ...
5 [7.988e+03 7.988e+03]
6 [7.992e+03 7.992e+03]
7 [7.996e+03 7.996e+03]]
test_targets:
22
Output
1[4.0000e+00 8.0000e+00 1.6000e+01......1.5976e+04 1.5984e+04 1.5992e+04]
1import tensorflow as tf
2from tensorflow import keras
3import numpy as np
4import data_creation as dc
python
The above lines import the Tensorflow, Keras, and Numpy libraries in the program. Also, the
data_creation.py program that we created earlier is also imported and given a named variable as
dc. All the trained test data sets we created can now be referenced using the dc. For example, if
the user needs to use the contents of train_data then all she has to do is use dc.train_data to
access it.
1model = keras.Sequential([
2 keras.layers.Flatten(input_shape=(2,)),
3 keras.layers.Dense(20, activation=tf.nn.relu),
4 keras.layers.Dense(20, activation=tf.nn.relu),
5 keras.layers.Dense(1)
6])
python
The above code creates the actual Deep Learning model. The above model initializes a model as
a stack of layers (Keras.Sequential) and then flattens the input array to a vector
(keras.layers.Flatten(input_shape=(2,)). The flattening part also happens to be the first layer of
the neural network. The second and third layers of the network consist of 20 nodes each and the
activation function we are using is relu (rectified linear unit). Other activation functions such
as softmax can also be used. The last layer, fourth layer, is the output layer. Since we expect only
one output value (a predicted value since this is a regression model), we have just one output
node in this model (keras.layers.Dense(1)).
The architecture of the model depends, to a large extent, on the problem we are trying to solve.
The model we have created above will not work very well for the classification problems, such
as image classification.
1model.compile(optimizer='adam',
2loss='mse',
3metrics=['mae']) python
The above code will be used to compile the network. The optimization function we are using
is adam which is a momentum based optimizer and prevents the model from getting stuck in
local minima. The loss function we are using is mse (mean square error). It considers the squared
difference between the predicted values and the actual values. Also, we are monitoring another
metric, mae (mean absolute error).
1model.fit(dc.train_data, dc.train_targets, epochs=10, batch_size=1)
python
24
This is where the actual training of the networks happens. The training set will be fed to the
network 10 times (epochs) for the training purpose. The epoch needs to be carefully selected as a
lesser number of epochs may lead to an under-trained network while too many epochs may
lead to overfitting, wherein the network works well on the training data but not on the test data
set.
1test_loss, test_acc = model.evaluate(dc.test_data, dc.test_targets)
2print('Test accuracy:', test_acc)
python
The above code evaluates the trained model on the test data set and subsequently prints the test
accuracy value.
1a= np.array([[2000,3000],[4,5]])
2print(model.predict(a))
python
Once the model has been trained and tested we can use it to predict the values by supplying real-
world values. In this case, we are supplying the 2 sets of values (2000,30000) and (4,5) and the
output from the model is printed.
Output
1Epoch 1/10
25000/5000 [==============================] - 5s 997us/sample - loss: 1896071.4827
- mean_absolute_error: 219.0276
3
4Epoch 2/10
55000/5000 [==============================] - 5s 956us/sample - loss: 492.9092 -
mean_absolute_error: 3.8202
6
7Epoch 3/10
85000/5000 [==============================] - 5s 1ms/sample - loss: 999.7580 -
mean_absolute_error: 7.1740
9
10Epoch 4/10
115000/5000 [==============================] - 5s 1ms/sample - loss: 731.0374 -
mean_absolute_error: 6.0325
12
13Epoch 5/10
145000/5000 [==============================] - 5s 935us/sample - loss: 648.6434 -
mean_absolute_error: 7.5037
15
16Epoch 6/10
25
import numpy as np
import tensorflow as tf
from tensorflow import keras
train_data = np.array([[1.0,1.0]])
train_targets = np.array([2.0])
print(train_data)
for i in range(3,10000,2):
train_data= np.append(train_data,[[i,i]],axis=0)
testr_adina_tata=rgnept.sa=rrnapy.(a[p[2p.e0n,2d.(0tr]a])in_targets,
[i+i])
26
test_targets = np.array([4.0])
for i in range(4,8000,4):
test_data = np.append(test_data,[[i,i]],axis=0)
test_targets = np.append(test_targets,[i+i])
model = keras.Sequential([
keras.layers.Flatten(input_shape=(2,)),
keras.layers.Dense(20, activation=tf.nn.relu),
keras.layers.Dense(20, activation=tf.nn.relu),
keras.layers.Dense(1)
])
model.compile(optimizer='adam',
loss='mse',
metrics=['mae'])
model.fit(train_data, train_targets, epochs=10, batch_size=1)
print(model.predict(a))
[[1. 1.]]
Epoch 1/10
5000/5000 [==============================] - 8s 1ms/step - loss:
742813.1875 - mae: 105.6519
Epoch 2/10
5000/5000 [==============================] - 7s 1ms/step - loss:
1772.8480 - mae: 6.0134
Epoch 3/10
5000/5000 [==============================] - 7s 1ms/step - loss:
1884.9642 - mae: 8.9911
Epoch 4/10
5100409/.5106080505 -
[=m=a=e=:==1=0=.=6=5=2=0==================] - 8s 2ms/step - loss:
Epoch 5/10
5000/5000 [==============================] - 7s 1ms/step - loss:
1018.8793 - mae: 6.2299
Epoch 6/10
5000/5000 [==============================] - 8s 2ms/step - loss:
1276.4749 - mae: 5.4312
Epoch 7/10
5000/5000 [==============================] - 7s 1ms/step - loss:
1943.9398 - mae: 8.8076
Epoch 8/10
5000/5000 [==============================] - 7s 1ms/step - loss:
3522.0959 - mae: 8.8434
Epoch 9/10
27
Experiment 4 - Train the model to multiply two matrices and report the result using
keras.
import numpy as np
np.random.seed(1)
model = Sequential()
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='mean_squared_error',optimizer='adam',metrics=['binary_accuracy'])
# decimal output
print('decimal output:\n'+str(model.predict(training_data)))
# rounded output
print('rounded output:\n'+str(model.predict(training_data).round()))
29
import numpy as np
np.random.seed(1)
model = Sequential()
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='mean_squared_error',optimizer='adam',metrics=['binary_accuracy'])
# decimal output
print('decimal output:\n'+str(model.predict(training_data)))
# rounded output
print('rounded output:\n'+str(model.predict(training_data).round()))
30
Epoch 1/1000
Epoch 2/1000
Epoch 3/1000
.
Epoch 989/1000
Epoch 990/1000
Epoch 991/1000
Epoch 992/1000
Epoch 993/1000
Epoch 994/1000
Epoch 995/1000
Epoch 996/1000
Epoch 997/1000
Epoch 998/1000
Epoch 999/1000
Epoch 1000/1000
decimal output:
[[0.34885803]
[0.7179798 ]
[0.68972814]
[0.28727812]]
32
rounded output:
[[0.]
[1.]
[1.]
[0.]]
33
Experiment 5 – Train the model to print the prime numbers using Keras
import numpy as np
seed = 7
np.random.seed(seed)
max_number = 2 ** num_digits
def prime_list():
counter = 0
primes = [2, 3]
is_prime = True
counter += 1
if primes[i] ** 2 > n:
break
counter += 1
if n % primes[i] == 0:
is_prime = False
break
34
if is_prime:
primes.append(n)
return primes
primes =
prime_list() def
prime_encode(i):
if i in primes:
return 1
else:
return 0
def bin_encode(i):
def create_dataset():
x, y = [], []
x.append(bin_encode(i))
y.append(prime_encode(i))
return np.array(x), y
model = Sequential()
model.add(Dense(units=100, input_dim=num_digits))
model.add(PReLU())
model.add(Dropout(rate=0.2))
model.add(Dense(units=50))
35
model.add(PReLU())
model.add(Dropout(rate=0.2))
model.add(Dense(units=25))
model.add(PReLU())
model.add(Dropout(rate=0.2))
model.add(Dense(units=1))
model.add(Activation("sigmoid"))
model.compile(optimizer='RMSprop',
loss='binary_crossentropy',
metrics=['accuracy'])
validation_split=0.1)
# predict
errors, correct = 0, 0
tp, fn, fp = 0, 0, 0
x = bin_encode(i)
y = model.predict(np.array(x).reshape(-1, num_digits))
pred =
1 else:
36
pred = 0
obs = prime_encode(i)
if pred == obs:
correct += 1
else:
errors += 1
tp += 1
fn += 1
fp += 1
print("Errors :", errors, " Correct :", correct, "F score :", f_score)
def plot_history(history):
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('model loss')
plt.xlabel('epoch')
plt.ylabel('loss')
37
plt.savefig('RMSprop_more')
plot_history(history)
Output
Errors : 9 CorreCt : 90 F sCore :
.8235294117647058
2 1 0 3.60 83e-08
3 1 0 0.03 2073
4 0 0 8.21 77e-06
5 1 1 0.61 928
6 0 0 3.97 66e-15
7 1 1 0.82 378
8 0 0 0.00 872908
9 0 0 0.02 4348
10 0 0 1.00 83e-09
11 1 0 0.18 609
12 0 0 5.53 88e-08
13 1 1 0.79 401
14 0 0 3.67 47e-15
15 0 0 0.02 023
16 0 0 4.04 9e-07
17 1 1 0.84 016
18 0 0 4.10 57e-14
19 1 1 0.91 604
20 0 0 4.37 6e-15
21 0 0 0.01 1451
22 0 0 6.49 46e-25
23 1 1 0.59 597
24 0 0 4.45 55e-08
25 0 1 0.73 933
26 0 0 4.97 32e-18
38
27 0 0 0.0958722
28 0 0 1.80154e-16
29 1 1 0.722513
30 0 0 2.00777e-28
31 1 1 0.774054
32 0 0 3.93779e-05
33 0 0 0.118341
34 0 0 1.88295e-11
35 0 0 0.480108
36 0 0 3.0609e-07
37 1 1 0.847888
38 0 0 3.42833e-18
39 0 0 0.0514646
40 0 0 5.82673e-07
41 1 1 0.726771
42 0 0 3.72693e-11
43 1 1 0.861872
44 0 0 5.71867e-14
45 0 0 0.18657
46 0 0 7.03075e-16
47 1 1 0.654062
48 0 0 1.30385e-10
49 0 1 0.923631
50 0 0 1.30955e-17
51 0 0 0.190215
52 0 0 6.45953e-19
53 1 1 0.558284
54 0 0 1.83163e-29
55 0 0 0.287756
56 0 0 3.29105e-11
57 0 0 0.292637
58 0 0 3.57044e-23
59 1 0 0.152102
60 0 0 1.80104e-22
61 1 1 0.858877
62 0 0 1.92684e-32
63 0 0 0.27367
64 0 0 1.74397e-09
65 0 1 0.727574
66 0 0 1.33752e-20
67 1 1 0.891129
68 0 0 1.47396e-17
69 0 0 0.346057
70 0 0 5.27672e-27
71 1 1 0.932053
72 0 0 4.04155e-10
73 1 1 0.879374
74 0 0 1.4077e-18
75 0 0 0.0290487
76 0 0 6.39801e-17
77 0 1 0.629597
78 0 0 1.54139e-30
79 1 1 0.791511
80 0 0 7.56631e-21
81 0 0 0.0438443
82 0 0 4.24787e-30
83 1 1 0.596353
84 0 0 6.45592e-32
85 0 0 0.431211
86 0 0 0.0
87 0 0 0.00903795
88 0 0 9.54647e-23
89 1 1 0.827787
90 0 0 2.43897e-31
91 0 1 0.746695
92 0 0 8.37092e-37
93 0 0 0.0384408
94 0 0 0.0
95 0 0 0.3743
39
96 0 0 7.28071e-13
97 1 1 0.888417
98 0 0 3.04541e-25
99 0 0 0.0649973
100 0 0 1.59478e-18
40
import numpy as np
import tensorflow as tf
imdb.load_data(num_words=20000) (x_train,y_train),
(x_test,y_test)=imdb.load_data(num_words=20000)
x_train=pad_sequences(x_train,maxlen=100)
x_test=pad_sequences(x_test,maxlen=100)
vocab_size=20000
emed_size=128
model=Sequential()
model.add(Embedding(vocab_size,emed_size,input_shape=(x_train.shape[1],)))
model.add(LSTM(units=60,activation='tanh'))
model.compile(optimizer='adam',loss='binary_crossentropy',metrics=['accuracy'])
41
model.summary()
history=model.fit(x_train,y_train,epochs=5,batch_size=128,validation_data=(x_test,y_test))
epoch_range=[1,2,3,4,5]
plt.plot(epoch_range, history.history['accuracy'])
plt.plot(epoch_range, history.history['val_accuracy'])
Model: "sequential"
=============================================================
====
Total params: 2,605,360
Trainable params: 2,605,360
Non-trainable params: 0
Epoch 1/5
196/196 [==============================] - 38s loss 1.760 -
170ms/step - : 6
accuracy: 0.0078 - val_loss: 1.1591 - val_accuracy:
0.1327
Epoch 2/5
196/196 [==============================] - 32s loss 1.152 -
164ms/step - : 4
accuracy: 0.1252 - val_loss: 1.1365 - val_accuracy:
0.0272 Epoch 3/5 -
196/196 [==============================] - 32s loss 0.981
162ms/step - : 0
accuracy: 0.0443 - val_loss: 0.9238 - val_accuracy:
0.0105 Epoch 4/5
196/196 [==============================] - 32s loss 0.7770 -
165ms/step - :
accuracy: 0.0857 - val_loss: 0.9253 - val_accuracy:
0.1010
Epoch 5/5
196/196 [==============================] - 32s loss 0.6973 -
165ms/step - :
accuracy: 0.0709 - val_loss: 1.0435 - val_accuracy:
0.0549
42
that you can use according to your preference.
You can download historical weather dataset from here, also feel free to use any weather dataset
Let's load the dataset and see the first few rows:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd#import dataset from data.csv file
dataset = pd.read_csv('data.csv')
dataset = dataset.dropna(subset=["Temperature"])
dataset=dataset.reset_index(drop=True)training_set = dataset.iloc[:,4:5].values
We include only the temperature column as we are going to forecast temperature and drop all
the rows that have no values or has a NaN.
Next, we will have to apply feature scaling to normalize temperature in the range 0 to 1.
#Feature Scaling
from sklearn.preprocessing import MinMaxScalersc = MinMaxScaler(feature_range=(0,1))
training_set_scaled = sc.fit_transform(training_set)
We will create a training set such that for every 30 days we will provide the next 4 days
temperature as output. In other words, input for our RNN would be 30 days temperature data
and the output would be 4 days forecast of temperature.
x_train = []
y_train = []n_future = 4 # next 4 days temperature forecast
n_past = 30 # Past 30 days for i in range(0,len(training_set_scaled)-n_past-
n_future+1): x_train.append(training_set_scaled[i : i + n_past , 0])
y_train.append(training_set_scaled[i + n_past : i + n_past + n_future , 0 ])x_train , y_train =
np.array(x_train), np.array(y_train)x_train = np.reshape(x_train, (x_train.shape[0] , x_train.shape[1],
1) )
x_train contains 30 previous temperature inputs before that day and y_train contains 4 days
temperature outputs after that day. Since x_train and y_train are lists we will have to convert
them to numpy array to fit training set to our model.
Now we are ready with our training data so let’s proceed to build an RNN model for forecasting
weather.
45
1. First, we will import keras sequential model from keras.models and keras layers ie. LSTM,
Dense and dropout. You can refer Keras documentation for more info on Keras models and
layers here
from keras.models import Sequential
from keras.layers import LSTM,Dense ,Dropout
# Fitting RNN to training set using Keras Callbacks. Read Keras callbacks docs for more info.
2. Let us define the layers in our RNN. We will create a sequential model by adding layers
sequentially using sequential(). The first layer is a Bidirectional LSTM with 30 memory
units, return_sequence=True means that the last output in the output sequence is returned and
the input_shape describes the structure of the input. With Bidirectional LSTM the output layer
gets feedback from past(forward) as well as future(backward) states simultaneously. We add 3
hidden layers and an output layer with a linear activation function that outputs 4 days
temperature.
And at the last, we fit the RNN model with our training data.
regressor = Sequential()regressor.add(Bidirectional(LSTM(units=30, return_sequences=True,
input_shape = (x_train.shape[1],1) ) ))
regressor.add(Dropout(0.2))regressor.add(LSTM(units= 30 , return_sequences=True))
regressor.add(Dropout(0.2))regressor.add(LSTM(units= 30 , return_sequences=True))
regressor.add(Dropout(0.2))regressor.add(LSTM(units= 30))
regressor.add(Dropout(0.2))
regressor.add(Dense(units = n_future,activation='linear'))regressor.compile(optimizer='adam',
loss='mean_squared_error',metrics=['acc'])
regressor.fit(x_train, y_train, epochs=500,batch_size=32 )
4. Now that we have our test data ready, we can test our RNN model.
predicted_temperature = regressor.predict(testing)predicted_temperature
=
46
sc.inverse_transform(predicted_temperature)predicted_temperature =
np.reshape(predicted_temperature,(predicted_temperature.shape[1],predicted_temperature.shape[0]))
The output from the model is in the normalized form, so to get the actual temperature values we
apply inverse_transform() to the predicted_temperature and then reshape it.
Let’s compare the predicted and real temperatures. As we can see the model performs well with
the given test data.
real_temperature
array([[82.], [82.], [83.], [83.]])predicted_temperature
array([[83.76233 ], [83.957565], [83.70461 ], [83.6326 ]])
If we forecast temperature for a month and visualize it we get the following results.
tf.keras.layers.LSTM( un
its, activation="tanh",
recurrent_activation="sigmoid",
use_bias=True,
kernel_initializer="glorot_uniform",
recurrent_initializer="orthogonal",
bias_initializer="zeros",
unit_forget_bias=True,
kernel_regularizer=None,
recurrent_regularizer=None,
bias_regularizer=None,
activity_regularizer=None,
kernel_constraint=None,
recurrent_constraint=None,
bias_constraint=None,
dropout=0.0,
recurrent_dropout=0.0,
return_sequences=False,
return_state=False,
go_backwards=False,
stateful=False,
time_major=False,
unroll=False,
**kwargs
)
Based on available runtime hardware and constraints, this layer will choose different
implementations (cuDNN-based or pure-TensorFlow) to maximize the performance. If a GPU
is available and all the arguments to the layer meet the requirement of the cuDNN kernel (see
below for details), the layer will use a fast cuDNN implementation.
1. activation == tanh
48
2. recurrent_activation == sigmoid
3. recurrent_dropout == 0
4. unroll is False
5. use_bias is True
6. Inputs, if use masking, are strictly right-padded.
7. Eager execution is enabled in the outermost context.
For example:
•
trreacnusrforermnta_tdiornopoof uthte: Finlopaut sb.eDtwefeaeunlt0: 0a .n d 1.
Fraction of the units to drop for the linear transformation of the recurrent state.
Default: 0.
• return_sequences: Boolean. Whether to return the last output in the output sequence, or
the full sequence. Default: False.
• return_state: Boolean. Whether to return the last state in addition to the output.
Default: False.
• go_backwards: Boolean (default False). If True, process the input sequence backwards
and return the reversed sequence.
• stateful: Boolean (default False). If True, the last state for each sample at index i in
a batch will be used as initial state for the sample of index i in the following batch.
• time_major: The shape format of the inputs and outputs tensors. If True, the inputs and
outputs will be in shape [timesteps, batch, feature], whereas in the False case, it will
be [batch, timesteps, feature]. Using time_major = True is a bit more efficient because it
avoids transposes at the beginning and end of the RNN calculation. However, most
TensorFlow data is batch-major, so by default this function accepts input and emits
output in batch-major form.
• unroll: Boolean (default False). If True, the network will be unrolled, else a symbolic
loop will be used. Unrolling can speed-up a RNN, although it tends to be more
memory- intensive. Unrolling is only suitable for short sequences.
Call arguments
• Sometimes, a sequence is better used in reversed order. In those cases, you can
simply reverse a vector x using the Python syntax x[::-1] before using it to train your
LSTM
network.
50
• Sometimes, neither the forward nor the reversed order works perfectly, but combining
them will give better results. In this case, you will need a bidirectional LSTM network.
• A bidirectional LSTM network is simply two separate LSTM networks; one feeds with
a forward sequence and another with reversed sequence. Then the output of the two
LSTM networks is concatenated together before being fed to the subsequent layers of
the network. In Keras, you have the function Bidirectional() to clone an LSTM layer for
forward-backward input and concatenate their output. For example,
model = Sequential()
1
model.add(Embedding(top_words, embedding_vecor_length,
2
3input_length=max_review_length))
model.add(Bidirectional(LSTM(100, dropout=0.2, recurrent_dropout=0.2)))
4 model.add(Dense(1, activation='sigmoid'))
• Since you created not one, but two LSTMs with 100 units each, this network will take
twice the amount of time to train. Depending on the problem, this additional cost may
be justified.
•
The full code listing with adding the bidirectional LSTM to the last example is listed
below for completeness.
25 model.add(Dense(1, activation='sigmoid'))
26 model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
27 print(model.summary())
28 model.fit(X_train, y_train, epochs=3, batch_size=64)
29 # Final evaluation of the model
30 scores = model.evaluate(X_test, y_test, verbose=0)
print("Accuracy: %.2f%%" % (scores[1]*100))
• Note: Your results may vary given the stochastic nature of the algorithm or
evaluation procedure, or differences in numerical precision. Consider running the
example a few times and compare the average outcome.
• Running this example provides the following output.
Epoch 1/3
391/391 [==============================] - 405s 1s/step - loss: 0.4960 - accuracy:
1
0.7532
2
Epoch 2/3
3
391/391 [==============================] - 439s 1s/step - loss: 0.3075 - accuracy:
4
5 0E.p8o7c4h4 3/3
6
391/391 [==============================] - 430s 1s/step - loss: 0.2551 - accuracy:
7
0.9014
Accuracy: 87.69%
import numpy as np
from tensorflow import keras
from tensorflow.keras import layers
max_features = 20000 # Only consider the top 20k words
maxlen = 200 # Only consider the first 200 words of each movie review
Model: "model"
lief nlegnthgt=h
l>enm(naexw_l_esnegntthe:nce.split(" "))
max_length = length
def tokenize_the_data_from_pandas(dataframe,column_name): # This function returns tokens
dictionary that includes every unique word as keys and unique integers for each key.
#Here we clean the string out of punctuations.
import string
for sent in range(len(data[column_name])):
example_sentence = data[column_name].iloc[sent]
new_sentence = example_sentence.translate(str.maketrans("","",string.punctuation))
data[column_name].iloc[sent] = new_sentence
#Here we create a dictionary that will encode each word into an integer to have a
representation of the word in the deep neural networks processes
tokens = {}
for sent in range(len(data[column_name])):
example_sentence = data[column_name].iloc[sent]
values = example_sentence.split(" ")
for word in values:
tokens[word] = 0
names = list(tokens.keys())
for num in range(len(names)):
tokens[names[num]] = num+1
return tokens
tokens = tokenize_the_data_from_pandas(data,column_name="text")
len(tokens.keys())
54
model.add(tf.keras.layers.MCoanxvP1oDo(li6n4g,17D,ac(5ti)v)ation="relu"))
model.add(tf.keras.layers.MaxPooling1D(5))
model.add(tf.keras.layers.Conv1D(64,7,activation="relu"))
model.add(tf.keras.layers.MaxPooling1D(5))
#model.add(tf.keras.layers.GlobalMaxPooling1D())
model.add(tf.keras.layers.Flatten())
model.add(tf.keras.layers.Dense(1, activation='sigmoid'))
model.compile(optimizer = "adam",loss="binary_crossentropy",metrics=["accuracy"])
model.fit(X_train,y_train,epochs=5,validation_data=(X_test,y_test))
Epoch 1/5
2022-01-26 13:16:46.737730: I tensorflow/stream_executor/cuda/cuda_dnn.cc:369] Loaded
cuDNN version 8005
1125/1125 [==============================] - 37s 26ms/step - loss: 0.3360 - accuracy:
0.8362 - val_loss: 0.2162 - val_accuracy: 0.9087
Epoch 2/5
1125/1125 [==============================] - 29s 26ms/step - loss: 0.0945 - accuracy:
0.9680 - val_loss: 0.2506 - val_accuracy: 0.9013
Epoch 3/5
1125/1125 [==============================] - 29s 26ms/step - loss: 0.0161 - accuracy:
0.9954 - val_loss: 0.4001 - val_accuracy: 0.8938
Epoch 4/5
101.92956/511-2v5a[l_=l=o=ss=:=0=.4=9=8=8=-=v=a=l=_a=c=c=u=ra=c=y=:
=0.=8=9=2=0 ====] - 29s 26ms/step - loss: 0.0096 - accuracy: Epoch 5/5
1125/1125 [==============================] - 29s 26ms/step - loss: 0.0107 - accuracy:
0.9967 - val_loss: 0.5361 - val_accuracy: 0.8915
Out[14]:
<keras.callbacks.History at 0x7f560b67b890>
#Predicting the test set to investiagte further.
y_pred=model.predict(X_test[:])
#We turn our networks output into binary values
y_pred[y_pred>0.5] = 1
y_pred[y_pred< 0.5] = 0
Truth = 0
Falset = 0
55
Truth += 1
#print(y_pred[pik] == y_test[pik])
print(Truth,Falset)
3566 434
print("accuracy:",Truth / (Truth + Falset))
accuracy: 0.8915
You can quickly develop a small LSTM for the IMDB problem and achieve good accuracy.
Let’s start by importing the classes and functions required for this model and initializing the
random number generator to a constant value to ensure you can easily reproduce the results.
1 import tensorflow as tf
2 from tensorflow.keras.datasets import imdb
3 from tensorflow.keras.models import Sequential
4 from tensorflow.keras.layers import Dense
5 from tensorflow.keras.layers import LSTM
6 from tensorflow.keras.layers import Embedding
7 from tensorflow.keras.preprocessing import sequence
8 # fix random seed for reproducibility
9 tf.random.set_seed(7)
You need to load the IMDB dataset. You are constraining the dataset to the top 5,000
words. You will also split the dataset into train (50%) and test (50%) sets.
1 # load the dataset but only keep the top n words, zero the rest
2 top_words = 5000
3 (X_train, y_train), (X_test, y_test) = imdb.load_data(num_words=top_words)
Next, you need to truncate and pad the input sequences, so they are all the same length for
modeling. The model will learn that the zero values carry no information. The sequences are not
the same length in terms of content, but same-length vectors are required to perform the
computation in Keras.
56
The first layer is the Embedded layer that uses 32-length vectors to represent each word. The
next layer is the LSTM layer with 100 memory units (smart neurons). Finally, because this is a
classification problem, you will use a Dense output layer with a single neuron and a sigmoid
activation function to make 0 or 1 predictions for the two classes (good and bad) in the problem.
Because it is a binary classification problem, log loss is used as the loss function
(binary_crossentropy in Keras). The efficient ADAM optimization algorithm is used. The
model is fit for only two epochs because it quickly overfits the problem. A large batch size of
64 reviews is used to space out weight updates.
# create the model
1 embedding_vecor_length
2 =
332 model = Sequential()
model.add(Embedding(top_words, embedding_vecor_length,
4input_length=max_review_length))
5model.add(LSTM(100))
6
model.add(Dense(1, activation='sigmoid'))
7
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
8
print(model.summary())
9
model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=3, batch_size=64)
Once fit, you can estimate the performance of the model on unseen reviews.
11 # load the dataset but only keep the top n words, zero the
rest 12 top_words = 5000
13 (X_train, y_train), (X_test, y_test) =
imdb.load_data(num_words=top_words) 14 # truncate and pad input sequences
15 max_review_length = 500
16 X_train = sequence.pad_sequences(X_train, maxlen=max_review_length)
17 X_test = sequence.pad_sequences(X_test, maxlen=max_review_length)
18 # create the model
19 embedding_vecor_length =
32 20 model = Sequential()
21 model.add(Embedding(top_words, embedding_vecor_length,
22 input_length=max_review_length))
23 model.add(LSTM(100))
24 model.add(Dense(1, activation='sigmoid'))
25 model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
26 print(model.summary())
27 model.fit(X_train, y_train, epochs=3, batch_size=64)
28 # Final evaluation of the model
29 scores = model.evaluate(X_test, y_test,
verbose=0) print("Accuracy: %.2f%%" %
(scores[1]*100))
Note: Your results may vary given the stochastic nature of the algorithm or evaluation
procedure, or differences in numerical precision. Consider running the example a few times and
compare the average outcome.
Running this example produces the following output.
Epoch 1/3
391/391 [==============================] - 124s 316ms/step - loss: 0.4525 -
1
accuracy: 0.7794
2
Epoch 2/3
3
391/391 [==============================] - 124s 318ms/step - loss: 0.3117 -
4 accuracy: 0.8706
5
6 Epoch 3/3
391/391 [==============================] - 126s 323ms/step - loss: 0.2526 -
7 accuracy: 0.9003
Accuracy: 86.83%