KEMBAR78
Deep Learning Manual | PDF | Computing
0% found this document useful (0 votes)
11 views58 pages

Deep Learning Manual

The document provides an overview of Keras, an open-source high-level neural network library developed by Francois Chollet, highlighting its user-friendly design, multi-backend support, and extensive industry adoption. It details the installation process using Anaconda, including environment setup and necessary packages, followed by a simple experiment demonstrating model training for arithmetic addition. Additionally, it discusses the advantages and disadvantages of Keras, emphasizing its ease of use and community support.

Uploaded by

khajahussain0886
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views58 pages

Deep Learning Manual

The document provides an overview of Keras, an open-source high-level neural network library developed by Francois Chollet, highlighting its user-friendly design, multi-backend support, and extensive industry adoption. It details the installation process using Anaconda, including environment setup and necessary packages, followed by a simple experiment demonstrating model training for arithmetic addition. Additionally, it discusses the advantages and disadvantages of Keras, emphasizing its ease of use and community support.

Uploaded by

khajahussain0886
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 58

2

LIST OF EXPERIMENTS

SNO NAME OF
THE DATE PAGENO SIGNATURE
EXPERIMEN
T
4

Experiment 1 – Introduction of Keras

Keras is an open-source high-level Neural Network library, which is written in


Python is capable enough to run on Theano, TensorFlow, or CNTK. It was
developed by one of the Google engineers, Francois Chollet. It is made user-
friendly, extensible, and modular for facilitating faster experimentation with deep
neural networks. It not only supports Convolutional Networks and Recurrent
Networks individually but also their combination.

It cannot handle low-level computations, so it makes use of the Backend library to


resolve it. The backend library act as a high-level API wrapper for the low-level
API, which lets it run on TensorFlow, CNTK, or Theano.

Initially, it had over 4800 contributors during its launch, which now has gone up to
250,000 developers. It has a 2X growth ever since every year it has grown. Big
companies like Microsoft, Google, NVIDIA, and Amazon have actively contributed
to the development of Keras. It has an amazing industry interaction, and it is used
in the development of popular firms likes Netflix, Uber, Google, Expedia, etc.

Specialties of Keras

o Focus on user experience has always been a major part of Keras.


o Large adoption in the industry.
o It is a multi backend and supports multi-platform, which helps all the
encoders come together for coding.
o Research community present for Keras works amazingly with the production
community.
o Easy to grasp all concepts.
o It supports fast prototyping.
o It seamlessly runs on CPU as well as GPU.
o It provides the freedom to design any architecture, which then later is
utilized as an API for the project.
o It is really very simple to get started with.
o Easy production of models actually makes Keras special.

Keras user experience

1. Keras is an API designed for humans – Best practices are followed by


Keras to decrease cognitive load, ensures that the models are consistent,
and the corresponding APIs are simple.
2. Not designed for machines – Keras provides clear feedback upon the
occurrence of any error that minimizes the number of user actions for the
majority of the common use cases.
3. Easy to learn and use.
4. Highly Flexible – Keras provide high flexibility to all of its developers by
integrating low-level deep learning languages such as TensorFlow or
Theano, which ensures that anything written in the base language can be
implemented in Keras.
5

multi-backend and multi-platform of Keras

Keras can be developed in R as well as Python, such that the code can be run with
TensorFlow, Theano, CNTK, or MXNet as per the requirement. Keras can be run
on CPU, NVIDIA GPU, AMD GPU, TPU, etc. It ensures that producing models with
Keras is really simple as it totally supports to run with TensorFlow serving, GPU
acceleration (WebKeras, Keras.js), Android (TF, TF Lite), iOS (Native CoreML) and
Raspberry Pi. Play Video

Keras Backend

Keras being a model-level library helps in developing deep learning models by


offering high- level building blocks. All the low-level computations such as
products of Tensor, convolutions, etc. are not handled by Keras itself, rather they
depend on a specialized tensor manipulation library that is well optimized to serve
as a backend engine. Keras has managed it so perfectly that instead of
incorporating one single library of tensor and performing operations related to
that particular library, it offers plugging of different backend engines into Keras.

Keras consist of three backend engines, which are as follows:

o TensorFlow
TensorFlow is a Google product, which is one of the most famous deep
learning tools
widely used in the research area of machine learning and deep neural
network. It came into the market on 9 th November 2015 under the Apache
License 2.0. It is built in such a way that it can easily run on multiple CPUs
and GPUs as well as on mobile operating systems. It consists of various
wrappers in distinct languages such as Java, C++, or Python.

o Theano
Theano was developed at the University of Montreal, Quebec, Canada, by
the MILA group. It is an open-source python library that is widely used for
performing mathematical operations on multi-dimensional arrays by
incorporating scipy and numpy. It utilizes GPUs for faster computation and
efficiently computes the gradients by building
symbolic graphs automatically. It has come out to be very suitable for
unstable expressions, as it first observes them numerically and then
computes them with more stable

algorithms.

o CNTK
Microsoft Cognitive Toolkit is deep learning's open-source framework. It
consists of all the basic building blocks, which are required to form a neural
network. The models are trained using C++ or Python, but it incorporates
C# or Java to load the model for making predictions.

Advantages of Keras

Keras encompasses the following advantages, which are as follows:


6

o It is very easy to understand and incorporate the faster deployment of network


models.
o It has huge community support in the market as most of the AI companies
are keen on using it.
o It supports multi backend, which means you can use any one of them among
TensorFlow, CNTK, and Theano with Keras as a backend according to your
requirement.
o Since it has an easy deployment, it also holds support for cross-platform.
Following are
the devices on which Keras can be deployed:
1. iOS with CoreML
2. Android with TensorFlow Android
3. Web browser with .js support
4. Cloud engine
5. Raspberry pi
o It supports Data parallelism, which means Keras can be trained on multiple
GPU's at an instance for speeding up the training time and processing a
huge amount of data.

Disadvantages of Keras

o The only disadvantage is that Keras has its own pre-configured layers, and if
you want to
create
APIs. Itan abstract
only layer,
supports it won'tAPI
high-level let you because
running it cannot
on the handle
top of the low-level
backend engine
(TensorFlow, Theano,
and CNTK).

Prerequisite

This Keras tutorial is made for both beginners and professionals, to help them
understand the fundamental concept of Keras. After the completion of this tutorial,
you will find yourself at a moderate level of expertise from where you can take
yourself to the next level.
7

Experiment 2 – Installing Keras and packages in Keras

To install Keras, you will need Anaconda Distribution, which is supported by a


company called Continuum Analytics. Anaconda provides a platform for Python
and R languages, which is an open-source and free distribution. It is a platform-
independent, which means that it can be installed on any operating system such as
MAC OS, Windows, and Linux as per the user's requirement. It has come up with
more than 1500 packages of Python/R that are necessary for developing deep
learning as well as machine learning models.

It provides an easy python installation with several IDE's such as Jupyter


Notebook, Anaconda prompt, Spyder, etc. Once it is installed, it will automatically
install Python with some of its basic IDE's and libraries by providing as much
convenience as it can to its user.

Following are the steps that illustrate Keras installation:

Step1: Download Anaconda Python

To download Anaconda, you can either go to one of your favorite browser and type
Download Anaconda Python in the search bar or, simply follow the link given below.

https://www.anaconda.com/distribution/#download-section.

Click on the very first link, and you will get directed to the Anaconda's download
page, as shown below:
8

You will notice that Anaconda is available for various operating systems such as
Windows,
M A C O S , an d Li n u x . Y o u c S
I t w ill of f er y ou P y t h o n 2 .7 and s
o w n l ad i t b y c l c k i n g o n th e av i la b l o p t io n .
P yt h on 3 . 7 v e rs i o n . S i n ce th e lat e s t ve rs i o n is o
a s p e r y o ur O
P y t h o n 3 . 7 ,
download it by clicking on the download option. The downloading will
automatically start after you hit the download option.

Step2: Install Anaconda Python

After the download is finished, go to the download folder and click on the
Anaconda's .exe file (Anaconda3-2019.03-Windows-x86_64.exe). The setup window
for the installation of Anaconda will get open up where you have to click on Next,
as shown below:
9

After clicking on the Next, it will open a License Agreement window, click on I
Agree to move ahead with the installation.

Next, you will get two options in the window; click on the first option, followed
by clicking
10

on Next.

Once you are done with the installation, click on Next.


11

Click on Finish after the installation is completed to end the process.

Step3: Create Environment

Now that you are done with installing Anaconda, you have to create a new conda
environment
12

where you will be installing all your modules to build your models.

You can run Anaconda prompt as an Administrator, which you can do by searching
the Anaconda prompt in the search bar and then click right on it, followed by
selecting the first option, which says Run as administrator.

After you click on it, you will see that your anaconda prompt has opened, and it
will look like the image given below.

Next, you will need to create an environment. For which you have to write the
following
13

command on the anaconda prompt and press enter. Here deeplearning specifies to
the name of the environment, but you can write anything as per your choice.

1. conda create --name deeplearning

From the image given above, you can see that it is asking you for the package plan
environment location, click on y and press enter.

So, you can see in the above image that you have successfully created an
environment. Now the next step is to activate the environment that you created
earlier. To activate the environment, write the following;
14

1. activate deeplearning

From the above image, you can see that you are in this environment.
Ncoemxt have to install the Keras, which you can simply do by using the
below-given
m,
aynodu.
1. conda install -c anaconda keras

You can see that it is asking you to install the following packages, so proceed with
typing y.
15

From the above image, you can see that you are done with the installation
successfully.

Socinccuerrtehnicseios faenrreowr: eMnvoidruolnemNeontFt


so , y o u n e e d to
o u n d E rr o r : N o
do a f e w in s ta ll a t io n s awghainle
m od u l e na m e d ' k e ra s '
siompasortoinagvKoiedratsh.e

So, you have to run two of the most important commands because when you
create an environment, jupyter and spyder are not preinstalled, that is why you have
to run them.

First, you will run the command for jupyter, which is as follow:

1. conda install jupyter


16

Again, it will ask you to install the following packages, so proceed with typing y.

You can see in the above image that it has been

successfully installed. Next, you will do the same for

spyder.

1. conda install spyder

Since you are doing for the very first time, so it will again ask you for y/n, so you
just simply
17

proceed by clicking on y as you did before.

You can see that your installation is successfully completed.


You would require to install matplotlib for visualization. Again, the same
procedure will be carried out.
1. conda install matplotlib

It will ask you for y/n, click on y to proceed further.


18

You can see that you have successfully installed matplotlib.


Lastly, you will be installing pandas, and again the procedure is the same.
1. conda install pandas

Proceed with clicking on y.


19

From the image given above, you can see that it also has been installed successfully.
20

Experiment 3 - Train the model to add two numbers and report the
result
With the advent of Deep Learning, there have been huge successes for these kinds
of perceptual problems. In this guide, for the sake of simplicity and ease of
understanding, we will try to change the simple arithmetic addition to that of a
perceptual problem and then try to predict the values through this trained model.
In this guide, we are going to use Keras library which is made available as part of
the Tensorflow
library.

Data Tensors
Getting the data in proper shape is perhaps the most important aspect of any
machine learning model and it holds true here as well. The below program
(data_creation.py) creates the training and test sets for the Addition problem.

1import numpy as np
2train_data =
np.array([[1.0,1.0]])
3train_targets =
np.array([2.0])

4print(train_data)
5for i in range(3,10000,2):
6 train_data= np.append(train_data,[[i,i]],axis=0)
7 train_targets=
np.append(train_targets,[i+i]) 8test_data
= np.array([[2.0,2.0]])
9test_targets =
np.array([4.0]) 10for i in
range(4,8000,4):
11 test_data = np.append(test_data,[[i,i]],axis=0)
12 test_targets = np.append(test_targets,[i+i])
python
Let's analyze the above program:
1import numpy as np
2train_data =
np.array([[1.0,1.0]])
3train_targets =
np.array([2.0]) python
In the above three lines, we are importing the Numpy library and creating
train_data and train_target data sets. train_data is the array that will be used to
hold the two numbers that are going to be added while train_targets is the vector
that will hold the Addition value of the two. train_data is initialized to contain the
value like 1.0 and 1.0 as two numbers. This is a very simple program so you will
see the same number repeated (1.0) and this pattern is repeated in the
entire train and test data set that is same number (i) is used to add itself.
21

1for i in range(3,10000,2):
2train_data= np.append(train_data,[[i,i]],axis=0)
3train_targets= np.append(train_targets,[i+i])
python
The above lines append the train_data array and train_target vector by looping over the counter

(l i k) eth: at starts from 3 and goes up to 10000 with a step function of 2. This is what train_data
looks
Output
1[[1.000e+00 1.000e+00]
2 [3.000e+00 3.000e+00]
3 [5.000e+00 5.000e+00]
4 ...
5 [9.995e+03 9.995e+03]
6 [9.997e+03 9.997e+03]
7 [9.999e+03 9.999e+03]]
train_targets:
Output
1[2.0000e+00 6.0000e+00 1.0000e+01......1.9990e+04 1.9994e+04 1.9998e+04]
test_data and test_targets are also created in a similar fashion, with one difference: it goes till
8000 with the step of 4.
1test_data = np.array([[2.0,2.0]])
2test_targets = np.array([4.0])
3for i in range(4,8000,4):
4test_data = np.append(test_data,[[i,i]],axis=0)

5test_targets = np.append(test_targets,[i+i])
test_data:
Output
1[[2.000e+00 2.000e+00]
2 [4.000e+00 4.000e+00]
3 [8.000e+00 8.000e+00]
4 ...
5 [7.988e+03 7.988e+03]
6 [7.992e+03 7.992e+03]
7 [7.996e+03 7.996e+03]]
test_targets:
22

Output
1[4.0000e+00 8.0000e+00 1.6000e+01......1.5976e+04 1.5984e+04 1.5992e+04]

Developing Neural Network for Addition Using Keras


Keras is an API spec that can be used to run various deep learning libraries e.g. Tensorflow,
Theano, etc. It is to be noted that Keras does not have an implementation and it is a high-level
API that runs on top of other deep learning libraries. The problem we are attempting to solve is a
regression problem where the output can be a continuum of values rather than taking a specified
set of values. Below, the program creates a Deep Learning model, trains it using the training set
we created in the data_creation.py program, and then tests it using the test set also created in the
same program. Finally, the trained model is used to predict the values.
1import tensorflow as tf
2from tensorflow import keras
3import numpy as np
4import data_creation as dc
5
6model = keras.Sequential([
7 keras.layers.Flatten(input_shape=(2,)),
8 keras.layers.Dense(20, activation=tf.nn.relu),
9 keras.layers.Dense(20, activation=tf.nn.relu),
10 keras.layers.Dense(1)
11])
12
13model.compile(optimizer='adam',
14 loss='mse',
15 metrics=['mae'])
16
17model.fit(dc.train_data, dc.train_targets, epochs=10, batch_size=1)
18
19test_loss, test_acc = model.evaluate(dc.test_data, dc.test_targets)
20print('Test accuracy:', test_acc)
21a= np.array([[2000,3000],[4,5]])
22print(model.predict(a))
python
Let's analyze the above program by breaking it into small chunks:
23

1import tensorflow as tf
2from tensorflow import keras
3import numpy as np
4import data_creation as dc
python
The above lines import the Tensorflow, Keras, and Numpy libraries in the program. Also, the
data_creation.py program that we created earlier is also imported and given a named variable as
dc. All the trained test data sets we created can now be referenced using the dc. For example, if
the user needs to use the contents of train_data then all she has to do is use dc.train_data to
access it.
1model = keras.Sequential([
2 keras.layers.Flatten(input_shape=(2,)),
3 keras.layers.Dense(20, activation=tf.nn.relu),
4 keras.layers.Dense(20, activation=tf.nn.relu),
5 keras.layers.Dense(1)

6])
python
The above code creates the actual Deep Learning model. The above model initializes a model as
a stack of layers (Keras.Sequential) and then flattens the input array to a vector
(keras.layers.Flatten(input_shape=(2,)). The flattening part also happens to be the first layer of
the neural network. The second and third layers of the network consist of 20 nodes each and the
activation function we are using is relu (rectified linear unit). Other activation functions such
as softmax can also be used. The last layer, fourth layer, is the output layer. Since we expect only
one output value (a predicted value since this is a regression model), we have just one output
node in this model (keras.layers.Dense(1)).
The architecture of the model depends, to a large extent, on the problem we are trying to solve.
The model we have created above will not work very well for the classification problems, such

as image classification.
1model.compile(optimizer='adam',
2loss='mse',
3metrics=['mae']) python
The above code will be used to compile the network. The optimization function we are using
is adam which is a momentum based optimizer and prevents the model from getting stuck in
local minima. The loss function we are using is mse (mean square error). It considers the squared
difference between the predicted values and the actual values. Also, we are monitoring another
metric, mae (mean absolute error).
1model.fit(dc.train_data, dc.train_targets, epochs=10, batch_size=1)
python
24

This is where the actual training of the networks happens. The training set will be fed to the
network 10 times (epochs) for the training purpose. The epoch needs to be carefully selected as a
lesser number of epochs may lead to an under-trained network while too many epochs may
lead to overfitting, wherein the network works well on the training data but not on the test data
set.
1test_loss, test_acc = model.evaluate(dc.test_data, dc.test_targets)
2print('Test accuracy:', test_acc)
python
The above code evaluates the trained model on the test data set and subsequently prints the test
accuracy value.
1a= np.array([[2000,3000],[4,5]])
2print(model.predict(a))
python
Once the model has been trained and tested we can use it to predict the values by supplying real-
world values. In this case, we are supplying the 2 sets of values (2000,30000) and (4,5) and the
output from the model is printed.
Output

1Epoch 1/10
25000/5000 [==============================] - 5s 997us/sample - loss: 1896071.4827
- mean_absolute_error: 219.0276
3
4Epoch 2/10
55000/5000 [==============================] - 5s 956us/sample - loss: 492.9092 -
mean_absolute_error: 3.8202
6
7Epoch 3/10
85000/5000 [==============================] - 5s 1ms/sample - loss: 999.7580 -
mean_absolute_error: 7.1740
9
10Epoch 4/10
115000/5000 [==============================] - 5s 1ms/sample - loss: 731.0374 -
mean_absolute_error: 6.0325
12
13Epoch 5/10
145000/5000 [==============================] - 5s 935us/sample - loss: 648.6434 -
mean_absolute_error: 7.5037
15
16Epoch 6/10
25

175000/5000 [==============================] - 5s 942us/sample - loss: 603.1096 -


mean_absolute_error: 7.7574
18
19Epoch 7/10
205000/5000 [==============================] - 5s 1ms/sample - loss: 596.2445 -
mean_absolute_error: 5.1727
21
22Epoch 8/10
235000/5000 [==============================] - 5s 924us/sample - loss: 685.5327 -
mean_absolute_error: 4.9312
24
25Epoch 9/10
265000/5000 [==============================] - 5s 931us/sample - loss: 1895.0845 -
mean_absolute_error: 5.7679
27
28Epoch 10/10
295000/5000 [==============================] - 5s 996us/sample - loss: 365.9733 -
mean_absolute_error: 2.7120
302000/2000 [==============================] - 0s 42us/sample - loss: 5.8080 -
mean_absolute_error: 2.0810
31Test accuracy: 2.0810156
32[[5095.9385 ]
33 [ 9.108022]]
python
As can be seen, the value predicted for the input set (2000,3000) is 5095.9385 and for input set
(4,5) it is 9.108022. This can be optimized by changing the epochs or by increasing the layers or
increasing the number of nodes in a layer.

import numpy as np
import tensorflow as tf
from tensorflow import keras
train_data = np.array([[1.0,1.0]])
train_targets = np.array([2.0])
print(train_data)
for i in range(3,10000,2):
train_data= np.append(train_data,[[i,i]],axis=0)

testr_adina_tata=rgnept.sa=rrnapy.(a[p[2p.e0n,2d.(0tr]a])in_targets,

[i+i])
26

test_targets = np.array([4.0])
for i in range(4,8000,4):
test_data = np.append(test_data,[[i,i]],axis=0)
test_targets = np.append(test_targets,[i+i])
model = keras.Sequential([
keras.layers.Flatten(input_shape=(2,)),
keras.layers.Dense(20, activation=tf.nn.relu),
keras.layers.Dense(20, activation=tf.nn.relu),
keras.layers.Dense(1)
])

model.compile(optimizer='adam',
loss='mse',
metrics=['mae'])
model.fit(train_data, train_targets, epochs=10, batch_size=1)

test_loss, test_acc = model.evaluate(test_data, test_targets)


print('Test accuracy:', test_acc)
a= np.array([[2000,3000],[4,5]])

print(model.predict(a))
[[1. 1.]]
Epoch 1/10
5000/5000 [==============================] - 8s 1ms/step - loss:
742813.1875 - mae: 105.6519
Epoch 2/10
5000/5000 [==============================] - 7s 1ms/step - loss:
1772.8480 - mae: 6.0134
Epoch 3/10
5000/5000 [==============================] - 7s 1ms/step - loss:
1884.9642 - mae: 8.9911
Epoch 4/10
5100409/.5106080505 -
[=m=a=e=:==1=0=.=6=5=2=0==================] - 8s 2ms/step - loss:
Epoch 5/10
5000/5000 [==============================] - 7s 1ms/step - loss:
1018.8793 - mae: 6.2299
Epoch 6/10
5000/5000 [==============================] - 8s 2ms/step - loss:
1276.4749 - mae: 5.4312
Epoch 7/10
5000/5000 [==============================] - 7s 1ms/step - loss:
1943.9398 - mae: 8.8076
Epoch 8/10
5000/5000 [==============================] - 7s 1ms/step - loss:
3522.0959 - mae: 8.8434
Epoch 9/10
27

5000/5000 [==============================] - 7s 1ms/step - loss:


707.0856 - mae: 5.1806
Epoch 10/10
5000/5000 [==============================] - 7s 1ms/step - loss:
1415.4739 - mae: 8.1555
63/63 [==============================] - 0s 2ms/step - loss:
1182.1486 - mae: 29.7238
Test accuracy: 29.723848342895508
1/1 [==============================] - 0s 93ms/step
[[5440.4873 ]
[ 9.6786175]]
28

Experiment 4 - Train the model to multiply two matrices and report the result using
keras.

import numpy as np

from keras.models import Sequential

from keras.layers.core import Dense

# Set seed for reproducibility

np.random.seed(1)

# the four different states of the XOR gate

training_data = np.array([[0,0],[0,1],[1,0],[1,1]], "float32")

# the four expected results in the same order

target_data = np.array([[0],[1],[1],[0]], "float32")

model = Sequential()

model.add(Dense(4, input_dim=2, activation='relu'))

model.add(Dense(1, activation='sigmoid'))

model.compile(loss='mean_squared_error',optimizer='adam',metrics=['binary_accuracy'])

history = model.fit(training_data, target_data, epochs=1000, verbose=1)

# decimal output

print('decimal output:\n'+str(model.predict(training_data)))

# rounded output

print('rounded output:\n'+str(model.predict(training_data).round()))
29

import numpy as np

from keras.models import Sequential

from keras.layers.core import Dense

# Set seed for reproducibility

np.random.seed(1)

# the four different states of the XOR gate

training_data = np.array([[0,0],[0,1],[1,0],[1,1]], "float32")

# the four expected results in the same order

target_data = np.array([[0],[1],[1],[0]], "float32")

model = Sequential()

model.add(Dense(4, input_dim=2, activation='relu'))

model.add(Dense(1, activation='sigmoid'))

model.compile(loss='mean_squared_error',optimizer='adam',metrics=['binary_accuracy'])

history = model.fit(training_data, target_data, epochs=1000, verbose=1)

# decimal output

print('decimal output:\n'+str(model.predict(training_data)))

# rounded output

print('rounded output:\n'+str(model.predict(training_data).round()))
30

Epoch 1/1000

1/1 [==============================] - 0s 473ms/step - loss: 0.2552 -


binary_accuracy: 0.7500

Epoch 2/1000

1/1 [==============================] - 0s 6ms/step - loss: 0.2550 - binary_accuracy:


0.7500

Epoch 3/1000

1/1 [==============================] - 0s 5ms/step - loss: 0.2547 - binary_accuracy:


0.7500

.
Epoch 989/1000

1/1 [==============================] - 0s 5ms/step - loss: 0.0967 - binary_accuracy:


1.0000

Epoch 990/1000

1/1 [==============================] - 0s 4ms/step - loss: 0.0966 - binary_accuracy:


1.0000

Epoch 991/1000

1/1 [==============================] - 0s 5ms/step - loss: 0.0964 - binary_accuracy:


1.0000

Epoch 992/1000

1/1 [==============================] - 0s 4ms/step - loss: 0.0963 - binary_accuracy:


1.0000

Epoch 993/1000

1/1 [==============================] - 0s 4ms/step - loss: 0.0962 - binary_accuracy:


1.0000
31

Epoch 994/1000

1/1 [==============================] - 0s 5ms/step - loss: 0.0960 - binary_accuracy:


1.0000

Epoch 995/1000

1/1 [==============================] - 0s 4ms/step - loss: 0.0959 - binary_accuracy:


1.0000

Epoch 996/1000

1/1 [==============================] - 0s 5ms/step - loss: 0.0957 - binary_accuracy:


1.0000

Epoch 997/1000

1/1 [==============================] - 0s 4ms/step - loss: 0.0956 - binary_accuracy:


1.0000

Epoch 998/1000

1/1 [==============================] - 0s 5ms/step - loss: 0.0954 - binary_accuracy:


1.0000

Epoch 999/1000

1/1 [==============================] - 0s 4ms/step - loss: 0.0953 - binary_accuracy:


1.0000

Epoch 1000/1000

1/1 [==============================] - 0s 5ms/step - loss: 0.0951 - binary_accuracy:


1.0000

1/1 [==============================] - 0s 63ms/step

decimal output:

[[0.34885803]

[0.7179798 ]

[0.68972814]

[0.28727812]]
32

1/1 [==============================] - 0s 24ms/step

rounded output:

[[0.]

[1.]

[1.]

[0.]]
33

Experiment 5 – Train the model to print the prime numbers using Keras

import numpy as np

from keras.layers import Dense, Dropout, Activation

from keras.layers.advanced_activations import PReLU

from keras.models import Sequential

from matplotlib import pyplot as plt

seed = 7

np.random.seed(seed)

num_digits = 14 # binary encode numbers

max_number = 2 ** num_digits

def prime_list():

counter = 0

primes = [2, 3]

for n in range(5, max_number, 2):

is_prime = True

for i in range(1, len(primes)):

counter += 1

if primes[i] ** 2 > n:

break

counter += 1

if n % primes[i] == 0:

is_prime = False

break
34

if is_prime:

primes.append(n)

return primes

primes =

prime_list() def

prime_encode(i):

if i in primes:

return 1

else:

return 0

def bin_encode(i):

return [i >> d & 1 for d in range(num_digits)]

def create_dataset():

x, y = [], []

for i in range(102, max_number):

x.append(bin_encode(i))

y.append(prime_encode(i))

return np.array(x), y

x_train, y_train = create_dataset()

model = Sequential()

model.add(Dense(units=100, input_dim=num_digits))

model.add(PReLU())

model.add(Dropout(rate=0.2))

model.add(Dense(units=50))
35

model.add(PReLU())

model.add(Dropout(rate=0.2))

model.add(Dense(units=25))

model.add(PReLU())

model.add(Dropout(rate=0.2))

model.add(Dense(units=1))

model.add(Activation("sigmoid"))

model.compile(optimizer='RMSprop',

loss='binary_crossentropy',

metrics=['accuracy'])

history = model.fit(x_train, y_train, epochs=1000, batch_size=128,

validation_split=0.1)

# predict

errors, correct = 0, 0

tp, fn, fp = 0, 0, 0

for i in range(2, 101):

x = bin_encode(i)

y = model.predict(np.array(x).reshape(-1, num_digits))

if y[0][0] >= 0.5:

pred =

1 else:
36

pred = 0

obs = prime_encode(i)

print(i, obs, pred, y[0][0])

if pred == obs:

correct += 1

else:

errors += 1

if obs == 1 and pred == 1:

tp += 1

if obs == 1 and pred == 0:

fn += 1

if obs == 0 and pred == 1:

fp += 1

precision = tp / (tp + fp)

recall = tp / (tp + fn)

f_score = 2 * precision * recall / (precision + recall)

print("Errors :", errors, " Correct :", correct, "F score :", f_score)

def plot_history(history):

plt.plot(history.history['loss'])

plt.plot(history.history['val_loss'])

plt.title('model loss')

plt.xlabel('epoch')

plt.ylabel('loss')
37

plt.legend(['loss', 'val_loss'], l c='upper right')

plt.savefig('RMSprop_more')

plot_history(history)

Output
Errors : 9 CorreCt : 90 F sCore :
.8235294117647058

2 1 0 3.60 83e-08
3 1 0 0.03 2073
4 0 0 8.21 77e-06
5 1 1 0.61 928

6 0 0 3.97 66e-15
7 1 1 0.82 378
8 0 0 0.00 872908
9 0 0 0.02 4348
10 0 0 1.00 83e-09
11 1 0 0.18 609
12 0 0 5.53 88e-08
13 1 1 0.79 401
14 0 0 3.67 47e-15
15 0 0 0.02 023
16 0 0 4.04 9e-07
17 1 1 0.84 016
18 0 0 4.10 57e-14
19 1 1 0.91 604
20 0 0 4.37 6e-15
21 0 0 0.01 1451
22 0 0 6.49 46e-25
23 1 1 0.59 597
24 0 0 4.45 55e-08
25 0 1 0.73 933
26 0 0 4.97 32e-18
38

27 0 0 0.0958722
28 0 0 1.80154e-16
29 1 1 0.722513
30 0 0 2.00777e-28
31 1 1 0.774054
32 0 0 3.93779e-05
33 0 0 0.118341
34 0 0 1.88295e-11
35 0 0 0.480108
36 0 0 3.0609e-07
37 1 1 0.847888
38 0 0 3.42833e-18
39 0 0 0.0514646
40 0 0 5.82673e-07
41 1 1 0.726771
42 0 0 3.72693e-11
43 1 1 0.861872
44 0 0 5.71867e-14
45 0 0 0.18657
46 0 0 7.03075e-16
47 1 1 0.654062
48 0 0 1.30385e-10
49 0 1 0.923631
50 0 0 1.30955e-17
51 0 0 0.190215
52 0 0 6.45953e-19
53 1 1 0.558284
54 0 0 1.83163e-29
55 0 0 0.287756
56 0 0 3.29105e-11
57 0 0 0.292637
58 0 0 3.57044e-23
59 1 0 0.152102
60 0 0 1.80104e-22
61 1 1 0.858877
62 0 0 1.92684e-32
63 0 0 0.27367
64 0 0 1.74397e-09
65 0 1 0.727574
66 0 0 1.33752e-20
67 1 1 0.891129
68 0 0 1.47396e-17
69 0 0 0.346057
70 0 0 5.27672e-27
71 1 1 0.932053
72 0 0 4.04155e-10
73 1 1 0.879374
74 0 0 1.4077e-18
75 0 0 0.0290487
76 0 0 6.39801e-17
77 0 1 0.629597
78 0 0 1.54139e-30
79 1 1 0.791511
80 0 0 7.56631e-21
81 0 0 0.0438443
82 0 0 4.24787e-30
83 1 1 0.596353
84 0 0 6.45592e-32
85 0 0 0.431211
86 0 0 0.0
87 0 0 0.00903795
88 0 0 9.54647e-23
89 1 1 0.827787
90 0 0 2.43897e-31
91 0 1 0.746695
92 0 0 8.37092e-37

93 0 0 0.0384408
94 0 0 0.0
95 0 0 0.3743
39

96 0 0 7.28071e-13
97 1 1 0.888417
98 0 0 3.04541e-25
99 0 0 0.0649973
100 0 0 1.59478e-18
40

6. Recurrent Neural Network

a. Numpy implement of a simple recurrent neural network

b. Create a recurrent layer in keras

c. Prepare IMDB data for movie review classification problem.


d. Train the model with embedding and simple RNN layers.

e. Plot the Results

import numpy as np

import matplotlib.pyplot as plt

import tensorflow as tf

from tensorflow.keras.datasets import imdb

from tensorflow.keras.preprocessing.sequence import pad_sequences

imdb.load_data(num_words=20000) (x_train,y_train),

(x_test,y_test)=imdb.load_data(num_words=20000)

x_train=pad_sequences(x_train,maxlen=100)

x_test=pad_sequences(x_test,maxlen=100)

vocab_size=20000

emed_size=128

from tensorflow.keras import Sequential

from tensorflow.keras.layers import LSTM, Dropout, Dense, Embedding

model=Sequential()

model.add(Embedding(vocab_size,emed_size,input_shape=(x_train.shape[1],)))

model.add(LSTM(units=60,activation='tanh'))

model.compile(optimizer='adam',loss='binary_crossentropy',metrics=['accuracy'])
41

model.summary()

history=model.fit(x_train,y_train,epochs=5,batch_size=128,validation_data=(x_test,y_test))

epoch_range=[1,2,3,4,5]

plt.plot(epoch_range, history.history['accuracy'])
plt.plot(epoch_range, history.history['val_accuracy'])

Model: "sequential"

Layer (type) Output Shape Param #


=============================================================
====
embedding (Embedding) (None, 100, 128) 256000
0

lstm (LSTM) (None, 60) 45360

=============================================================
====
Total params: 2,605,360
Trainable params: 2,605,360
Non-trainable params: 0

Epoch 1/5
196/196 [==============================] - 38s loss 1.760 -
170ms/step - : 6
accuracy: 0.0078 - val_loss: 1.1591 - val_accuracy:
0.1327
Epoch 2/5
196/196 [==============================] - 32s loss 1.152 -
164ms/step - : 4
accuracy: 0.1252 - val_loss: 1.1365 - val_accuracy:
0.0272 Epoch 3/5 -
196/196 [==============================] - 32s loss 0.981
162ms/step - : 0
accuracy: 0.0443 - val_loss: 0.9238 - val_accuracy:
0.0105 Epoch 4/5
196/196 [==============================] - 32s loss 0.7770 -
165ms/step - :
accuracy: 0.0857 - val_loss: 0.9253 - val_accuracy:
0.1010
Epoch 5/5
196/196 [==============================] - 32s loss 0.6973 -
165ms/step - :
accuracy: 0.0709 - val_loss: 1.0435 - val_accuracy:
0.0549
42
that you can use according to your preference.

You can download historical weather dataset from here, also feel free to use any weather dataset

of your choice which has temperature data.


44

Let's load the dataset and see the first few rows:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd#import dataset from data.csv file
dataset = pd.read_csv('data.csv')
dataset = dataset.dropna(subset=["Temperature"])
dataset=dataset.reset_index(drop=True)training_set = dataset.iloc[:,4:5].values

We include only the temperature column as we are going to forecast temperature and drop all
the rows that have no values or has a NaN.

Next, we will have to apply feature scaling to normalize temperature in the range 0 to 1.
#Feature Scaling
from sklearn.preprocessing import MinMaxScalersc = MinMaxScaler(feature_range=(0,1))
training_set_scaled = sc.fit_transform(training_set)

We will create a training set such that for every 30 days we will provide the next 4 days
temperature as output. In other words, input for our RNN would be 30 days temperature data
and the output would be 4 days forecast of temperature.
x_train = []
y_train = []n_future = 4 # next 4 days temperature forecast
n_past = 30 # Past 30 days for i in range(0,len(training_set_scaled)-n_past-
n_future+1): x_train.append(training_set_scaled[i : i + n_past , 0])
y_train.append(training_set_scaled[i + n_past : i + n_past + n_future , 0 ])x_train , y_train =
np.array(x_train), np.array(y_train)x_train = np.reshape(x_train, (x_train.shape[0] , x_train.shape[1],
1) )

x_train contains 30 previous temperature inputs before that day and y_train contains 4 days

temperature outputs after that day. Since x_train and y_train are lists we will have to convert
them to numpy array to fit training set to our model.

Now we are ready with our training data so let’s proceed to build an RNN model for forecasting
weather.
45

1. First, we will import keras sequential model from keras.models and keras layers ie. LSTM,
Dense and dropout. You can refer Keras documentation for more info on Keras models and
layers here
from keras.models import Sequential
from keras.layers import LSTM,Dense ,Dropout

# Fitting RNN to training set using Keras Callbacks. Read Keras callbacks docs for more info.

2. Let us define the layers in our RNN. We will create a sequential model by adding layers
sequentially using sequential(). The first layer is a Bidirectional LSTM with 30 memory
units, return_sequence=True means that the last output in the output sequence is returned and
the input_shape describes the structure of the input. With Bidirectional LSTM the output layer
gets feedback from past(forward) as well as future(backward) states simultaneously. We add 3
hidden layers and an output layer with a linear activation function that outputs 4 days
temperature.

And at the last, we fit the RNN model with our training data.
regressor = Sequential()regressor.add(Bidirectional(LSTM(units=30, return_sequences=True,
input_shape = (x_train.shape[1],1) ) ))
regressor.add(Dropout(0.2))regressor.add(LSTM(units= 30 , return_sequences=True))
regressor.add(Dropout(0.2))regressor.add(LSTM(units= 30 , return_sequences=True))
regressor.add(Dropout(0.2))regressor.add(LSTM(units= 30))
regressor.add(Dropout(0.2))
regressor.add(Dense(units = n_future,activation='linear'))regressor.compile(optimizer='adam',
loss='mean_squared_error',metrics=['acc'])
regressor.fit(x_train, y_train, epochs=500,batch_size=32 )

Note: I have used Adam optimizer because it is computationally efficient.

3. Create test data to test our model


performance. # read test dataset
testdataset = pd.read_csv('data
(12).csv') #get only the temperature
column
testdataset = testdataset.iloc[:30,3:4].valuesreal_temperature = pd.read_csv('data (12).csv')
real_temperature = real_temperature.iloc[30:,3:4].valuestesting = sc.transform(testdataset)
testing = np.array(testing)
testing = np.reshape(testing,(testing.shape[1],testing.shape[0],1))

4. Now that we have our test data ready, we can test our RNN model.
predicted_temperature = regressor.predict(testing)predicted_temperature
=
46

sc.inverse_transform(predicted_temperature)predicted_temperature =
np.reshape(predicted_temperature,(predicted_temperature.shape[1],predicted_temperature.shape[0]))

The output from the model is in the normalized form, so to get the actual temperature values we
apply inverse_transform() to the predicted_temperature and then reshape it.

Let’s compare the predicted and real temperatures. As we can see the model performs well with
the given test data.
real_temperature
array([[82.], [82.], [83.], [83.]])predicted_temperature
array([[83.76233 ], [83.957565], [83.70461 ], [83.6326 ]])

If we forecast temperature for a month and visualize it we get the following results.

Forecast of temperature over a month


47

8. Long short-term memory network

a. Implement LSTM using LSTM layer in keras

b. Train and evaluate using reversed sequences for IMDB data

c. Train and evaluate a bidirectional LSTM for IMDB data


a. Implement LSTM using LSTM layer in keras

tf.keras.layers.LSTM( un
its, activation="tanh",
recurrent_activation="sigmoid",
use_bias=True,
kernel_initializer="glorot_uniform",
recurrent_initializer="orthogonal",
bias_initializer="zeros",
unit_forget_bias=True,
kernel_regularizer=None,
recurrent_regularizer=None,
bias_regularizer=None,
activity_regularizer=None,
kernel_constraint=None,
recurrent_constraint=None,
bias_constraint=None,
dropout=0.0,
recurrent_dropout=0.0,
return_sequences=False,
return_state=False,
go_backwards=False,
stateful=False,
time_major=False,
unroll=False,
**kwargs
)

Based on available runtime hardware and constraints, this layer will choose different
implementations (cuDNN-based or pure-TensorFlow) to maximize the performance. If a GPU
is available and all the arguments to the layer meet the requirement of the cuDNN kernel (see
below for details), the layer will use a fast cuDNN implementation.

The requirements to use the cuDNN implementation are:

1. activation == tanh
48

2. recurrent_activation == sigmoid
3. recurrent_dropout == 0
4. unroll is False
5. use_bias is True
6. Inputs, if use masking, are strictly right-padded.
7. Eager execution is enabled in the outermost context.

For example:

>>> inputs = tf.random.normal([32, 10, 8])


>>> lstm = tf.keras.layers.LSTM(4)
>>> output = lstm(inputs)
>>> print(output.shape)
(32, 4)
>>> lstm = tf.keras.layers.LSTM(4, return_sequences=True, return_state=True)
>>> whole_seq_output, final_memory_state, final_carry_state = lstm(inputs)
>>> print(whole_seq_output.shape)
(32, 10, 4)
>>> print(final_memory_state.shape)
(32, 4)
>>> print(final_carry_state.shape)
(32, 4)
Arguments

• units: Positive integer, dimensionality of the output space.


• activation: Activation function to use. Default: hyperbolic tangent (tanh). If you
pass None, no activation is applied (ie. "linear" activation: a(x) = x).
• recurrent_activation: Activation function to use for the recurrent step. Default: sigmoid
(sigmoid). If you pass None, no activation is applied (ie. "linear" activation: a(x) = x).
• use_bias: Boolean (default True), whether the layer uses a bias vector.
• kernel_initializer: Initializer for the kernel weights matrix, used for the linear
transformation of the inputs. Default: glorot_uniform.
• recurrent_initializer: Initializer for the recurrent_kernel weights matrix, used for the
linear transformation of the recurrent state. Default: orthogonal.
• bias_initializer: Initializer for the bias vector. Default: zeros.
• unit_forget_bias: Boolean (default True). If True, add 1 to the bias of the forget gate at
initialization. Setting it to true will also force bias_initializer="zeros". This is
recommended in Jozefowicz et al..
• kernel_regularizer: Regularizer function applied to the kernel weights matrix.
Default: None.
• recurrent_regularizer: Regularizer function applied to the recurrent_kernel weights
matrix. Default: None.
• bias_regularizer: Regularizer function applied to the bias vector. Default: None.
• activity_regularizer: Regularizer function applied to the output of the layer (its
"activation"). Default: None.
49

• kernel_constraint: Constraint function applied to the kernel weights matrix.


Default: None.
• recurrent_constraint: Constraint function applied to the recurrent_kernel weights
matrix. Default: None.
• bias_constraint: Constraint function applied to the bias vector. Default: None.
• dropout: Float between 0 and 1. Fraction of the units to drop for the linear


trreacnusrforermnta_tdiornopoof uthte: Finlopaut sb.eDtwefeaeunlt0: 0a .n d 1.
Fraction of the units to drop for the linear transformation of the recurrent state.
Default: 0.
• return_sequences: Boolean. Whether to return the last output in the output sequence, or
the full sequence. Default: False.
• return_state: Boolean. Whether to return the last state in addition to the output.
Default: False.
• go_backwards: Boolean (default False). If True, process the input sequence backwards
and return the reversed sequence.
• stateful: Boolean (default False). If True, the last state for each sample at index i in
a batch will be used as initial state for the sample of index i in the following batch.
• time_major: The shape format of the inputs and outputs tensors. If True, the inputs and
outputs will be in shape [timesteps, batch, feature], whereas in the False case, it will
be [batch, timesteps, feature]. Using time_major = True is a bit more efficient because it
avoids transposes at the beginning and end of the RNN calculation. However, most
TensorFlow data is batch-major, so by default this function accepts input and emits
output in batch-major form.
• unroll: Boolean (default False). If True, the network will be unrolled, else a symbolic
loop will be used. Unrolling can speed-up a RNN, although it tends to be more
memory- intensive. Unrolling is only suitable for short sequences.

Call arguments

• inputs: A 3D tensor with shape [batch, timesteps, feature].


• mask: Binary tensor of shape [batch, timesteps] indicating whether a given timestep
should be masked (optional, defaults to None). An individual True entry indicates that
the
corresponding timestep should be utilized, while a False entry indicates that the
corresponding timestep should be ignored.
• training: Python boolean indicating whether the layer should behave in training mode or
in inference mode. This argument is passed to the cell when calling it. This is only
relevant if dropout or recurrent_dropout is used (optional, defaults to None).
• initial_state: List of initial state tensors to be passed to the first call of the cell (optional,
defaults to None which causes creation of zero-filled initial state tensors).

b. Train and evaluate using reversed sequences for IMDB data

• Sometimes, a sequence is better used in reversed order. In those cases, you can
simply reverse a vector x using the Python syntax x[::-1] before using it to train your
LSTM
network.
50

• Sometimes, neither the forward nor the reversed order works perfectly, but combining
them will give better results. In this case, you will need a bidirectional LSTM network.
• A bidirectional LSTM network is simply two separate LSTM networks; one feeds with
a forward sequence and another with reversed sequence. Then the output of the two
LSTM networks is concatenated together before being fed to the subsequent layers of
the network. In Keras, you have the function Bidirectional() to clone an LSTM layer for
forward-backward input and concatenate their output. For example,
model = Sequential()
1
model.add(Embedding(top_words, embedding_vecor_length,
2
3input_length=max_review_length))
model.add(Bidirectional(LSTM(100, dropout=0.2, recurrent_dropout=0.2)))
4 model.add(Dense(1, activation='sigmoid'))
• Since you created not one, but two LSTMs with 100 units each, this network will take
twice the amount of time to train. Depending on the problem, this additional cost may
be justified.

The full code listing with adding the bidirectional LSTM to the last example is listed
below for completeness.

1 # LSTM with dropout for sequence classification in the IMDB dataset


2 import tensorflow as tf
3 from tensorflow.keras.datasets import imdb
4 from tensorflow.keras.models import Sequential
5 from tensorflow.keras.layers import Dense
6 from tensorflow.keras.layers import LSTM
7 from tensorflow.keras.layers import Bidirectional
8 from tensorflow.keras.layers import Embedding
9 from tensorflow.keras.preprocessing import sequence
10 # fix random seed for
reproducibility 11
tf.random.set_seed(7)
12 # load the dataset but only keep the top n words, zero the rest
13 top_words = 5000
14 (X_train, y_train), (X_test, y_test) =
imdb.load_data(num_words=top_words) 15 # truncate and pad input sequences
16 max_review_length = 500
17 X_train = sequence.pad_sequences(X_train, maxlen=max_review_length)
18 X_test = sequence.pad_sequences(X_test, maxlen=max_review_length)
19 # create the model
20 embedding_vecor_length =
32 21 model = Sequential()
22 model.add(Embedding(top_words, embedding_vecor_length,

23434 imnpoudte_l.laedndg(tBh=idmiraexc_tiroenvaiel(wL_SlTenMg(th1)0)0, dropout=0.2,


recurrent_dropout=0.2)))
51

25 model.add(Dense(1, activation='sigmoid'))
26 model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
27 print(model.summary())
28 model.fit(X_train, y_train, epochs=3, batch_size=64)
29 # Final evaluation of the model
30 scores = model.evaluate(X_test, y_test, verbose=0)
print("Accuracy: %.2f%%" % (scores[1]*100))
• Note: Your results may vary given the stochastic nature of the algorithm or
evaluation procedure, or differences in numerical precision. Consider running the
example a few times and compare the average outcome.
• Running this example provides the following output.

Epoch 1/3
391/391 [==============================] - 405s 1s/step - loss: 0.4960 - accuracy:
1
0.7532
2
Epoch 2/3
3
391/391 [==============================] - 439s 1s/step - loss: 0.3075 - accuracy:
4
5 0E.p8o7c4h4 3/3
6
391/391 [==============================] - 430s 1s/step - loss: 0.2551 - accuracy:
7
0.9014
Accuracy: 87.69%

c. Train and evaluate a bidirectional LSTM for IMDB data

import numpy as np
from tensorflow import keras
from tensorflow.keras import layers
max_features = 20000 # Only consider the top 20k words
maxlen = 200 # Only consider the first 200 words of each movie review

Build the model

# Input for variable-length sequences of integers


inputs = keras.Input(shape=(None,), dtype="int32")
# Embed each integer in a 128-dimensional vector
x = layers.Embedding(max_features, 128)(inputs)
# Add 2 bidirectional LSTMs
x = layers.Bidirectional(layers.LSTM(64, return_sequences=True))(x)
x = layers.Bidirectional(layers.LSTM(64))(x)
# Add a classifier
outputs = layers.Dense(1, activation="sigmoid")(x)
model = keras.Model(inputs, outputs)
model.summary()
52

Model: "model"

Layer (type) Output Shape Param #


=================================================================
input_1 (InputLayer) [(None, None)] 0

embedding (Embedding) (None, None, 128) 2560000


bidirectional (Bidirectional (None, None, 128) 98816

bidirectional_1 (Bidirection (None, 128) 98816

dense (Dense) (None, 1) 129


=================================================================
Total params: 2,757,761
Trainable params: 2,757,761
Non-trainable params: 0

Load the IMDB movie review sentiment data

(x_train, y_train), (x_val, y_val) = keras.datasets.imdb.load_data( num_words=max_features)


print(len(x_train), "Training sequences")
print(len(x_val), "Validation sequences")
x_train = keras.preprocessing.sequence.pad_sequences(x_train, maxlen=maxlen)
x_val = keras.preprocessing.sequence.pad_sequences(x_val, maxlen=maxlen)
25000 Training sequences
25000 Validation sequences

Train and evaluate the model

model.compile("adam", "binary_crossentropy", metrics=["accuracy"])


model.fit(x_train, y_train, batch_size=32, epochs=2, validation_data=(x_val, y_val))
Epoch 1/2
782/782 [==============================] - 220s 281ms/step - loss: 0.4117 - accuracy:
0.8083 - val_loss: 0.6497 - val_accuracy: 0.6983
Epoch 2/2
726/782 [==========================>...] - ETA: 11s - loss: 0.3170 - accuracy: 0.8683
53

10. Convolutional Neural Networks


a. Preparing the IMDB data
b. Train and evaluate a simple 1D convent on IMDB Data
c. Train and evaluate a simple 1D convent on temperature prediction data
import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)

import mtenatspolroftlloiwb.paysptlfot as plt


%matplotlib inline
import string
import os
mother_directory = os.getcwd()
data_directory = "../input/imdb-movie-ratings-sentiment-analysis/movie.csv"
data = pd.read_csv(data_directory)
max_length = 0
for sentence in data["text"]:
new_sentence = sentence.translate(str.maketrans("","",string.punctuation))
#print(new_sentence)

lief nlegnthgt=h
l>enm(naexw_l_esnegntthe:nce.split(" "))
max_length = length
def tokenize_the_data_from_pandas(dataframe,column_name): # This function returns tokens
dictionary that includes every unique word as keys and unique integers for each key.
#Here we clean the string out of punctuations.
import string
for sent in range(len(data[column_name])):
example_sentence = data[column_name].iloc[sent]
new_sentence = example_sentence.translate(str.maketrans("","",string.punctuation))
data[column_name].iloc[sent] = new_sentence
#Here we create a dictionary that will encode each word into an integer to have a
representation of the word in the deep neural networks processes
tokens = {}
for sent in range(len(data[column_name])):
example_sentence = data[column_name].iloc[sent]
values = example_sentence.split(" ")
for word in values:
tokens[word] = 0

names = list(tokens.keys())
for num in range(len(names)):
tokens[names[num]] = num+1

return tokens

tokens = tokenize_the_data_from_pandas(data,column_name="text")
len(tokens.keys())
54

#This is a 1D convolutional model for the task


model = tf.keras.Sequential()
model.add(tf.keras.layers.Embedding(190020,200,input_length=2470))#This embedding layer
reduces the data dimension from 290 to 10 by creating relations and using floating numbers to
represent the words.
model.add(tf.keras.layers.Conv1D(32,7,activation="relu"))

model.add(tf.keras.layers.MCoanxvP1oDo(li6n4g,17D,ac(5ti)v)ation="relu"))
model.add(tf.keras.layers.MaxPooling1D(5))

model.add(tf.keras.layers.Conv1D(64,7,activation="relu"))
model.add(tf.keras.layers.MaxPooling1D(5))

#model.add(tf.keras.layers.GlobalMaxPooling1D())
model.add(tf.keras.layers.Flatten())
model.add(tf.keras.layers.Dense(1, activation='sigmoid'))

model.compile(optimizer = "adam",loss="binary_crossentropy",metrics=["accuracy"])
model.fit(X_train,y_train,epochs=5,validation_data=(X_test,y_test))
Epoch 1/5
2022-01-26 13:16:46.737730: I tensorflow/stream_executor/cuda/cuda_dnn.cc:369] Loaded
cuDNN version 8005
1125/1125 [==============================] - 37s 26ms/step - loss: 0.3360 - accuracy:
0.8362 - val_loss: 0.2162 - val_accuracy: 0.9087
Epoch 2/5
1125/1125 [==============================] - 29s 26ms/step - loss: 0.0945 - accuracy:
0.9680 - val_loss: 0.2506 - val_accuracy: 0.9013
Epoch 3/5
1125/1125 [==============================] - 29s 26ms/step - loss: 0.0161 - accuracy:
0.9954 - val_loss: 0.4001 - val_accuracy: 0.8938
Epoch 4/5

101.92956/511-2v5a[l_=l=o=ss=:=0=.4=9=8=8=-=v=a=l=_a=c=c=u=ra=c=y=:
=0.=8=9=2=0 ====] - 29s 26ms/step - loss: 0.0096 - accuracy: Epoch 5/5
1125/1125 [==============================] - 29s 26ms/step - loss: 0.0107 - accuracy:
0.9967 - val_loss: 0.5361 - val_accuracy: 0.8915
Out[14]:
<keras.callbacks.History at 0x7f560b67b890>
#Predicting the test set to investiagte further.
y_pred=model.predict(X_test[:])
#We turn our networks output into binary values
y_pred[y_pred>0.5] = 1
y_pred[y_pred< 0.5] = 0
Truth = 0
Falset = 0
55

for pik in range(len(y_pred)):

result = y_pred[pik] == y_test[pik]


if False in result:
Falset += 1
else:

Truth += 1
#print(y_pred[pik] == y_test[pik])
print(Truth,Falset)
3566 434
print("accuracy:",Truth / (Truth + Falset))
accuracy: 0.8915

11. Develop a traditional LSTM for sequence classification problem.

You can quickly develop a small LSTM for the IMDB problem and achieve good accuracy.

Let’s start by importing the classes and functions required for this model and initializing the
random number generator to a constant value to ensure you can easily reproduce the results.

1 import tensorflow as tf
2 from tensorflow.keras.datasets import imdb
3 from tensorflow.keras.models import Sequential
4 from tensorflow.keras.layers import Dense
5 from tensorflow.keras.layers import LSTM
6 from tensorflow.keras.layers import Embedding
7 from tensorflow.keras.preprocessing import sequence
8 # fix random seed for reproducibility
9 tf.random.set_seed(7)
You need to load the IMDB dataset. You are constraining the dataset to the top 5,000
words. You will also split the dataset into train (50%) and test (50%) sets.

1 # load the dataset but only keep the top n words, zero the rest
2 top_words = 5000
3 (X_train, y_train), (X_test, y_test) = imdb.load_data(num_words=top_words)
Next, you need to truncate and pad the input sequences, so they are all the same length for
modeling. The model will learn that the zero values carry no information. The sequences are not
the same length in terms of content, but same-length vectors are required to perform the
computation in Keras.
56

1 # truncate and pad input


sequences 2 max_review_length =
500
3 X_train = sequence.pad_sequences(X_train, maxlen=max_review_length)
4 X_test = sequence.pad_sequences(X_test, maxlen=max_review_length)
You can now define, compile and fit your LSTM model.

The first layer is the Embedded layer that uses 32-length vectors to represent each word. The
next layer is the LSTM layer with 100 memory units (smart neurons). Finally, because this is a
classification problem, you will use a Dense output layer with a single neuron and a sigmoid
activation function to make 0 or 1 predictions for the two classes (good and bad) in the problem.

Because it is a binary classification problem, log loss is used as the loss function
(binary_crossentropy in Keras). The efficient ADAM optimization algorithm is used. The
model is fit for only two epochs because it quickly overfits the problem. A large batch size of
64 reviews is used to space out weight updates.
# create the model
1 embedding_vecor_length
2 =
332 model = Sequential()
model.add(Embedding(top_words, embedding_vecor_length,
4input_length=max_review_length))
5model.add(LSTM(100))
6
model.add(Dense(1, activation='sigmoid'))
7
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
8
print(model.summary())
9
model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=3, batch_size=64)
Once fit, you can estimate the performance of the model on unseen reviews.

1 # Final evaluation of the model


2 scores = model.evaluate(X_test, y_test, verbose=0)
3 print("Accuracy: %.2f%%" % (scores[1]*100))
For completeness, here is the full code listing for this LSTM network on the IMDB dataset.

1 # LSTM for sequence classification in the IMDB dataset


2 import tensorflow as tf
3 from tensorflow.keras.datasets import imdb
4 from tensorflow.keras.models import Sequential
5 from tensorflow.keras.layers import Dense
6 from tensorflow.keras.layers import LSTM
7 from tensorflow.keras.layers import Embedding
8 from tensorflow.keras.preprocessing import sequence
9 # fix random seed for reproducibility
10 tf.random.set_seed(7)
57

11 # load the dataset but only keep the top n words, zero the
rest 12 top_words = 5000
13 (X_train, y_train), (X_test, y_test) =
imdb.load_data(num_words=top_words) 14 # truncate and pad input sequences
15 max_review_length = 500
16 X_train = sequence.pad_sequences(X_train, maxlen=max_review_length)
17 X_test = sequence.pad_sequences(X_test, maxlen=max_review_length)
18 # create the model
19 embedding_vecor_length =
32 20 model = Sequential()
21 model.add(Embedding(top_words, embedding_vecor_length,
22 input_length=max_review_length))
23 model.add(LSTM(100))
24 model.add(Dense(1, activation='sigmoid'))
25 model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
26 print(model.summary())
27 model.fit(X_train, y_train, epochs=3, batch_size=64)
28 # Final evaluation of the model
29 scores = model.evaluate(X_test, y_test,
verbose=0) print("Accuracy: %.2f%%" %
(scores[1]*100))
Note: Your results may vary given the stochastic nature of the algorithm or evaluation
procedure, or differences in numerical precision. Consider running the example a few times and
compare the average outcome.
Running this example produces the following output.

Epoch 1/3
391/391 [==============================] - 124s 316ms/step - loss: 0.4525 -
1
accuracy: 0.7794
2
Epoch 2/3
3
391/391 [==============================] - 124s 318ms/step - loss: 0.3117 -
4 accuracy: 0.8706
5
6 Epoch 3/3
391/391 [==============================] - 126s 323ms/step - loss: 0.2526 -
7 accuracy: 0.9003
Accuracy: 86.83%

You might also like