0% found this document useful (0 votes)

7 views15 pages

NLP Using RNN

The document outlines a Python program for language modeling using Recurrent Neural Networks (RNNs) with applications in sentiment analysis and character generation. It details the algorithm, including data preprocessing, model creation with LSTM layers, and text generation techniques, along with a comprehensive code example. The program utilizes the IMDB dataset for training and demonstrates how to encode and decode text for predictions.

Uploaded by

Kavitha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOC, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views15 pages

NLP Using RNN

Uploaded by

Kavitha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOC, PDF, TXT or read online on Scribd

You are on page 1/ 15

Ex.

No 4 : LANGUAGE MODELING USING RNN

AIM: To write a python program for Language Modeling using RNN.

ALGORITHM:

1. Use a recurrent neural network to do the following:

 Sentiment Analysis
 Character Generation

2. Collect Bag of words Just store frequency of words, not order

3. Process Word Embedding attempts to not only encode the frequency and order of words but the
meaning of those words in the sentence. It encodes each word as a dense vector that represents its
context in the sentence.

4. More Preprocessing follow the procedure below:

 if the review is greater than 250 words then trim off the extra words
 if the review is less than 250 words add the necessary amount of 0's to make it equal to 250.

5. Creating the Model

 create the model use a word embedding layer as the first layer in our model and add a LSTM
layer afterwards that feeds into a dense node to get our predicted sentiment.

 32 stands for the output dimension of the vectors generated by the embedding layer.

6. Making Predictions
 use our network to make predictions on our own reviews.

 Since our reviews are encoded well need to convert any review that we write into that form so
the network can understand it.

 To do that well load the encodings from the dataset and use them to encode our own data.
7. RNN Play Generator

7.1 Loading Your Own Data : To load your own data need to upload a file from the dialog below
7.2 Encoding: Encode each unique character as a different integer.
7.3 Creating Training Examples

 Feed the model a sequence and have it return to us the next character.

 The training examples we will prepapre will use a seq_length sequence as input and a
seq_length sequence as the output where that sequence is the original sequence shifted one letter
to the right.

7.4 Creating a Loss Function:

 Create our own loss function for this problem.

 This is because our model will output a (64, sequence_length, 65) shaped tensor that represents
the probability distribution of each character at each timestep for every sequence in the batch.

7.5 Creating Checkpoints

 Do setup and configure the model to save checkpoints as it trains. This will allow us to load
the model from a checkpoint and continue training it.
7.6 Loading the Model
 Rebuild the model from a checkpoint using a batch_size of 1 so that we can feed one piece of
text to the model and have it make a prediction.
7.7 Generating Text : use the lovely function provided by tensorflow to generate some text using any
starting string.

PROGRAM

from keras.datasets import imdb

from keras.preprocessing import sequence
import keras
import tensorflow as tf
import os
import numpy as np

VOCAB_SIZE = 88584
MAXLEN = 250
BATCH_SIZE = 64

(train_data, train_labels), (test_data, test_labels) =

imdb.load_data(num_words = VOCAB_SIZE)

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-

datasets/imdb.npz
17464789/17464789 [==============================] - 9s 1us/step

len(train_data[1])

189

train_data=sequence.pad_sequences(train_data,MAXLEN)
test_data=sequence.pad_sequences(test_data,MAXLEN)

len(train_data[1])

250

model=tf.keras.Sequential([
tf.keras.layers.Embedding(VOCAB_SIZE,32),
tf.keras.layers.LSTM(32),
tf.keras.layers.Dense(1,activation='sigmoid')
])

model.summary()

Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
embedding (Embedding) (None, None, 32) 2834688

lstm (LSTM) (None, 32) 8320

dense (Dense) (None, 1) 33

=================================================================
Total params: 2843041 (10.85 MB)
Trainable params: 2843041 (10.85 MB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________

model.compile(loss="binary_crossentropy",optimizer="rmsprop",metrics=['acc
uracy'])
history=model.fit(train_data,train_labels,epochs=10,validation_split=0.2)

Epoch 1/10
625/625 [==============================] - 85s 131ms/step - loss: 0.4563 -
accuracy: 0.7702 - val_loss: 0.3436 - val_accuracy: 0.8610
Epoch 2/10
625/625 [==============================] - 82s 131ms/step - loss: 0.2598 -
accuracy: 0.8989 - val_loss: 0.2953 - val_accuracy: 0.8844
Epoch 3/10
625/625 [==============================] - 82s 131ms/step - loss: 0.1996 -
accuracy: 0.9259 - val_loss: 0.3137 - val_accuracy: 0.8780
Epoch 4/10
625/625 [==============================] - 79s 126ms/step - loss: 0.1583 -
accuracy: 0.9434 - val_loss: 0.3233 - val_accuracy: 0.8820
Epoch 5/10
625/625 [==============================] - 84s 134ms/step - loss: 0.1349 -
accuracy: 0.9547 - val_loss: 0.3979 - val_accuracy: 0.8650
Epoch 6/10
625/625 [==============================] - 83s 132ms/step - loss: 0.1127 -
accuracy: 0.9603 - val_loss: 0.3648 - val_accuracy: 0.8838
Epoch 7/10
625/625 [==============================] - 85s 136ms/step - loss: 0.0970 -
accuracy: 0.9691 - val_loss: 0.3821 - val_accuracy: 0.8862
Epoch 8/10
625/625 [==============================] - 78s 124ms/step - loss: 0.0796 -
accuracy: 0.9747 - val_loss: 0.3580 - val_accuracy: 0.8750
Epoch 9/10
625/625 [==============================] - 76s 122ms/step - loss: 0.0694 -
accuracy: 0.9780 - val_loss: 0.4439 - val_accuracy: 0.8604
Epoch 10/10
625/625 [==============================] - 78s 124ms/step - loss: 0.0551 -
accuracy: 0.9831 - val_loss: 0.4081 - val_accuracy: 0.8796

#model.save("lstm_model")
#or
model.save("lstm.h5")

new_model = tf.keras.models.load_model('lstm.h5')

results=new_model.evaluate(test_data,test_labels)
print(results)

782/782 [==============================] - 31s 39ms/step - loss: 0.4922 -

accuracy: 0.8602
[0.49219557642936707, 0.8602399826049805]

results=model.evaluate(test_data,test_labels)
print(results)

782/782 [==============================] - 31s 39ms/step - loss: 0.4922 -

accuracy: 0.8602
[0.49219557642936707, 0.8602399826049805]

word_index=imdb.get_word_index()
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-
datasets/imdb_word_index.json
1641221/1641221 [==============================] - 1s 0us/step

for i in range(10):
print(list(word_index.keys())[i],':',list(word_index.values())[i])

fawn : 34701
tsukino : 52006
nunnery : 52007
sonja : 16816
vani : 63951
woods : 1408
spiders : 16115
hanging : 2345
woody : 2289
trawling : 52008

def encode_text(text):
tokens=keras.preprocessing.text.text_to_word_sequence(text)
tokens=[word_index[word] if word in word_index else 0 for word in
tokens]
return sequence.pad_sequences([tokens],MAXLEN)[0]

text="that movie was amazing, i have to watch it again"

encoded=encode_text(text)
print(encoded)

[ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 12 17 13 477 10 25 5 103 9 171]

# Decode function that converts itegers to text

reverse_word_index={value:key for (key,value) in word_index.items()}

def decode_integers(integers):
PAD=0
text=""
for num in integers:
if num!=PAD:
text+=reverse_word_index[num] +" "

return text[:-1]

print(decode_integers(encoded))

that movie was amazing i have to watch it again

def predict(text):
encoded_text=encode_text(text)
pred=encoded_text.reshape(1,250) #converting vector to 2d
result=model.predict(pred)
print(result[0])

positive_review="That was a good movie, i will definitely watch it again"

predict(positive_review)

negative_review="Don't waste your time watching this movie, so

disappointing"
predict(negative_review)

1/1 [==============================] - 1s 938ms/step

[0.97995764]
1/1 [==============================] - 0s 50ms/step
[0.5679327]

#Load data from keras

path_to_file = tf.keras.utils.get_file('shakespeare.txt',
'https://storage.googleapis.com/download.tensorflow.org/data/shakespeare.t
xt')

Downloading data from

https://storage.googleapis.com/download.tensorflow.org/data/shakespeare.tx
t
1115394/1115394 [==============================] - 1s 1us/step

text=open(path_to_file,'rb').read().decode(encoding='utf-8')
print("Length of text : ",len(text))

Length of text : 1115394

print(text[:250])

First Citizen:
Before we proceed any further, hear me speak.

All:
Speak, speak.

First Citizen:
You are all resolved rather to die than to famish?

All:
Resolved. resolved.

First Citizen:
First, you know Caius Marcius is chief enemy to the people.

vocab=sorted(set(text))

#Creating mapping from text to index

char2idx={u:i for i,u in enumerate(vocab)}
idx2char=np.array(vocab)

def text_to_int(text):
return np.array([char2idx[t] for t in text])

text_as_int=text_to_int(text)

print('Text:',text[0:13])
print('Encoded:',text_to_int(text[:13]))

Text: First Citizen

Encoded: [18 47 56 57 58 1 15 47 58 47 64 43 52]

# Convert int to text

def int_to_text(ints):
try:
ints=ints.numpy()
except:
pass
return ''.join(idx2char[ints])

print(int_to_text(text_to_int(text[:13])))

First Citizen

seq_length = 100 # length of sequence for a training example

examples_per_epoch = len(text)//(seq_length+1)

# Create training examples / targets

char_dataset = tf.data.Dataset.from_tensor_slices(text_as_int)

sequences = char_dataset.batch(seq_length+1, drop_remainder=True)

def split_input_target(chunk): # for the example: hello

input_text = chunk[:-1] # hell
target_text = chunk[1:] # ello
return input_text, target_text # hell, ello

dataset = sequences.map(split_input_target) # we use map to apply the

above function to every entry

for x, y in dataset.take(2):
print("\n\nEXAMPLE\n")
print("INPUT")
print(int_to_text(x))
print("\nOUTPUT")
print(int_to_text(y))

EXAMPLE

INPUT
First Citizen:
Before we proceed any further, hear me speak.

All:
Speak, speak.

First Citizen:
You

OUTPUT
irst Citizen:
Before we proceed any further, hear me speak.

All:
Speak, speak.

First Citizen:
You

EXAMPLE

INPUT
are all resolved rather to die than to famish?

All:
Resolved. resolved.

First Citizen:
First, you

OUTPUT
re all resolved rather to die than to famish?

All:
Resolved. resolved.

First Citizen:
First, you k

BATCH_SIZE = 64
VOCAB_SIZE = len(vocab) # vocab is number of unique characters
EMBEDDING_DIM = 256
RNN_UNITS = 1024

# Buffer size to shuffle the dataset

# (TF data is designed to work with possibly infinite sequences,
# so it doesn't attempt to shuffle the entire sequence in memory. Instead,
# it maintains a buffer in which it shuffles elements).
BUFFER_SIZE = 10000

data = dataset.shuffle(BUFFER_SIZE).batch(BATCH_SIZE, drop_remainder=True)

def build_model(vocab_size, embedding_dim, rnn_units, batch_size):

model = tf.keras.Sequential([
tf.keras.layers.Embedding(vocab_size, embedding_dim,
batch_input_shape=[batch_size, None]),
tf.keras.layers.LSTM(rnn_units,
return_sequences=True,
stateful=True,
recurrent_initializer='glorot_uniform'),
tf.keras.layers.Dense(vocab_size)
])
return model

model = build_model(VOCAB_SIZE,EMBEDDING_DIM, RNN_UNITS, BATCH_SIZE)

model.summary()

Model: "sequential_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
embedding_1 (Embedding) (64, None, 256) 16640

lstm_1 (LSTM) (64, None, 1024) 5246976

dense_1 (Dense) (64, None, 65) 66625

=================================================================
Total params: 5330241 (20.33 MB)
Trainable params: 5330241 (20.33 MB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________

for input_example_batch, target_example_batch in data.take(1):

example_batch_predictions = model(input_example_batch) # ask our
model for a prediction on our first batch of training data (64 entries)
print(example_batch_predictions.shape, "# (batch_size,
sequence_length, vocab_size)") # print out the output shape

(64, 100, 65) # (batch_size, sequence_length, vocab_size)

# we can see that the predicition is an array of 64 arrays, one for each
entry in the batch

print(example_batch_predictions.shape)
print(len(example_batch_predictions))
print(example_batch_predictions)

(64, 100, 65)

64
tf.Tensor(
[[[-3.93073249e-04 -4.13193274e-03 2.95304321e-03 ... -2.39735539e-03
-4.29750001e-03 -5.21482714e-03]
[-4.22039768e-04 -6.91745989e-03 3.74294701e-03 ... -5.37824444e-03
4.01285943e-05 -5.18289767e-03]
[ 3.14008445e-03 -2.59073591e-03 6.18810300e-03 ... -6.22856803e-03
-5.04955323e-03 -1.66755891e-03]
...
[-4.08313412e-04 4.04390274e-04 -4.97683510e-03 ... -4.46194550e-03
-2.48599751e-03 -2.99828593e-03]
[-6.51751319e-03 1.13740680e-03 5.76881459e-04 ... 1.14745833e-03
-2.71934411e-03 -5.65160112e-03]
[-5.81418304e-03 7.45741418e-05 7.74319284e-04 ... 1.95595226e-03
1.00510637e-03 -2.07067793e-03]]

[[ 2.98142736e-03 1.84627529e-03 -7.37963011e-03 ... -6.58784620e-03

-1.31064357e-04 -3.49854189e-03]
[ 4.49225260e-03 1.75412372e-03 -4.33418714e-03 ... -1.52143277e-03
5.20772440e-03 -5.24359290e-03]
[ 2.57248338e-03 -2.77510053e-03 -1.58735993e-03 ... -3.14915390e-03
-7.10702501e-04 -9.71744582e-03]
...
[ 5.04800444e-03 -6.15518913e-03 5.02665248e-03 ... -5.18665044e-03
-1.12859230e-03 3.72367864e-03]
[-4.90805786e-03 -3.12168244e-03 2.55103968e-03 ... -1.23845483e-03
7.18085561e-03 9.14988806e-04]
[ 1.73065613e-03 -3.34052043e-03 3.73464567e-03 ... 5.22589916e-03
4.09955531e-03 2.75234925e-05]]

[[ 3.07811610e-03 -4.89329745e-04 -1.33164215e-03 ... -2.27746228e-03

2.54902686e-03 -2.45156232e-03]
[ 9.74630937e-04 -1.91917177e-03 -1.51449814e-03 ... 3.52289353e-04
5.16330451e-03 -1.42873498e-03]
[-7.41399126e-05 9.48775094e-04 -1.23013637e-03 ... 5.67079429e-03
6.52022706e-03 -1.46835577e-03]
...
[-5.20169782e-03 6.67268643e-03 -9.16008651e-03 ... -1.19209150e-02
-1.23572641e-03 -4.01304569e-03]
[-1.01522561e-02 8.48373212e-03 -3.08945356e-03 ... -4.64959303e-03
-1.61087350e-03 -4.04672651e-03]
[-4.93974984e-03 9.79977846e-03 -9.37165972e-03 ... -1.01580238e-02
-4.90848965e-04 -6.86038006e-03]]

...

[[-5.06412331e-03 1.89589732e-03 2.89738621e-03 ... 3.95299867e-03

-1.01080327e-03 -1.29630999e-03]
[-1.09053566e-04 1.37019227e-03 7.33342662e-04 ... 2.68259668e-04
2.30084686e-03 -3.97849921e-03]
[-1.14224455e-03 -1.34483678e-04 -1.83367403e-04 ... 1.89813343e-03
5.34742558e-03 -3.02217295e-03]
...
[-1.48321060e-03 -1.13428582e-03 -2.41922238e-03 ... 9.48187429e-04
-7.84166041e-05 -1.85853895e-03]
[-6.86807185e-03 3.76374170e-04 2.32790876e-03 ... 5.83458412e-03
-8.01210059e-04 -2.36657518e-03]
[-5.50743425e-03 -4.03277948e-03 5.60935633e-03 ... 2.74476269e-03
-4.36123833e-03 -7.27833062e-03]]

[[-3.74341710e-03 3.78564978e-03 -6.55010575e-03 ... -4.16637585e-03

-2.78695137e-04 -3.71830305e-04]
[ 3.56418989e-03 3.84594593e-03 -1.87180401e-03 ... 2.75947712e-03
-2.73818383e-03 -1.48385111e-03]
[-9.77663556e-04 4.65825759e-03 4.42212168e-03 ... 3.78555525e-03
5.40595734e-04 -7.57254753e-03]
...
[-4.55768965e-03 -7.16793723e-03 5.00304997e-03 ... -7.03381235e-03
5.62174246e-04 -2.88552046e-03]
[ 1.72426400e-04 -1.37326727e-03 -1.20883517e-04 ... -7.60819251e-03
-1.31173711e-03 -4.44826018e-03]
[-3.99947213e-03 2.49462062e-03 -5.07832458e-03 ... -9.85019282e-03
-1.47505908e-03 -2.22451566e-03]]

[[ 2.66172946e-03 2.22246279e-04 1.83949992e-03 ... 3.09980777e-03

5.65271359e-03 -1.73419621e-03]
[ 4.92220558e-03 -3.99032608e-04 -6.84230006e-04 ... 2.86078430e-04
6.53239898e-03 -3.69627075e-03]
[ 2.48080585e-03 -4.65725362e-03 2.44203676e-03 ... -1.08791026e-03
2.18785251e-04 -8.17750301e-03]
...
[-1.21543221e-02 -4.54739854e-03 5.88502036e-04 ... 1.21296803e-03
4.90353536e-03 -7.66825397e-03]
[-1.55522004e-02 -2.53014686e-03 4.17143898e-03 ... 5.40314289e-03
2.56778859e-03 -7.31407665e-03]
[-1.09163178e-02 -4.85545956e-03 7.86961406e-04 ... 1.60261523e-03
5.13016060e-03 -5.22394851e-03]]], shape=(64, 100, 65), dtype=float32)

# lets examine one prediction

pred = example_batch_predictions[0]
print(len(pred))
print(pred)
# notice this is a 2d array of length 100, where each interior array is
the prediction for the next character at each time step

100
tf.Tensor(
[[-3.9307325e-04 -4.1319327e-03 2.9530432e-03 ... -2.3973554e-03
-4.2975000e-03 -5.2148271e-03]
[-4.2203977e-04 -6.9174599e-03 3.7429470e-03 ... -5.3782444e-03
4.0128594e-05 -5.1828977e-03]
[ 3.1400844e-03 -2.5907359e-03 6.1881030e-03 ... -6.2285680e-03
-5.0495532e-03 -1.6675589e-03]
...
[-4.0831341e-04 4.0439027e-04 -4.9768351e-03 ... -4.4619455e-03
-2.4859975e-03 -2.9982859e-03]
[-6.5175132e-03 1.1374068e-03 5.7688146e-04 ... 1.1474583e-03
-2.7193441e-03 -5.6516011e-03]
[-5.8141830e-03 7.4574142e-05 7.7431928e-04 ... 1.9559523e-03
1.0051064e-03 -2.0706779e-03]], shape=(100, 65), dtype=float32)

# and finally well look at a prediction at the first timestep

time_pred = pred[0]
print(len(time_pred))
print(time_pred)
# and of course its 65 values representing the probabillity of each
character occuring next

65
tf.Tensor(
[-3.9307325e-04 -4.1319327e-03 2.9530432e-03 1.2779376e-02
-4.8698825e-03 -1.8498915e-03 -4.5865178e-03 8.7094121e-04
1.9650310e-03 3.2511496e-03 2.1952731e-03 6.6440525e-03
1.4319521e-03 -3.5579172e-03 2.2880444e-03 -7.4413568e-03
1.8639711e-03 8.5085770e-04 -2.9051816e-04 -4.6098186e-03
3.8397252e-03 -2.1187393e-03 4.5483760e-04 -1.6458960e-03
5.4401148e-04 -7.4393884e-04 9.8232669e-04 -4.9993750e-03
-1.7126356e-03 1.4183960e-03 5.5882139e-03 -1.2871707e-03
6.0840254e-03 -1.6565667e-03 -6.6662161e-03 -6.0936613e-03
-9.8395627e-03 -3.9169355e-04 1.4780747e-03 -6.7412155e-05
8.8197575e-04 8.5747265e-04 2.0879199e-04 3.1654395e-03
-5.0588325e-04 -2.3278731e-03 5.4742588e-04 -1.7510151e-03
8.2861143e-04 -4.3253875e-03 2.2218318e-03 -2.9482259e-03
-4.2824708e-03 -2.8045098e-03 1.5042936e-03 1.0843073e-03
-4.4986671e-03 4.1392604e-03 2.5661956e-03 6.6752401e-03
-7.3688838e-04 1.7982540e-03 -2.3973554e-03 -4.2975000e-03
-5.2148271e-03], shape=(65,), dtype=float32)

# If we want to determine the predicted character we need to sample the

output distribution (pick a value based on probabillity)
sampled_indices = tf.random.categorical(pred, num_samples=1)

# now we can reshape that array and convert all the integers to numbers to
see the actual characters
sampled_indices = np.reshape(sampled_indices, (1, -1))[0]
predicted_chars = int_to_text(sampled_indices)

predicted_chars # and this is what the model predicted for training

sequence 1

"DzvJO!VWlRmk!RaLU'uG&k?GZh?KWI-,RRUIPQbyeWKTn-\nxA!ubU:dr:pKuwqT'qj?\nV
vfG;rjwZRJ&qMT&d WJlohxib-MlpT"

def loss(labels, logits):

return tf.keras.losses.sparse_categorical_crossentropy(labels, logits,
from_logits=True)

model.compile(optimizer='adam', loss=loss)

# Directory where the checkpoints will be saved

checkpoint_dir = './training_checkpoints'
# Name of the checkpoint files
checkpoint_prefix = os.path.join(checkpoint_dir, "ckpt_{epoch}")

checkpoint_callback=tf.keras.callbacks.ModelCheckpoint(
filepath=checkpoint_prefix,
save_weights_only=True)

#More eopchs will have better result, no overfitting here, in ex 50 or

more
history = model.fit(data, epochs=2, callbacks=[checkpoint_callback])

Epoch 1/2
172/172 [==============================] - 1253s 7s/step - loss: 2.4090
Epoch 2/2
172/172 [==============================] - 1465s 8s/step - loss: 1.9573

model = build_model(VOCAB_SIZE, EMBEDDING_DIM, RNN_UNITS, batch_size=1)

model.load_weights(tf.train.latest_checkpoint(checkpoint_dir))
model.build(tf.TensorShape([1, None]))

#checkpoint_num = 10
#model.load_weights(tf.train.load_checkpoint("./training_checkpoints/
ckpt_" + str(checkpoint_num)))
#model.build(tf.TensorShape([1, None]))

def generate_text(model, start_string):

# Evaluation step (generating text using the learned model)

# Number of characters to generate

num_generate = 800

# Converting our start string to numbers (vectorizing)

input_eval = [char2idx[s] for s in start_string]
input_eval = tf.expand_dims(input_eval, 0)

# Empty string to store our results

text_generated = []

# Low temperatures results in more predictable text.

# Higher temperatures results in more surprising text.
# Experiment to find the best setting.
temperature = 1.0

# Here batch size == 1

model.reset_states()
for i in range(num_generate):
predictions = model(input_eval)
# remove the batch dimension

predictions = tf.squeeze(predictions, 0)

# using a categorical distribution to predict the character returned

by the model
predictions = predictions / temperature
predicted_id = tf.random.categorical(predictions, num_samples=1)[-
1,0].numpy()

# We pass the predicted character as the next input to the model

# along with the previous hidden state
input_eval = tf.expand_dims([predicted_id], 0)

text_generated.append(idx2char[predicted_id])

return (start_string + ''.join(text_generated))

#inp=input('Type starting string')

inp="Romeo said"
print(generate_text(model,inp))

Romeo said:
Well hen twen 'et to lay hath now:--pake treen's.
VICWARA:
Whal, choobed me thy doss a fandire:
Uther claceon come menine of Sanclasain:
To, thes windist or to your elve
lith; I heer him. For she for my sont. Theroris; withard be sunter.
Who have within the Hanow call.

ASINIUS:
Yet, grisico, be lide mandy;
I leverchedsh, pontery he esser by the prope
pros a hand;
O, not alliald gow' thwer propenty a evero,
Or for clut the king ol Clirpakes me intrity.

PETEUNI:
Firly prection of that she mode,
And the reprect'd me
no he stieps of thather on my him ago stand o Thise here th

KENG HICiRINA:
Ickine; if you werl we the mad our adl,
Teth hemall plise be be not be hould on your.

ASHERG:
Grime your flain chol thee.

First Seast
Mancourter not me the king butrer as-us; on offor it thee!
On je

Sentiment Analysis with RNN & LSTM
No ratings yet
Sentiment Analysis with RNN & LSTM
16 pages
Exp 7 Text Sequence Generator LSTM
No ratings yet
Exp 7 Text Sequence Generator LSTM
12 pages
Integer-Encoding-Simplernn - Ipynb - Colaboratory
No ratings yet
Integer-Encoding-Simplernn - Ipynb - Colaboratory
4 pages
Course 3 - Week 2 - Exercise - Answer - Ipynb - Colaboratory
No ratings yet
Course 3 - Week 2 - Exercise - Answer - Ipynb - Colaboratory
8 pages
566f0619-9145-4b8f-b12b-cb8a5b0cd30d
No ratings yet
566f0619-9145-4b8f-b12b-cb8a5b0cd30d
17 pages
Computer Vision Lab Guide
No ratings yet
Computer Vision Lab Guide
120 pages
Sentiment Analysis Using LSTM
No ratings yet
Sentiment Analysis Using LSTM
5 pages
DL 22Q71A4206
No ratings yet
DL 22Q71A4206
65 pages
Neural Networks
No ratings yet
Neural Networks
8 pages
PDL 06-Merged
No ratings yet
PDL 06-Merged
8 pages
A Quick Recap: Artificial Intelligence LAB
No ratings yet
A Quick Recap: Artificial Intelligence LAB
29 pages
Transform Raw Texts Into Training and Development Data: Instructor: Nikos Aletras
No ratings yet
Transform Raw Texts Into Training and Development Data: Instructor: Nikos Aletras
2 pages
Python Deep Learning Basics
No ratings yet
Python Deep Learning Basics
8 pages
CNN and RNN Code
No ratings yet
CNN and RNN Code
10 pages
DL Lab 8 Excuted
No ratings yet
DL Lab 8 Excuted
3 pages
NLP Lab Assignment - 05
No ratings yet
NLP Lab Assignment - 05
6 pages
Sentiment Analysis Using LSTM
No ratings yet
Sentiment Analysis Using LSTM
5 pages
NLP PDF
No ratings yet
NLP PDF
17 pages
CCS355
No ratings yet
CCS355
29 pages
Ad3301 Set1
No ratings yet
Ad3301 Set1
2 pages
DL 5
No ratings yet
DL 5
9 pages
Building A Brain in 10 Minutes: Perceptron Research From The 50's & 6 Perceptron Research From The 50's & 6
No ratings yet
Building A Brain in 10 Minutes: Perceptron Research From The 50's & 6 Perceptron Research From The 50's & 6
14 pages
Deep Learning Manual
No ratings yet
Deep Learning Manual
53 pages
Deep DL Manual Nainish
No ratings yet
Deep DL Manual Nainish
8 pages
AD3511 - Deep Learning Lab Manual
No ratings yet
AD3511 - Deep Learning Lab Manual
53 pages
Exp 6,7,8
No ratings yet
Exp 6,7,8
17 pages
09 Milestone Project 2 Skimlit
No ratings yet
09 Milestone Project 2 Skimlit
32 pages
DL 5 Excuted
No ratings yet
DL 5 Excuted
13 pages
Ad3511 Deep Learning Lab Manual
No ratings yet
Ad3511 Deep Learning Lab Manual
80 pages
Practical
No ratings yet
Practical
6 pages
DL Programs
No ratings yet
DL Programs
13 pages
NN & DL Lab Manual 1
No ratings yet
NN & DL Lab Manual 1
44 pages
08 Natural Language Processing in Tensorflow
No ratings yet
08 Natural Language Processing in Tensorflow
29 pages
Bay Learn 2015 Deep Mind
No ratings yet
Bay Learn 2015 Deep Mind
69 pages
Analytics Final Exam Review
No ratings yet
Analytics Final Exam Review
16 pages
Ad3511-Deep Learning-Lab Manual
No ratings yet
Ad3511-Deep Learning-Lab Manual
53 pages
AD3511 Deep Learning Lab Manual
No ratings yet
AD3511 Deep Learning Lab Manual
54 pages
RNNs for Sequential Data Modeling
No ratings yet
RNNs for Sequential Data Modeling
33 pages
Experiment 3 (A, B, C) (RNN) (Recuurent) (IMDB) )
No ratings yet
Experiment 3 (A, B, C) (RNN) (Recuurent) (IMDB) )
11 pages
Expt 5 Expt 6
No ratings yet
Expt 5 Expt 6
10 pages
LSTM and BiLSTM Sentiment Analysis
No ratings yet
LSTM and BiLSTM Sentiment Analysis
6 pages
DL Lab Answers Batch 2
No ratings yet
DL Lab Answers Batch 2
27 pages
748747019-Ad3511-Deep-Learning-Lab-Manual-Iii-Yearjnn (1) - 1
No ratings yet
748747019-Ad3511-Deep-Learning-Lab-Manual-Iii-Yearjnn (1) - 1
51 pages
Deep Learning Lab
No ratings yet
Deep Learning Lab
7 pages
Ccs355 - NN&DL Lab Manual
No ratings yet
Ccs355 - NN&DL Lab Manual
34 pages
Deep Learning Tutorial
No ratings yet
Deep Learning Tutorial
133 pages
Python Code
No ratings yet
Python Code
52 pages
Deep Learning LAB
No ratings yet
Deep Learning LAB
47 pages
CS663-2024-Executive NLP - Assignment Sentiment Analysis
No ratings yet
CS663-2024-Executive NLP - Assignment Sentiment Analysis
4 pages
Lecture Notes 6
No ratings yet
Lecture Notes 6
5 pages
Sentimentanalysislab
No ratings yet
Sentimentanalysislab
5 pages
Deep Learning PGM 1
No ratings yet
Deep Learning PGM 1
6 pages
Deep Learning Lab Assignments - 6-9
No ratings yet
Deep Learning Lab Assignments - 6-9
14 pages
Sample
No ratings yet
Sample
6 pages
DL Exp-10,11,12
No ratings yet
DL Exp-10,11,12
6 pages
Deep Learning Lab Manual
No ratings yet
Deep Learning Lab Manual
72 pages
Image Captioning With Visual Attention PDF
No ratings yet
Image Captioning With Visual Attention PDF
16 pages
RLDL
No ratings yet
RLDL
27 pages
Binary Classification - Ipynb - Colab
No ratings yet
Binary Classification - Ipynb - Colab
5 pages
daa unit 1
No ratings yet
daa unit 1
33 pages
Deep Learning For Vision Lab Manual 2024
100% (1)
Deep Learning For Vision Lab Manual 2024
25 pages
Unit 2 PDF
No ratings yet
Unit 2 PDF
33 pages
CNN Face Recognition
No ratings yet
CNN Face Recognition
6 pages
DLV Lab Manual Program
No ratings yet
DLV Lab Manual Program
24 pages
Unit 1 Notes
No ratings yet
Unit 1 Notes
18 pages
7& 9 Autoencoder and Variational Autoencoder
No ratings yet
7& 9 Autoencoder and Variational Autoencoder
13 pages
Memory Technology
No ratings yet
Memory Technology
26 pages
Trends in Power and Energy in Integrated Circuits
No ratings yet
Trends in Power and Energy in Integrated Circuits
21 pages
Depende Bali Ty
No ratings yet
Depende Bali Ty
9 pages
5.hyperparameters and Validation Sets (C)
0% (1)
5.hyperparameters and Validation Sets (C)
3 pages
9.deep Feedforward Networks
100% (1)
9.deep Feedforward Networks
13 pages
Metal Catalyst
No ratings yet
Metal Catalyst
19 pages
Flyer Pulsar GB Old
No ratings yet
Flyer Pulsar GB Old
2 pages
Guideline For Dummies 3G - CDR PS Fast Analyze
No ratings yet
Guideline For Dummies 3G - CDR PS Fast Analyze
5 pages
MBBR & Nereda Comparison Table
No ratings yet
MBBR & Nereda Comparison Table
3 pages
Mock Test Wipro
No ratings yet
Mock Test Wipro
2 pages
Bridge Beam Manual: Precast Concrete Specialists
No ratings yet
Bridge Beam Manual: Precast Concrete Specialists
27 pages
Glass Louver - 3
No ratings yet
Glass Louver - 3
1 page
Drilling Data for Engineers
No ratings yet
Drilling Data for Engineers
1 page
33.VengalaJagadishet - Al. WBCPDF
No ratings yet
33.VengalaJagadishet - Al. WBCPDF
20 pages
Control Valve Sizing For Steam Systems
100% (1)
Control Valve Sizing For Steam Systems
22 pages
Eenam03355 D65ex PX WX-18 1806
No ratings yet
Eenam03355 D65ex PX WX-18 1806
418 pages
Parallel Computing Toolbox™UserGuide
No ratings yet
Parallel Computing Toolbox™UserGuide
729 pages
123382-Article Text-337824-1-10-20151008
No ratings yet
123382-Article Text-337824-1-10-20151008
6 pages
HP ATA SS Student Textbook
No ratings yet
HP ATA SS Student Textbook
830 pages
Quiz in Tle 7
No ratings yet
Quiz in Tle 7
8 pages
528 - Tank Weighing System - 1
No ratings yet
528 - Tank Weighing System - 1
4 pages
Car-Parrinello Molecular Dynamics: An Ab Initio Electronic Structure and Molecular Dynamics Program
No ratings yet
Car-Parrinello Molecular Dynamics: An Ab Initio Electronic Structure and Molecular Dynamics Program
172 pages
RAC Basic Sample Project
No ratings yet
RAC Basic Sample Project
1 page
Hotspin Technical Data, Features
No ratings yet
Hotspin Technical Data, Features
2 pages
Nintendo Entertainment System Architecture
100% (1)
Nintendo Entertainment System Architecture
10 pages
Casting Pattern Allowances Guide
No ratings yet
Casting Pattern Allowances Guide
10 pages
Design of Isolated Foundation - 18
No ratings yet
Design of Isolated Foundation - 18
1 page
3.9 General-Cold Work Permit
No ratings yet
3.9 General-Cold Work Permit
1 page
Four Computer Programs To Design and Calculate Safety Structures For Old Agricultural Tractors Lacking ROPS
No ratings yet
Four Computer Programs To Design and Calculate Safety Structures For Old Agricultural Tractors Lacking ROPS
6 pages
Alkenes and Alkynes-Students
100% (1)
Alkenes and Alkynes-Students
25 pages
K04 - Variable Acceleration (Split Journey) QP
No ratings yet
K04 - Variable Acceleration (Split Journey) QP
6 pages
Engineering Fire Safety Design
No ratings yet
Engineering Fire Safety Design
4 pages
Checklist - Pre-Transfer of Bunkers
No ratings yet
Checklist - Pre-Transfer of Bunkers
1 page
Toshiba R500 121
No ratings yet
Toshiba R500 121
5 pages
Welding Inspection Certificate EN 10204
No ratings yet
Welding Inspection Certificate EN 10204
1 page

NLP Using RNN

Uploaded by

NLP Using RNN

Uploaded by

Ex.

No 4 : LANGUAGE MODELING USING RNN

AIM: To write a python program for Language Modeling using RNN.

1. Use a recurrent neural network to do the following:

2. Collect Bag of words Just store frequency of words, not order

4. More Preprocessing follow the procedure below:

5. Creating the Model

7.4 Creating a Loss Function:

7.5 Creating Checkpoints

from keras.datasets import imdb

(train_data, train_labels), (test_data, test_labels) =

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-

lstm (LSTM) (None, 32) 8320

dense (Dense) (None, 1) 33

782/782 [==============================] - 31s 39ms/step - loss: 0.4922 -

782/782 [==============================] - 31s 39ms/step - loss: 0.4922 -

text="that movie was amazing, i have to watch it again"

# Decode function that converts itegers to text

reverse_word_index={value:key for (key,value) in word_index.items()}

that movie was amazing i have to watch it again

positive_review="That was a good movie, i will definitely watch it again"

negative_review="Don't waste your time watching this movie, so

1/1 [==============================] - 1s 938ms/step

#Load data from keras

Downloading data from

Length of text : 1115394

#Creating mapping from text to index

Text: First Citizen

# Convert int to text

seq_length = 100 # length of sequence for a training example

# Create training examples / targets

sequences = char_dataset.batch(seq_length+1, drop_remainder=True)

def split_input_target(chunk): # for the example: hello

dataset = sequences.map(split_input_target) # we use map to apply the

# Buffer size to shuffle the dataset

data = dataset.shuffle(BUFFER_SIZE).batch(BATCH_SIZE, drop_remainder=True)

def build_model(vocab_size, embedding_dim, rnn_units, batch_size):

model = build_model(VOCAB_SIZE,EMBEDDING_DIM, RNN_UNITS, BATCH_SIZE)

lstm_1 (LSTM) (64, None, 1024) 5246976

dense_1 (Dense) (64, None, 65) 66625

for input_example_batch, target_example_batch in data.take(1):

(64, 100, 65) # (batch_size, sequence_length, vocab_size)

(64, 100, 65)

[[ 2.98142736e-03 1.84627529e-03 -7.37963011e-03 ... -6.58784620e-03

[[ 3.07811610e-03 -4.89329745e-04 -1.33164215e-03 ... -2.27746228e-03

[[-5.06412331e-03 1.89589732e-03 2.89738621e-03 ... 3.95299867e-03

[[-3.74341710e-03 3.78564978e-03 -6.55010575e-03 ... -4.16637585e-03

[[ 2.66172946e-03 2.22246279e-04 1.83949992e-03 ... 3.09980777e-03

# lets examine one prediction

# and finally well look at a prediction at the first timestep

# If we want to determine the predicted character we need to sample the

predicted_chars # and this is what the model predicted for training

def loss(labels, logits):

# Directory where the checkpoints will be saved

#More eopchs will have better result, no overfitting here, in ex 50 or

model = build_model(VOCAB_SIZE, EMBEDDING_DIM, RNN_UNITS, batch_size=1)

def generate_text(model, start_string):

# Number of characters to generate

# Converting our start string to numbers (vectorizing)

# Empty string to store our results

# Low temperatures results in more predictable text.

# Here batch size == 1

# using a categorical distribution to predict the character returned

# We pass the predicted character as the next input to the model

return (start_string + ''.join(text_generated))

#inp=input('Type starting string')

You might also like