0% found this document useful (0 votes)

74 views45 pages

CS115 Math For Computer Science

Uploaded by

Phan Đức Thành Phát

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

74 views45 pages

CS115 Math For Computer Science

Uploaded by

Phan Đức Thành Phát

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 45

Recurrent Neural Networks (RNNs)

and the Exploding & Vanishing Gradient Problems

Dang Huu Phat 22521065

Phan Nguyen Huu Phong 22521090
Chau The Vi 22521653

University of Information Technology

November 2, 2024

Team 2 (UIT) CS115 - Math for Computer Science November 2, 2024 1 / 45

References

The contents of this document are taken mainly from the follow sources:
Robin M.Schmidt, Recurrent Neural Networks (RNNs): a gentle
introductionand overview1 ;
Razvan Pascanu, Tomas Mikolov, Yoshua Bengio, On the difficulty of
training RNNs2
Yoshua Bengio, Patrice Simard, Paolo Frasconi, Learning long-term
dependencies with gradients descent is difficult3

1
https://arxiv.org/pdf/1912.05911.pdf
2
https://arxiv.org/pdf/1211.5063.pdf
3
https://www.researchgate.net/profile/Y-Bengio/publication/5583935_
Learning_long-term_dependencies_with_gradient_descent_is_difficult/
links/546b702d0cf2397f7831c03d/
Learning-long-term-dependencies-with-gradient-descent-is-difficult.pdf
Team 2 (UIT) CS115 - Math for Computer Science November 2, 2024 2 / 45
Table of Contents

1 Introduction & Notation

2 Training Recurrent Neural Networks

3 Exploding and Vanishing Gradient

Team 2 (UIT) CS115 - Math for Computer Science November 2, 2024 3 / 45

Table of Contents

1 Introduction & Notation

2 Training Recurrent Neural Networks

3 Exploding and Vanishing Gradient

Team 2 (UIT) CS115 - Math for Computer Science November 2, 2024 4 / 45

Sequential Data
Sequential data, which more commonly known as sequence data or
sequences
What makes it unique? It is that elements in a sequence appear in a
certain order and are not independent of each other.
Example: Stock Price Prediction
In this example, the input data is the historical stock prices. i1 is the
stock price on January 1st, i2 is the stock price on January 2nd, and so
on. (i1 , i2 , ...i30 ) is called sequences. RNN will learn from the input
and predict the stock price in the future.

Figure: Example of Stock Price Prediction

Team 2 (UIT) CS115 - Math for Computer Science November 2, 2024 5 / 45
Sequential data versus time series data

Time series data is a special type of sequential data where each

example is associated with a dimension for time.
In time series data, samples are taken at successive timestamps, and
therefore, the time dimension determines the order among the data
points. For example, stock prices and voice or speech records are time
series data.
In contrast, not all sequential data shares this temporal structure. Text
data or DNA sequences, for instance, exhibit ordered patterns, but lack
a distinct time dimension.

Team 2 (UIT) CS115 - Math for Computer Science November 2, 2024 6 / 45

Sequential data versus time series data

A example of machine translation, input is sentence: "I love CS115",

the order of words significantly impact on the meaning of the sentence.
Input data, which consists of the sequence of words [’I’, ’love’, ’CS115’]
is refered to as sequences.
In natural language processing (NLP), it is not feasible to process entire
sentences as inputs. Similar to how videos are processed by breaking
them down into individual frames, in NLP, sentences are broken down
into individual words for input processing.

Team 2 (UIT) CS115 - Math for Computer Science November 2, 2024 7 / 45

Representing sequences

The movie my friend has not seen is good

The movie my friend has seen is not good

Output: y<1> y<2> y<3> y<4> y<5> y<6>

Input: x<1> x<2> x<3> x<4> x<5> x<6> Time

Figure: An example of time series data

Team 2 (UIT) CS115 - Math for Computer Science November 2, 2024 8 / 45

What is Recurrent Neural Networks (RNNs)?
Recurrent Neural Networks (RNNs) is a type of neural network
architecture which is used to detect patterns in a sequences.
They are also can be extended to images by breaking them down into a
series of patches and treating them as a sequence.
What differentiates Recurrent Neural Networks from Feedforward
Neural Networks also known as Multi-Layer Perceptrons (MLPs) is how
information gets passed through the network.

Team 2 (UIT) CS115 - Math for Computer Science November 2, 2024 9 / 45

What is Recurrent Neural Networks (RNNs)?
Multilayer perceptrons (MLPs) and CNNs for image data, assume that
the training examples are independent of each other and thus do not
incorporate ordering information.
We can say that such models do not have a memory of previously seen
training examples.
RNNs, by contrast, are designed for modeling sequences and are capable
of remembering past information and processing new events accordingly,
which is a clear advantage when working with sequence data.

Team 2 (UIT) CS115 - Math for Computer Science November 2, 2024 10 / 45

Pros and Cons

Benefits
Possibility of processing input of any length;
Model size not increasing with size of input;
Computation takes into account historical information;
Weights are shared across time

Costs
Computation being slow;
Difficulty of accessing information from a long time ago;
Cannot consider any future input for the current state;

Team 2 (UIT) CS115 - Math for Computer Science November 2, 2024 11 / 45

Frame Title

Algorithm 2: Evaluate the learned policy on an unlabeled test graph

Input: the unlabeled test graph G, the learned policy πθ , the query budget B
Result: the classification GNN f trained on the selected node sequence τ
τ = []
0
Vlabel ← ∅;
Randomly initialize the classification GNN as fG,V 0 ;
label

for t = 1 to B do
t
Generate the graph state SG based on fG,V t−1 and G;
tabel

Compute the probability distribution over candidate nodes as ptG = πθ SG

t

;
t−1
Select vt = arg max ptG (v) from Vtrain \Vlabel and query for its label;
v
t t−1
Vlabel ← Vlabel ∪ {vt } ;
Train the classification GNN for one more epoch to get fG,V t ;
label

τ . append
Team 2 (v t) ;
(UIT) CS115 - Math for Computer Science November 2, 2024 12 / 45
The different categories of sequence modeling
Many to One RNN
Many-to-One is used when a single output is required from
multiple input units or a sequence of them. It takes a sequence of
inputs to display a fixed output.
Sentiment Analysis is a common example of this type of Recurrent
Neural Network.

Figure: Many to One mode

Team 2 (UIT) CS115 - Math for Computer Science November 2, 2024 13 / 45

The different categories of sequence modeling

Many to Many RNN

Many-to-Many is used to generate a sequence of output data from
a sequence of input units.
This type of RNN is further divided into the following two
subcategories:
1 Equal Unit Size: In this case, the number of both the input and output
units is the same. A common application can be found in Name-Entity
Recognition.
2 Unequal Unit Size In this case, inputs and outputs have different
numbers of units. Its application can be found in Machine Translation.

Team 2 (UIT) CS115 - Math for Computer Science November 2, 2024 14 / 45

Understanding the dataflow in RNNs
In a standard feedforward network, information flows from the input to
the hidden layer, and then from the hidden layer to the output layer.
On the other hand, in an RNN, the hidden layer receives its input from
both the input layer of the current time step and the hidden layer from
the previous time step.

Figure: The dataflow of a standard feedforward NN and an RNN

Team 2 (UIT) CS115 - Math for Computer Science November 2, 2024 15 / 45

Understanding the dataflow in RNNs

y <t> Single layer RNN y <t-1> y <t> y (t+1)

h<t> Unfold h<t-1> h<t> h(t+1)

x<t> x<t-1> x<t> x(t+1)

Multilayer RNN
y <t> y <t-1> y <t> y (t+1)

h<t>
2
h<t-1> h<t> h(t+1)
2 2 2

Unfold
h<t> h<t-1> h<t> h(t+1)
1 1 1 1

x<t> x<t-1> x<t> x(t+1)

Figure: Examples of an RNN with one and two hidden layers

Team 2 (UIT) CS115 - Math for Computer Science November 2, 2024 16 / 45

Table of Contents

1 Introduction & Notation

2 Training Recurrent Neural Networks

3 Exploding and Vanishing Gradient

Team 2 (UIT) CS115 - Math for Computer Science November 2, 2024 17 / 45

Weight matrices in a single-hidden layer RNN

y <t> y <t-1> y <t> y <t+1>

[
Wh= Whh ; Whx ]
Wyh Wyh Wyh Wyh
Whh
Whh Whh
h<t> Unfold h<t-1> h<t> h<t+1>

Whx Whx Whx Whx

Team 2 (UIT) CS115 - Math for Computer Science November 2, 2024 18 / 45

Weight matrices in a single-hidden layer RNN

y <t> y <t-1> y <t> y <t+1>

[
Wh = Whh ; Whx ]
Wyh Wyh Wyh Wyh
Whh
Whh Whh
h <t> Unfold h <t-1> h <t> h <t+1>

Whx Whx Whx Whx

x <t> x <t-1> x <t> x <t+1>

Net input: t t t− 1
zh = W hx x + W hh h + bh
Activation: t
t
h = Øh zh

Team 2 (UIT) CS115 - Math for Computer Science November 2, 2024 19 / 45

Weight matrices in a single-hidden layer RNN

y <t> y <t-1> y <t> y <t+1>

[
Wh = Whh ; Whx ]
Wyh Wyh Wyh Wyh
Whh
Whh Whh
h <t> Unfold h <t-1> h <t> h <t+1>

Whx Whx Whx Whx

x <t> x <t-1> x <t> x <t+1>

Net input: t t t− 1
zh = W hx x + W hh h + bh Net input:
Activation: zyt = W yh h t
+ by
t t
h = Øh zh
Output:
t
y = Øy zyt

Team 2 (UIT) CS115 - Math for Computer Science November 2, 2024 20 / 45

Gradient-based Learning Methods

Gradient descent (GD) adjusts the weights of the model by finding the
error function derivatives with respect to each member of the weight
matrices.

Batch Gradient Descent

Computing the gradient for the whole dataset in each optimization iteration
to perform a single update as
U
λ X ∂ℓk
θt+1 = θt −
U ∂θ
k=1

However, computing error-derivatives through time is difficult. This is

mostly due to the relationship among the parameters and the dynamics
of the RNN, that is highly unstable and makes GD ineffective.
Team 2 (UIT) CS115 - Math for Computer Science November 2, 2024 21 / 45
Back-propagation Through Time (BPTT)

Team 2 (UIT) CS115 - Math for Computer Science November 2, 2024 22 / 45

Back-propagation Through Time (BPTT)

Hidden variable
h(t) = ϕh (xt · Wxh + h(t−1) · Whh + bh )

Output variable
y(t) = ϕy (h(t) · Wyh + by )

We can define a loss function L(y, o) to describe the difference between all
outputs yt and the target values ot as shown in the equation below.
Loss Function
T
X
L(y, o) = l(t) (y(t) , o(t) )
t=1

Team 2 (UIT) CS115 - Math for Computer Science November 2, 2024 23 / 45

Back-propagation Through Time (BPTT)
Since we have three weight matrices Wxh , Whh and Wyh we need to
compute the partial derivative to each of these weight matrices using the
chain rule.
Partial derivative with respect to Wyh
T
X ∂l(t) ∂y(t) ∂zy (t)
X ∂l(t) ∂y(t) T
∂L
= (t)
· ·
(t) ∂W
= (t)
· (t) · h(t)
∂Wyh ∂y ∂zy yh ∂y ∂zy
t=1 t=1

Partial derivative with respect to Whh

T
X ∂ℓ(t) ∂y(t) ∂zy (t)
∂L ∂h(t)
= · · ·
∂Whh
t=1
∂y(t) ∂zy(t) ∂h(t) ∂Whh
T
X ∂ℓ(t) ∂y(t) ∂h(t)
= · · Wyh ·
t=1
∂y(t) ∂zy(t) ∂Whh

Team 2 (UIT) CS115 - Math for Computer Science November 2, 2024 24 / 45

Back-propagation through time (BPTT)

(t-1) (t) (t+1)

( )
= ( ) y(t-1) y(t) y(t+1)

Wyh Wyh Wyh

Whh Whh Whh
h(...) h(t-1) h(t) h(t+1)

Wxh Wxh Wxh

x(t-1) x(t) x(t+1)

Team 2 (UIT) CS115 - Math for Computer Science November 2, 2024 25 / 45

Back-propagation through time (BPTT)

(t-1) (t) (t+1)

( )
= ( ) y(t-1) y(t) y(t+1)

Wyh Wyh Wyh

Whh Whh Whh
h(...) h(t-1) h(t) h(t+1)

Wxh Wxh Wxh

x(t-1) x(t) x(t+1)

Team 2 (UIT) CS115 - Math for Computer Science November 2, 2024 26 / 45

Back-propagation through time (BPTT)

o(t)

(t)

( )
= +
y(t)

Wyh

h(...) h(t-2) h(t-1) h(t) h(t+1)

Whh

Team 2 (UIT) CS115 - Math for Computer Science November 2, 2024 27 / 45

Back-propagation through time (BPTT)

1 2 h(t-1) h(t)
( )
= +
2 1

Whh

Team 2 (UIT) CS115 - Math for Computer Science November 2, 2024 28 / 45

Back-propagation through time (BPTT)

o(t)

( )
= + (t)

= + + y(t)

Wyh

h(...) h(t-2) h(t-1) h(t) h(t+1)

Whh

Team 2 (UIT) CS115 - Math for Computer Science November 2, 2024 29 / 45

Back-propagation through time (BPTT)

( )
= + o(t)

= + + (t)

= + + y(t)

Wyh

h(...) h(t-2) h(t-1) h(t) h(t+1)

Whh

Team 2 (UIT) CS115 - Math for Computer Science November 2, 2024 30 / 45

Back-propagation through time (BPTT)

( )
= +

= + +

= + +
( ) ( )
= ( )

Team 2 (UIT) CS115 - Math for Computer Science November 2, 2024 31 / 45

Back-propagation through time (BPTT)

( )
= ( )

( ) ( ) ( ) ( ) ( ) ( )
= ( ) ( ) ( ) ( )

Team 2 (UIT) CS115 - Math for Computer Science November 2, 2024 32 / 45

Back-propagation through time (BPTT)

Since each ht depends on the previous time step we can substitute the
last part from above equation
Partial derivative with respect to Whh
T t
∂L X ∂ℓ(t) ∂y(t) X ∂h(t) ∂h(k)
= (t)
· (t)
· W yh ·
(k) ∂W
∂Whh ∂y ∂zy
∂h hh
t=1 k=1

Similarly, we have the partial derivative with respect to Wxh as below

Partial derivative with respect to Wxh
T t
∂L X ∂ℓ(t) ∂y(t) X ∂h(t) ∂h(k)
= · · W yh ·
∂Wxh
t=1
∂y(t) ∂zy(t) k=1
∂h(k) ∂W
xh

Team 2 (UIT) CS115 - Math for Computer Science November 2, 2024 33 / 45

Back-propagation through time (BPTT)

In order to transport the error through time from timestep t back to

timestep k we can have

t
∂h(t) Y ∂h(i)
=
∂h(k) i=k+1 ∂h(i−1)

We can consider previous equation as a Jacobian matrix for the hidden

state parameter as

t t
Y ∂h(i) Y (i)
(i−1)
= T
Whh diag(ϕ′h (zh ))
i=k+1
∂h i=k+1

Team 2 (UIT) CS115 - Math for Computer Science November 2, 2024 34 / 45

Back-propagation through time (BPTT)

…
=

Team 2 (UIT) CS115 - Math for Computer Science November 2, 2024 35 / 45

Back-propagation through time (BPTT)

Assume activation function for h is tanh.

∂tanh(x)
∂x = 1 − tanh(x)2
Thus:
(k+1)
∂hj
(k+1) 2

(k)
= wi,j 1 − [hj ]
∂hi

Team 2 (UIT) CS115 - Math for Computer Science November 2, 2024 36 / 45

Back-propagation through time (BPTT)
( ) ( ) ( )
1 ( ) 1 ( ) 1 ( )
( ) ( ) ( )
( ) 1 ( ) 1 ( ) 1 ( )
( )
=

( ) ( ) ( )
1 ( ) 1 ( ) 1 ( )

(k+1) 2
This is just W T with the jth row multiplied by (1 − (hj ) , or:

∂h(k+1)
(k+1) 2

(k)
= diag 1 − (h 1 ) • WT
∂h
 2 
(k+1)
1 − h1 0 ··· 0
 
  W11 W21 w31 · · ·
(k+1) 2   W12 W22 · · · · · · 

= •

0 1 − h2 ··· 0 
.. .. .. ..

 
.. .. .. .. . . . .
. . . .

Team 2 (UIT) CS115 - Math for Computer Science November 2, 2024 37 / 45

Back-propagation through time (BPTT)

This is very problematic: Vanishing/Exploding gradient problem!

Team 2 (UIT) CS115 - Math for Computer Science November 2, 2024 38 / 45

Table of Contents

1 Introduction & Notation

2 Training Recurrent Neural Networks

3 Exploding and Vanishing Gradient

Team 2 (UIT) CS115 - Math for Computer Science November 2, 2024 39 / 45

Difficulty of training RNNs

ℎ ′
∥ ∥ = ∥ diag ∥
ℎ −1
′
ℎ
≤ diag ∥ ∥ ∥ ∥≤ ∥ ∥
ℎ −1
ℎ
∥ ∥≤
1 ℎ −1
′
≤ = if is sigmoid
4
′ ≤1= [ if is tanh ]

Team 2 (UIT) CS115 - Math for Computer Science November 2, 2024 40 / 45

Prove the derivative of the sigmoid function

1
f (x) =
1 +e−x
d 1 −1
f ′ (x) = −x

= −e
dx 1 + e−x (1 + e−x ) 2

e−x 1 + e−x − 1
= 2 =
(1 + e−x ) (1 + e−x )2
2 2
1 + e−x

1 1 1
= − = −
(1 + e−x )2 1 + e−x (1 + e−x ) 1 + e−x
= f (x) − f (x)2 = f (x)(1 − f (x))

Team 2 (UIT) CS115 - Math for Computer Science November 2, 2024 41 / 45

Gradient of Sigmoid Function

Figure: Sigmoid function

d
σ ′ (x) = σ (x) = slope of σ (x) at x
dx

1 1
= 1−
1 + e−x 1 + e−x
Derivative of σ(x) with respect to x

Team 2 (UIT) CS115 - Math for Computer Science November 2, 2024 42 / 45

Gradient of Sigmoid Function

x = 10 → g(x) ≈ 1
d
σ(x) ≈ 1(1 − 1) = 0
dx
x = −10 → g(x) ≈ 0

d
σ(x) ≈ 1(1 − 1) = 0
dx
1
x = 0 → g(x) = 2
d 1 1
σ(x) = 1 1 − =
dx 2 4

Team 2 (UIT) CS115 - Math for Computer Science November 2, 2024 43 / 45

Difficulty of training RNNs

t
∂ht Y ∂hi
=
∂hk ∂hi−1
i=k+1
t
Y
≤ γλ
i=k+1
(t−k)
≤ (γλ)

1 If (γλ) > 1 then gradient will explode

2 If (γλ) < 1 then gradient will vanish

Team 2 (UIT) CS115 - Math for Computer Science November 2, 2024 44 / 45

Exploding Gradient

Gradients in training RNNs on long sequences may explode as the

weights become larger and the norm of the gradient during training
largely increases.

As it is stated before, the necessary condition for this situation to

happen is λ > γ1 .

Team 2 (UIT) CS115 - Math for Computer Science November 2, 2024 45 / 45

Lập Trình Trên Bộ Xử Lý Song Song GPU Có Hỗ Trợ Lõi CUDA
No ratings yet
Lập Trình Trên Bộ Xử Lý Song Song GPU Có Hỗ Trợ Lõi CUDA
18 pages
Slot20 Assembly01 27 Slides
No ratings yet
Slot20 Assembly01 27 Slides
28 pages
NETWORKING HACKING - Complete Guide Tools For Comrk Via Services and Hardware. - Karnel, Erickson
No ratings yet
NETWORKING HACKING - Complete Guide Tools For Comrk Via Services and Hardware. - Karnel, Erickson
106 pages
Go101 v1.13.m.pdf ForPrinting
No ratings yet
Go101 v1.13.m.pdf ForPrinting
585 pages
Computer Science Topic 1.1 Questions
No ratings yet
Computer Science Topic 1.1 Questions
34 pages
Guideline To Make and Understand Unit Test Case: 1. Overview
No ratings yet
Guideline To Make and Understand Unit Test Case: 1. Overview
11 pages
Future Homes: Listening & Vocabulary
No ratings yet
Future Homes: Listening & Vocabulary
8 pages
Subject: PRF192-PFC Workshop 05
No ratings yet
Subject: PRF192-PFC Workshop 05
13 pages
Đề cương - TA2
No ratings yet
Đề cương - TA2
26 pages
Web Devlopment Roadmap 1
No ratings yet
Web Devlopment Roadmap 1
9 pages
Reverse .NET Software IX
No ratings yet
Reverse .NET Software IX
18 pages
Tài liệu thiết kế yêu cầu phần mềm
No ratings yet
Tài liệu thiết kế yêu cầu phần mềm
19 pages
History of Dart
No ratings yet
History of Dart
2 pages
Online Bookstore & Library System Tasks
No ratings yet
Online Bookstore & Library System Tasks
2 pages
E Book Handbook For Spoken Mathematics
No ratings yet
E Book Handbook For Spoken Mathematics
63 pages
Maths For Computing
No ratings yet
Maths For Computing
51 pages
Os Lab Manual - 0 PDF
No ratings yet
Os Lab Manual - 0 PDF
56 pages
1649 GCS0605 HuynhTanPhat ASM2
No ratings yet
1649 GCS0605 HuynhTanPhat ASM2
20 pages
JLPT N4 Kanji Flashcards (Printable Set)
No ratings yet
JLPT N4 Kanji Flashcards (Printable Set)
29 pages
Bai Tap Trac Nghiem Oop
No ratings yet
Bai Tap Trac Nghiem Oop
7 pages
Lab 2 Process & Multithreaded Process Course: Operating Systems
No ratings yet
Lab 2 Process & Multithreaded Process Course: Operating Systems
18 pages
Convert ERD to Relational DB
No ratings yet
Convert ERD to Relational DB
2 pages
Networking Protocols Explained
No ratings yet
Networking Protocols Explained
22 pages
Nessus - Linux - Pagada
No ratings yet
Nessus - Linux - Pagada
371 pages
Algorithms & Integers in Discrete Math
No ratings yet
Algorithms & Integers in Discrete Math
319 pages
AIO2024 LLamaIndex
No ratings yet
AIO2024 LLamaIndex
65 pages
Sample PE SWR302 v1.0 Long
No ratings yet
Sample PE SWR302 v1.0 Long
4 pages
Using Large N-Gram For Vietnamese Spell Checking: Advances in Intelligent Systems and Computing January 2015
100% (1)
Using Large N-Gram For Vietnamese Spell Checking: Advances in Intelligent Systems and Computing January 2015
11 pages
Homework Week 5 Trees
100% (1)
Homework Week 5 Trees
6 pages
JSON
No ratings yet
JSON
33 pages
IOT102 Digital Clock DS3231 Report
No ratings yet
IOT102 Digital Clock DS3231 Report
6 pages
Node.js Powers NASA's Data Revolution
No ratings yet
Node.js Powers NASA's Data Revolution
6 pages
Sentiment Analysis For Vietnamese: Binh Thanh Kieu Son Bao Pham
No ratings yet
Sentiment Analysis For Vietnamese: Binh Thanh Kieu Son Bao Pham
6 pages
Colab Tutorial
No ratings yet
Colab Tutorial
21 pages
Dưới Đây Là Một Ví Dụ Đơn Giản Về Code Cho Trò Chơi Xếp Gạch Sử Dụng Ngôn Ngữ Python
No ratings yet
Dưới Đây Là Một Ví Dụ Đơn Giản Về Code Cho Trò Chơi Xếp Gạch Sử Dụng Ngôn Ngữ Python
6 pages
CHATBOT Final
No ratings yet
CHATBOT Final
54 pages
Python 100 Libraries List
No ratings yet
Python 100 Libraries List
5 pages
Principal Component Analysis (PCA) - GeeksforGeeks
No ratings yet
Principal Component Analysis (PCA) - GeeksforGeeks
20 pages
Chapter1 Exam
No ratings yet
Chapter1 Exam
24 pages
Câu hỏi DOM - DOM Quiz - Chương trình học của IT4409 - MOOC daotao.ai
No ratings yet
Câu hỏi DOM - DOM Quiz - Chương trình học của IT4409 - MOOC daotao.ai
4 pages
2020 02. DNNRec A Novel Deep Learning Based Hybrid Recommender System
No ratings yet
2020 02. DNNRec A Novel Deep Learning Based Hybrid Recommender System
14 pages
500 Ký Tự Tiếng Hoa Cơ Bản (Bản Tiếng Việt)
No ratings yet
500 Ký Tự Tiếng Hoa Cơ Bản (Bản Tiếng Việt)
496 pages
Ôn HK1 - 3
No ratings yet
Ôn HK1 - 3
6 pages
Mountain Climber Game Assignment
No ratings yet
Mountain Climber Game Assignment
39 pages
Microsoft PowerPoint - Thieu Nang - Session 7
100% (1)
Microsoft PowerPoint - Thieu Nang - Session 7
47 pages
Introduction To Combinatorial Optimization (Ding-Zhu Du Panos M. Pardalos Xiaodong Hu Weili Wu) (Personal - Utdallas.edu)
No ratings yet
Introduction To Combinatorial Optimization (Ding-Zhu Du Panos M. Pardalos Xiaodong Hu Weili Wu) (Personal - Utdallas.edu)
230 pages
Advanced Front-end Training Tasks
No ratings yet
Advanced Front-end Training Tasks
4 pages
OSG202 Revise3
No ratings yet
OSG202 Revise3
46 pages
Group9 IT4520Q
No ratings yet
Group9 IT4520Q
6 pages
SE Ch3 - Software Design
100% (1)
SE Ch3 - Software Design
72 pages
PDF Download Log Records
No ratings yet
PDF Download Log Records
571 pages
Programming Operators Guide
No ratings yet
Programming Operators Guide
2 pages
Transcripts - Economy TOEIC 3
No ratings yet
Transcripts - Economy TOEIC 3
72 pages
6b. Recurrent Neural Networks
No ratings yet
6b. Recurrent Neural Networks
38 pages
RNN Overview: Types, Applications, and Code
No ratings yet
RNN Overview: Types, Applications, and Code
8 pages
Module 5
No ratings yet
Module 5
21 pages
What Is A Recurrent Neural Network
No ratings yet
What Is A Recurrent Neural Network
36 pages
RNNs & Teacher Forcing Explained
No ratings yet
RNNs & Teacher Forcing Explained
121 pages
Recurrent Neural Network
No ratings yet
Recurrent Neural Network
11 pages
Recurrent Neural Networks: RNN: S. Sumitra Department of Mathematics Indian Institute of Space Science and Technology
No ratings yet
Recurrent Neural Networks: RNN: S. Sumitra Department of Mathematics Indian Institute of Space Science and Technology
47 pages
DL
No ratings yet
DL
2 pages
Cluster Validation Techniques
No ratings yet
Cluster Validation Techniques
22 pages
CS F425 - Deep Learning - (Tanmay Tulsidas Verlekar) - 2023 - 2
No ratings yet
CS F425 - Deep Learning - (Tanmay Tulsidas Verlekar) - 2023 - 2
3 pages
05 Rnns
No ratings yet
05 Rnns
121 pages
Transformers Without Tears
No ratings yet
Transformers Without Tears
11 pages
Ann 2023-2023
No ratings yet
Ann 2023-2023
3 pages
DWDM Lab Report
No ratings yet
DWDM Lab Report
12 pages
Deep Learning in Software Testing
No ratings yet
Deep Learning in Software Testing
13 pages
Deep Learning Record
No ratings yet
Deep Learning Record
43 pages
Zero To Deep Learning With Keras and Tensorflow Compress
No ratings yet
Zero To Deep Learning With Keras and Tensorflow Compress
769 pages
Mldap
No ratings yet
Mldap
6 pages
A Survey and Critique of Deep Learning On Recommender Systems
No ratings yet
A Survey and Critique of Deep Learning On Recommender Systems
31 pages
Introduction To Deep Learning
No ratings yet
Introduction To Deep Learning
2 pages
Unit - 4 ANN
No ratings yet
Unit - 4 ANN
46 pages
chp3 Hebb Network
No ratings yet
chp3 Hebb Network
4 pages
Recurrent Neural Networks
No ratings yet
Recurrent Neural Networks
20 pages
8 Neural Networks
No ratings yet
8 Neural Networks
55 pages
Revisiting Deep Learning Models For Tabular Data
No ratings yet
Revisiting Deep Learning Models For Tabular Data
12 pages
Cfa 2
No ratings yet
Cfa 2
3 pages
Deep Learning State of The Art: Amulya Viswambharan ID 202090007 Kehkshan Fatima ID
No ratings yet
Deep Learning State of The Art: Amulya Viswambharan ID 202090007 Kehkshan Fatima ID
17 pages
23 DeepLearning PDF
No ratings yet
23 DeepLearning PDF
74 pages
Quiz Week 5 - Attempt Review
No ratings yet
Quiz Week 5 - Attempt Review
6 pages
08 An Example of NN Using ReLu
No ratings yet
08 An Example of NN Using ReLu
10 pages
Introduction to Recurrent Neural Networks
No ratings yet
Introduction to Recurrent Neural Networks
18 pages
Recurrent Neural Networks
No ratings yet
Recurrent Neural Networks
36 pages
Activation Function
No ratings yet
Activation Function
20 pages
Multilayer Neural Network
No ratings yet
Multilayer Neural Network
27 pages
790 1549 1 PB 1
No ratings yet
790 1549 1 PB 1
9 pages
Soft Computing Unit 2
No ratings yet
Soft Computing Unit 2
28 pages
2 Ai-B ML TLP
No ratings yet
2 Ai-B ML TLP
4 pages