KEMBAR78
FPGA Based Implementation of Neural Network | PDF | Field Programmable Gate Array | Machine Learning
0% found this document useful (0 votes)
81 views5 pages

FPGA Based Implementation of Neural Network

FPGA_Based_Implementation_of_Neural_Network[1]

Uploaded by

misalabhijeet000
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
81 views5 pages

FPGA Based Implementation of Neural Network

FPGA_Based_Implementation_of_Neural_Network[1]

Uploaded by

misalabhijeet000
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

FPGA Based Implementation of Neural Network

Sainath Shravan Lingala, Swanand Bedekar, Piyush Tyagi, Purba Saha and Priti Shahane
2022 International Conference on Advances in Computing, Communication and Applied Informatics (ACCAI) | 978-1-6654-9529-5/22/$31.00 ©2022 IEEE | DOI: 10.1109/ACCAI53970.2022.9752656

Symbiosis Institute of Technology, Pune, India


E-mail : shravanls1015@gmail.com. swanandbedekar26@gmail.com, piyushtyagi99@gmail.com
sweetpop.purba@gmail.com, pritis@sitpune.edu.in

Abstract- The objective of this paper is to implement a set of Processing Unit (GPU), FPGA, Application-Specific
Neural Networks (NN) for the detection and recognition of Integrated Circuit (ASIC) assume a significant part
handwritten digit characters. This paper is based on the
Modified National Institutes of Standards and Technology and are reasonable platforms to show NN
(MNIST) dataset which has been used for the testing and calculations [6]. Lately, FPGAs have been
training of the NN model. For this application, both software considered as an alluring stage for NN execution as
and hardware platforms have been used to obtain efficient they are appropriate hardware accelerators
outcomes and identify a comparative analysis between the
software and hardware performance on the basis of various
absolutely inferable from their adaptability and
parameters. These parameters include accuracy, resource productivity. Present day FPGAs have different
utilisation and operating frequency. This implementation of hardware models like dedicated processor, DSP,
NN has been performed over the software platform using adders, multiplexers, and memory blocks. These
python programming libraries like Tensorflow and Zynet. embedded resources along with customized logic
But studies referring to software-based implementations
conclude various limitations in terms of the execution of blocks, makes FPGA a perfect candidate for NN
Convolutional Neural Networks (CNN) as well as NN in model [8]. The section 2 discusses the previous
computation-intensive, memory intensive, and resource- advances achieved by several authors in disciplines
intensive characteristics of largescale, possessing various such as AI and FPGA. The section 3, on the other
challenges. Hence similar techniques have been used to
implement the hardware-based results over the Field
hand, discusses techniques, which are further
Programmable Gate Array (FPGA) and to utilize its divided into software and hardware
proficient properties such as parallelism and pipelining for implementations. Finally, section 4 discusses the
efficient execution. This hardware implementation is findings and results of the model achieved over
achieved through Vivado High Level Synthesis (HLS) hardware followed by the conclusion in section 5.
software using Verilog programming.

Index Terms—FPGA, Neural Network, MNIST, ReLU, II. PREVIOUS DEVELOPMENT


Sigmoid, Tensorflow, Vivado HLS
From the previous work it is observed that with the
I.INTRODUCTION development in technology, the use of machine
learning and deep learning algorithms as well as
With the development of technology, NN and its their applications are becoming more common. The
significance have increased over the years. execution of the ML algorithms calculations over
Utilizations of NN like face recognition, text and FPGA gives a more straightforward solution to
digit recognition, and image classifications have configure hardware specific to the algorithm
been generally acknowledged everywhere [1]. A NN empowering the feature of parallel execution
is a bunch of calculations that endeavours to utilizing the MNIST information base for training
recognise fundamental connections in a cluster of and testing purposes set for the NN on FPGA with
information utilizing a technique that emulates how the expectation to accomplish a plan which offers
the human brain functions. These days NN has superior execution, increased accuracy and no
turned into the best in class for AI calculations expense of CPU when contrasted with conventional
because of their high precision. In any case, software techniques [6], [7], [9].
execution of NN calculations on equipment stages
becomes challenging because of high calculation III.METHODOLOGY
intricacy, memory transfer speed and power
utilization. With bigger models of NN, the The process followed for the implementation of NN
prerequisite for the nature of the sort of processors includes designing a neuron, considering parameters
additionally increments [2]–[5]. This is the place for optimizations, generating weights and biases
where hardware accelerators like Graphics

Authorized licensed use limited to: COLLEGE OF ENGINEERING - Pune. Downloaded on July 31,2024 at 11:12:50 UTC from IEEE Xplore. Restrictions apply.
using “Tensorflow”, designing the layers, deciding neurons respectively.
activation functions like ReLU and sigmoid and
verification. Generating Weights, Biases and Test Data from
Tensorflow: The mnist dataset was installed as part
A. Software Implementation Methodology of Tensorflow itself. In Tensorflow weights are
Software implementation methodology of NN stored according to input, so if there are 784 inputs
Development includes generation of weights, biases, to the first layer containing 30 neurons, then
and test data along with the results which is Tensorflow stores weights as 784 lists of 30 weights
described below. each. So, the first weight represents the weight to the
first neuron from the first input. For the purpose of
Neural Network Development: This includes hardware implementation and the script to work, all
defining a very simple NN where the images are fed the weights of a particular neuron must be stored as
in the input layer and whose output will have 10 a single list which is why transpose is adapted.
neurons since the images are being classified into 10 Biases are always for a particular neuron, since there
different Classes. The function used here is sigmoid are no such challenges in the case of biases so the
function, which is a subset of the logistic function, transpose of the bias list is not taken into account.
and it is generally denoted as sigmoid(x). Fig. 1. The sigmoid function used in software
Shows the Simple Neutral Network. implementation Training in this case is much faster
than hardware implementation. After completion of
this step a text file gets generated consisting of
weights and biases. This file is used later for
hardware implementation to generate the NN by
generating mif type files using Zynet library. Hence,
the accuracy achieved in software implementation is
around 96.09 percent.

Comparison Of Results: The method used in the


paper for the purpose of feature extraction is Image
Fig. 1. Simple Neutral Network. Pixels. with the help of ANN as a classifier, an
accuracy of 96 percent was achieved. The paper [13]
The fundamental reason for employing the sigmoid used the Regular Histogram of Oriented Gradients
function is that it occurs between (0 to 1). As a method for the same purpose of handwritten
result, it is particularly useful for models in which character recognition by using a Support vector
the probability must be predicted as an output. Since machine as their classifier. The authors were
the probability of anything exists only between 0 successful in achieving an accuracy of 95.64
and 1, the sigmoid is the best choice. Image is percentage. [10] used the convolutional
nothing but ultimately a 2 dimensional matrix where cooccurrence Histogram of Oriented Gradients
each pixel is represented between 0 to 255 , 255 is method for the same purpose of handwritten
white and 0 is black , so one can have those pixels as character recognition by using a SVM as their
a 2 dimensional array. Then the 2 dimensional array classifier. It was possible for them to obtain an
is converted into a single dimensional array by accuracy of 81.64 percentage. [11] used the
flattening it.The 28x28 grid is flattened and 784 Convolutional cooccurrence Histogram of Oriented
neurons are generated. Now the NN is created by Gradients method for text recognition by using a
using “kera.sequential” , keras has an API SVM as their classifier. An accuracy of 83.60
“keras.layers.dense”, where “Dense” signifies that percent was achieved by the writers [12]. It worked
all the neurons in one layer are connected to every upon function object detection with Histogram of
other neuron in the second layer. Oriented Gradients as feature extraction and ANN as
classifier. They were able to achieve an accuracy of
Optimizer used in this case is the „adam‟ optimizer, 97.33 percent. “ [13]” used Chain Code Histogram
to train the model efficiently. Loss function used in along with Support Vector Machine as their
this case is “sparsecategorical-crossentropy”and classifier to achieve an accuracy of 98.48
metrics used is “accuracy”. Later hidden 4 layers are percentage, [14] used a hybrid classifier in the form
added for better accuracy containing 30,30,10,10 of CNN and SVM along with CNN as their feature

Authorized licensed use limited to: COLLEGE OF ENGINEERING - Pune. Downloaded on July 31,2024 at 11:12:50 UTC from IEEE Xplore. Restrictions apply.
extraction method. They were able to achieve an activation Functions.
accuracy of 94.40 percent.
Activation Functions (ReLU and Sigmoid): In a NN,
Feature Extraction Classifier Accuracy the way in which weighted sum of the input is
Method converted into an output from a node or nodes in a
layer is provided by an activation function. The
R-HOG SVM 95.64 activation function computes a weighted total and
HOG SVM 81.00 then adds bias to it to establish whether a neuron
HOG SVM 83.60 should be activated or not. The ReLU activation
function is a symmetric function following linearity
HOG ANN 97.33 that will output the input directly if it is positive,
CCH SVM 98.48 else, it will output zero. Similarly, the
implementation of the sigmoid activation function
CNN CNN+SVM 94.
can be done in the programme by changing the
Image Pixels ANN 96.00 activation type to sigmoid. The sigmoid function is
From this it can be concluded that the ANNs an activation function in machine learning that is
performance is more effectively superior when used to introduce nonlinearity to a model. To put it
compared to SVM-based classification [15]. another way, it chooses which values to pass as
output and which not to.
B. Hardware Implementation Methodology
FPGAs consist of integrated circuits including array Constraints File: Constraint file is required to
of reprogrammable logic blocks. These explain the software that will determine which
developments are driven by their flexibility, physical pins on the FPGA will be used or connected
hardware-timed speed and reliability, and to, as well as the HDL code that will define the
parallelism. FPGAs are devices that exhibit FPGAs behaviour. The value 4.000 taken in this
properties such as parallelism and pipe-lining in paper represents the clock time period and, in this
nature, unlike processors so different processing case, represents the input clock time period of 4
tasks do not have to work for same resources. nanoseconds.
Independent processing tasks are assigned to a
different section of the chip and operate IV.RESULTS
independently of the other logic blocks. In the result,
adding extra processing has no impact on the Resource Utilization: Every FPGA includes a set
performance of one element of the application. 1) amount of resources that are joined by
Designing the Model: The implementation of the programmable interconnects to build a
NN over Vivado begins by understanding the reconfigurable digital circuit and I/O blocks that
necessary parameters that are required to be taken allow the circuit to interface with the outside world.
under consideration for the implementation of the These often form the most essential FPGA
application of handwritten digit recognition over a specifications to consider when assessing and
hardware platform. This paper emphasizes on the evaluating FPGA for a certain application. Figures
implementation of NN for the recognition of the mentioned below represent the resources utilized
handwritten digit Recognition over FPGA using during the execution of the NN over Vivado HLS.
Vivado HLS Software. For this process, initially a Fig. 2 represents resource utilization of the NN using
neuron file is designed by programming the required ReLU as the activation function.
parameters of the neurons such as depth of the
neuron, layer number, address width and data width Similarly, Fig. 3 represents the resource utilization
along with the weights, bias and weight memory file using Sigmoid for the execution of the NN. From the
and activation function files about which is Fig. 2 and 3 , depicting the resource utilisation using
described in the preceding section. Then the layers ReLU and Sigmoid for the processing of the NN. It
are designed in a feed forwards manner comprising 5 can be compared that the implementation of the NN
layers with two layers as the input and output layer using sigmoid as the activation function is more
and three layers as the hidden layers. These layers efficient as compared to ReLU as the activation.
consist of 784, 30, 30, 10, 10 neurons in each This is since the resources utilized by Sigmoid
respective layer containing files of Neurons and

Authorized licensed use limited to: COLLEGE OF ENGINEERING - Pune. Downloaded on July 31,2024 at 11:12:50 UTC from IEEE Xplore. Restrictions apply.
activation function is less as compared to the ReLU function provides a better functioning model with an
activation function for the same implementation. accuracy of 96 percentage as compared to the model
using ReLU as the activation function which
provided an accuracy of only 33 percentage.

Calculation of Maximum Frequency: The


processor‟s operational clock cycles per second is
referred to as frequency of the device. Maximum
frequency is used to identify the amount of time
required to execute a set of instructions of the
program by the device. The maximum frequency is
calculated by taking the slack time into
consideration. Slack means what was requested and
what has been achieved. For calculating the
Fig. 2. Utilisation of Resources using ReLU.
maximum frequency, 4ns is taken as the clock
period which was declared in the constraint file as
shown in section III.B.3. Then, 0.035ns is taken as
the worst negative slack which was obtained in the
vivado using ReLU. Summation of these values
(NN)s are taken which provides the overall time
period of the device i.e. 4+0.035 = 4.035ns. Finally,
for calculating the frequency, the reciprocal of this
obtained value is calculated i.e. 1/4.035. Hence the
maximum frequency calculated using ReLU as the
activation Function for the NN obtained is Fmax =
247 MHz. The timing and frequency values for the
implementation of NN using Sigmoid as activation
Fig. 3. Utilisation of Resources using Sigmoid. function is also calculated in a similar manner where
4ns is taken as the clock period. Then, 0.011ns is
Accuracy: It is interpreted as the percentage of taken as worst negative slack which was obtained in
unerring predictions for the given test data. the Vivado using Sigmoid. Summation of these
Depending on the data loaded or used for training, it values are taken which provides the overall time
assists in making preference regarding the model period of the device i.e. 4+0.011 = 4.011ns. Finally,
best at recognising correlations and patterns between for calculating the frequency, the reciprocal of this
variables in a dataset. obtained value is calculated i.e. 1/4.035. Hence the
maximum frequency calculated using Sigmoid as the
activation function for the NN obtained is Fmax =
249 MHz.

V. CONCLUSION

In the era of increasing technology and compactness


of transistors over the devices as explained by
Moore‟s law, FPGA has substantiated to be an
efficient platform. In comparison to the software
model, hardware implementation using FPGA
Fig. 4. Accuracy using Sigmoid.
provides properties of reconfigurability of logic
gates along with the proficient features of pipelining
Fig. 4 represents the accuracy achieved while using
and parallelism. In this paper, the application of
Sigmoid as the activation function for the
recognizing the handwritten digits using NN were
implementation of the NN. By comparing the two
performed on both the platforms of software as well
models for the accuracy achieved, it can be deduced
as hardware. In the software implementation of the
that the NN model using Sigmoid as the activation

Authorized licensed use limited to: COLLEGE OF ENGINEERING - Pune. Downloaded on July 31,2024 at 11:12:50 UTC from IEEE Xplore. Restrictions apply.
NN, an accuracy of 96 percent was achieved using [4] Netzer, Yuval, et al. ”Reading digits in natural images with
unsupervised feature learning.” (2011).
Zynet and Tensorflow by configuring the network in [5] Moradi, Marzieh, Mohammad Ali Pourmina, and
python programming language. For the hardware FarbodRazzazi. ”FPGA-Based farsi handwritten digit
recognition system.” International Journal of Simulation
implementation over FPGA where Vivado HLS was Systems, Science and Technology 11.2 (2010).
used in reconfiguring the hardware device, a similar [6] Huynh, Thang Viet. ”Design space exploration for a single-
accuracy of 96 percent was achieved along with a FPGA handwritten digit recognition system.” 2014 IEEE Fifth
InternationalConference on Communications and Electronics
maximum operating frequency of 249 MHz. In the (ICCE). IEEE, (2014).
hardware execution of the NN, this implementation [7] Savich, Antony W., Medhat Moussa, and ShawkiAreibi. ”The
was also successful in comparing the results of impact of arithmetic representation on implementing MLP-BP
on FPGAs: A study.” IEEE transactions on neural networks
various other parameters such as resource utilisation, 18.1 (2007): 240-252.
maximum operating frequency, accuracy and of the [8] Nichols, Kristian R., Medhat A. Moussa, and Shawki M. Areibi.
”Feasibility of floating-point arithmetic in FPGA based artificial
hardware device based on the two kinds of activation neural networks.” In CAINE. (2002).
functions i.e., ReLU activation function and Sigmoid [9] Si, Jiong, and Sarah L. Harris. ”Handwritten digit recognition
activation function which enables to gain a better system on an FPGA.” 2018 IEEE 8th Annual Computing and
Communication Workshop and Conference (CCWC). IEEE,
understanding towards the working of NN over (2018).
FPGA and also forms a basic framework for future [10] Su, Bolan, et al. ”Character recognition in natural scenes using
work. Hence in conclusion, FPGA proves to be an convolutional co-occurrence hog.” 2014 22nd International
Conference on Pattern Recognition. IEEE, (2014).
effective as well as efficient solution towards [11] Tian, Shangxuan, et al. ”Scene text recognition using co-
solving the Artificial Intelligence (AI), Machine occurrence of histogram of oriented gradients.” 2013 12th
International Conference on Document Analysis and
Learning (ML) and Deep Learning (DL) challenges Recognition. IEEE, (2013).
where various parameters such as accuracy, resource [12] Varagul, Jittima, and Toshio Ito. ”Simulation of detecting
utilization and can be optimized with ease. function object for AGV using computer vision with neural
network.” Procedia Computer Science 96 (2016): 159-168.
[13] Kamble, Parshuram M., and Ravinda S. Hegadi. ”Handwritten
REFERENCES Marathi character recognition using R-HOG Feature.” Procedia
Computer Science 45 (2015): 266-274.
[1] NO, THIET KE KIEN TRUC MANG, RONNTA NHAN, and [14] Niu, Xiao-Xiao, and Ching Y. Suen. ”A novel hybrid CNN–
DANG CHU SO. ”Design of Artificial Neural Network SVM classifier for recognizing handwritten digits.” Pattern
Architecture for Handwritten Digit Recognition on FPGA.” Recognition 45.4 (2012): 1318-1325.
(2016). [15] Islam, KhTohidul, et al. ”Handwritten digits recognition with
[2] Park, Jinhwan, and Wonyong Sung. ”FPGA based artificial neural network.” 2017 International Conference on
implementation of deep neural networks using on-chip memory Engineering Technology and Technopreneurship (ICE2T).
only.” 2016 IEEE International conference on acoustics, speech IEEE, (2017).
and signal processing (ICASSP). IEEE, (2016).
[3] Hagan, Martin T., Howard B. Demuth, and Mark Beale. Neural
network design. PWS Publishing Co., (1997).

Authorized licensed use limited to: COLLEGE OF ENGINEERING - Pune. Downloaded on July 31,2024 at 11:12:50 UTC from IEEE Xplore. Restrictions apply.

You might also like