HANDWRITTEN
TEXT
RECOGNITION
 USING DEEP
  LEARNING
  M HARINI G SUBHIKSHA R SRIHARINI V ABINAYA
ABSTRACT
 The System is Built to Recognize Handwritten Text and then convert the recognized text into digital
form using Deep Learning. Deep Learning is an advanced technique to get better efficiency and reach
human level Prediction. Handwritten Recognition is a technology that can be used to recognize
handwritten characters. Handwriting text will be in images format. In this system we have used
convolution neural networks to predict real time handwritten text because these neural networks are
most properly used for Analysing images. To predict handwritten text, the Optical Character
Recognition Algorithm is used in the Convolution Recurrent Neural Network Model. Optical Character
Recognition problem is a type of image-based Sequence recognition problem. And for Sequence
recognition problems, most suited neural networks are Recurrent Neural Networks(RNN) while for an
image based problem most suited are Convolutional Neural Networks(CNN). To cope up with the OCR
problems we need to combine both of these CNN and RNN. Deep learning gives higher level
recognition accuracy. The Aim of our project is to make an application that can recognize the
handwriting using concepts of deep learning. We are thinking by approaching our problem using CNN
as they provide better accuracy over such tasks. Image processing could be a manipulation of images
within the computer vision. With the advent of technology, there are many techniques for the
manipulation of photographs.
PROBLEM STATEMENT
• In this system we have used convolution neural network to predict real
time handwritten digit because these neural networks are most properly
used for analyzing images.
• In this system we have used convolution neural network to predict real
time handwritten digit because these neural networks are most properly
used for analyzing images.
            EXISTING SYSTEM                                       PROPOSED SYSTEM
Relies on conventional classification methods for       Introduces a novel approach using deep learning
handwritten digit recognition.                         techniques for handwritten text recognition and
                                                       conversion into digital form.
Acknowledges the progress made in recognizing
handwritten digits but highlights the limitations in   Aims to enhance efficiency and achieve human-level
accuracy impacting work efficiency.                    prediction accuracy by leveraging advancements in
                                                       deep learning.
Utilizes a two-layer CNN network with two fully
connected layers.                                      Integrates Convolutional Neural Networks (CNN) and
                                                       Recurrent Neural Networks (RNN) in a Convolutional
 Employs the ReLU function to mitigate gradient        Recurrent Neural Network Model.
disappearance and saturation challenges.
                                                       Utilizes Optical Character Recognition (OCR)
                                                       algorithm, leveraging the strengths of CNN for
                                                       image-based problems and RNN for sequence
                                                       recognition.
SYSTEM ARCHITECTURE
MODULES
● Pre processing
● OCR Algorithm
● Convolution Layer
● Recurrent Layer
● Transcription Layer
PRE PROCESSING
● The preprocessing unit in the architecture diagram prepares input data for the
  neural network model.
●    It includes resizing, normalization, noise reduction, contrast enhancement, and
    segmentation. These steps ensure the input images are standardized, clean, and
    optimized for effective recognition by the neural network.
OCR ALGORITHM
●   OCR (Optical Character Recognition): OCR is a technology used to recognize text within images, including
    scanned documents and photos. It converts various types of text images (typed, handwritten, or printed) into
    machine-readable text data.
●   OCR Process: OCR involves converting digital or hand-written text images into machine-readable text that
    computers can process, store, and edit. This enables the manipulation of text as part of data entry and processing
    software.
●   Feature Extraction Methods: There are two main methods for extracting features in OCR: one evaluates
    characters based on lines and strokes, while the other identifies entire characters through pattern recognition.
●   Pattern-Matching Algorithms: OCR software uses pattern-matching algorithms to compare text images
    character by character with its internal database. If the system matches the text word by word, it's called optical
    word recognition. OCR software essentially "reads" text and converts it into digital form.
●   Evolution of OCR: OCR is one of the earliest addressed computer vision tasks and doesn't always require deep
    learning techniques, as it can be accomplished with traditional algorithms and methods.
CONVOLUTIONAL LAYER
●   The layer is used for image feature extraction. The component of convolutional layers
    is constructed by taking the convolutional and max pooling layers in CRNN
    model.Sequential feature representation from an input image is extracted using such
    component.
●   The first layer of a Convolutional Neural Network is always a Convolutional layer.
    Convolutional layers apply a convolution operation to the input, passing the result to
    the next layer. A convolution converts all the pixels in its receptive field into a single
    value.
      TRANSCRIPTION LAYER
Transcription Process:
  ●   The transcription layer converts per-frame predictions made by the RNN into a sequence of labels or text. This process is crucial
      for transforming the output of the neural network into readable text.
  ●   Connectionist Temporal Classification (CTC) is a commonly used technique in the transcription process. It helps decode the
      output from the RNN and convert it into text labels.
Role of Transcription Layer:
  ●   The transcription layer operates after the recognition model, taking the output probabilities from the model.
  ●    Its primary function is to convert these probabilities into a sequence of recognized text or characters. This involves applying
      decoding algorithms to determine the most probable sequence based on the output probabilities.
Mapping Probabilities to Symbols:
  ●    The transcription layer maps the continuous probability distributions generated by the recognition model into discrete symbols,
      such as characters or words, representing the recognized text.
  ●    By converting probabilities into discrete symbols, the transcription layer enables the neural network to output readable text that
      accurately represents the input sequence.
RECURRENT LAYER
1. Bi-directional RNN for Sequence Labelling:
  ●     Bi-directional RNNs are used on top of convolutional layers to label sequences.
  ●     They capture information from both directions, enhancing sequence understanding.
2. Fully Connected Layer:
  ●    Connects every neuron from the previous layer to every neuron in the next.
  ●    Output is fed back to the input, with the number of units determining output
       dimensionality.
  ●    Typically uses the hyperbolic tangent (tanh) activation function.
3. Recurrent Layer:
  ●    Comprised of recurrent units processing input and previous hidden state to produce
       output.
  ●    Output can be further processed or sent to subsequent layers.
  ●    Captures temporal dependencies within sequences, aiding pattern recognition.
CONCLUSION
●     An adaptive method is proposed for handwritten text recognition by pre-processing and
    training the dataset consecutively with CNN and RNN.
●    The input word images are processed and fed into neural network model layers during
    recognition.
●   The output of the CNN layers is further processed by the RNN layers. The results demonstrate
    the potential of consecutive use of CNN and RNN that improve the accuracy steadily.
FUTURE SCOPE
●   In future we are planning to extend this study to a larger extent where different embedding
    models can be considered on large variety of the datasets.
●    •we aim to enhance the work by implementing online recognition and extend it to different
    languages, additionally we can promote the system to recognize degraded text or broken
    characters
REFERENCES
1.A. Graves and J. Schmidhuber, “Offline handwriting recognition with multidimensional recurrent neural networks,” in NIPS, 2009.
 2.Rohan Vaidya;Darshan Trivedi;Sagar Satra;Prof. Mrunalini Pimpale, ”Handwritten Character Recognition Using DeepLearning”,
in ICICCT, 2018.
3.P. Voigtlaender, P. Doetsch, and H. Ney, “Handwriting recognition with large multidimensional long short-term memory recurrent
neural networks,” in ICFHR, 2016.
4.J. Puigcerver, “Are multidimensional recurrent layers really necessary for handwritten text recognition?” in ICDAR, 2017.
5.D. Keysers, T. Deselaers, H. A. Rowley, L. Wang, and V. Carbune, “Multi-language online handwriting recognition,” PAMI, vol.
39, no. 6, pp. 1180–1194, 2017.
6.V. Carbune, P. Gonnet, T. Deselaers, H. A. Rowley, A. Daryin, M. Calvo, L.-L. Wang, D. Keysers, S. Feuz, and P. Gervais, “Fast
multi- language lstm-based online handwriting recognition,” ArXiV, 2019. 49
7.U. Marti and H. Bunke. The IAM-database: An English Sentence Database for Off-line Handwriting Recognition. Int. Journal on
Document Analysis and Recognition, Volume 5, pages 39 - 46, 2002.
8.H.Bunke1, M. Roth1, E.G. Schukat-Talamazzini. Offline Cursive Handwriting Recognition using Hidden Markov Models.