50% found this document useful (2 votes)

2K views26 pages

On Text To Speech Conversion Using OCR

here the input can be given in the image or pdf form where the words will be extracted and gives output in the audio form

Uploaded by

gagana

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

50% found this document useful (2 votes)

2K views26 pages

On Text To Speech Conversion Using OCR

here the input can be given in the image or pdf form where the words will be extracted and gives output in the audio form

Uploaded by

gagana

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 26

TEXT EXTRACTION AND VOICE SYNTHESIS

Presented by
Anusha M(4AD15CS008)
Under the Guidance of
Mr.Raghuram A S, Bhoomika H S(4AD15CS013)
Asst. Professor,Dept. Of CSE, Gagana V(4AD15CS022)
ATME College,Mysuru. Lavanya S(4AD15CS039)

Under the coordinator of Under the cordinator of

Mrs.Sunitha patel M S Mr.Anilkumar C J,
Asst Professor,Dept Of CSE, Associate professor,
ATME College,Mysuru Dept of CSE,
ATME College, Mysuru.
INDEX

 Introduction
 Problem Statement
 Advantages
 System Specifications
 Methodology
 Design
 Implementation
 snapshots
INTRODUCTION
 Our project is capable to recognize the text and convert
the input into audio.
 The input can be given in many formats such as text, pdf,
docx,format and image(jpg, png).
 Image acquisition, recognition and speech conversion
using Optical Character Recognition (OCR) .
 An Image Processing Technology used to convert the
image containing horizontal text into text documents and
the extracted text is converted into speech.
PROBLEM STATEMENT
 The project is to recognize the text character of an image
and convert this text into speech signal. To achieve this,
text contained in the page is first pre-processed. The pre-
processed unit is prepared this for voice output.
REQUIREMENTS SPECIFICATION

Software Requirements
 Operating system : Windows 7.
 Coding Language : python 3.6
 Data Base : my SQL lite
 Tool : sublime text , django
Hardware Requirements
 Processor : Intel i3
 Speed : 2.53 Ghz
 RAM : 4 GB
 Hard Disk : 500 GB
 Speakers
METHODOLOGY
OCR(Optical Character Recognition)
Optical character recognition, or OCR, is a method of
converting a saved image into text.

OpenCV(Open source computer vision)

It is an library using which we can develop real time
computer vision application. This library consists of
inbuilt features or functions. It mainly focuses on image
processing including features like physical object, face
and text identification and recognition.
TTS(Text to speech)
It is a type of speech synthesis application that is used to
create a spoken sound version of text in a computer
document or image.
NLP(Natural language processing)
It consists of different type of english versions. It will
match the ASCII value(extracted from text or a
document) and HMM value(from dataset for speech).
Design start

Input files

Check
extension

Image files Doc files

( jpg, png) (doc, pdf)

OpenCV

Pre processing

A
A

Recognition

Text to speech

Voice output

Stop
ALGORITHM
OPENCV

• Predefined-26 letters,0-9 numbers, special

characters.
• The predefined text is matched with input source.
• This is how the words are divided into blocks.
 In binarization the pixel image is converted into
grayscale image.
Grayscale conversion

• The file size like 5mb,10mb,15mb in this the 5mb

file is processed fast compare to other.
• The pixel have its own color,angle,depth.
• If we use grayscale the size of the file is reduced.
Filter
It is used to modify or enhance the image.

Noise
Noises will be removed.
The binary input will be compared with dataset ,if
matched then the output will be stored in binary
form again.
For example:
Binary input:A-11,B-10,C-01,D-00
Dataset:A-11,B-10,C-01,D-00
Matched:A-A,B-B,C-C,D-D
NLP ALGORITHM
 We use Google text to speech algorithm in text to
speech synthesizer.
 The binary output will be given as the input (ASCII).
 The hidden markov model (HMM) values will be
stored in the database.
 Here we will match the binary output(ASCII values)
with the HMM in the database.
 After matched,the digital signal processing takes
place.
 And the output will be converted to analog signals.
IMPLEMENTATION

Register
• In registration page a new user can input his/her
name and he can create his own password.
• As soon as user give his/her name a unique
username will be generated by the application.
Login
• Once the user is successfully registered then user
can login into the application
• User can use his/her username generated by the
application to login to the application
• If the user fails to login he can use forgot
password option by giving the email id and the
password will be sent to the given email id.
File upload
• In this module the user can upload the file.

• As soon as the image to be uploaded a unique id

will be created and the date will be uploaded
automatically by the application.
SNAPSHOTS

Text To Speech Converter Documentation
50% (4)
Text To Speech Converter Documentation
28 pages
Python Mini Report PDF
100% (2)
Python Mini Report PDF
13 pages
Project Report Python Project
100% (1)
Project Report Python Project
25 pages
Speech Tech for HCI Designers
100% (6)
Speech Tech for HCI Designers
12 pages
CG Final Report
100% (1)
CG Final Report
30 pages
Major Project Report Template
No ratings yet
Major Project Report Template
44 pages
Project Final PDF
No ratings yet
Project Final PDF
51 pages
Online Reservation System Project in Java
No ratings yet
Online Reservation System Project in Java
7 pages
Desktop Virtual Assistant
No ratings yet
Desktop Virtual Assistant
12 pages
PROJECT REPORT For Machine Learning
100% (1)
PROJECT REPORT For Machine Learning
22 pages
AeroCrash Report
0% (1)
AeroCrash Report
26 pages
Software Requiement Specifications: Fake News Detector
100% (2)
Software Requiement Specifications: Fake News Detector
10 pages
Project Report On Tic Tac Toe Game Using Java
No ratings yet
Project Report On Tic Tac Toe Game Using Java
15 pages
Emotion Based Music Player: Graduate Project Report
50% (2)
Emotion Based Music Player: Graduate Project Report
53 pages
Speech Recognition System: A Project Report Submitted by
0% (1)
Speech Recognition System: A Project Report Submitted by
28 pages
Final Report Mini Project Final
No ratings yet
Final Report Mini Project Final
12 pages
Online Fake Logo Detection System Python Project
No ratings yet
Online Fake Logo Detection System Python Project
8 pages
Face ATM
No ratings yet
Face ATM
62 pages
Fake Logo Detection DT Report
100% (1)
Fake Logo Detection DT Report
26 pages
Rock Paper Scissors
No ratings yet
Rock Paper Scissors
8 pages
Seminar Topics For Mca Students
100% (2)
Seminar Topics For Mca Students
3 pages
Online Mobile Recharge System
100% (1)
Online Mobile Recharge System
10 pages
DIP Mini Project
100% (1)
DIP Mini Project
12 pages
Online Railway Reservation System Report
68% (19)
Online Railway Reservation System Report
35 pages
Digital Watermarking System Project Report
93% (15)
Digital Watermarking System Project Report
51 pages
Real-Time Face Emotion Detection
No ratings yet
Real-Time Face Emotion Detection
62 pages
Blood Donation App Project Report
No ratings yet
Blood Donation App Project Report
77 pages
Credit Fraud Detection: Project Report
No ratings yet
Credit Fraud Detection: Project Report
22 pages
Helicopter Game Project Report
100% (1)
Helicopter Game Project Report
34 pages
Voice Based Mail System
85% (27)
Voice Based Mail System
18 pages
A Project Report On "Tic Tac Toe Game"
No ratings yet
A Project Report On "Tic Tac Toe Game"
13 pages
A Software Engineering Mini Project On Online Trading System
100% (1)
A Software Engineering Mini Project On Online Trading System
29 pages
Metaverse Seminar Report
No ratings yet
Metaverse Seminar Report
19 pages
Gaze-Based Secured Authentication System Based On Morse Code-Report
100% (1)
Gaze-Based Secured Authentication System Based On Morse Code-Report
50 pages
Speech Recognition Seminar Report
87% (97)
Speech Recognition Seminar Report
32 pages
Cartoon Image Generation Method
No ratings yet
Cartoon Image Generation Method
38 pages
Event Registration: Minor Project Report On
No ratings yet
Event Registration: Minor Project Report On
37 pages
Barcode Based Attendance System: Project Report
100% (1)
Barcode Based Attendance System: Project Report
41 pages
Email Sending Using Django: Summer Internship Report
No ratings yet
Email Sending Using Django: Summer Internship Report
26 pages
Graphics Mini Project Report 1
No ratings yet
Graphics Mini Project Report 1
20 pages
Text - To - Speech Converter: Bachelor of Engineering IN Computer Science & Engineering
57% (7)
Text - To - Speech Converter: Bachelor of Engineering IN Computer Science & Engineering
42 pages
Sample Internship PPT
No ratings yet
Sample Internship PPT
19 pages
Railway Reservation System Report
No ratings yet
Railway Reservation System Report
74 pages
Eye Bank Management System
100% (5)
Eye Bank Management System
8 pages
1.what Is Mixed Language Programming?
100% (1)
1.what Is Mixed Language Programming?
2 pages
AIRSHOWREPORT1
No ratings yet
AIRSHOWREPORT1
26 pages
Face Recognition Based Attendance System
No ratings yet
Face Recognition Based Attendance System
70 pages
Heart Disease Prediction Using Machine Learning Report
50% (2)
Heart Disease Prediction Using Machine Learning Report
45 pages
Project Report On OCR
80% (5)
Project Report On OCR
55 pages
Language Translator
100% (3)
Language Translator
13 pages
IoT-Based Autonomous Car System
No ratings yet
IoT-Based Autonomous Car System
59 pages
Tamil Textual Image Reader
No ratings yet
Tamil Textual Image Reader
4 pages
Dip PDF
No ratings yet
Dip PDF
30 pages
Advanced Image To Speech Conversion
No ratings yet
Advanced Image To Speech Conversion
46 pages
Natural Language Processing: by Dr. Parminder Kaur
No ratings yet
Natural Language Processing: by Dr. Parminder Kaur
26 pages
Multi-Language Image to Speech Conversion
No ratings yet
Multi-Language Image to Speech Conversion
31 pages
Text To Speech
No ratings yet
Text To Speech
14 pages
Paper 4
No ratings yet
Paper 4
5 pages
Speech To Image Conversion: Shaik Karishma, Siddu Devi Naga Susmitha, Nanditha Katari, G. Sirisha
No ratings yet
Speech To Image Conversion: Shaik Karishma, Siddu Devi Naga Susmitha, Nanditha Katari, G. Sirisha
5 pages
Ocr Gtts
No ratings yet
Ocr Gtts
49 pages
SPJIMR Form B (Edit 1) - 2
100% (1)
SPJIMR Form B (Edit 1) - 2
3 pages
Datasheet Inverter 180VA 1200VA en
No ratings yet
Datasheet Inverter 180VA 1200VA en
2 pages
Catalog Number Description Locking Ring Thrust Washer Trim Washer Panel Gasket
No ratings yet
Catalog Number Description Locking Ring Thrust Washer Trim Washer Panel Gasket
27 pages
Bobcat Advanced Troubleshooting System Bats
No ratings yet
Bobcat Advanced Troubleshooting System Bats
2 pages
Muat Bongkarmhp-Wks Sukadaryati
No ratings yet
Muat Bongkarmhp-Wks Sukadaryati
23 pages
Laboratory Oil Testers by Megger
No ratings yet
Laboratory Oil Testers by Megger
4 pages
Case Interview Prep Workshop
100% (1)
Case Interview Prep Workshop
23 pages
Quality and Manufacturing Acronyms
No ratings yet
Quality and Manufacturing Acronyms
15 pages
Day 1,2
No ratings yet
Day 1,2
13 pages
VAP Prevention in Critical Care
No ratings yet
VAP Prevention in Critical Care
8 pages
Topic 8
No ratings yet
Topic 8
58 pages
MITWPU - Unit 2-Theory of Computation
No ratings yet
MITWPU - Unit 2-Theory of Computation
50 pages
MEG2
No ratings yet
MEG2
64 pages
Subject: Grade 9A
No ratings yet
Subject: Grade 9A
12 pages
Biomass 1
No ratings yet
Biomass 1
22 pages
Untitled Document 3
No ratings yet
Untitled Document 3
2 pages
IIP Mr. & Ms. Palaro 2022-2023 Guide
No ratings yet
IIP Mr. & Ms. Palaro 2022-2023 Guide
2 pages
Is LBA Mandatory For SCAN Listener in Oracle RAC?
100% (1)
Is LBA Mandatory For SCAN Listener in Oracle RAC?
8 pages
Cleaning of Glass
No ratings yet
Cleaning of Glass
1 page
Remote Sensing and Geographical Information System For Natural Disaster Management
No ratings yet
Remote Sensing and Geographical Information System For Natural Disaster Management
3 pages
MPLS L2VPN Config Commands Guide
No ratings yet
MPLS L2VPN Config Commands Guide
28 pages
Electrical Engg Exam Paper
No ratings yet
Electrical Engg Exam Paper
25 pages
Research 1232
No ratings yet
Research 1232
81 pages
Curriculum Map Contemporary Arts 1st and 2nd Quarter
No ratings yet
Curriculum Map Contemporary Arts 1st and 2nd Quarter
12 pages
IC Scrum Project Management Gantt Chart Template 10578 Excel 2000 2004
No ratings yet
IC Scrum Project Management Gantt Chart Template 10578 Excel 2000 2004
6 pages
P2AP PartIV Learnhowtodraftapatentapplication Final 0
No ratings yet
P2AP PartIV Learnhowtodraftapatentapplication Final 0
36 pages
CASE 12-159347 Redacted
No ratings yet
CASE 12-159347 Redacted
5 pages
SEI NCE DB 2016 Kenya Clean Cooking
No ratings yet
SEI NCE DB 2016 Kenya Clean Cooking
6 pages
Christ Apostolic Church
No ratings yet
Christ Apostolic Church
2 pages
Credit Co Operative Society
No ratings yet
Credit Co Operative Society
13 pages

On Text To Speech Conversion Using OCR

Uploaded by

On Text To Speech Conversion Using OCR

Uploaded by

TEXT EXTRACTION AND VOICE SYNTHESIS

Under the coordinator of Under the cordinator of

OpenCV(Open source computer vision)

Image files Doc files

• Predefined-26 letters,0-9 numbers, special

• The file size like 5mb,10mb,15mb in this the 5mb

• As soon as the image to be uploaded a unique id

You might also like