KEMBAR78
FYP Final Proposal | PDF | Deep Learning | Cognition
0% found this document useful (0 votes)
33 views17 pages

FYP Final Proposal

The document proposes developing a system that can automatically generate PowerPoint presentations from books using hybrid summarization techniques. This could help educators and professionals create presentations more quickly and accurately. The system would combine extractive and abstractive summarization and be evaluated based on accuracy, coherence and efficiency compared to other tools.

Uploaded by

Fateh Alim
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views17 pages

FYP Final Proposal

The document proposes developing a system that can automatically generate PowerPoint presentations from books using hybrid summarization techniques. This could help educators and professionals create presentations more quickly and accurately. The system would combine extractive and abstractive summarization and be evaluated based on accuracy, coherence and efficiency compared to other tools.

Uploaded by

Fateh Alim
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 17

From book to presentation generation using hybrid summarization

From book to presentation generation using hybrid summarization

Final Year Project Proposal by

Samiha Azeem
(231462422)

Aleezae Zia
(231523065)

Fateh Ali Alim


(231523017)

Primary Advisor: Muhammad Salman Chaudhry


Secondary Advisor: Rabranea Bqa

Department of Computer Science


Forman Christian College University
Lahore, Pakistan (20xx)
From book to presentation generation using hybrid summarization

ABSTRACT

In today’s world, books play an important part in any field to grow. From story books to extensive

academic books, every individual needs to access them. Over the past few years, Artificial

Intelligence (AI) and machine learning have gained widespread importance and are now integral

parts of various industries. Summarization methods have become significant for data recovery as

enormous volumes of information are accessible on internet and since time is running short

limitations, it is hard for a human to filter through huge measures of information to remove relevant

data. To find a relevant technique that can help to develop a prototype that can automatically

generate a PowerPoint presentation from a given book by utilizing hybrid summarization techniques

is the goal of this research paper. In order to produce a summary that is both more precise and more

coherent, hybrid summarization will be studied and researched. Hybrid summarization combines the

extractive and abstractive methods. The objective of this examination is to give an instrument to

teachers and experts to rapidly and effectively make introductions in view of composed material. The

system would take the most important ideas and concepts from the book, make a presentation outline,

and make slides with text and images that show the content the best. The review includes

investigating different rundown strategies, assessing the viability of the framework, and researching

its expected applications in different fields. The models will be evaluated based on various metrics,

including rouge and human evaluation.

1
From book to presentation generation using hybrid summarization

TABLE OF CONTENTS

Introduction ……………………………………………………………………………… 4

Problem Statement …………………………………………….……………………….. 5-6

Literature Review ………………………………………………….…………………… 6-

11

Project Overview …………………………………………………………..………….. 11-

12

Project Development Methodology ………………………………...…………………. 12-

13

Project Milestones and Deliverables …………………………………………..………… 13

Work Division …………………………………………………………………...……. 14-

15

Costing ……………………………………………………………………..……………. 15

References ……………………………………………………………….……………… 16

2
From book to presentation generation using hybrid summarization

1. INTRODUCTION

In today's knowledge-driven economy, academic books play a critical role in disseminating new

ideas and research findings across the scholarly community. However, the traditional format of

academic books can make it challenging for readers to quickly identify the most important ideas

and concepts contained within. With the abundance of information available today, time is often a

precious commodity, and it can be challenging to effectively communicate important ideas and

concepts to others. This can be especially challenging for busy academics and researchers who may

not have the time to read an entire book cover-to-cover.

Researchers and practitioners have begun to explore the use of AI summarization approaches to

automatically generate presentations from academic books. By leveraging the power of machine

learning and natural language generation, it may be possible to quickly and efficiently extract the

key ideas and concepts from an academic book and present them in a way that is both informative

and engaging.

To address this challenge, we are researching different summarization techniques that can help in

developing an AI-powered product that uses advanced machine learning and natural language

generation techniques to automatically generate presentations from books. Our research paper will

aid designing a product that can help individuals and organizations quickly and easily extract the

most important ideas and concepts from a book and present them in a way that is engaging,

informative, and customizable to the needs of the audience.


3
From book to presentation generation using hybrid summarization

The framework can save time and exertion for the individuals who need to plan introductions in

view of composed material, as it mechanizes the most common way of summing up and making

slides. The produced PowerPoint introductions can likewise be more precise and far reaching than

physically made ones, as the framework utilizes modern rundown methods to extricate key data.

In addition, the device can further develop availability for individuals with visual debilitations or

learning handicaps who might find it challenging to peruse extensive books. By creating precise

and straightforward PowerPoint, they can all the more likely access and figure out the substance.

. In this paper, we will describe the development process for our prototype, including the AI models

and algorithms used, as well as the user interface and user experience design. We will also discuss

potential applications and use cases for this technology, as well as its limitations and areas for

future development

2. PROBLEM STATEMENT

There is a growing need for a tool that can convert lengthy and complex academic and educational

books into more engaging and interactive presentations that can be consumed in a shorter amount of

time. The time-consuming and difficult process of creating PowerPoint presentations based on

written material is the issue that is addressed by the research topic "From book to generating

presentation using hybrid summarization." Experts, instructors, and specialists frequently need to get

ready introductions in view of books, reports, and other extensive records. However, manually

summarizing the content and creating a presentation outline can be time-consuming and require

significant effort.

The extractive or abstractive summarization methods that are utilized by the currently available

4
From book to presentation generation using hybrid summarization

summarization tools are often hampered in terms of accuracy and coherence. Abstractive

summarization generates new sentences that capture the essence of the original text, whereas

extractive summarization simply selects and condenses existing sentences from the text. hybrid

summarization, then again, joins these strategies to make a more precise and reasonable outline of

the content.

The problem statement, therefore, is the lack of an automated tool that can generate PowerPoint

presentations from a given book using hybrid summarization techniques. For those who need to

make presentations based on written material, such a tool could save time and effort while also

providing a more accurate and comprehensive summary of the material.

Using hybrid summarization methods, the proposed research aims to create a system that can

automate the process of summarizing books and creating PowerPoint presentations. By streamlining

the process of creating presentations and enhancing the information's accuracy and accessibility, this

would be a useful tool for educators and professionals in a variety of fields.

To achieve this goal, the study will explore several research questions, such as:

1. How can hybrid summarization techniques be effectively applied to generate PowerPoint

presentations from a given book?

2. What are the most important features and elements to include in a PowerPoint presentation

generated from a book, and how can they be selected using natural language processing

techniques?

3. How can the system be evaluated in terms of accuracy, coherence, and efficiency, and how

does it compare to existing summarization tools?


5
From book to presentation generation using hybrid summarization

3. LITERATURE REVIEW

Starting off this discussion a paper which condenses the basics of automated summarization The

paper “Text Summarization Techniques: A Brief Survey “by Mehdi Allahyari, Seyedamin Pouriyeh,

Mehdi Assef, provides a good starting point, focusing on the simple understanding of abstractive and

extractive summarization techniques, further discussing different models which might be used for

these techniques. Extractive summarizing bases the summarization based on important sentences and

phrases in a text, this is done based on frequency and representative sentences, phrases and words.

This method has its draw backs such as loss of coherence and inability to generate new information.

In short this method is best for simple uncomplicated and informal text summarization. This type of

summarization will use graph-based methods, sentence scoring and clustering based methods, but

these methods have a drawback of not being able to capture to semantic relationships between

sentences. On the other side abstractive summarization uses paraphrasing text creating new sentences.

This covers extractive summarizations draw back incoherence. But the paper does have its own

challenges, such as difficulty in preservation of the source texts meaning and so needing further

advanced natural language processing techniques. Abstractive techniques may use rule-based

methods, machine learning-based methods and deep learning-based methods. Deep learning the most

used method as it can capture complicated semantic relationships between not only sentences but also

words. But using deep learning requires a sizable amount of data and computational resources. The

paper has proved a good base in-order to research more complex and unique applications of these

techniques and models.

Now moving the topic of interest to abstractive summarization the paper "Abstractive Text

Summarization using Sequence-to-sequence RNNs and Beyond" by Nallapati et al, uses Abstractive

6
From book to presentation generation using hybrid summarization

summarization which generates a response based on the meaning of the test itself in-order to rephrase

the text shortening the input using deep learning models. As a result, the researchers used RNN’s as

Recurrent neural network schemes work for both abstractive and extractive summarization, this

technique summarizes in a sequential manner in correlation to a sequence-to-sequence model. Both

the models were mixed as the summarization was done based on machine translation encoding the

source text into fixed length vectors, the vectors are then decoded into a summary based on key

information from the original text. The model of mixed sequence to sequence and RNN.s can be seen

a successful method but abstractive summarization can lead to some issues. This method can be seen

to be successful for simple novel books using a basic and simple literature construction as the words

themselves can be inter changed to make the text simpler and even more easy to understand. As a

result of this given an input text which may be text reliant and to complex may result in an inaccurate

summary since some words and definitions won’t be interchangeable.

For a different approach Acharya carried out research which was condensed in “Extractive Text

Summarization Using Machine Learning" by. Which focuses on the use of machine learning

techniques for extractive summarization, mainly focusing on supervised learning for identifying

important words and phrases in an input text. The authors described different techniques which come

under this umbrella, like decision trees, Support vector machines, chi-square and etc. The authors

themselves select a mix of Support Vector Machines and Chi-square, which is a feature selection

technique. This method may produce acceptable f-scores, precision and recall scores, never the less

the methods a major issue. Starting off the method requires a large amount of training data, if the

training data is not up to code the model may not be able to deal specific and unique texts test cases

which were not catered. Another issue was the conversion didn’t take into account the semantic

"Get To The Point: Summarization with Pointer-Generator Networks" by See et al is a paper which

7
From book to presentation generation using hybrid summarization

also focuses their research on abstractive research but propose the use of Pointer Generated

Networks. The paper’s main goal is to generate a coherent and informative summary capturing the

main idea of the source text. As such the authors propose a neural network mechanism which will in

cooperate pointing and generative mechanisms. The pointing mechanism allows copying words from

the original input. Generative mechanism takes the opposite direction, allowing the generation of new

words not appearing the input. These proposes network is based on a sequence-to-sequence model

using encoding and decoding. Normally in this model encoder takes an input and produces states

which are used the decoder to generate a summary. The paper presents two additional components,

the pointer mechanism copies words from the source and the generator mechanism prevents the

model from repeatedly attending a previously visited section of the input text, increasing efficiency.

The evaluation bases on baseline in terms of ROUGE scores, which showed an increase in efficiency

and effectiveness of the approach and its potential to improving auto generating summarization tools.

The paper by Saif Mohammad and Bonnie J. Dorr "A Hybrid Approach to Automatic Summarization

of News Articles" was a little unique and different from other papers since the paper’s emblements

the proposed idea of Hybrid summarization, combining extractive and abstractive techniques to

generate a summary. This done through a carefully constructive method, first the most important

sentences and words are extracted from the input text using graph-based algorithm this a type of

extractive summarization. Next the researchers used a sequence-to-sequence model which is usually

used in abstractive summarization, to generate the summary based on the extracted sentences. The

authors further compared their results to abstractive and extractive summarization, the evaluation was

done using ROUGUE which as expected shows that the hybrid summarization outperformed other

methods in terms of ROUGUE score. Furthermore, conducting a human survey also showed that

hybrid summarization resulted in a more readable and informative summary of the source text.

8
From book to presentation generation using hybrid summarization

In Comparative Study of Text Summarization Methods" Munot and Govilkar evaluate the existing

summarization techniques. The different methods evaluated may including statistical and machine

learning-based approaches, graph-based methods, and hybrid techniques, highlighting the importance

of domain specific summarization. The comparative study used six summarization techniques,

KLSum, TextRank, LexRank, Luhn's algorithm, Latent Semantic Analysis (LSA), and Naive Bayes

(NB). The study was based on a dataset of 10 documents, with a total of 5,554 sentences. The

evaluation was performed though ROUGE metrics. The results showed that TextRank and LexRank

had the best performance on this dataset. While LSA and NB’s performance was not u to the

standards. One of the strengths of this paper is that it provides a comprehensive overview of the

different text summarization methods and compares them in a detailed and systematic manner. The

authors have also included a critical analysis of the existing literature on text summarization, which

can be valuable for researchers and practitioners working in this field.

The most common way of summing up a lot of composed material has been the subject of much

research in the field of normal language handling (NLP). Techniques for automatically summarizing

lengthy texts into shorter summaries have been developed using extractive and abstractive

summarization. Abstractive summarization creates new sentences that capture the essence of the

original text, whereas extractive summarization selects and condenses existing sentences from the

text.

However, these methods lack coherence and accuracy, so hybrid summarization methods have been

developed to overcome these drawbacks. Half breed synopsis consolidates both extractive and

abstractive strategies to make more precise and sound rundowns. This approach has been displayed to

beat both extractive and abstractive techniques on different measurements (Rush et al., 2015).

In conclusion, automated summarization has come a long way in recent years, with advancements in

9
From book to presentation generation using hybrid summarization

both extractive and abstractive summarization techniques. Extractive summarization has its

limitations but is ideal for simple and informal texts, while abstractive summarization can generate

more coherent and informative summaries but requires more advanced natural language processing

techniques. Deep learning-based methods are the most popular in abstractive summarization, but they

require a large amount of data and computational resources. Pointer Generated Networks and Hybrid

summarization are some of the newer approaches that have shown promise in generating more

efficient and effective summaries. However, despite the progress made, there are still challenges that

need to be addressed, such as the difficulty in preserving the source text's meaning, requiring the use

of more advanced natural language processing techniques.

After literature review the main question that came to our mind was: Can a hybrid model for book

summarization and presentation generation outperform existing methods in terms of accuracy,

efficiency, and user satisfaction?"

Based on this we made our hypothesis "The hybrid model for book summarization and presentation

generation will outperform existing methods in terms of accuracy, efficiency, and user satisfaction."

4. PROJECT OVERVIEW/GOAL

"From book to generating ppt using hybrid summarization" aims to create a system that automates the

process of summarizing a book and producing a PowerPoint presentation using hybrid summarization

methods. The proposed framework plans to give an important device to teachers and experts in

different fields, as making a PowerPoint show from a book can be tedious and testing.

In order to produce summaries that are more precise and coherent, the strategy known as hybrid

summarization combines extractive and abstractive approaches. This approach has been displayed to

10
From book to presentation generation using hybrid summarization

outflank both extractive and abstractive techniques on different measurements and has been applied

in different applications, including archive outline and text-to-discourse transformation. However, the

use of hybrid summarization methods to generate PowerPoint presentations from a book has not been

the subject of any research.

As a result, the proposed research aims to fill this void by creating a system that can automate the

process of summarizing books and creating PowerPoint presentations using hybrid summarization

techniques. The system will be compared to existing summarization tools and its accuracy, coherence,

and efficiency will be evaluated. Additionally, the tool's potential uses and advantages for

professionals in various fields and educators will be investigated. Additionally, the system's potential

extensions and enhancements, such as the incorporation of multimedia content or interactive

elements, will be the subject of the study.

In general, the research that has been proposed has the potential to offer professionals and educators a

useful tool for making book information in PowerPoint presentations more accessible and effective.

5. PROJECT DEVELOPMENT METHODOLOGY / ARCHITECTURE

Data Collection:
The collection of a dataset is the first step in this project.

 "SciSumm" dataset, which contains abstracts and summaries of scientific articles in the field

of computational linguistics. The dataset includes both extractive and abstractive

summaries, and is intended to be used for evaluating the effectiveness of summarization

techniques for scientific literature.

 "ACLRD" (Academic Corpus for Longform Reading Comprehension Dataset) dataset,

which contains academic articles and books from various domains, such as computer

11
From book to presentation generation using hybrid summarization

science, economics, and law. The dataset includes both the full text of the articles and book

chapters, as well as human-generated summaries.

Preprocessing:

Once the dataset is collected, it will be preprocessed before training the models. Preprocessing

involves standardizing the format. Hybrid model will be trained and tested using this dataset by

building a Sequence-to-Sequence model using a deep learning framework like TensorFlow, Keras,

or PyTorch

Evaluation Metrics:

 ROUGE (Recall-Oriented Understudy for Gisting Evaluation): Measures the overlap between

the generated summary and a reference summary.

 Human evaluation: A subjective evaluation conducted by human judges who assess the

quality of the generated summaries based on readability, accuracy, and coherence

 F1 score: It measures the trade-off between precision and recall. Precision measures the

percentage of relevant information in the summary, Recall measures the percentage of

relevant information in the reference summary that is also in the generated summary.

These evaluation metrics can be used individually or in combination to assess the quality of the

generated summaries and the effectiveness of the presentation generation tool.

12
From book to presentation generation using hybrid summarization

System Block Level Diagram, figure.1

6. PROJECT MILESTONES AND DELIVERABLES

13
From book to presentation generation using hybrid summarization

7. WORK DIVISION

Samiha Azeem Aleezae Zia Fateh Ali Alim

Proposal Proposal Proposal

Documentation (SRS) Documentation (SRS) Documentation (SRS)

Dataset Generation Dataset Generation Dataset Generation

Training models Training models Training models

Testing Testing Testing

Updating Research Paper Updating Research Paper Updating Research Paper

Web app (frontend) Web app (frontend) Web app (frontend)

Web app (backend) Web app (backend) Web app (backend)

14
From book to presentation generation using hybrid summarization

8. COSTING

We will use online resources to conduct the research at this time, so no investment is required;

however, when we build the website in the future, we may require additional funds for the domain.

REFERENCES

1. “Text Summarization Techniques: A Brief Survey “by Mehdi Allahyari, Seyedamin Pouriyeh,

15
From book to presentation generation using hybrid summarization

Mehdi Assef,

2. "Abstractive Text Summarization using Sequence-to-sequence RNNs and Beyond" by

Nallapati et al

3. “Extractive Text Summarization Using Machine Learning" by Acharya

4. "Get To The Point: Summarization with Pointer-Generator Networks" by See et al

5. "A Hybrid Approach to Automatic Summarization of News Articles" by Saif Mohammad and

Bonnie J. Dorr

6. “Comparative Study of Text Summarization Methods" by Munot and Govilkar

7. Rush, A. M., Chopra, S., & Weston, J. (2015). A neural attention model for abstractive sentence

summarization. arXiv preprint arXiv:1509.00685. https://arxiv.org/abs/1509.00685

8. Kirmani, M., Manzoor Hakak, N., Mohd, M., & Mohd, M. (2019). Hybrid text summarization:

a survey. In Soft Computing: Theories and Applications: Proceedings of SoCTA 2017 (pp. 63-

73). Springer Singapore.

9. El-Kassas, W. S., Salama, C. R., Rafea, A. A., & Mohamed, H. K. (2021). Automatic text

summarization: A comprehensive survey. Expert systems with applications, 165, 113679.

10. Thione, G. L., Van den Berg, M., Polanyi, L., & Culy, C. (2004, July). Hybrid text

summarization: Combining external relevance measures with structural analysis. In Text

Summarization Branches Out (pp. 51-55).

16

You might also like