Summasphere: Efficient Text Summarization
Summasphere: Efficient Text Summarization
Team ID : C241-PS079
Team Member :
Executive Summary/Abstract:
In today’s digital era, the overload of information and long corpus poses significant
challenges to effectively processing information that is essential for making critical
decisions across business, research and other economic domains. This situation can result
in missed opportunities, hampered innovation and decreased productivity.
Summasphere offers a cost-effective solution while excelling in usability with a simple and
elegant user interface. Compared to other solutions that might be complex or expensive,
Summasphere provides easier and more efficient access to sophisticated natural language
processing technology, making it a compelling choice for users looking for efficiency
without sacrificing quality.
Project Plan
Product-based Capstone Project
How did your team come up with this project?
Our team initiated this project after reviewing data from McKinsey & Company, that
professionals spend about 20% of their work time gathering information. Additionally, a
2018 IDC study found that data professionals are losing 50% of their time every week—30%
searching for, governing, and preparing data, plus 20% duplicating work. IDC reports that
90% of the data generated by organizations is unstructured, presenting a missed
opportunity to extract potentially valuable information. Enterprises also often face an
overwhelming volume of documents, reports, and other content daily, with a quarter of
respondents in an IDC report noting that data volume is outpacing their ability to utilize,
process, and manage it. This information overload is a natural consequence of the modern
organization’s constant communication approach, which often leads to employee
disengagement and suboptimal decision-making. In life sciences, scientists also often
encounter a wealth of scientific papers, research articles, and experimental reports. This
overload of information can be overwhelming and time-consuming, making it challenging
to look through complex scientific literature to identify breakthroughs, novel
methodologies, and key findings.
Based on the problem analysis using Five Whys method, and several research data, it
becomes evident that there is an urgent need for an effective summarization tool to speed
up decision-making processes. In today's fast-paced environment, delivering timely and
effective solutions is crucial for success, every second counts, and the ability to quickly
extract key information from lengthy documents can significantly impact productivity and
competitiveness.
Currently, tools like ChatGPT offer text summarization capabilities, but they lack
personalization and clear use cases. We saw an opportunity to enhance this technology by
utilizing LLM technology to become a solution for a specific goal, text summarization. The
market/niche we target is the information acceleration engineering market, so text
summarization can be a relevant solution, addressing the need for faster and more
accurate summarization. With the exponential growth of generative AI predicted to
continue increasing until 2030, particularly in text summarization, as evidenced by surveys
from explodingtopics.com, the urgency to address this challenge becomes even more
apparent.
Our project, Summasphere, aims to meet this urgent need by leveraging advanced
abstractive modeling techniques to provide concise summaries of complex documents.
Unlike extractive methods, Summasphere comprehends the underlying meaning of the text
and generates summaries using new phrases and sentences, akin to human summarization.
This approach not only saves time but also enhances comprehension and decision-making
so that users can access essential information without being overwhelmed by irrelevant
details. This aligns with the theme of "Economics Empowerment: Navigating Sustainable
Economies for All" by aiming to accelerate decision-making processes. By providing users
with concise summaries, we empower them to make informed decisions swiftly, ultimately
enhancing productivity and contributing to economic sustainability. By integrating
McKinsey's findings into our rationale, we emphasize the critical importance of our project
in today's digital era. Summasphere offers a timely and cost-effective solution to the
challenge of information overload, enabling users to make informed decisions swiftly and
effectively for enhanced productivity.
Project Plan
Product-based Capstone Project
Project Scope & Deliverables:
Project Objectives:
1. Summasphere is an application that enables users to efficiently grasp key points
from a complex long corpus through concise and informative summaries by using
Large Language Models and Topic Modelling Algorithms.
2. It also utilizes Data Visualization techniques such as bubble charts and bar charts to
visualize key points and cluster them based on their topics. This helps users to
understand more about the content of the document.
Project Scope:
1. Summasphere is accessible to all users, but our primary target users are academia,
researchers, and professionals.
2. This application will be deployed on a cloud-based server that is supported by GPU.
3. Summasphere will be developed on web and mobile (android) platforms for user
interaction.
4. Summasphere primarily uses English as the main feature, but we also develop
premium features for multi-language support.
Project Deliverables:
Week Expected Result Percentage
Mobile Development
● Building the medium and high
fidelity design of mobile
applications.
● Discussing the medium and high
fidelity design of mobile
application results.
Cloud Computing
● Discuss API Contract for the
Summarizer module
● Implement the Business Logic to
the APP for the Summarizer
module
● Implement the Business Logic to
the APP for the User and
Authentication module
● Building the medium fidelity
design of web application in
figma.
● Research for frameworks,
libraries and UI components
Mobile Development
● Develop High Fidelity of UI
Design from Figma to XML.
Cloud Computing
● Discuss API Contract for the
Analyzer module
● Implement the Business Logic to
the APP for the Analyzer module
● Implement the Business Logic to
the APP for the History module
● Create basic login, register,
summarizer, analyzer pages
● Discussing the read pdf format
that will then be sent to the
backend
Mobile Development
● UI (XML) refinement based on
High Fidelity design.
● Develop mobile (android)
application and integration API to
mobile application.
Cloud Computing
● Integration across multiple
services by integrating API
Project Plan
Product-based Capstone Project
Mobile Development
● Ensure mobile application
performance including UI
performance is seamless.
● Mobile application integration
testing.
● Refinement based on testing
feedback.
● Bug fixing and optimization.
Cloud Computing
● Testing the API while
simultaneously Fixing bugs that
may appear during the
development phase
● Services deployment and testing
● Bug Fixes and Optimization
● Doing testing to ensure the web
application is running well
6 (June 9th - June 10th 2024) Final User Testing & Prepare for 100%
Capstone Project Presentation
Project Plan
Product-based Capstone Project
Project Schedule:
Here is our project schedule using the Gantt chart.
Access the full detailed chart: Timeline Bangkit Capstone Minerva – FigJam (figma.com)
Based on your team’s knowledge, what tools/IDE/Library and resources that your
team will use to solve the problem?
1. Machine Learning
a. TensorFlow
TensorFlow is a free and open-source software library for machine learning
and artificial intelligence. It can be used across a range of tasks but has a
particular focus on training and inference of deep neural networks.
b. Keras-NLP
KerasNLP is a natural language processing library that works natively with
TensorFlow, JAX, or PyTorch. Built on Keras 3, these models, layers, metrics,
and tokenizers can be trained and serialized in any framework and re-used in
another without costly migrations.
c. Gemma-2b (Google Gemini)
Gemma is a family of lightweight, state-of-the-art open models from
Google, built from the same research and technology used to create the
Project Plan
Product-based Capstone Project
Gemini models. They are text-to-text, decoder-only large language models,
available in English, with open weights, pre-trained variants, and
instruction-tuned variants
d. HuggingFace
Hugging Face is an open-source platform that provides tools and resources
for working on natural language processing (NLP) and computer vision
projects. The platform offers model hosting, tokenizers, machine learning
applications, datasets, and educational materials for training and
implementing AI models
e. FastAPI
FastAPI is a modern, fast (high-performance), web framework for building
APIs with Python based on standard Python-type hints. The key features are
high performance, fast to code, fewer bugs, intuitive, easy, short, and robust
f. Gensim
Gensim is an open-source library for unsupervised topic modeling,
document indexing, retrieval by similarity, and other natural language
processing functionalities, using modern statistical machine learning.
Gensim is implemented in Python and Cython for performance
2. Cloud Computing
a. NodeJS (Runtime)
Node.js is a cross-platform, open-source JavaScript runtime environment
that can run on Windows, Linux, Unix, macOS, and more. Node.js runs on the
V8 JavaScript engine and executes JavaScript code outside a web browser.
b. Backend
i. NestJS (Backend Framework)
Nest is a framework for building efficient, scalable Node.js
server-side applications. It uses modern JavaScript, is built with
TypeScript (preserves compatibility with pure JavaScript), and
combines elements of OOP (Object Oriented Programming), FP
(Functional Programming), and FRP (Functional Reactive
Programming).
ii. PostgreSQL (Database Engine)
PostgreSQL is a free and open-source relational database
management system (RDBMS) emphasizing extensibility and SQL
compliance. PostgreSQL features transactions with atomicity,
consistency, isolation, durability (ACID) properties, automatically
updatable views, materialized views, triggers, foreign keys, and
stored procedures.
Project Plan
Product-based Capstone Project
iii. Prisma (ORM)
Prisma is an open-source database toolkit that helps developers build
modern, type-safe database applications fast. It replaces traditional
ORMs (Object-Relational Mapping) and provides a more efficient way
to work with databases in our application.
c. Frontend
i. ReactJS (Javascript Library)
React JS is a JavaScript library used to build user interfaces (UI) in
web applications. React allows developers to create UI components
that can be changed on the fly when the data inside them changes.
ii. Tailwind (CSS Framework)
Tailwind CSS is a CSS framework designed to make it easier and
faster to create applications using custom designs. Tailwind does not
offer specific styles and templates. Instead, Tailwind offers
opinionated building blocks known as utility classes to help style
website components.
iii. React-router-dom
React Router DOM is a library used in developing web applications
using React. This library makes it possible to make navigation in web
applications fast and SPA or Single Page Application.
iv. React-markdown
React-Markdown is a library for rendering Markdown content as
React components. This library provides a simple and flexible way to
display formatted text, lists, code blocks, images, links, and other
Markdown elements within your React UI.
v. React-spinners
React Spinners is a component library used in web development with
React JS to display loading indicators (spinners) indicating that the
application is loading or processing a task. The React Spinners library
provides various types of spinners that can be customized to match
the style and design needs of the application, including circle
spinners, bar spinners, or even custom animations.
vi. Headless UI
Headless UI is completely unstyled fully accessible UI components,
designed to integrate beautifully with Tailwind CSS.
d. Hoppscotch / Postman ( API Testing Tools )
Hoppscotch / Postman is an application that allows the testing of web APIs.
e. Axios ( HTTP Client )
Project Plan
Product-based Capstone Project
Axios is a promise-based HTTP Client for node.js and the browser.
f. Nodemon ( Monitor Script )
nodemon is a tool that helps develop Node.js-based applications by
automatically restarting the node application when file changes in the
directory are detected.
g. Google Cloud Platform
i. CPU Compute Engine
Backend and Frontend parts of our application will be deployed on
the CPU Compute Engine. We chose the CPU Compute Engine
because it is easily modified to our needs later and also suits the
need for CPU calculation of our NodeJS web parts.
ii. GPU Compute Engine
The Machine Learning part of our application will be deployed on
GPU Compute Engine. We chose the GPU Compute Engine because
it is easily modified to our needs later and also suits the need for GPU
for our Machine Learning model.
iii. Cloud SQL
Instead of using a Compute Engine, we decided just to use Cloud
SQL. We decided to use it because we didn’t need to modify our
Database Engine and just use regular PostgreSQL.
3. Mobile Development
a. Android Studio
Android Studio is the official Integrated Development Environment (IDE) for
Android app development. Based on the powerful code editor and
developer tools from IntelliJ IDEA to develop an android application using
Java, Kotlin, or Flutter (Dart).
b. Figma
Figma is a web-based design and prototyping tool for digital projects that
allows users to collaborate on graphics editing and user interface design.
c. Kotlin
Kotlin is a modern statically typed programming language used by over 60%
of professional Android developers that helps boost productivity, developer
satisfaction, and code safety.
d. Retrofit
Retrofit library is a library used to easily access and send network requests
from Android applications. Retrofit provides a convenient way to
communicate with RESTful web services.
Project Plan
Product-based Capstone Project
e. Postman
Postman is a platform used to test, manage, and develop APIs (Application
Programming Interface). It allows developers to make HTTP requests to
various API endpoints, organize and store requests and responses, and test
overall API functionality.
f. Room
Room is a persistence library that provides an abstraction layer over SQLite
to allow for more robust database access while harnessing the full power of
SQLite. It simplifies database work and ensures that all database queries are
performed on a background thread, avoiding UI blockage and enhancing app
performance. Room also supports LiveData and RxJava for data observation,
making it a robust choice for data storage in Android applications.
g. DataStore
DataStore is a data storage solution that serves as a replacement for
SharedPreferences in Android development. Developed by Google, it offers
two storage data formats: preferences, which is similar to
SharedPreferences and suitable for storing small data sets, and Proto
DataStore, which stores typed objects using protocol buffers. DataStore
ensures type safety, robust transactional consistency, and is fully
asynchronous to prevent UI blocking, unlike SharedPreferences.
h. Transition
Transition is a framework that allows developers to create smooth and
animated transitions between different UI states or elements within their
Android applications. It helps enhance the user experience by adding visual
polish and making interface changes more engaging and intuitive.
Based on your knowledge and explorations, what will your team need support for?
1. Machine Learning
Assistance in assessing model quality or leveraging model performance, especially
for the Topic Modelling Algorithm.
2. Cloud Computing
Assistance on guiding for the suitable service selection for this software
deployment architecture.
3. Mobile Development
Assistance in optimizing the performance of our mobile (android) app, particularly
in handling large text documents and generating summaries efficiently.
Project Plan
Product-based Capstone Project
Based on your knowledge and explorations, tell us the Machine Learning Part of your
Capstone!
Our team will be using the TensorFlow framework along with several third-party libraries
such as HuggingFace, FastAPI, Langchain, and Gensim for our features. We will utilize
multiple models for text summarization, namely BART and Gemma-2B. We are performing
fine-tuning on BART using high-quality datasets to ensure that the summaries it generates
are of the best quality. However, we have restricted the BART model to only handle text in
English. This will serve as the basic feature of Summasphere, which can be enhanced with
Gemma-2B as a premium feature.
Given that Gemma-2B already performs exceptionally well in text summarization, we only
need to fine-tune it using a few custom datasets and some synthetic datasets generated
using current state-of-the-art Large Language Models. To create high-quality datasets, we
are implementing Few-shot prompting on LLMs to provide examples of effective text
summarization. With these prompts, we will create between 1000 and 3000 synthetic
datasets for fine-tuning.
Additionally, we are applying topic modeling using the Latent Dirichlet Allocation algorithm
to cluster words into the most relevant topics. The expected output of this feature is that
users can understand the main subjects discussed in a document and identify the most
prominent words within a topic. These results will be supported with data visualizations
such as bar charts (to show the most frequent words) and bubble charts (to display topic
clusters and relevant words).
These models will be run on the Google Cloud Platform using API endpoints provided by
FastAPI, and the output from each model request will be displayed on the application
interface.
Based on your knowledge and explorations, tell us the Mobile Development Part of
your capstone?
For the mobile development part of our capstone, we created a user-friendly interface
using a Figma design. This design served as a blueprint for creating an intuitive and visually
appealing layout. To bring real-time data into our mobile app, we integrated an API using
Retrofit. Retrofit helped us seamlessly connect our app to external data sources, ensuring
Project Plan
Product-based Capstone Project
that users receive up-to-date information. Additionally, we utilized Postman for API testing,
ensuring reliability and accuracy in data retrieval. To manage local data storage efficiently,
we implemented Room, providing a robust database solution for our application.
Furthermore, for storing user preferences, we adopted DataStore to enhance data
persistence and retrieval. We also incorporated the Transition framework for smooth
screen transitions in our application. This combination of a thoughtful user interface,
seamless transitions and efficient data integration enhances the overall user experience,
making our app both visually pleasing and functionally robust.
Based on your team’s planning, is there any identifiable potential Risk or Issue related
to your project?
1. Machine Learning
There are some potential risks related to our project, and that is model jailbroken
may occur if a user gives irrelevant prompts or tries to change the model
functionality on text summarization. To mitigate this, we decided to give minimal
entered text to 500 words. This will make sure that the model recognizes this as a
document that must be summarized.
2. Cloud Computing
In cloud computing, the main risks include data security and privacy issues, which
can happen if there are system weaknesses or unauthorized access. Other
problems can be service interruptions or slow performance, often due to server
issues or too much traffic, which can delay the processing and access to data.
Project Plan
Product-based Capstone Project
Managing costs can also be difficult if the use of resources isn't carefully watched,
possibly leading to unexpected high costs. To reduce these risks, it's important to
put strong security measures in place, keep a close eye on how the system is
performing, and use good strategies to manage costs effectively.
3. Mobile Development
We may encounter issues related to UI responsiveness and the minimum SDK used
in our app. For instance, there are sometimes problems when the layout of the app
or page does not fit properly on the user's device screen, making the app
uncomfortable to use. Also, if the app does not run on older versions of Android,
users are forced to upgrade their Android version to use the app. Furthermore, bugs
and API performance issues often arise once the app is launched, leading to user
dissatisfaction and potentially causing users to switch to competitor apps, but we
will try our best to minimize this problem. Additionally, when fetching APIs,
especially those involving the sending and receiving of images in LDA and
WordCloud, poor internet connections can impede proper image loading. Therefore,
it is crucial to optimize the APIs, particularly for image handling, to ensure efficient
transmission and reception of images.
4. Resource Constraints
Meeting the project deadlines and staying within the budget can be challenging.
Resource constraints, whether related to personnel or technology, could affect the
project's progress.