3 - Round The Clock Virtual Friend - Report
3 - Round The Clock Virtual Friend - Report
on
BACHELOR OF TECHNOLOGY
DEGREE
Session 2021-22
in
AFFILIATED TO
DR. A.P.J. ABDUL KALAM TECHNICAL UNIVERSITY, U.P.,
LUCKNOW
(Formerly UPTU)
Project Report
on
BACHELOR OF TECHNOLOGY
DEGREE
Session 2021-22
in
AFFILIATED TO
DR. A.P.J. ABDUL KALAM TECHNICAL UNIVERSITY, U.P.,
LUCKNOW
(Formerly UPTU)
STUDENT’S DECLARATION
We hereby declare that the work being presented in this report entitled “Round the Clock
Virtual Friend: A Python based AI assistant for desktop” is an authentic record of our own
work carried out under the supervision of Ms. Shanu Sharma.
The matter embodied in this report has been submitted by us for the award of any other degree.
Dated:
Ayush Yati
Gulshan Sharma
Harshit Choudhary
Computer Science & Engineering
This is to certify that the above statement made by the candidates is correct to the best of my
knowledge.
ii
CERTIFICATE
This is to certify that Project Report entitled “Round the Clock Virtual Friend:
A Python based AI assistant for desktop” which is submitted by Ayush Yati, Gulshan Sharma
and Harshit Choudhary in partial fulfillment of the requirement for the award of degree B.
Tech. in Computer Science & Engineering of Dr. A.P.J. Abdul Kalam Technical University,
formerly Uttar Pradesh Technical University is a record of the candidates’ own work carried
out by them under my supervision. The matter embodied in this thesis is original and has not
been submitted for the award of any other degree.
iii
ACKNOWLEDGEMENT
It gives us a great sense of pleasure to present the report of the B. Tech Project undertaken
during B. Tech. Final Year. We owe special debt of gratitude to Ms. Shanu Sharma,
Department of Computer Science & Engineering, ABESEC Ghaziabad for her constant support
and guidance throughout the course of our work. Her sincerity, thoroughness and
perseverance have been a constant source of inspiration for us. It is only his cognizant efforts
that our endeavors have seen light of the day.
We also take the opportunity to acknowledge the contribution of Prof. (Dr.) Divya Mishra,
Head of the Department, Computer Science & Engineering, ABESEC Ghaziabad for her full
support and assistance during the development of the project.
We also do not like to miss the opportunity to acknowledge the contribution of all faculty
members of the department for their kind assistance and cooperation during the development
of our project. Last but not the least, we acknowledge our friends for their contribution in the
completion of the project.
iv
ABSTRACT
v
TABLE OF CONTENTS
Page
DECLARATION_______________________________________________________ii
CERTIFICATE________________________________________________________iii
ACKNOWLEDGEMENTS ______________________________________________iv
ABSTRACT___________________________________________________________v
TABLE OF CONTENTS ________________________________________________vi
LIST OF FIGURES ___________________________________________________ viii
CHAPTER 1 Introduction _______________________________________________1
1.1 Problem Introduction _____________________________________________1
1.2 Related previous work ____________________________________________2
1.3 Applicability ___________________________________________________3
1.4 Organization of report____________________________________________3
CHAPTER 2 Requirement Specification____________________________________5
2.1 Technology Survey ______________________________________________5
2.2 Requirement Analysis ____________________________________________6
2.3 Requirement Specification_________________________________________7
2.4 Feasibility Study_________________________________________________8
2.5 Hardware Requirements___________________________________________9
2.6 Software Requirements ___________________________________________9
2.7 Use Case Model________________________________________________10
CHAPTER 3 SYSTEM DESIGN_________________________________________11
3.1 ER Diagram ___________________________________________________11
3.2 Activity Diagram _______________________________________________12
3.3 Class Diagram _________________________________________________13
3.4 Use Case Diagram ______________________________________________14
3.5 Sequence Diagrams _____________________________________________15
3.6 Data Flow Diagrams_____________________________________________17
3.7 Component Diagram ____________________________________________19
3.8 Deployment Diagram____________________________________________20
3.9 Test Case Design _______________________________________________21
CHAPTER 4 Implementation____________________________________________22
4.1 Real Life Applications___________________________________________22
vi
4.2 Data Implementation and Program Execution_________________________23
4.2.1 Libraries and Packages ________________________________________23
4.2.2 Functions___________________________________________________23
4.3 Results _______________________________________________________24
4.4 System Testing_________________________________________________29
CHAPTER 5 Conclusion _______________________________________________31
5.1 Limitations____________________________________________________31
5.2 Scope for Future Work___________________________________________31
REFERENCES________________________________________________________32
vii
LIST OF FIGURES
viii
CHAPTER 1
INTRODUCTION
AI applications that can provide natural humanlike interaction with machine (through voice,
gesture, facial expression, eye movements etc.) are rapidly gaining popularity. One of the most
observed and popular directions of engagement was one based on the machine's comprehension
by a machine of natural language processing. In order to become his best personalized assistant,
a computer, rather than a person, self-learns to interact with a human, disclosing his acts,
patterns, and behaviour. People have been working on creating and upgrading tailored
assistants for a long time. These systems are constantly evolving, and they have now firmly
established themselves in a range of mobile devices and gadgets.
Some of the most popular assistants available right now are Google assistant, Siri from Apple,
Samsung Bixby, Microsoft Cortana, etc. This project outlines the architecture and design of
voice assistants.
It includes a planned work plan as well as the approach for working with a voice assistant. It
also explains the voice assistant's test findings. The main goal of this project is to develop a
local voice assistant that can perform human-like duties as well as chores that a human would
have to perform on a regular basis. It now has a number of new features, including the ability
to publish comments on social networking sites such as Facebook, Twitter, Instagram, and
others with merely a few easy instructions. You may also learn about the weather and climatic
conditions in your location. It can open and start web applications as well as the user's local
storage.
1
challenge for them. Even those who are blind can communicate with the computer by speaking
to it. It is also useful for people with memory loss.
In order to become a personal assistant, machines learn to converse with people by learning
from user's behaviour, preferences, and activities.
1.1.1. Motivation
This project was founded on the idea that on the internet, there is enough publicly available
data and knowledge to construct a virtual personal assistant that can be utilized by people with
disabilities, has increased security measures, and can be trained through speech. A virtual
assistant can be extremely beneficial to the elderly, the visually and physically challenged,
children, and others by ensuring that interacting with machines is no longer a challenge.
2
1.2. Related previous work
Each intelligent assistant developer follows his/her own development methodology, which has
an effect on the final output. One helper can produce higher-quality speech, while the other can
complete tasks more efficiently and with fewer descriptions and corrections. Others can only
do a limited number of jobs, but they can do them more precisely but according to the user's
demands. Surprisingly, no universal helper exists that can perform all tasks equally well. An
assistant's set of capabilities is totally determined by the developer's chosen field. Because these
systems rely on machine-learning algorithms and require massive amounts of data to be
acquired and trained on from a variety of sources, the source of this data is critical, whether it
comes from search engines, other way of obtaining information, or social media platforms. The
quantity of information received from various sources determines the assistant's genuine
nature. Despite the numerous learning methods, algorithms, and procedures available, the
process of designing such systems is largely same. The most essential technologies are voice
activation, automated voice recognition, Text-To-Speech, voice authentication, conversation
management, natural language understanding, and named entity recognition.
On the Windows, Android, and iOS platforms, we've all heard of Cortana, Google Assistant,
Bixby, Siri and a plethora of other virtual assistants that aid users with tasks. However, we
were surprised to find that no such complete virtual assistant is available for the Core Windows
platforms older than Windows 10, which accounts for 70% of all users. As a result, this is a
huge issue for users in areas where there may be internet instability, server problems, or places
where the internet is not available.
1.3. Applicability
The widespread usage of artificial intelligence in consumers' daily lives is further accelerating
the shift to speech. As the number of smart IoT devices such as smart ACs and smart speakers
is increasing, virtual assistants will become increasingly useful in the lives of consumers. The
most common method we see voice being used is through smart speakers. Many industry
analysts estimate that within the next five years, practically every application will use voice
technology in some fashion. The use of virtual assistants can also help to improve the IoT
system (Internet of Things). In twenty years, Microsoft and its competitors will offer personal
digital assistants that will provide the services of a full-time employee often reserved for the
wealthy and famous.
3
1.4. Organization of report
This document is divided into five chapters. The first is an introduction that discusses the
project's motivation, goal, and scope. Related previous work is also discussed in this chapter.
SRS is the second chapter (Software Requirement Specification). We've gone through the
technology that's being employed. We also spoke about needs, feasibility studies, use case
models, and hardware and software requirements. The third chapter depicts several DFDs and
focuses on system design. The fourth chapter is on implementation. The conclusion is the last
chapter, followed by references and a plagiarism report.
4
CHAPTER 2
SOFTWARE REQUIREMENT SPECIFICATION
2.1.1. Python
Python is a high-level interpreted programming language based on Object-Oriented
Programming (OOP). It's a powerful, practical language for swiftly constructing apps (RAD).
Python makes writing and running programs much easier. Python uses a fifth of the code of
other OOPs languages to achieve the same rationale.
Python has several advantages for everyone. Python is so versatile that it can't be utilized for
just one purpose. It has been able to penetrate some of the most popular and sophisticated
processes, such as Artificial Intelligence (AI), Machine Learning (ML), Natural Language
Processing (NLP), Data Science, and so on, as a result of its expanding popularity. Python has
a wide range of libraries to fulfil the needs of each project.
Python is a rather efficient programming language. In most cases, efficiency is not a concern
in everyday situations. If your Python code is inefficient, one common technique for making it
more efficient is to figure out what takes the longest and rewriting that in a lower-level
language. Writing it all in a low-level language will result in considerably less programming
and better efficient code since you'll have more time to optimize.
2.1.2. DBpedia
Knowledge bases are becoming more significant in enhancing the intelligence of Web and
corporate search, as well as assisting data integration. The DBpedia takes advantage of
Wikipedia's vast knowledge reservoir by extracting structured data and making it available on
the internet. There are several advantages to using the DBpedia knowledge base above other
knowledge bases:
It covers a broad range of topics. It is really bilingual, represents true community consensus,
and changes automatically as Wikipedia evolves.
You may use the DBpedia knowledge base to ask Wikipedia different queries like "Give me a
listing of all cities in California with a headcount of more than 30,000 people." or "Please give
me with a listing of all twentieth-century American vocalists."
5
2.1.3. Quepy
Quepy is a Python framework that transforms natural language inquiries into database queries.
Adapting different forms of questions and database queries in natural language is simple. You
may design your own natural language database access solution with a little code.
2.1.4. Pyttsx
Python Text to Speech is abbreviated as Pyttsx. It's a Python wrapper for text-to-speech
synthesis that works on any platform. It's a Python library for MacOS X, Windows, and Linux
that handles common text-to-speech engines. Python versions 2.x and 3.x are supported. The
fact that it can be utilized without an internet service is its main advantage.
2.1.6. SQLite
SQLite is a powerful software that allows you to create an in-process relational database for
storing modest to medium-sized data collections. It provides the majority of SQL's
functionality, with a few exceptions. The sqlite3module is available in most Python
distributions' standard library, To get started with SQLite, most users will not need to install
anything. SQLite runs in RAM alongside the program, allowing Python code to be added to
SQLite.
In SQLite, there are several hooks, and the standard library database driver supports a lot of
them.
6
In most circumstances, a user needs to manually handle a large number of programs to
complete a single task. A user planning a trip, for example, should look up airport codes for
nearby airports and then look up tickets between combinations of airports to get to their
destination. A system that can effortlessly manage chores is required.
There are already a number of virtual personal assistants available. However, we hardly ever
utilize it. A large number of people struggle with speech recognition. These systems are capable
of understanding English phrases, but they are unable to distinguish our dialect. Our
pronunciation differs significantly from theirs. They're also much more customer on mobile
devices than desktop PCs. It is necessary to get a virtual personal assistant with an
understanding of Indian accent and who can function on a desktop system.
It is because to a lack of perspective or knowledge of the question's meaning that a virtual
personal assistant is unable to successfully answer enquiries. Only thorough tuning including
both human as well as machine learning can yield adequate results. Having sufficient quality
control procedures in place will also reduce the likelihood of the virtual personal assistant
developing undesired undesirable behaviour. In order for them to function properly, they
require a tremendous amount of data to be supplied into them.
Virtual assistants must be able to model complicated task relationships and make user-friendly
recommendations based on such models. When a task includes several sub-tasks, each with its
own sub-tasks, it should be evaluated to determine the best approach. In this situation, there
are several possibilities, and the system must be able to take into account the users' preferences,
other current chores, and priority when recommending a strategy.
7
Users may interact with programs without using their hands thanks to speech recognition
software, allowing them to use a speech interface to ask questions or provide commands to the
agent. Users may engage with the virtual assistant while doing other things, which increases
the system's value. It also features ubiquitous connectivity via Wi-Fi or a LAN connection,
allowing distributed apps to use other web APIs without having to store them locally.
2.4.2. Operational feasibility: It relates to the ease with which the suggested system can be
used. The system does not require any special expertise to operate. In truth, it is designed to be
used by almost everyone. Children who do not yet know how to write can use the system to
read out issues and obtain responses.
2.4.3. Economic feasibility: In this section, we calculate the total cost and benefit of the
proposed system vs the current system. The most major cost for this project is the cost of
documentation. The microphone and speakers would be the user's responsibility as well. They
are, once again, low-cost and easily accessible. It won't be prohibitively expensive to maintain.
2.4.4. Project management and organizational structure: This section describes the
project's administration and organizational structure. This project was completed by a group of
people. All managerial responsibilities would be shared among the group members. This will
not cause any problems with management and will improve the project's feasibility.
8
2.4.5. Cultural feasibility: This factor considers the project's compatibility with the
surrounding culture. The virtual assistant is designed to fit in with the broader culture.
This project is theoretically doable and does not necessitate any additional hardware. It is really
easy to use and therefore does not demand any prior knowledge. The planned system's goals
are achievable, according to the project's overall feasibility analysis.
9
2.7. USE CASE MODEL
10
CHAPTER 3
SYSTEM DESIGN
3.1. ER DIAGRAM
Fig 2: ER Diagram.
The entities and relationships of a virtual assistant system are illustrated in the diagram above.
A system user has access to the keys and values. It may be used to keep track of any kind of
user data. The entry for the key "username" may, for example, be "Gulshan." Some keys may
be kept secure by the user. He can enable lock and create a password there (voice clip).
A single user is able to ask numerous questions. Each query will be given a unique ID to
distinguish it from the enquiry and its reply. A user may be responsible for a significant number
of tasks. Each of them should have its own id and status, indicating their current state.
A task must have a priority value and a classification based on whether this is a parent or child
task of an earlier assignment.
11
3.2. ACTIVITY DIAGRAM
The system is initially in standby mode. It begins execution as soon as it receives a wake-
up call. It is determined whether the received command is a questionnaire or a job to be
completed. As a result, precise steps are taken to address the problem. The system waits for
12
another command after the Question is answered or the process is completed. This loop
will run until a quit command is issued. At that point, it goes back to sleep.
3.3. CLASS DIAGRAM
The user class contains two audio attributes: one command and one answer. It has the ability
to listen for human orders. It is then examined and either a response or an answer is
provided. The request is in string form in the Question class, as it is interpreted by the
Interpret class. Based on its identity, it delivers it to the default, about, or search
functionality. There is also an interpreted command in string format in the task class. It
offers a number of features, including a reminder, a note, a mimic, research, and a reader.
13
3.4. USE CASE DIAGRAM
There's only the one user in this project. The computer system receives a command from
the user. After that, the system interprets it and returns the outcome. The user is notified of
the response.
14
3.5. SEQUENCE DIAGRAMS
3.5.1. Sequence Diagram (Query-Response)
The Diagram above depicts how an answer to a user's question is retrieved from the
internet. The voice question is encoded and sent to the Web scraper. The web scraper seeks
for and finds the solution. Then it is returned to the speaker, who delivers the user's
response.
15
3.5.2. Sequence Diagram (Task Execution)
16
3.6. DATA FLOW DIAGRAMS
3.6.1. DFD Level 0 (Context Level Diagram)
17
3.6.3. DFD Level 2
18
3.7. COMPONENT DIAGRAM
The Virtual Assistant is the most important component here. It offers two distinct services:
task execution and query answering.
19
3.8 DEPLOYMENT DIAGRAM
Using SQLite connection in Python programming, the user interacts with the SQLite database.
DBpedia is a knowledge database that can only be accessed via the internet. A LAN or
WLAN/Ethernet network is required for this.
20
3.9. TEST CASE DESIGN
Case (i)
Name: Response Time
ID: T1
Priority: High
Objective: In order for the system to reply as rapidly as feasible. Time is of the highest
significance in a voice-based system. We don't type inputs; instead, we speak them. The system
must also respond quickly. The user must receive an immediate response to his or her enquiry.
Case (ii)
Name: Accuracy
ID: T2
Priority: High
Objective: To guarantee that the system's responses are accurate based on the data collected.
Description: The basic purpose of a virtual personal system is to deliver exact responses to
any inquiries asked. It's worthless to get a speedy response if it's incorrect. Precision is crucial
in a virtual assistant system.
Case (iii)
Name: Approximation
ID: T3
Priority: Moderate
Objective: To check estimated answers about calculations.
Description: When performing a mathematical calculation, it is sometimes necessary to make
an approximation. When a user asks for the value of PI, for example, the system must respond
with an approximation rather than the exact value. It's worthless to get a speedy response if it's
inaccurate. In such cases, getting an exact figure isn't necessary.
Note: There may be a few additional test cases, and all these test cases might vary as the
software development progresses.
21
CHAPTER 4
IMPLEMENTATION
Round the Clock Virtual Friend, is a voice assistant that can execute numerous common
desktop operations with a single voice command, such as playing music and opening your
favorite IDE. Virtual Friend is unique among virtual personal assistants in that it is designed
exclusively for the desktop and does not need the user to register an account. It also does not
necessitate an internet connection in order to receive instructions for every task.
4.1.1. Saves time: Virtual Friend is a desktop voice assistant which works on the voice
command offered to it, it can do voice searching, voice-activated device control and can let us
complete a set of tasks.
4.1.2. Conversational interaction: It makes it easy to complete any activity because it does
so automatically use Python's essential modules and libraries in a conversational manner. As a
result, when a user gives it a task, they feel as if they are handing it to a human assistant because
of the conversational engagement forgiving input and delivering the desired output in the form
of a completed task.
4.1.3. Reactive nature: The desktop assistant is reactive, meaning it understands human
language and the context provided by the user, and responds in the same manner, i.e., in human
comprehensible language, English. As a result, the user makes an informed and intelligent
decision.
4.1.4. Multitasking: Its key use case could be its multitasking ability. It can keep asking for
instructions until the user tells it to "QUIT."
4.1.5. No Trigger phase: It requests instructions and listens to the user's response without
requiring a trigger phase, and then only conducts the task.
22
4.2. DATA IMPLEMENTATION AND PROGRAM EXECUTION
Install all of the essential packages and libraries first. "pip install" and then "import" are the
commands used to install the libraries. The following are the required packages:
4.2.2. FUNCTIONS
4.2.2. take_user_input(): The function takes the command as input from the user's microphone
and returns the result as a string.
4.2.2.2. greet_user( ): This function sends the user greetings according to the time of day, e.g.,
Good Morning, Good Afternoon, and Good Evening.
4.2.2.3. taskExecution(): This is the function that contains all of the essential task execution
definitions, such as sendEmail(), pdf reader(), news(), and numerous if conditions, such as
"open google","search on Wikipedia", "open notepad", "play some music", and "open discord"
etc.
23
4.3. RESULTS
24
Fig 16. Output for YouTube search.
25
Fig 18. Output to play music.
26
Fig 21. Input and Output to open Discord
27
Fig 23. Input and Output to Weather
28
4.4. SYSTEM TESTING
System testing is carried out on a fully integrated system to determine whether the demands
are implemented.
4.4.1. FUNCTIONALITY
This is where we test the system's functionality to see if it fulfils the purpose it was designed
to do. Each function was checked and ran to ensure functionality; if it was able to complete the
needed task correctly, the system passed that functionality test. To check if Virtual Friend could
search on Google, for example, the user said "Open Google," and Virtual Friend replied, "What
do I search on Google?" "What is Python?" the user inquired. As indicated in the diagram
below, Virtual Friend then launched Google and searched for the needed input.
29
4.4.2. USABILITY
The usability of a system is determined by assessing the software's ease of use, as well as how
it responds to each query posed by the user. It makes it easy to complete any activity because
it does so automatically use Python's essential modules and libraries in a conversational
manner. As a result, when a user gives it a task, they feel as if they are handing it to a human
assistant because of the conversational engagement for providing input and receiving the
desired output in the form of a completed task.
The virtual assistant is reactive, meaning it understands human language and the context
provided by the user, and responds in the same manner, i.e. in human comprehensible language,
English. As a result, the user makes an informed and intelligent decision.
Its key use case could be its multitasking ability. It can keep asking for instructions until the
user tells it to "QUIT." It requests instructions and listens to the user's response without
requiring a trigger phase, and then only conducts the task.
4.4.3. SECURITY
Vulnerabilities and dangers are the major emphasis of security testing. There is no risk of data
breach through remote access because Virtual Friend is a local desktop application. Because
the software is tied to a certain system, it will be launched when the user checks in.
4.4.4. STABILITY
The output of a system determines its stability; if the output is bounded and specific to the
bounded input, the system is stable. If all of the functional poles are functioning, the system is
stable.
30
CHAPTER 5
CONCLUSION
Without a doubt, Round the Clock Virtual Friend is a very useful voice assistant because it
saves the user’s time through conversational engagements, efficacy, and efficiency. However,
while carrying out the project, several limitations were discovered, as well as potential future
upgrade opportunities, which are listed below:
5.1. LIMITATIONS
5.1.1. Security is somewhere an issue, there is no voice command encryption in this project.
5.1.2. Background voice can interfere
5.1.3. Misinterpretation because of accents and may cause inaccurate results.
5.1.4. Virtual Friend cannot be called externally anytime like other traditional assistants
like Google Assistant can be called just by saying, “Ok Google!”
31
REFERENCES
[2] Deller John R., Jr., Hansen John J.L., Proakis John G. ,Discrete-Time Processing of Speech
Signals, IEEE Press, ISBN 0-7803-5386-2
[3] Proakis John G., Manolakis Dimitris G.,Digital Signal Processing, principles, algorithms,
and applications, Third Edition, Prentice Hall , New Jersey, 1996, ISBN 0- 13- 394338-9
[4] Ashish Jain, Hohn Harris, Speaker identification using MFCC and HMM based techniques,
university Of Florida, April 25,2004.
[5] M. Kolss, D. Bernreuther, M. Paulik, S. St¨ucker, S. Vogel, and A. Waibel, “Open Domain
Speech Recognition & Translation: Lectures and Speeches,” in Proceedings of ICASSP
[6] Klüwer, Tina. "From chatbots to dialog systems." Conversational agents and natural
language interaction: Techniques and Effective Practices. IGI Global, 2011
[7] Moskvitch, Katia. "The machines that learned to listen". BBC.com. Retrieved 5 May 2020
[8] "Developing Your Own Wake Word Engine Just Like 'Alexa' and 'OK Google'". GPU
Technology Conference. 17 July 2017.
32