Understanding Intelligence & AI
Understanding Intelligence & AI
What is Intelligence?
Intelligence is a complex and multifaceted concept that has been studied and debated by
researchers for centuries. According to researchers, intelligence is the ability to perceive or
infer information, and to retain it as knowledge to be applied towards adaptive behaviors
within an environment or context.
Traits of Intelligence
The following are the abilities that are involved in intelligence:
Trait Definition
Definition of Intelligence
We may define intelligence as:
Page 1
Created by Turbolearn AI
Decision Making
Decision making is a critical aspect of intelligence, and it involves the ability to make choices
based on available information.
Page 2
Created by Turbolearn AI
Virtual assistants
Image recognition
Natural language processing
Robotics
Self-driving cars
Conclusion
Intelligence is a complex and multifaceted concept that involves the ability to perceive or
infer information, and to retain it as knowledge to be applied towards adaptive behaviors
within an environment or context. Artificial Intelligence is the development of computer
systems that can perform tasks that typically require human intelligence, such as learning,
problem-solving, and decision-making.## Intelligence and Decision Making
Intelligence is the ability to perceive, understand, and act in the world. It involves reasoning,
planning, and learning.
Decision Making
Decision making is a crucial part of intelligence. It involves making choices based on
available information and experience.
Scenarios
Page 3
Created by Turbolearn AI
Artificial Intelligence
Artificial Intelligence AI is the ability of a machine to mimic human traits, such as making
decisions, predicting the future, learning, and improving on its own.
Page 4
Created by Turbolearn AI
Not all technologies are AI. To be considered AI, a machine must be trained with data and
be able to make decisions or predictions on its own.
Note: IoT InternetofT hings and automation are not the same as AI.## Introduction to
Artificial Intelligence
Artificial Intelligence AI is a vast domain that has been defined in various ways by different
organizations. Here are a few definitions:
NITI Aayog: AI refers to the ability of machines to perform cognitive tasks like
thinking, perceiving, learning, problem solving, and decision making.
World Economic Forum: AI is the software engine that drives the Fourth Industrial
Revolution.
European Artificial Intelligence AI leadership: AI is a cover term for techniques
associated with data analysis and pattern recognition.
Encyclopaedia Britannica: AI is the ability of a digital computer or computer-
controlled robot to perform tasks commonly associated with intelligent beings.
Artificial Intelligence
Refers to any technique that enables computers to mimic human
AI intelligence.
A subset of AI that enables machines to improve at tasks with
Machine Learning ML
experience data.
Enables software to train itself to perform tasks with vast amounts
Deep Learning DL
of data.
Note that Machine Learning and Deep Learning are part of Artificial Intelligence, but not
everything that is Machine Learning will be Deep Learning.
Page 5
Created by Turbolearn AI
AI Domains
AI models can be broadly categorized into three domains based on the type of data fed into
them:
Data Sciences: Related to data systems and processes, where the system collects
numerous data, maintains data sets, and derives meaning/sense out of them.
Computer Vision: Depicts the capability of a machine to get and analyze visual
information and predict decisions about it.
Natural Language Processing: Not discussed in this lecture, but an important domain
of AI.
Data Sciences
Data sciences is a domain of AI that involves collecting, maintaining, and deriving meaning
from data. Examples of data science include:
Computer Vision
Computer Vision is a domain of AI that involves analyzing visual information and predicting
decisions about it. Examples of computer vision include:
KWLH Chart
Page 6
Created by Turbolearn AI
Question Answer
Natural language refers to language that is spoken and written by people, and
natural language processing NLP attempts to extract information from the
spoken and written word using algorithms.
The ultimate objective of NLP is to read, decipher, understand, and make sense of human
languages in a manner that is valuable.
AI Ethics
Nowadays, we are moving from the Information era to the Artificial Intelligence era. Now
we do not use data or information, but the intelligence collected from the data to build
solutions. These solutions can even recommend the next TV show or movies you should
watch on Netflix.
Page 7
Created by Turbolearn AI
Scenario Description
A self-driving car is faced with a sudden decision: hit a small boy who has come
1 in front of the car or take a sharp right turn to save the boy and smash the car
into a metal pole, damaging the car and injuring the person sitting in it.
The car has hit the boy who came in front of it. Who should be held responsible
2
for the accident?
Data Privacy
The world of Artificial Intelligence revolves around data. Every company, whether small or
big, is mining data from as many sources as possible. More than 70% of the data collected
till now has been collected in the last 3 years, which shows how important data has
become in recent times.
Question Answer
Where do we collect data Various sources, including smartphones, online activities, and
from? more.
Why do we need to collect To build solutions, provide personalized recommendations, and
data? improve services.
Page 8
Created by Turbolearn AI
Example Description
You discuss buying new shoes with a friend on a mobile network or app, and
1
later receive notifications from online shopping websites recommending shoes.
You search for a trip to Kerala on Google, and later receive messages from apps
2
about packages and deals for the trip.
You discuss a book with someone face-to-face while your phone is nearby, and
3
later receive notifications about similar books or messages about the same book.
AI Bias
Everyone has a bias of their own, no matter how much one tries to be unbiased. Biases are
not negative all the time. Sometimes, it is required to have a bias to control a situation and
keep things working.
Example Description
Majorly, all virtual assistants have a female voice. Can you think of some reasons
1
for this?
If you search on Google for salons, the first few searches are mostly for female
2
salons. Is this a bias? If yes, then is it a negative bias or a positive one?
Artificial Intelligence AI is a budding technology that not everyone has access to. The
people who can afford AI-enabled devices make the most of it, while others who cannot are
left behind. This creates a gap between these two classes of people, which gets widened
with the rapid advancement of technology.
AI Creates Unemployment
AI is making people's lives easier by automating laborious tasks. However, this may lead to
mass unemployment, where people with little or no skills may be left without jobs. On the
other hand, those who keep up with their skills will flourish.
Page 9
Created by Turbolearn AI
Key Questions:
AI for Kids
Kids nowadays are smart enough to understand technology from a very early age.
However, should technology be given to children so young?
The Concern:
"While it is good that the boy knows how to use technology effectively, on the
other hand, he uses it to complete his homework without really learning
anything since he is not applying his brain to solve the Math problems."
Is it Ethical?
AI Project Cycle
The AI Project Cycle provides a framework for developing AI projects. It consists of 5
stages:
Stage Description
Problem Scoping
Problem Scoping is about identifying a problem and having a vision to solve it. It involves
understanding the problem and its parameters.
The 4Ws Problem Canvas helps in identifying the key elements related to the problem.
Page 10
Created by Turbolearn AI
W Description
Who Analyze the people getting affected directly or indirectly due to the problem
What Determine the nature of the problem
Where Identify the location where the problem exists
Why Understand the reasons behind the problem
The United Nations has announced 17 SDGs, which are a set of goals to be achieved by
2030. These goals correspond to problems that we might observe around us.
Goal Description
1 No Poverty
2 Zero Hunger
3 Good Health and Well-being
... ...
17 Partnerships for the Goals
Stakeholders
Stakeholders are the people who face the problem and would be benefitted with the
solution.## Problem Definition
The problem definition stage involves identifying and understanding the problem you want
to solve. This stage is crucial in the AI project cycle as it sets the foundation for the entire
project.
Page 11
Created by Turbolearn AI
The problem statement template is used to summarize the key points of the problem. It
consists of the following format:
Stakeholder(s)
Who has/have a
context, situation
Where. An ideal
benefitofsolutionforthem
Data Acquisition
Data acquisition is the process of collecting data for the project. Data can be a piece of
information or facts and statistics collected together for reference or analysis.
Types of Data
Data Features
Data features refer to the type of data you want to collect. For example, in a project to
predict employee salaries, data features might include:
Page 12
Created by Turbolearn AI
Salary amount
Increment percentage
Increment period
Bonus
Data Exploration
Data exploration is the process of analyzing and understanding the data. This stage
involves visualizing the data to identify trends, relationships, and patterns.
Modelling
Modelling is the process of developing an AI model to solve the problem. There are two
main approaches to modelling:
Page 13
Created by Turbolearn AI
Rule-Based Approach
A rule-based approach refers to the AI modelling where the rules are defined by
the developer. The machine follows the rules or instructions mentioned by the
developer and performs its task accordingly.
Example: A dataset that tells us about the conditions on the basis of which we can
decide if an elephant may be spotted or not while on safari.
Drawback: The learning is static, and the machine does not adapt to changes in the
data.
Learning-Based Approach
A learning-based approach refers to the AI modelling where the machine learns
by itself. Under the learning-based approach, the AI model gets trained on the
data fed to it and then is able to design a model which is adaptive to the change
in data.
Types of AI Models
The machine learning approach introduces dynamicity in the model by allowing it to adapt
to new data. This approach can be divided into three parts:
Supervised Learning
Definition: A supervised learning model is trained on a labelled dataset, where
the data is already known to the person training the model.
In a supervised learning model, the dataset is labelled, and the model is trained to predict
the label of new data. There are two types of supervised learning models:
Page 14
Created by Turbolearn AI
Classification: Where the data is classified according to the labels. For example, in a
grading system, students are classified on the basis of their grades.
Regression: Where the model works on continuous data. For example, predicting the
next salary based on previous salaries and increments.
Unsupervised Learning
Definition: An unsupervised learning model works on unlabelled data, where
the data is random and unknown to the person training the model.
Unsupervised learning models are used to identify relationships, patterns, and trends in the
data. They can be further divided into two categories:
Clustering: Refers to the unsupervised learning algorithm that can cluster unknown
data according to patterns or trends identified.
Dimensionality Reduction: Reduces the dimensions of the data to make it easier to
visualize and understand.
Dimensionality Reduction Description
Evaluation
Once a model is trained, it needs to be evaluated to calculate its efficiency and performance.
The model is tested with testing data, and its efficiency is calculated based on the following
parameters:
Accuracy
Precision
Recall
F1 Score
Neural Networks
Neural networks are loosely modelled after how neurons in the human brain behave. They
are able to extract data features automatically without needing the input of the
programmer.
Page 15
Created by Turbolearn AI
A neural network is divided into multiple layers, each with several blocks called nodes. Each
node has its own task to accomplish, which is then passed to the next layer.
Input Layer: Acquires data and feeds it to the neural network. No processing occurs at
this layer.
Hidden Layers: Where the whole processing occurs. Each node has its own machine
learning algorithm that it executes on the data received from the input layer.
Output Layer: Gives the final output to the user. No processing occurs at this layer.
Imagine a scenario where we are working on two Python-based projects and one of them
works on Python 2.7 and the other uses Python 3.7. In such situations, virtual environments
can be really useful to maintain dependencies of both the projects as the virtual
environments will make sure that these dependencies are not conflicting with each other
and no impact reaches the base environment at any point in time.
Page 16
Created by Turbolearn AI
This code will create an environment named env and will install Python 3.7 and other basic
packages into it.
4. After some processing, the prompt will ask if we wish to proceed with installations or
not. Type Y on it and press Enter. Once we press Enter, the packages will start getting
installed in the environment.
5. Depending upon the internet speed, the downloading of packages might take varied
time.
6. Once all the packages are downloaded and installed, we will get a message like this:
7. This shows that our environment called env has been successfully created. Once an
environment has been successfully created, we can access it by writing:
This would activate the virtual environment and we can see the term written in brackets has
changed from base to env. Now our virtual environment is ready to be used.
Page 17
Created by Turbolearn AI
To install Jupyter Notebook dependencies, we need to activate our virtual environment env
and write:
It will again ask if we wish to proceed with the installations, type Y to begin the
installations. Once the installations are complete, we can start working with Jupyter
notebooks in this environment.
Introduction to Python
Python is a programming language which was created by Guido Van Rossum in Centrum
Wiskunde & Informatica. The language was publicly released in 1991 and it got its name
from a BBC comedy series from 1970s Monty Python's Flying Circus.
Why Python?
Python is a popular language for developing applications of Artificial Intelligence. Here are
some reasons why Python gains maximum popularity:
Easy to learn, read and maintain: Python has few keywords, simple structure and a
clearly defined syntax.
A Broad Standard library: Python has a huge bunch of libraries with plenty of built-in
functions to solve a variety of problems.
Interactive Mode: Python has support for an interactive mode which allows
interactive testing and debugging of snippets of code.
Portability and Compatibility: Python can run on a wide variety of operating systems
and hardware platforms, and has the same interface on all platforms.
Extendable: We can add low-level modules to the Python interpreter.
Databases and Scalable: Python provides interfaces to all major open source and
commercial databases along with a better structure and support for much larger
programs than shell scripting.
Applications of Python
There exist a wide variety of applications when it comes to Python. Some of the
applications are:
Page 18
Created by Turbolearn AI
Web Development: Python can be used to build web applications using popular
frameworks like Django and Flask.
Data Analysis: Python has popular libraries like Pandas and NumPy for data analysis.
Machine Learning: Python has popular libraries like scikit-learn and TensorFlow for
machine learning.
Automation: Python can be used to automate tasks using scripts.
Python Basics
Printing Statements
We can use Python to display outputs for any code we write. To print any statement, we
use the print function in Python.
Statement Description
On the other hand, there exist some statements which do not get executed by the
computer. These lines of code are skipped by the machine. They are known as comments.
Comment Description
# This is a comment and will not be read by the This is a comment as it starts with
machine. #.
Page 19
Created by Turbolearn AI
Keyword Description
An identifier is any word which is variable. Identifiers can be declared by the user as per
their convenience of use and can vary according to the way the user wants.
Identifier Description
The type of data is defined by the term datatype in Python. There can be various types of
data which are used in Python programming.
Datatype Description
Python Inputs
Page 20
Created by Turbolearn AI
In Python, not only can we display the output to the user, but we can also take input from
the user.## User Input and Operators
Operators
Operators are special symbols that represent computation. They are applied on operands,
which can be values or variables.
Arithmetic Operators
+ Addition 10 + 20 30
- Subtraction 30 - 10 20
* Multiplication 30 * 100 300
/ Division 30 / 10 20.0
// Integer Division 25 // 10 2
% Remainder 25 % 10 5
** Raised to power 3 ** 2 9
Page 21
Created by Turbolearn AI
Conditional Operators
Logical Operators
Assignment Operators
= X = 5 X = 5
+= X += 5 X = X + 5
-= X -= 5 X = X - 5
*= X *= 5 X = X * 5
/= X /= 5 X = X / 5
Conditional Statements
Conditional statements help the machine in taking a decision according to the condition that
gets fulfilled.
Looping
Page 22
Created by Turbolearn AI
Looping mechanisms are used to iterate statements or a group of statements as many times
as it is asked for.
While Loop: used to execute a block of code as long as a certain condition is true.
For Loop: used to execute a block of code for a specified number of times.
Do-While Loop: used to execute a block of code at least once, and then repeat the
execution as long as a certain condition is true.
Python Packages
A package is a space where we can find codes or functions or modules of similar type.
Installing Packages
To install a package in Python, we need to use the conda install command.
Importing Packages
To use a package in Python, we need to import it.
Data Sciences
Data Science is a concept to unify statistics, data analysis, machine learning, and their
related methods in order to understand and analyze actual phenomena with data.
Page 23
Created by Turbolearn AI
Fraud and Risk Detection: used in finance to detect fraudulent transactions and
predict risk.
Genetics and Genomics: used in medicine to understand the impact of DNA on health
and develop personalized treatments.
Genomic Data: Techniques allow integration of different kinds of data with genomic
data in disease research, providing a deeper understanding of genetic issues in
reactions to particular drugs and diseases.
Internet Search: Search engines like Google, Yahoo, and Bing use data science
algorithms to deliver the best results for searched queries in a fraction of a second.
Targeted Advertising: Digital marketing uses data science algorithms to target ads
based on users' past behavior, resulting in higher click-through rates CT Rs than
traditional advertisements.
Website Recommendations: Companies like Amazon, Twitter, Google Play, Netflix,
LinkedIn, and IMDB use data science to improve user experience by recommending
similar products based on previous search results.
Airline Route Planning: Airline companies use data science to identify strategic areas
of improvement, predict flight delays, decide which class of airplanes to buy, and
effectively drive customer loyalty programs.
Page 24
Created by Turbolearn AI
Problem Scoping
4Ws Problem Canvas Description
Problem Statement
"Our restaurant owners have a problem of losses due to food wastage. The food is left
unconsumed due to improper estimation. We want to be able to predict the amount of food
to be prepared for every day's consumption."
Data Acquisition
The following data features affect the problem:
Quantity of dish prepared per day The amount of food prepared for each dish
Total number of customers per day The number of customers visiting the restaurant
Unconsumed quantity of dish per day The amount of food left unconsumed
Price of dish The price of each dish
Fixed customers per day The number of regular customers
System Map
Page 25
Created by Turbolearn AI
The system map shows the relationship between each element and the project's goal.
Positive arrows indicate a direct relationship, while negative arrows indicate an inverse
relationship.
Data Exploration
The collected data is analyzed to understand its requirements. The goal is to predict the
quantity of food to be prepared for the next day's consumption.## Modelling In this section,
we will discuss the process of modelling our dataset using a regression model.
A regression model is a type of supervised learning model that takes in continuous values
of data over a period of time. Since our dataset consists of 30 days of continuous data, we
can use a regression model to predict the next values.
Evaluation
Once the model has been trained, it is time to see if the model is working properly or not.
The evaluation process involves the following steps:
Page 26
Created by Turbolearn AI
Step 1: The trained model is fed data regarding the name of the dish and the quantity
produced for the same.
Step 2: It is then fed data regarding the quantity of food left unconsumed for the same
dish on previous occasions.
Step 3: The model then works upon the entries according to the training it got at the
modelling stage.
Step 4: The model predicts the quantity of food to be prepared for the next day.
Step 5: The prediction is compared to the testing dataset value.
Step 6: The model is tested for 10 testing datasets kept aside while training.
Step 7: Prediction values of testing dataset is compared to the actual values.
Step 8: If the prediction value is same or almost similar to the actual values, the model
is said to be accurate.
Data Collection
Data collection is the process of gathering data from various sources. It has been a part of
our society since ages, even when people did not have fair knowledge of calculations.
"Data collection is an exercise which does not require even a tiny bit of
technological knowledge. But when it comes to analysing the data, it becomes a
tedious process for humans as it is all about numbers and alpha-numerical
data."
Sources of Data
There exist various sources of data from where we can collect any type of data required.
The data collection process can be categorised in two ways: Offline and Online.
Types of Data
For Data Science, usually the data is collected in the form of tables. These tabular datasets
can be stored in different formats.
Page 27
Created by Turbolearn AI
Format Description
CSV Comma Separated Values. A simple file format used to store tabular data.
A piece of paper or a computer program used for accounting and recording
Spreadsheet
data using rows and columns.
Structured Query Language. A programming language used for managing
SQL
data held in different kinds of DBMS DatabaseManagementSystem.
Data Access
After collecting the data, to be able to use it for programming purposes, we should know
how to access the same in a Python code.
NumPy
NumPy is the fundamental package for Mathematical and logical operations on arrays in
Python.
Pandas
Pandas is a software library written for the Python programming language for data
manipulation and analysis.
"Pandas offers data structures and operations for efficiently handling structured
data, including tabular data such as spreadsheets and SQL tables."
Page 28
Created by Turbolearn AI
import numpy
A = numpy.array([1,2,3,4,5,6,7,8,9,0])
A = [1,2,3,4,5,6,7,8,9,0]
```## Data Structures and Operations
Pandas is a library that provides data structures and operations for manipulatin
> Pandas is a library that provides data structures and operations for manipulat
* **Series** (1-dimensional)
* **DataFrame** (2-dimensional)
These data structures handle the vast majority of typical use cases in finance,
Pandas provides several features that make it a powerful tool for data manipulat
* Scatter plots
* Bar charts
* Histograms
* Box plots
Matplotlib provides several features that make it a powerful tool for data visua
* Customizable plots: plots can be stylized and made more descriptive and commun
* Variety of plot types: Matplotlib provides a wide range of plot types that can
Page 29
Created by Turbolearn AI
Python provides several libraries that can be used for basic statistical analysi
| Method | Description |
| --- | --- |
| **Mean** | The average value of a dataset |
| **Median** | The middle value of a dataset when it is sorted in ascending orde
| **Mode** | The most frequently occurring value in a dataset |
| **Standard Deviation** | A measure of the spread or dispersion of a dataset |
| **Variance** | A measure of the spread or dispersion of a dataset |
Python provides several libraries that can be used to calculate statistical meth
Python provides several libraries that can be used for data visualization, inclu
* Scatter plots
* Bar charts
* Histograms
* Box plots
Python's data visualization libraries provide several features that make them po
* Customizable plots: plots can be stylized and made more descriptive and commun
* Variety of plot types: Python's data visualization libraries provide a wide ra
## Data Issues
Data can have several issues that need to be addressed before it can be analyzed
Python provides several libraries that can be used to handle data issues, includ
* Handling missing data: Pandas provides several features that can be used to ha
* Handling outliers: Python's data visualization libraries provide several featu
Page 30
Created by Turbolearn AI
| Quartile | Description |
| --- | --- |
| Q1 | 0th - 25th percentile |
| Q2 | 25th - 50th percentile (median) |
| Q3 | 50th - 75th percentile |
| Q4 | 75th - 100th percentile |
> The interquartile range (IQR) is the difference between the 75th percentile (Q
### Outliers
> Outliers are data points that are significantly different from the other data
## Personality Prediction
The KNN algorithm is a supervised machine learning algorithm that can be used fo
> The KNN algorithm relies on the surrounding points or neighbours to determine
* The KNN prediction model relies on the surrounding points or neighbours to det
* It utilises the properties of the majority of the nearest points to decide how
* It is based on the concept that similar data points should be close to each ot
| K Value | Prediction |
| --- | --- |
| 1 | Not sweet (based on the nearest neighbour) |
| 2 | No prediction (due to conflicting neighbours) |
| 3 | Sweet (based on the majority of neighbours) |
## Computer Vision
### Introduction
> Computer vision is a field of study that enables computers to interpret and un
The Emoji Scavenger Hunt game is an interactive way to experience the capabiliti
* Players must analyze the items and their surroundings to identify the correct
* The computer uses algorithms and methods to process and analyze the visual dat
Page 31
Created by Turbolearn AI
| Application | Description |
| --- | --- |
| **Facial Recognition** | Used in security systems, attendance tracking, and sm
| **Face Filters** | Used in social media apps like Instagram and Snapchat to ap
| **Google Search by Image** | Allows users to search for images and get results
| **Computer Vision in Retail** | Used to track customer movement, analyze navig
| **Self-Driving Cars** | Uses Computer Vision to identify objects, navigate rou
| **Medical Imaging** | Assists doctors in interpreting medical images and creat
| **Google Translate App** | Uses Computer Vision to translate text in real-time
| Task | Description |
| --- | --- |
| **Image Classification** | Assigns a label to an input image from a fixed set
| **Classification + Localisation** | Identifies the object and its location in
| **Object Detection** | Finds instances of real-world objects in images or vide
| **Instance Segmentation** | Detects instances of objects, assigns a category,
## Basics of Images
> A pixel is a picture element, the smallest unit of information that makes up a
### Resolution
| Term | Description |
| --- | --- |
| **Pixel Count** | The number of pixels in an image, expressed as width x heigh
| **Megapixel** | A unit of measurement for pixel count, equal to one million pi
> Each pixel has a pixel value that describes its brightness and/or color, typic
A **grayscale image** is an image that has a range of shades of gray without app
> "A grayscale image has each pixel of size 1 byte having a single plane of 2D a
The size of a grayscale image is defined as the Height x Width of that image.
An **RGB image** is an image that is made up of three primary colors: Red, Green
> "Every RGB image is stored in the form of three different channels called the
Each plane separately has a number of pixels with each pixel value varying from
| Channel | Description |
| --- | --- |
| R (Red) | Values range from 0 to 255 |
| G (Green) | Values range from 0 to 255 |
| B (Blue) | Values range from 0 to 255 |
Page 32
Created by Turbolearn AI
## Image Features
## Convolution
> "The convolution operation is commonly used to create effects such as filters
## Introduction to OpenCV
**OpenCV** (Open Source Computer Vision Library) is a tool that helps a computer
To install OpenCV library, open anaconda prompt and write the following command:
```python
pip install opencv-python
Resizing
Cropping
And many more
To learn more about OpenCV, head to Jupyter Notebook for introduction to OpenCV:
http://bit.ly/cv_notebook## Convolution Operator
What is Convolution?
Convolution is a simple mathematical operation that is fundamental to many common
image processing operators. It provides a way of "multiplying together" two arrays of
numbers, generally of different sizes, but of the same dimensionality, to produce a third
array of numbers of the same dimensionality.
Page 33
Created by Turbolearn AI
Kernel: A kernel is a matrix that is slid across the image and multiplied with the input such
that the output is enhanced in a certain desirable manner. Each kernel has a different value
for different kinds of effects that we want to apply to an image.
Convolution Operation
The convolution operation can be represented as:
I = Image Array K = Kernel Array I * K = Resulting array after performing the convolution
operator
Edge Extension
To achieve an output image of the same size as the input image, we need to extend the
edge values out by one in the original image while overlapping the centers and performing
the convolution. This will help us keep the input and output image of the same size.
Page 34
Created by Turbolearn AI
Convolution Layer: The first layer of a CNN, responsible for extracting high-level
features such as edges from the input image.
Rectified Linear Unit ReLU Layer: The next layer in the CNN, responsible for
introducing non-linearity in the feature map.
Pooling Layer: Responsible for reducing the spatial size of the convolved feature
while still retaining the important features.
Fully Connected Layer: The final layer of the CNN, responsible for making predictions
based on the features extracted by the previous layers.
Convolution Layer
Layer Description
Convolution Layer Extracts high-level features such as edges from the input image
ReLU Layer Introduces non-linearity in the feature map
Reduces the spatial size of the convolved feature while still retaining
Pooling Layer
the important features
Fully Connected Makes predictions based on the features extracted by the previous
Layer layers
The ReLU function is used to introduce non-linearity in the feature map, making the color
change more obvious and more abrupt.
Pooling Layer
There are two types of pooling that can be performed on an image:
Max Pooling: Returns the maximum value from the portion of the image covered by
the kernel.
Average Pooling: Returns the average value from the portion of the image covered by
the kernel.
Page 35
Created by Turbolearn AI
Note: The resulting array is obtained by performing the convolution operation on the image
array and the kernel array.## Convolutional Neural Networks CNNs
Pooling Layer
The pooling layer is an important layer in the CNN as it performs a series of tasks:
Introduction
Natural Language Processing NLP is the sub-field of AI that is focused on enabling
computers to understand and process human languages.
Page 36
Created by Turbolearn AI
NLP is concerned with the interactions between computers and human natural languages,
in particular how to program computers to process and analyze large amounts of natural
language data.
Applications of NLP
Application Description
Problem Scoping
Canvas Description
Who People who suffer from stress and are at the onset of depression.
People who need help are reluctant to consult a psychiatrist and hence live
What
miserably.
Where When they are going through a stressful period of time.
People get a platform where they can talk and vent out their feelings
Why
anonymously.
Page 37
Created by Turbolearn AI
Problem Statement
Our people undergoing stress have a problem of not being able to share their feelings while
they need to vent out their stress.## Chatbots and Natural Language Processing
Data Acquisition
To understand the sentiments of people, we need to collect their conversational data. This
data can be collected from various means:
Surveys
Observing therapists' sessions
Databases available on the internet
Interviews
Data Exploration
Once the textual data has been collected, it needs to be processed and cleaned so that an
easier version can be sent to the machine. This process is called Data Normalization.
Modelling
Once the text has been normalized, it is then fed to an NLP NaturalLanguageP rocessing
based AI model. NLP is a subfield of artificial intelligence that deals with the interaction
between computers and humans in natural language.
Evaluation
The model trained is then evaluated and the accuracy for the same is generated on the
basis of the relevance of the answers which the machine gives to the users' responses.
Page 38
Created by Turbolearn AI
Model
Description
Performance
The model's output does not match the true function, resulting in lower
Underfitting
accuracy.
The model's performance matches well with the true function, resulting in
Perfect Fit
optimum accuracy.
The model's performance is trying to cover all the data samples, even if
Overfitting
they are out of alignment to the true function, resulting in lower accuracy.
Chatbots
A chatbot is a computer program that uses NLP to simulate conversation with human users.
There are two types of chatbots:
Chatbot
Description
Type
A chatbot that works around a script which is programmed in them. They are
Script-bot
easy to make and mostly free, but have limited functionality.
A chatbot that works on bigger databases and other resources directly. They
Smart-bot
are flexible and powerful, but require coding to take them on board.
"Human language is complex and has multiple characteristics that might be easy
for a human to understand but extremely difficult for a computer to understand."
Arrangement of words and meaning: Human language has rules and structure,
which can be difficult for computers to understand.
Multiple meanings of a word: Words can have different meanings depending on the
context, which can be difficult for computers to understand.
Page 39
Created by Turbolearn AI
Part-of-Speech Tagging
Part-of-speech tagging is a technique used to identify the different parts of a speech, such
as nouns, verbs, adverbs, and adjectives.
Syntax and
Description
Semantics
Different syntax,
2+3 = 3+2 bothstatementshavethesamemeaning
same semantics
Different semantics, 2/3 P ython2.7 = 2/3 P ython3
same syntax bothstatementshavethesamesyntaxbutdifferentmeanings
Natural language processing NLP is a subfield of artificial intelligence that deals with the
interaction between computers and humans in natural language. However, human
language is complex and can be challenging for computers to understand.
Homophones: Words that sound the same but have different meanings.
Homographs: Words that are spelled the same but have different meanings.
Idioms: Phrases that have a different meaning than the literal meaning of the
individual words.
Sarcasm: Language that is intended to convey a meaning that is opposite of its literal
meaning.
Text Normalisation
Page 40
Created by Turbolearn AI
Text normalisation is the process of simplifying text data to make it easier for computers to
understand. The goal of text normalisation is to reduce the complexity of the text data
while preserving its meaning.
Step Description
Stemming vs Lemmatization
Stemming Lemmatization
Bag of Words
The bag of words model is a simple NLP model that represents text data as a bag oraset of
its word occurrences without considering grammar or word order.
Page 41
Created by Turbolearn AI
Document 1: "Aman and Anil are stressed" Document 2: "Aman went to a therapist"
Document 3: "Anil went to download a health chatbot"
Document 1:
′′aman′′,′′anil′′,′′are′′,′′stressed′′
Document 2:
′′aman′′,′′went′′,′′to′′,′′therapist′′
Document 3:
′′anil′′,′′went′′,′′to′′,′′download′′,′′health′′,′′chatbot′′
′′aman′′,′′anil′′,′′are′′,′′stressed′′,′′went′′,′′to′′,′′therapist′′,′′download′′,′′health′′,′′chatbot′′
Document 1:
1, 1, 1, 1, 0, 0, 0, 0, 0, 0
Document 2:
1, 0, 1, 0, 1, 1, 1, 0, 0, 0
Document 3:
1, 1, 0, 0, 1, 1, 0, 1, 1, 1
Note that the document vectors represent the frequency of each word in the dictionary for
each document.## Bag of Words Algorithm
Page 42
Created by Turbolearn AI
The Bag of Words algorithm is a method used to represent text data in a numerical format
that can be processed by machines.
Word
aman
and
anil
are
stressed
went
download
health
chatbot
therapist
to
Page 43
Created by Turbolearn AI
aman 1 0 0
and 1 1 1
anil 1 0 0
are 1 0 0
stressed 1 0 0
went 0 1 0
download 0 0 1
health 0 1 1
chatbot 0 1 1
therapist 0 0 1
to 0 1 0
Term Frequency
Term frequency is the frequency of a word in one document.
Page 44
Created by Turbolearn AI
aman 2
and 2
anil 2
are 2
stressed 1
went 1
download 1
health 2
chatbot 2
therapist 1
to 2
Vocabulary Inverse Document Frequency
aman log3/2
and log3/2
anil log3/2
are log3/2
stressed log3/1
went log3/1
download log3/1
health log3/2
chatbot log3/2
therapist log3/1
to log3/2
TFIDF Formula
TFIDFW = TFW * logIDF (W )
Where TFW is the term frequency of word W, and IDFW is the inverse document frequency
of word W.
Applications of TFIDF
TFIDF is commonly used in the Natural Language Processing domain. Some of its
applications are:
Page 45
Created by Turbolearn AI
DIY: Do It Yourself!
Try completing the following exercise using the corpus provided:
Document 1: We can use health chatbots for treating stress. Document 2: We can use NLP
to create chatbots and we will be making health chatbots now! Document 3: Health
Chatbots cannot replace human counsellors now. Yay >< !! @1nteLA!4Y
What is Evaluation?
Evaluation is the process of understanding the reliability of any AI model, based
on outputs by feeding test dataset into the model and comparing with actual
answers.
True Positive T P : When the model predicts a positive outcome and the reality is also
positive.
True Negative T N : When the model predicts a negative outcome and the reality is
also negative.
False Positive F P : When the model predicts a positive outcome but the reality is
negative.
False Negative F N : When the model predicts a negative outcome but the reality is
positive.
Page 46
Created by Turbolearn AI
Confusion Matrix
The confusion matrix is a table used to evaluate the performance of a classification model. It
maps the predictions against the actual outcomes.
Positive Positive TP
Positive Negative FP
Negative Positive FN
Negative Negative TN
Evaluation Methods
Accuracy
Accuracy is defined as the percentage of correct predictions out of all the
observations.
Accuracy = T P + T N / T P + T N + F P + F N
Precision
Precision is defined as the percentage of true positive cases versus all the cases
where the prediction is true.
Precision = TP / T P + F P
Recall
Recall is defined as the fraction of positive cases that are correctly identified.
Recall = TP / T P + F N
Precision Recall
Numerator TP TP
Denominator TP + FP TP + FN
Page 47
Created by Turbolearn AI
In this case, the precision is low, indicating that the model is prone to false alarms.
Key Takeaways
High accuracy does not necessarily mean good performance.
Good precision does not guarantee good model performance.
Recall is an important evaluation metric that considers both true positives and false
negatives.## Choosing the Right Metric
When evaluating the performance of a model, it's essential to choose the right metric. The
choice between Precision and Recall depends on the specific use case.
Forest Fire: A false negative can lead to a forest fire not being detected, resulting in
significant damage.
Viral Outbreak: A false negative can lead to a viral outbreak not being detected,
resulting in widespread infection.
Page 48
Created by Turbolearn AI
F1 Score
The F1 Score is a measure of the balance between precision and recall. It's calculated using
the following formula:
"The F1 Score is a way to combine precision and recall into a single metric."
F1 Score Variations
Practice Time
Let's practice calculating accuracy, precision, recall, and F1 score using the following
scenarios:
Page 49
Created by Turbolearn AI
Actual Flood 15 3
Actual No Flood 1 10
Actual Rain 12 4
Actual No Rain 3 9
Calculate the accuracy, precision, recall, and F1 score for each scenario.
Page 50