KEMBAR78
Deep Learning Notes | PDF | Deep Learning | Machine Learning
0% found this document useful (0 votes)
10 views19 pages

Deep Learning Notes

Machine Learning (ML) is a subset of Artificial Intelligence that allows systems to learn from data and make predictions without explicit programming, widely applied in various fields such as healthcare and finance. It includes types like supervised, unsupervised, and semi-supervised learning, each with distinct characteristics and applications. Deep Learning (DL), a subset of ML, utilizes artificial neural networks to solve complex problems and has transformed areas like computer vision and natural language processing.

Uploaded by

Avadhut Kale
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views19 pages

Deep Learning Notes

Machine Learning (ML) is a subset of Artificial Intelligence that allows systems to learn from data and make predictions without explicit programming, widely applied in various fields such as healthcare and finance. It includes types like supervised, unsupervised, and semi-supervised learning, each with distinct characteristics and applications. Deep Learning (DL), a subset of ML, utilizes artificial neural networks to solve complex problems and has transformed areas like computer vision and natural language processing.

Uploaded by

Avadhut Kale
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

Machine Learning Intro

Machine Learning (ML) is a subset of Artificial Intelligence (AI) that enables systems to learn
from data and make predictions or decisions without explicit programming. It involves
algorithms and statistical models that allow computers to identify patterns and improve
performance over time. Unlike traditional programming, where rules are manually defined,

ML is widely used in various fields, such as healthcare, finance, e-commerce, cybersecurity,


and robotics. It powers technologies like self-driving cars, voice assistants (Alexa, Siri),
fraud detection systems, recommendation engines (Netflix, Amazon), and medical
diagnosis tools.

With the rapid growth of big data, powerful computing resources, and advanced
algorithms, ML has evolved into an essential tool for solving complex real-world problems. It
plays a crucial role in automation, decision-making, and predictive analytics, making
systems more intelligent and efficient.

Types of Machine Learning

1. Supervised Learning

In supervised learning, the model is trained using labeled data (input-output pairs), meaning
it learns from past examples to make predictions.

Key Characteristics:

 Requires a labeled dataset (with input and correct output).

 The model maps inputs to outputs by minimizing errors.

 Used for classification and regression problems.

Examples:

 Spam Detection – Classifying emails as spam or not spam.

 House Price Prediction – Predicting house prices based on features like area, number
of rooms, etc.

Common Algorithms:

 Linear Regression (for regression tasks)

 Decision Trees (for classification and regression)

 Support Vector Machines (SVM)

 Neural Networks
2. Unsupervised Learning

In unsupervised learning, the model is trained on unlabeled data, meaning it finds hidden
patterns and structures without explicit labels.

Key Characteristics:

 No labeled data is provided.

 The model learns by detecting patterns, similarities, and clusters.

 Used for clustering and association rule mining.

Examples:

 Customer Segmentation – Grouping customers based on shopping behavior.

 Anomaly Detection – Identifying fraudulent transactions in banking.

Common Algorithms:

 K-Means Clustering

 Hierarchical Clustering

 Principal Component Analysis (PCA) (for dimensionality reduction)

3. Semi-Supervised Learning

This approach combines supervised and unsupervised learning, using a small amount of
labeled data along with a large amount of unlabeled data to improve learning.

Key Characteristics:

 Less labeled data is required compared to supervised learning.

 It helps when labeling data is expensive or time-consuming.

Examples:

 Medical Diagnosis – Using a few labeled medical images to train a model for disease
detection.

 Webpage Classification – Categorizing web pages with minimal labeled examples.

Deep Learning Intro


Deep Learning (DL) is a subset of Machine Learning (ML) that uses artificial neural networks
to model and solve complex problems. It mimics the way the human brain processes
information, enabling computers to recognize patterns, classify data, and make decisions
with minimal human intervention.

Deep learning has revolutionized fields like computer vision, natural language processing
(NLP), speech recognition, and robotics. It powers advanced AI applications such as
autonomous vehicles, real-time translation, medical diagnosis, and facial recognition.

Key Components of Deep Learning

1. Artificial Neural Networks (ANNs)

Deep Learning models are built using Artificial Neural Networks (ANNs), inspired by the
structure of the human brain.

 Neuron (Perceptron): The basic unit of a neural network that processes input data.

 Layers of Neural Networks:

o Input Layer: Receives raw data (e.g., images, text, numerical values).

o Hidden Layers: Perform feature extraction and pattern recognition.

o Output Layer: Produces the final result (e.g., classification, prediction).

2. Activation Functions

Activation functions introduce non-linearity in neural networks, helping them learn complex
patterns. Common activation functions include:

 ReLU (Rectified Linear Unit): Used in most deep networks to prevent vanishing
gradients.

 Sigmoid: Used for binary classification problems.

 Softmax: Used for multi-class classification.

3. Backpropagation & Optimization

Deep learning models learn through backpropagation, which adjusts the weights of neurons
based on the error (loss function). Optimizers like SGD (Stochastic Gradient Descent) and
Adam help minimize the error efficiently.

Types of Deep Learning Architectures

1. Feedforward Neural Networks (FNNs)


 The simplest type of deep learning model where data flows in one direction from
input to output.

 Used in basic classification and regression tasks.

2. Convolutional Neural Networks (CNNs)

 Designed for image and video processing.

 Uses convolutional layers to detect edges, textures, and patterns in images.

 Used in face recognition, object detection, medical imaging, and self-driving cars.

3. Recurrent Neural Networks (RNNs)

 Used for sequential data processing (e.g., time-series data, speech, and text).

 Maintains a memory of past inputs using hidden states.

 Used in language translation, chatbots, and speech recognition.

Key Differences Between Machine Learning (ML) and Deep Learning (DL):

1. Feature Engineering:

o Machine Learning requires manual feature extraction, where experts define


important data attributes for training.

o Deep Learning automatically extracts features from raw data using neural
networks.

2. Data Requirements:

o ML works well with small to medium-sized datasets.

o DL requires large datasets to effectively learn complex patterns.

3. Model Complexity:

o ML models like decision trees, SVMs, and linear regression are relatively
simple and interpretable.

o DL models involve deep neural networks with multiple layers, making them
more complex but powerful.

4. Computation Power:

o ML models can run efficiently on standard CPUs.


o DL requires high-performance GPUs or TPUs due to its heavy computational
needs.

Q. Supervised and Unsupervised Learning – A Detailed Explanation

1. Supervised Learning

Definition:

Supervised Learning is a type of Machine Learning (ML) where the model is trained on a
labeled dataset. This means that each input data point has a corresponding correct output,
and the model learns to map inputs to outputs based on this labeled data.

How It Works:

 The dataset consists of input features (X) and output labels (Y).

 The model learns patterns from the training data and makes predictions on new,
unseen data.

 The learning process continues until the model achieves a desired level of accuracy.

Types of Supervised Learning:

1. Classification:

o The goal is to classify input data into predefined categories.

o Example:

 Spam Detection: Classifying emails as Spam or Not Spam.

 Handwritten Digit Recognition: Identifying digits in postal codes.

2. Regression:

o The goal is to predict a continuous numerical value.

o Example:

 House Price Prediction: Estimating the price of a house based on


features like size, location, and number of bedrooms.

 Stock Market Forecasting: Predicting future stock prices based on


historical trends.

Advantages of Supervised Learning:

✔ Produces accurate and reliable predictions.


✔ Helps in making well-defined decisions with clear output labels.
✔ Useful for applications like medical diagnosis, fraud detection, and sentiment analysis.
Disadvantages of Supervised Learning:

❌ Requires large amounts of labeled data, which can be time-consuming and expensive to
obtain.
❌ Cannot discover hidden patterns that are not explicitly labeled.

2. Unsupervised Learning

Definition:

Unsupervised Learning is a type of Machine Learning where the model is trained on


unlabeled data. The algorithm identifies patterns, structures, and relationships in the data
without predefined output labels.

How It Works:

 The dataset contains only input features (X) without corresponding output labels.

 The model groups or clusters similar data points together based on inherent
similarities.

 It is used for exploratory data analysis, anomaly detection, and data compression.

Types of Unsupervised Learning:

1. Clustering:

o The goal is to group similar data points into clusters.

o Example:

 Customer Segmentation: Dividing customers into different groups


based on purchasing behavior.

 Document Categorization: Grouping news articles based on topics like


sports, politics, or technology.

2. Dimensionality Reduction:

o Reducing the number of features in a dataset while retaining important


information.

o Example:

 Principal Component Analysis (PCA): Used for image compression


and visualization of high-dimensional data.

 Feature Selection in Genomics: Identifying key genes that contribute


to disease prediction.
Advantages of Unsupervised Learning:

✔ Useful for discovering hidden patterns in data.


✔ Works well for large and complex datasets without labeled examples.
✔ Helps in anomaly detection (e.g., fraud detection in banking).

Disadvantages of Unsupervised Learning:

❌ Less accurate compared to supervised learning, as there is no predefined output to


guide the model.
❌ Results can be difficult to interpret, requiring domain expertise.

Q. Bias-Variance Tradeoff in Machine Learning – A Detailed Explanation

1. What is Bias-Variance Tradeoff?

The Bias-Variance Tradeoff is a fundamental concept in Machine Learning (ML) that


describes the balance between two sources of errors in a model:

 Bias (Underfitting): The error due to oversimplification of the model, leading to


inability to capture complex patterns in the data.

 Variance (Overfitting): The error due to overcomplexity, causing the model to learn
noise instead of general patterns.

A well-trained machine learning model should have a balance between bias and variance to
achieve good generalization on new, unseen data.

2. Understanding Bias and Variance

Bias (Underfitting)

Definition:
Bias refers to the assumptions made by a model to simplify the learning process. A high bias
model is too simple and fails to capture the underlying patterns in the data.

Characteristics of High Bias Models:


✔ Simple models with few parameters
✔ Makes strong assumptions about the data
✔ Performs poorly on both training and test data

Example:

 Using Linear Regression to model a highly non-linear dataset will lead to high bias
because the model is too simple to capture the complex relationship.
 A model predicting house prices based only on square footage, ignoring other factors
like location and number of bedrooms, has high bias.

Variance (Overfitting)

Definition:
Variance refers to the model’s sensitivity to small changes in the training data. A high
variance model is too complex and memorizes the training data, but fails to generalize to
new data.

Characteristics of High Variance Models:


✔ Highly complex models with many parameters
✔ Captures noise along with the actual pattern
✔ Performs well on training data but poorly on test data

Example:

 A deep neural network with too many layers trained on a small dataset may learn
random noise, leading to poor performance on unseen data.

 A decision tree with excessive depth may fit training data perfectly but fail on test
data due to memorization.

3. The Tradeoff Between Bias and Variance

 Low Bias & High Variance → Model overfits and memorizes data, leading to poor
generalization.

 High Bias & Low Variance → Model underfits and fails to learn meaningful patterns.

 Optimal Model → A balance between bias and variance to ensure the model
generalizes well.

Graphical Representation

 Low bias, high variance → Overfitting

 High bias, low variance → Underfitting

 Optimal → A balance between bias and variance

4. Why is Bias-Variance Tradeoff Important?

1. Improves Generalization:
o Helps in creating models that perform well on new, unseen data.

2. Optimizes Model Performance:

o Avoids models that are too simple (high bias) or too complex (high variance).

3. Prevents Overfitting and Underfitting:

o Ensures the model does not memorize the training data but learns general
patterns.

4. Better Decision Making in Model Selection:

o Helps in choosing the right algorithm and hyperparameters.

5. How to Manage Bias-Variance Tradeoff?

✔ Increase Training Data → Reduces variance and improves generalization.


✔ Use Cross-Validation → Helps in assessing model performance and avoiding overfitting.

Q. Importance of Hyperparameters in a Machine Learning Model

1. What are Hyperparameters?

In Machine Learning (ML), hyperparameters are parameters set before training a model,
determining its behavior and performance. Unlike model parameters (e.g., weights in neural
networks), hyperparameters cannot be learned from data and must be tuned manually or
using optimization techniques.

2. Why are Hyperparameters Important?

1. Control Model Complexity:

o Prevents overfitting (too complex) or underfitting (too simple).

o Example: In Decision Trees, setting a very deep tree can cause overfitting.

2. Optimize Learning Efficiency:

o Affects training speed and convergence.

o Example: A high learning rate in gradient descent may lead to divergence,


while a low learning rate makes training too slow.

3. Improve Model Accuracy:


o Well-tuned hyperparameters ensure the model captures meaningful patterns
and generalizes well.

o Example: Number of hidden layers in a neural network affects accuracy.

4. Balance Bias-Variance Tradeoff:

o Hyperparameters help in adjusting the model to achieve an optimal balance


between bias (underfitting) and variance (overfitting).

5. Essential for Model Selection & Performance Tuning:

o Different ML algorithms require different hyperparameters.

o Example: Support Vector Machine (SVM) requires kernel selection and


regularization tuning.

3. Examples of Hyperparameters

Example 1: Learning Rate (α) in Neural Networks & Gradient Descent

 The learning rate determines how much the model updates weights during training.

 A high learning rate can cause the model to overshoot the optimal solution, while a
low learning rate makes learning too slow.

🔹 Example:

 A deep learning model using Stochastic Gradient Descent (SGD) needs a well-tuned
learning rate.

 If the learning rate is too high, the loss function fluctuates without convergence.

 If it is too low, training takes too long.

Example 2: Number of Trees in Random Forest

 Random Forest is an ensemble learning technique where multiple decision trees


vote for predictions.

 The number of trees is a hyperparameter that controls model robustness.

🔹 Example:

 Too few trees → High variance, leading to overfitting.

 Too many trees → Increases computation time without significant accuracy gain.
4. How to Tune Hyperparameters?

1. Grid Search:

o Tries different combinations of hyperparameters and selects the best one.

2. Random Search:

o Randomly samples hyperparameters to find an optimal set faster.

3. Bayesian Optimization & Genetic Algorithms:

o Advanced methods for optimizing hyperparameters dynamically.

Q. Difference Between Underfitting and Overfitting

Feature Underfitting Overfitting

The model is too simple and fails


The model is too complex and learns
Definition to capture patterns in the
noise along with actual patterns.
training data.

High bias, low variance – makes


Bias-Variance High variance, low bias – memorizes
strong assumptions and
Tradeoff training data instead of generalizing.
oversimplifies the data.

Performance on Poor – does not learn well from Excellent – fits the training data almost
Training Data training data. perfectly.

Poor – fails to generalize, performs


Performance on Poor – fails to generalize and
well on training but poorly on unseen
Test Data gives inaccurate predictions.
data.

- Model is too simple (e.g., using


- Model is too complex (e.g., deep
linear regression for non-linear
neural network with too many layers).
data).
Cause - Too many features (including
- Not enough features.
irrelevant ones).
- Insufficient training (too few
- Lack of proper regularization.
epochs in deep learning).

- Use a more complex model


(e.g., polynomial regression - Reduce model complexity (e.g.,
Solution instead of linear regression). prune decision trees, reduce deep
- Add more relevant features. learning layers).
- Train the model longer. - Use regularization techniques (L1,
Feature Underfitting Overfitting

L2).
- Increase training data.

A deep neural network with many


Using linear regression to fit a layers that memorizes training data
Example but performs poorly on new data.
highly non-linear dataset.

1. Addressing Overfitting Using Regularization

Overfitting happens when the model is too complex and learns noise in the training data,
leading to poor generalization.

How Regularization Helps?

✅ Reduces model complexity by shrinking large weights.


✅ Forces the model to focus on important patterns instead of noise.
✅ Prevents extreme variations in predictions.

Techniques to Reduce Overfitting

1. L1 Regularization (Lasso Regression)

o Adds absolute values of weights as a penalty: Loss=MSE+λ∑∣wi∣

o Helps in feature selection by reducing some weights to zero (removing


irrelevant features).

2. L2 Regularization (Ridge Regression)

o Adds squared values of weights as a penalty: Loss=MSE+λ∑wi2

2. Addressing Underfitting Using Regularization

Underfitting happens when the model is too simple and cannot capture underlying patterns.

How Regularization Helps?

✅ Prevents excessive simplification of the model.


✅ Ensures the model learns enough complexity to fit the data properly.
✅ Helps adjust the penalty term (λ) to find the right balance.

Techniques to Reduce Underfitting

1. Increase Model Complexity


o If underfitting occurs, using a deeper neural network or adding polynomial
features can improve learning.

2. Train Longer & Use More Data

o Sometimes, underfitting is caused by stopping training too early.

o Increasing epochs or using more data helps the model learn better.

Q. Advantages and Challenges of Deep Learning

1. Advantages of Deep Learning

Deep Learning (DL) has revolutionized artificial intelligence by enabling machines to learn
complex patterns and representations. It outperforms traditional machine learning in
various tasks due to its ability to automate feature extraction, handle large data, and
generalize well.

(i) Automatic Feature Extraction

 Traditional Machine Learning requires manual feature engineering, but DL


automatically learns features from raw data.

 Eliminates the need for domain expertise in feature selection.

 Example: In image recognition, CNNs (Convolutional Neural Networks) automatically


extract edges, textures, and object features without human intervention.

(ii) High Accuracy and Generalization

 Deep Learning models, especially deep neural networks (DNNs), achieve state-of-
the-art accuracy in tasks like image recognition, speech processing, and NLP.

 They generalize well when trained on large, diverse datasets.

 Example: Google's BERT model provides human-like accuracy in natural language


understanding (NLU), powering applications like Google Search and chatbots.

(iii) Scalability with Big Data

 DL models perform better with more data, unlike traditional ML models which
saturate quickly.

 Example: Deep Learning is used in Autonomous Vehicles, where massive sensor data
is processed to detect objects, predict movements, and navigate safely.

(iv) Handles Unstructured Data Efficiently


 Deep learning can analyze images, audio, text, and videos, unlike traditional ML
which struggles with unstructured data.

 Example: Netflix uses DL for personalized movie recommendations based on user


behavior and viewing history.

2. Challenges of Deep Learning

Despite its advantages, Deep Learning has some limitations that make its deployment and
implementation challenging.

(i) High Computational Cost & Hardware Dependency

 Training deep networks requires powerful GPUs/TPUs and high memory, making it
expensive.

 Example: Training GPT-4 requires thousands of GPUs, making it costly for small
companies to implement.

(ii) Requires a Large Amount of Data

 Deep learning models perform well only when trained on large labeled datasets.

 Small datasets can lead to poor generalization and overfitting.

 Example: In medical diagnosis, DL models require thousands of labeled medical


images to predict diseases accurately. Small datasets lead to incorrect diagnoses.

(iii) Black-Box Nature (Lack of Interpretability)

 DL models make highly accurate predictions, but their internal decision-making


process is difficult to interpret.

 This limits their use in applications where explainability is essential (e.g., healthcare
and finance).

 Example: In loan approval systems, deep learning models predict whether a person
should get a loan, but banks cannot explain the exact reasons for rejection.

(iv) Longer Training Time & Hyperparameter Tuning Complexity

 Training deep models can take hours to weeks, depending on data and model
complexity.

 Requires careful tuning of learning rate, batch size, number of layers, etc., which can
be time-consuming.

 Example: Training self-driving car models takes months, requiring massive data
collection and continuous fine-tuning.
Q. How Deep Learning Works: Understanding Neurons, Layers, and Networks

Deep Learning is a subset of Machine Learning that uses Artificial Neural Networks (ANNs)
to simulate the way the human brain processes information. It involves three core
components: neurons, layers, and networks. These elements work together to perform
complex tasks such as image recognition, natural language processing, and autonomous
driving.

1. Neurons: The Building Blocks of Deep Learning

A neuron is the fundamental unit of a deep learning model, inspired by biological neurons in
the brain.

Structure of a Neuron:

Each neuron in a deep learning model consists of the following components:

 Inputs (x1,x2,...,xnx_1, x_2, ..., x_nx1,x2,...,xn): These represent the features of the
input data (e.g., pixel values in an image).

 Weights (w1,w2,...,wn ): These determine the importance of each input.

 Bias (b): An additional parameter that helps the model shift decision boundaries.

 Summation Function (∑wixi+b): Computes a weighted sum of inputs.

 Activation Function: Applies a transformation to introduce non-linearity. Examples


include ReLU, Sigmoid, and Tanh.

Mathematical Representation of a Neuron:

y=f(∑wixi+b)

where f is the activation function that decides whether the neuron should fire or not.

Example of a Neuron in Action

Consider an image classification task where we need to identify if an image contains a cat.
Each neuron in the first layer takes pixel values as inputs, multiplies them by weights, adds a
bias, and then applies an activation function to determine the next action.

2. Layers: Organizing Neurons to Extract Features

Neurons are grouped into layers, each performing a different role in data processing.

Types of Layers in a Neural Network:

(i) Input Layer:


 The first layer that receives raw input data (e.g., image pixels, text embeddings).

 Passes information to the next layer without modifications.

(ii) Hidden Layers:

 These layers extract important features by learning patterns.

 Each neuron in a hidden layer applies a transformation to the input data.

 Deep Learning involves multiple hidden layers, making it a Deep Neural Network
(DNN).

(iii) Output Layer:

 Produces the final prediction (e.g., classification label, probability score).

 Uses an activation function like Softmax (for classification) or Linear Activation (for
regression tasks).

Example of Layers in Deep Learning

 In an image recognition model, the first hidden layer detects simple edges, the next
layers detect shapes, and the final layers recognize objects like "cat" or "dog."

3. Networks: Connecting Layers to Form a Deep Learning Model

A Deep Neural Network (DNN) consists of multiple layers of interconnected neurons.

Types of Deep Learning Networks:

(i) Feedforward Neural Networks (FNNs)

 Data flows in one direction, from input to output.

 Used in basic classification and regression tasks.

(ii) Convolutional Neural Networks (CNNs)

 Specialized for image processing tasks.

 Uses convolutional layers to detect edges, textures, and objects.

 Example: Used in facial recognition systems.

(iii) Recurrent Neural Networks (RNNs)

 Used for sequence-based tasks like text and speech recognition.

 Maintains memory using feedback loops.

 Example: Used in chatbots and language translation.


Q. Common Architectural Principles of Deep Networks

Deep Learning models are built on fundamental architectural principles that help in efficient
learning, generalization, and optimization. These principles define how neurons, layers,
and networks interact to solve complex tasks. Below are the key architectural principles
used in deep networks:

1. Layered Hierarchical Representation

Principle:

Deep networks stack multiple layers, where each layer learns a higher-level abstraction of
the input data. The deeper the network, the more complex features it can extract.

How It Works:

 Lower layers learn basic features (e.g., edges in an image).

 Intermediate layers learn patterns (e.g., textures or shapes).

 Higher layers detect complex objects (e.g., faces, cars).

Example:

 In image recognition, early layers detect edges, mid-level layers recognize shapes,
and deeper layers classify objects (e.g., "dog" or "cat").

 In speech recognition, initial layers detect phonemes, while deeper layers


understand words and sentences.

2. Non-linearity for Complex Representations

Principle:

Deep networks use non-linear activation functions (e.g., ReLU, Sigmoid, Tanh) to allow
complex transformations. Without non-linearity, multiple layers would behave like a single
linear transformation.

How It Works:

 ReLU (Rectified Linear Unit): Introduces non-linearity by setting negative values to


zero.

 Sigmoid and Tanh: Used in older networks for binary classification but suffer from
vanishing gradient problems.

Example:

 In a CNN, ReLU activation allows the model to detect edges, curves, and shapes.
 In transformer models (like BERT or GPT-4), non-linear activations help in capturing
contextual relationships in text.

3. Feature Extraction through Convolutions and Pooling

Principle:

Instead of processing raw input directly, deep networks extract relevant features through
convolutions and pooling.

How It Works:

 Convolutional Layers use small filters to detect patterns in images.

 Pooling Layers reduce dimensionality by summarizing important features, improving


computational efficiency.

Example:

 A CNN for facial recognition uses convolutional layers to detect eyes, nose, and
mouth patterns.

 A speech recognition system extracts important frequencies using similar


techniques.

4. Attention Mechanism for Context Awareness

Principle:

Instead of treating all inputs equally, deep networks use attention mechanisms to focus on
important parts of the input data.

How It Works:

 Attention assigns higher weights to important words in a sentence or key objects in


an image.

 Transformers (e.g., GPT, BERT) rely on self-attention to process text efficiently.

Example:

 Google Translate uses attention to focus on important words while translating


sentences.

 Chatbots (e.g., ChatGPT) use attention to generate coherent and context-aware


responses.
Uses algorithms to analyze data, Uses artificial neural networks (ANNs) to
Definition learn patterns, and make automatically extract complex patterns from
decisions. data.

Data Works well with small to Requires large volumes of data for
Dependency medium-sized datasets. better performance.

Feature Requires manual feature selection Automatically learns features


Engineering and extraction by experts. without manual intervention.

Training Faster training due to simpler Slower training due to deep neural network
Time algorithms. complexity.

Computational Requires less computing power Needs high computational power


Power (runs on CPUs). (often GPUs/TPUs).

More interpretable and easier Acts as a black box, making it hard to


Interpretability
to debug. interpret decisions.

Email spam filters, fraud detection, Self-driving cars, facial recognition, voice
Examples
recommendation systems. assistants (like Siri, Alexa).

You might also like