KEMBAR78
Introduction To AI and ML | PDF | Matrix (Mathematics) | Machine Learning
0% found this document useful (0 votes)
5 views24 pages

Introduction To AI and ML

Uploaded by

faijan khan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views24 pages

Introduction To AI and ML

Uploaded by

faijan khan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 24

Introduction to AI and ML: Easy-to-Understand Notes

What is Artificial Intelligence (AI)?

Imagine you want a computer to be smart, like a human. That's the main idea behind Artificial Intelligence (AI). It's a
big field in computer science that tries to make machines do things that normally need human brains.

This includes things like:

• Learning: Getting better at tasks over time.

• Understanding language: Like talking to Alexa or Siri.

• Seeing things: Like a self-driving car recognizing a stop sign.

• Solving problems: Like a computer beating a chess champion.

• Making decisions: Choosing the best path in a game.

In short, AI is about making computers think and act smart.

What is Machine Learning (ML)?

Machine Learning (ML) is a big part of AI. Think of it as a way to teach computers without telling them every single
step. Instead, you show them lots and lots of examples (data), and they learn from those examples.

Imagine you want to teach a computer to spot cats in pictures.

• Old way (not ML): You'd write very specific rules: "If it has pointy ears AND whiskers AND a tail AND says
'meow', it's a cat." This gets really complicated for every tiny detail!

• ML way: You show the computer thousands of pictures – some with cats, some without. You tell it, "This one
is a cat, this one isn't." After seeing enough examples, the computer figures out for itself what makes a cat a
cat. It learns the patterns.

So, ML is about computers learning from data, without being told every single rule. It's like learning from experience.

Why is Machine Learning so important?

• It helps computers tackle problems that are too complex to program with simple rules.

• ML systems can get smarter over time as they get more data.

• They can find hidden patterns in huge amounts of information that humans would never see.

Where Do We See Machine Learning in Action?

ML is everywhere around us!

• Image & Face Recognition:

o Unlocking your phone with your face.

o Self-driving cars knowing what's on the road (people, other cars).

o Doctors using computers to find diseases in X-rays.

• Understanding Language:

o Talking to voice assistants like Alexa or Siri.

o Google Translate converting languages for you.


o Your email knowing if a message is spam or not.

• Recommendations:

o Netflix suggesting movies you might like.

o Amazon showing you products you might want to buy.

o Spotify recommending new songs.

• Healthcare:

o Helping find new medicines faster.

o Predicting if someone might get a certain illness.

• Money & Finance:

o Banks catching fraudulent transactions (stolen credit cards).

o Automated trading on the stock market.

• Robots:

o Teaching robots how to walk or pick up different objects.

• Self-Driving Cars & Drones:

o ML helps them "see" the world, make decisions, and navigate.

Types of Machine Learning

There are three main ways machine learning algorithms learn:

1. Supervised Learning (you may say like “Learning with a Teacher”)

In Supervised Learning, we train the machine using "labelled" data. This means the data comes with the correct
answers or outcomes already provided. It's like a teacher supervising a student, giving them examples with solutions
so they can learn the rules.

• The Idea: The computer learns from labelled data. This means for every piece of information you give it, you
also tell it the correct answer.

• How it works:

o You give the computer lots of examples: "This picture is a cat," "This picture is a dog." (The picture is
the data, "cat" or "dog" is the label/answer).

o The computer studies these examples and learns the connection between the picture and its label.

o Once it's learned, you can show it a new picture it's never seen, and it will try to guess if it's a cat or a
dog.
• When to use it:

o Classification: When you want to put things into categories.

▪ Examples: Is this email spam or not spam? Is this fruit an apple or an orange? Does this patient
have disease A or disease B?

o Regression: When you want to predict a number.

▪ Examples: How much will this house sell for? What will the temperature be tomorrow?

• Think of it like: A student doing homework with an answer key. They check their answers to learn.

2. Unsupervised Learning (you may say like “Learning without a Teacher”)

Unsupervised Learning deals with "unlabelled" data, meaning there are no pre-defined correct answers. The
machine must find hidden patterns or structures within the data all by itself. It's like letting a student explore and
discover things on their own without direct guidance.

• The Idea: The computer gets unlabelled data. This means you just give it information, but you don't tell it any
correct answers or categories. The computer must find hidden patterns or structures on its own.

• How it works:

o You give the computer a big pile of data (like a box full of mixed LEGO bricks).
o The computer looks for similarities and groups the data into natural categories based on what it finds.

o It's like finding natural groups or ways to organize the data without any help.

• When to use it:

o Clustering: Grouping similar things together.

▪ Examples: Finding different types of customers in a store's data based on what they buy,
without being told what those types are beforehand. Grouping similar news articles together.

o Finding Hidden Connections: Discovering unexpected relationships.

▪ Examples: People who buy barbecue grills often also buy charcoal, lighter fluid, etc.

• Think of it like: A student exploring a new topic on their own, trying to find out how different ideas connect.

3. Reinforcement Learning (you may say like “Learning by Trial and Error”)

Reinforcement Learning is about an agent learning to make decisions by trial and error in an environment. The agent
receives rewards for good actions and penalties for bad ones, just like training a pet. The goal is to learn a policy
that maximizes the total reward.

• The Idea: The computer (called an "agent") learns by trying things out in an "environment." It gets rewards
for doing good things and penalties for bad things. Its goal is to learn a strategy that gets it the most rewards
over time.

• How it works:

o Agent (the computer program) is in an Environment (like a game or a robot's world).

o The agent tries an Action (moves a chess piece, makes a robot take a step).

o The environment gives the agent a Reward (positive points for a good move) or a Penalty (negative
points for a bad move).

o The agent uses these rewards/penalties to learn which actions are best in different situations, slowly
getting better and better.
• When to use it:

o Playing Games: Teaching AI to play complex games like chess, Go, or even video games, often better
than humans.

o Robotics: Training robots to walk, grab objects, or navigate tricky places.

o Self-Driving Cars: Helping cars learn how to make decisions on the road (like when to speed up, slow
down, or turn).

• Think of it like: Training a pet. When it does something good, you give it a treat. If it does something bad, you
might say "no." Over time, the pet learns what to do to get treats.
Feature Supervised Learning Unsupervised Learning Reinforcement Learning

Predict a specific output Discover hidden patterns, Learn optimal sequences of actions
Primary Goal variable (label) based on structures, or relationships in an environment to maximize
input features. within the data. cumulative reward.

Interaction-based: Learns by
Labelled Data: Each row has interacting with an environment.
Unlabelled Data: Rows contain
Data input features (X) and a Data (states, actions, rewards) is
only input features (X); no
Requirement corresponding known output generated through experience, often
predefined output target.
target (Y). represented in tabular form (e.g., Q-
table).

Algorithm learns a mapping Algorithm explores the inherent Agent learns through trial and error
from X to Y by finding structure of the data itself, by taking actions in different states
Learning
correlations and identifying groups, dimensions, and receiving rewards/penalties,
Process
dependencies in the Labelled or anomalies without external iteratively updating its "knowledge"
data. guidance. (e.g., Q-values).

Reward Signal: Receives scalar


Direct Feedback: The No Direct Feedback: Relies on
reward from the environment after
Feedback algorithm's predictions are internal metrics (e.g., distance,
each action, indicating the
Mechanism directly compared against density) to evaluate discovered
immediate goodness of the action in
the known true labels (Y). patterns.
the current state.

Clustering, Dimensionality Sequential Decision Making, Control


Typical
Classification, Regression Reduction, Association Rule Problems, Optimization in dynamic
Problems
Mining, Anomaly Detection environments.

Predictions (e.g., a class Groups/Clusters of data points, An optimal "policy" – a mapping


Output label, a numerical value) for reduced feature sets, identified from states to the best actions to
new, unseen data. anomalies, association rules. take in those states.

Linear Regression, Logistic Q-Learning, SARSA (for small,


Regression, Decision Trees, discrete state/action spaces where Q-
K-Means, Hierarchical
Common Random Forests, Gradient tables are feasible). For complex
Clustering, DBSCAN, PCA, t-SNE,
Algorithms Boosting Machines (XGBoost, tabular states, often combined with
Apriori, Isolation Forest.
LightGBM), SVMs, Simple function approximators (e.g., Deep
Neural Networks. Q-Networks).
Customer Segmentation: Inventory Management: An agent
Predicting House Price: Given
Grouping customers into learns to adjust order quantities
features like size, location,
distinct segments (e.g., "High- based on current stock levels and
Example number of rooms (input X),
Value," "Budget Shopper") demand (states) to minimize holding
predict the selling price
based on their purchase history, costs and maximize availability
(output Y).
demographics (input X). (rewards).

Severe Curse of Dimensionality:


Can struggle with very high- Can be sensitive to feature Tabular Q-tables become
dimensional data, or highly scaling and the "curse of unmanageably large and sparse for
Limitations
non-linear relationships dimensionality" for very wide continuous or very high-dimensional
without complex models. datasets. state/action spaces. Limited to
simple, discrete environments.

NumPy
NumPy (Numerical Python) is the foundational library for numerical computing in Python. It provides a high-
performance multidimensional array object, and tools for working with these arrays. It is the cornerstone for many
other scientific Python libraries, including SciPy, Matplotlib, and scikit-learn.

Core Advantage: NumPy's primary benefit lies in its ability to perform operations on large datasets significantly
faster and with greater memory efficiency than standard Python lists, largely due to its implementation in C and
Fortran.

Difference between python List and Numpy array

Feature Python List NumPy Array (ndarray)

Heterogeneous: Can store elements of Homogeneous: All elements must be of the


Data Type
different data types (e.g., [1, 'abc', 3.14]). same data type (e.g., [1, 2, 3] are all integers).

Less Efficient: Stores references to objects, Highly Efficient: Stores elements directly and
Memory Efficiency which can be scattered in memory. Higher contiguously in memory. Lower overhead per
overhead per element. element.

Slower: Operations require explicit Python Significantly Faster (Vectorized): Operations


Performance
loops (e.g., list comprehensions). Interpreted applied to entire array at once, optimized in
(Numerical Ops)
and dynamic typing add overhead. C/Fortran.

General Purpose: Basic list operations Specialized for Numerics: Extensive math
Functionality (append, insert, pop, sort, concatenate). operations (element-wise, ufuncs), linear
Limited mathematical functions. algebra, statistics, reshaping, broadcasting, etc.
Fixed Size: Size is fixed upon creation.
Dynamic Size: Can grow or shrink (add/remove
Size/Mutability Reshaping creates a view or a new array.
elements) after creation. Mutable.
Mutable element values.

Requires loops for element-wise math (+ Direct arithmetic operations are element-wise
Ease of Use (Math)
concatenates lists, * repeats lists). (+, -, *, / operate on corresponding elements).

Primarily 1-dimensional, though can be nested Natively supports multi-dimensional arrays


Dimensions
(list of lists) to simulate multi-dimensions. (1D, 2D, 3D, ..., nD).

Scientific computing, data analysis, machine


Common Use Storing collections of varied data, handling
learning, image processing, any task involving
Cases flexible sequences, general programming.
large numerical datasets.

A flexible, ordered collection of arbitrary A fixed-size, homogeneous container for


Core Concept
Python objects. numerical data, optimized for array operations.

Demonstrating the execution time difference between a Python list and a NumPy array

import time
import numpy as np

N = 10**6 # One million elements (small enough for quick run, large enough for difference)

start_time_list = time.perf_counter()
list_result = [x + x for x in range(N)] # Double each number
end_time_list = time.perf_counter()
print(f"List time: {end_time_list - start_time_list:.6f} seconds")

# --- NumPy Array Example ---


start_time_numpy = time.perf_counter()
numpy_arr = np.arange(N) # Create array [0, 1, ..., N-1]
numpy_result = numpy_arr + numpy_arr # Double each number (vectorized)
end_time_numpy = time.perf_counter()
print(f"NumPy time: {end_time_numpy - start_time_numpy:.6f} seconds")

# --- Optional: Calculate Speedup ---


# print(f"NumPy is ~{(end_time_list - start_time_list) / (end_time_numpy - start_time_numpy):.2f}x faster")

How to run it:


1. Save the code as a Python file (e.g., speed_test_simple.py).
2. Run from your terminal: python speed_test_simple.py

Expected Output (will vary slightly on your machine):


List time: 0.085345 seconds
NumPy time: 0.001234 seconds

What this shows:

Even with just 1 million elements, you'll see that the NumPy operation completes in milliseconds, while the
equivalent Python list operation takes significantly longer (tens or hundreds of milliseconds). This illustrates the
efficiency gain of NumPy's vectorized operations.

Installation and Importation


To begin utilizing NumPy, ensure it's properly set up in your Python environment.
1. Installation (One-time Setup):
If NumPy is not already installed, use pip, the Python package installer:
pip install numpy

This command downloads and installs the necessary packages.


2. Importation (Per Script/Session):
In every Python script or interactive session where you intend to use NumPy, it's standard practice to import it
using the widely adopted alias np:

import numpy as np

This allows you to refer to NumPy functions and objects using the convenient np. prefix.

Example:

import numpy as np
print(f"NumPy version installed: {np.__version__}")

Understanding the ndarray (The Core Object)


The fundamental object in NumPy is the ndarray (N-dimensional array). Unlike Python lists, ndarray objects are
homogeneous (all elements must be of the same data type) and have a fixed size upon creation.

Key Attributes of an ndarray:


● .shape: A tuple indicating the dimensions of the array (e.g., (rows, columns) for a 2D array).
● .ndim: The number of array dimensions (axes).
● .size: The total number of elements in the array.
● .dtype: The data type of the elements (e.g., int32, float64).

Creating NumPy Arrays


Arrays can be created from existing Python sequences or generated using specialized NumPy functions.

3.1. From Python Lists/Tuples

Convert standard Python lists (or nested lists for higher dimensions) into NumPy arrays.

Example 1: 1-Dimensional Array (Vector)

import numpy as np

data_1d = [10, 20, 30, 40, 50]


arr_1d = np.array(data_1d)

print("1D Array:", arr_1d)


print("Shape:", arr_1d.shape) # Output: (5,)
print("Dimensions:", arr_1d.ndim) # Output: 1

Example 2: 2-Dimensional Array (Matrix)

import numpy as np

data_2d = [[1, 2, 3], [4, 5, 6]]


arr_2d = np.array(data_2d)

print("\n2D Array:\n", arr_2d)


print("Shape:", arr_2d.shape) # Output: (2, 3) (2 rows, 3 columns)
print("Dimensions:", arr_2d.ndim) # Output: 2

3.2. Using Built-in Array Creation Functions

NumPy provides functions for generating arrays with specific initial values or patterns.

Example 3: np.zeros(), np.ones()

Create arrays filled with zeros or ones.

import numpy as np

zeros_array = np.zeros((3, 4)) # 3 rows, 4 columns of zeros


ones_array = np.ones(5) # 1D array of 5 ones
print("\nZeros Array (3x4):\n", zeros_array)
print("Ones Array (1D):\n", ones_array)

Example 4: np.arange()

Generate an array with values within a specified range (similar to Python's range()).

import numpy as np

sequence_array = np.arange(0, 10, 2) # Start 0, end before 10, step 2


print("\nSequence Array (0 to 9, step 2):", sequence_array) # Output: [0 2 4 6 8]

Example 5: np.linspace()

Create an array of evenly spaced numbers over a specified interval.

import numpy as np

evenly_spaced = np.linspace(0, 10, 5) # 5 values between 0 and 10, inclusive


print("\nEvenly Spaced Array (5 points from 0 to 10):", evenly_spaced) # Output: [ 0. 2.5 5. 7.5 10. ]

Example 6: np.random module

Generate arrays with random numbers.

import numpy as np

random_floats = np.random.rand(2, 3) # 2x3 array of random floats between 0 and 1


random_integers = np.random.randint(0, 100, size=(2, 2)) # 2x2 array of random integers [0, 99]

print("\nRandom Floats (2x3):\n", random_floats)


print("Random Integers (2x2):\n", random_integers)

Indexing and Slicing Arrays

Accessing specific elements or subsets of an array is crucial. NumPy's indexing and slicing capabilities are powerful
and flexible.

4.1. Basic Indexing

Access elements by their position (index). Remember, indexing starts at 0.


Example 1: 1D Array Indexing

import numpy as np

arr = np.array([10, 20, 30, 40, 50])


print("\nOriginal Array:", arr)
print("Element at index 0:", arr[0]) # Output: 10
print("Element at last index:", arr[-1]) # Output: 50

Example 2: 2D Array Indexing

Use [row_index, col_index].

import numpy as np

matrix = np.array([[1, 2, 3],


[4, 5, 6],
[7, 8, 9]])
print("\nOriginal Matrix:\n", matrix)
print("Element at (0, 0):", matrix[0, 0]) # Output: 1
print("Element at (1, 2):", matrix[1, 2]) # Output: 6

4.2. Slicing Arrays

Extract sub-arrays using [start:stop:step] notation, similar to Python lists, but extended for multiple dimensions.

Example 3: 1D Array Slicing

import numpy as np

arr = np.arange(10) # [0 1 2 3 4 5 6 7 8 9]
print("\nOriginal Array:", arr)
print("Elements from index 2 to 7 (exclusive):", arr[2:7]) # Output: [2 3 4 5 6]
print("Every other element:", arr[::2]) # Output: [0 2 4 6 8]

Example 4: 2D Array Slicing

Apply slicing independently to each dimension.

import numpy as np

matrix = np.array([[10, 20, 30, 40],


[50, 60, 70, 80],
[90, 100, 110, 120]])
print("\nOriginal Matrix:\n", matrix)
print("First two rows, first three columns:\n", matrix[:2, :3])
# Output:
# [[10 20 30]
# [50 60 70]]
print("All rows, second column only:\n", matrix[:, 1]) # Output: [20 60 100]

4.3. Boolean (Mask) Indexing

Select elements based on a boolean condition. This creates a "mask" which filters the array.

Example 5: Filtering by Condition

import numpy as np

data = np.array([1, 5, 2, 8, 3, 9, 4, 7])


print("\nOriginal Data:", data)
filtered_data = data[data > 5] # Select elements where the value is greater than 5
print("Elements greater than 5:", filtered_data) # Output: [8 9 7]

Array Manipulation

NumPy provides functions to change array shape, combine, and split arrays.

5.1. Reshaping Arrays (.reshape())

Change the dimensions of an array without altering its data. The new shape must have the same total number of
elements.

Example 1: Reshaping 1D to 2D

import numpy as np

arr_1d = np.arange(9) # [0 1 2 3 4 5 6 7 8]
reshaped_arr = arr_1d.reshape((3, 3)) # Reshape to 3 rows, 3 columns
print("\nOriginal 1D:", arr_1d)
print("Reshaped to 3x3:\n", reshaped_arr)
● You can use -1 for one dimension and NumPy will infer its size: arr_1d.reshape((3, -1)).

5.2. Concatenating Arrays (np.concatenate(), np.vstack(), np.hstack())

Join arrays along a specified axis.

Example 2: Vertical and Horizontal Stacking

import numpy as np

arr_a = np.array([[1, 2], [3, 4]])


arr_b = np.array([[5, 6], [7, 8]])

print("\nArray A:\n", arr_a)


print("Array B:\n", arr_b)

v_stacked = np.vstack((arr_a, arr_b)) # Stack vertically (add rows)


h_stacked = np.hstack((arr_a, arr_b)) # Stack horizontally (add columns)

print("\nVertically Stacked:\n", v_stacked)


print("Horizontally Stacked:\n", h_stacked)

5.3. Splitting Arrays (np.split(), np.vsplit(), np.hsplit())

Divide an array into multiple sub-arrays.

Example 3: Splitting a Matrix

import numpy as np

arr = np.arange(16).reshape((4, 4))


print("\nOriginal Array for Splitting:\n", arr)

split_rows = np.vsplit(arr, 2) # Split into 2 arrays vertically (2 rows each)


split_cols = np.hsplit(arr, 4) # Split into 4 arrays horizontally (1 column each)

print("\nSplit into 2 vertical parts (list of arrays):\n", split_rows[0], "\n", split_rows[1])


print("\nSplit into 4 horizontal parts (list of arrays):\n", split_cols[0], "\n...", split_cols[3])
Mathematical Operations (Vectorization)

NumPy's core strength is its ability to perform element-wise operations efficiently, a concept known as vectorization.
This eliminates the need for explicit loops, leading to cleaner and faster code.

6.1. Element-wise Arithmetic

Standard arithmetic operators (+, -, *, /, **) apply element by element.

Example 1: Basic Operations

import numpy as np

a = np.array([10, 20, 30])


b = np.array([1, 2, 3])

print("\nArray 'a':", a)
print("Array 'b':", b)
print("a + b (element-wise):", a + b) # [11 22 33]
print("a * b (element-wise):", a * b) # [10 40 90]
print("a ** 2 (element-wise square):", a ** 2) # [100 400 900]

6.2. Universal Functions (UFuncs)

NumPy provides a wide array of mathematical functions (ufuncs) that operate element-wise.

Example 2: Common Mathematical Functions

import numpy as np

angles = np.array([0, np.pi/2, np.pi]) # np.pi is NumPy's pi constant


values = np.array([1, 4, 9])

print("\nAngles:", angles)
print("Sine of angles:", np.sin(angles)) # [0. 1. 0.]
print("\nValues:", values)
print("Square root of values:", np.sqrt(values)) # [1. 2. 3.]

6.3. Aggregation Functions


Compute summary statistics across an entire array or along specific axes.
Example 3: Aggregations

import numpy as np

data_matrix = np.array([[1, 2, 3],


[4, 5, 6],
[7, 8, 9]])
print("\nData Matrix:\n", data_matrix)

print("Sum of all elements:", np.sum(data_matrix)) # 45


print("Mean of all elements:", np.mean(data_matrix)) # 5.0
print("Maximum element:", np.max(data_matrix)) # 9
print("Standard Deviation:", np.std(data_matrix)) # 2.58...

# Aggregations along axes:


print("Sum of columns (axis=0):", np.sum(data_matrix, axis=0)) # [12 15 18]
print("Mean of rows (axis=1):", np.mean(data_matrix, axis=1)) # [2. 5. 8.]

● axis=0 refers to operations along columns (collapsing rows).


● axis=1 refers to operations along rows (collapsing columns).

6.4. Linear Algebra (np.linalg)

NumPy's linalg module provides essential linear algebra operations.

Example 4: Matrix Multiplication, Determinant, Inverse

import numpy as np

A = np.array([[1, 2], [3, 4]])


B = np.array([[5, 6], [7, 8]])

print("\nMatrix A:\n", A)
print("Matrix B:\n", B)

# Matrix multiplication (dot product)


dot_product = np.dot(A, B) # Or A @ B (Python 3.5+)
print("Matrix Product (A @ B):\n", dot_product)

# Determinant of A
det_A = np.linalg.det(A)
print("Determinant of A:", det_A)

# Inverse of A
inv_A = np.linalg.inv(A)
print("Inverse of A:\n", inv_A)

Broadcasting (Advanced Concept)


Broadcasting is a powerful mechanism that allows NumPy to perform operations on arrays of different shapes. When
operations are performed on arrays with compatible shapes, NumPy automatically "stretches" the smaller array
across the larger array without explicitly creating multiple copies of the smaller array.

Broadcasting Rules (simplified):


1. Equal ndim (number of dimensions): If arrays have different numbers of dimensions, the array with fewer
dimensions is "padded" with ones on its left side.
2. Compatible Dimensions: For each dimension, starting from the trailing (rightmost) dimension, the size of the
dimension must either be the same, or one of them must be 1.
3. Stretching: A dimension with size 1 is stretched to match the size of the other array in that dimension.

Example 1: Scalar and Array Broadcasting

import numpy as np

arr = np.array([1, 2, 3])


scalar = 10

result = arr + scalar # Scalar 10 is broadcast to [10, 10, 10]


print("\nArray + Scalar (Broadcasting):", result) # Output: [11 12 13]

Example 2: 1D Array and 2D Array Broadcasting

import numpy as np

matrix = np.array([[10, 20, 30],


[40, 50, 60]]) # Shape (2, 3)

vector = np.array([1, 2, 3]) # Shape (3,)

result = matrix + vector


print("\nMatrix + Vector (Broadcasting):\n", result)
# Vector (3,) is effectively broadcast to (2, 3) like:
# [[1 2 3],
# [1 2 3]]
# And then added element-wise.

Performance Considerations (Why NumPy is Preferred)


NumPy's speed advantage over standard Python lists for numerical operations is significant, especially with large
datasets. This is due to:
● Vectorization: Operations are performed on entire arrays at once, leveraging highly optimized C code under the
hood.
● Homogeneous Data Types: All elements in a NumPy array are of the same type, allowing for more efficient
memory storage and faster CPU operations.
● Contiguous Memory Allocation: NumPy arrays store their elements in a contiguous block of memory,
improving cache performance and access speed.

Example: Performance Comparison

import numpy as np
import time

data_size = 10**7 # 10 million elements

# Python Lists
list1 = list(range(data_size))
list2 = list(range(data_size))

start_time = time.perf_counter()
result_list = [x + y for x, y in zip(list1, list2)]
end_time = time.perf_counter()
print(f"\nTime for Python list addition ({data_size} elements): {end_time - start_time:.4f} seconds")

# NumPy Arrays
arr1 = np.arange(data_size)
arr2 = np.arange(data_size)

start_time = time.perf_counter()
result_arr = arr1 + arr2
end_time = time.perf_counter()
print(f"Time for NumPy array addition ({data_size} elements): {end_time - start_time:.4f} seconds")

Running this comparison typically demonstrates NumPy performing the operation orders of magnitude faster.

NumPy Array Arithmetic


The core idea is vectorization: instead of writing explicit loops to perform an operation on each element, you apply
the operation directly to the entire array (or between two arrays), and NumPy handles the element-wise
computation in highly optimized, low-level code (often C or Fortran).

1. Basic Element-wise Arithmetic Operations

When you perform standard arithmetic operations (+, -, *, /, **, %) between NumPy arrays of the same shape, or
between a NumPy array and a scalar (a single number), the operation is applied element-by-element.
1.1 Array and Scalar Operations

A scalar value is applied to every element in the array.

Example:

import numpy as np

arr = np.array([10, 20, 30, 40])


scalar = 5

print("Original Array:", arr)


print("Scalar:", scalar)

# Addition
print("\nArray + Scalar (Addition):", arr + scalar) # Each element gets 5 added
# Output: [15 25 35 45]

# Subtraction
print("Array - Scalar (Subtraction):", arr - scalar) # Each element gets 5 subtracted
# Output: [ 5 15 25 35]

# Multiplication
print("Array * Scalar (Multiplication):", arr * scalar) # Each element gets multiplied by 5
# Output: [ 50 100 150 200]

# Division
print("Array / Scalar (Division):", arr / scalar) # Each element gets divided by 5
# Output: [ 2. 4. 6. 8.]

# Power
print("Array ** Scalar (Power):", arr ** 2) # Each element gets squared
# Output: [ 100 400 900 1600]

# Modulus (Remainder)
print("Array % Scalar (Modulus):", arr % 3) # Remainder when divided by 3
# Output: [1 2 0 1]

1.2 Array and Array Operations


When performing operations between two arrays, they must generally have compatible shapes (either identical
shapes or broadcastable shapes, which we'll cover in the next section). The operation is then performed on
corresponding elements.

Example:

import numpy as np

arr1 = np.array([1, 2, 3, 4])


arr2 = np.array([5, 6, 7, 8])

print("Array 1:", arr1)


print("Array 2:", arr2)

# Addition
print("\nArray1 + Array2 (Element-wise Addition):", arr1 + arr2)
# Output: [ 6 8 10 12] (1+5, 2+6, 3+7, 4+8)

# Subtraction
print("Array1 - Array2 (Element-wise Subtraction):", arr1 - arr2)
# Output: [-4 -4 -4 -4]

# Multiplication
print("Array1 * Array2 (Element-wise Multiplication):", arr1 * arr2)
# Output: [ 5 12 21 32] (This is NOT matrix multiplication/dot product)

# Division
print("Array1 / Array2 (Element-wise Division):", arr1 / arr2)
# Output: [0.2 0.33333333 0.42857143 0.5 ]

# Power
print("Array1 ** Array2 (Element-wise Power):", arr1 ** arr2)
# Output: [1 64 2187 65536] (1^5, 2^6, 3^7, 4^8)

Important Note on Multiplication (*):

In NumPy, the * operator performs element-wise multiplication. If you need matrix multiplication (dot product), you
should use np.dot() or the @ operator (available in Python 3.5+).

import numpy as np

matrix_a = np.array([[1, 2],


[3, 4]])
matrix_b = np.array([[5, 6],
[7, 8]])

print("Matrix A:\n", matrix_a)


print("Matrix B:\n", matrix_b)
# Element-wise multiplication
print("\nElement-wise multiplication (A * B):\n", matrix_a * matrix_b)
# Output:
# [[ 5 12] (1*5, 2*6)
# [21 32]] (3*7, 4*8)

# Matrix multiplication (Dot product)


print("\nMatrix multiplication (np.dot(A, B)):\n", np.dot(matrix_a, matrix_b))
# Output:
# [[19 22] (1*5+2*7, 1*6+2*8)
# [43 50]] (3*5+4*7, 3*6+4*8)

# Matrix multiplication using @ operator (Python 3.5+)


print("Matrix multiplication (A @ B):\n", matrix_a @ matrix_b)

2. Broadcasting (Arithmetic with Different Shapes)

Broadcasting is NumPy's incredibly powerful feature that allows operations to be performed on arrays of different
shapes. It implicitly "stretches" the smaller array(s) to match the larger array's shape for the operation, without
actually creating copies in memory (unless necessary).

The rules for broadcasting are:


1. Equal ndim (dimensions): If arrays have different numbers of dimensions, the array with fewer dimensions is
"padded" with ones on its left side.
2. Compatible Dimensions: For each dimension, starting from the trailing (rightmost) dimension, the size of the
dimension must either be the same, or one of them must be 1.
3. Stretching: A dimension with size 1 is stretched to match the size of the other array in that dimension.

If these rules are not met, a ValueError: operands could not be broadcast together occurs.

Example: Broadcasting a 1D Array to a 2D Array


import numpy as np

matrix = np.array([[10, 20, 30],


[40, 50, 60]]) # Shape: (2, 3)

row_vector = np.array([1, 2, 3]) # Shape: (3,)

print("Matrix:\n", matrix)
print("Row Vector:", row_vector)
# Matrix + Row Vector (Broadcasting)
# NumPy broadcasts row_vector (shape (3,)) to (1, 3) and then stretches it along axis 0 to (2, 3)
# It's like:
# [[1 2 3]
# [1 2 3]]
result = matrix + row_vector
print("\nMatrix + Row Vector (Broadcasted Addition):\n", result)
# Output:
# [[11 22 33]
# [41 52 63]]

Example: Broadcasting a Column Vector

To broadcast a 1D array as a column vector, you need to explicitly make it 2D with a column dimension of 1 using
np.newaxis or reshape(-1, 1).

import numpy as np

matrix = np.array([[10, 20, 30],


[40, 50, 60]]) # Shape: (2, 3)

col_vector = np.array([100, 200]) # Shape: (2,)


col_vector_reshaped = col_vector[:, np.newaxis] # Shape: (2, 1)

print("Matrix:\n", matrix)
print("Column Vector (original):", col_vector)
print("Column Vector (reshaped for broadcasting):\n", col_vector_reshaped)

# Matrix + Column Vector (Broadcasting)


# NumPy broadcasts col_vector_reshaped (shape (2, 1)) along axis 1 to (2, 3)
# It's like:
# [[100 100 100]
# [200 200 200]]
result = matrix + col_vector_reshaped
print("\nMatrix + Column Vector (Broadcasted Addition):\n", result)
# Output:
# [[110 120 130]
# [240 250 260]]

3. Comparison Operators
NumPy also supports element-wise comparison operators (==, !=, <, >, <=, >=). These operations return a boolean
array of the same shape as the input, indicating True or False for each element's comparison.

Example:

import numpy as np

arr1 = np.array([1, 5, 3, 8])


arr2 = np.array([2, 5, 4, 7])

print("Array 1:", arr1)


print("Array 2:", arr2)

print("\nArray1 == Array2 (Element-wise equality):", arr1 == arr2)


# Output: [False True False False]

print("Array1 < Array2 (Element-wise less than):", arr1 < arr2)


# Output: [ True False True False]

print("Array1 > 4 (Element-wise greater than scalar):", arr1 > 4)


# Output: [False True False True]

These boolean arrays are extremely useful for boolean indexing (masking), as demonstrated in the previous notes.

4. Bitwise Operations

NumPy also supports element-wise bitwise operations (& - AND, | - OR, ^ - XOR, ~ - NOT) for integer arrays.

Example:

import numpy as np

a = np.array([0, 1, 2, 3], dtype=np.uint8) # Unsigned 8-bit integers


b = np.array([0, 1, 4, 5], dtype=np.uint8)

print("Array a:", a)
print("Array b:", b)

print("\nBitwise AND (a & b):", a & b)


# Output: [0 1 0 1]
# Binary for 2 (0010) & 4 (0100) = 0 (0000)
# Binary for 3 (0011) & 5 (0101) = 1 (0001)

print("Bitwise OR (a | b):", a | b)
# Output: [0 1 6 7]
# Binary for 2 (0010) | 4 (0100) = 6 (0110)
# Binary for 3 (0011) | 5 (0101) = 7 (0111)

print("Bitwise NOT (~a):", ~a)


# Output: [255 254 253 252] (Due to two's complement representation for uint8)

In essence, NumPy array arithmetic is a cornerstone of efficient numerical computing in Python. By leveraging
vectorized operations and broadcasting, you can write concise, readable, and incredibly fast code for complex
mathematical tasks, avoiding the performance bottlenecks of explicit Python loops.

Conclusion
NumPy is an essential library for data manipulation and numerical computation in Python. Mastering its ndarray
object, creation methods, indexing, manipulation techniques, and vectorized operations, along with understanding
broadcasting, will significantly enhance your productivity and performance in any data-intensive task. It forms the
backbone for advanced data analysis, machine learning, and scientific computing workflows.

You might also like