NumPy
What is NumPy?
NumPy means Numerical Python .
It is a Python library that helps to do fast math calculations with numbers, arrays, and matrices.
It is also the base library for many others like Pandas, Matplotlib, Scikit-Learn, and TensorFlow.
Why NumPy?
Faster → Works much quicker than normal lists.
Uses Less Memory → Takes less space in your computer.
Easy to Use → Has many ready-made math and statistics functions.
Very Important → Other libraries like Pandas, Matplotlib, and TensorFlow need NumPy.
1. ndarray (N-Dimensional Array)
It is the core object of NumPy.
Works like a Python list but is faster and can handle multi-dimensional data.
2. Mathematical Functions
NumPy comes with a lot of built-in math functions.
You don’t need loops → operations are applied to the whole array at once.
3. Broadcasting
Lets you do operations on arrays of different shapes.
Instead of writing loops, NumPy automatically stretches the smaller array.
4. Linear Algebra
Built-in functions for matrix operations like multiplication, transpose, inverse, eigenvalues.
How to Install NumPy?
1. Install with pip
Run the below command in your command prompt:
pip install numpy
2.Install with conda (for Anaconda users)
If you are a Anaconda distribution use:
conda install numpy
3. Install inside Jupyter Notebook
if you are using Jupyter Notebook use:
!pip install numpy
!pip install numpy
Requirement already satisfied: numpy in c:\users\sameeksha\onedrive\desktop\new folder\envs\samiksha\lib\site-pa
ckages (2.2.6)
How to Import NumPy?
import numpy as np
Here, np is a short name (alias) for NumPy .
What is an Array?
An array is a container that stores values of the same type (all integers, all floats, etc.).
In NumPy, arrays are called ndarray (N-Dimensional Array).
Arrays can be:
1D Array → Like a simple list of numbers.
2D Array → Like rows and columns (a matrix).
3D Array → Like a cube of numbers.
Why Arrays are Important?
Arrays store many values together instead of one by one.
They make calculations faster than normal Python lists.
They use less memory in your computer.
Create Arrays
1. 1D Array (like a simple list)
arr1 = np.array([1,2,3,4,5])
print(arr1) #1D array
[1 2 3 4 5]
2. 2D Array (like a matrix)
arr2 = np.array([[1,2,3],[4,5,6]])
print(arr2) #2D array
[[1 2 3]
[4 5 6]]
3. 3D Array (like a cube)
arr3 = np.array([[[1,2],[3,4]],[[5,6],[7,8]]])
print(arr3) #3D array
[[[1 2]
[3 4]]
[[5 6]
[7 8]]]
Array Creation Methods
Zeros Array - Creates an array filled with zeros of the given shape.
Z = np.zeros((2,3))
print(Z)
[[0. 0. 0.]
[0. 0. 0.]]
Ones Array -Creates an array filled with ones of the given shape.
O = np.ones((3,3))
print(O)
[[1. 1. 1.]
[1. 1. 1.]
[1. 1. 1.]]
Full Array -Creates an array filled with a specified constant.
F = np.full((2,2),5)
print(F)
[[5 5]
[5 5]]
Identity Matrix -Also creates an identity matrix (like eye) but only for 2D.
I = np.eye(4)
print(I)
[[1. 0. 0. 0.]
[0. 1. 0. 0.]
[0. 0. 1. 0.]
[0. 0. 0. 1.]]
I = np.eye(2)
print(I)
[[1. 0.]
[0. 1.]]
Arrange (start, stop, step) -Creates an array with regularly spaced values (like Python’s range).
A = np.arange(0,10,2)
print(A)
[0 2 4 6 8]
linspace (start, stop, num) -Creates an array of evenly spaced numbers over a specified interval.
L = np.linspace(0,1,5)
print(L)
[0. 0.25 0.5 0.75 1. ]
Random Arrays -Creates an array with random integers from low to high.
R = np.random.randint(0,7)
print(R) #executes all the random valuess..
R = np.random.randint(0,7)
print(R) #executes all the random valuess..
4
R = np.random.randint(0,7)
print(R) #executes all the random valuess..
Array Attributes
Array attributes give us basic information about an array (like its shape, size, and dimensions).
Attribute Meaning
ndim Number of dimensions
shape Rows × Columns
size Total elements
dtype Data type
itemsize Bytes per element
nbytes Total memory
1. ndim →Number of Dimensions
import numpy as np
arr = np.array([[1, 2, 3], [4, 5, 6]])
print(arr.ndim)
#2D array
import numpy as np
arr = np.array([[[1, 2],[3,4]],[[4, 5],[6,7]]])
print(arr.ndim) #3D array
2. shape → Rows & Columns (Size of Each Dimension)
arr = np.array([[1, 2, 3], [4, 5, 6]])
print(arr.shape)
(2, 3)
arr = np.array([1, 2, 3, 4, 5, 6])
print(arr.shape)
(6,)
3. size → Total Number of Elements
arr = np.array([[1, 2, 3], [4, 5, 6]])
print(arr.size)
4. dtype → Data Type of Elements
arr = np.array([1, 2, 3])
print(arr.dtype)
int64
arr = np.array([[1, 2, 3],[5, 6, 7]])
arr.dtype
dtype('int64')
5. itemsize → Memory Used per Element (in bytes)
arr = np.array([1, 2, 3])
print(arr.itemsize)
arr = np.array([[[1],[3]]])
arr.itemsize
8
6. nbytes → Total Memory Used by Array
arr = np.array([1, 2, 3])
print(arr.nbytes)
24
arr = np.array([[[1, 2, 3],[5, 6, 7],[8, 9, 10]]])
arr.nbytes
72
reshape()
Creates a new view (or copy) of the array with a different shape, but does not change the original array.
arr = np.array([1, 2, 3, 4, 5, 6])
reshaped = arr.reshape(2, 3)
print("Original Array:", arr)
print("Reshaped Array:\n", reshaped)
Original Array: [1 2 3 4 5 6]
Reshaped Array:
[[1 2 3]
[4 5 6]]
resize()
Changes the original array shape permanently.
If needed, it adds extra elements (zeros) or removes elements.
arr = np.array([1, 2, 3, 4, 5, 6])
arr.resize(2, 3)
print("Resized Array:\n", arr)
Resized Array:
[[1 2 3]
[4 5 6]]
flatten()
Returns a copy of the array in 1D.
Changes in new array do not affect the original.
a = np.array([[1,2],[3,4]])
b = a.flatten()
b[0] = 99
print(a)
[[1 2]
[3 4]]
ravel()
Returns a view (when possible) of the array in 1D.
Changes in new array may affect the original.
a = np.array([[1,2],[3,4]])
c = a.ravel()
c[0] = 99
print(a)
[[99 2]
[ 3 4]]
Data Types
NumPy supports many different data types.
When we create an array, NumPy automatically assigns a dtype.
1. Integer Types (int)
Whole numbers (positive or negative).
Sizes: int8 , int16 , int32 , int64 (8, 16, 32, 64 bits).
arr = np.array([1, 2, 3], dtype=np.int16)
print(arr, arr.dtype)
[1 2 3] int16
2. Unsigned Integers (uint)
Only positive numbers.
Sizes: uint8 , uint16 , uint32 , uint64 .
arr = np.array([1, 2, 3], dtype=np.uint8)
print(arr, arr.dtype)
[1 2 3] uint8
3. Floating Point (float)
Decimal numbers.
Sizes: float16 , float32 , float64 .
arr = np.array([1.5, 2.7, 3.9], dtype=np.float32)
print(arr, arr.dtype)
[1.5 2.7 3.9] float32
Descriptive Statistics in NumPy
Descriptive statistics summarize and describe the main features of a dataset.
data = np.array([10, 20, 20, 30, 40, 50, 60])
print("Data:", data)
Data: [10 20 20 30 40 50 60]
MEASURES OF CENTRAL TENDENCY
1. Mean (Average)
mean = np.mean(data)
mean
np.float64(32.857142857142854)
2. Median (Middle value)
median = np.median(data)
median
np.float64(30.0)
MEASURES OF DISPERSION
3. Range (Maximum - Minimum)
Range = max(data)-min(data)
Range
np.int64(50)
4.Variance (Square of Std. Dev.)
Var = np.var(data)
Var
np.float64(277.55102040816325)
5. Standard Deviation (Spread of data)
ST = np.std(data)
ST
np.float64(16.65986255670086)
6. Qunatile
A quantile divides the data into equal parts.
0.25 → 25% of data lies below this point (1st quartile).
0.5 → 50% of data lies below this point (Median).
0.75 → 75% of data lies below this point (3rd quartile).
data = np.array([10, 20, 30, 40, 50, 60, 70, 80, 90])
q1 = np.quantile(data, 0.25) # 25% quantile
q2 = np.quantile(data, 0.5) # 50% quantile (median)
q3 = np.quantile(data, 0.75) # 75% quantile
q1
np.float64(30.0)
q2
np.float64(50.0)
q3
np.float64(70.0)
IQR = q3 - q1
IQR = q3 - q1
IQR
np.float64(40.0)
7. Percentile
Percentile tells the value below which a given percentage of data falls.
25th percentile = same as 0.25 quantile
50th percentile = same as 0.50 quantile (Median)
90th percentile = value below which 90% of data lies
p25 = np.percentile(data, 25) # 25th percentile
p50 = np.percentile(data, 50) # 50th percentile (median)
p90 = np.percentile(data, 90) # 90th percentile
p25
np.float64(30.0)
p50
np.float64(50.0)
p90
np.float64(82.0)
MEASURES OF RELATIONSHIP
1. Correlation
Correlation measures strength & direction of relationship.
Values range between -1 and 1.
exp = np.array([1,2,3,4,5])
sal = np.array([10000,20000,30000,40000,50000])
np.corrcoef(exp,sal)
array([[1., 1.],
[1., 1.]])
2. Covariance
Covariance shows how two variables change together.
Positive → When one increases, the other also increases.
Negative → When one increases, the other decreases.
np.cov(exp,sal)
array([[2.5e+00, 2.5e+04],
[2.5e+04, 2.5e+08]])
Creation of Multidimensional Arrays
1. While Creation
Directly pass nested lists to np.array() .
Structure of list defines the dimension.
arr2D = np.array([[1,2],[3,4]]) # 2D Array
arr3D = np.array([[[1,2],[3,4]], [[5,6],[7,8]]]) # 3D Array
arr2D
array([[1, 2],
[3, 4]])
arr3D
array([[[1, 2],
[3, 4]],
[[5, 6],
[7, 8]]])
2. After Creation
First create a 1D array.
Use .reshape() to change into multi-dimensional form.
a = np.array([1,2,3,4,5,6])
print(a.reshape(2,3))
[[1 2 3]
[4 5 6]]
INDEXING
Used to access specific elements in arrays.
Works like Python lists but supports multi-dimensional.
arr = np.array([10,2,11,5,8,100])
arr[3] #single element
np.int64(5)
arr[3:]
array([ 5, 8, 100])
np.random.seed(403)
arr2d = np.random.randint(20,100,12).reshape(4,3)
arr2d
array([[70, 80, 96],
[72, 41, 89],
[21, 40, 54],
[79, 98, 62]], dtype=int32)
arr2d[1,1] #row = 1, column = 1
np.int32(41)
arr2d[0,1] #row = 0, column = 1
np.int32(80)
arr2d[2,1] #row = 2, column = 1
np.int32(40)
What is Indexing?
Indexing means accessing specific elements in an array using their position (index number).
Index starts at 0 (first element).
Single Element Access → Get one value using its index (e.g., arr[2] = 3rd element).
Multi-dimensional Access → Use row, column format → arr[row, column].
Negative Indexing → Use -1, -2, ... to access elements from the end of the array.
Important Points
Works for 1D, 2D, and higher-dimensional arrays.
In 2D arrays:
> `arr[0, :]` → First row
>
> `arr[:, 1]` → Second column
Indexing can be combined with slicing and masking for more complex selection.
SLICING
Extracts a range of elements from arrays.
Syntax: array[start:end:step]
1. Slicing 2D
arr2d
array([[70, 80, 96],
[72, 41, 89],
[21, 40, 54],
[79, 98, 62]], dtype=int32)
arr2d[0,0:3]
array([70, 80, 96], dtype=int32)
arr2d[1:4,0:2]
array([[72, 41],
[21, 40],
[79, 98]], dtype=int32)
arr2d[1:3,2:3]
array([[89],
[54]], dtype=int32)
2. Slicing 3D
np.random.seed(403)
arr3d = np.random.randint(20,100,24).reshape(2,4,3)
arr3d
array([[[70, 80, 96],
[72, 41, 89],
[21, 40, 54],
[79, 98, 62]],
[[74, 30, 93],
[35, 52, 61],
[82, 29, 29],
[57, 61, 69]]], dtype=int32)
arr3d[0][2,1:3]
array([40, 54], dtype=int32)
arr3d[1][0:2,1:3]
array([[30, 93],
[52, 61]], dtype=int32)
arr3d[1][2][1:3]
array([29, 29], dtype=int32)
3. Slicing nD
np.random.seed(403)
arr5d = np.random.randint(20,100,24).reshape(1,1,2,4,3)
arr5d
array([[[[[70, 80, 96],
[72, 41, 89],
[21, 40, 54],
[79, 98, 62]],
[[74, 30, 93],
[35, 52, 61],
[82, 29, 29],
[57, 61, 69]]]]], dtype=int32)
arr5d[0,0,0][3,1:3]
array([98, 62], dtype=int32)
arr5d[0][0][0][0,-3]
np.int32(70)
What is Slicing?
Slicing means selecting a part of an array instead of the whole array.
It uses a rule called start : stop : step.
Start → The index from where slicing begins.
Stop → The index where slicing ends (but not included).
Step → The gap or jump between elements.
Important Points
Works on 1D arrays (single line of numbers) and 2D arrays (rows and columns).
You can select a range of elements, only rows, only columns, or even a sub-part of a matrix.
If start or stop is not given, NumPy assumes from the beginning or till the end.
If step is negative, it means slicing in reverse order.
BOOLEAN INDEXING
!
What is Boolean Indexing?
Boolean indexing is a way of filtering arrays using True/False values.
It helps select elements that meet a condition (e.g., numbers > 5).
np.random.seed(403)
salary = np.random.randint(20000,100000,30)
salary
array([22226, 51351, 61916, 58736, 25324, 91041, 92610, 21486, 35953,
38102, 66602, 99305, 66880, 73161, 29406, 94994, 59433, 26628,
28625, 92942, 25346, 62522, 59552, 45368, 41437, 39254, 42825,
51080, 70859, 57661], dtype=int32)
salary > 50000 #true are the employees whose salary is more than 50000
#False are the employees whose salary is less than 50000
array([False, True, True, True, False, True, True, False, False,
False, True, True, True, True, False, True, True, False,
False, True, False, True, True, False, False, False, False,
True, True, True])
salary[salary > 50000]
array([51351, 61916, 58736, 91041, 92610, 66602, 99305, 66880, 73161,
94994, 59433, 92942, 62522, 59552, 51080, 70859, 57661],
dtype=int32)
how many employees are there whose salary is less than 40k?
salary < 40000 #true says the employees salary is more thank 40000
#False says the employees salary is less than 40000
array([ True, False, False, False, True, False, False, True, True,
True, False, False, False, False, True, False, False, True,
True, False, True, False, False, False, False, True, False,
False, False, False])
salary[salary < 40000 ]
array([22226, 25324, 21486, 35953, 38102, 29406, 26628, 28625, 25346,
39254], dtype=int32)
salary[salary < 40000].size #no of employees whose salary is less than 40000
10
how many employees are there whose salary is bet 50k to 70k?
(salary > 50000) & (salary < 70000) #(&) bitwise AND operator
array([False, True, True, True, False, False, False, False, False,
False, True, False, True, False, False, False, True, False,
False, False, False, True, True, False, False, False, False,
True, False, True])
salary[(salary > 50000) & (salary < 70000)].size #the no of employees whose salary is between 50k to 70k..
10
we can also use len() and sum() along with size() to find the no of employees..
len(salary[(salary > 50000) & (salary < 70000)]) #len() with bitwise operator..
10
sum((salary > 50000) & (salary < 70000)) #sum() with bitwise operator..
np.int64(10)
6 . salary > 50000 -> 10000
salary < 50000 -> 15000
salary[salary > 50000] + 10000
array([ 61351, 71916, 68736, 101041, 102610, 76602, 109305, 76880,
83161, 104994, 69433, 102942, 72522, 69552, 61080, 80859,
67661], dtype=int32)
salary[salary < 50000] + 15000
array([37226, 40324, 36486, 50953, 53102, 44406, 41628, 43625, 40346,
60368, 56437, 54254, 57825], dtype=int32)
np.where(salary > 50000,salary+10000,salary+15000)
array([ 37226, 61351, 71916, 68736, 40324, 101041, 102610, 36486,
50953, 53102, 76602, 109305, 76880, 83161, 44406, 104994,
69433, 41628, 43625, 102942, 40346, 72522, 69552, 60368,
56437, 54254, 57825, 61080, 80859, 67661], dtype=int32)
Mathematical Constants
Pi (π): np.pi
This is the number 3.14159... used for circles.
np.pi #value of pi
3.141592653589793
Euler's Number (e): np.e
This is the number 2.71828... used for growth and decay.
np.e
2.718281828459045
Infinity: np.inf
The constant infinity represents a value that is larger than any finite number.
np.inf
inf
Not a Number (nan)
The NaN stands for Not a Number.
It's used to represent missing or undefined results from a mathematical operation, like 0 / 0
np.nan
nan
Array Manipulation
Array manipulation is simply about changing the shape, size, or content of an array.
1. Reshaping Arrays
Reshaping changes the number of rows and columns of an array without changing the data itself.
N = np.arange(9)
N
array([0, 1, 2, 3, 4, 5, 6, 7, 8])
N.reshape(3,3) # it reshapes into 3x3 matrix ..
array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])
2. Stacking and Splitting Arrays
Stacking combines multiple arrays into one larger array, while splitting does the opposite—it breaks one large array into several
smaller ones.
np.vstack : Stacks arrays vertically (on top of each other).
a = np.array([[1,2],[3,4]])
b = np.array([[20,25],[30,35]])
array([[1, 2],
[3, 4]])
array([[20, 25],
[30, 35]])
np.vstack((a,b)) #axis = 0 same as concatenation
array([[ 1, 2],
[ 3, 4],
[20, 25],
[30, 35]])
np.concatenate((a,b),axis = 0)
array([[ 1, 2],
[ 3, 4],
[20, 25],
[30, 35]])
np.hstack: Stacks arrays horizontally (side by side).
np.hstack((a,b)) #axis = 1
array([[ 1, 2, 20, 25],
[ 3, 4, 30, 35]])
np.concatenate((a,b),axis = 1)
array([[ 1, 2, 20, 25],
[ 3, 4, 30, 35]])
Splitting (np.split)
Divides an array into smaller, equal parts.
arr = np.array([[1,2,3,2],[4,5,6,4],[7,8,9,8],[11,23,45,76]])
arr
array([[ 1, 2, 3, 2],
[ 4, 5, 6, 4],
[ 7, 8, 9, 8],
[11, 23, 45, 76]])
np.split(arr,2) #it has been seperated into 2 equal parts..
[array([[1, 2, 3, 2],
[4, 5, 6, 4]]),
array([[ 7, 8, 9, 8],
[11, 23, 45, 76]])]
np.split(arr,4) #it been seperated into 4 equal parts
[array([[1, 2, 3, 2]]),
array([[4, 5, 6, 4]]),
array([[7, 8, 9, 8]]),
array([[11, 23, 45, 76]])]
1. Horizontal Split (np.hsplit)
This function divides an array into multiple smaller arrays horizontally, along the columns.
np.hsplit(arr,1) #split into 1 part horizontally
[array([[ 1, 2, 3, 2],
[ 4, 5, 6, 4],
[ 7, 8, 9, 8],
[11, 23, 45, 76]])]
np.hsplit(arr,2) #split into 2 parts horizontally
[array([[ 1, 2],
[ 4, 5],
[ 7, 8],
[11, 23]]),
array([[ 3, 2],
[ 6, 4],
[ 9, 8],
[45, 76]])]
np.hsplit(arr,3) #cannot split as it is not equal to 3 parts..
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[139], line 1
----> 1 np.hsplit(arr,3)
File ~\OneDrive\Desktop\New folder\envs\Samiksha\Lib\site-packages\numpy\lib\_shape_base_impl.py:952, in hsplit(
ary, indices_or_sections)
950 raise ValueError('hsplit only works on arrays of 1 or more dimensions')
951 if ary.ndim > 1:
--> 952 return split(ary, indices_or_sections, 1)
953 else:
954 return split(ary, indices_or_sections, 0)
File ~\OneDrive\Desktop\New folder\envs\Samiksha\Lib\site-packages\numpy\lib\_shape_base_impl.py:877, in split(a
ry, indices_or_sections, axis)
875 N = ary.shape[axis]
876 if N % sections:
--> 877 raise ValueError(
878 'array split does not result in an equal division') from None
879 return array_split(ary, indices_or_sections, axis)
ValueError: array split does not result in an equal division
np.hsplit(arr,4)
[array([[ 1],
[ 4],
[ 7],
[11]]),
array([[ 2],
[ 5],
[ 8],
[23]]),
array([[ 3],
[ 6],
[ 9],
[45]]),
array([[ 2],
[ 4],
[ 8],
[76]])]
2. Vertical Split (np.vsplit)
This function divides an array into multiple smaller arrays vertically, along the rows.
np.vsplit(arr,4) #equal parts row wise
[array([[1, 2, 3, 2]]),
array([[4, 5, 6, 4]]),
array([[7, 8, 9, 8]]),
array([[11, 23, 45, 76]])]
np.vsplit(arr,2)
[array([[1, 2, 3, 2],
[4, 5, 6, 4]]),
array([[ 7, 8, 9, 8],
[11, 23, 45, 76]])]
np.vsplit(arr1,1)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[144], line 1
----> 1 np.vsplit(arr1,1)
File ~\OneDrive\Desktop\New folder\envs\Samiksha\Lib\site-packages\numpy\lib\_shape_base_impl.py:1007, in vsplit
(ary, indices_or_sections)
959 """
960 Split an array into multiple sub-arrays vertically (row-wise).
961
(...) 1004
1005 """
1006 if _nx.ndim(ary) < 2:
-> 1007 raise ValueError('vsplit only works on arrays of 2 or more dimensions')
1008 return split(ary, indices_or_sections, 0)
ValueError: vsplit only works on arrays of 2 or more dimensions
we cannot split for 1 dimension it is more than 2
3. ARG
The arg functions in NumPy are tools that help you find the position (or index) of a specific value in an array..
np.argmax() : Finds the spot of the biggest number.
np.argmin() : Finds the spot of the smallest number.
np.argsort() : Finds all the spots in the correct order to sort the array. It gives you the "map" to rearrange the original
numbers from smallest to largest.
arr = np.array([10,11,12,14,18,900,0])
np.argmax(arr) #biggest number..
np.int64(5)
np.argmin(arr) #smallest number..
np.int64(6)
np.argsort(arr) #sorted valuess..
array([6, 0, 1, 2, 3, 4, 5])
What are NaN Statistics?
NaN statistics are a set of special NumPy functions designed to handle arrays that contain NaN (Not a Number) values. These
functions simply ignore any NaN values when performing calculations, allowing you to get meaningful results from your data.
arr = np.array([10,15,18,19,np.nan,200,10])
arr
array([ 10., 15., 18., 19., nan, 200., 10.])
np.max(arr) #we cannot predict the Max value so the output will be Nan
np.float64(nan)
Other Common NaN Statistical Functions
np.nansum(): Calculates the sum of all non- NaN values.
np.nanmedian(): Finds the median of all non- NaN values.
np.nanstd(): Calculates the standard deviation while ignoring NaN values.
np.nanmin(): Finds the minimum value while ignoring NaN values.
np.nanmax(): Finds the maximum value while ignoring NaN values.
np.nanvar(): Finds the variance value while ignoring NaN values.
np.nanmean(): Finds the average value while ignoring NaN values.
To Exclude NaN
np.nanmax(arr) #Maximum value
np.float64(200.0)
np.nanmin(arr) #minimum value
np.float64(10.0)
np.nanmedian(arr) #middle value
np.float64(16.5)
np.nansum(arr) #addition
np.float64(272.0)
np.nanvar(arr) #variance
np.float64(4796.555555555555)
np.nanstd(arr) #standard deviation
np.float64(69.25716970506053)
np.nanmean(arr) #average value
np.float64(45.333333333333336)
Linear Algebra
Linear algebra is a branch of mathematics that deals with vectors, matrices, and linear equations.
NumPy's linalg
1.Solving Linear Equations
Using matrices to solve for unknown variables..
np.linalg.solve()
a = np.array([[1,2,3],[1,-3,9],[1,-10,-90]])
b = np.array([[2],[10],[20]])
array([[ 1, 2, 3],
[ 1, -3, 9],
[ 1, -10, -90]])
array([[ 2],
[10],
[20]])
np.linalg.solve(a,b)
array([[ 5.1396648 ],
[-1.58659218],
[ 0.01117318]])
a = np.array([[20,3,4,5],[10,-10,9,1],[1,1,1,1],[1,-1,1,1]])
b = np.array([[100],[2],[100],[20]])
array([[ 20, 3, 4, 5],
[ 10, -10, 9, 1],
[ 1, 1, 1, 1],
[ 1, -1, 1, 1]])
array([[100],
[ 2],
[100],
[ 20]])
np.linalg.solve(a,b)
array([[-17.19379845],
[ 40. ],
[ 62.09302326],
[ 15.10077519]])
2. The Determinant
The determinant is a special number calculated from the elements of a square matrix.
A single number that tells you about a matrix's properties.
np.linalg.det()
arr = np.array([[1,2],[4,5]])
arr
array([[1, 2],
[4, 5]])
np.linalg.det(arr)
np.float64(-2.9999999999999996)
arr2 = np.array([[1,2,10],[4,5,9],[8,2,1]])
arr2
array([[ 1, 2, 10],
[ 4, 5, 9],
[ 8, 2, 1]])
np.linalg.det(arr2)
np.float64(-196.99999999999983)
3. The Inverse of a Matrix
The inverse of a matrix, which when multiplied by the original matrix, gives the identity matrix.
np.linalg.inv()
arr2
array([[ 1, 2, 10],
[ 4, 5, 9],
[ 8, 2, 1]])
np.linalg.inv(arr2)
array([[ 0.06598985, -0.09137056, 0.16243655],
[-0.34517766, 0.40101523, -0.15736041],
[ 0.16243655, -0.07106599, 0.01522843]])
4. Eigenvalues and Eigenvectors
np.linalg.eig()
arr = np.array([[1,10,11],[2,14,3],[7,8,9]])
arr
array([[ 1, 10, 11],
[ 2, 14, 3],
[ 7, 8, 9]])
val,vect = np.linalg.eig(arr)
val
array([-4.71378609, 20.89675026, 7.81703582])
vect
array([[ 0.89618065, -0.59532811, 0.45082152],
[-0.02476076, -0.4593855 , -0.50346122],
[-0.4429979 , -0.65920361, 0.73707988]])
np.linalg.eig(arr)
EigResult(eigenvalues=array([-4.71378609, 20.89675026, 7.81703582]), eigenvectors=array([[ 0.89618065, -0.5953
2811, 0.45082152],
[-0.02476076, -0.4593855 , -0.50346122],
[-0.4429979 , -0.65920361, 0.73707988]]))
np.linalg.eigvals(arr)
array([-4.71378609, 20.89675026, 7.81703582])
Vectorization
Vectorization means applying an operation to every item in an array at the same time, without needing a slow for loop
arr1 = np.array([[1,3,4],[7,1,3],[1,1,1]])
arr2 = np.array([[11,13,4],[7,21,-3],[1,-1,1]])
arr1 + arr2
array([[12, 16, 8],
[14, 22, 0],
[ 2, 0, 2]])
arr1 - arr2
array([[-10, -10, 0],
[ 0, -20, 6],
[ 0, 2, 0]])
arr1 * arr2
array([[11, 39, 16],
[49, 21, -9],
[ 1, -1, 1]])
arr1/arr2
array([[ 0.09090909, 0.23076923, 1. ],
[ 1. , 0.04761905, -1. ],
[ 1. , -1. , 1. ]])
dot() - a1∗b1+a2∗b2+a3∗b3
np.dot(arr1,arr2)
array([[ 36, 72, -1],
[ 87, 109, 28],
[ 19, 33, 2]])
clip() -The array you want to "clip."
A minimum value (or min). Any number in the array that is below this minimum will be changed to the minimum value.
Amaximum value (or max). Any number that is above this maximum will be changed to the maximum value.
marks = np.array([-2,-30,18,23,77,98,120,101,1000])
np.clip(marks,0,100) #min values as 0 and max values as 1
array([ 0, 0, 18, 23, 77, 98, 100, 100, 100])
abs() -This function gives you the absolute value of a number.
np.abs(arr)
array([[ 1, 10, 11],
[ 2, 14, 3],
[ 7, 8, 9]])
ceil() -This function rounds a number up to the nearest whole number.
np.ceil(arr)
array([[ 1, 10, 11],
[ 2, 14, 3],
[ 7, 8, 9]])
floor() -This function rounds a number down to the nearest whole number.
np.floor(arr)
array([[ 1, 10, 11],
[ 2, 14, 3],
[ 7, 8, 9]])
File Handling
File handling in NumPy is about saving your arrays to files and loading them back into your program.
Saving and Loading Text Files (.txt)
np.savetxt(): This function saves an array to a plain text file.
np.loadtxt(): This function loads data from a plain text file into an array
np.random.seed(403)
sal = np.random.randint(10000,90000,10)
sal
array([12226, 41351, 51916, 48736, 15324, 81041, 82610, 11486, 25953,
28102], dtype=int32)
rev_sal = np.where(sal > 50000, sal + 0.05*sal,sal + 0.1*sal)
np.savetxt('Updated sal.txt',rev_sal)
%pwd #it has been saved in this particular file..
'C:\\Users\\Sameeksha\\Batch 403'
np.loadtxt('Updated sal.txt') #it loads the given data into an array from a txt file....
array([13448.6 , 45486.1 , 54511.8 , 53609.6 , 16856.4 , 85093.05,
86740.5 , 12634.6 , 28548.3 , 30912.2 ])
Saving and Loading Binary Files (.npy)
np.save(): This function saves a single NumPy array to a binary file.
np.load(): This function loads a .npy file back into a NumPy array.
np.save('new sal.npy',rev_sal) #it saves in the form of binary..
np.load('new sal.npy') #it loads the file into numpy array..
array([13448.6 , 45486.1 , 54511.8 , 53609.6 , 16856.4 , 85093.05,
86740.5 , 12634.6 , 28548.3 , 30912.2 ])
Distributions
Distributions are ways to generate random numbers that follow real-world patterns. NumPy has many built-in random
distributions inside the numpy.random module.
Distribution Function Example Use
Uniform np.random.uniform() Fair dice, equal chances
Normal (Gaussian) np.random.normal() Heights, test scores
Binomial np.random.binomial() Coin toss, success/fail trials
Poisson np.random.poisson() Number of events (calls, errors)
Exponential np.random.exponential() Waiting time
Beta np.random.beta() Probabilities between 0 and 1
Chi-Square np.random.chisquare() Statistical testing
Gamma np.random.gamma() Time-based modeling
1. Uniform Distribution
All numbers in a range have an equal chance.
np.random.uniform(0, 1, size=5) # Random numbers between 0 and 1
array([0.45692003, 0.0563029 , 0.33208023, 0.24528069, 0.45300511])
2. Normal Distribution
Most numbers are close to the mean
np.random.normal(0, 1, size=5) # mean=0, std=1, size=5
array([-1.40873101, 1.42538088, -1.07833556, -0.71974151, -0.64653497])
3. Binomial Distribution
Returns how many times you got "yes"
np.random.binomial(n = 100,p = 0.3,size = 20)
array([33, 23, 32, 29, 33, 29, 32, 30, 37, 24, 26, 34, 30, 30, 34, 24, 31,
39, 31, 30], dtype=int32)
4. Poisson Distribution
Good for event counts
np.random.poisson(lam=3, size=5)
array([2, 1, 3, 1, 1], dtype=int32)
5. Exponential Distribution
Opposite of Poisson.
np.random.exponential(size = 10)
array([0.75241534, 0.87459268, 0.52357768, 0.3921389 , 0.82821386,
0.28066685, 1.32406615, 2.72275075, 0.04495131, 3.45511878])
6. Beta Distribution
Used in probability modeling; values between 0 and 1.
np.random.beta(a=2, b=5, size=5)
array([0.1870489 , 0.3955642 , 0.4594023 , 0.17405814, 0.05320998])
7. Chi-Square Distribution
Used in statistical tests and hypothesis testing.
np.random.chisquare(df=2, size=5)
array([1.86457984, 0.56563743, 0.09816924, 5.49680552, 0.2177673 ])
8. Gamma Distribution
Used in weather models,etc..
np.random.gamma(shape=2, scale=2, size=5)
array([4.26168205, 1.53341864, 0.266672 , 4.81610956, 0.50010853])
Image Manipulation with NumPy + Matplotlib
Images are just NumPy arrays, so you can slice, flip, and rotate them like any array.
NumPy functions like flipud() , fliplr() , flip() , and rot90() let you easily flip and rotate images.
Matplotlib's imshow() displays images, so you can see the results of your changes instantly.
Task Function/Code
Flip Vert np.flipud(img) or img[::-1]
Flip Horiz np.fliplr(img) or img[:, ::-1]
Flip Axis np.flip(img, axis)
Rotate np.rot90(img, k=...)
Crop img[y1:y2, x1:x2]
Show Image plt.imshow(img)
First Method to Import & View Image
Import Libraries
import matplotlib.pyplot as plt
Read the Image
image = plt.imread(r"C:\Users\Sameeksha\OneDrive\Desktop\image.jpg")
Show the Image with Matplotlib
plt.imshow(image)
<matplotlib.image.AxesImage at 0x19de5387b10>
np.fliplr() – Flip Left/Right (Horizontal)
fliplr = np.fliplr(image) #flip the image from left / right
plt.imshow(fliplr)
<matplotlib.image.AxesImage at 0x19de540bed0>
np.flipud() – Flip Up/Down (Vertical)
flipud = np.flipud(image) #flip the image upside down...
plt.imshow(flipud)
<matplotlib.image.AxesImage at 0x19de6e7c2d0>
np.flip() – General Flip (specify axis)
flip_color = np.flip(image)
plt.imshow(flip_color) #changes the color red to blue and blue to red .. follows RGB colors
<matplotlib.image.AxesImage at 0x19de7bf0690>
image
array([[[ 73, 228, 232],
[ 73, 228, 232],
[ 73, 228, 232],
...,
[ 73, 228, 232],
[ 73, 228, 232],
[ 73, 228, 232]],
[[ 73, 228, 232],
[ 73, 228, 232],
[ 73, 228, 232],
...,
[ 73, 228, 232],
[ 73, 228, 232],
[ 73, 228, 232]],
[[ 73, 228, 232],
[ 73, 228, 232],
[ 73, 228, 232],
...,
[ 73, 228, 232],
[ 73, 228, 232],
[ 73, 228, 232]],
...,
[[ 71, 226, 230],
[ 71, 226, 230],
[ 71, 226, 230],
...,
[ 65, 217, 222],
[ 65, 217, 222],
[ 65, 217, 222]],
[[ 71, 226, 230],
[ 71, 226, 230],
[ 71, 226, 230],
...,
[ 65, 217, 222],
[ 65, 217, 222],
[ 65, 217, 222]],
[[ 71, 226, 230],
[ 71, 226, 230],
[ 71, 226, 230],
...,
[ 65, 217, 222],
[ 65, 217, 222],
[ 65, 217, 222]]], shape=(900, 1600, 3), dtype=uint8)
# it shows the array of the image...
Crop / Cut the Image
plt.imshow(image[260:480,780:1000])
<matplotlib.image.AxesImage at 0x19de81d8f50>
Images are stored as arrays in the format
A square area around the fish's eye.
Only the pixels in that region are shown with imshow().
plt.imshow(image[50:250,750:1000]) #cutting only the wing part....
<matplotlib.image.AxesImage at 0x19de97116d0>
Rotation
np.rot90() to rotate an image clockwise in 90-degree steps.
rotate = np.rot90(image)
plt.imshow(rotate)
<matplotlib.image.AxesImage at 0x19de9763c50>
Joining Images with hstack and vstack
np.hstack() – Horizontal Stack
Joins images side by side (left to right).
join_h = np.hstack((fliplr,flipud,flip_color))
plt.imshow(join_h)
<matplotlib.image.AxesImage at 0x19de9d55a90>
np.vstack() – Vertical Stack
Joins images top to bottom.
Requires images to have the same width and number of channels.
join_v = np.vstack((fliplr,flipud,flip_color))
plt.imshow(join_v)
<matplotlib.image.AxesImage at 0x19de9da9310>
plt.imshow(image*2) #multiplies every pixel by 2..
#imagegets change
<matplotlib.image.AxesImage at 0x19ded1df110>