KEMBAR78
Unit3 Notes | PDF | Integer (Computer Science) | Arithmetic
0% found this document useful (0 votes)
21 views23 pages

Unit3 Notes

The document provides an overview of NumPy, a library for scientific computing in Python, detailing its core object, the ndarray, which is a multidimensional array that offers significant performance advantages over standard Python lists. It covers creating ndarrays, data types, indexing, slicing, and various operations such as arithmetic and joining arrays. Additionally, it discusses advanced features like array iteration, reshaping, and memory efficiency, making it essential for data-intensive applications like machine learning.

Uploaded by

babusuresh3743
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views23 pages

Unit3 Notes

The document provides an overview of NumPy, a library for scientific computing in Python, detailing its core object, the ndarray, which is a multidimensional array that offers significant performance advantages over standard Python lists. It covers creating ndarrays, data types, indexing, slicing, and various operations such as arithmetic and joining arrays. Additionally, it discusses advanced features like array iteration, reshaping, and memory efficiency, making it essential for data-intensive applications like machine learning.

Uploaded by

babusuresh3743
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 23

Unit-3

NumPy Basics, The NumPy ndarray: A Multidimensional Array Object,


Creating ndarrays, Data Types for ndarrays, Arithmetic with NumPy Arrays,
Basic Indexing and Slicing, Boolean Indexing, Fancy Indexing, Transposing
Arrays and Swapping Axes, Universal Functions: Fast Element-Wise, Array
Functions, Array-Oriented Programming with Arrays, Expressing Conditional
Logic as Array Operations, Mathematical and Statistical Methods, Methods for
Boolean Arrays, Sorting, Unique and Other Set Logic, File Input and Output
with Arrays, Linear Algebra, Pseudorandom Number Generation, Example:
Random Walks, Simulating Many Random Walks at Once, Example: Linear
algebra.

NumPy Basics:
NumPy, or "Numerical Python," is a key library for scientific computing,
founded by Travis Oliphant in 2005. Its core object, the ndarray, is a
multidimensional array of a single data type that is up to 50 times faster
than standard Python lists. This efficiency makes it vital for data-intensive
fields like machine learning.
Advantages of NumPy array:
 Memory Efficiency: ndarrays are stored in a contiguous block of
memory. This allows for cache-friendly access, a concept called
locality of reference, enabling the CPU to retrieve data quickly.
Python lists, by contrast, store pointers to scattered objects.
 Vectorization: NumPy operations are vectorized, meaning they
apply to the entire array at once without slow Python for loops. These
operations are built on optimized C and Fortran code, allowing NumPy
to leverage modern CPU architectures for parallel processing,
significantly boosting performance.

Creating a NumPy Array


Creating an ndarray is straightforward. You first import the NumPy library and
then use the np.array() function to convert a Python sequence, such as a list
or tuple, into a NumPy array.
Example:
Python
import numpy as np

# Create a 1D array from a list


arr = np.array([1, 2, 3, 4, 5])
print("Array:", arr)
# Output: Array: [1 2 3 4 5]

# Check the type of the new object


print("Type:", type(arr))
# Output: Type: <class 'numpy.ndarray'>

Creating ndarrays and Dimensions


Creating a NumPy array is done using the np.array() function. An array's
dimension (or rank) refers to the number of axes it has. You can check an
array's dimensions using the .ndim attribute and its shape with the .shape
attribute. The shape is a tuple that gives the size of the array along each
dimension.
1D Array (Vector): A one-dimensional array has a single row of data.
Python
import numpy as np

arr1 = np.array([1, 2, 3])


print(arr1)
# Output: [1 2 3]

print("Dimensions:", arr1.ndim)
# Output: Dimensions: 1

print("Shape:", arr1.shape)
# Output: Shape: (3,)
<hr>
2D Array (Matrix): A two-dimensional array has rows and columns. This is a
common structure for representing data tables or matrices.
Python
import numpy as np

arr2 = np.array([[1, 2, 3], [4, 5, 6]])


print(arr2)
# Output:
# [[1 2 3]
# [4 5 6]]

print("Dimensions:", arr2.ndim)
# Output: Dimensions: 2
print("Shape:", arr2.shape)
# Output: Shape: (2, 3)
The shape (2, 3) indicates the array has 2 rows and 3 columns.
<hr>
3D Array (Tensor): A three-dimensional array can be thought of as a
collection of matrices.
Python
import numpy as np

arr3 = np.array([[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]])
print(arr3)
# Output:
# [[[ 1 2 3]
# [ 4 5 6]]
#
# [[ 7 8 9]
# [10 11 12]]]

print("Dimensions:", arr3.ndim)
# Output: Dimensions: 3

print("Shape:", arr3.shape)
# Output: Shape: (2, 2, 3)
The shape (2, 2, 3) means the array has 2 matrices, each with 2 rows and 3
columns. The number of elements is 2times2times3=12.

NumPy array indexing:


NumPy array indexing is how you select individual elements, slices, or
subsets of a NumPy array. It's similar to indexing Python lists but with
powerful extensions for multidimensional arrays.
Basic Indexing
Basic indexing involves using square brackets [] to access elements.
 1D Arrays: You can access elements using a single integer index, just
like with a Python list.
Python
import numpy as np
arr = np.array([10, 20, 30, 40, 50])

print(arr[0]) # Output: 10
print(arr[3]) # Output: 40
print(arr[-1]) # Output: 50 (negative indexing works the same way)
 Multidimensional Arrays: To access an element in a 2D array or higher,
you provide a comma-separated tuple of indices for each dimension.
Python
import numpy as np
arr_2d = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])

print(arr_2d[1, 2]) # Output: 6 (row 1, column 2)


print(arr_2d[0, 0]) # Output: 1 (row 0, column 0)
You can also use a list of indices to access multiple elements.

Slicing
Slicing is used to select a range of elements. The syntax is start:stop:step,
where stop is exclusive.
 1D Array Slicing:
Python
import numpy as np
arr = np.array([10, 20, 30, 40, 50])

print(arr[1:4]) # Output: [20 30 40] (from index 1 up to but not including 4)


print(arr[::2]) # Output: [10 30 50] (every second element)
print(arr[:2]) # Output: [10 20] (up to index 2)
 Multidimensional Array Slicing: You can slice each dimension
independently.
Python
import numpy as np
arr_2d = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])

# Get the first two rows and all columns


print(arr_2d[:2, :])
# Output:
# [[1 2 3]
# [4 5 6]]

# Get the second column of all rows


print(arr_2d[:, 1])
# Output: [2 5 8]

# Get the element at row 1, column 0, and row 2, column 1


print(arr_2d[[1, 2], [0, 1]]) # Output: [4 8]

Data Types for ndarrays:


NumPy's ndarray has an associated data type (dtype) that specifies the type
of data stored in the array. This is a crucial feature because all elements
within a single ndarray must be of the same type. Unlike Python's standard
lists, which can hold a mix of data types, NumPy's dtypes provide a precise,
memory-efficient way to handle numerical data.

Key Reasons for Specific Dtypes


 Memory Efficiency: By using a specific data type, you can control the
amount of memory allocated for each element. For example, a 16-bit
integer (np.int16) uses half the memory of a 32-bit integer (np.int32).
This is a huge advantage when working with very large datasets.
 Performance: Operations on uniformly typed arrays can be performed
more quickly and efficiently by the CPU. This allows NumPy to leverage
vectorized operations, which are much faster than Python loops.
 Control and Predictability: Explicitly defining a dtype prevents NumPy
from making assumptions about your data. This ensures consistency
and avoids potential errors from implicit type conversions.

Common NumPy Data Types


NumPy offers a wide range of dtypes to accommodate different needs. The
most common ones fall into these categories:
 Integers: int8, int16, int32, int64. The number indicates the size in bits.
 Unsigned Integers: uint8, uint16, uint32, uint64. These are for non-
negative integers.
 Floating-Point Numbers: float16, float32, float64. These are used for
decimal numbers, with higher precision for larger bit sizes. float64 is
the default for most floating-point operations.
 Booleans: bool. Stores True or False values.
 Strings: string_ or str_. Used to store character strings.
Specifying and Checking Dtypes
NumPy can infer the dtype from the input data, but it's often best practice to
specify it explicitly. You can check an array's dtype using the .dtype attribute.
Python
import numpy as np

# NumPy infers the data type as int64


arr_inferred = np.array([1, 2, 3])
print(arr_inferred.dtype)
# Output: int64

# Explicitly specifying the data type as float32


arr_specified = np.array([1, 2, 3], dtype=np.float32)
print(arr_specified.dtype)
# Output: float32

Array Copy vs. View


In NumPy, an important distinction exists between creating a copy and a
view of an array.
 Copy: A copy is a new array that owns its data. Changes made to the
copy won't affect the original array, and vice versa. Use the .copy()
method to create a copy.
Python
import numpy as np
arr = np.array([1, 2, 3, 4])
x = arr.copy()
arr[0] = 42 # Change in original
print(arr) # Output: [42 2 3 4]
print(x) # Output: [ 1 2 3 4]
 View: A view is an array that looks at the same data as the original
array. Changes made to the view will affect the original array, and vice
versa. Slicing an array creates a view. You can check if an array owns
its data with the .base attribute; if it's None, it's a copy.
Python
import numpy as np
arr = np.array([1, 2, 3, 4])
y = arr.view()
y[0] = 42 # Change in view
print(arr) # Output: [42 2 3 4]
print(y) # Output: [42 2 3 4]
print(y.base is arr) # Output: True

Array Shape and Reshape


 Shape: The .shape attribute returns a tuple representing the
dimensions of an array.
Python
import numpy as np
arr = np.array([[1, 2, 3, 4], [5, 6, 7, 8]])
print(arr.shape) # Output: (2, 4) (2 rows, 4 columns)
 Reshape: The .reshape() method changes the shape of an array
without changing its data. The new shape must have the same number
of elements as the original array. You can use -1 as a placeholder for
one dimension, and NumPy will automatically calculate its size.
Python
import numpy as np
arr = np.arange(12)
new_arr = arr.reshape(3, 4)
print(new_arr)
# Output:
# [[ 0 1 2 3]
# [ 4 5 6 7]
# [ 8 9 10 11]]

# Using -1 to let NumPy infer the size


inferred_arr = arr.reshape(2, -1)
print(inferred_arr.shape) # Output: (2, 6)

Array Iteration
Iterating through NumPy arrays is fundamental for processing data.
 Standard Iteration: You can iterate over the first dimension of an array
using a for loop. For multidimensional arrays, this will return a sub-
array at each step.
Python
import numpy as np
arr_2d = np.array([[1, 2, 3], [4, 5, 6]])
for row in arr_2d:
print(row)
# Output:
# [1 2 3]
# [4 5 6]
 Iterating with nditer: To iterate through every element of a
multidimensional array, regardless of its shape, use the np.nditer()
function.
Python
import numpy as np
arr_2d = np.array([[1, 2], [3, 4]])
for element in np.nditer(arr_2d):
print(element)
# Output: 1, 2, 3, 4
 Enumerated Iteration: To get both the index and the value during
iteration, use np.ndenumerate().
Python
import numpy as np
arr_2d = np.array([[1, 2], [3, 4]])
for index, value in np.ndenumerate(arr_2d):
print(f"Index: {index}, Value: {value}")
# Output:
# Index: (0, 0), Value: 1
# Index: (0, 1), Value: 2
# Index: (1, 0), Value: 3
# Index: (1, 1), Value: 4

Joining Arrays
Joining means putting two or more arrays together. The np.concatenate()
function is a primary way to do this. You specify the arrays to join and the
axis along which to join them.
Python
import numpy as np
arr1 = np.array([[1, 2], [3, 4]])
arr2 = np.array([[5, 6], [7, 8]])

# Joining along axis 0 (rows)


arr_join_rows = np.concatenate((arr1, arr2), axis=0)
print(arr_join_rows)
# Output:
# [[1 2]
# [3 4]
# [5 6]
# [7 8]]

# Joining along axis 1 (columns)


arr_join_cols = np.concatenate((arr1, arr2), axis=1)
print(arr_join_cols)
# Output:
# [[1 2 5 6]
# [3 4 7 8]]

Arithmetic operations:
Arithmetic operations are used for numerical computation and can
perform them on arrays using NumPy. With NumPy we can quickly add,
subtract, multiply, divide and get power of elements in an array. NumPy
performs these operations even with large amounts of data. The basic
arithmetic functions in NumPy and show how to use them for simple
calculations.
1. Addition of Arrays
Addition is an arithmetic operation where the corresponding elements of two
arrays are added together. In NumPy the addition of two arrays is done using
the np.add() function.
import numpy as np
a = np.array([5, 72, 13, 100])
b = np.array([2, 5, 10, 30])
add_ans = np.add(a, b) print(add_ans)
Output: [ 7 77 23 130]
2. Subtraction of Arrays
Subtract two arrays element-wise using the np.subtract() function. This
function subtracts each element of the second array from the corresponding
element in the first array.
import numpy as np
a = np.array([5, 72, 13, 100])
b = np.array([2, 5, 10, 30])
sub_ans = np.subtract(a, b) print(sub_ans)
Output: [ 3 67 3 70]
3. Multiplication of Arrays
Multiplication in NumPy can be done element-wise using
the np.multiply() function. This multiplies corresponding elements of two
arrays.
import numpy as np
a = np.array([5, 72, 13, 100])
b = np.array([2, 5, 10, 30])
mul_ans = np.multiply(a, b) print(mul_ans)
Output: [ 10 360 130 3000]
4. Division of Arrays
Division is another important operation that is performed element-wise using
the np.divide() function. This divides each element of the first array by the
corresponding element in the second array.
import numpy as np
a = np.array([5, 72, 13, 100])
b = np.array([2, 5, 10, 30])
div_ans = np.divide(a, b) print(div_ans)
Output: [ 2.5 14.4 1.3 3.33333333]
5. Exponentiation (Power)
It allows us to raise each element in an array to a specified power. In NumPy,
this can be done using the np.power() function.
import numpy as np
a = np.array([5, 72, 13, 100])
b = np.array([2, 5, 10, 30])
pow_ans = np.power(a, b) print(pow_ans)
Output: [25 1934917632 137858491849 1152921504606846976]
6. Modulus Operation
It finds the remainder when one number is divided by another. In NumPy, you
can use the np.mod() function to calculate the modulus element-wise
between two arrays.
import numpy as np
a = np.array([5, 72, 13, 100])
b = np.array([2, 5, 10, 30])
mod_ans = np.mod(a, b) print(mod_ans)
Output: [ 1 2 3 10]
With these basic arithmetic functions in NumPy we can efficiently perform
calculations on arrays.

Boolean Indexing:
Access array elements based on conditions instead of explicit indices.
Practice with Boolean Indexing: Dataset
import numpy as np
data = np.array([12, 43, 36, 32, 51, 18, 79, 7])
print("Data: ", data) # Data: [12 43 36 32 51 18 79 7]
Practice with Boolean Indexing: Boolean Mask
Now, suppose we want to extract elements greater than 30. We form a
Boolean array checking this condition for data:
bool_array = data > 30
print("Boolean Array: ", bool_array) # Boolean Array: [False True True True
True False True False]
Practice with Boolean Indexing: Selecting Data
To extract the elements that satisfy our condition from data, we use
the bool_array as an index:
filtered_data = data[bool_array]
print("Filtered Data: ", filtered_data) # Filtered Data: [43 36 32 51 79]
Now filtered_data only holds values from data that are greater than 30.
Complex Filter Condition
We can also use Boolean conditions to combine multiple criteria using logical
operators like & for AND and | for OR.
Consider an array of prices per unit for different products in a retail store:
prices = np.array([15, 30, 45, 10, 20, 35, 50])
print("Prices: ", prices) # Prices: [15 30 45 10 20 35 50]
Suppose we want to find prices that are greater than 20 and less than 40. We
can combine conditions using the & operator:
filtered_prices = prices[(prices > 20) & (prices < 40)]
print("Filtered Prices (20 < price < 40): ", filtered_prices)
# Filtered Prices (20 < price < 40): [30 35]
Now, let's consider finding prices that are either less than 15 or greater than
45 using the | operator:
filtered_prices_or = prices[(prices < 15) | (prices > 45)]
print("Filtered Prices (price < 15 OR price > 45): ", filtered_prices_or)
# Filtered Prices (price < 15 OR price > 45): [10 50]
Using these logical operators, you can create complex filtering conditions to
extract exactly the data you need.

Fancy Indexing
A tool for accessing multiple non-adjacent array items. pass an array of
indices to select these items. Consider an array of seven numbers:
data = np.array([11, 22, 33, 44, 55, 66, 77])
print("Data: ", data) # Data: [11 22 33 44 55 66 77]
fetch the 1st, 3rd, and 5th elements together:
fancy_indexes = np.array([0, 2, 4])
fancy_data = data[fancy_indexes]
print("Data from Fancy Indexing: ", fancy_data) # Data from Fancy Indexing:
[11 33 55]
Practice with Fancy Indexing
Let's consider a practical use of Fancy Indexing. Given an array representing
people's ages, we want to fetch the ages of person 2, person 5, and person
7:
ages = np.array([15, 22, 27, 35, 41, 56, 63, 74, 81])
print("Initial Ages Array: ", ages) # Initial Ages Array: [15 22 27 35 41 56 63
74 81]
indexes = np.array([1, 4, 6]) # Indices of values of interest
fetched_ages = ages[indexes]
print("Fetched Ages: ", fetched_ages) # Fetched Ages: [22 41 63]
Transposing Arrays
The numpy.transpose() function or .T attribute is used to reverse or permute
the axes of an array.
Example:
import numpy as np
# Create a 2D array
arr = np.array([[1, 2, 3], [4, 5, 6]])
# Transpose using .T
transposed1 = arr.T
# Transpose using numpy.transpose()
transposed2 = np.transpose(arr)
print(transposed1)
# Output:
# [[1 4]
# [2 5]
# [3 6]]
For higher-dimensional arrays, you can specify the order of axes:
Copy codearr_3d = np.random.rand(2, 3, 4)
transposed = np.transpose(arr_3d, axes=(1, 0, 2)) # Rearrange axes

Swapping Axes
The numpy.swapaxes() function swaps two specified axes of an array while
leaving others unchanged.
Example:
Copy code# Create a 3D array
arr_3d = np.random.rand(2, 3, 4)
# Swap axes 0 and 1
swapped = np.swapaxes(arr_3d, 0, 1)
print(swapped.shape) # Output: (3, 2, 4)

Note:
 Transpose: Rearranges all axes (default reverses them).
 Swapaxes: Swaps only two specified axes.
Both are efficient and return views of the original array when possible.

Universal Functions:
"Universal Functions" and they are NumPy functions that operate on
the ndarray object.
Why use ufuncs?
ufuncs are used to implement vectorization in NumPy which is way faster
than iterating over elements. They also provide broadcasting and additional
methods like reduce, accumulate etc. that are very helpful for computation.
functions also take additional arguments, like:
where boolean array or condition defining where the operations should take
place.
dtype defining the return type of elements.
out output array where the return value should be copied.
ufunc Rounding Decimals
There are primarily five ways of rounding off decimals in NumPy:
 Truncation fix rounding floor ceil
NumPy – Rounding Decimals
NumPy offers 5 main ways to round decimals:
1. Truncation – Removes decimals, returns value closest to zero.
np.trunc([-3.1666, 3.6667]) # [-3. 3.]
np.fix([-3.1666, 3.6667]) # [-3. 3.]
2. Rounding (around) – Rounds to given decimal places (≥5 rounds up).
np.around(3.1666, 2) # 3.17
3. Floor – Rounds down to nearest integer.
np.floor([-3.1666, 3.6667]) # [-4. 3.]
4. Ceil – Rounds up to nearest integer.
np.ceil([-3.1666, 3.6667]) # [-3. 4.]

NumPy Log Functions


NumPy supports logarithms with base 2, 10, and e. If log cannot be
computed, it returns -inf or inf.
Log base 2
import numpy as np
arr = np.arange(1, 10)
print(np.log2(arr))
Log base 10 : print(np.log10(arr))
Natural log (base e) print(np.log(arr))
Log at any base (custom ufunc using math.log)
from math import log
nplog = np.frompyfunc(log, 2, 1)
print(nplog(100, 15))

ufunc Summations
NumPy Addition vs Summation
Addition (np.add) – Element-wise addition between two arrays.
import numpy as np
arr1 = np.array([1, 2, 3])
arr2 = np.array([1, 2, 3])
print(np.add(arr1, arr2)) # [2 4 6]
Summation (np.sum) – Adds all elements.
print(np.sum([arr1, arr2])) # 12
Summation over axis – Adds along a specified axis.
print(np.sum([arr1, arr2], axis=1)) # [6 6]
Cumulative sum (np.cumsum) – Running total of elements.
arr = np.array([1, 2, 3])
print(np.cumsum(arr)) # [1 3 6]
ufunc Products
Product (np.prod) – Multiplies all elements.
import numpy as np
arr = np.array([1, 2, 3, 4])
print(np.prod(arr)) # 24
Product of multiple arrays – Flattens and multiplies all elements.
arr1 = np.array([1, 2, 3, 4])
arr2 = np.array([5, 6, 7, 8])
print(np.prod([arr1, arr2])) # 40320
Product over axis – Multiplies along a specific axis.
print(np.prod([arr1, arr2], axis=1)) # [24 1680]
Cumulative product (np.cumprod) – Running product of elements.
arr = np.array([5, 6, 7, 8])
print(np.cumprod(arr)) # [5 30 210 1680]

NumPy Discrete Difference – np.diff


Discrete difference = subtract successive elements.
import numpy as np
arr = np.array([10, 15, 25, 5])
print(np.diff(arr)) # [5 10 -20]
Calculation: 15-10=5, 25-15=10, 5-25=-20
Repeated difference – use n parameter.
print(np.diff(arr, n=2)) # [5 -30]
Step 1: [5, 10, -20]
Step 2: 10-5=5, -20-10=-30

NumPy – LCM & GCD


LCM (Lowest Common Multiple)
 Definition: Smallest positive number divisible by both numbers.
import numpy as np
# Two numbers print(np.lcm(4, 6)) # 12
# Array arr = np.array([3, 6, 9])
print(np.lcm.reduce(arr)) # 18
# LCM of 1 to 10
print(np.lcm.reduce(np.arange(1, 11))) # 2520
GCD (Greatest Common Divisor / HCF)
 Definition: Largest number dividing both numbers exactly.
import numpy as np
# Two numbers print(np.gcd(6, 9)) # 3
# Array arr = np.array([20, 8, 32, 36, 16])
print(np.gcd.reduce(arr)) # 4

NumPy – Trigonometric and Hyperbolic Functions


1. Trigonometric Functions
NumPy provides: sin(), cos(), tan() – take radian values as input.
import numpy as np
print(np.sin(np.pi/2)) # 1.0
arr = np.array([np.pi/2, np.pi/3, np.pi/4, np.pi/5])
print(np.sin(arr))
Degree–Radian Conversion
 deg2rad() → degrees → radians
 rad2deg() → radians → degrees
print(np.deg2rad([90, 180, 270, 360]))
print(np.rad2deg([np.pi/2, np.pi, 1.5*np.pi, 2*np.pi]))
Inverse Trigonometric Functions
 arcsin(), arccos(), arctan() – return angles in radians.
print(np.arcsin(1.0)) print(np.arcsin([1, -1, 0.1]))
Hypotenuse Calculation
 hypot(base, perpendicular) – Pythagoras theorem.
print(np.hypot(3, 4)) # 5.0

2. Hyperbolic Functions
NumPy provides:
 sinh(), cosh(), tanh() – hyperbolic equivalents.
print(np.sinh(np.pi/2))
arr = np.array([np.pi/2, np.pi/3, np.pi/4, np.pi/5])
print(np.cosh(arr))
Inverse Hyperbolic Functions
 arcsinh(), arccosh(), arctanh() – return angles in radians.
print(np.arcsinh(1.0))
print(np.arctanh([0.1, 0.2, 0.5]))

NumPy Set Operations (1-D arrays only)


1. Create a Set – remove duplicates
np.unique(arr)
2. Union – values present in either array
np.union1d(arr1, arr2)
3. Intersection – values present in both arrays
np.intersect1d(arr1, arr2, assume_unique=True)
4. Difference – values in first array not in second
np.setdiff1d(arr1, arr2, assume_unique=True)
5. Symmetric Difference – values in either array but not both
np.setxor1d(arr1, arr2, assume_unique=True)
Note: assume_unique=True speeds computation when data is already
unique.

Array-Oriented Programming with Arrays python


Array-oriented programming in Python is a programming paradigm that
leverages arrays (especially NumPy arrays) to perform operations on entire
collections of data at once, rather than iterating through individual elements.
This approach is efficient, concise, and ideal for numerical and data-intensive
tasks.
Key Concepts of Array-Oriented Programming in Python
1. Vectorization:
Replaces explicit loops with array operations. Improves performance by
utilizing optimized C-based implementations in libraries like NumPy.
o Example: import numpy as np
o # Without vectorization
o list1 = [1, 2, 3] list2 = [4, 5, 6]
o result = [x + y for x, y in zip(list1, list2)]
o # With vectorization
o array1 = np.array([1, 2, 3]) array2 = np.array([4, 5, 6])
o result = array1 + array2
o print(result) # Output: [5 7 9]
2. Broadcasting:
Allows operations on arrays of different shapes by automatically expanding
smaller arrays to match the shape of larger ones.
o Example: array = np.array([1, 2, 3])
o scalar = 5
o result = array + scalar
o print(result) # Output: [6 7 8]
3. Multi-Dimensional Arrays:
NumPy supports multi-dimensional arrays for complex data structures like
matrices.
o Example: matrix1 = np.array([[1, 2], [3, 4]])
o matrix2 = np.array([[5, 6], [7, 8]])
o result = matrix1 + matrix2
o print(result)
o # Output: # [[ 6 8] # [10 12]]
4. Efficient Aggregations:
Perform operations like summation, mean, or standard deviation directly on
arrays.
Example: data = np.array([1, 2, 3, 4, 5])
o print(data.sum()) # Output: 15
o print(data.mean()) # Output: 3.0
5. Logical Operations:
o Apply logical conditions across entire arrays.
o Example: array = np.array([1, 2, 3, 4, 5])
o result = array > 3
o print(result) # Output: [False False False True True]
Benefits of Array-Oriented Programming
 Performance: Eliminates Python loops, leveraging optimized C-based
operations.
 Readability: Code is more concise and easier to understand.
 Scalability: Handles large datasets efficiently.
By adopting array-oriented programming, you can write cleaner, faster, and
more efficient Python code, especially for data analysis, scientific computing,
and machine learning tasks.

NumPy mathematical and statistical methods


Mathematical Methods
1. Basic Arithmetic Operations:
o np.add(), np.subtract(), np.multiply(), np.divide()
o Example: np.add([1, 2], [3, 4]) → [4, 6]
2. Trigonometric Functions:
o np.sin(), np.cos(), np.tan(), etc.
o Example: np.sin(np.pi / 2) → 1.0
3. Exponential and Logarithmic Functions:
o np.exp(), np.log(), np.log10()
o Example: np.exp(1) → 2.718
4. Power and Root Functions:
o np.power(), np.sqrt()
o Example: np.sqrt(16) → 4.0
5. Rounding Functions:
o np.round(), np.floor(), np.ceil()
o Example: np.round(3.14159, 2) → 3.14

Statistical Methods
1. Descriptive Statistics:
o np.mean(): Calculates the mean.
o np.median(): Finds the median.
o np.std(), np.var(): Standard deviation and variance.
o Example: np.mean([1, 2, 3]) → 2.0
2. Summation and Products:
o np.sum(), np.prod()
o Example: np.sum([1, 2, 3]) → 6
3. Min/Max and Percentiles:
o np.min(), np.max(), np.percentile()
o Example: np.percentile([1, 2, 3, 4], 50) → 2.5
4. Correlation and Covariance:
o np.corrcoef(): Correlation coefficient.
o np.cov(): Covariance matrix.
o Example: np.corrcoef([1, 2, 3], [4, 5, 6])

methods for boolean arrays:


1. Creating Boolean Arrays
 From conditions: Use comparison operators to create Boolean arrays.
 import numpy as np
 arr = np.array([1, 2, 3, 4, 5])
 bool_arr = arr > 3 # [False, False, False, True, True]
 From values: Convert integers (1/0) or other data types to Boolean.
 bool_arr = np.array([1, 0, 1, 0], dtype=bool) # [True, False, True, False]

2. Logical Operations
 AND (&):
 a = np.array([True, False, True])
 b = np.array([False, False, True])
 result = a & b # [False, False, True]
 OR (|):
 result = a | b # [True, False, True]
 NOT (~):
 result = ~a # [False, True, False]
 XOR (^):
 result = a ^ b # [True, False, False]

3. Counting True Values


 Use np.sum() to count True values (since True is treated as 1).
 bool_arr = np.array([True, False, True, True])
 count = np.sum(bool_arr) # 3

4. Indexing and Filtering


 Use Boolean arrays to filter elements in another array.
 arr = np.array([10, 20, 30, 40])
 bool_arr = arr > 25
 filtered = arr[bool_arr] # [30, 40]

5. Combining Conditions
 Combine multiple conditions using logical operators.
 arr = np.array([10, 20, 30, 40])
 result = (arr > 15) & (arr < 35) # [False, True, True, False]

6. Boolean Reduction
 np.any(): Checks if at least one value is True.
 np.any([False, True, False]) # True
 np.all(): Checks if all values are True.
 np.all([True, True, False]) # False

Sorting,Unique and Other Set Logic


1. Sorting
NumPy allows sorting arrays using the np.sort() function:
import numpy as np

arr = np.array([3, 1, 4, 1, 5, 9])


sorted_arr = np.sort(arr) # Returns a sorted copy
print(sorted_arr) # Output: [1 1 3 4 5 9]
 In-place Sorting: Use arr.sort() to sort the array directly.
 Multi-dimensional Sorting: Sort along specific axes using the axis
parameter.

2. Finding Unique Elements


The np.unique() function returns the sorted unique elements of an array:
arr = np.array([1, 2, 2, 3, 4, 4, 4])
unique_elements = np.unique(arr)
print(unique_elements) # Output: [1 2 3 4]
 Additional Features:
o return_counts=True: Returns the count of each unique element.
o return_index=True: Returns the indices of the first occurrences.

3. Set Operations
NumPy supports mathematical set operations for arrays:
 Union: Combine unique elements from two arrays.
 a = np.array([1, 2, 3])
 b = np.array([3, 4, 5])
 union = np.union1d(a, b)
 print(union) # Output: [1 2 3 4 5]
 Intersection: Find common elements.
 intersection = np.intersect1d(a, b)
 print(intersection) # Output: [3]
 Difference: Elements in a but not in b.
 difference = np.setdiff1d(a, b)
 print(difference) # Output: [1 2]
 Symmetric Difference: Elements in either a or b but not both.
 sym_diff = np.setxor1d(a, b)
 print(sym_diff) # Output: [1 2 4 5]

File Input and Output with Arrays:


In Python, working with arrays often involves saving and loading data
efficiently. NumPy provides several functions to handle file input and output
for arrays. Here's a concise overview:
1. Saving Arrays
 numpy.save: Saves a single array to a binary .npy file.
 import numpy as np
 arr = np.array([1, 2, 3, 4])
 np.save('array_file.npy', arr)
 numpy.savez: Saves multiple arrays into a compressed .npz file.
 arr1 = np.array([1, 2, 3])
 arr2 = np.array([4, 5, 6])
 np.savez('arrays_file.npz', array1=arr1, array2=arr2)
 numpy.savetxt: Saves an array to a human-readable text file.
 np.savetxt('array_file.txt', arr, delimiter=',')

2. Loading Arrays
 numpy.load: Loads arrays from .npy or .npz files.
 loaded_arr = np.load('array_file.npy')
 print(loaded_arr)
For .npz files:
data = np.load('arrays_file.npz')
print(data['array1'], data['array2'])
 numpy.loadtxt: Loads data from a text file into an array.
 loaded_arr = np.loadtxt('array_file.txt', delimiter=',')
 print(loaded_arr)
 numpy.genfromtxt: Handles missing data while loading from text
files.
 loaded_arr = np.genfromtxt('array_file.txt', delimiter=',',
filling_values=0)
 print(loaded_arr)

Note:
 Binary Format: Efficient for large arrays but not human-readable
(.npy, .npz).
 Text Format: Human-readable but less efficient for storage (.txt, .csv).
 Compression: Use. npz for saving multiple arrays in a compressed
format.
These functions make it easy to store and retrieve array data for various
applications, ensuring both efficiency and flexibility.

Linear Algebra
Linear algebra in Python can be efficiently handled using libraries like
NumPy and SciPy, which provide robust tools for matrix operations, solving
linear systems, and more. Here's a quick overview of what you can do:
1. Basic Operations
You can perform operations like addition, subtraction, multiplication, and
division on matrices and vectors.
import numpy as np

# Define matrices
A = np.array([[1, 2], [3, 4]])
B = np.array([[5, 6], [7, 8]])

# Matrix addition
C=A+B

# Matrix multiplication
D = np.dot(A, B)

print("Addition:\n", C)
print("Multiplication:\n", D)
2. Matrix Properties
You can compute properties like determinant, rank, trace, and eigenvalues.
# Determinant
det = np.linalg.det(A)

# Rank
rank = np.linalg.matrix_rank(A)

# Eigenvalues and eigenvectors


eigenvalues, eigenvectors = np.linalg.eig(A)

print("Determinant:", det)
print("Rank:", rank)
print("Eigenvalues:", eigenvalues)

3. Solving Linear Systems


Solve equations of the form Ax=bAx = bAx=b.
# Define a system of equations
A = np.array([[3, 1], [1, 2]])
b = np.array([9, 8])

# Solve for x
x = np.linalg.solve(A, b)

print("Solution:", x)

4. Matrix Inversion
Find the inverse of a matrix (if it exists).
# Inverse of a matrix
A_inv = np.linalg.inv(A)

print("Inverse:\n", A_inv)

5. Advanced Operations
Perform singular value decomposition (SVD), QR decomposition, or compute
norms.
# Singular Value Decomposition
U, S, V = np.linalg.svd(A)

# Norm of a matrix
norm = np.linalg.norm(A)

print("SVD Components:\n", U, S, V)
print("Norm:", norm)
These tools make Python a powerful choice for linear algebra tasks, whether
you're working on data science, machine learning, or scientific computing.
Pseudorandom Number Generation
Pseudo-random number generation is the process of generating a sequence
of numbers that appears random but is actually deterministic. These
numbers are produced by an algorithm, a pseudo-random number
generator (PRNG), using a starting value called a seed. If you use the same
seed, you'll get the exact same sequence of "random" numbers. This
predictability is useful for debugging and reproducibility in simulations.

Example: Random Walks


A random walk is a mathematical process that describes a path consisting
of a succession of random steps. A simple 1D random walk starts at a point,
say 0, and at each step, it moves either one unit to the right (+1) or one unit
to the left (-1) with equal probability.
To simulate this using a PRNG, you'd perform the following steps:
1. Initialize: Set a starting position (e.g., position = 0).
2. Loop: Repeat for a desired number of steps (e.g., 100).
3. Generate: At each step, generate a pseudo-random number. A
common method is to generate a number between 0 and 1.
4. Decide: If the number is less than 0.5, the walker moves left (position
-= 1). If it's 0.5 or greater, the walker moves right (position += 1).
5. Record: Store the position after each step to track the path.
Here's a simple conceptual Python-like code snippet:
Python
import random

position = 0
path = [position]
for _ in range(100):
if random.random() < 0.5:
position -= 1
else:
position += 1
path.append(position)
In this example, random.random() is the PRNG function that generates a
number between 0.0 and 1.0.

Simulating Many Random Walks at Once


Simulating multiple random walks simultaneously is a common task in fields
like physics and finance. For instance, you might want to simulate the paths
of 10,000 particles at once to understand their collective behavior. A naive
approach would be to loop through each particle and calculate its position at
each time step, which can be computationally expensive.
This is where leveraging the power of linear algebra becomes incredibly
useful. Instead of thinking about each walk individually, you can represent
the positions of all the particles as a single vector or matrix.
Let's say you want to simulate N random walks over T time steps.
 The positions of the N particles at any time t can be stored in a vector
Pt of size N×1.
 The random steps for all particles at time t+1 can be represented by
another vector St+1 of size N×1, where each element is either +1 or -
1.
The update rule for all particles at once would be:
Pt+1=Pt+St+1
When performing this in a programming environment like Python with
libraries such as NumPy, these vector additions are highly optimized. This is
because these libraries use efficient underlying code (often written in C or
Fortran) that can perform the operation on the entire vector simultaneously,
a process known as vectorization. This approach is significantly faster than
using a traditional for loop to update each particle's position one by one.
This is a simple example of how linear algebra can be used to perform many
random simulations in parallel, vastly increasing the efficiency of the
computation. You're not just adding numbers; you're adding vectors, which is
a core operation in linear algebra that can be implemented with incredible
efficiency on modern processors.

An example of pseudo-random number generation is the linear


congruential generator (LCG), a simple and widely used method. The LCG
formula is:
Xn+1=(aXn+c)(modm)
 Xn is the current pseudo-random number in the sequence.
 Xn+1 is the next pseudo-random number.
 a is the multiplier.
 c is the increment.
 m is the modulus.
 The process starts with a seed value, X0.
For example, with parameters a=5, c=3, m=16, and a seed X0=7, the
sequence unfolds as follows:
1. X1=(5⋅7+3)(mod16)=(35+3)(mod16)=38(mod16)=6
2. X2=(5⋅6+3)(mod16)=(30+3)(mod16)=33(mod16)=1
3. X3=(5⋅1+3)(mod16)=(5+3)(mod16)=8(mod16)=8
4. X4=(5⋅8+3)(mod16)=(40+3)(mod16)=43(mod16)=11
...and so on.
The sequence generated is 7,6,1,8,11,… and will eventually repeat. The
length of this repeating sequence is known as the period. The LCG's
simplicity makes it easy to implement, but its predictable nature can be a
drawback for applications requiring high-quality randomness.

You might also like