KEMBAR78
Unit II - Notes | PDF | Data Type | Boolean Data Type
0% found this document useful (0 votes)
7 views10 pages

Unit II - Notes

Uploaded by

Kannan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views10 pages

Unit II - Notes

Uploaded by

Kannan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 10

VEL TECH HIGH TECH

Dr. RANGARAJAN Dr. SAKUNTHALA ENGINEERING


COLLEGE
An Autonomous Institution
Approved by AICTE-New Delhi, Affiliated to Anna University, Chennai
Accredited by NBA, New Delhi & Accredited by NAAC with “A” Grade & CGPA of 3.27

DEPARTMENT OF ARTIFICIAL INTELLIGENCE AND DATA SCIENCE

Course code Semester


Category OPEN ELECTIVE(OE) L T P C
Course Title PYTHON FOR DATA SCIENCE 3 0 0 3

UNIT II TOWARDS DATA SCIENCE USING NUMPY


Understanding Data Types in Python - The Basics of NumPy Arrays - Computation on NumPy Arrays:
Universal Functions - Aggregations: Min, Max, and Everything in Between Computation on Arrays:
Broadcasting-Comparisons, Masks, and Boolean Logic Fancy Indexing-Sorting Arrays.

COURSEOBJECTIVES:
· To describe the fundamentals for exploring and managing data with Python.

· To examine the various data analytics techniques for labelled/columnar data using Python.

· To demonstrate a flexible range of data visualizations techniques in Python.

· To describe the various Machine learning algorithms for data modelling with Python.

COURSEOUTCOMES:

Blooms
CO.No. CourseOutcomes
level

OnsuccessfulcompletionofthisCourse,studentswillbeableto

C305. 2 Make use of knowledge on NumPy to write programs on K2


array operations.

Understanding Data Types in Python

In Python, data types define the kind of data a variable can hold. Understanding these is essential because different
operations are valid for different data types, and correct usage ensures efficient coding and data processing.

. Basic Built-in Data Types


Type Description Example
int Integer numbers x = 10

float Decimal numbers y = 3.14

bool Boolean values (True or flag = True


False)
str Text (string of characters) name = "Alice"

2. Type Conversion (Casting)


You can convert between types using built-in functions:

python
CopyEdit
int("10") # Converts string to integer 10
float("3.5") # Converts string to float 3.5
str(100) # Converts integer to string '100'
bool(0) # Converts to boolean False

3. Collection Data Types


Type Description Example

lis Ordered, mutable collection fruits = ["apple",


t "banana"]

tup Ordered, immutable collection coords = (10, 20)


le

set Unordered collection of unique elements ids = {101, 102, 103}

dic Key-value pairs student = {"name": "John",


t "age": 21}

4. Type Checking
Use the type() function to check the data type of a variable:

python
CopyEdit
x = 25
print(type(x)) # Output: <class 'int'>

5. NumPy Data Types (For Data Science)


In scientific computing with NumPy, special data types are used for efficient computation:

NumPy Data Type Description

np.int32, np.int64 Fixed-size integers

np.float32, Floating-point numbers


np.float64

np.bool_ Boolean

np.str_ Unicode string

Example:
python
CopyEdit
import numpy as np
arr = np.array([1, 2, 3], dtype=np.int32)
print(arr.dtype) # Output: int32

Why Understanding Data Types is Important in Data Science?


● Ensures correct operations (e.g., numeric vs. text processing)

● Saves memory and improves performance

● Helps prevent errors in data transformation and modeling

● Required when using libraries like Pandas, NumPy, and scikit-learn

The Basics of NumPy Arrays


What is NumPy?
NumPy (Numerical Python) is a powerful Python library used for:

● Efficient numerical computation,

● Creating and manipulating n-dimensional arrays,

● Performing mathematical operations on large datasets.

1. What is a NumPy Array?


A NumPy array is a grid of values, all of the same type, and is indexed by a tuple of non-negative integers.

● NumPy arrays are more efficient than Python lists.

● Arrays support element-wise operations and broadcasting.

import numpy as np

arr = np.array([1, 2, 3]) # 1D array

2. Creating NumPy Arrays


🔹 From Python lists:
python
CopyEdit
a = np.array([1, 2, 3]) # 1D array
b = np.array([[1, 2], [3, 4]]) # 2D array

Using built-in functions:


python
CopyEdit
np.zeros((2, 3)) # 2x3 array of zeros
np.ones((2, 2)) # 2x2 array of ones
np.arange(0, 10, 2) # [0 2 4 6 8]
np.linspace(0, 1, 5) # 5 evenly spaced numbers from 0 to 1
np.eye(3) # 3x3 identity matrix

3. Array Attributes
Attribute Description Example

ndim Number of dimensions arr.ndi


m

shape Tuple representing array dimensions arr.sha


pe

size Total number of elements arr.siz


e

dtype Data type of array elements arr.dty


pe

arr = np.array([[1, 2, 3], [4, 5, 6]])


print(arr.ndim) # 2
print(arr.shape) # (2, 3)
print(arr.size) # 6

4. Indexing and Slicing


Just like Python lists, NumPy arrays support indexing and slicing.

python
CopyEdit
arr = np.array([10, 20, 30, 40])
print(arr[1]) # 20
print(arr[1:3]) # [20 30]

matrix = np.array([[1, 2], [3, 4]])


print(matrix[0, 1]) # 2

5. Array Operations
NumPy supports vectorized operations (element-wise), which are faster than loops.

python
CopyEdit
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])

print(a + b) # [5 7 9]
print(a * 2) # [2 4 6]
print(a ** 2) # [1 4 9]
print(np.sin(a)) # Applies sin to each element

6. Reshaping Arrays
You can change the shape of arrays without changing data.

python
CopyEdit
arr = np.array([[1, 2], [3, 4], [5, 6]])
reshaped = arr.reshape((2, 3))

7. Array Copying vs. View


● arr.copy() creates a new array.

● arr.view() or slicing returns a view (reference) to the same data.

Why NumPy Arrays Are Important in Data Science:


● Memory-efficient and fast computations

● Support for vectorized operations (no explicit loops)

● Foundation for Pandas, Scikit-learn, TensorFlow, etc.

Computation on NumPy Arrays


NumPy is optimized for performing fast, element-wise operations on large arrays. It includes mathematical, statistical, and logical operations that are
both vectorized and highly efficient.
1. Arithmetic Operations
NumPy allows arithmetic operations to be applied element-wise on arrays of the same shape.

python

import numpy as np

a = np.array([1, 2, 3])

b = np.array([4, 5, 6])

print(a + b) # [5 7 9]

print(a - b) # [-3 -3 -3]

print(a * b) # [4 10 18]

print(a / b) # [0.25 0.4 0.5 ]

print(a ** 2) # [1 4 9]

2. Universal Functions (ufuncs)


NumPy provides many universal functions, which are fast, vectorized operations implemented in C.

🔹 Examples:

Function Description

np.add Element-wise addition


(a, b)

np.sub Element-wise subtraction


tract(
a, b)

np.mul Element-wise multiplication


tiply(
a, b)

np.div Element-wise division


ide(a,
b)

Trigonometric Functions:
x = np.array([0, np.pi/2, np.pi])

print(np.sin(x)) # [0.0, 1.0, 0.0]


Exponential and Logarithmic:
python

CopyEdit

np.exp([1, 2, 3]) # [e^1,


e^2, e^3]

np.log([1, np.e, np.e**2]) # [0.,


1., 2.]

Aggregation Functions
Aggregations compute summary statistics over the entire
array or along an axis.

Common Aggregations:

Function Description

np.sum() Sum of all


elements

np.mean() Mean
(average)

np.std() Standard
deviation

np.min() Minimum
value

np.max() Maximum
value

np.median( Median
)

np.percent Nth percentile


ile()

Example:

arr = np.array([[1, 2, 3], [4, 5, 6]])

print(arr.sum()) # 21

print(arr.mean()) # 3.5
print(arr.sum(axis=0)) # [5 7 9] - column-wise sum

print(arr.sum(axis=1)) # [6 15] - row-wise sum

4. Broadcasting in NumPy
Broadcasting allows NumPy to work with arrays of different shapes during
arithmetic operations.

Rules:
1. If arrays have different dimensions, the smaller one is
stretched to match.

2. Dimensions must be compatible (or one must be 1).

Example:
python

CopyEdit

a = np.array([1, 2, 3]) # shape (3,)

b = np.array([[10], [20]]) # shape (2, 1)

print(a + b)

# Output: [[11 12 13]

# [21 22 23]]

5. Comparison and Boolean


Logic
You can use NumPy to compare elements or apply Boolean conditions.

python

CopyEdit

a = np.array([1, 2, 3, 4])

print(a > 2) # [False False True True]

print(a == 3) # [False False True False]

Logical Operators:

Operator Description

np.logical_ Element-wise
and() AND

np.logical_ Element-wise
or()
OR

np.logical_ Element-wise
not()
NOT

a = np.array([1, 2, 3])

b = np.array([3, 2, 1])

np.logical_and(a < 3, b < 3) # [ True False False]

6. Fancy Indexing
Allows retrieving multiple elements using an array of indices.

python

CopyEdit

arr = np.array([10, 20, 30, 40])

idx = [0, 3]

print(arr[idx]) # [10 40]

7. Sorting Arrays
python

CopyEdit

a = np.array([3, 1, 2])

print(np.sort(a)) # [1 2 3]

For multi-dimensional arrays:

python

CopyEdit

arr = np.array([[3, 1], [4, 2]])

np.sort(arr, axis=1) # Sort rows


Summary Table
Topic Examples / Functions

Arithmetic
Ops +, -, *, /, **

Universal
Funcs np.sin(),
np.exp(),
np.power()

Aggregations
np.sum(),
np.mean(),
np.max()

Broadcasting Add arrays of different shapes

Comparisons
a > b, a == b

Boolean Logic
np.logical_and(
),
np.logical_not(
)

Fancy
Indexing arr[[0, 2]]

Sorting
np.sort()

You might also like