Blue Print:
Unit Unit Name Marks
No
1 Data Handling using Pandas and Data 30
Visualization
2 Database Query using SQL 25
3 Introduction to Computer Networks 7
4 Societal Impacts 8
Practical 30
Total 100
Unit 1
Data Handling using Pandas and Data
Visualization
(Data Handling using Pandas –I)
Module: Module is a file which contains python functions. It
is .py file which has python executable code or statements.
Package: Package is namespace which contains multiple
packages or modules. It is a directory which contains a special
file __init__.py.
__init__.py file denotes Python the file that contains __init__.py
as package.
Library: It is collection of various packages. There is no
difference between package and python library conceptually.
Framework: It is a collection of various libraries which architects
the code flow.
Pandas:
Pandas is the most popular open source python library
used for data analysis.
We can analyze the data in pandas in two ways-
● Series
● Dataframes
Installation of pandas:
pip install pandas
Series:
Series is 1-Dimensional array defined in python pandas
to store any data type.
Syntax:
<Series Name>=<pd>.Series(<list name>, ...)
Example:
5 15 16 4 34
Properties of Series:
• Series will contain homogeneous data type.
• Size of the series immutable
• Values in the series are mutable.
Creation of Series:
We can create a pandas series in following ways-
● From arrays
● From Lists
● From Dictionaries
● From scalar value
From Lists :
Output:
From arrays :
Output:
From Dictionary:
Output:
From Scalar Value:
Output:
Mathematical Operations on Series:
Mathematical Operations on Series (cont…):
Output:
Head and Tail functions on Series:
head and tail functions returns first and last n rows respectively.
Syntax:
<Series name>.head(n)
<Series name>.tail(n)
n-number of rows
Default value of n is 5
Selection, Indexing and Slicing on Series:
Selection: We can select a value from the series by using its
corresponding index.
Syntax:
<Series name>[<index number>]
Output:
Indexing:
Series.index attribute is used to get or set the index labels for the
given series.
Syntax:
<Series name>.index
Indexing (cont...):
Output:
Slicing:
Slicing operation on the series split the series based on the given
parameters.
Syntax:
<Series name>[<start>:<stop>:<step>]
Note: start,stop,step are optional
Default values: start=0, stop=n-1, step=1
Note: slicing will take default index