0% found this document useful (0 votes)

21 views9 pages

Pandas Part-2

The document provides an overview of various operations that can be performed on a Pandas DataFrame, including accessing data using loc and iloc functions, filtering rows and columns, and dropping specific rows or columns. It also explains how to reset the index of a DataFrame and modify the original DataFrame. Examples of code snippets illustrate these operations using random data.

Uploaded by

imbilalbaig

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

21 views9 pages

Pandas Part-2

Uploaded by

imbilalbaig

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

Pandas DataFrame Operations

In [1]: import numpy as np

import pandas as pd

loc and iloc function

Note:
-> Use loc if you want to access by index naming (Named Based)

-> Use iloc if you want to access specific value of that particular index (Index Position Based)

In [2]: # Creating a dataframe of random values using rand() function

df=pd.DataFrame(np.random.rand(15, 5), index=np.arange(15), columns=['A','B','C','D','E']
df

Out[2]: A B C D E
A B C D E

0 0.155607 0.755827 0.969633 0.868001 0.465959

1 0.960138 0.830459 0.866927 0.451456 0.996912

2 0.359386 0.285517 0.384530 0.762477 0.054584

3 0.518195 0.763627 0.634155 0.446701 0.525050

4 0.724831 0.595158 0.567885 0.443769 0.003731

5 0.653767 0.135068 0.617750 0.916386 0.730770

6 0.361150 0.114477 0.990422 0.338284 0.682234

7 0.849273 0.988311 0.410418 0.094562 0.196757

8 0.057593 0.107239 0.958606 0.293883 0.473595

9 0.433382 0.983331 0.904262 0.710168 0.093863

10 0.742669 0.227531 0.149279 0.006128 0.491768

11 0.639688 0.096979 0.877697 0.463407 0.309152

12 0.711802 0.031837 0.337505 0.147178 0.677049

13 0.229736 0.215737 0.795375 0.613702 0.369378

14 0.841418 0.227723 0.659610 0.575109 0.613839

In [3]: # Access dataframe elements with an argument

df.loc[(df['C']<0.3)]

Out[3]: A B C D E

10 0.742669 0.227531 0.149279 0.006128 0.491768

In [4]: # Another method to access dataframe elements with an argument

df.loc[(df['C']<0.4) & (df['D']>0.1)]

Out[4]: A B C D E

2 0.359386 0.285517 0.384530 0.762477 0.054584

12 0.711802 0.031837 0.337505 0.147178 0.677049

In [5]: # Access only columns of a dataframe

df.loc[:,['C','D']]

Out[5]: C D

0 0.969633 0.868001

1 0.866927 0.451456

2 0.384530 0.762477

3 0.634155 0.446701

4 0.567885 0.443769

5 0.617750 0.916386
C D

6 0.990422 0.338284

7 0.410418 0.094562

8 0.958606 0.293883

9 0.904262 0.710168

10 0.149279 0.006128

11 0.877697 0.463407

12 0.337505 0.147178

13 0.795375 0.613702

14 0.659610 0.575109

In [6]: # Access only rows of a dataframe

df.loc[[6,8],:]

Out[6]: A B C D E

6 0.361150 0.114477 0.990422 0.338284 0.682234

8 0.057593 0.107239 0.958606 0.293883 0.473595

In [7]: # How to drop a specific column of data frame

df.drop(['E'], axis=1)

Out[7]: A B C D

0 0.155607 0.755827 0.969633 0.868001

1 0.960138 0.830459 0.866927 0.451456

2 0.359386 0.285517 0.384530 0.762477

3 0.518195 0.763627 0.634155 0.446701

4 0.724831 0.595158 0.567885 0.443769

5 0.653767 0.135068 0.617750 0.916386

6 0.361150 0.114477 0.990422 0.338284

7 0.849273 0.988311 0.410418 0.094562

8 0.057593 0.107239 0.958606 0.293883

9 0.433382 0.983331 0.904262 0.710168

10 0.742669 0.227531 0.149279 0.006128

11 0.639688 0.096979 0.877697 0.463407

12 0.711802 0.031837 0.337505 0.147178

13 0.229736 0.215737 0.795375 0.613702

14 0.841418 0.227723 0.659610 0.575109

In [8]: # How to drop a specific row of data frame

df.drop([5])

Out[8]: A B C D E

0 0.155607 0.755827 0.969633 0.868001 0.465959

1 0.960138 0.830459 0.866927 0.451456 0.996912

2 0.359386 0.285517 0.384530 0.762477 0.054584

3 0.518195 0.763627 0.634155 0.446701 0.525050

4 0.724831 0.595158 0.567885 0.443769 0.003731

6 0.361150 0.114477 0.990422 0.338284 0.682234

7 0.849273 0.988311 0.410418 0.094562 0.196757

8 0.057593 0.107239 0.958606 0.293883 0.473595

9 0.433382 0.983331 0.904262 0.710168 0.093863

10 0.742669 0.227531 0.149279 0.006128 0.491768

11 0.639688 0.096979 0.877697 0.463407 0.309152

12 0.711802 0.031837 0.337505 0.147178 0.677049

13 0.229736 0.215737 0.795375 0.613702 0.369378

14 0.841418 0.227723 0.659610 0.575109 0.613839

In [9]: # Original dataframe remained unchanged

Out[9]: A B C D E

0 0.155607 0.755827 0.969633 0.868001 0.465959

1 0.960138 0.830459 0.866927 0.451456 0.996912

2 0.359386 0.285517 0.384530 0.762477 0.054584

3 0.518195 0.763627 0.634155 0.446701 0.525050

4 0.724831 0.595158 0.567885 0.443769 0.003731

5 0.653767 0.135068 0.617750 0.916386 0.730770

6 0.361150 0.114477 0.990422 0.338284 0.682234

7 0.849273 0.988311 0.410418 0.094562 0.196757

8 0.057593 0.107239 0.958606 0.293883 0.473595

9 0.433382 0.983331 0.904262 0.710168 0.093863

10 0.742669 0.227531 0.149279 0.006128 0.491768

11 0.639688 0.096979 0.877697 0.463407 0.309152

12 0.711802 0.031837 0.337505 0.147178 0.677049

13 0.229736 0.215737 0.795375 0.613702 0.369378

14 0.841418 0.227723 0.659610 0.575109 0.613839

In [10]: # To change origianl dataframe, use copy function or assign a variable

df=df.drop(['D', 'E'], axis=1) # For dropping columns
df=df.drop([3,5,8,11,14]) # For dropping rows

In [11]: # Now original data changed

Out[11]: A B C

0 0.155607 0.755827 0.969633

1 0.960138 0.830459 0.866927

2 0.359386 0.285517 0.384530

4 0.724831 0.595158 0.567885

6 0.361150 0.114477 0.990422

7 0.849273 0.988311 0.410418

9 0.433382 0.983331 0.904262

10 0.742669 0.227531 0.149279

12 0.711802 0.031837 0.337505

13 0.229736 0.215737 0.795375

In [12]: # Now reset index, but it create a new index column

df.reset_index()

Out[12]: index A B C

0 0 0.155607 0.755827 0.969633

1 1 0.960138 0.830459 0.866927

2 2 0.359386 0.285517 0.384530

3 4 0.724831 0.595158 0.567885

4 6 0.361150 0.114477 0.990422

5 7 0.849273 0.988311 0.410418

6 9 0.433382 0.983331 0.904262

7 10 0.742669 0.227531 0.149279

8 12 0.711802 0.031837 0.337505

9 13 0.229736 0.215737 0.795375

In [13]: # To resolve above issue

df.reset_index(drop=True, inplace=True)

In [14]: # Index issue resolved

Out[14]: A B C

0 0.155607 0.755827 0.969633

A B C

1 0.960138 0.830459 0.866927

2 0.359386 0.285517 0.384530

3 0.724831 0.595158 0.567885

4 0.361150 0.114477 0.990422

5 0.849273 0.988311 0.410418

6 0.433382 0.983331 0.904262

7 0.742669 0.227531 0.149279

8 0.711802 0.031837 0.337505

9 0.229736 0.215737 0.795375

In [15]: # Access specific value of that particular index

df.iloc[[0,4]]

Out[15]: A B C

0 0.155607 0.755827 0.969633

4 0.361150 0.114477 0.990422

In [16]: # Anoter method to access specific value of that particular index

df.iloc[[0,1],[1,2]]

Out[16]: B C

0 0.755827 0.969633

1 0.830459 0.866927

In [17]: df

Out[17]: A B C

0 0.155607 0.755827 0.969633

1 0.960138 0.830459 0.866927

2 0.359386 0.285517 0.384530

3 0.724831 0.595158 0.567885

4 0.361150 0.114477 0.990422

5 0.849273 0.988311 0.410418

6 0.433382 0.983331 0.904262

7 0.742669 0.227531 0.149279

8 0.711802 0.031837 0.337505

9 0.229736 0.215737 0.795375

In [18]: # Null all values of C column

df['C'].isnull()

0 False
Out[18]:
1 False
2 False
3 False
4 False
5 False
6 False
7 False
8 False
9 False
Name: C, dtype: bool

In [19]: # Another Null all values of C column, permenent

df['C']=None
df

Out[19]: A B C

0 0.155607 0.755827 None

1 0.960138 0.830459 None

2 0.359386 0.285517 None

3 0.724831 0.595158 None

4 0.361150 0.114477 None

5 0.849273 0.988311 None

6 0.433382 0.983331 None

7 0.742669 0.227531 None

8 0.711802 0.031837 None

9 0.229736 0.215737 None

Drop NaN Value

In [20]: # Read a CSV
df2 = pd.read_csv('superhero.csv')
df2

Out[20]: Name Toy Born

0 Superman NaN NaN

1 Batman Batmobile 1940-04-25

2 Catwoman NaN NaN

In [21]: # Drop NaN Values but original not changed

df2.dropna() #OR "df.dropna(axis=0)"

Out[21]: Name Toy Born

1 Batman Batmobile 1940-04-25

In [22]: # Drop NaN Values from columns but original not changed
df2.dropna(axis='columns') #OR "df.dropna(axis=1)"

Out[22]: Name

0 Superman

1 Batman

2 Catwoman

In [23]: # Permanent drop NaN Values

df2.dropna(inplace=True)
df2

Out[23]: Name Toy Born

1 Batman Batmobile 1940-04-25

Drop Duplicates
In [24]: # Read a CSV
df3 = pd.read_csv('iceshop.csv')
df3

Out[24]: Brand Style Rating

0 Kulfa Cup 4.0

1 Kulfa Cup 4.0

2 Praline Pack 3.5

3 Mango Cup 4.5

4 Mango Cup 4.5

5 Chocolate Pack 4.0

In [25]: # Drop duplicates but original not changed

df3.drop_duplicates()

Out[25]: Brand Style Rating

0 Kulfa Cup 4.0

2 Praline Pack 3.5

3 Mango Cup 4.5

5 Chocolate Pack 4.0

In [26]: # Drop duplicates from specific subset but original not changed
df3.drop_duplicates(subset=['Brand'])

Out[26]: Brand Style Rating

0 Kulfa Cup 4.0

2 Praline Pack 3.5

Brand Style Rating

3 Mango Cup 4.5

5 Chocolate Pack 4.0

In [27]: # Permanent drop duplicates

df3.drop_duplicates(inplace=True)
df3

Out[27]: Brand Style Rating

0 Kulfa Cup 4.0

2 Praline Pack 3.5

3 Mango Cup 4.5

5 Chocolate Pack 4.0

In [28]: # Check dataframe shape and info

df3.shape

(4, 3)
Out[28]:

In [29]: df3.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 4 entries, 0 to 5
Data columns (total 3 columns):
# Column Non-Null Count Dtype

0 Brand 4 non-null object

1 Style 4 non-null object
2 Rating 4 non-null float64
dtypes: float64(1), object(2)
memory usage: 128.0+ bytes

Unit3 - 3) Pandas - Ipynb - Colab
No ratings yet
Unit3 - 3) Pandas - Ipynb - Colab
11 pages
9.9.24 Revision
No ratings yet
9.9.24 Revision
9 pages
Pandas DataFrame Notes - 12pages-Pages-4
No ratings yet
Pandas DataFrame Notes - 12pages-Pages-4
1 page
Exp 3
No ratings yet
Exp 3
10 pages
Revision Notes DataFrame XII IP
No ratings yet
Revision Notes DataFrame XII IP
8 pages
Pandas
No ratings yet
Pandas
44 pages
Python Pandas and DataFrame Basics
No ratings yet
Python Pandas and DataFrame Basics
20 pages
10 Minutes To Pandas
No ratings yet
10 Minutes To Pandas
26 pages
Pandas Introduction: What Is Python Pandas Used For?
No ratings yet
Pandas Introduction: What Is Python Pandas Used For?
28 pages
Numpy Boolean Indexing: Filter
No ratings yet
Numpy Boolean Indexing: Filter
39 pages
Python DataFrame Techniques
No ratings yet
Python DataFrame Techniques
10 pages
Pandas Cheat Sheet
100% (1)
Pandas Cheat Sheet
2 pages
Python Cheat Sheets
97% (33)
Python Cheat Sheets
11 pages
Pandas
No ratings yet
Pandas
24 pages
Dataframe
No ratings yet
Dataframe
19 pages
12 Pandas
100% (1)
12 Pandas
21 pages
Pandas Notes
No ratings yet
Pandas Notes
47 pages
Pandas Guide for Beginners
No ratings yet
Pandas Guide for Beginners
18 pages
Assignments IP Class 12
No ratings yet
Assignments IP Class 12
9 pages
GR12 Record Programs 6TH Onwards
No ratings yet
GR12 Record Programs 6TH Onwards
18 pages
Pandas Quick Start Guide
No ratings yet
Pandas Quick Start Guide
23 pages
Pandas Data Wrangling Cheat Sheet
100% (2)
Pandas Data Wrangling Cheat Sheet
6 pages
Series and Pandas Methods
No ratings yet
Series and Pandas Methods
5 pages
10 Minutes To Pandas - Pandas 2.1.1 Documentation
No ratings yet
10 Minutes To Pandas - Pandas 2.1.1 Documentation
24 pages
PDF&Rendition 1
No ratings yet
PDF&Rendition 1
47 pages
DataFrames Continued
No ratings yet
DataFrames Continued
9 pages
Pandas For Python Pro Level Cheat Sheet
No ratings yet
Pandas For Python Pro Level Cheat Sheet
14 pages
Python For Data Science 1662157639
No ratings yet
Python For Data Science 1662157639
6 pages
Top Machine Learning Artificial Intelligence AI Data Science Cheat Sheets ForML & Deep Learning Engineers
No ratings yet
Top Machine Learning Artificial Intelligence AI Data Science Cheat Sheets ForML & Deep Learning Engineers
14 pages
Pandas Data Reshaping & Indexing Guide
No ratings yet
Pandas Data Reshaping & Indexing Guide
1 page
Pandas - Ipynb - Colab
No ratings yet
Pandas - Ipynb - Colab
22 pages
03 DataFrames
No ratings yet
03 DataFrames
9 pages
Pandas Cheat Sheet
No ratings yet
Pandas Cheat Sheet
2 pages
12 Pandas
No ratings yet
12 Pandas
9 pages
Day 18-9-2023 - Jupyter Notebook
No ratings yet
Day 18-9-2023 - Jupyter Notebook
8 pages
Pandas
No ratings yet
Pandas
5 pages
Pandas DataFrame Notes
No ratings yet
Pandas DataFrame Notes
13 pages
Pandas Merged
No ratings yet
Pandas Merged
2 pages
Pandas Cheat Sheet
85% (13)
Pandas Cheat Sheet
2 pages
Pandas Cheat Sheet CN
No ratings yet
Pandas Cheat Sheet CN
4 pages
Pandas Cheat Sheet
100% (4)
Pandas Cheat Sheet
2 pages
DataFrame Ac Win Final
No ratings yet
DataFrame Ac Win Final
30 pages
CS2209 Python Pandas
No ratings yet
CS2209 Python Pandas
30 pages
Cheat Python
No ratings yet
Cheat Python
8 pages
Panas Short Notes
No ratings yet
Panas Short Notes
4 pages
Pandas Library
No ratings yet
Pandas Library
6 pages
Data Frame Demo
No ratings yet
Data Frame Demo
73 pages
Pandas Series and DataFrame Guide
No ratings yet
Pandas Series and DataFrame Guide
98 pages
Gary Goldschneider's Everyday Astrology PDF
No ratings yet
Gary Goldschneider's Everyday Astrology PDF
31 pages
Startme
No ratings yet
Startme
16 pages
Importance of Pricing
No ratings yet
Importance of Pricing
3 pages
Promo: Ganadores de La Semana:)
No ratings yet
Promo: Ganadores de La Semana:)
4 pages
Aqa 2011 Past Paper
No ratings yet
Aqa 2011 Past Paper
14 pages
OBD Tools for BMW & China Cars
No ratings yet
OBD Tools for BMW & China Cars
50 pages
Orta Sevi̇yede İngi̇li̇zce Bi̇len Ana Di̇li̇ Türkçe Olan Öğrenci̇leri̇n Vücut
No ratings yet
Orta Sevi̇yede İngi̇li̇zce Bi̇len Ana Di̇li̇ Türkçe Olan Öğrenci̇leri̇n Vücut
163 pages
ELECTIVE
No ratings yet
ELECTIVE
5 pages
A Level Mathematics Paper 1 Equations Involving Indices Logarithms & Others
No ratings yet
A Level Mathematics Paper 1 Equations Involving Indices Logarithms & Others
10 pages
CC Limp Bizkit
No ratings yet
CC Limp Bizkit
6 pages
Formative and Summative Assessment
100% (1)
Formative and Summative Assessment
2 pages
BTVN Ngày 11.4
No ratings yet
BTVN Ngày 11.4
4 pages
Cma December, 2019 Examination Foundation Level Subject: 003. Quantitative Techniques
No ratings yet
Cma December, 2019 Examination Foundation Level Subject: 003. Quantitative Techniques
4 pages
Higher Novemeber 2009 Paper 3
No ratings yet
Higher Novemeber 2009 Paper 3
16 pages
01 Aen 17526 s17 Model Answer
No ratings yet
01 Aen 17526 s17 Model Answer
26 pages
Cell: The Building Blocks of Life: Awaluddin, M.Kes
No ratings yet
Cell: The Building Blocks of Life: Awaluddin, M.Kes
39 pages
CAIE-IGCSE-ICT - Practical
No ratings yet
CAIE-IGCSE-ICT - Practical
11 pages
Syllabus Isye6501
No ratings yet
Syllabus Isye6501
5 pages
Oman Cables: PVC & LSF Wire Guide
No ratings yet
Oman Cables: PVC & LSF Wire Guide
24 pages
Creative Arts Grade 6 Curriculum Design - 240115 - 133144
0% (1)
Creative Arts Grade 6 Curriculum Design - 240115 - 133144
57 pages
Petitioner Respondent: Fil-Estate Properties, Inc., Realty, Inc.
No ratings yet
Petitioner Respondent: Fil-Estate Properties, Inc., Realty, Inc.
8 pages
GCU Consultants (Johor) SDN BHD: General Notes and Miscellaneous Detail 2
No ratings yet
GCU Consultants (Johor) SDN BHD: General Notes and Miscellaneous Detail 2
1 page
Sony Soundbar Manual
No ratings yet
Sony Soundbar Manual
2 pages
A Season in Hell - The Illuminations - Arthur Rimbaud - 2023 - Anna's Archive
No ratings yet
A Season in Hell - The Illuminations - Arthur Rimbaud - 2023 - Anna's Archive
193 pages
ENG503 (Finals)
No ratings yet
ENG503 (Finals)
25 pages
Intermediate Microeconomics: Market Demand
No ratings yet
Intermediate Microeconomics: Market Demand
4 pages
Machine SafetyChris BOSH
No ratings yet
Machine SafetyChris BOSH
17 pages
Daud Anderson Macdonald V Norabi' Ah BT Muda, (2011) 1
No ratings yet
Daud Anderson Macdonald V Norabi' Ah BT Muda, (2011) 1
7 pages
Vectors Notes
No ratings yet
Vectors Notes
13 pages
Quenchant Fundamentals - Condition Monitoring of Quench Oils
No ratings yet
Quenchant Fundamentals - Condition Monitoring of Quench Oils
8 pages