0% found this document useful (0 votes)

27 views4 pages

Pandas - Cheatsheet

Uploaded by

Nandan Patkar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

27 views4 pages

Pandas - Cheatsheet

Uploaded by

Nandan Patkar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

Pandas Cheat Sheet

by Justin1209 (Justin1209) via cheatography.com/101982/cs/21202/

Import the Pandas Module Loading and Saving CSVs (cont) Converting Datatypes

import pandas as pd # Get the first DataFrame chunk: # Convert argument to numeric type
df_urb_pop pandas.to_numeric(arg, errors‐
Create a DataFrame df_urb_pop = next(urb_pop_re‐ ="raise")

# Method 1 ader) errors:

"raise" -> raise an exception
df1 = pd.DataFrame({
Inspect a DataFrame "coerce" -> invalid parsing will be set as
'name': ['John Smith',
NaN
'Jane Doe'], df.head(5) First 5 rows
'address': ['13 Main St.', df.info() Statistics of columns (row
DataFrame for Select Columns / Rows
'46 Maple Ave.'], count, null values, datatype)
'age': [34, 28] df = pd.DataFrame([

}) Reshape (for Scikit) ['January', 100, 100, 23,

# Method 2 100],
nums = np.array(range(1, 11))
df2 = pd.DataFrame([ ['February', 51, 45, 145, 45],
-> [ 1 2 3 4 5 6 7 8 9 10]
['John Smith', '123 Main ['March', 81, 96, 65, 96],
nums = nums.reshape(-1, 1)
St.', 34], ['April', 80, 80, 54, 180],
-> [ [1],
['Jane Doe', '456 Maple ['May', 51, 54, 54, 154],
[2],
Ave.', 28], ['June', 112, 109, 79, 129]],
[3],
['Joe Schmo', '9 Broadway', columns=['month', 'east',
[4],
51] 'north', 'south', 'west']
[5],
], )
[6],
columns =[ 'name', [7],
Select Columns
'address', 'age']) [8],
# Select one Column
[9],
Loading and Saving CSVs clinic_north = df.north
[10]]
# Load a CSV File in to a --> Reshape values for Scikit
You can think of reshape() as rotating this
DataFrame learn: clinic_north.values.re‐
array. Rather than one big row of numbers,
df = pd.read_csv('my-csv-f‐ shape(-1, 1)
nums is now a big column of numbers -
ile.csv') # Select multiple Columns
there’s one number in each row.
# Saving DataFrame to a CSV File clinic_north_south = df[['n‐

df.to_csv('new-csv-file.csv') orth', 'south']]

# Load DataFrame in Chunks (For Make sure that you have a double set of
large Datasets) brackets [[ ]], or this command won’t work!
# Initialize reader object:
urb_pop_reader
urb_pop_reader = pd.read_c‐
sv('ind_pop_data.csv', chunks‐
ize=1000)

By Justin1209 (Justin1209) Published 23rd November, 2019. Sponsored by Readable.com

cheatography.com/justin1209/ Last updated 31st January, 2020. Measure your website readability!
Page 1 of 4. https://readable.com
Pandas Cheat Sheet
by Justin1209 (Justin1209) via cheatography.com/101982/cs/21202/

Select Rows Adding a Column Performing Column Operation (cont)

# Select one Row df = pd.DataFrame([ -> lower, upper

march = df.iloc[2] [1, '3 inch screw', 0.5, # Perform a lambda Operation on
# Select multiple Rows 0.75], a Column
jan_feb_march = df.iloc[:3] [2, '2 inch nail', 0.10, get_last_name = lambda x:
feb_march_april = df.iloc[1:4] 0.25], x.split(" ")[-1]
may_june = df.iloc[-2:] [3, 'hammer', 3.00, 5.50], df['last_name'] = df.Name. apply‐
# Select Rows with Logic [4, 'screwdriver', 2.50, 3.00] (get_last_name)
january = df[df.month == ],
'January'] columns=['Product ID', 'Descr‐ Performing a Operation on Multiple
-> <, >, <=, >=, !=, == iption', 'Cost to Manufacture', Columns
march_april = df[(df.month == 'Price'] df = pd.DataFrame([
'March') | (df.month == ) ["Apple", 1.00, "No"],
'April')] # Add a Column with specified ["Milk", 4.20, "No"],
-> &, | row-values ["Paper Towels", 5.00, "‐
january_february_march = df['Sold in Bulk?'] = ['Yes', Yes"],
df[df.month.isin (['January', 'Yes', 'No', 'No'] ["Light Bulbs", 3.75, "Yes"],
'February', 'March'])] # Add a Column with same value ],
-> column_name.isin([" ", " "]) in every row columns=["Item", "Price", "Is
df['Is taxed?'] = 'Yes'
Selecting a Subset of a Dataframe often taxed?"])
results in non-consecutive indices. # Add a Column with calculation # Lambda Function
df['Revenue'] = df['Price'] - df['Price with Tax'] = df.app‐
Using .reset_index() will create a new df['Cost to Manufacture'] ly(lambda row:
DataFrame move the old indices into a new row['Price'] * 1.075
colum called index. Performing Column Operation if row['Is taxed?'] ==
df = pd.DataFrame([ 'Yes'
Use .reset_index(drop=True) if you dont ['JOHN SMITH', 'john.smith@‐ else row['Price'],
need the index column. axis=1
gmail.com'],
Use .reset_index(inplace=True) to prevent
['Jane Doe', 'jdoe@yahoo.c‐ )
a new DataFrame from brein created.
om'], We apply a lambda to rows, as opposed to
['joe schmo', 'joeschmo@hotma‐ columns, when we want to perform functi‐
il.com'] onality that needs to access more than one
], column at a time.
columns=['Name', 'Email'])
# Changing a column with an
Operation
df['Name'] = df.Name. apply(lo‐
wer)

By Justin1209 (Justin1209) Published 23rd November, 2019. Sponsored by Readable.com

cheatography.com/justin1209/ Last updated 31st January, 2020. Measure your website readability!
Page 2 of 4. https://readable.com
Pandas Cheat Sheet
by Justin1209 (Justin1209) via cheatography.com/101982/cs/21202/

Rename Columns Column Statistics Pivot Tables

# Method 1 Mean = Average df.col umn .mean() orders =

df.columns = ['NewName_1', Median df.column .median() pd.read_csv('orders.csv')
'NewName_2, 'NewName_3', '...'] shoe_counts = orders.
Minimal Value df.column .min()
# Method 2 groupby(['shoe_type', 'shoe_col‐
Maximum Value df.column .max()
df.rename(columns={ or']).
Number of Values df.column .count()
'OldName_1': 'NewName_1', id.count().reset_index()
'OldName_2': 'NewName_2' Unique Values df.col umn .nuni‐ shoe_counts_pivot = shoe_coun‐
}, inplace=True) que() ts.pivot(
Standard Deviation df.col umn .std() index = 'shoe_type',
Using inplace=True lets us edit the original
DataFrame. List of Unique df.column .unique() columns = 'shoe_color',
Values values = 'id').reset_index()
Series vs. Dataframes Dont't forget reset_index() at the end of a We have to build a temporary table where
# Dataframe and Series groupby operation we group by the columns we want to
print(type(clinic_north)): include in the pivot table
# <class 'pandas.core.series.Series'>
Calculating Aggregate Functions
print(type(df)): # Group By Merge (Same Column Name)
# <class 'pandas.core.frame.DataFrame'> grouped = df. groupby(['col1', sales = pd.read_csv('sales.csv')
print(type(clinic_north_south)) 'col2']).col3 targets = pd.read_csv('targe‐
# <class 'pandas.core.frame.DataFrame
'> .measurement(). reset_index() ts.csv')
In Pandas # -> group by column1 and men_women = pd.read_csv('men_w‐
- a series is a one-dimensional object column2, calculate values of omen_sales.csv')
that contains any type of data. column3 # Method 1
# Percentile sales_targets = pd .merge (sales,
- a dataframe is a two-dimensional high_earners = df.groupby('cat‐ targets, how=" ")
object that can hold multiple columns of egory').wage # how: "inner"(default), "out‐
different types of data.
.apply(lambda x: np.percen‐ er", "left", "right"
tile(x, 75)) #Method 2 (Method Chaining)
A single column of a dataframe is a series,
.reset_index() all_data = sales .merge (targe‐
and a dataframe is a container of two or
# np.percentile can calculate ts).merge(men_women)
more series objects.
any percentile over an array of
values

Don't forget reset.index()

By Justin1209 (Justin1209) Published 23rd November, 2019. Sponsored by Readable.com

cheatography.com/justin1209/ Last updated 31st January, 2020. Measure your website readability!
Page 3 of 4. https://readable.com
Pandas Cheat Sheet
by Justin1209 (Justin1209) via cheatography.com/101982/cs/21202/

Inner Merge (Different Column Name) Melt

orders = pandas.melt(DataFrame, id_vars,

pd.read_csv('orders.csv') value_vars, var_name, value_nam‐
products = pd.read_csv('produ‐ e='value')
cts.csv') id_vars: Column(s) to use as identifier
# Method 1: Rename Columns variables.
orders_products = pd .merge (or‐ value_vars: Column(s) to unpivot. If not
ders, products.rename(columns= specified, uses all columns that are not set
{'id':'product_id'}), how=" ") as id_vars.

.reset_index() var_name: Name to use for the ‘variable’

column.
# how: "inner"(default), "out‐
value_name: Name to use for the ‘value’
er", "left", "right"
column.
# Method 2:
orders_products = Unpivot a DataFrame from wide to long
pd.merge (orders, products, format, optionally leaving identifiers set.

left_on ="pro‐
Assert Statements
duct_id",
right_on ‐ # Test if country is of type
="id", object
suffixes =["_‐ assert gapminder.country.dtypes
orders","_products"]) == np.object
# Test if year is of type int64
Method 2:
assert gapminder.year.dtypes ==
If we use this syntax, we’ll end up with two
columns called id. np.int64

Pandas won’t let you have two columns # Test if life_expectancy is of

with the same name, so it will change them type float64
to id_x and id_y. assert gapminder.life_expectanc‐
We can help make them more useful by y.dtypes == np.float64
using the keyword suffixes. # Assert that country does not
contain any missing values
Concatenate assert pd.notnull(gapminder.cou‐
bakery = ntry).all()
pd.read_csv('bakery.csv') # Assert that year does not
ice_cream = pd.read_csv('ice_c‐ contain any missing values
ream.csv') assert pd.notnull(gapminder.yea‐
menu = pd.concat([bakery, r).all()
ice_cream])

By Justin1209 (Justin1209) Published 23rd November, 2019. Sponsored by Readable.com

cheatography.com/justin1209/ Last updated 31st January, 2020. Measure your website readability!
Page 4 of 4. https://readable.com

Cheat Sheet Pandas
No ratings yet
Cheat Sheet Pandas
4 pages
Pandas Cheat Sheet
100% (1)
Pandas Cheat Sheet
2 pages
Pandas Cheat Sheet for Data Manipulation
No ratings yet
Pandas Cheat Sheet for Data Manipulation
1 page
Pandas Cheat Sheet for Data Science
No ratings yet
Pandas Cheat Sheet for Data Science
5 pages
Pandas Cheat Sheet........
No ratings yet
Pandas Cheat Sheet........
11 pages
Cheat Sheet: The Pandas Dataframe Object I: Preliminaries Get Your Data Into A Dataframe
No ratings yet
Cheat Sheet: The Pandas Dataframe Object I: Preliminaries Get Your Data Into A Dataframe
12 pages
Pandas
No ratings yet
Pandas
5 pages
Pandas DataFrame Notes
No ratings yet
Pandas DataFrame Notes
13 pages
Pandas Cheat Sheet
No ratings yet
Pandas Cheat Sheet
2 pages
Pandas Data Wrangling Cheat Sheet
100% (2)
Pandas Data Wrangling Cheat Sheet
6 pages
Pandas Cheat Sheet
85% (13)
Pandas Cheat Sheet
2 pages
Pandas Cheat Sheet
100% (4)
Pandas Cheat Sheet
2 pages
Pandas Cheat Sheet CN
No ratings yet
Pandas Cheat Sheet CN
4 pages
Pandas Notes
No ratings yet
Pandas Notes
20 pages
Data Analysis With Python
No ratings yet
Data Analysis With Python
60 pages
Cheat Sheet: The Pandas Dataframe Object: Preliminaries Get Your Data Into A Dataframe
100% (1)
Cheat Sheet: The Pandas Dataframe Object: Preliminaries Get Your Data Into A Dataframe
12 pages
Content Pandas Cheat Sheet
No ratings yet
Content Pandas Cheat Sheet
9 pages
3Y3Z2Xzqn7 U Y%K : 2. How To Create A Data Frame Using A Dictionary of Pre-Existing Columns or Numpy 2D Arrays?
No ratings yet
3Y3Z2Xzqn7 U Y%K : 2. How To Create A Data Frame Using A Dictionary of Pre-Existing Columns or Numpy 2D Arrays?
8 pages
Pandas DataFrame Cheat Sheet
No ratings yet
Pandas DataFrame Cheat Sheet
4 pages
Pandas DataFrame Cheat Sheet
100% (1)
Pandas DataFrame Cheat Sheet
10 pages
Cheat Sheet - Pandas
No ratings yet
Cheat Sheet - Pandas
12 pages
Unit IV
No ratings yet
Unit IV
49 pages
12 Pandas
100% (1)
12 Pandas
21 pages
Pandas DataFrame Notes
100% (1)
Pandas DataFrame Notes
10 pages
12 Pandas
No ratings yet
12 Pandas
9 pages
Pandas Cheat Sheet Free Resources At: Dataquest - Io/guide
No ratings yet
Pandas Cheat Sheet Free Resources At: Dataquest - Io/guide
7 pages
Cheat Sheet
No ratings yet
Cheat Sheet
10 pages
Pandas DataFrame Notes
No ratings yet
Pandas DataFrame Notes
10 pages
Pandas Dataframe
No ratings yet
Pandas Dataframe
8 pages
PYTHON PANDAS Cheat Sheet
No ratings yet
PYTHON PANDAS Cheat Sheet
2 pages
Pandas Merged
No ratings yet
Pandas Merged
2 pages
Python Pandas for Data Science
No ratings yet
Python Pandas for Data Science
59 pages
Pandas
No ratings yet
Pandas
26 pages
Pandas Cheat Sheet
No ratings yet
Pandas Cheat Sheet
5 pages
Unit 1 Python Pandas
No ratings yet
Unit 1 Python Pandas
20 pages
Pandas and Python
No ratings yet
Pandas and Python
24 pages
Data Science Cheat Sheet: KEY Imports
100% (1)
Data Science Cheat Sheet: KEY Imports
1 page
File Ip
No ratings yet
File Ip
22 pages
PANDAS Cheatsheet
No ratings yet
PANDAS Cheatsheet
4 pages
Pandas Cheat Sheet
No ratings yet
Pandas Cheat Sheet
17 pages
Pandas Cheat Sheet
No ratings yet
Pandas Cheat Sheet
19 pages
Pandas
No ratings yet
Pandas
8 pages
Pandas DataFrame Notes
67% (3)
Pandas DataFrame Notes
13 pages
Pandas
No ratings yet
Pandas
27 pages
Pandas Introduction: What Is Python Pandas Used For?
No ratings yet
Pandas Introduction: What Is Python Pandas Used For?
28 pages
Python Pandas Demo PDF
100% (2)
Python Pandas Demo PDF
23 pages
Pandas Operations Guide
No ratings yet
Pandas Operations Guide
6 pages
Tabular DI
No ratings yet
Tabular DI
17 pages
Nandan Resume
No ratings yet
Nandan Resume
1 page
Job Description ADVI
No ratings yet
Job Description ADVI
7 pages
Nandan CV
No ratings yet
Nandan CV
2 pages
700 Solutions
No ratings yet
700 Solutions
3 pages
Tamil Nadu Telecom Achievements 2019-20
100% (1)
Tamil Nadu Telecom Achievements 2019-20
45 pages
Speed Control of Switched Reluctance Motor Based On Fuzzy Logic Controller
No ratings yet
Speed Control of Switched Reluctance Motor Based On Fuzzy Logic Controller
5 pages
Workflow Management Training
No ratings yet
Workflow Management Training
628 pages
Johanna Silva MCS3002273 Task1
No ratings yet
Johanna Silva MCS3002273 Task1
10 pages
Fertilizer Forecasting Using Machine Learning
100% (1)
Fertilizer Forecasting Using Machine Learning
4 pages
Design Manual
No ratings yet
Design Manual
53 pages
Aviation Procurement & Sales Expert Resume
No ratings yet
Aviation Procurement & Sales Expert Resume
5 pages
LPPublisher 1
No ratings yet
LPPublisher 1
4 pages
(Ebooks PDF) Download Discrete Mathematics and Applications 2nd Edition Ferland Full Chapters
100% (11)
(Ebooks PDF) Download Discrete Mathematics and Applications 2nd Edition Ferland Full Chapters
55 pages
Days of Innocence and Wonder Lucy Treloar Official Test Bank
No ratings yet
Days of Innocence and Wonder Lucy Treloar Official Test Bank
406 pages
Chapter 7
No ratings yet
Chapter 7
2 pages
Aqa A Level Computer Science Paper 2 2023 75172
No ratings yet
Aqa A Level Computer Science Paper 2 2023 75172
40 pages
Power/AIX Training for Beginners
No ratings yet
Power/AIX Training for Beginners
4 pages
Econ 151
No ratings yet
Econ 151
4 pages
Project Report - 2 On CreditCard Fraud Detection
No ratings yet
Project Report - 2 On CreditCard Fraud Detection
42 pages
Warhammer 40k: Assassinorum Guide
0% (1)
Warhammer 40k: Assassinorum Guide
4 pages
Technote-HowToDecodeOpt82 v1
No ratings yet
Technote-HowToDecodeOpt82 v1
4 pages
Q1 - LE - TLE 7 - Lesson 2 - Week 2
No ratings yet
Q1 - LE - TLE 7 - Lesson 2 - Week 2
13 pages
(Ebook PDF) Business Driven Information Systems 6 Edition by Paige Baltzan Download
100% (2)
(Ebook PDF) Business Driven Information Systems 6 Edition by Paige Baltzan Download
50 pages
KIRAN's Resume
No ratings yet
KIRAN's Resume
1 page
SQL Subquery Techniques
No ratings yet
SQL Subquery Techniques
4 pages
Ict 11
No ratings yet
Ict 11
2 pages
BGP for Network Professionals
No ratings yet
BGP for Network Professionals
12 pages
First Order Differential Equations
No ratings yet
First Order Differential Equations
61 pages
Bits For Mid1
100% (1)
Bits For Mid1
14 pages
ITP4733 Ln07-08 Netcode 1030
No ratings yet
ITP4733 Ln07-08 Netcode 1030
54 pages
Deepfake's Impact on Trust
No ratings yet
Deepfake's Impact on Trust
7 pages
Linux on TX3 Mini: A Technical Guide
No ratings yet
Linux on TX3 Mini: A Technical Guide
22 pages
A Company Provides Vce Files Free and Valid Dumps PDF
No ratings yet
A Company Provides Vce Files Free and Valid Dumps PDF
10 pages

Pandas - Cheatsheet

Uploaded by

Pandas - Cheatsheet

Uploaded by

Pandas Cheat Sheet

by Justin1209 (Justin1209) via cheatography.com/101982/cs/21202/

# Method 1 ader) errors:

}) Reshape (for Scikit) ​ ​['J​anu​ary', 100, 100, 23,

df.to_​csv​('n​ew-​csv​-fi​le.c​sv') orth', 'south']]

By Justin1209 (Justin1209) Published 23rd November, 2019. Sponsored by Readable.com

Select Rows Adding a Column Performing Column Operation (cont)

# Select one Row df = pd.DataFrame([ -> lower, upper

By Justin1209 (Justin1209) Published 23rd November, 2019. Sponsored by Readable.com

Rename Columns Column Statistics Pivot Tables

# Method 1 Mean = Average df.col​ um​n .​m​ean() orders =

Don't forget reset.i​nd​ex()

By Justin1209 (Justin1209) Published 23rd November, 2019. Sponsored by Readable.com

Inner Merge (Different Column Name) Melt

orders = panda​s.m​elt​(Da​taF​rame, id_vars,

.res​et_​ind​ex() var_n​ame: Name to use for the ‘variable’

Pandas won’t let you have two columns # Test if life_e​xpe​ctancy is of

By Justin1209 (Justin1209) Published 23rd November, 2019. Sponsored by Readable.com

You might also like

}) Reshape (for Scikit) ['January', 100, 100, 23,

df.to_csv('new-csv-file.csv') orth', 'south']]

# Method 1 Mean = Average df.col umn .mean() orders =

Don't forget reset.index()

orders = pandas.melt(DataFrame, id_vars,

.reset_index() var_name: Name to use for the ‘variable’

Pandas won’t let you have two columns # Test if life_expectancy is of