KEMBAR78
Pandas Cheatsheet | PDF | Information Science | Information Technology Management
0% found this document useful (0 votes)
12 views1 page

Pandas Cheatsheet

Uploaded by

almnvg22
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views1 page

Pandas Cheatsheet

Uploaded by

almnvg22
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 1

Getting Started with Pandas Cheatsheet

# import from JSON string, file, URL df.iloc[10 : 20] # select rows from 10 to 20
pandas is open-source and the most popular Python tool for data pd.read_json(json_string)
# select all rows with columns at position 2, 4, and 5
wrangling and analytics. It is fast, intuitive, and can handle # extract tables from HTML file, URL df.iloc[ : , [2, 4, 5]]
pd.read_html(url)
multiple data formats such as CSV, Excel, JSON, HTML, and # select all rows with columns from sale to profit
df.loc[ : , 'sale' : 'profit']
SQL.
Exporting Data # filter the dataframe using logical condition and select sale
and profit columns
These commands are commonly used to export files in various df.loc[df['sale'] > 10, ['sale', 'profit']]
Creating DataFrames formats but you can also export the dataframe into binary Feather,
HDF5, BigQuery table, and Pickle file. df.iat[1, 2] # select a single value using positioning
Change the layout, rename the column names, append rows and
df.to_csv(filename) # export CSV tabular file df.at[4, 'sale'] # select single value using label
Create a pandas dataframe object by specifying the columns name
and index.
df.to_excel(filename) # export Excel file
From dictionary:
# apply modifications to SQL database Querying
df.to_sql(table_name, connection_object)
df = pd.DataFrame( {"A" : [1, 4, 7], "B" : [2, 5, 8],
"C" : [3, 6, 3]}, index=[101, 102, 103]) df.to_json(filename) # export JSON format file Filter out the rows using logical conditions. The query() returns a
boolean for filtering rows.
From multi-dimensional list:
df.query('sale > 20') # filters rows using logical conditions
df = pd.DataFrame( [[1, 2, 3], [4, 5, 6],[7, 8, 3]], Inspecting Data
index=[101, 102, 103], columns=['A', 'B', 'C']) df.query('sale > 20 and profit < 30') # combining conditions
Understand the data and the distribution by using these
commands. # string logical condition
df.query('company.str.startswith("ab")', engine="python")
# view first n rows or use df.tail(n) for last n rows
A B C df.head(n)

# display and ordered first n values or use df.nsmallest(n, Reshaping Data


101 1 2 3 'value') for ordered last n rows
df.nlargest(n, 'value')
Change the layout, rename the column names, append rows and
102 4 5 6 df.sample(n=10) # randomly select and display n rows columns, and sort values and index.

Df.shape # view number of rows and columns pd.melt(df) # combine columns into rows
103 7 8 9
# view the index, datatype and memory information # convert rows into columns
df.info() df.pivot(columns='var', values='val')

# view statistical summary of numerical columns pd.concat([df1,df2], axis = 0) # appending rows


df.describe()
Importing Data pd.concat([df1,df2], axis = 1) # appending columns
# view unique values and counts of the city column
df.city.value_counts() # sort values by sale column from high to low
Import the data from text, Excel, website, database, or nested df.sort_values('sale', ascending=False)
JSON file.
df.sort_index() # sort the index
Subsetting
pd.read_csv(file_location) # import tabular CSV file
df.reset_index() # move the index to columns
pd.read_table(file_location) # import delimited text file Select a single row or column and multiple rows or columns using
these commands. # rename a column using dictionary
pd.read_excel(file_location) # import Excel file df.rename(columns = {'sale':'sales'})
df['sale'] # select a single column
# connect and extract the data from SQL database # removing sales and profit columns from dataframe
pd.read_sql(query, connection_object) df[['sale', 'profit']] # select two selected columns df.drop(columns=['sales', 'profit'])

You might also like