Getting Started with Pandas Cheatsheet
# import from JSON string, file, URL                                 df.iloc[10 : 20]                   # select rows from 10 to 20
pandas is open-source and the most popular Python tool for data    pd.read_json(json_string)
                                                                                                                                        # select all rows with columns at position 2, 4, and 5
wrangling and analytics. It is fast, intuitive, and can handle     # extract tables from HTML file, URL                                 df.iloc[ : , [2, 4, 5]]
                                                                   pd.read_html(url)
multiple data formats such as CSV, Excel, JSON, HTML, and                                                                               # select all rows with columns from sale to profit
                                                                                                                                        df.loc[ : , 'sale' : 'profit']
SQL.
                                                                   Exporting Data                                                       # filter the dataframe using logical condition and select sale
                                                                                                                                        and profit columns
                                                                   These commands are commonly used to export files in various          df.loc[df['sale'] > 10, ['sale', 'profit']]
 Creating DataFrames                                               formats but you can also export the dataframe into binary Feather,
                                                                   HDF5, BigQuery table, and Pickle file.                               df.iat[1, 2]         # select a single value using positioning
 Change the layout, rename the column names, append rows and
                                                                   df.to_csv(filename)         # export CSV tabular file                df.at[4, 'sale']     # select single value using label
 Create a pandas dataframe object by specifying the columns name
 and index.
                                                                   df.to_excel(filename)       # export Excel file
 From dictionary:
                                                                   # apply modifications to SQL database                                Querying
                                                                   df.to_sql(table_name, connection_object)
 df = pd.DataFrame( {"A" : [1, 4, 7], "B" : [2, 5, 8],
     "C" : [3, 6, 3]}, index=[101, 102, 103])                      df.to_json(filename)        # export JSON format file                Filter out the rows using logical conditions. The query() returns a
                                                                                                                                        boolean for filtering rows.
 From multi-dimensional list:
                                                                                                                                        df.query('sale > 20') # filters rows using logical conditions
 df = pd.DataFrame( [[1, 2, 3], [4, 5, 6],[7, 8, 3]],              Inspecting Data
    index=[101, 102, 103], columns=['A', 'B', 'C'])                                                                                     df.query('sale > 20 and profit < 30')      # combining conditions
                                                                   Understand the data and the distribution by using these
                                                                   commands.                                                            # string logical condition
                                                                                                                                        df.query('company.str.startswith("ab")', engine="python")
                                                                   # view first n rows or use df.tail(n) for last n rows
                    A               B                 C            df.head(n)
                                                                   # display and ordered first n values or use df.nsmallest(n,          Reshaping Data
 101                1               2                 3            'value') for ordered last n rows
                                                                   df.nlargest(n, 'value')
                                                                                                                                        Change the layout, rename the column names, append rows and
 102                4               5                 6            df.sample(n=10) # randomly select and display n rows                 columns, and sort values and index.
                                                                   Df.shape          # view number of rows and columns                  pd.melt(df)                      # combine columns into rows
 103                7               8                 9
                                                                   # view the index, datatype and memory information                    # convert rows into columns
                                                                   df.info()                                                            df.pivot(columns='var', values='val')
                                                                   # view statistical summary of numerical columns                      pd.concat([df1,df2], axis = 0) # appending rows
                                                                   df.describe()
 Importing Data                                                                                                                         pd.concat([df1,df2], axis = 1) # appending columns
                                                                   # view unique values and counts of the city column
                                                                   df.city.value_counts()                                               # sort values by sale column from high to low
 Import the data from text, Excel, website, database, or nested                                                                         df.sort_values('sale', ascending=False)
 JSON file.
                                                                                                                                        df.sort_index()                  # sort the index
                                                                   Subsetting
 pd.read_csv(file_location)       # import tabular CSV file
                                                                                                                                        df.reset_index()                 # move the index to columns
 pd.read_table(file_location)     # import delimited text file     Select a single row or column and multiple rows or columns using
                                                                   these commands.                                                      # rename a column using dictionary
 pd.read_excel(file_location)     # import Excel file                                                                                   df.rename(columns = {'sale':'sales'})
                                                                   df['sale']                        # select a single column
 # connect and extract the data from SQL database                                                                                       # removing sales and profit columns from dataframe
 pd.read_sql(query, connection_object)                             df[['sale', 'profit']]            # select two selected columns      df.drop(columns=['sales', 'profit'])