0% found this document useful (0 votes)

184 views4 pages

Pandas DataFrame Basics Guide

The document discusses various methods for creating, loading, manipulating, and analyzing dataframes in Pandas. Key points include: - Pandas series and dataframes can be created from arrays, dictionaries, and CSV files using functions like pd.Series(), pd.DataFrame(), and pd.read_csv(). - Data can be extracted from dataframes using indexing, column selection, .loc[], and .pivot_table(). Rows and columns can be renamed, merged, and concatenated. - Methods like .head(), .info(), .describe() provide information about the data in a dataframe.

Uploaded by

Dev D Ghosh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

184 views4 pages

Pandas DataFrame Basics Guide

Uploaded by

Dev D Ghosh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 4

PANDAS

You could create a Pandas series from an array-like

object using the following command: pd.Series(data, dtype)

To create a dataframe from a dictionary, you can run

the following command: pd.DataFrame(dictionary_name)

You can also provide lists or arrays to create dataframes, but you will have to
specify the column names as shown below.

pd.DataFrame(dictionary_name, columns = ['column_1', 'column_2'])

You can use the following command to load data into a dataframe from a csv
file:

pd.read_csv(filepath, sep=',', header='infer')

use the following code to change the row indices:

dataframe_name.index

To change the index while loading the data from a file,

you can use the attribute 'index_col':
pd.read_csv(filepath, index_col = column_number)

For column header, you can specify the column names using the following
code:
dataframe_name.columns = list_of_column_names
While working with Pandas, the dataframes may hold large volumes of data. It
would be an inefficient approach to load the entire data whenever an operation is
performed. Hence, you must use the following code to load a limited number of
entries:

dataframe_name.head()

 dataframe.info(): This method prints information about the dataframe, which

includes the index data type and column data types, the count of non-null values and

the memory used.

 dataframe.describe(): This function produces descriptive statistics for the

dataframe, that is, the central tendency (mean, median, min, max, etc.), dispersion,

etc. It analyses the data and generates output for numeric and non-numeric data types

accordingly.

The selection of rows in dataframes is similar to the indexing you saw in NumPy

arrays.
The syntax df[start_index:end_index] will subset the rows according to
the start and end indices.

You can select one or more columns from a dataframe using the following
commands:

 df['column'] or df.column: It returns a series

 df[['col_x', 'col_y']]: It returns a dataframe

You can use the loc method to extract rows and columns from a dataframe
based on the following labels:

dataframe.loc[[list_of_row_labels], [list_of_column_labels]]
You can use the following code to rename a column:

dataframe.rename(index={row_index: "new_name"}, columns={column_name:

"new_name"})

You can use the following code to set a multilevel index in a dataframe:

dataframe.set_index([column_1, column_2])

To obtain data from such dataframes, you have to provide the row details as a
tuple inside a list. You can go through the code provided below for reference:

dataframe.loc[[(label_1, sub_label_1), (label_1, sub_label_2)],

[column_label_1, column_label_2]]

You can use the following command to create pivot tables in Pandas:

df.pivot(columns='grouping_variable_col', values='value_to_aggregate',
index='grouping_variable_row')

Using the pivot_table() function, you can specify the aggregate function

you would want Pandas to execute over the columns provided. It could be the
same or different for each column in the dataframe.

df.pivot_table(values, index, aggfunc={'value_1': np.mean,'value_2': [min,

max, np.mean]})

You can use the following command to merge two dataframes:

dataframe_1.merge(dataframe_2, on = ['column_1', 'column_2'], how = '____')

The how attribute in the code above specifies the type of merge to be performed:

 left: This will select the entries only in the first dataframe.

 right: This will consider the entries only in the second dataframe.

 outer: This takes the union of all the entries in the dataframes.

 inner: This will result in the intersection of the keys from both frames.

You can add columns or rows from one dataframe to another using the
concat() function:

pd.concat([dataframe_1, dataframe_2], axis = _)

Pandas
No ratings yet
Pandas
13 pages
DevOps Session 3 Pandas
No ratings yet
DevOps Session 3 Pandas
33 pages
Pandas Cheat Sheet for Data Science
No ratings yet
Pandas Cheat Sheet for Data Science
5 pages
Pandas 6 1716219621
No ratings yet
Pandas 6 1716219621
17 pages
Python Data Analysis Basics
No ratings yet
Python Data Analysis Basics
246 pages
Pandas in Python 16sept2022
No ratings yet
Pandas in Python 16sept2022
8 pages
Pandas
No ratings yet
Pandas
27 pages
ML Lab1 Python Panda
No ratings yet
ML Lab1 Python Panda
9 pages
Pandas Series and DataFrame Guide
No ratings yet
Pandas Series and DataFrame Guide
87 pages
Pandas
No ratings yet
Pandas
41 pages
EDA With Pandas CheatSheet
No ratings yet
EDA With Pandas CheatSheet
3 pages
Pandas Notes
No ratings yet
Pandas Notes
6 pages
Pandas Notes Design
No ratings yet
Pandas Notes Design
5 pages
Pandas
No ratings yet
Pandas
8 pages
Block 1-Data Handling Using Pandas DataFrame
No ratings yet
Block 1-Data Handling Using Pandas DataFrame
17 pages
International Indian School, Riyadh WORKSHEET (2020-2021) Grade - Xii - Informatics Practices - Second Term
No ratings yet
International Indian School, Riyadh WORKSHEET (2020-2021) Grade - Xii - Informatics Practices - Second Term
9 pages
Loops in Python
No ratings yet
Loops in Python
18 pages
40 NumPy and Pandas Interview Questions With Answers 1740141557
No ratings yet
40 NumPy and Pandas Interview Questions With Answers 1740141557
6 pages
Module1-Cheat-Sheet-LINE PLOT
No ratings yet
Module1-Cheat-Sheet-LINE PLOT
3 pages
Study Material IP XII
No ratings yet
Study Material IP XII
116 pages
LMRS Ip 2020 21
No ratings yet
LMRS Ip 2020 21
21 pages
Pandas Notes Basic To Advance
No ratings yet
Pandas Notes Basic To Advance
21 pages
Pandas
No ratings yet
Pandas
86 pages
Python Notes for B.Tech Students
No ratings yet
Python Notes for B.Tech Students
143 pages
Unit-1 Python Pandas
No ratings yet
Unit-1 Python Pandas
56 pages
18 Pandas
No ratings yet
18 Pandas
33 pages
Pythonic Data Cleaning With Numpy and Pandas
No ratings yet
Pythonic Data Cleaning With Numpy and Pandas
11 pages
12 Ip
No ratings yet
12 Ip
5 pages
Pandas Guide for Data Analysts
No ratings yet
Pandas Guide for Data Analysts
9 pages
Data Structures for Beginners
100% (1)
Data Structures for Beginners
31 pages
Pandas
No ratings yet
Pandas
30 pages
Class XII Pandas & SQL Practical List
100% (1)
Class XII Pandas & SQL Practical List
7 pages
Data Visualization With Pandas
No ratings yet
Data Visualization With Pandas
8 pages
Python Data Structures
No ratings yet
Python Data Structures
20 pages
Data Manipulation With Pandas
No ratings yet
Data Manipulation With Pandas
19 pages
Unit 4 Fod
100% (1)
Unit 4 Fod
21 pages
UNIT-5 Data Visualization Using Dataframe
No ratings yet
UNIT-5 Data Visualization Using Dataframe
38 pages
Python Loops Quiz for Beginners
No ratings yet
Python Loops Quiz for Beginners
15 pages
Unit - 1 - Python Pandas
No ratings yet
Unit - 1 - Python Pandas
176 pages
1 Pandas Basics
No ratings yet
1 Pandas Basics
13 pages
(Unit 3) Introduction To SQL: SQL (Structured Query Language)
No ratings yet
(Unit 3) Introduction To SQL: SQL (Structured Query Language)
30 pages
Python Pandas New Sylabus
No ratings yet
Python Pandas New Sylabus
53 pages
P Unit-4 NP
No ratings yet
P Unit-4 NP
30 pages
An Introduction To Interactive Programming in Python
No ratings yet
An Introduction To Interactive Programming in Python
3 pages
Pandas Methods
No ratings yet
Pandas Methods
6 pages
12 Pandas
100% (1)
12 Pandas
21 pages
v2 Python Loops
No ratings yet
v2 Python Loops
28 pages
1 - Interactive Data Visualization With Bokeh
No ratings yet
1 - Interactive Data Visualization With Bokeh
31 pages
Data Manipulation With Pandas
No ratings yet
Data Manipulation With Pandas
39 pages
Pandas
No ratings yet
Pandas
14 pages
Python Notes
No ratings yet
Python Notes
110 pages
NumPy Array Operations Guide
100% (1)
NumPy Array Operations Guide
73 pages
Pandas vs PySpark: Data Operations
No ratings yet
Pandas vs PySpark: Data Operations
3 pages
PANDAS Cheatsheet
No ratings yet
PANDAS Cheatsheet
4 pages
Analyzing Data Using Python Filtering Data in Pandas
No ratings yet
Analyzing Data Using Python Filtering Data in Pandas
52 pages
Python - Environment Setup
No ratings yet
Python - Environment Setup
10 pages
Python Data Visualization Guide
No ratings yet
Python Data Visualization Guide
17 pages
XII-IP - Data Visualisation
No ratings yet
XII-IP - Data Visualisation
65 pages
Pandas & Matplotlib Cheat Sheet
No ratings yet
Pandas & Matplotlib Cheat Sheet
2 pages
DataFrame Ac Win Final
No ratings yet
DataFrame Ac Win Final
30 pages
Specific Speed
No ratings yet
Specific Speed
10 pages
Assessment Task 2 Instructions: Answer
No ratings yet
Assessment Task 2 Instructions: Answer
8 pages
Dumpsys ANR WindowManager
No ratings yet
Dumpsys ANR WindowManager
3,790 pages
Fish Innards Dehydrator Design & Fabrication
No ratings yet
Fish Innards Dehydrator Design & Fabrication
10 pages
EDIST 2024 Agenda
No ratings yet
EDIST 2024 Agenda
5 pages
Petroleum Machinery Installation Guide
No ratings yet
Petroleum Machinery Installation Guide
11 pages
Ipac Thinc2017 Presentationdraft Final Copy-Min
No ratings yet
Ipac Thinc2017 Presentationdraft Final Copy-Min
24 pages
Ranjan Kumar Jaiswal Kotak Mahindra Bank
No ratings yet
Ranjan Kumar Jaiswal Kotak Mahindra Bank
6 pages
Cambridge English Advanced Result Workbook - 2014 - 120p No Key
No ratings yet
Cambridge English Advanced Result Workbook - 2014 - 120p No Key
96 pages
Gothic Architecture Presentation
No ratings yet
Gothic Architecture Presentation
81 pages
Bimaks Water Treatment Catalog 1718370506
No ratings yet
Bimaks Water Treatment Catalog 1718370506
28 pages
Tales from the Rabbi's Desk 2
No ratings yet
Tales from the Rabbi's Desk 2
11 pages
17MU5A0305 Project Report
No ratings yet
17MU5A0305 Project Report
107 pages
Introduction To Maya Hieroglyphs - European Maya Conference - Harri Kettunen, Christophe Helmke
No ratings yet
Introduction To Maya Hieroglyphs - European Maya Conference - Harri Kettunen, Christophe Helmke
158 pages
CSC Books
No ratings yet
CSC Books
20 pages
Session-5 - Prbs. On Potentiometer Transducer - 16-9-2020 (Autosaved)
No ratings yet
Session-5 - Prbs. On Potentiometer Transducer - 16-9-2020 (Autosaved)
24 pages
The Classical Macro Model
No ratings yet
The Classical Macro Model
45 pages
Chapter Two and References - 043431
No ratings yet
Chapter Two and References - 043431
9 pages
Vectors Notes
No ratings yet
Vectors Notes
13 pages
Untitled
No ratings yet
Untitled
3 pages
Cma December, 2019 Examination Foundation Level Subject: 003. Quantitative Techniques
No ratings yet
Cma December, 2019 Examination Foundation Level Subject: 003. Quantitative Techniques
4 pages
Grouting Around Power Tunnel Lining: Satish Kumar Sharma
No ratings yet
Grouting Around Power Tunnel Lining: Satish Kumar Sharma
12 pages
Chapter Two Brain and Language
No ratings yet
Chapter Two Brain and Language
6 pages
Valuing Options: Multiple Choice Questions
100% (1)
Valuing Options: Multiple Choice Questions
15 pages
HD WF4 Specification V6.0.1
100% (1)
HD WF4 Specification V6.0.1
5 pages
November 2015 Cash Advance Liquidation
No ratings yet
November 2015 Cash Advance Liquidation
36 pages
Aqa 2011 Past Paper
No ratings yet
Aqa 2011 Past Paper
14 pages
Cloud Security Engineer Course Outline
No ratings yet
Cloud Security Engineer Course Outline
4 pages
Formative and Summative Assessment
100% (1)
Formative and Summative Assessment
2 pages
Engineers' Guide to NMSE Walls
No ratings yet
Engineers' Guide to NMSE Walls
9 pages

Pandas DataFrame Basics Guide

Uploaded by

Pandas DataFrame Basics Guide

Uploaded by

PANDAS

You could create a Pandas series from an array-like

To create a dataframe from a dictionary, you can run

pd.DataFrame(dictionary_name, columns = ['column_1', 'column_2'])

pd.read_csv(filepath, sep=',', header='infer')

use the following code to change the row indices:

To change the index while loading the data from a file,

 dataframe.info(): This method prints information about the dataframe, which

the memory used.

The selection of rows in dataframes is similar to the indexing you saw in NumPy

 df['column'] or df.column: It returns a series

 df[['col_x', 'col_y']]: It returns a dataframe

dataframe.rename(index={row_index: "new_name"}, columns={column_name:

dataframe.loc[[(label_1, sub_label_1), (label_1, sub_label_2)],

Using the pivot_table() function, you can specify the aggregate function

df.pivot_table(values, index, aggfunc={'value_1': np.mean,'value_2': [min,

You can use the following command to merge two dataframes:

dataframe_1.merge(dataframe_2, on = ['column_1', 'column_2'], how = '____')

pd.concat([dataframe_1, dataframe_2], axis = _)

You might also like