0% found this document useful (0 votes)

14 views15 pages

Pandas Moderate

The document provides an overview of data manipulation techniques using Pandas, including grouping data, merging DataFrames, data transformation, pivot tables, and rolling functions. It includes code examples for calculating mean and sum by groups, performing different types of joins, applying custom functions, and creating pivot tables. These operations are essential for data analysis and allow for effective summarization and manipulation of datasets.

Uploaded by

singhamal1710

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views15 pages

Pandas Moderate

Uploaded by

singhamal1710

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 15

Moderate Level Questions

Groupby Operations
Grouping data and performing aggregations are common operations in data
analysis. Here's how you can do it using Pandas:

Example DataFrame
Let's assume you have the following DataFrame:

import pandas as pd
import numpy as np

data = {
'ID': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Eva', 'Frank', 'Grace', 'Helen', 'Ia
n', 'Jack'],
'Age': [25, 30, 35, 40, 45, 50, 55, 60, 65, 70],
'Salary': [50000, 55000, 60000, 65000, 70000, 75000, 80000, 85000, 9
0000, 95000],
'Department': ['HR', 'IT', 'Finance', 'Marketing', 'Sales', 'HR', 'IT', 'Financ
e', 'Marketing', 'Sales']
}

df = pd.DataFrame(data)

1. Group Data by a Column and Calculate the Mean

To group data by a column and calculate the mean of another column, you can
use the groupby() method along with mean() :

# Group by 'Department' and calculate the mean 'Salary'

mean_salary_by_department = df.groupby('Department')['Salary'].mean()
print(mean_salary_by_department)

Moderate Level Questions 1

2. Find the Sum of a Numeric Column for Each Category in
Another Column
To find the sum of a numeric column for each category in another column, you
can use the groupby() method along with sum() :

# Group by 'Department' and calculate the sum of 'Salary'

total_salary_by_department = df.groupby('Department')['Salary'].sum()
print(total_salary_by_department)

Explanation
: Groups the DataFrame by the
df.groupby('Department')['Salary'].mean()

'Department' column and calculates the mean of the 'Salary' column for
each group.

: Groups the DataFrame by the 'Department'

df.groupby('Department')['Salary'].sum()

column and calculates the sum of the 'Salary' column for each group.

These operations allow you to summarize and analyze your data based on
specific categories. You can adjust the column names and aggregation
functions as needed for your specific use case.

Merging and Joining

Merging and joining DataFrames are essential operations for combining data
from different sources. Here's how you can perform these operations using
Pandas:

Example DataFrames
Let's assume you have the following two DataFrames:

import pandas as pd

# First DataFrame
df1 = pd.DataFrame({
'ID': [1, 2, 3, 4, 5],
'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Eva'],

Moderate Level Questions 2

'Department': ['HR', 'IT', 'Finance', 'Marketing', 'Sales']
})

# Second DataFrame
df2 = pd.DataFrame({
'ID': [3, 4, 5, 6, 7],
'Salary': [60000, 65000, 70000, 75000, 80000],
'Department': ['Finance', 'Marketing', 'Sales', 'HR', 'IT']
})

1. Merge Two DataFrames on a Common Column

To merge two DataFrames on a common column, you can use the merge()

method:

# Merge df1 and df2 on the 'ID' column

merged_df = pd.merge(df1, df2, on='ID')
print(merged_df)

2. Perform an Outer Join on Two DataFrames

To perform an outer join on two DataFrames, you can use the merge() method
with the how='outer' parameter:

# Perform an outer join on df1 and df2 based on the 'ID' column
outer_join_df = pd.merge(df1, df2, on='ID', how='outer')
print(outer_join_df)

Explanation
: Merges df1 and df2 on the 'ID' column. By default,
pd.merge(df1, df2, on='ID')

this performs an inner join, including only rows with matching 'ID' values in
both DataFrames.

: Performs an outer join, including all rows

pd.merge(df1, df2, on='ID', how='outer')

from both DataFrames. Rows with no match will have NaN in the resulting
DataFrame.

Moderate Level Questions 3

These operations allow you to combine data from different DataFrames based
on common columns, enabling more comprehensive analysis. You can adjust
the column names and join types as needed for your specific use case.
Certainly! Let's dive deeper into merging and joining DataFrames with more
examples and their outputs.

Example DataFrames
We'll use the following DataFrames for demonstration:

import pandas as pd

# First DataFrame
df1 = pd.DataFrame({
'ID': [1, 2, 3, 4, 5],
'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Eva'],
'Department': ['HR', 'IT', 'Finance', 'Marketing', 'Sales']
})

# Second DataFrame
df2 = pd.DataFrame({
'ID': [3, 4, 5, 6, 7],
'Salary': [60000, 65000, 70000, 75000, 80000],
'Department': ['Finance', 'Marketing', 'Sales', 'HR', 'IT']
})

1. Inner Join (Default Merge)

An inner join returns only the rows with matching keys in both DataFrames.

# Inner join on 'ID'

inner_join_df = pd.merge(df1, df2, on='ID')
print(inner_join_df)

Output:

Moderate Level Questions 4

ID Name Department Salary
0 3 Charlie Finance 60000
1 4 David Marketing 65000
2 5 Eva Sales 70000

2. Outer Join
An outer join returns all rows from both DataFrames, filling in NaN for missing
matches.

# Outer join on 'ID'

outer_join_df = pd.merge(df1, df2, on='ID', how='outer')
print(outer_join_df)

Output:

ID Name Department_x Salary Department_y

0 1 Alice HR NaN NaN
1 2 Bob IT NaN NaN
2 3 Charlie Finance 60000 Finance
3 4 David Marketing 65000 Marketing
4 5 Eva Sales 70000 Sales
5 6 NaN NaN 75000 HR
6 7 NaN NaN 80000 IT

3. Left Join
A left join returns all rows from the left DataFrame and the matched rows from
the right DataFrame.

# Left join on 'ID'

left_join_df = pd.merge(df1, df2, on='ID', how='left')
print(left_join_df)

Output:

Moderate Level Questions 5

ID Name Department Salary
0 1 Alice HR NaN
1 2 Bob IT NaN
2 3 Charlie Finance 60000.0
3 4 David Marketing 65000.0
4 5 Eva Sales 70000.0

4. Right Join
A right join returns all rows from the right DataFrame and the matched rows
from the left DataFrame.

# Right join on 'ID'

right_join_df = pd.merge(df1, df2, on='ID', how='right')
print(right_join_df)

Output:

ID Name Department Salary

0 3 Charlie Finance 60000
1 4 David Marketing 65000
2 5 Eva Sales 70000
3 6 NaN NaN 75000
4 7 NaN NaN 80000

Explanation
Inner Join: Returns only the rows where there is a match in both
DataFrames.

Outer Join: Returns all rows from both DataFrames, filling in NaN where
there is no match.

Left Join: Returns all rows from the left DataFrame and the matched rows
from the right DataFrame.

Moderate Level Questions 6

Right Join: Returns all rows from the right DataFrame and the matched
rows from the left DataFrame.

These examples illustrate how you can merge DataFrames using different join
types to suit your data analysis needs.

Data Transformation
Data transformation is a crucial step in data preprocessing. Here's how you
can apply a custom function to a column and work with date columns in
Pandas:

Example DataFrame
Let's assume you have the following DataFrame:

import pandas as pd

data = {
'ID': [1, 2, 3, 4, 5],
'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Eva'],
'Joining_Date': ['2021-05-15', '2020-06-20', '2019-07-25', '2018-08-30',
'2017-09-30'],
'Salary': [50000, 55000, 60000, 65000, 70000]
}

df = pd.DataFrame(data)

1. Apply a Custom Function to a Column

To apply a custom function to a column, you can use the apply() method. Let's
say you want to increase each salary by 10%.

# Define a custom function to increase salary by 10%

def increase_salary(salary):
return salary * 1.10

# Apply the custom function to the 'Salary' column

Moderate Level Questions 7

df['Salary'] = df['Salary'].apply(increase_salary)
print(df)

Output:

ID Name Joining_Date Salary

0 1 Alice 2021-05-15 55000.0
1 2 Bob 2020-06-20 60500.0
2 3 Charlie 2019-07-25 66000.0
3 4 David 2018-08-30 71500.0
4 5 Eva 2017-09-30 77000.0

2. Convert a Date Column to Datetime Format and Extract the

Year
To convert a date column to datetime format and extract the year, you can use
pd.to_datetime() and the dt accessor.

# Convert 'Joining_Date' to datetime format

df['Joining_Date'] = pd.to_datetime(df['Joining_Date'])

# Extract the year from 'Joining_Date'

df['Year'] = df['Joining_Date'].dt.year
print(df)

Output:

ID Name Joining_Date Salary Year

0 1 Alice 2021-05-15 55000.0 2021
1 2 Bob 2020-06-20 60500.0 2020
2 3 Charlie 2019-07-25 66000.0 2019
3 4 David 2018-08-30 71500.0 2018
4 5 Eva 2017-09-30 77000.0 2017

Explanation

Moderate Level Questions 8

: Applies the custom function
apply(increase_salary) increase_salary to each
element in the 'Salary' column.

pd.to_datetime(df['Joining_Date']) : Converts the 'Joining_Date' column to datetime

format.

df['Joining_Date'].dt.year : Extracts the year from the 'Joining_Date' column.

These operations allow you to transform and manipulate your data efficiently.
You can adjust the functions and column names as needed for your specific
use case.

Pivot Tables and Cross Tabulation

Pivot tables and cross-tabulations are powerful tools for summarizing and
analyzing data. Here's how you can create them using Pandas:

Example DataFrame
Let's assume you have the following DataFrame:

import pandas as pd

data = {
'ID': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Eva', 'Frank', 'Grace', 'Helen', 'Ia
n', 'Jack'],
'Department': ['HR', 'IT', 'Finance', 'Marketing', 'Sales', 'HR', 'IT', 'Financ
e', 'Marketing', 'Sales'],
'Salary': [50000, 55000, 60000, 65000, 70000, 75000, 80000, 85000, 9
0000, 95000],
'Performance_Score': [4, 5, 3, 4, 5, 4, 3, 5, 4, 3]
}

df = pd.DataFrame(data)

1. Create a Pivot Table

Moderate Level Questions 9

A pivot table summarizes data by aggregating values based on one or more
keys.

# Create a pivot table to show the average salary by department and perfo
rmance score
pivot_table = df.pivot_table(values='Salary', index='Department', columns
='Performance_Score', aggfunc='mean')
print(pivot_table)

Output:

Performance_Score 3 4 5
Department
Finance 60000.0 85000.0 NaN
HR 75000.0 50000.0 NaN
IT NaN 80000.0 55000.0
Marketing NaN 65000.0 90000.0
Sales NaN 70000.0 95000.0

2. Generate a Cross-Tabulation
Cross-tabulation shows the frequency distribution of two categorical variables.

# Generate a cross-tabulation of 'Department' and 'Performance_Score'

cross_tab = pd.crosstab(df['Department'], df['Performance_Score'])
print(cross_tab)

Output:

Performance_Score 3 4 5
Department
Finance 1 1 0
HR 0 1 1
IT 0 1 1

Moderate Level Questions 10

Marketing 0 1 1
Sales 0 1 1

Explanation
: Creates a pivot table with 'Department' as the index,
pivot_table

'Performance_Score' as the columns, and the mean of 'Salary' as the

values.

: Generates a cross-tabulation of 'Department' and

crosstab

'Performance_Score', showing the frequency of each combination.

These operations help you summarize and analyze your data effectively. You
can adjust the columns and aggregation functions as needed for your specific
use case.

Rolling and Window Functions

Rolling and window functions are useful for time series analysis and smoothing
data. Here's how you can compute a moving average and a cumulative sum
using Pandas:

Example DataFrame
Let's assume you have the following DataFrame with time series data:

import pandas as pd

data = {
'Date': pd.date_range(start='2023-01-01', periods=10, freq='D'),
'Value': [10, 20, 30, 40, 50, 60, 70, 80, 90, 100]
}

df = pd.DataFrame(data)

1. Compute a Moving Average

To compute a moving average with a window size of 5, you can use the rolling()

method.

Moderate Level Questions 11

# Compute a moving average with a window size of 5
df['Moving_Average'] = df['Value'].rolling(window=5).mean()
print(df)

Output:

Date Value Moving_Average

0 2023-01-01 10 NaN
1 2023-01-02 20 NaN
2 2023-01-03 30 NaN
3 2023-01-04 40 NaN
4 2023-01-05 50 30.0
5 2023-01-06 60 40.0
6 2023-01-07 70 50.0
7 2023-01-08 80 60.0
8 2023-01-09 90 70.0
9 2023-01-10 100 80.0

2. Find the Cumulative Sum

To find the cumulative sum of a column, you can use the cumsum() method.

# Find the cumulative sum of the 'Value' column

df['Cumulative_Sum'] = df['Value'].cumsum()
print(df)

Output:

Date Value Moving_Average Cumulative_Sum

0 2023-01-01 10 NaN 10
1 2023-01-02 20 NaN 30
2 2023-01-03 30 NaN 60
3 2023-01-04 40 NaN 100
4 2023-01-05 50 30.0 150
5 2023-01-06 60 40.0 210

Moderate Level Questions 12

6 2023-01-07 70 50.0 280
7 2023-01-08 80 60.0 360
8 2023-01-09 90 70.0 450
9 2023-01-10 100 80.0 550

Explanation
: Computes the moving average of the 'Value' column
rolling(window=5).mean()

with a window size of 5. The first few values are NaN because there aren't
enough preceding values to compute the average.

: Computes the cumulative sum of the 'Value' column, which is the

cumsum()

sum of all previous values up to the current row.

These operations are useful for analyzing trends and patterns in time series
data. You can adjust the window size and column names as needed for your
specific use case.

Date Time Operations

Working with date columns often involves extracting components like day,
month, and year, as well as calculating differences between dates. Here's how
you can do these operations using Pandas:

Example DataFrame
Let's assume you have the following DataFrame with date columns:

import pandas as pd

data = {
'Start_Date': pd.to_datetime(['2023-01-01', '2023-02-15', '2023-03-20',
'2023-04-25']),
'End_Date': pd.to_datetime(['2023-01-10', '2023-02-20', '2023-03-30',
'2023-05-01'])
}

df = pd.DataFrame(data)

Moderate Level Questions 13

1. Extract the Day, Month, and Year
To extract the day, month, and year from a date column, you can use the dt

accessor.

# Extract day, month, and year from 'Start_Date'

df['Day'] = df['Start_Date'].dt.day
df['Month'] = df['Start_Date'].dt.month
df['Year'] = df['Start_Date'].dt.year
print(df)

Output:

Start_Date End_Date Day Month Year

0 2023-01-01 2023-01-10 1 1 2023
1 2023-02-15 2023-02-20 15 2 2023
2 2023-03-20 2023-03-30 20 3 2023
3 2023-04-25 2023-05-01 25 4 2023

2. Find the Difference in Days Between Two Date Columns

To find the difference in days between two date columns, you can subtract one
column from the other.

# Calculate the difference in days between 'End_Date' and 'Start_Date'

df['Difference_Days'] = (df['End_Date'] - df['Start_Date']).dt.days
print(df)

Output:

Start_Date End_Date Day Month Year Difference_Days

0 2023-01-01 2023-01-10 1 1 2023 9
1 2023-02-15 2023-02-20 15 2 2023 5
2 2023-03-20 2023-03-30 20 3 2023 10
3 2023-04-25 2023-05-01 25 4 2023 6

Moderate Level Questions 14

Explanation
dt.day , dt.month , dt.year : Extract the day, month, and year from the
'Start_Date' column.

: Calculates the difference in days between

(df['End_Date'] - df['Start_Date']).dt.days

the 'End_Date' and 'Start_Date' columns.

These operations allow you to manipulate and analyze date data efficiently.
You can adjust the column names and operations as needed for your specific
use case.

Moderate Level Questions 15

Learn Pandas
No ratings yet
Learn Pandas
37 pages
Python 2.1.3
No ratings yet
Python 2.1.3
6 pages
Chapter 2 Python Pandas - II
No ratings yet
Chapter 2 Python Pandas - II
19 pages
4th Unit Answer Bank
No ratings yet
4th Unit Answer Bank
40 pages
Day 11 Pandas For Data Science - Part 2
No ratings yet
Day 11 Pandas For Data Science - Part 2
21 pages
Unit 4 1
No ratings yet
Unit 4 1
3 pages
Panda Joins
No ratings yet
Panda Joins
25 pages
Introduction To Pandas in Data Analytics
No ratings yet
Introduction To Pandas in Data Analytics
12 pages
EDA Lecture 7 - 9
No ratings yet
EDA Lecture 7 - 9
7 pages
Pandas Dataframe All Operations 1735471870
No ratings yet
Pandas Dataframe All Operations 1735471870
4 pages
Data Wrangling with Pandas
No ratings yet
Data Wrangling with Pandas
16 pages
Edp 3
No ratings yet
Edp 3
16 pages
DSP Unit-5 Updated
No ratings yet
DSP Unit-5 Updated
23 pages
Dataframe in Pandas - Cheatsheet
No ratings yet
Dataframe in Pandas - Cheatsheet
8 pages
Pandas Dataframe Cheat Sheet
No ratings yet
Pandas Dataframe Cheat Sheet
3 pages
Razorpay Data Analyst Interview Questions 1739977522
No ratings yet
Razorpay Data Analyst Interview Questions 1739977522
12 pages
Pandas Cheat Sheet for Data Manipulation
No ratings yet
Pandas Cheat Sheet for Data Manipulation
1 page
Exploratory Data Analysis (Eda) With Pandas: (Cheatsheet)
No ratings yet
Exploratory Data Analysis (Eda) With Pandas: (Cheatsheet)
7 pages
Content Pandas Cheat Sheet
No ratings yet
Content Pandas Cheat Sheet
9 pages
Python CheatSheet
No ratings yet
Python CheatSheet
2 pages
07 Data Wrangling
No ratings yet
07 Data Wrangling
51 pages
Python Interviews
No ratings yet
Python Interviews
154 pages
Pandas
No ratings yet
Pandas
13 pages
Pandas Data Wrangling Cheat Sheet
100% (2)
Pandas Data Wrangling Cheat Sheet
6 pages
OOM Unit 2
No ratings yet
OOM Unit 2
145 pages
Chapter-2 Python Pandas
100% (2)
Chapter-2 Python Pandas
33 pages
Pandas Trampas
No ratings yet
Pandas Trampas
9 pages
Pandas Cheat Sheet
No ratings yet
Pandas Cheat Sheet
17 pages
Praveen PPT
No ratings yet
Praveen PPT
9 pages
Pandas
No ratings yet
Pandas
13 pages
EDA Cheat Sheet
No ratings yet
EDA Cheat Sheet
7 pages
Data Mining - Week - 4
No ratings yet
Data Mining - Week - 4
8 pages
Python For DS Unit4
No ratings yet
Python For DS Unit4
11 pages
Data Analyst Interview Q&A Guide
No ratings yet
Data Analyst Interview Q&A Guide
20 pages
Pandas Cheat Sheet
100% (1)
Pandas Cheat Sheet
2 pages
UnitIV 1
No ratings yet
UnitIV 1
4 pages
Pandas Cheat Sheet
No ratings yet
Pandas Cheat Sheet
2 pages
Pandas
No ratings yet
Pandas
26 pages
Exp 3
No ratings yet
Exp 3
10 pages
Pandas Cheat Sheet
100% (4)
Pandas Cheat Sheet
2 pages
Pandas Cheat Sheet CN
No ratings yet
Pandas Cheat Sheet CN
4 pages
Pandas Cheat Sheet for Data Science
No ratings yet
Pandas Cheat Sheet for Data Science
5 pages
12 Pandas
100% (1)
12 Pandas
21 pages
Pandas Cheat Sheet
85% (13)
Pandas Cheat Sheet
2 pages
Pandas Library: Data Manipulation & Analysis Guide
No ratings yet
Pandas Library: Data Manipulation & Analysis Guide
9 pages
EDA With Pandas
No ratings yet
EDA With Pandas
8 pages
Document (4) - 1
No ratings yet
Document (4) - 1
15 pages
Introduction To Pandas Programming 2
No ratings yet
Introduction To Pandas Programming 2
3 pages
Python Day 6 (Typed Notes) - Pandas Day 3 - Practice HomeWork, Concat, Different Systems - Connectivity, GIT Installation
No ratings yet
Python Day 6 (Typed Notes) - Pandas Day 3 - Practice HomeWork, Concat, Different Systems - Connectivity, GIT Installation
15 pages
04-Data Manipulation With Pandas
No ratings yet
04-Data Manipulation With Pandas
28 pages
02 EDA - Transformation Techniques
No ratings yet
02 EDA - Transformation Techniques
18 pages
Final Prez NLP Bilingual News Summarization
No ratings yet
Final Prez NLP Bilingual News Summarization
11 pages
Cellulosic Fiber Policy - Abercrombie
No ratings yet
Cellulosic Fiber Policy - Abercrombie
4 pages
NRF '24 - Harnessing Data To Stay Connected To Customers - Abercrombie
No ratings yet
NRF '24 - Harnessing Data To Stay Connected To Customers - Abercrombie
3 pages
How2Recycle - Abercrombie
No ratings yet
How2Recycle - Abercrombie
3 pages
2014 Engine D6GA Engine Control System Schematic Diagrams
No ratings yet
2014 Engine D6GA Engine Control System Schematic Diagrams
1 page
Miba Bearing Folder en
100% (2)
Miba Bearing Folder en
20 pages
ASP Pco3 User Manual f124lr.9600f510 Dd96 4bbc B1ed A7ea11fec782
100% (1)
ASP Pco3 User Manual f124lr.9600f510 Dd96 4bbc B1ed A7ea11fec782
24 pages
CV Chetan Kumar
No ratings yet
CV Chetan Kumar
1 page
MATLAB Workshop Lecture 1
No ratings yet
MATLAB Workshop Lecture 1
46 pages
BLDG Paper House Wrap Flashing
No ratings yet
BLDG Paper House Wrap Flashing
5 pages
CLC5526 Digital Variable Gain Amplifier (DVGA) : General Description
No ratings yet
CLC5526 Digital Variable Gain Amplifier (DVGA) : General Description
10 pages
RCM Report Sample
100% (2)
RCM Report Sample
15 pages
ABB MagMaster W&W
No ratings yet
ABB MagMaster W&W
12 pages
2220006-TD13 - Vessel-Mv Hai Long Bravo-Service-19-09-2022
No ratings yet
2220006-TD13 - Vessel-Mv Hai Long Bravo-Service-19-09-2022
7 pages
FCFS - Final Spos Practical
No ratings yet
FCFS - Final Spos Practical
2 pages
Concrete Anchorage for Engineers
No ratings yet
Concrete Anchorage for Engineers
9 pages
Jailkit Installation Guide Linux
No ratings yet
Jailkit Installation Guide Linux
8 pages
Michler en
No ratings yet
Michler en
3 pages
Case Studies
67% (3)
Case Studies
21 pages
3.i Work Study
No ratings yet
3.i Work Study
27 pages
Mobile Communication: Unit-I Two Marks Q&A
No ratings yet
Mobile Communication: Unit-I Two Marks Q&A
20 pages
2024-04-10-Railway Board Approval of TCSC Meeting Recommendation (1) - 3
No ratings yet
2024-04-10-Railway Board Approval of TCSC Meeting Recommendation (1) - 3
1 page
PT6982 S
No ratings yet
PT6982 S
4 pages
Operation Manual: Digital-Multifunction Generator Differential Protection Relay Type Md32-G
No ratings yet
Operation Manual: Digital-Multifunction Generator Differential Protection Relay Type Md32-G
22 pages
Ebl264 3
100% (3)
Ebl264 3
4 pages
Decisive Aspects in The Evolution of Microprocessors: Dezsö Sima
No ratings yet
Decisive Aspects in The Evolution of Microprocessors: Dezsö Sima
31 pages
Natural Ventilation
100% (1)
Natural Ventilation
6 pages
Indispensable SPL 3 N 015
No ratings yet
Indispensable SPL 3 N 015
6 pages
Pem Mmps Eng
No ratings yet
Pem Mmps Eng
39 pages
Soporte Motor-1
No ratings yet
Soporte Motor-1
3 pages
Feasibility Study Template
No ratings yet
Feasibility Study Template
3 pages
Sample Basic Interview - Questions
100% (1)
Sample Basic Interview - Questions
9 pages
04.GPD155-F All New Nmax 155 Connected Version Camshaft & Chain
No ratings yet
04.GPD155-F All New Nmax 155 Connected Version Camshaft & Chain
1 page
Servoy Beginners Handbook PREVIEW
No ratings yet
Servoy Beginners Handbook PREVIEW
47 pages

Pandas Moderate

Uploaded by

Pandas Moderate

Uploaded by

Moderate Level Questions

1. Group Data by a Column and Calculate the Mean

# Group by 'Department' and calculate the mean 'Salary'

Moderate Level Questions 1

# Group by 'Department' and calculate the sum of 'Salary'

: Groups the DataFrame by the 'Department'

Merging and Joining

Moderate Level Questions 2

1. Merge Two DataFrames on a Common Column

# Merge df1 and df2 on the 'ID' column

2. Perform an Outer Join on Two DataFrames

: Performs an outer join, including all rows

Moderate Level Questions 3

1. Inner Join (Default Merge)

# Inner join on 'ID'

Moderate Level Questions 4

# Outer join on 'ID'

ID Name Department_x Salary Department_y

# Left join on 'ID'

Moderate Level Questions 5

# Right join on 'ID'

ID Name Department Salary

Moderate Level Questions 6

1. Apply a Custom Function to a Column

# Define a custom function to increase salary by 10%

# Apply the custom function to the 'Salary' column

Moderate Level Questions 7

ID Name Joining_Date Salary

2. Convert a Date Column to Datetime Format and Extract the

# Convert 'Joining_Date' to datetime format

# Extract the year from 'Joining_Date'

ID Name Joining_Date Salary Year

Moderate Level Questions 8

pd.to_datetime(df['Joining_Date']) : Converts the 'Joining_Date' column to datetime

df['Joining_Date'].dt.year : Extracts the year from the 'Joining_Date' column.

Pivot Tables and Cross Tabulation

1. Create a Pivot Table

Moderate Level Questions 9

# Generate a cross-tabulation of 'Department' and 'Performance_Score'

Moderate Level Questions 10

'Performance_Score' as the columns, and the mean of 'Salary' as the

: Generates a cross-tabulation of 'Department' and

'Performance_Score', showing the frequency of each combination.

Rolling and Window Functions

1. Compute a Moving Average

Moderate Level Questions 11

Date Value Moving_Average

2. Find the Cumulative Sum

# Find the cumulative sum of the 'Value' column

Date Value Moving_Average Cumulative_Sum

Moderate Level Questions 12

: Computes the cumulative sum of the 'Value' column, which is the

sum of all previous values up to the current row.

Date Time Operations

Moderate Level Questions 13

# Extract day, month, and year from 'Start_Date'

Start_Date End_Date Day Month Year

2. Find the Difference in Days Between Two Date Columns

# Calculate the difference in days between 'End_Date' and 'Start_Date'

Start_Date End_Date Day Month Year Difference_Days

Moderate Level Questions 14

: Calculates the difference in days between

the 'End_Date' and 'Start_Date' columns.

Moderate Level Questions 15

You might also like