KEMBAR78
Data Handling | PDF
0% found this document useful (0 votes)
1 views19 pages

Data Handling

Data handling

Uploaded by

yajat kohli
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
1 views19 pages

Data Handling

Data handling

Uploaded by

yajat kohli
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

Data

Handling
Dealing with Duplicates
▪ The drop_duplicates() method in Pandas is designed to remove duplicate rows
from a DataFrame based on all columns or specific ones. By default, it scans
the entire DataFrame and retains the first occurrence of each row and removes
any duplicates that follow.

This example shows how duplicate rows are removed while retaining the first
occurrence using dataframe.drop_duplicates().
Scaling Data
(Nomralization)
Perform aggregation, summarizing and grouping data at
https://www.geeksforgeeks.org/python/pandas-groupby-summarising-
aggregating-and-grouping-data-in-python/

Understanding and creating boxplots for outlier detection from


https://www.geeksforgeeks.org/python/finding-the-outlier-points-from-matplotlib/

Working with Missing Data in Pandas from


https://www.geeksforgeeks.org/data-analysis/working-with-missing-data-in-pandas/

You might also like