KEMBAR78
Basic data manipulation with pandas pandas | PPTX
Introduction to Python for Data
Science
Replay
• OOPS Continued------
• Anaconda Installation
• Jupyter Notebook Interface
• Working and Use of Jupyter Notebook
• Shortcuts for cell moving and marking
Session 9:Basic data manipulation with
pandas
Agenda
• What is Pandas
Series
DataFrame
Panel
• Installing Pandas
• Creating DataFrame
• Adding data in DataFrame using Append Function
• Getting Shape and information of the data
• Getting Statistical Analysis of Data
• Dropping Columns from Data
• Dropping Rows from Data
Pandas
• Pandas is a Python library used for working with data sets.
• It has functions for analyzing, cleaning, exploring, and
manipulating data.
• The name "Pandas" has a reference to both "Panel Data", and
"Python Data Analysis" and was created by Wes McKinney in
2008.
Series
• Pandas Series is a 1-dimensional structure resembling arrays
containing homogeneous data in it. It is a linear data structure and
stores elements in a single dimension.
• Note: The size of the Series Data Structure in Pandas is
immutable i.e once set, it cannot be changed dynamically. While
the values/elements in the Series can be changed or manipulated.
Series Example
• A one-dimensional labeled array capable of holding any data type
Output
DataFrame
• Python Pandas module provides DataFrame that is a 2-dimensional
structure, resembling the 2-D arrays. Here, the input data is framed in
the form of rows and columns.
• Note: The size of the DataFrame Data Structure in Pandas is
mutable.
DataFrame Example
• A two-dimensional labeled data structure with columns of potentially
different types.
Output
Panel
• Python Pandas module offers a Panel that is a 3-dimensional data
structure and contains 3 axes to serve the following functions:
• items: (axis 0) Every item of it corresponds to a DataFrame in it.
• major_axis: (axis 1) It corresponds to the rows of each DataFrame.
• minor_axis: (axis 2) It corresponds to the columns of each DataFrame.
Panel Data Example
•A three-dimensional data structure designed for handling 3D data.
•As of Pandas version 1.0.0, Panels are deprecated and users are encouraged to use multi-index DataFrames
import numpy as np
panel = pd.Panel(np.random.rand(2, 3, 4), items=['Item1', 'Item2'], major_axis=['A', 'B', 'C'], minor_axis=['X1', 'X2', 'X3', 'X4'])
print(panel)
Installation
There are various ways to install the Python Pandas module. One of the easiest ways is to
install using Python package installer i.e. PIP.
Creating a DataFrame from a
dictionary
Adding Data using Append Function
Getting Shape and Information of
the Data
Getting Statistical Analysis of
Data
Statistical data analysis is a procedure of performing various statistical
operations. It is a kind of quantitative research, which seeks to quantify the
data. Quantitative data basically involves descriptive data, such as survey data
and observational data.
Dropping Columns from Data
Dropping Rows from Data
Rows can be dropped using the “drop” method by specifying the
index
Q N A
Q What is a Pandas Series, and how is it
different from a DataFrame?
Q Explain the steps involved in installing
Pandas?
Q How can you get the shape, information,
and statistical analysis of a DataFrame?

Basic data manipulation with pandas pandas

  • 1.
    Introduction to Pythonfor Data Science
  • 2.
    Replay • OOPS Continued------ •Anaconda Installation • Jupyter Notebook Interface • Working and Use of Jupyter Notebook • Shortcuts for cell moving and marking
  • 3.
    Session 9:Basic datamanipulation with pandas Agenda • What is Pandas Series DataFrame Panel • Installing Pandas • Creating DataFrame • Adding data in DataFrame using Append Function • Getting Shape and information of the data • Getting Statistical Analysis of Data • Dropping Columns from Data • Dropping Rows from Data
  • 4.
    Pandas • Pandas isa Python library used for working with data sets. • It has functions for analyzing, cleaning, exploring, and manipulating data. • The name "Pandas" has a reference to both "Panel Data", and "Python Data Analysis" and was created by Wes McKinney in 2008.
  • 5.
    Series • Pandas Seriesis a 1-dimensional structure resembling arrays containing homogeneous data in it. It is a linear data structure and stores elements in a single dimension. • Note: The size of the Series Data Structure in Pandas is immutable i.e once set, it cannot be changed dynamically. While the values/elements in the Series can be changed or manipulated.
  • 6.
    Series Example • Aone-dimensional labeled array capable of holding any data type Output
  • 7.
    DataFrame • Python Pandasmodule provides DataFrame that is a 2-dimensional structure, resembling the 2-D arrays. Here, the input data is framed in the form of rows and columns. • Note: The size of the DataFrame Data Structure in Pandas is mutable.
  • 8.
    DataFrame Example • Atwo-dimensional labeled data structure with columns of potentially different types. Output
  • 9.
    Panel • Python Pandasmodule offers a Panel that is a 3-dimensional data structure and contains 3 axes to serve the following functions: • items: (axis 0) Every item of it corresponds to a DataFrame in it. • major_axis: (axis 1) It corresponds to the rows of each DataFrame. • minor_axis: (axis 2) It corresponds to the columns of each DataFrame.
  • 10.
    Panel Data Example •Athree-dimensional data structure designed for handling 3D data. •As of Pandas version 1.0.0, Panels are deprecated and users are encouraged to use multi-index DataFrames import numpy as np panel = pd.Panel(np.random.rand(2, 3, 4), items=['Item1', 'Item2'], major_axis=['A', 'B', 'C'], minor_axis=['X1', 'X2', 'X3', 'X4']) print(panel)
  • 11.
    Installation There are variousways to install the Python Pandas module. One of the easiest ways is to install using Python package installer i.e. PIP.
  • 12.
    Creating a DataFramefrom a dictionary
  • 13.
    Adding Data usingAppend Function
  • 14.
    Getting Shape andInformation of the Data
  • 15.
    Getting Statistical Analysisof Data Statistical data analysis is a procedure of performing various statistical operations. It is a kind of quantitative research, which seeks to quantify the data. Quantitative data basically involves descriptive data, such as survey data and observational data.
  • 16.
  • 17.
    Dropping Rows fromData Rows can be dropped using the “drop” method by specifying the index
  • 18.
  • 19.
    Q What isa Pandas Series, and how is it different from a DataFrame?
  • 20.
    Q Explain thesteps involved in installing Pandas?
  • 21.
    Q How canyou get the shape, information, and statistical analysis of a DataFrame?