KEMBAR78
Seaborn - Ipynb - Colaboratory | PDF | Histogram | Statistical Analysis
0% found this document useful (0 votes)
5 views8 pages

Seaborn - Ipynb - Colaboratory

The document discusses Seaborn, a visualization library built on top of Matplotlib, which addresses common complaints about Matplotlib's usability and integration with Pandas DataFrames. It covers various types of plots such as histograms, pair plots, and box plots, demonstrating how to visualize data effectively with simple commands. Seaborn simplifies the process of creating informative and interactive visualizations, making it easier for users to represent their data.

Uploaded by

Arise and Shine
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views8 pages

Seaborn - Ipynb - Colaboratory

The document discusses Seaborn, a visualization library built on top of Matplotlib, which addresses common complaints about Matplotlib's usability and integration with Pandas DataFrames. It covers various types of plots such as histograms, pair plots, and box plots, demonstrating how to visualize data effectively with simple commands. Seaborn simplifies the process of creating informative and interactive visualizations, making it easier for users to represent their data.

Uploaded by

Arise and Shine
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

11/3/23, 5:23 AM Seaborn.

ipynb - Colaboratory

Visualization with Seaborn

Matplotlib has proven to be an incredibly useful and popular visualization tool, but even avid
users will admit it often leaves much to be desired. There are several valid complaints about
Matplotlib that often come up:

Prior to version 2.0, Matplotlib's defaults are not exactly the best choices.
Matplotlib's API is relatively low level. Doing sophisticated statistical visualization is
possible, but often requires a lot of redundant code.
Matplotlib predated Pandas by more than a decade, and thus is not designed for use with
Pandas DataFrame s. In order to visualize data from a Pandas DataFrame , you must extract
each Series and often concatenate them together into the right format. It would be nicer
to have a plotting library that can intelligently use the DataFrame labels in a plot.

An answer to these problems is [Seaborn]. Seaborn provides an API on top of Matplotlib that
offers sane choices for plot style and color defaults, defines simple high-level functions for
common statistical plot types, and integrates with the functionality provided by Pandas
DataFrame s.

Exploring Seaborn Plots

Histograms, KDE, and densities


Often in statistical data visualization, all you want is to plot histograms and joint distributions of
variables. We have seen that this is relatively straightforward in Matplotlib:

# Import necessary libraries


import seaborn as sns
import numpy as np
import pandas as pd

# Generating dataset of random numbers


x = np.random.randn(200)
x = pd.Series(x, name = "Numerical Variable")

# Plot histograms witout the density estimate


sns.histplot(x, kde = False)

https://colab.research.google.com/drive/1UOWXvbwioSb7EnnUYklNyqdM1htuCLre?usp=sharing#printMode=true 1/8
11/3/23, 5:23 AM Seaborn.ipynb - Colaboratory

<matplotlib.axes._subplots.AxesSubplot at 0x7eff2f436e20>

By default kde parameter of seaborn.histplot is set to false. So, by setting the kde to true, a
kernel density estimate is computed to smooth the distribution and a density plotline is drawn.

# Plot histograms with density estimate


sns.histplot(x, kde = True)

<matplotlib.axes._subplots.AxesSubplot at 0x7eff31a29c70>

#Plots the kde alone


sns.kdeplot(x, shade=False)

<matplotlib.axes._subplots.AxesSubplot at 0x7eff3f3a4070>

https://colab.research.google.com/drive/1UOWXvbwioSb7EnnUYklNyqdM1htuCLre?usp=sharing#printMode=true 2/8
11/3/23, 5:23 AM Seaborn.ipynb - Colaboratory

There are other parameters that can be passed to jointplot —for example, we can use a
hexagonally based histogram instead:

Pair plots
To plot multiple pairwise bivariate distributions in a dataset, you can use the .pairplot() function.

The diagonal plots are the univariate plots, and this displays the relationship for the (n, 2)
combination of variables in a DataFrame as a matrix of plots.

iris = sns.load_dataset("iris")
iris.head()

sepal_length sepal_width petal_length petal_width species

0 5.1 3.5 1.4 0.2 setosa

1 4.9 3.0 1.4 0.2 setosa

2 4.7 3.2 1.3 0.2 setosa

3 4.6 3.1 1.5 0.2 setosa

4 5.0 3.6 1.4 0.2 setosa

Visualizing the multidimensional relationships among the samples is as easy as calling


sns.pairplot :

sns.pairplot(iris, hue='species', size=2.5);

https://colab.research.google.com/drive/1UOWXvbwioSb7EnnUYklNyqdM1htuCLre?usp=sharing#printMode=true 3/8
11/3/23, 5:23 AM Seaborn.ipynb - Colaboratory

/usr/local/lib/python3.8/dist-packages/seaborn/axisgrid.py:2076: UserWarning: The `si


warnings.warn(msg, UserWarning)

Faceted histograms
Sometimes the best way to view data is via histograms of subsets. Seaborn's FacetGrid makes
this extremely simple. We'll take a look at some data that shows the amount that restaurant staff
receive in tips based on various indicator data:

tips = sns.load_dataset('tips')
tips.head()

total_bill tip sex smoker day time size

0 16.99 1.01 Female No Sun Dinner 2

1 10.34 1.66 Male No Sun Dinner 3

2 21.01 3.50 Male No Sun Dinner 3

3 23.68 3.31 Male No Sun Dinner 2

4 24.59 3.61 Female No Sun Dinner 4

#plotting tip % as histogram


import matplotlib.pyplot as plt
tips['tip_pct'] = 100 * tips['tip'] / tips['total_bill']

grid = sns.FacetGrid(tips, row="sex", col="time", margin_titles=True)


grid.map(plt.hist, "tip_pct", bins=np.linspace(0, 40, 15));

https://colab.research.google.com/drive/1UOWXvbwioSb7EnnUYklNyqdM1htuCLre?usp=sharing#printMode=true 4/8
11/3/23, 5:23 AM Seaborn.ipynb - Colaboratory

Factor plots
Factor plots can be useful for this kind of visualization as well. This allows you to view the
distribution of a parameter within bins defined by any other parameter:

A box and whisker plot—also called a box plot—displays the five-number summary of a set of
data. The five-number summary is the minimum, first quartile, median, third quartile, and
maximum. In a box plot, we draw a box from the first quartile to the third quartile. A vertical line
goes through the box at the median.

#Box plot
sns.factorplot("day", "total_bill", "sex", data=tips, kind="box")
plt.show()

https://colab.research.google.com/drive/1UOWXvbwioSb7EnnUYklNyqdM1htuCLre?usp=sharing#printMode=true 5/8
11/3/23, 5:23 AM Seaborn.ipynb - Colaboratory

/usr/local/lib/python3.8/dist-packages/seaborn/categorical.py:3717: UserWarning: The


warnings.warn(msg)
/usr/local/lib/python3.8/dist-packages/seaborn/_decorators.py:36: FutureWarning: Pass
warnings.warn(

#Bar plot
sns.factorplot("day", data=tips, kind="count")
plt.show()

/usr/local/lib/python3.8/dist-packages/seaborn/_decorators.py:36: FutureWarning: Pass


warnings.warn(

#Violin plot
sns.factorplot("day", "total_bill", "sex", data=tips, kind="violin")
plt.show()

https://colab.research.google.com/drive/1UOWXvbwioSb7EnnUYklNyqdM1htuCLre?usp=sharing#printMode=true 6/8
11/3/23, 5:23 AM Seaborn.ipynb - Colaboratory

/usr/local/lib/python3.8/dist-packages/seaborn/categorical.py:3717: UserWarning: The


warnings.warn(msg)
/usr/local/lib/python3.8/dist-packages/seaborn/_decorators.py:36: FutureWarning: Pass
warnings.warn(

#Plotting a pie chart


plt.figure(figsize=[9,7])
tips['size'].value_counts().plot.pie()
plt.show()

Advantages of Seaborn: By using the seaborn library, we can easily represent our data on a plot.

This library is used to visualize our data; we do not need to take care of the internal details; we
just have to pass our data set or data inside the relplot() function, and it will calculate and place
the value accordingly.

Inside this, we can switch to any other representation of data using the ‘kind’ property inside it.

It creates an interactive and informative plot to representation our data; also, this is easy for the
user to understand and visualize the records on the application.

It uses static aggregation for plot generation in python. As it is based on the matplotlib so while
installing seaborn, we also have other libraries installed, out of which we have matplotlib, which
also provides several features and functions to create more interactive plots in python.

https://colab.research.google.com/drive/1UOWXvbwioSb7EnnUYklNyqdM1htuCLre?usp=sharing#printMode=true 7/8
11/3/23, 5:23 AM Seaborn.ipynb - Colaboratory

https://colab.research.google.com/drive/1UOWXvbwioSb7EnnUYklNyqdM1htuCLre?usp=sharing#printMode=true 8/8

You might also like