Introduction to ggplot2
Visualize relationships Changing colors
Cheat Sheet
# Create a scatter plot with ggplot2
Scatter plot
# Change the outline color of a histogram geom
ggplot(data, aes(x = x_column, y = y_column)) +
ggplot(diamonds, aes(price)) +
geom_point()
geom_histogram(color = "red")
Learn ggplot2 online at www.DataCamp.com # Create a bar plot with ggplot2
Bar plot chart # Change the fill color of a histogram geom
ggplot(diamonds, aes(price)) +
ggplot(data, aes(x = x_column, y = y_column)) +
geom_histogram(fill = "blue")
geom_col()
# Add a gray color scale
Swap geom_col() for geom_bar() to calculate the bar heights from counts of the x values.
The grammar of graphics ggplot(iris, aes(Sepal.Length,
geom_point(size = 4) +
Sepal.Width, color = Species)) +
# Create a lollipop chart with ggplot2
Lollipop chart
scale_color_grey()
ggplot(data, aes(x = x_column, y = y_column)) +
geom_point() +
# Change to other native color scales
The grammar of graphics is a framework for specifying the components of a plot. This approach of
geom_segment(aes(x = x_column, xend = x_column, y = 0, yend = y_column))
ggplot(iris, aes(Sepal.Length, Sepal.Width, color = Species)) +
building plots in a modular way allows a high level of flexibility for creating a wide range of
geom_point(size = 4) +
visualizations.
scale_color_brewer(palette = "Spectral")
# Create a bubble plot with ggplot2
Bubble chart
This cheat sheet covers all you need to know about building data visualizations in ggplot2.
ggplot(data, aes(x = x_column, y = y_column, size = size_column)) +
geom_point(alpha =
scale_size_area()
0.7) +
Changing shape
In a bubble plot, "bubbles" can overlap, which can be solved by adjusting the transparency attribute, # Change the shape of markers
# Change the shape radius
> Creating your first ggplot2 plot alpha. scale_size_area() makes the points to be proportional to the values in size_column.
ggplot(diamonds, aes(price, carat))
geom_point(shape = 1)
+
base_plot +
scale_radius(range = c(1, 6))
A single line plot Visualize distributions shape = 1 makes the points circles. Run # Change max shape
example(points) to see the shape for each number.
base_plot +
area size
scale_size_area(max_size = 4)
Histogram
# Create a histogram with ggplot2
# Change the shape of markers based on a
# Create a lineplot in ggplot2
ggplot(data, aes(x_column)) +
third column
ggplot(data, aes(x = x_column, y = y_column)) +
geom_line( geom_histogram(bins = 15)
base_plot +
Line chart geom_point(size = 2)
Box plot
ggplot() creates a canvas to draw on. aes() matches columns of data to aesthetics of the # Create a box plot with ggplot2
plot. Here, x_column is used for the x-axis and ggplot(data, aes(x = x_column, y = y_column)) +
data is the data frame containing data for the plot. It
y_column for the y-axis. geom_boxplot()
contains columns named x_column and y_column.
geom_line() adds a line geometry. That is, it draws Violin plot Changing fonts
a line plot connecting each data point in the dataset.
# Create a violin plot with ggplot2
ggplot(data, aes(x = x_column, y = y_column, fill = z_value)) +
geom_violin()
# Change font family
base_plot +
> Geometries, attributes, and aesthetics in ggplot2
Density plot
theme(text = element_text(family = "serif"))
# Create a density plot with ggplot2
ggplot(data, aes(x = x_column)) +
# Change font size
geom_density()
base_plot +
Geometries are visual representations of the data. Common geometries are points, lines, bars, histograms, boxes, and
theme(text = element_text(size = 20))
maps. The visual properties of geometries such as color, size and shape can be defined as attributes or aesthetics
Attributes are fixed values of visual properties of geometries. For example, if you want to set the color of all the # Change text angle
points to red, then you would be setting the color attribute to red. Attributes must always be defined inside the
geometry function.
> Customizing visualizations with ggplot2 base_plot +
theme(text = element_text(angle = 90))
# Create a red lineplot in ggplot2
ggplot(data, aes(x = x_column, y = y_column)) +
Manipulating axes # Change alignment
base_plot +
with hjust and vjust
geom_line(color = "red" theme(text = element_text(hjust = 0.7, vjust = 0.4))
# Switching to logarithmic scale
# Changing axis limits without clipping
Aesthetics are values of visual properties of geometries that depend on data values. For example, if you want the ggplot(data, aes(x = x_column, y = y_column)) +
ggplot(data, aes(x = x_column, y = y_column)) +
color of the points to depend on values in z_column then you would be mapping z_column to the color aesthetic. geom_point() +
scale_x_log10() # or scale_y_log10
geom_point() +
coord_cartesian(xlim = c(min, max),
Changing themes
Aesthetics can be defined inside the geometry function or inside ggplot(). The latter makes the aesthetics apply to
ylim = c(min, max),
all the geometries included in the plot.
# Reverse the direction of the axis
clip = "off")
# Minimal theme
# Dark theme (high contrast)
ggplot(data, aes(x = x_column, y = y_column)) +
# Create a lineplot where lines are colored according to another in ggplot2
geom_point() +
# Changing axis limits with clipping
base_plot + theme_minimal()
base_plot + theme_dark()
ggplot(data, aes(x = x_column, y = y_column)) +
scale_x_reverse()
ggplot(data, aes(x = x_column, y = y_column)) +
geom_line(aes(color = z_column))
geom_point() +
# White background
# Classic theme
# Square root scale
xlim(min, max) +
base_plot + theme_bw() base_plot + theme_classic()
Here are the most common aesthetic mappings and attributes you will encounter in ggplot2 ggplot(data, aes(x = x_column, y = y_column)) +
ylim(min, max)
geom_point() +
x set or map the x-axis coordinat fill set or map the interior (fill) colo scale_x_sqrt()
y set or map the x-axis coordinat
color set or map the color or edge color
size set or map the size or widt
alpha set or map the transparency
> Faceting
Manipulating labels and legends Faceting breaks your plot into multiple panels, allowing you to compare different portions of your dataset side-by-side.
> The most common visualizations in ggplot2 # A scatter plot that
base_plot <-
will be used throughout these examples
ggplot(data, aes(x = x_column, y = y_column, color = color_column)) +
For example, you can show data for each category of a categorical variable in its own panel.
geom_point()
Capture a trend # Adding labels on the plot
facet_wrap()
base_plot + labs(x = 'X Axis Label', y = 'Y Axis Label', title = 'Plot title',
# Create a multi-line plot with ggplot2
Multi-line chart subtitle = 'Plot subtitle', caption = 'Image by author')
# Facet the figures into a rectangular layout of two columns
ggplot(data, aes(x_column, y_column, color = color_column)) +
base_plot + facet_wrap(vars(cut), ncol = 2)
geom_line()
# When using any aesthetics
base_plot + labs(color = "Diamond depth", size = "Diamond table")
# Facet the figures into a rectangular layout of two rows
Swap the color aesthetic for the group aesthetic to make all lines the same color.
base_plot + facet_wrap(vars(cut), nrow = 2)
# Remove the legend
base_plot + theme(legend.position = "none")
# Facet into a rectangular layout but give axes free ranges (variable plot dimensions):
# Create an area chart with ggplot2
Area chart
base_plot + facet_wrap(vars(cut), scales = "free")
ggplot(data, aes(x = x_column, y = y_column)) +
# Change legend position outside of the plot — You can also pass "top", "right", or "left"
geom_area()
base_plot + theme(legend.position = "bottom")
# Facet by both columns and rows with facet_grid
base_plot +
# Create a stacked area chart with ggplot2
Stacked area chart
facet_grid(rows vars(clarity), cols = vars(cut),
# Place the legend into the plot area
=
ggplot(data, aes(x = x_column, y = y_column, fill=z_column)) +
base_plot + theme(legend.position = c(0.1, 0.7))
space = "free", scales = "free")
geom_area()