15.
003 Software Tools — Data Science Afshine Amidi & Shervine Amidi
Study Guide: Data Visualization with R Type Command Illustration
geom_boxplot(
Afshine Amidi and Shervine Amidi Box
x, y, params
plot
)
August 21, 2020
geom_tile(
General structure Heatmap x, y, params
r Overview – The general structure of the code that is used to plot figures is as follows: )
R
ggplot(...) +............# Initialization where the possible parameters are summarized in the table below:
..geom_function(...) +...# Main plot(s)
..facet_function(...) +..# Facets (optional) Command Description Use case
..labs(...) +............# Legend (optional)
..scale_function(...) +..# Scales (optional) color Color of a line / point / border ’red’
..theme_function(...)....# Theme (optional)
fill Color of an area ’red’
We note the following points: size Size of a line / point 4
• The ggplot() layer is mandatory. shape Shape of a point 4
• When the data argument is specified inside the ggplot() function, it is used as default in linetype Shape of a line ’dashed’
the following layers that compose the plot command, unless otherwise specified.
alpha Transparency, between 0 and 1 0.3
• In order for features of a data frame to be used in a plot, they need to be specified inside
the aes() function.
r Maps – It is possible to plot maps based on geometrical shapes as follows:
r Basic plots – The main basic plots are summarized in the table below:
Type Command Illustration
geom_point(
Scatter x, y, params
plot )
geom_line(
Line x, y, params
plot ) The following table summarizes the main commands used to plot maps:
Category Action Command
Map Draw polygon shapes from the geometry column geom_sf(data)
geom_bar(
Bar Additional Add and customize geographical directions annotation_north_arrow(l)
x, y, params
chart ) elements Add and customize distance scale annotation_scale(l)
Range Customize range of coordinates coord_sf(xlim, ylim)
Massachusetts Institute of Technology 1 https://www.mit.edu/~amidi
15.003 Software Tools — Data Science Afshine Amidi & Shervine Amidi
r Animations – Plotting animations can be made using the gganimate library. The following r Additional elements – We can add objects on the plot with the following commands:
command gives the general structure of the code:
R Type Command Illustration
# Main plot
ggplot() + geom_vline(
..... +
..transition_states(field, states_length) xintercept, linetype
)
# Generate and save animation
animate(plot, duration, fps, width, height, units, res, renderer)
anim_save(filename) Line
geom_hline(
yintercept, linetype
)
Advanced features
r Facets – It is possible to represent the data through multiple dimensions with facets using
the following commands:
geom_curve(
x, y, xend, yend
Type Command Illustration Curve
)
facet_grid(
Grid row_var ∼ column_var
(1 or 2D) ) geom_rect(
Rectangle xmin, xmax, ymin, ymax
)
facet_wrap(
Wrapped vars(x1, ..., xn),
nrow, ncol
)
Last touch
r Legend – The title of legends can be customized to the plot with the following command:
r Text annotation – Plots can have text annotations with the following commands:
R
Command Illustration plot + labs(params)
geom_text( where the params are summarized below:
x, y, label,
hjust, vjust Element Command
) Title / subtitle of the plot title = ’text’ / subtitle = ’text’
Title of the x / y axis x = ’text’ / y = ’text’
geom_label_repel( Title of the size / color size = ’text’ / color = ’text’
x, y, label,
Caption of the plot caption = ’text’
nudge_x, nudge_y
)
This results in the following plot:
Massachusetts Institute of Technology 2 https://www.mit.edu/~amidi
15.003 Software Tools — Data Science Afshine Amidi & Shervine Amidi
Remark: in order to fix the same appearance parameters for all plots, the theme_set() function
can be used.
r Scales and axes – Scales and axes can be changed with the following commands:
Category Action Command
xlim(xmin, xmax)
Range Specify range of x / y axis
ylim(ymin, ymax)
scale_x_continuous()
Nature Display ticks in a customized manner scale_x_discrete()
scale_x_date()
scale_x_log10()
Magnitude Transform axes scale_x_reverse()
scale_x_sqrt()
r Plot appearance – The appearance of a given plot can be set by adding the following
command:
Remark: the scale_x() functions are for the x axis. The same adjustments are available for the
Type Command Illustration y axis with scale_y() functions.
r Double axes – A plot can have more than one axis with the sec.axis option within a given
scale function scale_function(). It is done as follows:
Black
R
and theme_bw()
scale_function(sec.axis = sec_axis(∼ .))
white
r Saving figure – It is possible to save figures with predefined parameters regarding the scale,
width and height of the output image with the following command:
R
Classic theme_classic()
ggsave(plot, filename, scale, width, height)
Minimal theme_minimal()
None theme_void()
In addition, theme() is able to adjust positions/fonts of elements of the legend.
Massachusetts Institute of Technology 3 https://www.mit.edu/~amidi