KEMBAR78
Principles of Data Visualization | PDF
Eamonn Maguire, DPhil
Principles of Data Visualization
iCSC, CERN
March 2017
The role of visualization systems is to provide visual representations of datasets that
help people carry out tasks more effectively.
Visualization
Tamara Munzner
A Visualization should
1. Save time

2. Have a clear purpose*
3. Include only the relevant content*

4. Encodes data/information appropriately
* from Noel Illinsky, http://complexdiagrams.com/
The role of visualization systems is to provide visual representations of datasets that
help people carry out tasks more effectively.
Visualization is suitable when there is a need to augment human capabilities
rather than replace people with computational decision-making methods.
Visualization
Tamara Munzner
A Visualization should
1. Save time

2. Have a clear purpose*
3. Include only the relevant content*

4. Encodes data/information appropriately
* from Noel Illinsky, http://complexdiagrams.com/
Course outcomes
The what? Major data types and classifications of them
The why? Why are we visualising at all?
The how? How can we visualize? What archetypes can we use to guide us?
Finally: case study? Given some data, how can we go about visualising it?
A lot of the content for this introduction comes from this book from Prof.
Tamara Munzner (UBC, Vancouver, Canada) which I created the illustrations
for.
If you’re interested in learning more, it’s a great book to check out from the
CERN library, or buy :)
The role of visualization systems is to provide visual representations of datasets that
help people carry out tasks more effectively.
External representation:
replace cognition with
perception
Visualization
The role of visualization systems is to provide visual representations of datasets that
help people carry out tasks more effectively.
External representation:
replace cognition with
perception
Visualization
Cerebral:Visualizing Multiple Experimental Conditions on a
Graph with Biological Context. Barsky, Munzner, Gardy, and
Kincaid. IEEETVCG (Proc. InfoVis) 14(6):1253-1260,
2008.]
The role of visualization systems is to provide visual representations of datasets that
help people carry out tasks more effectively.
External representation:
replace cognition with
perception
Visualization
The statistics would lead us to
believing that everything is the
same
The role of visualization systems is to provide visual representations of datasets that
help people carry out tasks more effectively.
Why visualize?
A Nested Model of Visualization Design and Validation.
Munzner. IEEE TVCG 15(6):921-928, 2009 (Proc. InfoVis
2009).
Analysis framework: Four levels, three questions
Data/task abstraction
Visual encoding/interaction idiom
Algorithm
Domain situation
• Domain situation
A Nested Model of Visualization Design and Validation.
Munzner. IEEE TVCG 15(6):921-928, 2009 (Proc. InfoVis
2009).
Analysis framework: Four levels, three questions
Data/task abstraction
Visual encoding/interaction idiom
Algorithm
Domain situation
• Domain situation
– who are the target users?
A Nested Model of Visualization Design and Validation.
Munzner. IEEE TVCG 15(6):921-928, 2009 (Proc. InfoVis
2009).
Analysis framework: Four levels, three questions
Data/task abstraction
Visual encoding/interaction idiom
Algorithm
Domain situation
• Domain situation
– who are the target users?
• Data/Task Abstraction
A Nested Model of Visualization Design and Validation.
Munzner. IEEE TVCG 15(6):921-928, 2009 (Proc. InfoVis
2009).
Analysis framework: Four levels, three questions
Data/task abstraction
Visual encoding/interaction idiom
Algorithm
Domain situation
• Domain situation
– who are the target users?
• Data/Task Abstraction
– translate from specifics of domain to
vocabulary of vis
A Nested Model of Visualization Design and Validation.
Munzner. IEEE TVCG 15(6):921-928, 2009 (Proc. InfoVis
2009).
Analysis framework: Four levels, three questions
Data/task abstraction
Visual encoding/interaction idiom
Algorithm
Domain situation
• Domain situation
– who are the target users?
• Data/Task Abstraction
– translate from specifics of domain to
vocabulary of vis
• What is shown? Data abstraction
A Nested Model of Visualization Design and Validation.
Munzner. IEEE TVCG 15(6):921-928, 2009 (Proc. InfoVis
2009).
A Multi-Level Typology of Abstract Visualization Tasks
Brehmer and Munzner. IEEE TVCG 19(12):2376-2385,
2013 (Proc. InfoVis 2013).
Analysis framework: Four levels, three questions
Data/task abstraction
Visual encoding/interaction idiom
Algorithm
Domain situation
• Domain situation
– who are the target users?
• Data/Task Abstraction
– translate from specifics of domain to
vocabulary of vis
• What is shown? Data abstraction
• Why is the user looking at it? Task
abstraction
A Nested Model of Visualization Design and Validation.
Munzner. IEEE TVCG 15(6):921-928, 2009 (Proc. InfoVis
2009).
A Multi-Level Typology of Abstract Visualization Tasks
Brehmer and Munzner. IEEE TVCG 19(12):2376-2385,
2013 (Proc. InfoVis 2013).
Analysis framework: Four levels, three questions
Data/task abstraction
Visual encoding/interaction idiom
Algorithm
Domain situation
• Domain situation
– who are the target users?
• Data/Task Abstraction
– translate from specifics of domain to
vocabulary of vis
• What is shown? Data abstraction
• Why is the user looking at it? Task
abstraction
• Visual Encoding
A Nested Model of Visualization Design and Validation.
Munzner. IEEE TVCG 15(6):921-928, 2009 (Proc. InfoVis
2009).
A Multi-Level Typology of Abstract Visualization Tasks
Brehmer and Munzner. IEEE TVCG 19(12):2376-2385,
2013 (Proc. InfoVis 2013).
Analysis framework: Four levels, three questions
Data/task abstraction
Visual encoding/interaction idiom
Algorithm
Domain situation
• Domain situation
– who are the target users?
• Data/Task Abstraction
– translate from specifics of domain to
vocabulary of vis
• What is shown? Data abstraction
• Why is the user looking at it? Task
abstraction
• Visual Encoding
• How is it shown?
A Nested Model of Visualization Design and Validation.
Munzner. IEEE TVCG 15(6):921-928, 2009 (Proc. InfoVis
2009).
A Multi-Level Typology of Abstract Visualization Tasks
Brehmer and Munzner. IEEE TVCG 19(12):2376-2385,
2013 (Proc. InfoVis 2013).
Analysis framework: Four levels, three questions
Data/task abstraction
Visual encoding/interaction idiom
Algorithm
Domain situation
• Domain situation
– who are the target users?
• Data/Task Abstraction
– translate from specifics of domain to
vocabulary of vis
• What is shown? Data abstraction
• Why is the user looking at it? Task
abstraction
• Visual Encoding
• How is it shown?
• visual encoding: how to draw
A Nested Model of Visualization Design and Validation.
Munzner. IEEE TVCG 15(6):921-928, 2009 (Proc. InfoVis
2009).
A Multi-Level Typology of Abstract Visualization Tasks
Brehmer and Munzner. IEEE TVCG 19(12):2376-2385,
2013 (Proc. InfoVis 2013).
Analysis framework: Four levels, three questions
Data/task abstraction
Visual encoding/interaction idiom
Algorithm
Domain situation
• Domain situation
– who are the target users?
• Data/Task Abstraction
– translate from specifics of domain to
vocabulary of vis
• What is shown? Data abstraction
• Why is the user looking at it? Task
abstraction
• Visual Encoding
• How is it shown?
• visual encoding: how to draw
• interaction: how to manipulate
A Nested Model of Visualization Design and Validation.
Munzner. IEEE TVCG 15(6):921-928, 2009 (Proc. InfoVis
2009).
A Multi-Level Typology of Abstract Visualization Tasks
Brehmer and Munzner. IEEE TVCG 19(12):2376-2385,
2013 (Proc. InfoVis 2013).
Analysis framework: Four levels, three questions
Data/task abstraction
Visual encoding/interaction idiom
Algorithm
Domain situation
• Domain situation
– who are the target users?
• Data/Task Abstraction
– translate from specifics of domain to
vocabulary of vis
• What is shown? Data abstraction
• Why is the user looking at it? Task
abstraction
• Visual Encoding
• How is it shown?
• visual encoding: how to draw
• interaction: how to manipulate
• Algorithm
A Nested Model of Visualization Design and Validation.
Munzner. IEEE TVCG 15(6):921-928, 2009 (Proc. InfoVis
2009).
A Multi-Level Typology of Abstract Visualization Tasks
Brehmer and Munzner. IEEE TVCG 19(12):2376-2385,
2013 (Proc. InfoVis 2013).
Analysis framework: Four levels, three questions
Data/task abstraction
Visual encoding/interaction idiom
Algorithm
Domain situation
• Domain situation
– who are the target users?
• Data/Task Abstraction
– translate from specifics of domain to
vocabulary of vis
• What is shown? Data abstraction
• Why is the user looking at it? Task
abstraction
• Visual Encoding
• How is it shown?
• visual encoding: how to draw
• interaction: how to manipulate
• Algorithm
– efficient computation, layout
algorithms etc.
A Nested Model of Visualization Design and Validation.
Munzner. IEEE TVCG 15(6):921-928, 2009 (Proc. InfoVis
2009).
A Multi-Level Typology of Abstract Visualization Tasks
Brehmer and Munzner. IEEE TVCG 19(12):2376-2385,
2013 (Proc. InfoVis 2013).
Analysis framework: Four levels, three questions
Data/task abstraction
Visual encoding/interaction idiom
Algorithm
Domain situation
What are you visualising?
The branches of data visualization
Information Visualization Scientific Visualization
Position is given.
Also medical visualizations
Position is derived.
Incl. GeoVis
The branches of data visualization
Information Visualization Scientific Visualization
Position is given.
Also medical visualizations
Position is derived.
Incl. GeoVis
Why are you visualising this?
Discover
Finding new insights in your data
Implies a level of interactivity to query, compare, correlate etc.
Discover
Finding new insights in your data
This is typically where one should be careful in how information is presented.
An erroneous data encoding could bring about wrong conclusions.
Implies a level of interactivity to query, compare, correlate etc.
Discover
Finding new insights in your data
This is typically where one should be careful in how information is presented.
An erroneous data encoding could bring about wrong conclusions.
Implies a level of interactivity to query, compare, correlate etc.
Present
Presenting your results, e.g. for a paper
Present
Presenting your results, e.g. for a paper
Present
Presenting your results, e.g. for a paper
Present
Presenting your results, e.g. for a paper
Present
Presenting your results, e.g. for a paper
Present
Presenting your results, e.g. for a paper
Enjoy
Infographics, art, or superfluous visualizations.
Enjoy
Infographics, art, or superfluous visualizations.
Bear in mind that this was a non-interactive figure in a paper
Enjoy
Infographics, art, or superfluous visualizations.
Bear in mind that this was a non-interactive figure in a paper
How can you encode information optimally?
How can you encode information optimally?
How can you encode information optimally?
JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC
2 10 4 5 6 9 1 3 5 3 4 7
How can you encode information optimally?
JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC
2 10 4 5 6 9 1 3 5 3 4 7
0
JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC
5
10
Scatter
How can you encode information optimally?
JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC
2 10 4 5 6 9 1 3 5 3 4 7
0
JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC
5
10
JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC
0
5
10
Scatter Line
How can you encode information optimally?
JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC
2 10 4 5 6 9 1 3 5 3 4 7
0
JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC
5
10
JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC
0
5
10
JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC
0
5
10
Scatter Line
Histogram
How can you encode information optimally?
JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC
2 10 4 5 6 9 1 3 5 3 4 7
0
JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC
5
10
JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC
0
5
10
JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC
0
5
10
Scatter Line
Histogram Area
JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC
0
5
10
How can you encode information optimally?
JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC
2 10 4 5 6 9 1 3 5 3 4 7
0
JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC
5
10
JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC
0
5
10
JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC
0
5
10
JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC
Scatter Line
Histogram Area
Size
JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC
0
5
10
How can you encode information optimally?
JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC
2 10 4 5 6 9 1 3 5 3 4 7
0
JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC
5
10
JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC
0
5
10
JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC
0
5
10
JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC
Scatter Line
Histogram Area
Size Saturation
JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC
0
5
10
How can you encode information optimally?
JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC
2 10 4 5 6 9 1 3 5 3 4 7
0
JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC
5
10
JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC
0
5
10
JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC
0
5
10
JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC
JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC
Scatter Line
Histogram Area
Size Saturation
Size & Saturation
JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC
0
5
10
How can you encode information optimally?
JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC
2 10 4 5 6 9 1 3 5 3 4 7
0
JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC
5
10
JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC
0
5
10
JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC
0
5
10
JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC
JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC
Scatter Line
Histogram Area
Size Saturation
Size & Saturation
JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC
0
5
10
Size, Saturation, & Position
JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC
5
And that’s just a really simple low dimensional example
Moreover, all of these visualizations encode the information,
but the decode error (interpreting, comparing, …) for each is different
So, why?
Stevens, 1975
Our perception system does not behave linearly.
Some stimuli are perceived less or more than intended.
We have to be careful when mapping
data to the visual world
Some visual channels are more effective for some
data types over others.
We have to be careful when mapping
data to the visual world
Some visual channels are more effective for some
data types over others.
Some data has a natural mapping that our
brains expect given certain types of data
Natural Mappings
We have to be careful when mapping
data to the visual world
Some visual channels are more effective for some
data types over others.
Some data has a natural mapping that our
brains expect given certain types of data
There are many intricacies of the visual system that must be considered
The pop-out effect
• Parallel processing on many individual channels
– speed independent of distractor count
– speed depends on channel and amount of difference from
distractors
• Serial search for (almost all) combinations
– speed depends on number of distractors
We pre-attentively process a scene, and some visual elements
stand out more than others.
The pop-out effect
• Parallel processing on many individual channels
– speed independent of distractor count
– speed depends on channel and amount of difference from
distractors
• Serial search for (almost all) combinations
– speed depends on number of distractors
We pre-attentively process a scene, and some visual elements
stand out more than others.
The pop-out effect
• Parallel processing on many individual channels
– speed independent of distractor count
– speed depends on channel and amount of difference from
distractors
• Serial search for (almost all) combinations
– speed depends on number of distractors
We pre-attentively process a scene, and some visual elements
stand out more than others.
The pop-out effect
• Parallel processing on many individual channels
– speed independent of distractor count
– speed depends on channel and amount of difference from
distractors
• Serial search for (almost all) combinations
– speed depends on number of distractors
We pre-attentively process a scene, and some visual elements
stand out more than others.
The pop-out effect
• Parallel processing on many individual channels
– speed independent of distractor count
– speed depends on channel and amount of difference from
distractors
• Serial search for (almost all) combinations
– speed depends on number of distractors
We pre-attentively process a scene, and some visual elements
stand out more than others.
Not all exhibit the pop-out effect!
Not all exhibit the pop-out effect!
Parallel line pairs do not pop out from tilted pairs…
Not all exhibit the pop-out effect!
Parallel line pairs do not pop out from tilted pairs…
And not all visual channels pop out as quickly as other. E.g. colour is always on top.
Relative Comparison
Relative Comparison
Relative Comparison
36px
Relative Comparison
Relative Comparison
4 values Unordered Unaligned
Relative Comparison
4 values Unordered Unaligned
11 values Unordered Unaligned
4 values
UnorderedAligned
Relative Comparison
4 values
UnorderedAligned
8 values
UnorderedAligned
Relative Comparison
8 values
20 values
Relative Comparison
Known Target Search
Colour Variety
Unknown Target Search
Grouped
Random
Grouped
Random
Random Grouped
Target shown before hand (known) or not shown (unknown).
The unique colour here is the orange square.
A) Known and Unknown Target Search
C) Response Time and Accuracy Results
Known Target Search
Colour Variety
Unknown Target Search
Grouped
Random
Grouped
Random
Subitizing
Grouped
Random
Random Grouped
Which grid has more colours?
7 8
B) Subitizing (how many colours?)
How Capacity Limits of Attention Influence Information Visualization Effectiveness.
Haroz S. and Whitney D., IEEE TVCG 2012
Known Target Search
Colour Variety
Unknown Target Search
Grouped
Random
Grouped
Random
Random Grouped
Target shown before hand (known) or not shown (unknown).
The unique colour here is the orange square.
A) Known and Unknown Target Search
C) Response Time and Accuracy Results
Known Target Search
Colour Variety
Unknown Target Search
Grouped
Random
Grouped
Random
Subitizing
Grouped
Random
Random Grouped
Which grid has more colours?
7 8
B) Subitizing (how many colours?)
How Capacity Limits of Attention Influence Information Visualization Effectiveness.
Haroz S. and Whitney D., IEEE TVCG 2012
Known Target Search
Colour Variety
Unknown Target Search
Grouped
Random
Grouped
Random
Random Grouped
Target shown before hand (known) or not shown (unknown).
The unique colour here is the orange square.
A) Known and Unknown Target Search
C) Response Time and Accuracy Results
Known Target Search
Colour Variety
Unknown Target Search
Grouped
Random
Grouped
Random
Subitizing
Grouped
Random
Random Grouped
Which grid has more colours?
7 8
B) Subitizing (how many colours?)
How Capacity Limits of Attention Influence Information Visualization Effectiveness.
Haroz S. and Whitney D., IEEE TVCG 2012
Known Target Search
Colour Variety
Unknown Target Search
Grouped
Random
Grouped
Random
Random Grouped
Target shown before hand (known) or not shown (unknown).
The unique colour here is the orange square.
A) Known and Unknown Target Search
C) Response Time and Accuracy Results
Known Target Search
Colour Variety
Unknown Target Search
Grouped
Random
Grouped
Random
Subitizing
Grouped
Random
Random Grouped
Which grid has more colours?
7 8
B) Subitizing (how many colours?)
How Capacity Limits of Attention Influence Information Visualization Effectiveness.
Haroz S. and Whitney D., IEEE TVCG 2012
A. Law of Closure B. Law of Similarity D. Law of Connectedness E. Law of Symmetry
[ ] { } ( )
C. Law of Proximity
F. Law of Good
Continuation
G. Contour Saliency
a c
b
a
c
b
d
H. Law of Common Fate I. Law of Past
Experience
J. Law of
Pragnanz
K. Figure/Ground
Gestalt Laws
Dimension X
A and B have the same width.
However B and C are perceived
more alike even though they are
different widths and heights.
Width and Height are integral
dimensions
Dimension X
Width
Integral/Separable Dimensions
Dimension X
A and B have the same width.
However B and C are perceived
more alike even though they are
different widths and heights.
Width and Height are integral
dimensions
Dimension X
Width
Dimension X
Colour
A and B have the same
colour and are perceived
more similar.
Colour and Height are
Separable dimensions
Integral/Separable Dimensions
Fully separableFully Integral
Dimension X
Width
Dimension X
Orientation
Dimension X
Colour
Dimension X
Motion
Dimension X
Motion
Dimension X
Colour
Integral/Separable Dimensions
We don’t see in 3D, and we have difficulties interpreting
information on the Z-axis.
We have to be careful when mapping
data to the visual world
Some visual channels are more effective for some
data types over others.
Some data has a natural mapping that our
brains expect given certain types of data
There are many visual tricks that can be observed due to
how the visual system works
2D always wins…
Our visual system is not good at interpreting information on
the z-axis.
*3D is normally only used for exploration of inherently 3D information, such as
medical imaging data…
These options, taken randomly from google image searches so how widely 3D is abused in
information visualization. All of these charts are manipulating our perception of the data by
using the Z axis to occlude information…it would be avoided in 2D.
2D always wins…
2D always wins…
2D always wins…
http://cms-results.web.cern.ch/cms-results/public-results/preliminary-results/BPH-14-008/index.html
2D always wins…
We don’t see in 3D, and we have difficulties interpreting
information on the Z-axis.
We have to be careful when mapping
data to the visual world
Some visual channels are more effective for some
data types over others.
Some data has a natural mapping that our
brains expect given certain types of data
There are many visual tricks that can be observed due to
how the visual system works
Colour
The simplest, yet most abused of all visual encodings.
The problem is that a smooth step in a value does not equate to a
smooth colour transition…
Colour
Additionally, colour is not equally binned in reality. We perceive
colours differently due to an increased sensitivity to the yellow part
of the spectrum…
Wavelength (nm)
IRUV
Visible Spectrum
https://mycarta.wordpress.com/2012/10/06/the-rainbow-is-deadlong-live-the-rainbow-part-3/
Luminosity is also not stable across the colours, meaning some colours
will pop out more than others… and not always intentionally.
https://mycarta.wordpress.com/2012/10/06/the-rainbow-is-deadlong-live-the-rainbow-part-3/
Luminosity is also not stable across the colours, meaning some colours
will pop out more than others… and not always intentionally.
Gregory compared the wavelength of light with the smallest observable difference
in hue (expressed as wavelength difference)
And how we perceive changes in hue is also very different.
Is this a good visualization?
Is this a good visualization?
But, grayscale would be just as useful and less visually distracting.
Is there a colour palette for scientific
visualization that works?
HSL linear L rainbow palette
Kindlmann, G. Reinhard, E. and Creem, S., 2002, Face-based Luminance Matching for Perceptual Colormap
Generation, IEEE Proceedings of the conference on Visualization ’02
https://mycarta.wordpress.com/2012/10/06/the-rainbow-is-deadlong-live-the-rainbow-part-3/
But there are some in CMS that are already moving
away from the rainbow
http://cms-results.web.cern.ch/cms-results/public-results/publications/
JME-13-004/index.html
Is there a colour palette for scientific visualization
that works?
HSL linear L rainbow palette
Kindlmann, G. Reinhard, E. and Creem, S., 2002, Face-based Luminance Matching for Perceptual Colormap
Generation, IEEE Proceedings of the conference on Visualization ’02
https://mycarta.wordpress.com/2012/10/06/the-rainbow-is-deadlong-live-the-rainbow-part-3/
Binary
Diverging
Categorical
Sequential
Categorical
Categorical
There are also lots of default colour maps that can be applied to
particular data types.
http://colorbrewer2.org/
Semantic relevance
Or just consistency
When there are many colours for example, we find it
difficult to remember abstract associations.
Selecting Semantically-Resonant Colors for Data Visualization
Sharon Lin, Julie Fortuna, Chinmay Kulkarni, Maureen Stone, Jeffrey Heer
Computer Graphics Forum (Proc. EuroVis), 2013
What are semantically resonant colours?
Semantic colouring is a good idea in theory, but
there are limited areas where this really works.
But, if you are going to use colour, have a consistent
colour mapping. That way, the decoding time is less.
Saving time…
In general, CMS & other experiments are pretty good at
this, but there are some examples…
http://cms-results.web.cern.ch/cms-results/public-
results/preliminary-results/SUS-16-029/CMS-PAS-
SUS-16-029_Figure_003.png
http://cms-results.web.cern.ch/cms-results/public-
results/preliminary-results/SUS-16-029/CMS-PAS-
SUS-16-029_Figure_003.png
What happens when we have a high number of
dimensions?
And that was just to represent a low number
of dimensions
Parallel
Coordinates
Scatter Plot
Matrices
Glyphs
Linked
Plots
Height Weight Cholesterol
Multidimensional Visualization
Scatter Plot Matrices
…
Height Weight CholName
1.76 63 4.5John
1.79 70 4.15Mike
1.61 60 6.7Jim
1.84 90 5.03Francois
Multidimensional Visualization
Scatter Plot Matrices
…
Height Weight CholesterolHeight Weight CholName
1.76 63 4.5John
1.79 70 4.15Mike
1.61 60 6.7Jim
1.84 90 5.03Francois
Multidimensional Visualization
Scatter Plot Matrices
…
Height Weight CholesterolHeight Weight CholName
1.76 63 4.5John
1.79 70 4.15Mike
1.61 60 6.7Jim
1.84 90 5.03Francois
1.76
Multidimensional Visualization
Scatter Plot Matrices
…
Height Weight CholesterolHeight Weight CholName
1.76 63 4.5John
1.79 70 4.15Mike
1.61 60 6.7Jim
1.84 90 5.03Francois
1.76
63
1.76
4.5
Multidimensional Visualization
Scatter Plot Matrices
…
Height Weight CholesterolHeight Weight CholName
1.76 63 4.5John
1.79 70 4.15Mike
1.61 60 6.7Jim
1.84 90 5.03Francois
1.76
63
1.76
4.5
Height Weight Cholesterol
Multidimensional Visualization
Visual Exploration of Large Structured Datasets. Wills. Proc. New Techniques and Trends in
Statistics (NTTS), pp. 237–246. IOS Press, 1995.
Linked Plots
Multidimensional Visualization
When one visualization won’t cut it…
Multidimensional Visualization
With dc.js, crossfilter, and d3.js
Multidimensional Visualization
With dc.js, crossfilter, and d3.js
Multidimensional Visualization
With dc.js, crossfilter, and d3.js
Multidimensional Visualization
With dc.js, crossfilter, and d3.js
Multidimensional Visualization
With dc.js, crossfilter, and d3.js
Multidimensional Visualization
My Tutorial on Creating Dashboard Visualizations
https://thor-project.github.io/dashboard-tutorial/
Parallel Coordinate Plots
Positive Correlation Negative (inverse) Correlation No Correlation
Multidimensional Visualization
Parallel Coordinate Plots
Positive Correlation Negative (inverse) Correlation No Correlation
Multidimensional Visualization
Lets take an example where we have many variables to display...
Each user is represented by a circle
user a
user z
user a
user z
2 Dimensions 3 Dimensions
Size indicates number of logins per day
Parallel Coordinate Plots
As we get to higher levels of dimensions, we’ll have problems. Our choice of visual encoding will affect the visual
availability of each dimension to the user.
4 Dimensions
Color indicates users department
5 Dimensions
Transparency indicates consistency in logins
user a
user z
user a
user z
Parallel Coordinate Plots
Parallel coordinates are a visualization technique employed when a large number of dimensions need to be displayed
(often without a temporal element) and where each of those dimensions can be equally important in the decision
making process.
Not so easy to spotEasy to see
In the scatter plots here, it’s easy to see correlation between downloads and uploads,
but with the other dimensions that’s difficult.
Positive Correlation Negative (inverse) Correlation No Correlation
Parallel Coordinate Plots
2 Dimensions
Uploads
Downloads
Parallel Coordinate Plots
2 Dimensions
Uploads
Downloads
1 User
Parallel Coordinate Plots
2 Variables
Uploads
Downloads
11 Users
Parallel Coordinate Plots
2 Dimensions
Uploads
Downloads
11 Users
user a
user z
Parallel Coordinate Plots
3 Dimensions
Uploads
Downloads
Logins per day
11 Users
Parallel Coordinate Plots
4 Dimensions
Uploads
Downloads
Logins per day
Std. deviation in logins per day
11 Users
2
-2
Parallel Coordinate Plots
5 Dimensions
Uploads
Downloads
Logins per day
Std. deviation in logins per day
Department
11 Users We use color for department since it’s categorical information.
2
-2
We can keep adding more parallel lines, and comfortably have
around 20 dimensions for many users displayed at once.
Parallel Coordinate Plots
Clustering Outlier DetectionCorrelation Distribution
Parallel coordinates provide an efficient way to visualize many variables, along with
their associated clusters, anomalies, value distributions and correlations.
Parallel Coordinate Plots
83
• static item aggregation
• task: find distribution
• data: table
• derived data
– 4 quantitative attributes
• median: central line
• lower and upper quartile:
boxes
• lower upper fences:
whiskers
– outliers beyond fence cutoffs
explicitly shown
and kurtosis 20) and a bimo
tion 0.31). Richer displays o
multi-modality is particularly
!
!
!!
!
!
!
!
!
n s k mm!2024
!2024
Figure 4: From left to right: box
Glyphs
Multidimensional Visualization
Glyphs
Multidimensional Visualization
A Simple Example | Student Test Results
Multidimensional Visualization
Math
Physics
English
Religion
Math Physics English Religion
Table Scatter Plot Matrix
Math Physics English Religion
85
90
65
50
40
95
80
50
40
60
71
60
90
95
80
65
50
90
80
90
86
Math Physics English Religion
85
90
65
50
40
95
80
50
40
60
71
60
90
95
80
65
50
90
80
90
Table Parallel Coordinates
Math Physics English Religion
100
90
80
70
60
50
40
30
20
10
0
A Simple Example | Student Test Results
Multidimensional Visualization
87
Math Physics English Religion
85
90
65
50
40
95
80
50
40
60
71
60
90
95
80
65
50
90
80
90
Table Glyph
A Simple Example | Student Test Results
Multidimensional Visualization
Math
Physics
English
Religion
87
Math Physics English Religion
85
90
65
50
40
95
80
50
40
60
71
60
90
95
80
65
50
90
80
90
Table Glyph
A Simple Example | Student Test Results
Multidimensional Visualization
Math
Physics
English
Religion
Teacher
Arrange Spatially
What about topological data?
Graphs/Networks
Graphs/Networks
Cerebral:Visualizing Multiple Experimental Conditions on a Graph with Biological Context. Barsky, Munzner,
Gardy, and Kincaid. IEEETVCG (Proc. InfoVis) 14(6):1253-1260, 2008.]
Graphs/Networks
https://cambridge-intelligence.com
Graphs/Networks
https://cambridge-intelligence.com
Graphs/Networks
Matrix Representations
https://bost.ocks.org/mike/miserables/
Graphs/Networks
Hive Plots
http://jsfiddle.net/7a7b5dwp/
Graphs/Networks
Hive Plots
http://jsfiddle.net/7a7b5dwp/ http://jsfiddle.net/eamonnmag/vso70qnr/
Trees
Cut Level 0.3 4 Clusters
View as treemapDendrogram Search
• split by
neighbourhood
• then by type
• colour by price
• neighbourhood
patterns
– where it’s expensive
– where you pay much
more for detached
type
94
Configuring Hierarchical Layouts to Address Research Questions. Slingsby, Dykes, andWood. IEEETransactions on
Visualization and Computer Graphics (Proc. InfoVis 2009) 15:6 (2009), 977–984.
Treemaps Partitioning
• switch order of splits
– type then
neighbourhood
• switch colour
– by price variation
• type patterns
– within specific
type, which
neighbourhoods
are inconsistent
95
Configuring Hierarchical Layouts to Address Research Questions. Slingsby, Dykes, andWood. IEEETransactions on
Visualization and Computer Graphics (Proc. InfoVis 2009) 15:6 (2009), 977–984.
Treemaps Partitioning
Validation & User Testing
97
Validating your visualisation…
97
Validating your visualisation…
Domain situation
Observe target users using existing tools
Visual encoding/interaction idiom
Justify design with respect to alternatives
Algorithm
Measure system time/memory
Analyze computational complexity
Observe target users after deployment ( )
Measure adoption
Analyze results qualitatively
Measure human time with lab experiment (lab study)
Data/task abstraction
98
“Visualization can surprise you, but doesn't scale well.
Modelling scales well, but can't surprise you.”
Hadley Wickham
The elephant in the room… scaling up
99
One of the biggest challenges in HEP, Biology, chemistry,
and business is scale.
Our screens have a limited number of pixels. And our data
is often much larger.
You can do two things to get around this.
The elephant in the room… scaling up
100
GPU usage and
WebGL
Not yet compatible across all
browsers, and not everyone
has a dedicated GPU.
Reduce the
problem space
Try and get users to focus
queries as soon as possible
to reduce the amount of
data to be visualised.
As done in imMens from
Jeffrey Heer’s lab in
Washington. Renders billions
of data points at 50 fps in
the browser.
Provide ways to aggregate
information in to meaningful
overview visualisations…
Solutions
101
Munzner. A K Peters Visualization Series, CRC Press, Visualization Series, 2014.
Visualization Analysis and Design.
More??
102
http://antarctic-design.co.uk/biovis-workshop15/
Tutorial on D3
Tutorial on Dashboard Visualizations
https://thor-project.github.io/dashboard-tutorial/
Visualization Survey Sites
Set Visualization - http://www.cvast.tuwien.ac.at/SetViz
Time Series Visualization - http://survey.timeviz.net/
Parallel Coordinates Visualization
Periodic Table of Visualizations
Data Vis Catalogue
Further Links
Eamonn Maguire
CERN
Data Visualization and suggestions for CMS
CERN
August 25th 2016
Questions
@antarcticdesign
eamonnmag@gmail.com

Principles of Data Visualization

  • 1.
    Eamonn Maguire, DPhil Principlesof Data Visualization iCSC, CERN March 2017
  • 2.
    The role ofvisualization systems is to provide visual representations of datasets that help people carry out tasks more effectively. Visualization Tamara Munzner A Visualization should 1. Save time 2. Have a clear purpose* 3. Include only the relevant content* 4. Encodes data/information appropriately * from Noel Illinsky, http://complexdiagrams.com/
  • 3.
    The role ofvisualization systems is to provide visual representations of datasets that help people carry out tasks more effectively. Visualization is suitable when there is a need to augment human capabilities rather than replace people with computational decision-making methods. Visualization Tamara Munzner A Visualization should 1. Save time 2. Have a clear purpose* 3. Include only the relevant content* 4. Encodes data/information appropriately * from Noel Illinsky, http://complexdiagrams.com/
  • 4.
    Course outcomes The what?Major data types and classifications of them The why? Why are we visualising at all? The how? How can we visualize? What archetypes can we use to guide us? Finally: case study? Given some data, how can we go about visualising it?
  • 5.
    A lot ofthe content for this introduction comes from this book from Prof. Tamara Munzner (UBC, Vancouver, Canada) which I created the illustrations for. If you’re interested in learning more, it’s a great book to check out from the CERN library, or buy :)
  • 6.
    The role ofvisualization systems is to provide visual representations of datasets that help people carry out tasks more effectively. External representation: replace cognition with perception Visualization
  • 7.
    The role ofvisualization systems is to provide visual representations of datasets that help people carry out tasks more effectively. External representation: replace cognition with perception Visualization
  • 8.
    Cerebral:Visualizing Multiple ExperimentalConditions on a Graph with Biological Context. Barsky, Munzner, Gardy, and Kincaid. IEEETVCG (Proc. InfoVis) 14(6):1253-1260, 2008.] The role of visualization systems is to provide visual representations of datasets that help people carry out tasks more effectively. External representation: replace cognition with perception Visualization
  • 10.
    The statistics wouldlead us to believing that everything is the same The role of visualization systems is to provide visual representations of datasets that help people carry out tasks more effectively. Why visualize?
  • 11.
    A Nested Modelof Visualization Design and Validation. Munzner. IEEE TVCG 15(6):921-928, 2009 (Proc. InfoVis 2009). Analysis framework: Four levels, three questions Data/task abstraction Visual encoding/interaction idiom Algorithm Domain situation
  • 12.
    • Domain situation ANested Model of Visualization Design and Validation. Munzner. IEEE TVCG 15(6):921-928, 2009 (Proc. InfoVis 2009). Analysis framework: Four levels, three questions Data/task abstraction Visual encoding/interaction idiom Algorithm Domain situation
  • 13.
    • Domain situation –who are the target users? A Nested Model of Visualization Design and Validation. Munzner. IEEE TVCG 15(6):921-928, 2009 (Proc. InfoVis 2009). Analysis framework: Four levels, three questions Data/task abstraction Visual encoding/interaction idiom Algorithm Domain situation
  • 14.
    • Domain situation –who are the target users? • Data/Task Abstraction A Nested Model of Visualization Design and Validation. Munzner. IEEE TVCG 15(6):921-928, 2009 (Proc. InfoVis 2009). Analysis framework: Four levels, three questions Data/task abstraction Visual encoding/interaction idiom Algorithm Domain situation
  • 15.
    • Domain situation –who are the target users? • Data/Task Abstraction – translate from specifics of domain to vocabulary of vis A Nested Model of Visualization Design and Validation. Munzner. IEEE TVCG 15(6):921-928, 2009 (Proc. InfoVis 2009). Analysis framework: Four levels, three questions Data/task abstraction Visual encoding/interaction idiom Algorithm Domain situation
  • 16.
    • Domain situation –who are the target users? • Data/Task Abstraction – translate from specifics of domain to vocabulary of vis • What is shown? Data abstraction A Nested Model of Visualization Design and Validation. Munzner. IEEE TVCG 15(6):921-928, 2009 (Proc. InfoVis 2009). A Multi-Level Typology of Abstract Visualization Tasks Brehmer and Munzner. IEEE TVCG 19(12):2376-2385, 2013 (Proc. InfoVis 2013). Analysis framework: Four levels, three questions Data/task abstraction Visual encoding/interaction idiom Algorithm Domain situation
  • 17.
    • Domain situation –who are the target users? • Data/Task Abstraction – translate from specifics of domain to vocabulary of vis • What is shown? Data abstraction • Why is the user looking at it? Task abstraction A Nested Model of Visualization Design and Validation. Munzner. IEEE TVCG 15(6):921-928, 2009 (Proc. InfoVis 2009). A Multi-Level Typology of Abstract Visualization Tasks Brehmer and Munzner. IEEE TVCG 19(12):2376-2385, 2013 (Proc. InfoVis 2013). Analysis framework: Four levels, three questions Data/task abstraction Visual encoding/interaction idiom Algorithm Domain situation
  • 18.
    • Domain situation –who are the target users? • Data/Task Abstraction – translate from specifics of domain to vocabulary of vis • What is shown? Data abstraction • Why is the user looking at it? Task abstraction • Visual Encoding A Nested Model of Visualization Design and Validation. Munzner. IEEE TVCG 15(6):921-928, 2009 (Proc. InfoVis 2009). A Multi-Level Typology of Abstract Visualization Tasks Brehmer and Munzner. IEEE TVCG 19(12):2376-2385, 2013 (Proc. InfoVis 2013). Analysis framework: Four levels, three questions Data/task abstraction Visual encoding/interaction idiom Algorithm Domain situation
  • 19.
    • Domain situation –who are the target users? • Data/Task Abstraction – translate from specifics of domain to vocabulary of vis • What is shown? Data abstraction • Why is the user looking at it? Task abstraction • Visual Encoding • How is it shown? A Nested Model of Visualization Design and Validation. Munzner. IEEE TVCG 15(6):921-928, 2009 (Proc. InfoVis 2009). A Multi-Level Typology of Abstract Visualization Tasks Brehmer and Munzner. IEEE TVCG 19(12):2376-2385, 2013 (Proc. InfoVis 2013). Analysis framework: Four levels, three questions Data/task abstraction Visual encoding/interaction idiom Algorithm Domain situation
  • 20.
    • Domain situation –who are the target users? • Data/Task Abstraction – translate from specifics of domain to vocabulary of vis • What is shown? Data abstraction • Why is the user looking at it? Task abstraction • Visual Encoding • How is it shown? • visual encoding: how to draw A Nested Model of Visualization Design and Validation. Munzner. IEEE TVCG 15(6):921-928, 2009 (Proc. InfoVis 2009). A Multi-Level Typology of Abstract Visualization Tasks Brehmer and Munzner. IEEE TVCG 19(12):2376-2385, 2013 (Proc. InfoVis 2013). Analysis framework: Four levels, three questions Data/task abstraction Visual encoding/interaction idiom Algorithm Domain situation
  • 21.
    • Domain situation –who are the target users? • Data/Task Abstraction – translate from specifics of domain to vocabulary of vis • What is shown? Data abstraction • Why is the user looking at it? Task abstraction • Visual Encoding • How is it shown? • visual encoding: how to draw • interaction: how to manipulate A Nested Model of Visualization Design and Validation. Munzner. IEEE TVCG 15(6):921-928, 2009 (Proc. InfoVis 2009). A Multi-Level Typology of Abstract Visualization Tasks Brehmer and Munzner. IEEE TVCG 19(12):2376-2385, 2013 (Proc. InfoVis 2013). Analysis framework: Four levels, three questions Data/task abstraction Visual encoding/interaction idiom Algorithm Domain situation
  • 22.
    • Domain situation –who are the target users? • Data/Task Abstraction – translate from specifics of domain to vocabulary of vis • What is shown? Data abstraction • Why is the user looking at it? Task abstraction • Visual Encoding • How is it shown? • visual encoding: how to draw • interaction: how to manipulate • Algorithm A Nested Model of Visualization Design and Validation. Munzner. IEEE TVCG 15(6):921-928, 2009 (Proc. InfoVis 2009). A Multi-Level Typology of Abstract Visualization Tasks Brehmer and Munzner. IEEE TVCG 19(12):2376-2385, 2013 (Proc. InfoVis 2013). Analysis framework: Four levels, three questions Data/task abstraction Visual encoding/interaction idiom Algorithm Domain situation
  • 23.
    • Domain situation –who are the target users? • Data/Task Abstraction – translate from specifics of domain to vocabulary of vis • What is shown? Data abstraction • Why is the user looking at it? Task abstraction • Visual Encoding • How is it shown? • visual encoding: how to draw • interaction: how to manipulate • Algorithm – efficient computation, layout algorithms etc. A Nested Model of Visualization Design and Validation. Munzner. IEEE TVCG 15(6):921-928, 2009 (Proc. InfoVis 2009). A Multi-Level Typology of Abstract Visualization Tasks Brehmer and Munzner. IEEE TVCG 19(12):2376-2385, 2013 (Proc. InfoVis 2013). Analysis framework: Four levels, three questions Data/task abstraction Visual encoding/interaction idiom Algorithm Domain situation
  • 24.
    What are youvisualising?
  • 25.
    The branches ofdata visualization Information Visualization Scientific Visualization Position is given. Also medical visualizations Position is derived. Incl. GeoVis
  • 26.
    The branches ofdata visualization Information Visualization Scientific Visualization Position is given. Also medical visualizations Position is derived. Incl. GeoVis
  • 27.
    Why are youvisualising this?
  • 28.
    Discover Finding new insightsin your data Implies a level of interactivity to query, compare, correlate etc.
  • 29.
    Discover Finding new insightsin your data This is typically where one should be careful in how information is presented. An erroneous data encoding could bring about wrong conclusions. Implies a level of interactivity to query, compare, correlate etc.
  • 30.
    Discover Finding new insightsin your data This is typically where one should be careful in how information is presented. An erroneous data encoding could bring about wrong conclusions. Implies a level of interactivity to query, compare, correlate etc.
  • 31.
  • 32.
  • 33.
  • 34.
  • 35.
  • 36.
  • 37.
    Enjoy Infographics, art, orsuperfluous visualizations.
  • 38.
    Enjoy Infographics, art, orsuperfluous visualizations. Bear in mind that this was a non-interactive figure in a paper
  • 39.
    Enjoy Infographics, art, orsuperfluous visualizations. Bear in mind that this was a non-interactive figure in a paper
  • 40.
    How can youencode information optimally?
  • 41.
    How can youencode information optimally?
  • 42.
    How can youencode information optimally? JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC 2 10 4 5 6 9 1 3 5 3 4 7
  • 43.
    How can youencode information optimally? JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC 2 10 4 5 6 9 1 3 5 3 4 7 0 JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC 5 10 Scatter
  • 44.
    How can youencode information optimally? JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC 2 10 4 5 6 9 1 3 5 3 4 7 0 JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC 5 10 JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC 0 5 10 Scatter Line
  • 45.
    How can youencode information optimally? JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC 2 10 4 5 6 9 1 3 5 3 4 7 0 JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC 5 10 JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC 0 5 10 JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC 0 5 10 Scatter Line Histogram
  • 46.
    How can youencode information optimally? JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC 2 10 4 5 6 9 1 3 5 3 4 7 0 JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC 5 10 JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC 0 5 10 JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC 0 5 10 Scatter Line Histogram Area JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC 0 5 10
  • 47.
    How can youencode information optimally? JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC 2 10 4 5 6 9 1 3 5 3 4 7 0 JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC 5 10 JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC 0 5 10 JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC 0 5 10 JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC Scatter Line Histogram Area Size JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC 0 5 10
  • 48.
    How can youencode information optimally? JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC 2 10 4 5 6 9 1 3 5 3 4 7 0 JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC 5 10 JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC 0 5 10 JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC 0 5 10 JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC Scatter Line Histogram Area Size Saturation JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC 0 5 10
  • 49.
    How can youencode information optimally? JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC 2 10 4 5 6 9 1 3 5 3 4 7 0 JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC 5 10 JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC 0 5 10 JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC 0 5 10 JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC Scatter Line Histogram Area Size Saturation Size & Saturation JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC 0 5 10
  • 50.
    How can youencode information optimally? JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC 2 10 4 5 6 9 1 3 5 3 4 7 0 JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC 5 10 JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC 0 5 10 JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC 0 5 10 JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC Scatter Line Histogram Area Size Saturation Size & Saturation JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC 0 5 10 Size, Saturation, & Position JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC 5
  • 51.
    And that’s justa really simple low dimensional example Moreover, all of these visualizations encode the information, but the decode error (interpreting, comparing, …) for each is different So, why?
  • 52.
    Stevens, 1975 Our perceptionsystem does not behave linearly. Some stimuli are perceived less or more than intended.
  • 53.
    We have tobe careful when mapping data to the visual world Some visual channels are more effective for some data types over others.
  • 56.
    We have tobe careful when mapping data to the visual world Some visual channels are more effective for some data types over others. Some data has a natural mapping that our brains expect given certain types of data
  • 57.
  • 58.
    We have tobe careful when mapping data to the visual world Some visual channels are more effective for some data types over others. Some data has a natural mapping that our brains expect given certain types of data There are many intricacies of the visual system that must be considered
  • 59.
    The pop-out effect •Parallel processing on many individual channels – speed independent of distractor count – speed depends on channel and amount of difference from distractors • Serial search for (almost all) combinations – speed depends on number of distractors We pre-attentively process a scene, and some visual elements stand out more than others.
  • 60.
    The pop-out effect •Parallel processing on many individual channels – speed independent of distractor count – speed depends on channel and amount of difference from distractors • Serial search for (almost all) combinations – speed depends on number of distractors We pre-attentively process a scene, and some visual elements stand out more than others.
  • 61.
    The pop-out effect •Parallel processing on many individual channels – speed independent of distractor count – speed depends on channel and amount of difference from distractors • Serial search for (almost all) combinations – speed depends on number of distractors We pre-attentively process a scene, and some visual elements stand out more than others.
  • 62.
    The pop-out effect •Parallel processing on many individual channels – speed independent of distractor count – speed depends on channel and amount of difference from distractors • Serial search for (almost all) combinations – speed depends on number of distractors We pre-attentively process a scene, and some visual elements stand out more than others.
  • 63.
    The pop-out effect •Parallel processing on many individual channels – speed independent of distractor count – speed depends on channel and amount of difference from distractors • Serial search for (almost all) combinations – speed depends on number of distractors We pre-attentively process a scene, and some visual elements stand out more than others.
  • 65.
    Not all exhibitthe pop-out effect!
  • 66.
    Not all exhibitthe pop-out effect! Parallel line pairs do not pop out from tilted pairs…
  • 67.
    Not all exhibitthe pop-out effect! Parallel line pairs do not pop out from tilted pairs… And not all visual channels pop out as quickly as other. E.g. colour is always on top.
  • 68.
  • 69.
  • 70.
  • 71.
  • 72.
    Relative Comparison 4 valuesUnordered Unaligned
  • 73.
    Relative Comparison 4 valuesUnordered Unaligned 11 values Unordered Unaligned
  • 74.
  • 75.
  • 76.
  • 77.
    Known Target Search ColourVariety Unknown Target Search Grouped Random Grouped Random Random Grouped Target shown before hand (known) or not shown (unknown). The unique colour here is the orange square. A) Known and Unknown Target Search C) Response Time and Accuracy Results Known Target Search Colour Variety Unknown Target Search Grouped Random Grouped Random Subitizing Grouped Random Random Grouped Which grid has more colours? 7 8 B) Subitizing (how many colours?) How Capacity Limits of Attention Influence Information Visualization Effectiveness. Haroz S. and Whitney D., IEEE TVCG 2012
  • 78.
    Known Target Search ColourVariety Unknown Target Search Grouped Random Grouped Random Random Grouped Target shown before hand (known) or not shown (unknown). The unique colour here is the orange square. A) Known and Unknown Target Search C) Response Time and Accuracy Results Known Target Search Colour Variety Unknown Target Search Grouped Random Grouped Random Subitizing Grouped Random Random Grouped Which grid has more colours? 7 8 B) Subitizing (how many colours?) How Capacity Limits of Attention Influence Information Visualization Effectiveness. Haroz S. and Whitney D., IEEE TVCG 2012
  • 79.
    Known Target Search ColourVariety Unknown Target Search Grouped Random Grouped Random Random Grouped Target shown before hand (known) or not shown (unknown). The unique colour here is the orange square. A) Known and Unknown Target Search C) Response Time and Accuracy Results Known Target Search Colour Variety Unknown Target Search Grouped Random Grouped Random Subitizing Grouped Random Random Grouped Which grid has more colours? 7 8 B) Subitizing (how many colours?) How Capacity Limits of Attention Influence Information Visualization Effectiveness. Haroz S. and Whitney D., IEEE TVCG 2012
  • 80.
    Known Target Search ColourVariety Unknown Target Search Grouped Random Grouped Random Random Grouped Target shown before hand (known) or not shown (unknown). The unique colour here is the orange square. A) Known and Unknown Target Search C) Response Time and Accuracy Results Known Target Search Colour Variety Unknown Target Search Grouped Random Grouped Random Subitizing Grouped Random Random Grouped Which grid has more colours? 7 8 B) Subitizing (how many colours?) How Capacity Limits of Attention Influence Information Visualization Effectiveness. Haroz S. and Whitney D., IEEE TVCG 2012
  • 82.
    A. Law ofClosure B. Law of Similarity D. Law of Connectedness E. Law of Symmetry [ ] { } ( ) C. Law of Proximity F. Law of Good Continuation G. Contour Saliency a c b a c b d H. Law of Common Fate I. Law of Past Experience J. Law of Pragnanz K. Figure/Ground Gestalt Laws
  • 83.
    Dimension X A andB have the same width. However B and C are perceived more alike even though they are different widths and heights. Width and Height are integral dimensions Dimension X Width Integral/Separable Dimensions
  • 84.
    Dimension X A andB have the same width. However B and C are perceived more alike even though they are different widths and heights. Width and Height are integral dimensions Dimension X Width Dimension X Colour A and B have the same colour and are perceived more similar. Colour and Height are Separable dimensions Integral/Separable Dimensions
  • 85.
    Fully separableFully Integral DimensionX Width Dimension X Orientation Dimension X Colour Dimension X Motion Dimension X Motion Dimension X Colour Integral/Separable Dimensions
  • 86.
    We don’t seein 3D, and we have difficulties interpreting information on the Z-axis. We have to be careful when mapping data to the visual world Some visual channels are more effective for some data types over others. Some data has a natural mapping that our brains expect given certain types of data There are many visual tricks that can be observed due to how the visual system works
  • 87.
    2D always wins… Ourvisual system is not good at interpreting information on the z-axis. *3D is normally only used for exploration of inherently 3D information, such as medical imaging data…
  • 88.
    These options, takenrandomly from google image searches so how widely 3D is abused in information visualization. All of these charts are manipulating our perception of the data by using the Z axis to occlude information…it would be avoided in 2D. 2D always wins…
  • 89.
  • 90.
  • 91.
  • 93.
    We don’t seein 3D, and we have difficulties interpreting information on the Z-axis. We have to be careful when mapping data to the visual world Some visual channels are more effective for some data types over others. Some data has a natural mapping that our brains expect given certain types of data There are many visual tricks that can be observed due to how the visual system works Colour
  • 94.
    The simplest, yetmost abused of all visual encodings. The problem is that a smooth step in a value does not equate to a smooth colour transition… Colour
  • 95.
    Additionally, colour isnot equally binned in reality. We perceive colours differently due to an increased sensitivity to the yellow part of the spectrum… Wavelength (nm) IRUV Visible Spectrum
  • 96.
    https://mycarta.wordpress.com/2012/10/06/the-rainbow-is-deadlong-live-the-rainbow-part-3/ Luminosity is alsonot stable across the colours, meaning some colours will pop out more than others… and not always intentionally.
  • 97.
    https://mycarta.wordpress.com/2012/10/06/the-rainbow-is-deadlong-live-the-rainbow-part-3/ Luminosity is alsonot stable across the colours, meaning some colours will pop out more than others… and not always intentionally.
  • 98.
    Gregory compared thewavelength of light with the smallest observable difference in hue (expressed as wavelength difference) And how we perceive changes in hue is also very different.
  • 99.
    Is this agood visualization?
  • 100.
    Is this agood visualization? But, grayscale would be just as useful and less visually distracting.
  • 102.
    Is there acolour palette for scientific visualization that works?
  • 103.
    HSL linear Lrainbow palette Kindlmann, G. Reinhard, E. and Creem, S., 2002, Face-based Luminance Matching for Perceptual Colormap Generation, IEEE Proceedings of the conference on Visualization ’02 https://mycarta.wordpress.com/2012/10/06/the-rainbow-is-deadlong-live-the-rainbow-part-3/
  • 104.
    But there aresome in CMS that are already moving away from the rainbow http://cms-results.web.cern.ch/cms-results/public-results/publications/ JME-13-004/index.html
  • 105.
    Is there acolour palette for scientific visualization that works?
  • 106.
    HSL linear Lrainbow palette Kindlmann, G. Reinhard, E. and Creem, S., 2002, Face-based Luminance Matching for Perceptual Colormap Generation, IEEE Proceedings of the conference on Visualization ’02 https://mycarta.wordpress.com/2012/10/06/the-rainbow-is-deadlong-live-the-rainbow-part-3/
  • 107.
    Binary Diverging Categorical Sequential Categorical Categorical There are alsolots of default colour maps that can be applied to particular data types. http://colorbrewer2.org/
  • 108.
    Semantic relevance Or justconsistency When there are many colours for example, we find it difficult to remember abstract associations.
  • 109.
    Selecting Semantically-Resonant Colorsfor Data Visualization Sharon Lin, Julie Fortuna, Chinmay Kulkarni, Maureen Stone, Jeffrey Heer Computer Graphics Forum (Proc. EuroVis), 2013 What are semantically resonant colours?
  • 110.
    Semantic colouring isa good idea in theory, but there are limited areas where this really works. But, if you are going to use colour, have a consistent colour mapping. That way, the decoding time is less. Saving time… In general, CMS & other experiments are pretty good at this, but there are some examples…
  • 113.
  • 114.
  • 115.
    What happens whenwe have a high number of dimensions? And that was just to represent a low number of dimensions
  • 116.
  • 117.
    Scatter Plot Matrices … HeightWeight CholName 1.76 63 4.5John 1.79 70 4.15Mike 1.61 60 6.7Jim 1.84 90 5.03Francois Multidimensional Visualization
  • 118.
    Scatter Plot Matrices … HeightWeight CholesterolHeight Weight CholName 1.76 63 4.5John 1.79 70 4.15Mike 1.61 60 6.7Jim 1.84 90 5.03Francois Multidimensional Visualization
  • 119.
    Scatter Plot Matrices … HeightWeight CholesterolHeight Weight CholName 1.76 63 4.5John 1.79 70 4.15Mike 1.61 60 6.7Jim 1.84 90 5.03Francois 1.76 Multidimensional Visualization
  • 120.
    Scatter Plot Matrices … HeightWeight CholesterolHeight Weight CholName 1.76 63 4.5John 1.79 70 4.15Mike 1.61 60 6.7Jim 1.84 90 5.03Francois 1.76 63 1.76 4.5 Multidimensional Visualization
  • 121.
    Scatter Plot Matrices … HeightWeight CholesterolHeight Weight CholName 1.76 63 4.5John 1.79 70 4.15Mike 1.61 60 6.7Jim 1.84 90 5.03Francois 1.76 63 1.76 4.5 Height Weight Cholesterol Multidimensional Visualization
  • 122.
    Visual Exploration ofLarge Structured Datasets. Wills. Proc. New Techniques and Trends in Statistics (NTTS), pp. 237–246. IOS Press, 1995. Linked Plots Multidimensional Visualization
  • 123.
    When one visualizationwon’t cut it… Multidimensional Visualization
  • 124.
    With dc.js, crossfilter,and d3.js Multidimensional Visualization
  • 125.
    With dc.js, crossfilter,and d3.js Multidimensional Visualization
  • 126.
    With dc.js, crossfilter,and d3.js Multidimensional Visualization
  • 127.
    With dc.js, crossfilter,and d3.js Multidimensional Visualization
  • 128.
    With dc.js, crossfilter,and d3.js Multidimensional Visualization
  • 129.
    My Tutorial onCreating Dashboard Visualizations https://thor-project.github.io/dashboard-tutorial/
  • 130.
    Parallel Coordinate Plots PositiveCorrelation Negative (inverse) Correlation No Correlation Multidimensional Visualization
  • 131.
    Parallel Coordinate Plots PositiveCorrelation Negative (inverse) Correlation No Correlation Multidimensional Visualization
  • 132.
    Lets take anexample where we have many variables to display... Each user is represented by a circle user a user z user a user z 2 Dimensions 3 Dimensions Size indicates number of logins per day Parallel Coordinate Plots
  • 133.
    As we getto higher levels of dimensions, we’ll have problems. Our choice of visual encoding will affect the visual availability of each dimension to the user. 4 Dimensions Color indicates users department 5 Dimensions Transparency indicates consistency in logins user a user z user a user z Parallel Coordinate Plots
  • 134.
    Parallel coordinates area visualization technique employed when a large number of dimensions need to be displayed (often without a temporal element) and where each of those dimensions can be equally important in the decision making process. Not so easy to spotEasy to see In the scatter plots here, it’s easy to see correlation between downloads and uploads, but with the other dimensions that’s difficult.
  • 135.
    Positive Correlation Negative(inverse) Correlation No Correlation Parallel Coordinate Plots
  • 136.
  • 137.
  • 138.
  • 139.
    2 Dimensions Uploads Downloads 11 Users usera user z Parallel Coordinate Plots
  • 140.
    3 Dimensions Uploads Downloads Logins perday 11 Users Parallel Coordinate Plots
  • 141.
    4 Dimensions Uploads Downloads Logins perday Std. deviation in logins per day 11 Users 2 -2 Parallel Coordinate Plots
  • 142.
    5 Dimensions Uploads Downloads Logins perday Std. deviation in logins per day Department 11 Users We use color for department since it’s categorical information. 2 -2 We can keep adding more parallel lines, and comfortably have around 20 dimensions for many users displayed at once. Parallel Coordinate Plots
  • 143.
    Clustering Outlier DetectionCorrelationDistribution Parallel coordinates provide an efficient way to visualize many variables, along with their associated clusters, anomalies, value distributions and correlations. Parallel Coordinate Plots
  • 144.
    83 • static itemaggregation • task: find distribution • data: table • derived data – 4 quantitative attributes • median: central line • lower and upper quartile: boxes • lower upper fences: whiskers – outliers beyond fence cutoffs explicitly shown and kurtosis 20) and a bimo tion 0.31). Richer displays o multi-modality is particularly ! ! !! ! ! ! ! ! n s k mm!2024 !2024 Figure 4: From left to right: box Glyphs Multidimensional Visualization
  • 145.
  • 146.
    A Simple Example| Student Test Results Multidimensional Visualization Math Physics English Religion Math Physics English Religion Table Scatter Plot Matrix Math Physics English Religion 85 90 65 50 40 95 80 50 40 60 71 60 90 95 80 65 50 90 80 90
  • 147.
    86 Math Physics EnglishReligion 85 90 65 50 40 95 80 50 40 60 71 60 90 95 80 65 50 90 80 90 Table Parallel Coordinates Math Physics English Religion 100 90 80 70 60 50 40 30 20 10 0 A Simple Example | Student Test Results Multidimensional Visualization
  • 148.
    87 Math Physics EnglishReligion 85 90 65 50 40 95 80 50 40 60 71 60 90 95 80 65 50 90 80 90 Table Glyph A Simple Example | Student Test Results Multidimensional Visualization Math Physics English Religion
  • 149.
    87 Math Physics EnglishReligion 85 90 65 50 40 95 80 50 40 60 71 60 90 95 80 65 50 90 80 90 Table Glyph A Simple Example | Student Test Results Multidimensional Visualization Math Physics English Religion Teacher Arrange Spatially
  • 150.
  • 151.
  • 152.
    Graphs/Networks Cerebral:Visualizing Multiple ExperimentalConditions on a Graph with Biological Context. Barsky, Munzner, Gardy, and Kincaid. IEEETVCG (Proc. InfoVis) 14(6):1253-1260, 2008.]
  • 153.
  • 154.
  • 155.
  • 156.
  • 157.
  • 158.
    Trees Cut Level 0.34 Clusters View as treemapDendrogram Search
  • 159.
    • split by neighbourhood •then by type • colour by price • neighbourhood patterns – where it’s expensive – where you pay much more for detached type 94 Configuring Hierarchical Layouts to Address Research Questions. Slingsby, Dykes, andWood. IEEETransactions on Visualization and Computer Graphics (Proc. InfoVis 2009) 15:6 (2009), 977–984. Treemaps Partitioning
  • 160.
    • switch orderof splits – type then neighbourhood • switch colour – by price variation • type patterns – within specific type, which neighbourhoods are inconsistent 95 Configuring Hierarchical Layouts to Address Research Questions. Slingsby, Dykes, andWood. IEEETransactions on Visualization and Computer Graphics (Proc. InfoVis 2009) 15:6 (2009), 977–984. Treemaps Partitioning
  • 161.
  • 162.
  • 163.
    97 Validating your visualisation… Domainsituation Observe target users using existing tools Visual encoding/interaction idiom Justify design with respect to alternatives Algorithm Measure system time/memory Analyze computational complexity Observe target users after deployment ( ) Measure adoption Analyze results qualitatively Measure human time with lab experiment (lab study) Data/task abstraction
  • 164.
    98 “Visualization can surpriseyou, but doesn't scale well. Modelling scales well, but can't surprise you.” Hadley Wickham The elephant in the room… scaling up
  • 165.
    99 One of thebiggest challenges in HEP, Biology, chemistry, and business is scale. Our screens have a limited number of pixels. And our data is often much larger. You can do two things to get around this. The elephant in the room… scaling up
  • 166.
    100 GPU usage and WebGL Notyet compatible across all browsers, and not everyone has a dedicated GPU. Reduce the problem space Try and get users to focus queries as soon as possible to reduce the amount of data to be visualised. As done in imMens from Jeffrey Heer’s lab in Washington. Renders billions of data points at 50 fps in the browser. Provide ways to aggregate information in to meaningful overview visualisations… Solutions
  • 167.
    101 Munzner. A KPeters Visualization Series, CRC Press, Visualization Series, 2014. Visualization Analysis and Design. More??
  • 168.
    102 http://antarctic-design.co.uk/biovis-workshop15/ Tutorial on D3 Tutorialon Dashboard Visualizations https://thor-project.github.io/dashboard-tutorial/ Visualization Survey Sites Set Visualization - http://www.cvast.tuwien.ac.at/SetViz Time Series Visualization - http://survey.timeviz.net/ Parallel Coordinates Visualization Periodic Table of Visualizations Data Vis Catalogue Further Links
  • 169.
    Eamonn Maguire CERN Data Visualizationand suggestions for CMS CERN August 25th 2016 Questions @antarcticdesign eamonnmag@gmail.com