Relational Data Visualisation
Part II
Visualisation (COMP7016)
Dr Zhonglin (Jolin) Qu
Z.Qu@westernsydney.edu.au
Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au 1
Last Lecture
• Relational Data Visualisation – Part I
• Introduction to Graph Visualisation
• Tree Visualisation
– Connection Approach
– Enclosure Approach
– Connection+Enclosure Approach
Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au 2
Module Outline
• Relational Data Visualisation – Part II
– Graph and Graph Visualization
– Network Visualisation
– Graph Tools and Applications
Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au 3
GRAPHS AND
GRAPH TERMINOLOGIES
Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au 4
What’s a Graph can be Used?
• Any relational data sets
– i.e. most of data sets can be modelled as a graph
• Examples:
– US telephone system
– World Wide Web
– Distribution network for on-line retailer
– Call graph of a large software system
– Semantic map in an AI algorithm
– Set of connected friends
Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au 5
Graph Visualization Problems
• Graph layout and positioning
– Make a concrete rendering of abstract graph
– None of the layouts are the best
• Scale
– Small graphs is not a problem
– Large graphs is a big challenge
• Navigation/Interaction
– How to support user changing focus and moving
around the graph
Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au 6
Layout Heuristics
• Layout algorithms can be
– planar
– grid-based
– orthogonal
– curved lines
– hierarchies
– circular
– ...
Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au 7
Graph Terminologies
• Graph G = (V, E)
– V: set of vertices (objects) or
nodes
– E: set of edges connecting
vertices (relationships)
• A simple graph G(V,E) is a
graph which contains no
multi-edges and no loops Not a Simple Graph
Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au 8
Graph Terminologies (cont.)
• A Path G contains only edges
that can be consecutively
traversed
• A Tree G contains no cycles A Path
• A Network G contains cycles
A Tree
Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au 9
Graph Terminologies (cont.)
• Unconnected graph – Includes
1 or more edges traversal
starting from a given vertex
cannot reach any other vertex. Unconnected Graph
• Articulation points – are
vertices, which if deleted from
the graph, would break up the
graph in multiple sub-graphs.
Articulation Point
Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au 10
GRAPH VISUALISATION AND
GRAPH DRAWING
Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au 11
Hierarchical Drawings, Sugiyama
• directed acyclic graphs, DAGs
K. Sugiyama, S. Tagawa, and M. Toda IEEE Trans SCM 1981
(1) break cycles
(2) compute layering, the Y-coordinates
and insert dummy nodes for long-span edges
(3) crossing reduction
repeat
down phase: sort next layer
placement on lower layer
up phase: sort previous layer
placement on upper layer
until DONE
(4) routing of the edges
Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au 12
Hierarchical Drawings, Sugiyama (cont)
Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au 13
Pros and Cons
• Pros:
– Most visual edges flow in the same direction
– Number of their intersections are minimized
– Show the flow and hierarchy very clearly
• Cons:
– Not well suit for general graphs
– Not scale well on large graphs
• Often chosen for the arrangement of flow
diagrams, process charts and workflows
Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au 14
Force Directed Methods
Idea: a Spring model (Originated by Peter
Eades, USYD)
select optimal edge length (node distance) k
repeat
for each node v do
for each pair of nodes (u, v)
k2
compute repulsive force fr(u,v) = - c•
d(u, v)
for each edge e = (u,v)
d(u, v)2
compute attractive force fa(u,v) = c•
k
sum all force vectors F(v) = ∑ fr(u,v) + ∑ fa(u,v)
move node v according to F(v)
until DONE
Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au 15
Forces Directed Methods
• Attractive forces
– along each edge
– proportional to shortest paths
• Repulsive forces
– between each pair of nodes (O(n2) pairs, costly!)
– only between closely related nodes (hash grid)
• Other forces
– center of gravity (attractive)
– underlying magnetic fields (concentric, radial, horizontal)
– angular forces (between adjacent edges at nodes v)
– from the boundary (repulsive; bounce back)
Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au 16
Forces Directed Methods
Demo
https://www.amcharts.com/demos/force-directed-network/
Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au 17
Pros and Cons
• Pros:
– Well suite for general graphs with different
variations
– Good layout
– Easy to implement
• Cons:
– Layout is not predictable
– Slow in computational cost O(N 3) or O(N2log(N))
– Too much twisting to achieve optimal layout
– Not scale well on large graphs of more than Image generated by:
thousands of nodes http://mbostock.github.io/d3/talk/20111116/for
ce-collapsible.html
• Widely use in practice
Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au 18
Big Graphs
• Difficult to keep all in memory
• Not effective to show all nodes
• Often visualized as “hairballs”
• Solutions
– Structural clustering or
– Abstraction and aggregation
– Show a high-level overview of topology
Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au 19
Address Computational Scalability –
Multilevel Approaches
[Schulz 2004]
Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au 20
Collapsible Nodes
Collapsed
nodes
http://mbostock.github.io/d3/talk/20111116/force-collapsible.html
Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au 21
Collapsible Nodes
Collapsed
nodes
https://www.amcharts.com/demos/collapsible-force-directed-tree/
Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au 22
Address Computational Scalability –
Abstraction/Aggregation
Cytoscape.org
750 nodes
30,000 nodes
18 nodes 90 nodes
Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au 23
Aesthetics
• What is a nice drawing ?
• What makes drawings understandable or readable?
• How can we measure quality?
• Can we formalise aesthetics ?
• A proverb
”A picture is worth a thousand words“
• R. Feynman (Nobel prize in Physics)
”It’s all visual“
• R.A. Earnshaw (a poineer in computer graphics, 1973)
”visualization uses interactive compute graphics to help provide
insight on complicated problems, models or systems“.
”Scientific visualization is exploring data and information
graphically, gaining understanding and insights into the data“
• R. Hamming (1973)
"the purpose of computing is insight not numbers"
Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au 24
Aesthetic Criteria in Graph Drawing
• Edge Crossings: minimise towards planar
• Totall Edge Length: minimise towards
proper scale
• Area: minimise towards efficiency
• Reducing maximum Edge Length: minimise
longest edge
• Uniform Edge Lengths: minimise variances
• Totall Bends: minimise orthogonal towards
straight-line
Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au 25
Drawing Styles
• polyline drawings
reduce bends, no sharp angles, polish by with Bezier splines
• straight-line or curve
uniform (short) edge length
• orthogonal drawings
minimize bends
• planar drawings
minimize crossings and bends
• grid embeddings
grid coordinates for nodes and bend-points
• visibility
horizontal bar nodes and vertical visibility
Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au 26
Node Attributes
• Colouring
• Positions
• Size
• Shapes
• Multiple Views
• Etc.
Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au 27
Edge Attributes
• Colouring
• Positions
• Size
• Shapes
• Thickness
• Drawing style
– Straight line
– Curve (edge
bundling)
– Orthogonal
Demo: http://mbostock.github.io/d3/talk/20111116/bundle.html
Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au 28
Matrix Representations
• Use adjacency matrix to show the relations
Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au 29
Matrix Representations - Exercise
• Show the adjacency matrix for the following graph
A A B C D E F G
B C A
B
C
D E
D
E
F G F
G
Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au 30
Resources
• G. DiBattista, P. Eades, R. Tamassia, I.G. Tollis (1999), Draph Drawing, Prentice
Hall.
• M. Kaufmann, D. Wagner (eds)., Drawing Graphs: Methods and Models, LNCS
2025, Springer Verlag, 2001
• Herman, I, Melancon, G & Marshall, MS (2000), 'Graph Visualization in
Information Visualization: a Survey', IEEE Transactions on Visualization and
Computer Graphics, vol. 6, pp. 24-44
• Jünger, M., Mutzel, P. (2004), Graph Drawing Software, Springer-Verlag, ISBN
978-3-540-00881-1.
• Kaufmann, Michael; Wagner, Dorothea, eds. (2001), Drawing Graphs: Methods
and Models, Lecture Notes in Computer Science 2025, Springer-Verlag,
doi:10.1007/3-540-44969-8.
• Tamassia, Roberto, ed. (2014), Handbook of Graph Drawing and Visualization,
CRC Press.
• Tutorial: www.cs.brown.edu/people/rt/papers/gd-tutorial/gd-constraints.pdf
• Annual Graph Drawing Conference
• http://www.readwriteweb.com/archives/the_best_tools_for_visualization.php
Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au 31
Graph Visualisation – A Case Study
• NicholasChristakis_2010-480p.mp4
• https://www.ted.com/talks/nicholas_christaki
s_the_hidden_influence_of_social_networks?
subtitle=en
Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au 32
NETWORK VISUALISATION
Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au 33
What’s common in Network Visualisation?
• A Network is a type Graph
• Usually refer to real applications with large
size and high complexity
– Millions of items and connections
• Structural behaviour is usually unknown
• Relationships are usually non-directional
• Aesthetic Criteria is not usually considered as
the most important criteria for the design
• Usually accompany with an interaction
technique for the exploration
Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au 34
Visualisation of the Internet
How many web pages that the Internet has? ≈ 50 billion Web Pages
Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au 35
http://www.worldwidewebsize.com /
Facebook
2.9+ Billion Users in 2023
Facebook network visualisation of +1000 friends.
http://kimoquaintance.com/2011/08/22/what-can-we-learn-about-somalis-from-their-facebook-networks/
Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au 36
Facebook
2.9+ Billion Users in 2023
Network of 340 friends with clear community clusters and several significant friends identified by betweenness centrality weighting of node size
http://kimoquaintance.com/2011/08/22/what-can-we-learn-about-somalis-from-their-facebook-networks/
Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au 37
And Many More Networks
• Citation Network: 250+ million Articles
• : 500+ million Articles per day
• : 300+ million active Users
• : mobile phone network 100+ million
users
• Protein-protein interactions: 200 million
possible interactions in human genome
Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au 38
Small-World Networks
• A small-world network is a type of
mathematical graph in which most nodes are
not neighbours of one another, but most
nodes can be reached from every other by a
small number of hops or steps.
• A small world network, where nodes
represent people and edges connect people
that know each other, captures the small
world phenomenon of strangers being linked
by a mutual acquaintance.
Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au 39
Small-World Networks (cont.)
• It needs 6-steps to reach an unknown item
• Many empirical graphs are well modeled by
small-world networks.
– Social networks
– The connectivity of the Internet
– Gene networks
Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au 40
World Trade in 1992
https://slideplayer.com/slide/6352926/
Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au 41
Connected Paper Visualisation
https://www.connectedpapers.com/main/d1f0f33c197dc9932c4a2e42eb1a8a60635bb401/Explainable-artificial-intelligence%3A-
a-comprehensive-review/graph
Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au 42
A Social Network Visualisation
Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au 43
CorpusViz - A Visualisation Showing the
Communication among Adults and Children
Tran, J., Nguyen, Q. V., Jones, C., Hendery, R., & Simoff, S. (2017). CorpusViz: Child and Adult Speech Visualisation. Paper
presented at the 21st International Conference on Information Visualisation, London
Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au 44
Visualisation of a very large Network
Problem?
Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au 45
How to due with large Networks or
Graphs?
• Partition a graph into a hierarchy of sub-graphs
– Simplify the complex structures of large graphs through the global
abstraction for easy interpretation, perception and navigation of large
information spaces.
• Improve the layout algorithms to utilise display space allowing
more nodes and information to be displayed.
– More information can be displayed on very high-resolution and large
screen, but it does not necessarily provide very much more
information into the brain [Ware 2004]
– Investigation of space-efficient techniques for visualising large
datasets could be more effective and economical than the use of
expensive large display devices.
• Apply Interaction and Navigation
Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au 46
Visualisation of a very large Network
Combined Clustering & Space-Efficient
Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au 47
Visualisation of a very large Network
Combined Clustering & Space-Efficient
Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au 48
Handling Large Graphs / Networks
Graph Nodes Edges
YahooWeb 1.4 Billion 6 Billion
Symantec Machine-File 1 Billion 37 Billion
Graph
Twitter 104 Million 3.7 Billion
Phone call network 30 Million 260 Million
Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au 49
Step 1: Storing Large Graphs/Networks
• On your laptop computer
– SQLite
– Neo4j (GPL license)
• On a server
– MySQL, PostgreSQL, etc.
• With a cluster
– Hadoop (generic framework)
– HBase, inspired by Google’s BigTable
– Hama, inspired by Google’s Pregel
– FlockDB, by Twitter
• Comparison of “graph databases”
– http://nosql.mypopescu.com/post/40759505554/a-comparison-
of-7-graph-databases
Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au 50
Storing Large Graph on a PC - SQLite
• Light weight
• Easily handle up to gigabytes"
• Roughly tens of millions of nodes/edges (perhaps up
• to billions?). Very good! For today’s standard.
• Very easy to maintain: one cross-platform file
• Has programming wrappers in numerous languages
• C++, Java (Andriod), Python, Objective C (iOS),...
• Queries are so easy!
– e.g., find all nodes’ degrees = 1 SQL statement"
• Bonus: SQLite even supports full-text search
Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au 51
SQLite – Graph Scheme
• Simplest schema:
– edges(source_id, target_id)
• More sophisticated (flexible; lets you store more things):
CREATE TABLE nodes (
id INTEGER PRIMARY KEY,
type INTEGER DEFAULT 0,
name VARCHAR DEFAULT ' ');
CREATE TABLE edges (
source_id INTEGER,
target_id INTEGER,
type INTEGER DEFAULT 0,
weight FLOAT DEFAULT 1,
timestamp INTEGER DEFAULT 0,
PRIMARY KEY(source_id, target_id, timestamp));
Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au 52
Step 2: Data Mining or Graph Mining
• Analyse it first by data mining or graph mining.
• If the dataset is small, we can visualise it first
• Does it follow any expected patterns? Or does it
NOT follow some patterns (outliers)?
– Why does this matter?"
– If we know the patterns (models), we can do
prediction, recommendation, etc.
• e.g., is Alice going to “friend” Bob on Facebook?
– People often buy beer and diapers together.
– Outliers often give us new insights
• e.g., telemarketer’s friends don’t know each other
Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au 53
Step 3: Visualising the Results
• Visualise data mining results
• Visualise the graph using mining attributes
– E.g. filtering, attributes, importance
• Hierarchical visualisation based on clusters
from clustering process
– Coarse to granularity
• Aims: find or verify patterns or (ab)normality
Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au 54
GRAPH TOOLS &
APPLICATIONS
Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au 55
D3.js
Image source: https://d3js.org/
• A very comprehensive JavaScript library for data manipulation and visualization
• Include multiple library packages for graph/network visualisation.
• Require certain programming skill.
• Open source, https://d3js.org/
56
Prefuse Visualization Toolkit
• Rich set of features for data modelling, visualization, and interaction
• Written in Java, using the Java 2D graphics library → can use for web applets
• http://prefuse.org/ (BSB Licence)
Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au 57
Jgraph: Java Graph Visualization and Layout
• Powerful, easy-to-use, feature-rich and standards-compliant open source graph visualization for Java.
• http://www.jgraph.com/jgraph.html
Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au 58
Jung - Java Universal Network/Graph Framework
• Provides a common and extendible language for the modeling, analysis, and visualization of data that can be
represented as a graph or network.
• Written in Java, which allows JUNG-based applications to make use of the extensive built-in capabilities of the Java API,
as well as those of other existing third-party Java libraries.
• http://jung.sourceforge.net/index.html
Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au 59
Gephi
https://gephi.org/
Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au 60
Cytoscape
http://www.cytoscape.org/
Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au 61
Cytoscape Web
http://cytoscapeweb.cytoscape.org/
Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au 62
NetworkX
https://networkx.github.io/
Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au 63
Proceus - Network-based Visual Analysis of Tabular Data
http://www.cc.gatech.edu/gvu/ii/ploceus/
Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au 64
Next Lecture
• Multi-dimensional Data Visualisation
Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au 65