Relational Data Visualisation
Part II
 Visualisation (COMP7016)
      Dr Zhonglin (Jolin) Qu
   Z.Qu@westernsydney.edu.au
      Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au   1
                    Last Lecture
• Relational Data Visualisation – Part I
• Introduction to Graph Visualisation
• Tree Visualisation
  – Connection Approach
  – Enclosure Approach
  – Connection+Enclosure Approach
              Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au   2
              Module Outline
• Relational Data Visualisation – Part II
  – Graph and Graph Visualization
  – Network Visualisation
  – Graph Tools and Applications
              Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au   3
    GRAPHS AND
GRAPH TERMINOLOGIES
  Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au   4
    What’s a Graph can be Used?
• Any relational data sets
  – i.e. most of data sets can be modelled as a graph
• Examples:
  – US telephone system
  – World Wide Web
  – Distribution network for on-line retailer
  – Call graph of a large software system
  – Semantic map in an AI algorithm
  – Set of connected friends
              Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au   5
    Graph Visualization Problems
• Graph layout and positioning
  – Make a concrete rendering of abstract graph
  – None of the layouts are the best
• Scale
  – Small graphs is not a problem
  – Large graphs is a big challenge
• Navigation/Interaction
  – How to support user changing focus and moving
    around the graph
              Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au   6
             Layout Heuristics
• Layout algorithms can be
  – planar
  – grid-based
  – orthogonal
  – curved lines
  – hierarchies
  – circular
  – ...
              Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au   7
           Graph Terminologies
• Graph G = (V, E)
  – V: set of vertices (objects) or
    nodes
  – E: set of edges connecting
    vertices (relationships)
• A simple graph G(V,E) is a
  graph which contains no
  multi-edges and no loops                                                Not a Simple Graph
               Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au                    8
     Graph Terminologies (cont.)
• A Path G contains only edges
  that can be consecutively
  traversed
• A Tree G contains no cycles                                          A Path
• A Network G contains cycles
                                                                        A Tree
            Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au             9
     Graph Terminologies (cont.)
• Unconnected graph – Includes
  1 or more edges traversal
  starting from a given vertex
  cannot reach any other vertex.                                    Unconnected Graph
• Articulation points – are
  vertices, which if deleted from
  the graph, would break up the
  graph in multiple sub-graphs.
                                                                         Articulation Point
              Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au                        10
GRAPH VISUALISATION AND
    GRAPH DRAWING
    Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au   11
Hierarchical Drawings, Sugiyama
• directed acyclic graphs, DAGs
  K. Sugiyama, S. Tagawa, and M. Toda IEEE Trans SCM 1981
  (1) break cycles
  (2) compute layering, the Y-coordinates
       and insert dummy nodes for long-span edges
  (3) crossing reduction
      repeat
        down phase: sort next layer
                       placement on lower layer
        up phase:     sort previous layer
                       placement on upper layer
     until DONE
  (4) routing of the edges
                    Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au   12
Hierarchical Drawings, Sugiyama (cont)
         Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au   13
                 Pros and Cons
• Pros:
  – Most visual edges flow in the same direction
  – Number of their intersections are minimized
  – Show the flow and hierarchy very clearly
• Cons:
  – Not well suit for general graphs
  – Not scale well on large graphs
• Often chosen for the arrangement of flow
  diagrams, process charts and workflows
              Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au   14
             Force Directed Methods
Idea: a Spring model (Originated by Peter
  Eades, USYD)
select optimal edge length (node distance) k
repeat
   for each node v do
       for each pair of nodes (u, v)
                                                      k2
          compute repulsive force fr(u,v) = - c•
                                                    d(u, v)
        for each edge e = (u,v)
                                                   d(u, v)2
          compute attractive force fa(u,v) = c•
                                                       k
    sum all force vectors F(v) = ∑ fr(u,v) + ∑ fa(u,v)
    move node v according to F(v)
until DONE
                       Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au   15
          Forces Directed Methods
• Attractive forces
   – along each edge
   – proportional to shortest paths
• Repulsive forces
   – between each pair of nodes (O(n2) pairs, costly!)
   – only between closely related nodes (hash grid)
• Other forces
   –   center of gravity (attractive)
   –   underlying magnetic fields (concentric, radial, horizontal)
   –   angular forces (between adjacent edges at nodes v)
   –   from the boundary (repulsive; bounce back)
                    Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au   16
Forces Directed Methods
                         Demo
 https://www.amcharts.com/demos/force-directed-network/
        Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au   17
                           Pros and Cons
• Pros:
   – Well suite for general graphs with different
     variations
   – Good layout
   – Easy to implement
• Cons:
   –   Layout is not predictable
   –   Slow in computational cost O(N 3) or O(N2log(N))
   –   Too much twisting to achieve optimal layout
   –   Not scale well on large graphs of more than              Image generated by:
       thousands of nodes                          http://mbostock.github.io/d3/talk/20111116/for
                                                                 ce-collapsible.html
• Widely use in practice
                        Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au           18
                        Big Graphs
•   Difficult to keep all in memory
•   Not effective to show all nodes
•   Often visualized as “hairballs”
•   Solutions
    – Structural clustering or
    – Abstraction and aggregation
    – Show a high-level overview of topology
               Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au   19
Address Computational Scalability –
     Multilevel Approaches
                      [Schulz 2004]
       Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au   20
          Collapsible Nodes
                                                                       Collapsed
                                                                       nodes
http://mbostock.github.io/d3/talk/20111116/force-collapsible.html
            Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au           21
        Collapsible Nodes
                                                                          Collapsed
                                                                          nodes
https://www.amcharts.com/demos/collapsible-force-directed-tree/
               Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au               22
Address Computational Scalability –
Abstraction/Aggregation
                 Cytoscape.org
                                                                750 nodes
     30,000 nodes
    18 nodes                                                          90 nodes
     Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au                    23
                                 Aesthetics
•   What is a nice drawing ?
•   What makes drawings understandable or readable?
•   How can we measure quality?
•   Can we formalise aesthetics ?
• A proverb
    ”A picture is worth a thousand words“
• R. Feynman (Nobel prize in Physics)
    ”It’s all visual“
• R.A. Earnshaw (a poineer in computer graphics, 1973)
    ”visualization uses interactive compute graphics to help provide
      insight on complicated problems, models or systems“.
     ”Scientific visualization is exploring data and information
     graphically, gaining understanding and insights into the data“
• R. Hamming (1973)
     "the purpose of computing is insight not numbers"
                        Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au   24
Aesthetic Criteria in Graph Drawing
• Edge Crossings: minimise towards planar
• Totall Edge Length: minimise towards
  proper scale
• Area: minimise towards efficiency
• Reducing maximum Edge Length: minimise
  longest edge
• Uniform Edge Lengths: minimise variances
• Totall Bends: minimise orthogonal towards
  straight-line
           Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au   25
                     Drawing Styles
• polyline drawings
   reduce bends, no sharp angles, polish by with Bezier splines
• straight-line or curve
   uniform (short) edge length
• orthogonal drawings
   minimize bends
• planar drawings
   minimize crossings and bends
• grid embeddings
   grid coordinates for nodes and bend-points
• visibility
   horizontal bar nodes and vertical visibility
                   Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au   26
              Node Attributes
•   Colouring
•   Positions
•   Size
•   Shapes
•   Multiple Views
•   Etc.
              Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au   27
                    Edge Attributes
•   Colouring
•   Positions
•   Size
•   Shapes
•   Thickness
•   Drawing style
    – Straight line
    – Curve (edge
      bundling)
    – Orthogonal
         Demo: http://mbostock.github.io/d3/talk/20111116/bundle.html
                   Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au   28
         Matrix Representations
• Use adjacency matrix to show the relations
              Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au   29
 Matrix Representations - Exercise
• Show the adjacency matrix for the following graph
          A                                           A       B      C       D   E   F   G
  B               C                            A
                                               B
                                               C
      D       E
                                               D
                                               E
  F                   G                        F
                                               G
                  Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au                   30
                                 Resources
•   G. DiBattista, P. Eades, R. Tamassia, I.G. Tollis (1999), Draph Drawing, Prentice
    Hall.
•   M. Kaufmann, D. Wagner (eds)., Drawing Graphs: Methods and Models, LNCS
    2025, Springer Verlag, 2001
•   Herman, I, Melancon, G & Marshall, MS (2000), 'Graph Visualization in
    Information Visualization: a Survey', IEEE Transactions on Visualization and
    Computer Graphics, vol. 6, pp. 24-44
•   Jünger, M., Mutzel, P. (2004), Graph Drawing Software, Springer-Verlag, ISBN
    978-3-540-00881-1.
•   Kaufmann, Michael; Wagner, Dorothea, eds. (2001), Drawing Graphs: Methods
    and Models, Lecture Notes in Computer Science 2025, Springer-Verlag,
    doi:10.1007/3-540-44969-8.
•   Tamassia, Roberto, ed. (2014), Handbook of Graph Drawing and Visualization,
    CRC Press.
•   Tutorial: www.cs.brown.edu/people/rt/papers/gd-tutorial/gd-constraints.pdf
•   Annual Graph Drawing Conference
•   http://www.readwriteweb.com/archives/the_best_tools_for_visualization.php
                       Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au    31
Graph Visualisation – A Case Study
• NicholasChristakis_2010-480p.mp4
• https://www.ted.com/talks/nicholas_christaki
  s_the_hidden_influence_of_social_networks?
  subtitle=en
            Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au   32
NETWORK VISUALISATION
   Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au   33
What’s common in Network Visualisation?
 • A Network is a type Graph
 • Usually refer to real applications with large
   size and high complexity
   – Millions of items and connections
 • Structural behaviour is usually unknown
 • Relationships are usually non-directional
 • Aesthetic Criteria is not usually considered as
   the most important criteria for the design
 • Usually accompany with an interaction
   technique for the exploration
               Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au   34
    Visualisation of the Internet
How many web pages that the Internet has? ≈ 50 billion Web Pages
               Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au                             35
                                                                          http://www.worldwidewebsize.com /
                        Facebook
               2.9+ Billion Users in 2023
Facebook network visualisation of +1000 friends.
http://kimoquaintance.com/2011/08/22/what-can-we-learn-about-somalis-from-their-facebook-networks/
                            Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au                 36
                                   Facebook
                          2.9+ Billion Users in 2023
Network of 340 friends with clear community clusters and several significant friends identified by betweenness centrality weighting of node size
http://kimoquaintance.com/2011/08/22/what-can-we-learn-about-somalis-from-their-facebook-networks/
                                        Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au                                                   37
      And Many More Networks
• Citation Network: 250+ million Articles
•           : 500+ million Articles per day
•           : 300+ million active Users
•           : mobile phone network 100+ million
  users
• Protein-protein interactions: 200 million
  possible interactions in human genome
             Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au   38
        Small-World Networks
• A small-world network is a type of
  mathematical graph in which most nodes are
  not neighbours of one another, but most
  nodes can be reached from every other by a
  small number of hops or steps.
• A small world network, where nodes
  represent people and edges connect people
  that know each other, captures the small
  world phenomenon of strangers being linked
  by a mutual acquaintance.
            Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au   39
   Small-World Networks (cont.)
• It needs 6-steps to reach an unknown item
• Many empirical graphs are well modeled by
  small-world networks.
  – Social networks
  – The connectivity of the Internet
  – Gene networks
              Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au   40
World Trade in 1992
https://slideplayer.com/slide/6352926/
    Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au   41
Connected Paper Visualisation
 https://www.connectedpapers.com/main/d1f0f33c197dc9932c4a2e42eb1a8a60635bb401/Explainable-artificial-intelligence%3A-
                                             a-comprehensive-review/graph
                             Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au                                    42
A Social Network Visualisation
       Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au   43
  CorpusViz - A Visualisation Showing the
Communication among Adults and Children
Tran, J., Nguyen, Q. V., Jones, C., Hendery, R., & Simoff, S. (2017). CorpusViz: Child and Adult Speech Visualisation. Paper
presented at the 21st International Conference on Information Visualisation, London
                             Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au                                       44
Visualisation of a very large Network
Problem?
           Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au   45
   How to due with large Networks or
               Graphs?
• Partition a graph into a hierarchy of sub-graphs
   – Simplify the complex structures of large graphs through the global
     abstraction for easy interpretation, perception and navigation of large
     information spaces.
• Improve the layout algorithms to utilise display space allowing
  more nodes and information to be displayed.
   – More information can be displayed on very high-resolution and large
     screen, but it does not necessarily provide very much more
     information into the brain [Ware 2004]
   – Investigation of space-efficient techniques for visualising large
     datasets could be more effective and economical than the use of
     expensive large display devices.
• Apply Interaction and Navigation
                    Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au   46
 Visualisation of a very large Network
Combined Clustering & Space-Efficient
        Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au   47
 Visualisation of a very large Network
Combined Clustering & Space-Efficient
        Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au   48
Handling Large Graphs / Networks
Graph                             Nodes                             Edges
YahooWeb                          1.4 Billion                       6 Billion
Symantec Machine-File             1 Billion                         37 Billion
Graph
Twitter                           104 Million                       3.7 Billion
Phone call network                30 Million                        260 Million
               Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au           49
Step 1: Storing Large Graphs/Networks
• On your laptop computer
   – SQLite
   – Neo4j (GPL license)
• On a server
   – MySQL, PostgreSQL, etc.
• With a cluster
   –   Hadoop (generic framework)
   –   HBase, inspired by Google’s BigTable
   –   Hama, inspired by Google’s Pregel
   –   FlockDB, by Twitter
• Comparison of “graph databases”
   – http://nosql.mypopescu.com/post/40759505554/a-comparison-
     of-7-graph-databases
                   Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au   50
    Storing Large Graph on a PC - SQLite
•   Light weight
•   Easily handle up to gigabytes"
•   Roughly tens of millions of nodes/edges (perhaps up
•   to billions?). Very good! For today’s standard.
•   Very easy to maintain: one cross-platform file
•   Has programming wrappers in numerous languages
•   C++, Java (Andriod), Python, Objective C (iOS),...
•   Queries are so easy!
    – e.g., find all nodes’ degrees = 1 SQL statement"
• Bonus: SQLite even supports full-text search
                  Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au   51
          SQLite – Graph Scheme
• Simplest schema:
   – edges(source_id, target_id)
• More sophisticated (flexible; lets you store more things):
   CREATE TABLE nodes (
      id INTEGER PRIMARY KEY,
      type INTEGER DEFAULT 0,
      name VARCHAR DEFAULT ' ');
   CREATE TABLE edges (
      source_id INTEGER,
      target_id INTEGER,
      type INTEGER DEFAULT 0,
      weight FLOAT DEFAULT 1,
      timestamp INTEGER DEFAULT 0,
      PRIMARY KEY(source_id, target_id, timestamp));
                  Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au   52
 Step 2: Data Mining or Graph Mining
• Analyse it first by data mining or graph mining.
• If the dataset is small, we can visualise it first
• Does it follow any expected patterns? Or does it
  NOT follow some patterns (outliers)?
   – Why does this matter?"
   – If we know the patterns (models), we can do
     prediction, recommendation, etc.
      • e.g., is Alice going to “friend” Bob on Facebook?
   – People often buy beer and diapers together.
   – Outliers often give us new insights
      • e.g., telemarketer’s friends don’t know each other
                 Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au   53
    Step 3: Visualising the Results
• Visualise data mining results
• Visualise the graph using mining attributes
  – E.g. filtering, attributes, importance
• Hierarchical visualisation based on clusters
  from clustering process
  – Coarse to granularity
• Aims: find or verify patterns or (ab)normality
              Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au   54
 GRAPH TOOLS &
  APPLICATIONS
Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au   55
                                      D3.js
                                                       Image source: https://d3js.org/
•   A very comprehensive JavaScript library for data manipulation and visualization
•   Include multiple library packages for graph/network visualisation.
•   Require certain programming skill.
•   Open source, https://d3js.org/
                                                                                         56
       Prefuse Visualization Toolkit
•   Rich set of features for data modelling, visualization, and interaction
•   Written in Java, using the Java 2D graphics library → can use for web applets
•   http://prefuse.org/ (BSB Licence)
                            Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au   57
    Jgraph: Java Graph Visualization and Layout
•   Powerful, easy-to-use, feature-rich and standards-compliant open source graph visualization for Java.
•   http://www.jgraph.com/jgraph.html
                             Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au                 58
    Jung - Java Universal Network/Graph Framework
•   Provides a common and extendible language for the modeling, analysis, and visualization of data that can be
    represented as a graph or network.
•   Written in Java, which allows JUNG-based applications to make use of the extensive built-in capabilities of the Java API,
    as well as those of other existing third-party Java libraries.
•   http://jung.sourceforge.net/index.html
                                  Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au                                      59
                  Gephi
       https://gephi.org/
Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au   60
            Cytoscape
http://www.cytoscape.org/
 Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au   61
        Cytoscape Web
http://cytoscapeweb.cytoscape.org/
     Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au   62
             NetworkX
https://networkx.github.io/
 Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au   63
Proceus - Network-based Visual Analysis of Tabular Data
                    http://www.cc.gatech.edu/gvu/ii/ploceus/
              Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au   64
                  Next Lecture
• Multi-dimensional Data Visualisation
             Dr Zhonglin (Jolin) Qu. Email: Z.qu@westernsydney.edu.au   65