An introduction to social
network analysis
June 11, 2008
David Lazer
Director
Program on Networked Governance
Kennedy School of Government
Harvard University
Networks
Definition:
Paradigmatic focus on relationships
Emergent/self-organized interconnected forms
Units are indeterminate
Key issues:
How do the configuration of networks affect how
individuals and systems function?
How to study networks?
A brief history of the study of
networks
Roots in sociology and anthropology
going back to early days of those fields
Sociometry in 1930s (Moreno)
Substantial interest in social
psychology in 1940s-1960s (Festinger,
Milgram, Newcomb)
Economic sociology 1970s-present
(Granovetter,White, Uzzi)
Explosion of social capital research in
1990s (Putnam, Burt)
Invasion of the physicists (Watts,
Barabasi, Newman)
Networks in political science
Examples go back (at least) to 1938.
Applications to:
Legislative processes
Public opinion
IR
Interest groups
But until very recently, very thin research tradition--
does not fit into dominant paradigms
Network analysis
Overview of foci of current social network
Research design
Some examples of applications
Multiple levels of network
theory
Systems level– what network structures
function well for what tasks?
Positional level– how does the individual
position in the network affect that individual?
Relational level– what drives the
configuration of the network?
Overview of some social
network “ideas”
How do networks affect how systems and
individuals function?
How are networks structured?
How do networks affect
systemic and individual
functioning?
Regulation
Circulation
Coordination
Control
Regulation vs Circulation
vs
Coordination and control:
centralized vs decentralized
nets
Network structure
Small worlds (Milgram; Watts and Strogatz)
Scale free networks (Barabasi, Stanley)
Homophily (Merton, Lazarsfeld)
Small world networks
Scale free networks
Homophily: birds of a feather
How to do social network
research?
Types of data
Research foci
Design issues
Types of network data
One mode vs two mode
Whole network vs egocentric
Different types of relationships
One mode vs two mode
One mode: person to Two mode: person
person to event
Jack Jack Jill
likes
Jill
Whole network
Network visualization of Members who traveled at least 10 days
together (Williams 2006)
Egocentric
From:Assessing the Social and Behavioral Science Base for HIV/AIDS
Prevention and Intervention: Workshop Summary (1995)
Types of relationships
Communication
Affection
Advice
Proximity
Power
Multiplexity of relationships
Data
What kind of data might be appropriate?
Survey
Any communication, meeting
Proximity
Affiliation
Behavioral
Ramifications of missing/noisy data
Some network measures degrade more than
others
The coming revolution in
observational data
Impact of removal of links from 7m person
mobile phone network: weak vs strong tie
removal
Structure and tie strengths in mobile communication networks by J.-P. Onnela, J. Saramäki, J. Hyvönen, G.
Szabó, D. Lazer, K. Kaski, J. Kertész, and A.-L. Barabási, PNAS 104, 7332-7336 (2007)
Research foci
Individual level
System level
Network structure
Micro to macro, using computational models
Analysis I: individual-level
outcomes
Impact on being in a particular place in the
network
E.g., impact of centrality: degree, reachability,
betweeness on outcomes; benefit of a brokering
between other actors (Burt)
Analysis II: system-level
outcomes
Impact of structure of overall network
E.g., density of connections has inverse U
relationship with performance in creative settings
(Leenders, Uzzi)
Impact of centralization on signal aggregation
(Bavelas)
Various research on small groups
Analysis III: network structure
Dyadic correlates of the configuration of the
network
E.g., homophily, distance (McPherson)
Mid-level features (e.g., triads, quads, etc)
Structure “reduction” (Newman, Frank)
Descriptions of overall structure
Degrees of separation, clustering (small world)
Degree distribution (scale free)
Analysis IV: computational
modeling
In emergent processes, snapshot may not
reflect micro-level processes (Schelling)
Agent-based modeling: very simple
assumptions about behavior, which
(sometimes) yield surprising systemic
patterns.
Ex: my work on the social structure of
exploration and exploitation
Network visualization
Source:
Pajek
Homepa
ge
(http://r
esearch.
lumeta.c
om/ches
/map/ga
llery/ind
ex.html)
When useful?
To see unanticipated patterns
More useful for small networks and
egocentric networks
Tools for pattern recognition in larger
networks(?)
Powerful tools for presenting ideas
Software: Netdraw, Pajek, Visone
Example:
Networks among State Health Officials
Legend:
Relationships
1. Grey ties = overlapping ties
2. Red = Talk in general
3. B l a ck = Pandemic preparedness
4. Dark grey ties = Professional
development
Nodes
5. R e gion 1: red
6. R e gion 2: blue
7. R e gion 3: black
8. R e gion 4: grey
9. R e gion 5: pink (Territories)
Example:
Flight patterns movie (Aaron Koblin)
http://vw.indiana.edu/07netsci/entries/#flight
Research design
Statistical challenges
Design challenges
Statistical challenges
Interdependence of observations
For example, whether if A talks to B, and B to C, it
is more likely that A talks to C (transitivity)
Statistical methods to deal with these
interdependencies (QAP, P*, ERGM)
Design challenges
The causal nexus–
whither the causal
arrow?
Network to node?
Node to network?
Omitted variable driving
both?
The value of control
The value of
longitudinal data
Example:
studying social influence
How to dissect cause and effect of social
influence?
Problem of unobserved heterogeneity
Some roommate studies
Study of policy school students
Keys to studying social
influence in this study
Longitudinal data
Measurement of views at inception of system
Implausibility of alternative explanation that
network is related to unobserved
heterogeneity
The network of influence
triangle=section 1
square/diamond = section 2
circle/octagon = section 3
dark blue = 1 (extremely liberal)
blue = 2
light blue = 3
gray = 4
pink = 5
red = 6 (conservative)
larger= became more conservative
smaller = became more liberal
in-between=no change
Paradigm shortcomings
Until recently, almost all work was based on
snapshots of small systems.
Lack of attention to causal nexus
Lots of attention on flow in networks, but little
data on actual flow
Relational focus obscures interplay of nodal-
level factors and network