KEMBAR78
Network analysis: People and open source communities | PDF
NETWORK ANALYSIS:
PEOPLE AND OPEN
SOURCE COMMUNITIES
Dawn M. Foster
@geekygirldawn	
  
dawn@dawnfoster.com	
  
fastwonderblog.com
PhD	
  Student	
  
University	
  of	
  Greenwich	
  
London,	
  UK
WHOAMI
• Geek, traveler, reader
• 20 year tech career. Past 15
years doing community &
open source (Intel, Jive,
Puppet Labs, etc.)
• PhD student at University of
Greenwich researching Linux
kernel
Photos by Josh Bancroft, Don Park
WHAT IS NETWORK
ANALYSIS?
Studies relationships
between units and looks for
patterns and structure in
those relationships
Image from ANAMIA Project
AGENDA AND INFO
• Gathering your data
• Data manipulation for
network analysis
• Visualization
• What else can you do?
Image from a Northern Marina Islands Network
Scripts, Data, and More:

github.com/geekygirldawn/oscon_2015
I 💖 METRICS GRIMOIRE
MailingListStats aka MLStats
CVSAnalY - repos
Bicho - bugs
More
Photo by Bitergia
http://metricsgrimoire.github.io/
MLSTATS
a) Install mlstats
$ python setup.py install
b) Create database
mysql> create database mlstats;
c) Import data by running mlstats
$ mlstats --db-user=USERNAME --db-password=PASS http://URLOFYOURLIST
EXTRACT DATA
SELECT mp.email_address AS sender,

(SELECT mp2.email_address FROM 

messages m2, messages_people mp2 WHERE
m2.is_response_of=m.is_response_of 

AND mp2.message_id=m2.is_response_of limit 1)
AS receiver FROM messages_people mp, messages m
WHERE YEAR(m.first_date)=2015 AND
MONTH(m.first_date)=1 AND
mp.message_id=m.message_id;
people
sending emails
subquery: who
they replied to
limittime
formanageable
data
Output:
sender@example.com in_reply_to@example.com
sender1@example.com in_reply_to1@example.com
sender2@example.com in_reply_to2@example.com
...
EXTRACT DATA: SCRIPTS
Reformat / clean up data
Reproducible
Reduce human error
oscon.py script
Image from Mark Grealish
github.com/geekygirldawn/oscon_2015
R / VISONE / GOURCE
Convert data for better use
with network analysis
Visualize data using

RStudio, Visone, and Gource
Image from WebOps.com
WHAT ELSE?
So many visualization tools
Python network packages
Network analysis is more
than just pretty pictures!
Dawn Foster
University of Greenwich
Centre for Business Network Analysis
www2.gre.ac.uk/about/faculty/business/research/centres/cbna/home
@geekygirldawn, dawn@dawnfoster.com
fastwonderblog.com
THANK YOU

Network analysis: People and open source communities

  • 1.
    NETWORK ANALYSIS: PEOPLE ANDOPEN SOURCE COMMUNITIES Dawn M. Foster @geekygirldawn   dawn@dawnfoster.com   fastwonderblog.com PhD  Student   University  of  Greenwich   London,  UK
  • 2.
    WHOAMI • Geek, traveler,reader • 20 year tech career. Past 15 years doing community & open source (Intel, Jive, Puppet Labs, etc.) • PhD student at University of Greenwich researching Linux kernel Photos by Josh Bancroft, Don Park
  • 3.
    WHAT IS NETWORK ANALYSIS? Studiesrelationships between units and looks for patterns and structure in those relationships Image from ANAMIA Project
  • 4.
    AGENDA AND INFO •Gathering your data • Data manipulation for network analysis • Visualization • What else can you do? Image from a Northern Marina Islands Network Scripts, Data, and More:
 github.com/geekygirldawn/oscon_2015
  • 5.
    I 💖 METRICSGRIMOIRE MailingListStats aka MLStats CVSAnalY - repos Bicho - bugs More Photo by Bitergia http://metricsgrimoire.github.io/
  • 6.
    MLSTATS a) Install mlstats $python setup.py install b) Create database mysql> create database mlstats; c) Import data by running mlstats $ mlstats --db-user=USERNAME --db-password=PASS http://URLOFYOURLIST
  • 7.
    EXTRACT DATA SELECT mp.email_addressAS sender,
 (SELECT mp2.email_address FROM 
 messages m2, messages_people mp2 WHERE m2.is_response_of=m.is_response_of 
 AND mp2.message_id=m2.is_response_of limit 1) AS receiver FROM messages_people mp, messages m WHERE YEAR(m.first_date)=2015 AND MONTH(m.first_date)=1 AND mp.message_id=m.message_id; people sending emails subquery: who they replied to limittime formanageable data Output: sender@example.com in_reply_to@example.com sender1@example.com in_reply_to1@example.com sender2@example.com in_reply_to2@example.com ...
  • 8.
    EXTRACT DATA: SCRIPTS Reformat/ clean up data Reproducible Reduce human error oscon.py script Image from Mark Grealish github.com/geekygirldawn/oscon_2015
  • 9.
    R / VISONE/ GOURCE Convert data for better use with network analysis Visualize data using
 RStudio, Visone, and Gource
  • 10.
  • 11.
    WHAT ELSE? So manyvisualization tools Python network packages Network analysis is more than just pretty pictures!
  • 12.
    Dawn Foster University ofGreenwich Centre for Business Network Analysis www2.gre.ac.uk/about/faculty/business/research/centres/cbna/home @geekygirldawn, dawn@dawnfoster.com fastwonderblog.com THANK YOU