Python Load Testing - Pygotham 2012

LOAD TESTING
WITH PYTHON

dan kuebrich / dan.kuebrich@gmail.com / @dankosaur
1

$ﬁnger dkuebric
• Currently: Tracelytics

• Before: Songza.com /
AmieStreet.com

• Likes websites, soccer, and beer

2

MENU
• What is load testing?
• How does it work?
• One good way to do it
• (with Python, no less!)
• A few examples
• Survey: making sense of it all

3

WHAT IS LOAD TESTING
• AKA stress testing
• Putting demand on a system (web app)
and measuring its response (latency,
availability)

http://www.ﬂickr.com/photos/60258236@N02/5515327431/

4

Q: WILL IT PERFORM
Do my code + infrastructure provide the
desired level of performance in terms of
latency and throughput for real user
workloads?

And if not, how can I get there?

5

http://www.ﬂickr.com/photos/dvs/3540320095/

6

WHEN TO LOAD TEST

• Launching a new feature (or site!)
• Expect a big trafﬁc event
• Changing infrastructure
• “I just don’t like surprises”

7

A PAGEVIEW IS BORN
now DNS
give can
I me
lolcats
search results
First HTTP request

Retrieve assets
and run js

t=0 t=done
8

THE WATERFALL CHART

9

A PAGEVIEW IS BORN
yslow?

DNS, connection
Fulﬁll HTTP Request (“Time to ﬁrst byte”)
Download + render page contents
(+js)

10

HOW TO TEST IT

most common • “F*ck it, we’ll do it live!”
• Synthetic HTTP requests
(“Virtual Users / VUs”)

least common
• Synthetic clicks in real
browsers (“RBUs”)

11

VU vs RBU
VU RBU
• Simulate individual • Operate a browser
protocol-level requests

• Low overhead • Higher overhead

• Scripts must parallel user • Scripts must parallel user
behavior behavior

• Tough to get AJAX-heavy • Accurate simulation of
sites exactly right, maintain AJAX

12

MULTI-MECHANIZE
• FOSS (LGPL v3), based on mechanize
• Load generation framework for VUs
• Written in Python, scripted in Python
• Heavyweight for bandwidth-bound
tests, but stocked with py-goodness

13

MULTI-MECHANIZE
• Basic idea:
• Write a few scripts that simulate
user actions or paths

• Specify how you want to run them:
x VUs in parallel on script A, y on
script B, ramping up, etc.

• Run and watch
14

A SIMPLE M-M SCRIPT
get_index.py
import
requests

class
Transaction(object):

def
run(self):

r
=
requests.get(‘http://website.com/’)

r.raw.read()

15

A SIMPLE PROJECT
• A multi-mechanize “project” is a set of
test scripts and a conﬁg that speciﬁes
how to run them:
dan@host:~/mm/my_project$
ls
-‐1

.
..
config.cfg

#
config
file
test_scripts/

#
your
tests
are
here
results/

#
result
files
go
here

16

A SIMPLE PROJECT
conﬁg.cfg
[global]
run_time:
60
rampup:
60
results_ts_interval:
60
console_logging:
off
progress_bar:
on

[user_group-‐1]
threads:
25
script:
get_index.py

17

A FEW M-M FEATURES
features.py
import
requests
import
time

class
Transaction(object):

def
run(self):

r
=
requests.get(‘http://website.com/a’)

r.raw.read()

assert
(r.status_code
==
200),
‘not
200’

assert
(‘Error’
not
in
r.text)

t1
=
time.time()

r
=
requests.get(‘http://website.com/b’)

r.raw.read()

latency
=
time.time()
-‐
t1

self.custom_timers[‘b’]
=
latency
18

[
$
multimech-‐run
example
]

19

INTERACTION
login.py
import
mechanize
as
m

class
MyTransaction(object):

def
run(self):

br
=
m.Browser()

br.set_handle_equiv(True)

br.set_handle_gzip(True)

br.set_handle_redirect(True)

br.set_handle_referer(True)

br.set_handle_robots(False)

br.set_handle_refresh(m._http.HTTPRefreshProcessor(),

max_time=1)

_
=
br.open(‘http://reddit.tlys.us’)

br.select_form(nr=1)

br.form['user']
=
u

br.form['passwd']
=
p

r
=
br.submit()

r.read()

20

[
$
cat
more-‐advanced-‐example.py
]
[
$
multimech-‐run
more-‐advanced

]

21

GETTING INSIGHT
• First, is the machine working hard?
• Inspect basic resources: CPU/RAM/
IO

• Ganglia/Munin/etc.

22

GETTING INSIGHT
• Second, why is the machine working
hard?

• What is my app doing?
• What are my databases doing?
• How are my caches performing?

24

REDDIT: A PYLONS APP

nginx

memcached uwsgi
pylons

postgresql cassandra

queued jobs

25

request start

nginx

memcached uwsgi
pylons


queued jobs request end

26

request start

nginx

memcached uwsgi
pylons


queued jobs request end

27

INSIDE THE APP
PROFILING VS INSTRUMENTATION

• Fine-grained analysis of • Gather curated set of
Python code information
• Easy to set up • Requires monkey-patching
(or code edits)
• No visibility into DB, • Can connect DB, cache
cache, etc. performance to app
• Distorts app performance • Little (tunable) overhead
• proﬁle, cProﬁle • django-debug-toolbar,
statsd, Tracelytics, New Relic

28

PROFILING 101
• ncalls: # of calls to method • cumtime: time spent in that
method and all child calls
• tottime: total time spent
exclusively in that method • Try repoze.proﬁle for WSGI
ncalls tottime percall cumtime percall filename:lineno(function)
892048 12.643 0.000 17.676 0.000 thing.py:116(__getattr__)
14059/2526 9.475 0.001 34.159 0.014 template_helpers.py:181(_replace_render)
562060 7.384 0.000 7.384 0.000 {posix.stat}
204587/163113 6.908 0.000 51.302 0.000 filters.py:111(mako_websafe)
115192/109693 6.590 0.000 9.700 0.000 {method 'join' of 'str' objects}
1537933 6.584 0.000 15.437 0.000 registry.py:136(__getattr__)
1679803/1404938 5.294 0.000 11.767 0.000 {hasattr}
2579769/2434607 5.173 0.000 12.713 0.000 {getattr}
139 4.809 0.035 106.065 0.763 pages.py:1004(__init__)
8146 3.967 0.000 15.031 0.002 traceback.py:280(extract_stack)
43487 3.942 0.000 3.942 0.000 {method 'recv' of '_socket.socket' object
891579 3.759 0.000 21.430 0.000 thing.py:625(__getattr__)
72021 3.633 0.000 5.910 0.000 memcache.py:163(serverHashFunction)
201 3.319 0.017 38.667 0.192 pages.py:336(render)
392 3.236 0.008 3.236 0.008 {Cfilters.uspace_compress}
1610797 3.208 0.000 3.209 0.000 registry.py:177(_current_obj)
2017343 3.113 0.000 3.211 0.000 {isinstance}

29

INSTRUMENTATION
• DB queries
• Cache usage
• RPC calls
• Optionally proﬁle critical segments of
code
• Exceptions
• Associate with particular codepaths or
URLs

30

DJANGO-DEBUG-TOOLBAR

31

TRACELYTICS/NEW RELIC

33

THANKS!
(any questions?)

• Multi-mechanize: testutils.org/multi-
mechanize/

• Email me: dan.kuebrich@gmail.com
• My job: tracelytics.com

34

APPENDIX
• Multi-mechanize
• Download: testutils.org/multi-mechanize
• Development: github.com/cgoldberg/multi-mechanize
• Mechanize: wwwsearch.sourceforge.net/mechanize/
• Reddit
• Source: github.com/reddit/reddit
• Demo load test: github.com/dankosaur/reddit-loadtest
• RBU load testing
• BrowserMob: browsermob.com/performance-testing
• LoadStorm: loadstorm.com
• New, but promising: github.com/detro/ghostdriver

35

APPENDIX
• Machine monitoring:
• Ganglia: ganglia.sourceforge.net
• Munin: munin-monitoring.org
• Application monitoring:
• repoze.proﬁle: docs.repoze.org/proﬁle/
• Django-Debug-Toolbar: github.com/django-debug-
toolbar/django-debug-toolbar
• Graphite: graphite.wikidot.com
• and statsd: github.com/etsy/statsd
• and django instrumentation: pypi.python.org/pypi/
django-statsd/1.8.0
• Tracelytics: tracelytics.com

36

AND: TRACELYTICS FREE TRIAL

We’ve got a 14-day trial; with only
minutes to install, you’ll have plenty of
time for load testing. No credit card
required!

http://tracelytics.com

37

Python Load Testing - Pygotham 2012

More Related Content

What's hot

Similar to Python Load Testing - Pygotham 2012

Recently uploaded

Python Load Testing - Pygotham 2012