Down the rabbit hole, profiling in Django

Down the rabbit hole

Know your code

Profiling python

Doing a lot of slow thing at the
same time

Types of profiling
• (code) profiling (cpu/IO bound problems)
• memory profiling (obviously memory related
problems)

Tools
• cProfile
• profile
• hotshot (deprecated? it splits analysis)
• line profiler
• trace

Getting dirty

import cProfile
cProfile.run('foo()')

Or

python -m cProfile myscript.py

More interactive

python -m cProfile myscript.py –o foo.profile

import pstats
p = pstats.Stats('foo.profile')

p.sort_stats('cumulative').print_stats(10)

More complex
import profile Source: http://www.doughellmann.com/PyMOTW/profile/
def fib(n):
# from http://en.literateprograms.org/Fibonacci_numbers_(Python)
if n == 0:
return 0
elif n == 1:
return 1
else:
return fib(n-1) + fib(n-2)

def fib_seq(n):
seq = [ ]
if n > 0:
seq.extend(fib_seq(n-1))
seq.append(fib(n))
return seq

print 'RAW'
print '=' * 80
profile.run('print fib_seq(20); print')

Output
$ python profile_fibonacci_raw.py
RAW
================================================================================
[0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610, 987, 1597, 2584, 41
81, 6765]

57356 function calls (66 primitive calls) in 0.746 CPU seconds

Ordered by: standard name

ncalls tottime percall cumtime percallfilename:lineno(function)
21 0.000 0.000 0.000 0.000:0(append)
20 0.000 0.000 0.000 0.000:0(extend)
1 0.001 0.001 0.001 0.001:0(setprofile)
1 0.000 0.000 0.744 0.744<string>:1(<module>)
1 0.000 0.000 0.746 0.746profile:0(print fib_seq(20); print)
0 0.000 0.000 profile:0(profiler)
57291/21 0.743 0.000 0.743 0.035 profile_fibonacci_raw.py:13(fib)
21/1 0.001 0.000 0.744 0.744 profile_fibonacci_raw.py:22

Memoize!
@memoize
def fib(n):
if n == 0:
return 0
elif n == 1:
return 1
else:
return fib(n-1) + fib(n-2)

Yeah!
$ python profile_fibonacci_memoized.py
MEMOIZED
================================================================================
[0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610, 987, 1597, 2584, 41
81, 6765]

145 function calls (87 primitive calls) in 0.003 CPU seconds

Ordered by: standard name

ncalls tottime percall cumtime percallfilename:lineno(function)
21 0.000 0.000 0.000 0.000:0(append)
20 0.000 0.000 0.000 0.000:0(extend)
1 0.001 0.001 0.001 0.001:0(setprofile)
1 0.000 0.000 0.002 0.002<string>:1(<module>)
1 0.000 0.000 0.003 0.003profile:0(print fib_seq(20); print)
0 0.000 0.000 profile:0(profiler)
59/21 0.001 0.000 0.001 0.000 profile_fibonacci.py:19(__call__)
21 0.000 0.000 0.001 0.000 profile_fibonacci.py:26(fib)
21/1 0.001 0.000 0.002 0.002 profile_fibonacci.py:36(fib_seq)

Line profiling
Source: http://packages.python.org/line_profiler/
Line # Hits Time Per Hit % Time Line Contents
==============================================================
149 @profile
150 def Proc2(IntParIO):
151 50000 82003 1.6 13.5 IntLoc = IntParIO + 10
152 50000 63162 1.3 10.4 while 1:
153 50000 69065 1.4 11.4 if Char1Glob == 'A':
154 50000 66354 1.3 10.9 IntLoc = IntLoc - 1
155 50000 67263 1.3 11.1 IntParIO = IntLoc -
IntGlob
156 50000 65494 1.3 10.8 EnumLoc = Ident1
157 50000 68001 1.4 11.2 if EnumLoc == Ident1:
158 50000 63739 1.3 10.5 break
159 50000 61575 1.2 10.1 return IntParIO

More complex == visualize

Run
Snake
Run

Kcachegrind

Pyprof
2
calltree

What to look for?
• Things you didn’t expect ;)
• much time spend in one function
• lots of calls to same function

Performance solutions
• caching
• getting stuff out of inner loops
• removing logging

Memory profiling

Django has a memory leak

In debug mode ;)

Tools
• heapy (and pysizer)
• meliea (more like hotshot)

Meliea!
• Memory issues happen sometimes in
production (long living processes). Want to get
info, then analyze locally
• runsnakerun has support for it

Dumping
Source: http://jam-bazaar.blogspot.com/2009/11/memory-debugging-with-meliae.html

from meliae import scanner
scanner.dump_all_objects('filename.json')

Analyzing
>>> from meliae import loader
>>> om = loader.load('filename.json')
>>> s = om.summarize(); s

This dumps out something like:
Total 17916 objects, 96 types, Total size = 1.5MiB (1539583 bytes)
Index Count % Size % Cum Max Kind
0 701 3 546460 35 35 49292 dict
1 7138 39 414639 26 62 4858 str
2 208 1 94016 6 68 452 type
3 1371 7 93228 6 74 68 code
4 1431 7 85860 5 80 60 function
5 1448 8 59808 3 84 280 tuple
6 552 3 40760 2 86 684 list
7 56 0 29152 1 88 596 StgDict
8 2167 12 26004 1 90 12 int
9 619 3 24760 1 91 40 wrapper_descriptor
10 570 3 20520 1 93 36 builtin_function_or_method
...

Your production system

A completely different story

What can you do to prevent this?
• From an API perspective:
– Maybe one WSGI process that is different
– Gets a small amount of requests (load balancing)
– This process takes care of doing profiling
(preferably in the hotshot way)

Or…..
• Pycounters

• Author is in tha house: Boaz Leskes

Finally
• You should know about this
• Part of your professional toolkit

• This should be in IDE’s!
– Komodo already has it, what about
PyCharm??(can you blog this Reinout? ;)

Links
Articles:
http://www.doughellmann.com/PyMOTW/profile/
http://www.doughellmann.com/PyMOTW/trace/
http://jam-bazaar.blogspot.com/2010/08/step-by-step-meliae.html
https://code.djangoproject.com/wiki/ProfilingDjango

Videos:
http://www.youtube.com/watch?v=Iw9-GckD-gQ
http://blip.tv/pycon-us-videos-2009-2010-2011/introduction-to-python-profiling-1966784

Software:
http://www.vrplumber.com/programming/runsnakerun/
http://kcachegrind.sourceforge.net/html/Home.html
http://pypi.python.org/pypi/pyprof2calltree/
https://launchpad.net/meliae
http://pypi.python.org/pypi/line_profiler
http://pypi.python.org/pypi/Dozer

Down the rabbit hole, profiling in Django

More Related Content

What's hot

Similar to Down the rabbit hole, profiling in Django

Recently uploaded

Down the rabbit hole, profiling in Django

Editor's Notes