KEMBAR78
Using PyPy instead of Python for speed | PDF
“If you want your code to run faster,

you should probably just use PyPy.”
— Guido van Rossum
@nbivald
niklas.bivald@enplore.com
Python implementation with a JIT compiler
stable
2
compliant
1
fastPyPy is a and
3
Since 2003 PyPy has received funding from the Python Software
Foundation, EU, Google and Mozilla
004A0AE803D6099C009D060C048700D10D07091F0BC402770BD30CCF0A5100EC0F790DBA0A7E
0F750318082702170296031B06EA042F0AA20CAB09350D3804610A0D0A8F035B08F30C6005AB
08CA00EB05AB0FD803170B8902150522047B09F80BFE0A660BD008E50FFC0758059608340EEA
0EC1025C01AA0E7A05460BA50DAD0CEB0C7C08DD086A047C01FF008A006A0BF804CC05820E05
04CB0AAD0B2904390A620C6A004A084E006C04F10736012B080D06090CF70B12037E0FA0040A
05A207A901E709500C0E007B06B60E920854047300FA02450CA706B50F3F0CA0080A0A8A0729
050C0FC0081A01DA02F905280D4F068D0AF0030006180FAC052606DE0B980E1102930B9C0EA4
04AE035709F7064400980AC2001B09B3014F0C93033406D00CE504C1026407470CB30B8E0392
040B01860D5C0F90052A03460D4200850FD706270C0504000B7E028E05F903BF065306BA0907
08140C9609240C2D079B015C089D0E4701B30F1B0AB604F60C6004A10BBE08FE033006220EFD
0C510A7B06FF0D7607600A820A 4702 CD0420B67042208100CD808D70B1C0A6406440840029

0091F08100048072A0C7E0D5103B402A0000B0B890D7D046C09990F0507CD07BF0E4E071708D
20331036002100F24032B066804C30000065D067E0ABD0C10056500F3068B0568048103750C1
400E7055C0A020D560E9B062D0E330E1702F2010008EF0BEC0A930233051B00A504090DD50FD
60338063905D901AC0D8005A702C109D40A000469095803730EFD078806C6036307DD07A708B
B02E10C6F0C5B04AE010D00E5075907F006EE09400D26072705B60119074400F40F320B7E0E7
303E20C0F02F20B4609DC00FB0EB907A50ECE0F4709C904F70F6801BA06420A670A5B05870E4
909D503B10EF4061F0C600AD00E8B03F0082603710F5B08C5033A0A91017A00440A9D0B4C0AB
E03E10A330C7F0749071205E00E350F3806A606FC0FA40BCF01B90D650C2108C1056C03040BB
6045A0A080EB30E3306630175035C06EE06990D4A03D4075D04320BBB038E022B0A23090F080
804AB0E250C800C0205E30C7A01F10AF60AB60C5D055404590A920BEC05FE0F0D04DF00CF082
F03A00A7005BD04C3030800C705D206E00B4E06110597098E05A30E32001703C70D33016B048
D0D5909EB0B16034009690BCC0AA601C7004D001F0B850E400E020EB20559088105C70B450FA
9006B00990298062F0690025C068B0ADF04B507C5066D064F0611005B0FFF02870FEA0A190C2
F06560B890BD90375064B0F310CAF0EFA0A2906490805083703F60A62032F0A7706B00CDF05E
F03A5072C069A09A50F0104800E9D0E7A0180059F05070F5208C20BC50B7404BA058B0FFD071
9042E074B0CB809E10E0B01CF0BC402E90849009F0843029D0F5B0E8504EE078809A40FAE002
A0395037809BF053F081206A1024802220E3302320ABF0C4102570D5401630C570A1E0116038
6030D030609C103040F3A04220ECA02BC016E08DC086F063B08280282073200E90ABC0562057
20BE40CEA0DBB06740E31048A0C30062C08B303D000740B93001B0C4808E604DF0EC50E4808D
40575049E016A0D56087507A701BE0354015003030AD6055E0EFC01380D8809D4090804D003C
8012D055F0CE90C9E04BB08E60A8D028200670DE0046B068F0D7009E905E706A407A601D106C
007CB081300EB0FBA0671053F0A630C930BE705BF01DC0B810CA60F88003604380FFD04A10F4
9001F0761023B089509FD0E6B0BFB0BE6003A04B90677038D02EA04E2075502AC061D004F05F
709C7061307E00B3B059E0F8C0FF605FF0C82047E07CE07E302DD0AD90F6F09C300CE0ACC01C
80E4E0CA008F404FD0F5B09CD0F680449045F038D02EA04E2075502AC061D004F05F709C7061
307E00B3B059E0F8C0FF605FF0C82047E07CE07E302DD0AD90F6F09C300CE0ACC01C80E4E0CA
- Visualisation

- (Lots of) Statistics

- From classifications to
regressions
Data Parsers Analytics
... in it's rawest form ... bespoke, in 

Python
$ python
Python 2.7.13 (default, Dec 18 2016,
07:03:39)
>>> print ("Hello World")
Hello World
>>>
>>> [x+1 for x in xrange(0,20, 2)]
[1, 3, 5, 7, 9, 11, 13, 15, 17, 19]
>>>
$ python your-application.py
I'm a python application
$ pypy
Python 2.7.13[PyPy 5.8.0 with GCC 4.2.1]
on darwin
>>>> print ("Hello World")
Hello World
>>>>
>>> [x+1 for x in xrange(0,20, 2)]
[1, 3, 5, 7, 9, 11, 13, 15, 17, 19]
>>>
$ pypy your-application.py
I'm a python application
Compliant
1
We've used PyPy in production since 2014, processing:


◦ Millions of flights
◦ Petabytes of data
◦ Thousands upon thousands of servers spun up
◦ 10 000 LOC + applications
Stable
2
$ time python -c "
> l = 0
> for i in xrange(0,10000):
> for ii in xrange(0, 10000):
> l = i+ii
> print(l)
> "
19998
real 0m8.101s
user 0m7.999s
time pypy -c "
> l = 0
> for i in xrange(0,10000):
> for ii in xrange(0, 10000):
> l = i+ii
> print(l)
> "
19998
real 0m0.201s
user 0m0.174s
Fast
3
SYNTHETIC BENCHMARK
Each second of flight data is divided into 256 words á 12 bit. Each
12 bit section contains between 3 to 12 parameters. To extract, we
iterate bit-by-bit on the continuous byte stream.
$ pypy run-file.py
AircraftDfdrDataPoint: created a new dfdr
datapoint
Task parse.blackbox.data succeeded in
18.82s
$ python run-file.py
AircraftDfdrDataPoint: created a new dfdr
datapoint
Task parse.blackbox.data succeeded in
46.25s
Fast
3
REAL WORLD BENCHMARK
good
4
the
the Python we know CPU Heavy tasks
... preferbly > 1s... and love
bad
5
... the
No C No Numpy
... unless cffi ... coming soon
No Py3
... coming soon
unsure
6
... and the
Web API:s?
... such as Flask
Questions?
@nbivald
niklas.bivald@enplore.com

Using PyPy instead of Python for speed