KEMBAR78
Async Web Frameworks in Python | PDF
Advantages and Disadvantages
Of Using
Python’s Asynchronous
Frameworks for Web Services
Ryan C. Johnson
https://www.linkedin.com/in/ryanjohnson5
Key Asynchronous Concepts
● The best introduction to the asynchronous model is by my friend and former
colleague Dave Peticolas (highly recommended):
http://krondo.com/an-introduction-to-asynchronous-programming-and-twisted/
● “The fundamental idea behind the asynchronous model is that an
asynchronous program, when faced with a task that would normally block in a
synchronous program, will instead execute some other task that can still make
progress.”
Key Asynchronous Concepts (cont.)
● Event-based
● Locus-of-control is the event loop
● I/O calls are non-blocking (rely on events to signal when data is
ready)
● Conditions for async outperforming sync (Dave Peticolas):
○ “The tasks perform lots of I/O, causing a synchronous program to
waste lots of time blocking when other tasks could be running.”
○ “There are a large number of tasks so there is likely always at least
one task that can make progress.”
○ “The tasks are largely independent from one another so there is little
need for inter-task communication (and thus for one task to wait upon
another).”
Example
Synchronous:
def get(...):
...
results1 = io_call_1(...)
results2 = io_call_2(...)
...
Asynchronous (Twisted):
@inlineCallbacks
def get(...):
...
results1 = yield io_call_1(...)
results2 = yield io_call_2(...)
...
The Typical Web Service Provides Ideal Conditions
● HTTP requests are handled independently from each other
● The handling of HTTP requests typically involves one or more I/O calls to
one or more databases or other web services, and in most real-world
cases, the majority of time is spent waiting on these I/O calls
● The service is constantly accepting requests or should be available for
accepting requests
Key advantages
● Efficiency
○ Handle an equivalent number of requests with fewer/smaller servers
compared to sync
○ Scalability limited by the number of open socket connections within a
single process vs. the number of concurrent threads/processes for
sync web frameworks (thousands to tens-of-thousands for async vs.
tens to hundreds for sync)
○ A small server (in terms of CPU and memory) running an async web
service in a single process will match and often outperform a larger
server running a sync web service using tens to hundreds of
threads/processes
Key advantages (cont.)
● Able to handle large numbers of concurrent, long-lived requests
○ Only burn a socket, not a thread/process
○ This can be the determining factor in choosing async over sync
○ Allows efficient “push” functionality via web sockets, EventSource or
other long-lived connections
○ Gavin Roy at PyCon 2012 (then CTO of MyYearBook.com): “We do
more traffic and volume through this [Tornado] than the rest of our site
infrastructure combined...8 servers as opposed to 400-500.”
(http://pyvideo.org/video/720/more-than-just-a-pretty-web-framework-t
he-tornad)
Key disadvantage
A single async process has a more complex model for thinking about
shared state and how it can change than a single sync process
● Must keep in mind that shared state can change between the moments of
yielding control to the event loop and returning control back to your code
Simple example
shared = []
@inlineCallbacks
def get(self, id):
shared.append(id)
print ‘pre yield for get({id}): shared={shared}’.format(id=id, shared=shared)
obj = yield async_get_from_db(id)
print ‘post yield for get({id}): shared={shared}’.format(id-id, shared=shared)
Possible sequence of events:
1. An incoming GET request is handled, calling get with id=1
2. Print pre yield for get(1): shared=[1]
3. Yield to the event loop after calling async_get_from_db(1)
4. While waiting for the result of async_get_from_db(1), the event loop handles the next request, calling get
with id=2
5. Print pre yield for get(2): shared=[1, 2]
6. Yield to the event loop after calling async_get_from_db(2)
7. While waiting for the result of async_get_from_db(2), the event loop sees that the result from
async_get_from_db(1) is ready, and returns control back to the “paused” execution of get(1)
8. Print post yield get(1): shared = [1, 2] ← within the call to get(1) the shared state has
changed between the yield to the event loop and the return of the result
Asynchronous Frameworks
● Implicit (yields to the event loop occur implicitly when an I/O call is made)
○ gevent
● Explicit (yields to the event loop controlled by the programmer)
○ Twisted
■ Cyclone (Tornado API on Twisted’s event loop)
■ Klein (Flask-like API on Twisted Web)
■ Tornado (in Twisted-compatibility mode)
○ Tornado
○ asyncio (Python 3.4+)
■ aiohttp
■ Tornado (running on the asyncio event loop)
Implicit Style - Advantages
● Coding syntax and style is same as synchronous (when an I/O call is made,
control implicitly returns to the event loop to work on another request or event)
● Compatible with the huge ecosystem of popular synchronous Python
packages that perform I/O (e.g., SQLAlchemy)
○ This is a huge advantage over the explicit style
○ Assumes that the socket module is used for I/O, so when it is
monkey-patched (using gevent), you will no longer block on I/O calls but
instead yield control to the event loop
○ Python packages that perform I/O but don’t use the socket module can
still be used, but they will block on I/O
Implicit Style - Disadvantages
● Lack of explicit yielding syntax fails to indicate the points in the code where
shared state may change:
https://glyph.twistedmatrix.com/2014/02/unyielding.html
● Lack of control over yielding to the event loop prevents the ability to launch
multiple I/O calls before yielding (impossible to launch independent I/O
tasks in parallel)
○ In my opinion, this is the biggest disadvantage, but only if multiple and
independent I/O tasks could be performed
○ For example, the following is impossible to do using the implicit style:
@inlineCalbacks
def get(...):
deferred1 = io_call_1(...)
deferred2 = io_call_2(...)
result1 = yield deferred1
result2 = yield deferred2
Explicit Style - Advantages
● Explicit yielding syntax indicates points at which shared state may change
● Complete control over yielding to the event loop allows the ability to launch
multiple I/O calls before yielding (parallelism of independent I/O tasks)
○ In my opinion, this is the biggest advantage, but only if multiple and
independent I/O tasks could be performed
○ For example:
@inlineCalbacks
def get(...):
deferred1 = io_call_1(...)
deferred2 = io_call_2(...)
result1 = yield deferred1
result2 = yield deferred2
Explicit Style - Disadvantages
● Different syntax and coding style than synchronous code
● Much smaller and sometimes less mature ecosystem of Python packages
can be used
○ This is a huge disadvantage
○ For example, it precludes the use of SQLAlchemy
● If not using Python 3.5 with its async and await keywords, or if there is
no generator-style decorator like Twisted’s @inlineCallbacks or
Tornado’s @coroutine, any significant amount of code becomes an
unreadable mess of callbacks
Generator-Based Decorators are Essential
def func():
deferred = io_call()
def on_result(result):
print result
deferred.addCallback(on_result)
return deferred
becomes readable and very similar to the synchronous style:
@inlineCallbacks
def func():
result = yield io_call()
print result
Conclusions
● Large (10x to 100x) performance/efficiency gains for I/O bound web
services
● Implicit style recommended for most cases (assuming the monkey-patch of
the socket module does the trick), as there is very little required to reap
the benefits (for example, simple configuration of uWSGI or gunicorn to
use gevent)
● Explicit style only recommended if the gains from launching multiple I/O
tasks in parallel outweigh the losses due to a smaller and sometimes
less-mature ecosystem of Python packages and a more complex coding
style

Async Web Frameworks in Python

  • 1.
    Advantages and Disadvantages OfUsing Python’s Asynchronous Frameworks for Web Services Ryan C. Johnson https://www.linkedin.com/in/ryanjohnson5
  • 2.
    Key Asynchronous Concepts ●The best introduction to the asynchronous model is by my friend and former colleague Dave Peticolas (highly recommended): http://krondo.com/an-introduction-to-asynchronous-programming-and-twisted/ ● “The fundamental idea behind the asynchronous model is that an asynchronous program, when faced with a task that would normally block in a synchronous program, will instead execute some other task that can still make progress.”
  • 3.
    Key Asynchronous Concepts(cont.) ● Event-based ● Locus-of-control is the event loop ● I/O calls are non-blocking (rely on events to signal when data is ready) ● Conditions for async outperforming sync (Dave Peticolas): ○ “The tasks perform lots of I/O, causing a synchronous program to waste lots of time blocking when other tasks could be running.” ○ “There are a large number of tasks so there is likely always at least one task that can make progress.” ○ “The tasks are largely independent from one another so there is little need for inter-task communication (and thus for one task to wait upon another).”
  • 4.
    Example Synchronous: def get(...): ... results1 =io_call_1(...) results2 = io_call_2(...) ... Asynchronous (Twisted): @inlineCallbacks def get(...): ... results1 = yield io_call_1(...) results2 = yield io_call_2(...) ...
  • 5.
    The Typical WebService Provides Ideal Conditions ● HTTP requests are handled independently from each other ● The handling of HTTP requests typically involves one or more I/O calls to one or more databases or other web services, and in most real-world cases, the majority of time is spent waiting on these I/O calls ● The service is constantly accepting requests or should be available for accepting requests
  • 6.
    Key advantages ● Efficiency ○Handle an equivalent number of requests with fewer/smaller servers compared to sync ○ Scalability limited by the number of open socket connections within a single process vs. the number of concurrent threads/processes for sync web frameworks (thousands to tens-of-thousands for async vs. tens to hundreds for sync) ○ A small server (in terms of CPU and memory) running an async web service in a single process will match and often outperform a larger server running a sync web service using tens to hundreds of threads/processes
  • 7.
    Key advantages (cont.) ●Able to handle large numbers of concurrent, long-lived requests ○ Only burn a socket, not a thread/process ○ This can be the determining factor in choosing async over sync ○ Allows efficient “push” functionality via web sockets, EventSource or other long-lived connections ○ Gavin Roy at PyCon 2012 (then CTO of MyYearBook.com): “We do more traffic and volume through this [Tornado] than the rest of our site infrastructure combined...8 servers as opposed to 400-500.” (http://pyvideo.org/video/720/more-than-just-a-pretty-web-framework-t he-tornad)
  • 8.
    Key disadvantage A singleasync process has a more complex model for thinking about shared state and how it can change than a single sync process ● Must keep in mind that shared state can change between the moments of yielding control to the event loop and returning control back to your code
  • 9.
    Simple example shared =[] @inlineCallbacks def get(self, id): shared.append(id) print ‘pre yield for get({id}): shared={shared}’.format(id=id, shared=shared) obj = yield async_get_from_db(id) print ‘post yield for get({id}): shared={shared}’.format(id-id, shared=shared) Possible sequence of events: 1. An incoming GET request is handled, calling get with id=1 2. Print pre yield for get(1): shared=[1] 3. Yield to the event loop after calling async_get_from_db(1) 4. While waiting for the result of async_get_from_db(1), the event loop handles the next request, calling get with id=2 5. Print pre yield for get(2): shared=[1, 2] 6. Yield to the event loop after calling async_get_from_db(2) 7. While waiting for the result of async_get_from_db(2), the event loop sees that the result from async_get_from_db(1) is ready, and returns control back to the “paused” execution of get(1) 8. Print post yield get(1): shared = [1, 2] ← within the call to get(1) the shared state has changed between the yield to the event loop and the return of the result
  • 10.
    Asynchronous Frameworks ● Implicit(yields to the event loop occur implicitly when an I/O call is made) ○ gevent ● Explicit (yields to the event loop controlled by the programmer) ○ Twisted ■ Cyclone (Tornado API on Twisted’s event loop) ■ Klein (Flask-like API on Twisted Web) ■ Tornado (in Twisted-compatibility mode) ○ Tornado ○ asyncio (Python 3.4+) ■ aiohttp ■ Tornado (running on the asyncio event loop)
  • 11.
    Implicit Style -Advantages ● Coding syntax and style is same as synchronous (when an I/O call is made, control implicitly returns to the event loop to work on another request or event) ● Compatible with the huge ecosystem of popular synchronous Python packages that perform I/O (e.g., SQLAlchemy) ○ This is a huge advantage over the explicit style ○ Assumes that the socket module is used for I/O, so when it is monkey-patched (using gevent), you will no longer block on I/O calls but instead yield control to the event loop ○ Python packages that perform I/O but don’t use the socket module can still be used, but they will block on I/O
  • 12.
    Implicit Style -Disadvantages ● Lack of explicit yielding syntax fails to indicate the points in the code where shared state may change: https://glyph.twistedmatrix.com/2014/02/unyielding.html ● Lack of control over yielding to the event loop prevents the ability to launch multiple I/O calls before yielding (impossible to launch independent I/O tasks in parallel) ○ In my opinion, this is the biggest disadvantage, but only if multiple and independent I/O tasks could be performed ○ For example, the following is impossible to do using the implicit style: @inlineCalbacks def get(...): deferred1 = io_call_1(...) deferred2 = io_call_2(...) result1 = yield deferred1 result2 = yield deferred2
  • 13.
    Explicit Style -Advantages ● Explicit yielding syntax indicates points at which shared state may change ● Complete control over yielding to the event loop allows the ability to launch multiple I/O calls before yielding (parallelism of independent I/O tasks) ○ In my opinion, this is the biggest advantage, but only if multiple and independent I/O tasks could be performed ○ For example: @inlineCalbacks def get(...): deferred1 = io_call_1(...) deferred2 = io_call_2(...) result1 = yield deferred1 result2 = yield deferred2
  • 14.
    Explicit Style -Disadvantages ● Different syntax and coding style than synchronous code ● Much smaller and sometimes less mature ecosystem of Python packages can be used ○ This is a huge disadvantage ○ For example, it precludes the use of SQLAlchemy ● If not using Python 3.5 with its async and await keywords, or if there is no generator-style decorator like Twisted’s @inlineCallbacks or Tornado’s @coroutine, any significant amount of code becomes an unreadable mess of callbacks
  • 15.
    Generator-Based Decorators areEssential def func(): deferred = io_call() def on_result(result): print result deferred.addCallback(on_result) return deferred becomes readable and very similar to the synchronous style: @inlineCallbacks def func(): result = yield io_call() print result
  • 16.
    Conclusions ● Large (10xto 100x) performance/efficiency gains for I/O bound web services ● Implicit style recommended for most cases (assuming the monkey-patch of the socket module does the trick), as there is very little required to reap the benefits (for example, simple configuration of uWSGI or gunicorn to use gevent) ● Explicit style only recommended if the gains from launching multiple I/O tasks in parallel outweigh the losses due to a smaller and sometimes less-mature ecosystem of Python packages and a more complex coding style