Building RESTful
Web Services with
CherryPy
Joseph Tate
Joseph.Tate@palemountain.com
http://bit.ly/OTRvDd
Background image © 2006 by Cosmic Kitty. Used under Creative Commons license.
Some Definitions: REST
REST - REpresentational State Transfer
● A type of remote access protocol, which, in contrast to RPC
protocols, transfers state from client to server to manipulate
state instead of calling remote procedures.
● Does not define any encoding or structure. Does not define
ways of returning useful error messages.
● Uses HTTP "verbs" to perform state transfer operations.
● REST Resources are always uniquely identified by URL.
● REST IS NOT AN API it's just an API transport layer
Definitions: REST Examples
GET /items/
200 OK
A list of items available.
POST /items/
201 CREATED
Create a new item, and generate an ID for it.
GET /item/7/
200 OK
Retrieve a single item listed above.
PUT /item/7/
204 No Content
Write changes made on the client to the server.
DELETE /item/7/
204 No Content OR 202 Accepted
Delete the item on the server. (204 = deleted, 202 = marked for deletion)
Definitions: REST Cont'd.
What do you notice?
● REST is a set of conventions built on HTTP
○ For example, success and failure are represented as HTTP
status codes and messages.
● Resources are Unique: Addressing /item/7/ will always
return the item with ID 7.
● I like to keep index and retrieval URLs as plural and singular
URLs
Some Definitions: RWS
RWS: RESTful Web Service
A web accessible API that uses REST principles.
○ Client is "resource" aware
○ Client makes state changes and propagates
them to the server
○ Defines encoding, encapsulation, error
handling, etc.
○ REST by itself is not an API. A RWS is, but
requires quite a bit of work to create.
Some Definitions: SOA
SOA: Service Oriented Architecture
An application architecture which breaks a
monolithic application into small, discrete, and
reusable pieces.
○ RWS is a key piece of many SOA.
○ SOA enables Rich Internet Applications,
multiple front ends, and third party
application integration.
Why CherryPy?
Zen: http://bit.ly/OAN0dC
● Robust, fast, SSL enabled, HTTP 1.1 Server. No
reverse proxying or mod_* required.
● Easy to get started
○ No external dependencies to use.
○ Extremely simple hello world.
● Very powerful extension architecture (more
later)
● Default decorators do not modify inputs or
outputs.
CherryPy Key Architecture
cherrypy.engine: Controls process
startup/teardown and event handling.
cherrypy.server: Configures and controls the WSGI
or HTTP server.
cherrypy.tools: A toolbox of utilities that are
orthogonal to processing an HTTP request.
CherryPy Hello World
import cherrypy
class HelloWorld(object):
def index(self):
return "Hello World!"
index.exposed = True
cherrypy.quickstart(HelloWorld())
CherryPy Config
Confusing at first, but powerful.
● Config can be in files
○ cherrypy.config.update(file('cfg.ini'))
● On server startup as a dictionary
○ cherrypy.tree.mount(Root(), '', cfgdict)
● On method handlers
○ Via decorators (@cherrypy.expose, etc.)
○ Via class/function attributes
■ _cp_config
● cherrypy.config is a thread local, and request
specific.
CherryPy Config Example
# Enable JSON processing on input
class Root(object):
@cherrypy.expose
@cherrypy.tools.json_in()
def index(self, urlparm1=None):
data = cherrypy.request.json
# etc.
# Alternative configuration
index.expose = True
index._cp_config = {
'cherrypy.tools.json_in.on': True
}
CherryPy Engine Plugins
cherrypy.engine is actually a publisher/subscriber
bus.
Plugins can run custom code at application startup,
teardown, exit, or at "engine intervals" by
subscribing to the appropriate events.
Example:
Create a scratch DB at server startup that is
destroyed at exit.
Engine Plugin Example
class ScratchDB(plugins.SimplePlugin):
def start(self):
self.fname = 'myapp_%d.db' % os.getpid()
self.db = sqlite.connect(database=self.fname)
start.priority = 80
def stop(self):
self.db.close()
os.remove(self.fname)
cherrypy.engine.scratchdb = ScratchDB(cherrypy.engine)
CherryPy Tools
Most Python frameworks use decorators to enable
features and dispatching
● Problematic unit testing
● Difficult to remember syntax
● Changing application configuration means
changing CODE!
YUCK!
CherryPy Tools Cont'd
CherryPy uses config to change application
configuration by enabling/disabling tools.
● Tools execute code based on request hooks:
○ https://cherrypy.readthedocs.io/en/latest/ex
tend.html#per-request-functions
○ cherrypy.tool.TOOL_NAME.on = True enables
a tool
○ cherrypy.tool.TOOL_NAME.<parameter> =
<value> sets parameter values.
Tool Example
def authorize_all():
cherrypy.request.authorized = 'authorize_all'
cherrypy.tools.authorize_all = cherrypy.Tool('before_handler',
authorize_all, priority=11)
def is_authorized():
if not cherrypy.request.authorized:
raise cherrypy.HTTPError("403 Forbidden",
','.join(cherrypy.request.unauthorized_reasons))
cherrypy.tools.is_authorized = cherrypy.Tool('before_handler',
is_authorized, priority = 49)
cherrypy.config.update({
'tools.is_authorized.on': True,
'tools.authorize_all.on': True
Parts of a RESTful Web Service
● REST
● Authentication
● Authorization
● Structure
● Encapsulation
● Error Handling
REST
Usually when thinking about REST you think about
CRUD+i (create, retrieve, update, delete, plus index)
In CherryPy REST is handled via a paired class setup
● Class 1 handles indexing/cataloguing and item
creation
○ GET /items/, POST /items/
● Class 2 handles retrieving, updating, and deleting
single items
○ GET /item/6/, PUT /item/6/, DELETE /item/6/
REST Cont'd
Optimizations
● Class 1 can be extended to handle batch
operations like a bulk delete
● Class 2 could grant access to individual fields for
updating or special state changes
CherryPy REST (Class 1)
class ItemIndexREST(object):
exposed = True
@cherrypy.tools.json_out()
def GET(self, dsid=None):
# Return an index of items (DON'T Actually do this)
return []
@cherrypy.tools.json_in()
@cherrypy.tools.authorize_all() # A registration method
def POST(self, login=False):
# Create the item and generate a URL to identify it
cherrypy.response.headers['Location'] = \
self._entity_url(actor)
cherrypy.response.status = 201
CherryPy REST (Class 2)
class ItemREST(object):
exposed = True
@cherrypy.tools.json_out()
@cherrypy.tools.authorize_self()
def GET(self, *vpath):
item = retrieve(vpath[0])
return item.asDict()
@cherrypy.tools.json_in()
@cherrypy.tools.authorize_self()
def PUT(self, *vpath):
# Do work to save the current state
cherrypy.response.headers['Location'] = \
path_to_object(item)
cherrypy.response.status = 204
CherryPy REST (Assembly)
RESTopts = {
'tools.SASessionTool.on': True,
'tools.SASessionTool.engine': model.engine,
'tools.SASessionTool.scoped_session': model.DBSession,
'tools.authenticate.on': True,
'tools.is_authorized.on': True,
'tools.authorize_admin.on': True,
'tools.json_out.handler': json.json_handler,
'tools.json_in.processor': json.json_processor,
'request.dispatch': cherrypy.dispatch.MethodDispatcher()
}
app = cherrypy.tree.mount(actor.ItemREST(), '/item', {'/':
RESTopts})
app = cherrypy.tree.mount(actor.ItemIndexREST(), '/items',
{'/': RESTopts})
app.merge(cfile)
Identification
Unless you're providing an anonymous service, it's
important to know WHO or WHAT is accessing your
service.
Build tools to handle each authentication method,
e.g., OpenID, tokens, Basic Auth, cookies, etc..
Lots of free tools at http://tools.cherrypy.org/
(Defunct)
Authn Tool Examples
def authenticate():
if not hasattr(cherrypy.request, 'user') or
cherrypy.request.user is None:
# < Do stuff to look up your users >
cherrypy.request.authorized = False # This only
authenticates. Authz must be handled separately.
cherrypy.request.unauthorized_reasons = []
cherrypy.request.authorization_queries = []
cherrypy.tools.authenticate = \
cherrypy.Tool('before_handler', authenticate,
priority=10)
Authorization
● To keep your sanity, limit via URI
○ Morphing objects by user token leads to the dark side
○ If users get different views of the same resource, prefer
/user/5/item/7/ to /item/7/.
● Make sure your authorization checking routines
are FAST
● Explicit is better than implicit to code, but harder
to manage.
● Because of tools, we can add as many authz
routines as we need per given handler
Authz Example
def authorize_all():
cherrypy.request.authorized = 'authorize_all'
cherrypy.tools.authorize_all = cherrypy.Tool('before_handler',
authorize_all, priority=11)
def is_authorized():
if not cherrypy.request.authorized:
raise cherrypy.HTTPError("403 Forbidden",
','.join(cherrypy.request.unauthorized_reasons))
cherrypy.tools.is_authorized = cherrypy.Tool('before_handler',
is_authorized, priority = 49)
cherrypy.config.update({
'tools.is_authorized.on': True,
'tools.authorize_all.on': True
Structure
Spend time mapping out your
URL tree.
Can you auto discover the API?
Does it make sense?
Are your URLs really universal?
Encapsulation
● Pick something lightweight
● Human readable
● FAST
● Accessible to various clients
Typical choices are XML, and increasingly JSON
Encapsulation Cont'd
Generic Envelopes make for intuitive APIs.
● XML has ATOM
● JSON has Shoji
● Thrift
● Pickle?
Encapsulation Cont'd
Think about CRUD+i
What needs encapsulating
● i - Need a list of items
○ Full items? Just pointers?
● R - Retrieval of a single item
● C, U, and D all have no result
Encapsulation via Shoji
http://www.aminus.org/rbre/shoji/shoji-draft-02.txt
Draft JSON encapsulation format mimicking the ATOM XML
protocol.
Shoji defines three types of envelopes.
Shoji Catalogs
Indexing is handled by catalogs.
{
"element": "shoji:catalog",
"self": "http://example.org/users",
"entities": ["1"]
}
Shoji Catalogs Cont'd
Catalogs can have child catalogs, entities, and values
{"element": "shoji:catalog",
"self": "http://example.org/users",
"title": "Users Catalog",
"description": "The set of user entities for this
application.",
"updated": "#2003-12-13T18:30:02Z#",
"catalogs": {"bills": "bills",
"sellers": "sellers",
"sellers by sold count":
"sellers{?sold_count}"
},
"entities": ["1", "2", "88374", "9843"],
"views": {"Sold Counts": "sold_counts"},
}
Shoji Entities
Entities are the individual item envelopes
{
"element": "shoji:entity",
"self": "http://example.org/users/1",
"body": {
"last_modified": "2003-12-13 18:30:02Z",
"first_name": "Katsuhiro",
"last_name": "Shoji",
"sold_count": 387
}
}
Shoji Views
Item members are presented as views.
● Views are unstructured data.
● Integers, strings, lists, etc.
● Shortcuts for modifying the Entity object
Error Handling
● Even if you choose Shoji as your data encapsulation you have
to think about how errors should be handled.
● What can be handled via HTTP error codes?
○ 418 I'm a teapot (RFC 2324)
○ 400 Bad Request
○ 403 Unauthorized
Error Handling Cont'd
What about validation errors?
Database errors?
Other application errors?
HTTP 500. Return a response body!
Error Handling Example
import cherrypy
import json
def error_page_default(status, message, traceback, version):
ret = {
'status': status,
'version': version,
'message': [message],
'traceback': traceback
}
return json.dumps(ret)
class Root:
_cp_config = {'error_page.default': error_page_default}
@cherrypy.expose
def index(self):
raise cherrypy.HTTPError(500, "This is an error")
cherrypy.quickstart(Root())
Other considerations?
● Do you offer client libraries?
○ You should probably do so if
■ You have special authentication
■ You have non-trivial encapsulation
● Do you publish your API? Keep it private?
● Caching?
● Sub resources?
● Adding links between entities?
Conclusions
● RWS and by extension SOA lets you simplify backend
development
● Enable architectures like SOFEA, or SOUI, RIA
● Keep your API discoverable, understandable, and clean.
● Easy to do in Python via CherryPy because of tools and
plugins