KEMBAR78
Advanced task management with Celery | PDF
Advanced Task Management in Celery


           Mahendra M
           @mahendra
    https://github.com/mahendra
@mahendra
●   Python developer for 6 years
●   FOSS enthusiast/volunteer for 14 years
    ●   Bangalore LUG and Infosys LUG
    ●   FOSS.in and LinuxBangalore/200x
●   Celery user for 3 years
●   Contributions
    ●   patches, testing new releases
    ●   Zookeeper msg transport for kombu
    ●   Kafka support (in-progress)
Quick Intro to Celery
●   Asynchronous task/job queue
●   Uses distributed message passing
●   Tasks are run asynchronously on worker nodes
●   Results are passed back to the caller (if any)
Overview

                    Worker 1



                    Worker 2
Sender    Msg Q
                        .
                        .
                        .

                    Worker N
Sample Code
from celery.task import task


@task
def add(x, y):
   return x + y


result = add.delay(5,6)
result.get()
Uses of Celery
●   Asynchronous task processing
●   Handling long running / heavy jobs
    ●   Image resizing, video transcode, PDF generation
●   Offloading heavy web backend operations
●   Scheduling tasks to be run at a particular time
    ●   Cron for python
Advanced Uses
●   Task Routing
●   Task retries, timeout and revoking
●   Task Canvas – combining tasks
    ●   Task co-ordination
    ●   Dependencies
    ●   Task trees or graphs
    ●   Batch tasks
    ●   Progress monitoring
●   Tricks
    ●   DB conflict management
Sending tasks to a particular worker

                                  Worker 1
                                 (Windows)

                       windows
                                  Worker 2
                       windows   (Windows)
     Sender    Msg Q
                                     .
                        linux
                                     .
                                     .
                                 Worker N
                                  (Linux)
Routing tasks – Use cases
●   Priority execution
●   Based on hardware capabilities
    ●   Special cards available for video capture
    ●   Making use of GPUs (CUDA)
●   Based on OS (for eg. Playready encryption)
●   Based on location
    ●   Moving compute closer to data (Hadoop-ish)
    ●   Sending tasks to different data centers
●   Sequencing operations (CouchDB conflicts)
Sample Code
from celery.task import task


@task(queue = 'windows')
def drm_encrypt(audio_file, key_phrase):
   ...


r = drm_encrypt.apply_async( args = [afile, key],
                               queue = 'windows' )


#Start celery worker with queues options
$ celery worker -Q windows
Retrying tasks
@task( default_retry_delay = 60,
      max_retries = 3 )
def drm_encrypt(audio_file, key_phrase):
   try:
       playready.encrypt(...)
   except Exception, exc:
       raise drm_encrypt.retry(exc=exc, countdown=5)
Retrying tasks
●   You can specify the number of times a task can
    be retried.
●   The cases for retrying a task must be handled
    within code. Celery will not do it automatically
●   The tasks should be designed to be idempotent
Handling worker failures
@task( acks_late = True )
def drm_encrypt(audio_file, key_phrase):
     try:
          playready.encrypt(...)
     except Exception, exc:
          raise drm_encrypt.retry(exc=exc, countdown=5)



●   This is used where the task must be resend in case of
    worker or node failure
●   The ack message to the message queue is sent after the
    task finishes executing
Worker processes

                                 Worker 1
                                (Windows)

                      windows
                                 Worker 2
                      windows   (Windows)
Sender        Msg Q
                                    .
                       linux
                                    .
                                    .
                                Worker N
                                 (Linux)
                                    Process 1
                                    Process 2

                                    Process N
Worker processes

                                 Worker 1
                                (Windows)

                      windows
                                 Worker 2
                      windows   (Windows)
Sender        Msg Q
                                    .
                       linux
                                    .
                                    .
                                Worker N
                                 (Linux)
                                    Process 1
                                    Process 2

                                    Process N
Worker process
●   In every worker node, celery starts a pool of
    worker processes
●   The number is determined by the concurrency
    setting (or autodetected – for full CPU usage)
●   Each processes can be configured to restart
    after running x number of tasks
    ●   Disabled by default
●   Alternately eventlet can be used instead of
    processes (discuss later)
Revoking tasks
celery.control.revoke( task_id,
                        terminate = False,
                        signal = 'SIGKILL' )
●
    revoke() works by sending a broadcast
    message to all workers
●   If a task has not yet run, workers will keep this
    task_id in memory and ensure that it does not
    run
●   If a task is running, revoke() will not work
    unless terminate = True
Task expiration
task.apply_async( expires = x )
        x can be
        * in seconds
        * a specific datetime()


●   Global time limits can be configured in settings
    ●   Soft time limit – the task receives an exception
        which can be used to cleanup
    ●   Hard time limit – the worker running the task is
        killed and is replaced with another one.
Handling soft time limit
@task()
def drm_encrypt(audio_file, key_phrase):
   Try:
          setup_tmp_files()
           SoftTimeLimitExceeded:

          playready.encrypt(...)
   except SoftTimeLimitExceeded:
          cleanup_tmp_files()
   except Exception, exc:
          raise drm_encrypt.retry(exc=exc, countdown=5)
Task Canvas
●   Chains – Linking one task to another
●   Groups – Execute several tasks in parallel
●   Chord – execute a task after a set of tasks has
    finished
●   Map and starmap – Similar to map() function
●   Chunks – divide an iterable of work into chunks
●   Chunks + Chord/chain can be used for map-
    reduce
                Best shown in a demo
Task trees

[ task 1 ] --- spawns --- [ task 2 ] ---- spawns -->   [ task 2_1 ]
                  |                                    [ task 2_3 ]
                  |
                  +------ [ task 3 ] ---- spawns -->   [ task 3_1 ]
                  |                                    [ task 3_2 ]
                  |
                  +------ [ task 4 ] ---- links ---> [ task 5 ]
                                                         |(spawns)
                                                         |
                                                         |
                          [ task 8 ] <--- links <--- [ task 6 ]
                                                         |(spawns)
                                                     [ task 7 ]
Task Trees
●   Home grown solution (our current approach)
    ●   Use db models and keep track of trees
●   Better approach
    ●   Use celery-tasktree
    ●   http://pypi.python.org/pypi/celery-tasktree
Celery Batches
●   Collect jobs and execute it in a batch.
●   Can be used for stats collection
●   Batch execution is done once
    ●   a configured timeout is reached OR
    ●   a configured number of tasks have been received
●   Useful for reducing n/w and db loads
Celery Batches
from celery.contrib.batches import Batches
@task( base=Batches, flush_every=50, flush_interval=10 )
def collect_stats( requests ):
   items = {}
   for request in requests:
       item_id = request.kwargs['item_id']
       items[ item_id ] = get_obj( item_id )
       items[ item_id ].count += 1
   # Sync to db


collect_stats.delay( item_id = 45 )
collect_stats.delay( item_id = 57 )
Celery monitoring
●   Celery Flower
    https://github.com/mher/flower
●   Django admin monitor
●   Celery jobstatic
    http://pypi.python.org/pypi/jobtastic
Celery deployment
●   Cyme – celery instance manager
    https://github.com/celery/cyme
●   Celery autoscaling
●   Use celery eventlet where required

Advanced task management with Celery

  • 1.
    Advanced Task Managementin Celery Mahendra M @mahendra https://github.com/mahendra
  • 2.
    @mahendra ● Python developer for 6 years ● FOSS enthusiast/volunteer for 14 years ● Bangalore LUG and Infosys LUG ● FOSS.in and LinuxBangalore/200x ● Celery user for 3 years ● Contributions ● patches, testing new releases ● Zookeeper msg transport for kombu ● Kafka support (in-progress)
  • 3.
    Quick Intro toCelery ● Asynchronous task/job queue ● Uses distributed message passing ● Tasks are run asynchronously on worker nodes ● Results are passed back to the caller (if any)
  • 4.
    Overview Worker 1 Worker 2 Sender Msg Q . . . Worker N
  • 5.
    Sample Code from celery.taskimport task @task def add(x, y): return x + y result = add.delay(5,6) result.get()
  • 6.
    Uses of Celery ● Asynchronous task processing ● Handling long running / heavy jobs ● Image resizing, video transcode, PDF generation ● Offloading heavy web backend operations ● Scheduling tasks to be run at a particular time ● Cron for python
  • 7.
    Advanced Uses ● Task Routing ● Task retries, timeout and revoking ● Task Canvas – combining tasks ● Task co-ordination ● Dependencies ● Task trees or graphs ● Batch tasks ● Progress monitoring ● Tricks ● DB conflict management
  • 8.
    Sending tasks toa particular worker Worker 1 (Windows) windows Worker 2 windows (Windows) Sender Msg Q . linux . . Worker N (Linux)
  • 9.
    Routing tasks –Use cases ● Priority execution ● Based on hardware capabilities ● Special cards available for video capture ● Making use of GPUs (CUDA) ● Based on OS (for eg. Playready encryption) ● Based on location ● Moving compute closer to data (Hadoop-ish) ● Sending tasks to different data centers ● Sequencing operations (CouchDB conflicts)
  • 10.
    Sample Code from celery.taskimport task @task(queue = 'windows') def drm_encrypt(audio_file, key_phrase): ... r = drm_encrypt.apply_async( args = [afile, key], queue = 'windows' ) #Start celery worker with queues options $ celery worker -Q windows
  • 11.
    Retrying tasks @task( default_retry_delay= 60, max_retries = 3 ) def drm_encrypt(audio_file, key_phrase): try: playready.encrypt(...) except Exception, exc: raise drm_encrypt.retry(exc=exc, countdown=5)
  • 12.
    Retrying tasks ● You can specify the number of times a task can be retried. ● The cases for retrying a task must be handled within code. Celery will not do it automatically ● The tasks should be designed to be idempotent
  • 13.
    Handling worker failures @task(acks_late = True ) def drm_encrypt(audio_file, key_phrase): try: playready.encrypt(...) except Exception, exc: raise drm_encrypt.retry(exc=exc, countdown=5) ● This is used where the task must be resend in case of worker or node failure ● The ack message to the message queue is sent after the task finishes executing
  • 14.
    Worker processes Worker 1 (Windows) windows Worker 2 windows (Windows) Sender Msg Q . linux . . Worker N (Linux) Process 1 Process 2 Process N
  • 15.
    Worker processes Worker 1 (Windows) windows Worker 2 windows (Windows) Sender Msg Q . linux . . Worker N (Linux) Process 1 Process 2 Process N
  • 16.
    Worker process ● In every worker node, celery starts a pool of worker processes ● The number is determined by the concurrency setting (or autodetected – for full CPU usage) ● Each processes can be configured to restart after running x number of tasks ● Disabled by default ● Alternately eventlet can be used instead of processes (discuss later)
  • 17.
    Revoking tasks celery.control.revoke( task_id, terminate = False, signal = 'SIGKILL' ) ● revoke() works by sending a broadcast message to all workers ● If a task has not yet run, workers will keep this task_id in memory and ensure that it does not run ● If a task is running, revoke() will not work unless terminate = True
  • 18.
    Task expiration task.apply_async( expires= x ) x can be * in seconds * a specific datetime() ● Global time limits can be configured in settings ● Soft time limit – the task receives an exception which can be used to cleanup ● Hard time limit – the worker running the task is killed and is replaced with another one.
  • 19.
    Handling soft timelimit @task() def drm_encrypt(audio_file, key_phrase): Try: setup_tmp_files() SoftTimeLimitExceeded: playready.encrypt(...) except SoftTimeLimitExceeded: cleanup_tmp_files() except Exception, exc: raise drm_encrypt.retry(exc=exc, countdown=5)
  • 20.
    Task Canvas ● Chains – Linking one task to another ● Groups – Execute several tasks in parallel ● Chord – execute a task after a set of tasks has finished ● Map and starmap – Similar to map() function ● Chunks – divide an iterable of work into chunks ● Chunks + Chord/chain can be used for map- reduce Best shown in a demo
  • 21.
    Task trees [ task1 ] --- spawns --- [ task 2 ] ---- spawns --> [ task 2_1 ] | [ task 2_3 ] | +------ [ task 3 ] ---- spawns --> [ task 3_1 ] | [ task 3_2 ] | +------ [ task 4 ] ---- links ---> [ task 5 ] |(spawns) | | [ task 8 ] <--- links <--- [ task 6 ] |(spawns) [ task 7 ]
  • 22.
    Task Trees ● Home grown solution (our current approach) ● Use db models and keep track of trees ● Better approach ● Use celery-tasktree ● http://pypi.python.org/pypi/celery-tasktree
  • 23.
    Celery Batches ● Collect jobs and execute it in a batch. ● Can be used for stats collection ● Batch execution is done once ● a configured timeout is reached OR ● a configured number of tasks have been received ● Useful for reducing n/w and db loads
  • 24.
    Celery Batches from celery.contrib.batchesimport Batches @task( base=Batches, flush_every=50, flush_interval=10 ) def collect_stats( requests ): items = {} for request in requests: item_id = request.kwargs['item_id'] items[ item_id ] = get_obj( item_id ) items[ item_id ].count += 1 # Sync to db collect_stats.delay( item_id = 45 ) collect_stats.delay( item_id = 57 )
  • 25.
    Celery monitoring ● Celery Flower https://github.com/mher/flower ● Django admin monitor ● Celery jobstatic http://pypi.python.org/pypi/jobtastic
  • 26.
    Celery deployment ● Cyme – celery instance manager https://github.com/celery/cyme ● Celery autoscaling ● Use celery eventlet where required