KEMBAR78
Dynamic Scheduling - Federated clusters in mesos | PPTX
Aaron Carey
Production Engineer - ILM London
acarey@ilm.com
Federated
Clusters in
Mesos
Why?
Who wins?
Why?
Sites in 3 time zones
Need to share render resources
Went through a project to prepare for cloud burst rendering
Renders mostly come at night (mostly)
What happens when our farm is full?
Can we burst to our other locations?
Approaches
Huawei Design
Led by the master and gossip protocol
Includes policy model
Master decides if a framework gets an offer
Master is in control
Based on two master plugins, consul deployment, gossip protocol
https://www.youtube.com/watch?v=kqyVQzwwD5E
http://www.slideshare.net/mKrishnaKumar1/federated-mesos-clusters-for-global-
data-center-designs
Our hack design
Needs to be simple
Decisions made in the framework
Framework connects to all masters
Masters don’t care about each other
We don’t need a policy engine
Keep code out of the Master
Diversion...
A note on scheduling...
Historically, schedulers in VFX are tyrannical micro managers
Full knowledge of the whole cluster and all tasks allow better informed decisions
In Mesos you only know what the Master tells you
No knowledge of other frameworks
At the mercy of the Master
Offers only deal in the present
We could hoard all offers we get, but we want to play nice
We don’t know if a better offer is just around the corner
Making dynamic scheduling decisions...
Can we intelligently schedule tasks without knowing the whole cluster state?
Schedule penalty
Every datacentre has a penalty for scheduling a task
Golf rules
Penalty = Interactivity Penalty + Data Penalty + Utilisation Penalty
Interactive Penalty
Framework regularly checks current latency to connected datacentres
Lo = maximum latency for interactive applications (around 35ms)
Lm = latency for datacentre m
I = 0 for non interactive, 1 for interactive
Data Penalty
Total Input Data Required - Input Data Already at Location
Bandwidth
Utilisation Penalty
Framework checks current utilisation of datacentres
Utarget = target utilisation of datacentre (e.g. 95%)
Um = utilisation of datacentre m
Time Penalty
Optional
Penalty decreases based on length of time in the queue
Putting it together
Set a cost threshold above which jobs don’t run
Tasks will get dispatched to the datacentre with the lowest cost
Thresholding can ensure jobs wait for optimum resources without consuming all
offers
Where were we?
Framework
System
What’s Next?
Peer to Peer vs Hierarchical
Get involved!
Proposal for federated clusters:
https://docs.google.com/document/d/1U4IY_ObAXUPhtTa-
0Rw_5zQxHDRnJFe5uFNOQ0VUcLg/edit?usp=sharing
Federated Marathon:
https://github.com/schibsted/triathlon
Current Discussion (favouring hierarchical design):
user@mesos.apache.org
We’re Hiring
londonrecruitment@ilm.com

Dynamic Scheduling - Federated clusters in mesos

  • 1.
    Aaron Carey Production Engineer- ILM London acarey@ilm.com
  • 2.
  • 3.
  • 4.
  • 5.
    Why? Sites in 3time zones Need to share render resources Went through a project to prepare for cloud burst rendering Renders mostly come at night (mostly) What happens when our farm is full? Can we burst to our other locations?
  • 6.
  • 7.
    Huawei Design Led bythe master and gossip protocol Includes policy model Master decides if a framework gets an offer Master is in control Based on two master plugins, consul deployment, gossip protocol https://www.youtube.com/watch?v=kqyVQzwwD5E http://www.slideshare.net/mKrishnaKumar1/federated-mesos-clusters-for-global- data-center-designs
  • 8.
    Our hack design Needsto be simple Decisions made in the framework Framework connects to all masters Masters don’t care about each other We don’t need a policy engine Keep code out of the Master
  • 9.
  • 10.
    A note onscheduling... Historically, schedulers in VFX are tyrannical micro managers Full knowledge of the whole cluster and all tasks allow better informed decisions In Mesos you only know what the Master tells you No knowledge of other frameworks At the mercy of the Master Offers only deal in the present We could hoard all offers we get, but we want to play nice We don’t know if a better offer is just around the corner
  • 11.
    Making dynamic schedulingdecisions... Can we intelligently schedule tasks without knowing the whole cluster state?
  • 12.
    Schedule penalty Every datacentrehas a penalty for scheduling a task Golf rules Penalty = Interactivity Penalty + Data Penalty + Utilisation Penalty
  • 13.
    Interactive Penalty Framework regularlychecks current latency to connected datacentres Lo = maximum latency for interactive applications (around 35ms) Lm = latency for datacentre m I = 0 for non interactive, 1 for interactive
  • 14.
    Data Penalty Total InputData Required - Input Data Already at Location Bandwidth
  • 15.
    Utilisation Penalty Framework checkscurrent utilisation of datacentres Utarget = target utilisation of datacentre (e.g. 95%) Um = utilisation of datacentre m
  • 16.
    Time Penalty Optional Penalty decreasesbased on length of time in the queue
  • 17.
    Putting it together Seta cost threshold above which jobs don’t run Tasks will get dispatched to the datacentre with the lowest cost Thresholding can ensure jobs wait for optimum resources without consuming all offers
  • 18.
  • 19.
  • 20.
  • 21.
  • 22.
    Peer to Peervs Hierarchical
  • 23.
    Get involved! Proposal forfederated clusters: https://docs.google.com/document/d/1U4IY_ObAXUPhtTa- 0Rw_5zQxHDRnJFe5uFNOQ0VUcLg/edit?usp=sharing Federated Marathon: https://github.com/schibsted/triathlon Current Discussion (favouring hierarchical design): user@mesos.apache.org
  • 24.