KEMBAR78
Lessons from building large clusters | PDF
©2010 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice1
Phil Day, HP Consulting
8th November 2010
22
Small vs Large Clusters
Small Production Clusters and
Proof of Concept
– Build and run by a few skilful
people
– Can be a natural extension
to conventional IT
– You know the servers by
name
Large Production Clusters
– Build and run by pioneers
– Large development staff
– Major Hadoop contributors
– Understand the problems of
scale
Images: Creative Commons 2.0 – Attribution Andrew Morrell (Flickr )
33
– Have, or want to start with, a small PoC (10’s of nodes)
– Want to quickly scale to large cluster (100’s of nodes)
– Want the scale of large clusters, but with the build and operational
model of a small one
– Want to run the cluster rather than build and develop it
– Need to integrate it with existing systems
Large Scale Early Adopters
Unfortunately not all things in life scale as well as Hadoop
Design – The Technology Challenge
Build – The Engineering Challenge
Transfer to Operations - The Service Management Challenge
44
Design – The Technology Challenge
Selecting all the right bits
Server Selection
– Core Nodes: Resilient, Big Memory, RAID
– Data Nodes: Not resilient, no RAID or hot swap, basic iLO
– Trade off Disks vs Cores vs Memory to match target load
– Need to consider disc allocation policy
– Network redundancy is useful to avoid rack switch failures
– Edge Nodes (Data ingress/egress & Mgmt)
– Higher spec data nodes
– Help provide the “appliance” view of the cluster
– Have Hadoop installed but don’t run as part of the cluster.
– Network Selection
– Dual 1Gb from data nodes to rack switches
– 10Gb from rack switches to core, and from Edge nodes
55
Build – The Engineering Challenge
Do you realise how many cardboard boxes that is ?
Building at the scale of 500+ servers has its own set of problems
• Space and Environment
• Consistency of Build
• Failures during the Build
• Deployment time and the cost of rework
Two things we found very helpful:
Factory Integration Services
Cluster Management Utility
66
Build – HP Factory Integration Services
Reducing risk and time
• Many years experience of building large clusters
• Site inspection
• Build, Configure, Soak Test
• Diagnose and fix DoAs
• Rack and Label
• Asset tagging
• Custom build and set-up
• Pack and Ship
• On-Site build and integration
www.hp.com/go/factoryexpress
Complex solutions ...
... Made simple
77
Build – HP Cluster Management Utility
Rack aware deployment and monitoring
• Proven cluster deployment and management tool
• 11 Years of experience
• Proven with clusters of 3500+ nodes
• Deployment
• Network and power load aware deployment
• Easily extensible
• Kickstart integration
• Monitoring
• Scalable non intrusive monitoring
• Collectl integration
• Administration
• Command Line or GUI
• Cluster wide configuration
www.hp.com/go/cmu
88
CMU Dashboard
99
Cluster Performance over time
Disk (read)
CPU
Disk (write)
Network
Map
Red
05:00
10:00
15:00
1010
Operate – the organisational challenge
How do we know when its working ?
Clusters are not just large numbers of servers
• At scale it may never be 100% up (like a network)
.... but it can be 100% down (like a server)
• Need to think more in terms of “How healthy is it ?”
• Core nodes are important
• Data nodes much less so – unless they fail in patterns
• Edge nodes – somewhere in between
• Look at HDFS health for replication counts
• Nagios & ganglia
• Collectl / CMU to visualise the cluster
1111
Summary
Key considerations when building a large cluster
• Use a pilot system to establish your server configuration
• Stand on the shoulders of the Pioneers
• Build and test in the factory if you can
• Consistency in the build and configuration is vital
• Cherish the NameNode, protect the Edge Nodes, and develop the
right level of indifference to the Data Nodes
• Practice the key recovery cases
• Match training and support to the service expectations
And remember not all things in life scale as well as Hadoop
12
Questions ?

Lessons from building large clusters

  • 1.
    ©2010 Hewlett-Packard DevelopmentCompany, L.P. The information contained herein is subject to change without notice1 Phil Day, HP Consulting 8th November 2010
  • 2.
    22 Small vs LargeClusters Small Production Clusters and Proof of Concept – Build and run by a few skilful people – Can be a natural extension to conventional IT – You know the servers by name Large Production Clusters – Build and run by pioneers – Large development staff – Major Hadoop contributors – Understand the problems of scale Images: Creative Commons 2.0 – Attribution Andrew Morrell (Flickr )
  • 3.
    33 – Have, orwant to start with, a small PoC (10’s of nodes) – Want to quickly scale to large cluster (100’s of nodes) – Want the scale of large clusters, but with the build and operational model of a small one – Want to run the cluster rather than build and develop it – Need to integrate it with existing systems Large Scale Early Adopters Unfortunately not all things in life scale as well as Hadoop Design – The Technology Challenge Build – The Engineering Challenge Transfer to Operations - The Service Management Challenge
  • 4.
    44 Design – TheTechnology Challenge Selecting all the right bits Server Selection – Core Nodes: Resilient, Big Memory, RAID – Data Nodes: Not resilient, no RAID or hot swap, basic iLO – Trade off Disks vs Cores vs Memory to match target load – Need to consider disc allocation policy – Network redundancy is useful to avoid rack switch failures – Edge Nodes (Data ingress/egress & Mgmt) – Higher spec data nodes – Help provide the “appliance” view of the cluster – Have Hadoop installed but don’t run as part of the cluster. – Network Selection – Dual 1Gb from data nodes to rack switches – 10Gb from rack switches to core, and from Edge nodes
  • 5.
    55 Build – TheEngineering Challenge Do you realise how many cardboard boxes that is ? Building at the scale of 500+ servers has its own set of problems • Space and Environment • Consistency of Build • Failures during the Build • Deployment time and the cost of rework Two things we found very helpful: Factory Integration Services Cluster Management Utility
  • 6.
    66 Build – HPFactory Integration Services Reducing risk and time • Many years experience of building large clusters • Site inspection • Build, Configure, Soak Test • Diagnose and fix DoAs • Rack and Label • Asset tagging • Custom build and set-up • Pack and Ship • On-Site build and integration www.hp.com/go/factoryexpress Complex solutions ... ... Made simple
  • 7.
    77 Build – HPCluster Management Utility Rack aware deployment and monitoring • Proven cluster deployment and management tool • 11 Years of experience • Proven with clusters of 3500+ nodes • Deployment • Network and power load aware deployment • Easily extensible • Kickstart integration • Monitoring • Scalable non intrusive monitoring • Collectl integration • Administration • Command Line or GUI • Cluster wide configuration www.hp.com/go/cmu
  • 8.
  • 9.
    99 Cluster Performance overtime Disk (read) CPU Disk (write) Network Map Red 05:00 10:00 15:00
  • 10.
    1010 Operate – theorganisational challenge How do we know when its working ? Clusters are not just large numbers of servers • At scale it may never be 100% up (like a network) .... but it can be 100% down (like a server) • Need to think more in terms of “How healthy is it ?” • Core nodes are important • Data nodes much less so – unless they fail in patterns • Edge nodes – somewhere in between • Look at HDFS health for replication counts • Nagios & ganglia • Collectl / CMU to visualise the cluster
  • 11.
    1111 Summary Key considerations whenbuilding a large cluster • Use a pilot system to establish your server configuration • Stand on the shoulders of the Pioneers • Build and test in the factory if you can • Consistency in the build and configuration is vital • Cherish the NameNode, protect the Edge Nodes, and develop the right level of indifference to the Data Nodes • Practice the key recovery cases • Match training and support to the service expectations And remember not all things in life scale as well as Hadoop
  • 12.