KEMBAR78
Automated Deployment using Open Source | ODP
How to build a fully automated (or nearly so) deployment system using open source tools. Russell Miller SCALE 8x 2010 (16,080, in case you were wondering) [email_address]
Overview This talk will concentrate on a CentOS/RHEL/Fedora environment.
The methods discussed in this talk will use only open source tools.  You don't have to pay for anything I mention here unless you want to.  In which case...
My paypal address is...
My qualifications Sysadmin/Developer for 12 years
Worked on many different kinds of environments
Maintained deployment system for 4,000 server environment for leading Internet Shopping Comparison company in West Los Angeles.
Instrumental in bringing up two datacenters of about 300 servers each from scratch for leading Internet Advertising company in Irvine, including building the entire deployment system from metal to application.
The second datacenter, once all the physical hardware was there, took live traffic in less than two weeks from bare metal.
What I am not good at Managers
Powerpoint/OpenOffice Impress presentations
Funny jokes.
Keeping an audience awake
And, of course, self-deprecating humor.
Why a deployment system? If you have one server, you don't need one.
Deploying servers manually usually mean many cycles of going back and forth with the application owners to validate.  For each server.  This eats up time and means bringing up servers can take up to or more than two weeks...
And consistency is a lost cause.
You've already lost control the minute the OS is installed.
It's too easy to miss little things.
System Administrators are TERRIBLE at documentation.
Admit it...  you are.  No use denying it.
What about server spec sheets?
NO!
Why not? It is practically impossible to get application owners to tell you what they need – if they understand what they need in the first place.
It's not their fault.  Their focus is on the application.  They are not System Administrators.  That is why you are paid.
They do not accurately represent server builds
They often contain incorrect or useless information
Frankly, they serve no function other than to waste time and sow confusion, while providing an easily shattered veneer of repeatability.
So, naturally, they're usually the first thing tried.
How does it benefit you? Very, very fast deployment times (metal to live server in less than an hour, and deploy as many servers as you can get open terminals to at once – I've done 20 in an hour.  And unlimited if you can get the serial consoles to work via expect.)

Automated Deployment using Open Source

  • 1.
    How to builda fully automated (or nearly so) deployment system using open source tools. Russell Miller SCALE 8x 2010 (16,080, in case you were wondering) [email_address]
  • 2.
    Overview This talkwill concentrate on a CentOS/RHEL/Fedora environment.
  • 3.
    The methods discussedin this talk will use only open source tools. You don't have to pay for anything I mention here unless you want to. In which case...
  • 4.
  • 5.
  • 6.
    Worked on manydifferent kinds of environments
  • 7.
    Maintained deployment systemfor 4,000 server environment for leading Internet Shopping Comparison company in West Los Angeles.
  • 8.
    Instrumental in bringingup two datacenters of about 300 servers each from scratch for leading Internet Advertising company in Irvine, including building the entire deployment system from metal to application.
  • 9.
    The second datacenter,once all the physical hardware was there, took live traffic in less than two weeks from bare metal.
  • 10.
    What I amnot good at Managers
  • 11.
  • 12.
  • 13.
  • 14.
    And, of course,self-deprecating humor.
  • 15.
    Why a deploymentsystem? If you have one server, you don't need one.
  • 16.
    Deploying servers manuallyusually mean many cycles of going back and forth with the application owners to validate. For each server. This eats up time and means bringing up servers can take up to or more than two weeks...
  • 17.
    And consistency isa lost cause.
  • 18.
    You've already lostcontrol the minute the OS is installed.
  • 19.
    It's too easyto miss little things.
  • 20.
    System Administrators areTERRIBLE at documentation.
  • 21.
    Admit it... you are. No use denying it.
  • 22.
    What about serverspec sheets?
  • 23.
  • 24.
    Why not? Itis practically impossible to get application owners to tell you what they need – if they understand what they need in the first place.
  • 25.
    It's not theirfault. Their focus is on the application. They are not System Administrators. That is why you are paid.
  • 26.
    They do notaccurately represent server builds
  • 27.
    They often containincorrect or useless information
  • 28.
    Frankly, they serveno function other than to waste time and sow confusion, while providing an easily shattered veneer of repeatability.
  • 29.
    So, naturally, they'reusually the first thing tried.
  • 30.
    How does itbenefit you? Very, very fast deployment times (metal to live server in less than an hour, and deploy as many servers as you can get open terminals to at once – I've done 20 in an hour. And unlimited if you can get the serial consoles to work via expect.)
  • 31.
    The code isthe documentation. Server specs can never go out of sync because the server specs are actively deployed.
  • 32.
    Very tight controlover anything that is deployed to the servers.
  • 33.
    (This means thateven if someone installs something you don't want them to, you can simply have it removed within 10 minutes with no manual intervention.)
  • 34.
  • 35.
    And most importantly,astonished and very happy managers.
  • 36.
    So do Ihave your attention? :)
  • 37.
    So which opensource tools? You will need the following tools: Request Tracker (or an asset tracker with a command line interface)
  • 38.
    Nictool/djbdns (Bind andany other manager will work too, but this is what I use, because nictool has a simple schema and is scriptable)
  • 39.
  • 40.
  • 41.
    Httpd (you willsee why in a moment)
  • 42.
  • 43.
    And... puppet (cfengineor another configuration management tool will probably work, but I prefer puppet.)
  • 44.
    How does thiswork? The simple flow is: Enter the information for the server into your Asset Tracker. The most important part is the MAC Address, though you can add other things as required. In multiple datacenters you might want a “Location” field, for example.
  • 45.
    Make sure theDNS info is properly entered into your DNS server in whatever way.
  • 46.
    Tell your DHCPserver you want to allow the server to PXE boot. This can happen manually or automatically.
  • 47.
    I prefer manuallysimply because if you set the server up to boot automatically – you can get into a situation where the server accidentally reboots and rebuilds itself. This tends to make app owners unhappy.
  • 48.
    And.... letit install. An hour later you have a full build with no further manual intervention.
  • 49.
    How does thiswork behind the scenes? Magic?
  • 50.
  • 51.
    Guess I'll haveto tell you the super-secret explanation.
  • 52.
    Behind the scenes...The RT server is the bedrock of the system. It contains all of the information needed to successfully build the system. A command line interface is absolutely required so that the scripts that actually do the build can get access to the info.
  • 53.
    For example, ata minimum you'll want to put the MAC Address into the RT system. You may even want to populate DNS from an IP field. Every step of this process uses the info from RT.
  • 54.
    There may besite-specific stuff you need to use. Don't be afraid to add or use it. This is only a framework.
  • 55.
    Behind the scenesYou will next need a script to build the pxelinux.cfg file for pxebooting. This script pulls all of the required info out of AT (such as the MAC address, location, etc) and generates the appropriate file. This is a custom script and is by no means one size fits all, but is fairly easy to write. The output of this script is a working pxelinux.cfg file.
  • 56.
    What about kickstart?Oh, here comes the genius part. (And I can say that because I didn't invent it, but I have it down to an art!)
  • 57.
    The kickstart fileis not a file at all. It is a CGI script. It goes to RT and DNS and gathers all of the information required, makes decisions on how to build the servers, and then custom generates a kickstart file.
  • 58.
    It should atminimum take one argument – the RT asset ID. This is a unique identifier and allows all the information to be pulled out of the asset tracker to be used in the script.
  • 59.
    Be careful! Thisscript is one of the bedrocks of what you are trying to do. It is also dangerous. It is dangerous because you are essentially trying to generate a script in one language (kickstart/bash) using another language (perl, python, whatever), so it rapidly becomes unmaintainable.
  • 60.
    I recommend somethinglike Template::Toolkit to make it more manageable.
  • 61.
    Build maintainability infrom the beginning! You may not get another chance!
  • 62.
    Yum repository You'llalso need a yum repository at this point.
  • 63.
    DO NOT USEEXTERNAL REPOSITORIES.
  • 64.
    Pull down aninternal mirror and use that.
  • 65.
    The reason forthis is: control. If you use external repositories, you are putting control of releases and upgrades into their hands, not yours.
  • 66.
    And while Centos,Fedora, etc., are fairly good about it, they make mistakes – and you do not want your production site to go down because of someone else's mistake. It's still your fault for not taking my advice. :)
  • 67.
    Control As youmight have gathered from now, as an aside...
  • 68.
    I am acontrol freak.
  • 69.
    At least whenit comes to System Administration.
  • 70.
    But this isa good thing...
  • 71.
    Because if youare in complete control of your environment, you reduce the chances of surprises.
  • 72.
    And surprises areyour worst enemy.
  • 73.
    ... well, maybenot your WORST...
  • 74.
    Check-in At thispoint, you have: An RT server with all of the info you need
  • 75.
    A pxelinux.cfg generationscript that will point to a custom kickstart script, which pulls the necessary info from RT..
  • 76.
    ... which isgenerated by a CGI script, which pulls the necessary info from RT..
  • 77.
    And a yumrepository which has all of the packages you need for a kickstart install.
  • 78.
    Congratulations... You cannow build a server.
  • 79.
    But what aboutconfiguration management and application deployment?
  • 80.
    Oops. Lookslike there's more to do.
  • 81.
    The rest ofthe story Now that you have a server that is up and on the network, it needs to be configured.
  • 82.
    Each server willlikely have a base config that every server needs. For example, snmp, ntp, etc., etc.
  • 83.
    But each serveralso has an individual role. Application server, database server, facebook browser, Quake Server, pr0n datastore...
  • 84.
    Can you deploythese roles automatically too?
  • 85.
  • 86.
    Configuration Management Forthis task, you will want a configuration management system.
  • 87.
  • 88.
    Puppet is aconfiguration management system
  • 89.
    It controls whatis deployed and what is NOT deployed.
  • 90.
    It can deploya package to one or a thousand servers at the same time.
  • 91.
    And it slices,it dices, it writes bad checks...
  • 92.
    Facter Facter ispuppet's best-kept little secret.
  • 93.
    Facter executes littlebits of ruby code in order to determine facts about the system. OS release is one example, etc.
  • 94.
    But the factsthat it determines are not limited to that...
  • 95.
    The snippets ofruby code can also call AT and pull facts out of AT and make them available to puppet.
  • 96.
    AT and facterThis gives you the ability to dynamically decide what puppet installs simply by changing fields in the asset tracker.
  • 97.
    Putting it alltogether... Here is a sample workflow to building a server. This is not theoretical, it is working at my workplace.
  • 98.
    Put server inDNS, and in AT.
  • 99.
    Set the fieldsappropriately. For example, Server Function is DB server, build is Centos 5.4, network role is vmware, mac address is correct.
  • 100.
    Run a commandto add server to dhcp server (pulling info from DNS and AT).
  • 101.
  • 102.
    Wait for itto build out.
  • 103.
  • 104.
  • 105.
    Hand off fullybuilt server .
  • 106.
    Can it bemore automated? Indeed.
  • 107.
    For example, youcould automate the server restart by using a serial port and expect. For many enterprise-class servers this isn't reliable, but it'll work for some. Many servers use CLP-SM. This is great for consistent command lines. Doesn't work so well for programmatic resetting. You could also set AT to automatically kick off a rebuild on setting a field.
  • 108.
    You could haveAT automatically populate DNS using the SOAP client for nictool...
  • 109.
    It's not outof the realm of possibility that you could set up a server from bare metal all the way to application deployment simply by setting the fields correctly in AT and then setting a special field using this infrastructure.
  • 110.