KEMBAR78
Optimization of modern web applications | KEY
Optimization of
modern web apps

     Eugene Lazutkin

   ClubAjax on 8/7/2012
       Dallas, TX
About me
• Eugene Lazutkin
• Open Source web developer
   • Majors: JavaScript & Python, Dojo & Django.
• Independent consultant
• Blog: lazutkin.com
• Twitter: @uhop, Google+: gplus.to/uhop
What’s new?
Browser landscape has
      changed
What else is new?

• We are transitioning from static web applications
  with JavaScript helpers to dynamic web
  applications.
   • One-page web applications.
   • Interactive grids, charts, CRUD.
   • Multimedia is on its way.
The waterfall
Important things
• Individual web requests.
• Event: DOMContentLoaded
   • DOM is fully parsed, but not laid out yet.
• Event: load
   • All external assets are loaded.
   • DOM geometry is calculated.
When can we use it?

• After DOMContentLoaded.
• After load.
• Sometimes between them.
   • Example: our app works, yet some images are
     being downloaded.
Problem: batching

• A network diagram frequently looks like a staircase.
• Requests are batched.
• Browser limits connections per host.
   • Usually 2-8 connections depending on browser
     and on HTTP version (1.0 or 1.1).
   • Prevents server overload.
Problem: bandwidth


• We need to download a lot of resources.
• Slow connections limit the app.
• The less we download, the better.
Anatomy of connection
• Lifecycle (add browser delays, and network latency
  liberally):
   1.Client does a DNS lookup (complex operation).
   2.Client sends a request (data, headers, cookies).
   3.Server gets it, processes it, and sends a
     response.
   4.Client receives and processes it.
Problem: connections


• Connections are expensive.
• We should reduce their number.
Solution: batching

• Sharding:
   • Serve different resources from different hosts.
• Pro: if batching is a bottleneck, that can help
  considerably.
• Con: more expensive DNS lookups.
CDN
• Can help with batching and bandwidth at the same
  time.
• Can reduce latency for geographically distributed
  clients.
• The same problems as sharding.
• You should factor in CDN service costs (usually
  inexpensive).
Solution: bandwidth
• Let’s compress all we can.
   • Images can be compressed lossy or losslessly.
   • Text (JavaScript, CSS, HTML) should be
     gzipped.
      • It can be preprocessed (minified) to be even
        more compressible.
• Use static compression whenever you can.
Solution: bandwidth

• If we bundle similar resources together usually we
  can compress them better.
   • Merge all JavaScript.
   • Merge all CSS.
   • Use sprites instead of separate images.
• Bundling conserves connections too!
Solution: bandwidth

• It makes sense to remove all inlined JavaScript
  (<script> blocks, event handlers), and CSS
  (<style> blocks) from HTML.
• Images should be converted to sprites.
   • Example: <img> can be represented as <div>
     with a proper background image.
What I use

• Both Dojo and RequireJS come with a build tool.
   • It bundles, and minifies JavaScript.
   • It bundles and minifies CSS.
• SmartSprites (http://csssprites.org)
   • It can handle vertically and horizontally tiled,
     and untiled images.
Problems with
           bundling

• 3rd party resources cannot be bundled easily.
• Bundled resources should have the same
  expiration.
• Dynamic data cannot be easily bundled.
Solution: connections


• We bundled all we could. Now what?
• Now it is time to go back to basics: network
  protocols starting with TCP/IP.
Really???
Oh, yes!

• The standard-compliant server sends 3 (three)
  packages and waits for ACK.
   • It is a part of congestion-controlling algorithm.
• What does it mean for us?
   • Client gets 3 packages relatively fast.
   • Useful payload is just over 1.5k.
TCP 3 packets rule

• How can we use it?
   • If we send an HTML page, try to fit all
     external resource requests in the first 1.5k.
   • If you can keep your HTML page under 1.5k
     (compressed) — awesome!
HTTP rules!

• HTTP/1.0 creates one connection per request.
   • Expensive.
• HTTP/1.1 allows to reuse the same connection to
  request different resources from the same host.
   • Double check that you use HTTP/1.1 on
     server.
HTTP pipelining
HTTP pipelining
• Part of HTTP/1.1.
• Allows to request several resources without waiting
  for response.
• Resources should come in the order of their
  requests.
• Frequently turned off.
• Improves high-latency/mobile scenarios.
SPDY

• Introduced by Google.
• Will likely be a part of HTTP/2.0.
• Allows asynchronous requests/responses over a
  single connection.
• Allows server push and server hint.
Who supports SPDY?

• Implemented by Chrome/Chromium and Firefox.
• Used by Google, Twitter.
• Announced by Facebook.
• Implemented by most vendors including Apache,
  nginx (experimental), most app servers like node.js.
• Server push and hint are rarely implemented.
Ideal web app
<!doctype html>
<html>
     <head>
        <link rel=”stylesheet” type=”text/css” href=”x.css”>
        <!— images are requested from CSS as one sprite —>
        <script src=”x.js”></script>
     </head>
     <body>
        <!— HTML here may be dynamically generated —>
     </body>
</html>


                          Now what?
Where to include JS?
• Most gurus recommend to include it in a body as a
  last node.
• That’s incorrect in general!
• It works only for “gradual enhancements” scripts.
   • Scripts, which provide convenience, not main
     functionality.
      • Error checking, calendars, and so on.
Where to include JS?

• It is unwise to make it last, if our app functionality
  depends on it.
    • It renders significant parts of out web page.
    • It requests data from a server.
    • It is the application.
• In our “ideal app” it doesn’t matter where to put it.
Can we reduce it more?
• We can inline CSS and JavaScript back.
• Images can be inlined too using “data:” URI.
• Cons:
   • Usually it violates “the same expiration” rule.
   • Prevents reuse between pages.
   • “data:” URI can increase a file size.
Problem: dynamic data
• We optimized the web app. Now what?
• Usually the dynamic data requests stick out like a
  sore thumb.
   • Unlike static files, such requests do take some
     server processing:
      • SQL queries, disk I/O, internal network
        services.
Solution: dynamic data

• We can try to consolidate several requests required
  to render a page into one request.
• We can request this data first thing.
   • Literally.
   • Both XHR and <script> can be used but I
     prefer scripts with JSONP.
Data-first idea part 1

• Let’s request the data first, if it takes a long time.
• In order to be efficient we cannot rely on any other
  JavaScript libraries.
• It will be loaded in parallel with the rest.
    • Con: it will occupy a connection slot.
• The result would be stored in a variable.
Data-first idea part 2

• When our main JS is loaded we can check that
  variable.
   • If it is populated, we can wait until DOM is
     ready to render data.
   • Otherwise we can override our JSONP callback
     function, and wait for data, and for DOM.
Data-first sketch
<!doctype html>
<html>
     <head>
        <script>
           function __load(data){...}
           var t = document.createElement("script");
           t.src = "/api?timeout=2&callback=__load";
           document.documentElement.appendChild(t);
        </script>
        <link...>
        <script...>
     </head>
     <body>
        <!— HTML, if any —>
Cache considerations


• If we expect our user to come again, or
• If we expect it to use other pages of our web app.
• We have to work with cache.
Server-side headers
• Determine expirations of your resources and set all
  proper HTTP headers:
   • Expires, Last-Modified, Cache-Control
   • If set properly, browser would not even attempt
     to connect within their expiration period.
• Set ETag header.
   • Sometimes timestamp is not reliable.
Server-side headers
• Proper settings reduce number of connections.
• It allows server to send 304 (not modified) response
  instead of a potentially big resource.
• Don’t forget that some companies and ISPs run
  intermediate caches.
• Read http://lazutkin.com/blog/2007/feb/1/
  improving-performance/ for more details.
Prime cache
• Sometimes it makes sense to load files not used by
  this web page, which can be used by other pages.
• Usually it is done asynchronously several seconds
  later after the page has started.
   • Invisible image can be created.
   • CSS and JS can be linked.
     • They should not interfere with the page!
Or use manifest

• Part of HTML5 to facilitate offline applications.
• A text file that lists what should be downloaded and
  placed into a permanent cache, network URLs,
  and fallback URLs.
• Should be served as “text/cache-manifest”.
• Supported by FF, Cr, Opera, Safari, IE10.
Cache manifest
            example
<!doctype html>
<html manifest=”cache.manifest”>
     ...

  <!— cache.manifest example content —>
  CACHE MANIFEST
  /y.js
  /y.css
  /y.jpg
  /y.html
  ...
Tools of trade

• Built-in debuggers of modern browsers.
   • Firebug.
• Network sniffers.
   • HTTPWatch, Fiddler.
• And...
Navigation timing


• For your debugging pleasure you can use
  Navigation Timing API.
• A lot of resource-specific timing information!
• Supported by FF, Cr, IE9.
That’s all
  folks!
Picture credits


pie: http://en.wikipedia.org/wiki/File:Wikimedia_browser_share_pie_chart_3.png
really: http://www.flickr.com/photos/zpeckler/3093588439/
http pipelining: http://en.wikipedia.org/wiki/File:HTTP_pipelining2.svg

Optimization of modern web applications

  • 1.
    Optimization of modern webapps Eugene Lazutkin ClubAjax on 8/7/2012 Dallas, TX
  • 2.
    About me • EugeneLazutkin • Open Source web developer • Majors: JavaScript & Python, Dojo & Django. • Independent consultant • Blog: lazutkin.com • Twitter: @uhop, Google+: gplus.to/uhop
  • 3.
  • 4.
    What else isnew? • We are transitioning from static web applications with JavaScript helpers to dynamic web applications. • One-page web applications. • Interactive grids, charts, CRUD. • Multimedia is on its way.
  • 5.
  • 6.
    Important things • Individualweb requests. • Event: DOMContentLoaded • DOM is fully parsed, but not laid out yet. • Event: load • All external assets are loaded. • DOM geometry is calculated.
  • 7.
    When can weuse it? • After DOMContentLoaded. • After load. • Sometimes between them. • Example: our app works, yet some images are being downloaded.
  • 8.
    Problem: batching • Anetwork diagram frequently looks like a staircase. • Requests are batched. • Browser limits connections per host. • Usually 2-8 connections depending on browser and on HTTP version (1.0 or 1.1). • Prevents server overload.
  • 9.
    Problem: bandwidth • Weneed to download a lot of resources. • Slow connections limit the app. • The less we download, the better.
  • 10.
    Anatomy of connection •Lifecycle (add browser delays, and network latency liberally): 1.Client does a DNS lookup (complex operation). 2.Client sends a request (data, headers, cookies). 3.Server gets it, processes it, and sends a response. 4.Client receives and processes it.
  • 11.
    Problem: connections • Connectionsare expensive. • We should reduce their number.
  • 12.
    Solution: batching • Sharding: • Serve different resources from different hosts. • Pro: if batching is a bottleneck, that can help considerably. • Con: more expensive DNS lookups.
  • 13.
    CDN • Can helpwith batching and bandwidth at the same time. • Can reduce latency for geographically distributed clients. • The same problems as sharding. • You should factor in CDN service costs (usually inexpensive).
  • 14.
    Solution: bandwidth • Let’scompress all we can. • Images can be compressed lossy or losslessly. • Text (JavaScript, CSS, HTML) should be gzipped. • It can be preprocessed (minified) to be even more compressible. • Use static compression whenever you can.
  • 15.
    Solution: bandwidth • Ifwe bundle similar resources together usually we can compress them better. • Merge all JavaScript. • Merge all CSS. • Use sprites instead of separate images. • Bundling conserves connections too!
  • 16.
    Solution: bandwidth • Itmakes sense to remove all inlined JavaScript (<script> blocks, event handlers), and CSS (<style> blocks) from HTML. • Images should be converted to sprites. • Example: <img> can be represented as <div> with a proper background image.
  • 17.
    What I use •Both Dojo and RequireJS come with a build tool. • It bundles, and minifies JavaScript. • It bundles and minifies CSS. • SmartSprites (http://csssprites.org) • It can handle vertically and horizontally tiled, and untiled images.
  • 18.
    Problems with bundling • 3rd party resources cannot be bundled easily. • Bundled resources should have the same expiration. • Dynamic data cannot be easily bundled.
  • 19.
    Solution: connections • Webundled all we could. Now what? • Now it is time to go back to basics: network protocols starting with TCP/IP.
  • 20.
  • 21.
    Oh, yes! • Thestandard-compliant server sends 3 (three) packages and waits for ACK. • It is a part of congestion-controlling algorithm. • What does it mean for us? • Client gets 3 packages relatively fast. • Useful payload is just over 1.5k.
  • 22.
    TCP 3 packetsrule • How can we use it? • If we send an HTML page, try to fit all external resource requests in the first 1.5k. • If you can keep your HTML page under 1.5k (compressed) — awesome!
  • 23.
    HTTP rules! • HTTP/1.0creates one connection per request. • Expensive. • HTTP/1.1 allows to reuse the same connection to request different resources from the same host. • Double check that you use HTTP/1.1 on server.
  • 24.
  • 25.
    HTTP pipelining • Partof HTTP/1.1. • Allows to request several resources without waiting for response. • Resources should come in the order of their requests. • Frequently turned off. • Improves high-latency/mobile scenarios.
  • 26.
    SPDY • Introduced byGoogle. • Will likely be a part of HTTP/2.0. • Allows asynchronous requests/responses over a single connection. • Allows server push and server hint.
  • 27.
    Who supports SPDY? •Implemented by Chrome/Chromium and Firefox. • Used by Google, Twitter. • Announced by Facebook. • Implemented by most vendors including Apache, nginx (experimental), most app servers like node.js. • Server push and hint are rarely implemented.
  • 28.
    Ideal web app <!doctypehtml> <html> <head> <link rel=”stylesheet” type=”text/css” href=”x.css”> <!— images are requested from CSS as one sprite —> <script src=”x.js”></script> </head> <body> <!— HTML here may be dynamically generated —> </body> </html> Now what?
  • 29.
    Where to includeJS? • Most gurus recommend to include it in a body as a last node. • That’s incorrect in general! • It works only for “gradual enhancements” scripts. • Scripts, which provide convenience, not main functionality. • Error checking, calendars, and so on.
  • 30.
    Where to includeJS? • It is unwise to make it last, if our app functionality depends on it. • It renders significant parts of out web page. • It requests data from a server. • It is the application. • In our “ideal app” it doesn’t matter where to put it.
  • 31.
    Can we reduceit more? • We can inline CSS and JavaScript back. • Images can be inlined too using “data:” URI. • Cons: • Usually it violates “the same expiration” rule. • Prevents reuse between pages. • “data:” URI can increase a file size.
  • 32.
    Problem: dynamic data •We optimized the web app. Now what? • Usually the dynamic data requests stick out like a sore thumb. • Unlike static files, such requests do take some server processing: • SQL queries, disk I/O, internal network services.
  • 33.
    Solution: dynamic data •We can try to consolidate several requests required to render a page into one request. • We can request this data first thing. • Literally. • Both XHR and <script> can be used but I prefer scripts with JSONP.
  • 34.
    Data-first idea part1 • Let’s request the data first, if it takes a long time. • In order to be efficient we cannot rely on any other JavaScript libraries. • It will be loaded in parallel with the rest. • Con: it will occupy a connection slot. • The result would be stored in a variable.
  • 35.
    Data-first idea part2 • When our main JS is loaded we can check that variable. • If it is populated, we can wait until DOM is ready to render data. • Otherwise we can override our JSONP callback function, and wait for data, and for DOM.
  • 36.
    Data-first sketch <!doctype html> <html> <head> <script> function __load(data){...} var t = document.createElement("script"); t.src = "/api?timeout=2&callback=__load"; document.documentElement.appendChild(t); </script> <link...> <script...> </head> <body> <!— HTML, if any —>
  • 37.
    Cache considerations • Ifwe expect our user to come again, or • If we expect it to use other pages of our web app. • We have to work with cache.
  • 38.
    Server-side headers • Determineexpirations of your resources and set all proper HTTP headers: • Expires, Last-Modified, Cache-Control • If set properly, browser would not even attempt to connect within their expiration period. • Set ETag header. • Sometimes timestamp is not reliable.
  • 39.
    Server-side headers • Propersettings reduce number of connections. • It allows server to send 304 (not modified) response instead of a potentially big resource. • Don’t forget that some companies and ISPs run intermediate caches. • Read http://lazutkin.com/blog/2007/feb/1/ improving-performance/ for more details.
  • 40.
    Prime cache • Sometimesit makes sense to load files not used by this web page, which can be used by other pages. • Usually it is done asynchronously several seconds later after the page has started. • Invisible image can be created. • CSS and JS can be linked. • They should not interfere with the page!
  • 41.
    Or use manifest •Part of HTML5 to facilitate offline applications. • A text file that lists what should be downloaded and placed into a permanent cache, network URLs, and fallback URLs. • Should be served as “text/cache-manifest”. • Supported by FF, Cr, Opera, Safari, IE10.
  • 42.
    Cache manifest example <!doctype html> <html manifest=”cache.manifest”> ... <!— cache.manifest example content —> CACHE MANIFEST /y.js /y.css /y.jpg /y.html ...
  • 43.
    Tools of trade •Built-in debuggers of modern browsers. • Firebug. • Network sniffers. • HTTPWatch, Fiddler. • And...
  • 44.
    Navigation timing • Foryour debugging pleasure you can use Navigation Timing API. • A lot of resource-specific timing information! • Supported by FF, Cr, IE9.
  • 45.
  • 46.