KEMBAR78
Nginx Internals | PDF
Nginx Internals

   Joshua Zhu
   09/19/2009
Agenda
   Source code layout
   Key concepts and infrastructure
   The event-driven architecture
   HTTP request handling
   Mail proxying process
   Nginx module development
   Misc. topics
Source Code Layout
   Files
       $ find . -name "*.[hc]" -print | wc –l
        234
       $ ls src
        core event http mail misc os
   Lines of code
       $ find . -name "*.[hc]" -print | xargs wc -l | tail
        -n1
        110953 total
Code Organization
   core/
         The backbone and infrastructure
   event/
         The event-driven engine and modules
   http/
         The HTTP server and modules
   mail/
         The Mail proxy server and modules
   misc/
         C++ compatibility test and the Google perftools module
   os/
         OS dependent implementation files
Nginx Architecture
   Non-blocking
   Event driven
   Single threaded[*]
   One master process and several worker
    processes
   Resource efficient
   Highly modular
The Big Picture
Agenda
   Source code layout
   Key concepts and infrastructure
   The event-driven architecture
   HTTP request handling
   Mail proxying process
   Nginx module development
   Misc. topics
Memory Pool
   Avoid memory fragmentation
   Avoid memory leak
   Allocation and deallocation can be very
    fast
   Lifetime and pool size
       Cycle
       Connection
       Request
Memory Pool (cont’d)
   ngx_pool_t
       Small blocks
       Large blocks
       Free chain list
       Cleanup handler list
   API
       ngx_palloc
             memory aligned
       ngx_pnalloc
       ngx_pcalloc
Memory Pool Example (1 Chunk)
Memory Pool Example (2 Chunks)
Buffer Management
   Buffer
        Pointers
             memory
                    start/pos/last/end
             file
                    file_pos/file_last/file
        Flags
             last_buf
             last_in_chain
             flush
             in_file
             memory
             …
Buffer Management (cont’d)
   Buffer chain
       Singly-linked list of buffers
   Output chain
       Context
            in/free/busy chains
       Output filter
   Chain writer
       Writer context
String Utilities
   ngx_str_t
        data
        len
        sizeof() - 1
   Memory related
   String formatting
   String comparison
   String search
   Base64 encoding/decoding
   URI escaping/unescaping
   UTF-8 decoding
   String-to-number conversion
Data Structures
   Abstract data types
       Array
       List
       Queue
       Hash table
       Red black tree
       Radix tree
   Characteristic
       Set object values after added
            keep interfaces clean
       Chunked memory (part)
            efficient
Logging
   Error log
       Level
       Debug
   Access log
       Multiple logs
       Log format
            variables
       Per location
   Rotation
Configuration File
   Directive
        name
        type
        set
        conf
        offset
        post
   Parsing
        ngx_conf_parse
   Values
        init
        merge
Configuration File (cont’d)
   Block
        events
        http
        server
        upstream
        location
        if
   Variables
        Buildins
        Other types
               http_
               sent_http_
               upstream_http_
               cookie_
               arg_
Agenda
   Source code layout
   Key concepts and infrastructure
   The event-driven architecture
   HTTP request handling
   Mail proxying process
   Nginx module development
   Misc. topics
Master and Workers
   Master
       Monitor workers, respawn when a worker dies
       Handle signals and notify workers
            exit
            reconfiguration
            update
            log rotation
            …
   Worker
       Process client requests
            handle connections
       Get cmd from master
Master Process Cycle
Worker Process Cycle
Inter-process Communication
   Signals
       Channel
            socketpair
            command
   Shared memory
       Connection counter
       Stat
       Atomic & spinlock
       Mutex
Event
   ngx_event_t
       Read
       Write
       Timeout
   Callbacks
   Handlers
       ngx_event_accept
       ngx_process_events_and_timers
       ngx_handle_read_event
       ngx_handle_write_event
   Posted events
       Posted accept events queue
       Posted events queue
Time Cache
   The overhead of gettimeofday()
   Time cache variables
       ngx_cached_time
       ngx_current_msec
       Time strings
            ngx_cached_err_log_time
            ngx_cached_http_time
            ngx_cached_http_log_time
   Timer resolution
       Interval timer
            setitimer()
Events and Timers Processing
Timer Management
   Actions
       Add a timer
       Delete a timer
       Get the minimum timer
   Red black tree[*]
       O(log n) complexity
Accept Mutex
   Thundering herd
   Serialize accept()
   Lock/unlock
   Listening sockets
   Delay
I/O
   Multiplexing
        kqueue/epoll
              NGX_USE_CLEAR_EVENT (edge triggered)
        select/poll/dev/poll
              NGX_USE_LEVEL_EVENT (level triggered)
        …
   Advanced I/O
        sendfile()
        writev()
        direct I/O
        mmap()
        AIO
        TCP/IP options
              TCP_CORK/TCP_NODELAY/TCP_DEFER_ACCEPT
Agenda
   Source code layout
   Key concepts and infrastructure
   The event-driven architecture
   HTTP request handling
   Mail proxying process
   Nginx module development
   Misc. topics
Important Structures
   Connection
       ngx_connection_t
   HTTP connection
       ngx_http_connection_t
   HTTP request
       ngx_http_request_t
            headers_in
            headers_out
            …
Virtual Servers
   Address
   Port
   Server names
   Core server conf
Locations
   Location tree
       Static
       Regex
            = ^~ ~ ~*
   Per-location configuration
       Value
            inheritance
            override
       Handler
   Named location
       try_files/post_action/error_page
HTTP Contexts
   Types
        main_conf
        srv_conf
        loc_conf
   Request
        ngx_http_get_module_main_conf
        ngx_http_get_module_srv_conf
        ngx_http_get_module_loc_conf
   Parse conf file
        ngx_http_conf_get_module_main_conf
        ngx_http_conf_get_module_srv_conf
        ngx_http_conf_get_module_loc_conf
   Module context
        ngx_http_get_module_ctx
        ngx_http_set_ctx
HTTP Handling
   Receive data
   Parse the request
   Find the virtual server
   Find the location
   Run phase handlers
   Generate the response
   Filter response headers
   Filter the response body
   Send out the output to the client
Request Parsing
   Request line
   Headers
   Interesting tricks
       Finite state machine
       ngx_strX_cmp
Phases and Handlers
   Phases
        POST_READ
        SERVER_REWRITE
        FIND_CONFIG
        REWRITE
        POST_REWRITE
        PREACCESS
        ACCESS
        POST_ACCESS
        TRY_FILES
        CONTENT
        LOG
   Phase handler
        Checker
        Handler
        Next
Phases and Handlers (cont’d)
   Phase engine
        Handlers
        server_rewrite_index
        location_rewrite_index
        r->phase_handler
   Default checkers
        ngx_http_core_generic_phase
        ngx_http_core_find_config_phase
        ngx_http_core_post_rewrite_phase
        ngx_http_core_access_phase
        ngx_http_core_post_access_phase
        ngx_http_core_try_files_phase
        ngx_http_core_content_phase
Phases and Handlers (cont’d)
phase            modules
POST_READ        realip
SERVER_REWRITE   rewrite
REWRITE          rewrite
PREACCESS        limit_req, limit_zone, realip
ACCESS           access, auth_basic
CONTENT          autoindex, dav, gzip, index,
                 random_index, static
LOG              log
Filter Chain
   Singly-linked list like (CoR)
   Filter response only
        Header filter
        Body filter
   Send out the response
        ngx_http_send_header
              top_header_filter
        ngx_http_output_filter
              ngx_http_top_body_filter
        ngx_http_header_filter
        ngx_http_copy_filter
        ngx_http_write_filter
   Process order
Filter Chain Example
HTTP Handling Example
   curl -i http://localhost/
HTTP Keep-Alive
   Request memory reuse
   Connection memory shrink
   Keep-alive timeout
   Request count
Subrequest
   Filters
       Addition filter
       SSI filter
   Maximum subrequests
Internal Redirect
   Return a different URL than originally
    requested
   Examples
       try_files
       index/random_index
       post_action
       send_error_page
       upstream_process_headers
Upstream
   Hooks
        input_filter_init
        input_filter
        create_request
        reinit_request
        process_header
        abort_request
        finalize_request
        rewrite_redirect
   Modules
        FastCGI
        Proxy
        Memcached
   Event pipe
   Load balancer
Agenda
   Source code layout
   Key concepts and infrastructure
   The event-driven architecture
   HTTP request handling
   Mail proxying process
   Nginx module development
   Misc. topics
Mail Proxy
   Sequence diagram
Mail Proxy (cont’d)
   Mail session
       Command parsing
       Packets relay
   Things you can do
       Load balancing
       Authentication rewriting
       Black lists/white lists
Agenda
   Source code layout
   Key concepts and infrastructure
   The event-driven architecture
   HTTP request handling
   Mail proxying process
   Nginx module development
   Misc. topics
General Module Interface
   Context
        index & ctx_index
   Directives
   Type
        core/event/http/mail
   Hooks
        init_master
              called at master process initialization
        init_module
              called when the module is loaded
        init_process
              called at worker process initialization
        exit_process
              called at worker process termination
        exit_master
              called at master process termination
Core Module Interface
   Name
   Hooks
       create_conf
       init_conf
   Examples
       Core
       Events
       Log
       HTTP
Event Module Interface
   Name
   Hooks
       create_conf
       init_conf
       event_actions
            add
            del
            enable
            disable
            add_conn
            del_conn
            process_changes
            process_events
            init
            done
Mail Module Interface
   Protocol
       type
       init_session
       init_protocol
       parse_command
       auth_state
   create_main_conf
   init_main_conf
   create_srv_conf
   merge_srv_conf
HTTP Module Interface
   Hooks
       preconfiguration
       postconfiguration
       create_main_conf
       init_main_conf
       create_srv_conf
       merge_srv_conf
       create_loc_conf
       merge_loc_conf
A “Hello World” HTTP Module
• Creating a hello world! module
      Files
           ngx_http_hello_module.c
           config
      Build
           ./configure –add-module=/path/to/hello/module
      Configuration
           location & directive
Agenda
   Source code layout
   Key concepts and infrastructure
   The event-driven architecture
   HTTP request handling
   Mail proxying process
   Nginx module development
   Misc. topics
Auto Scripts
   Handle the differences
       OS
       Compiler
       Data types
       Libraries
   Module enable/disable
   Modules order
Reconfiguration
Hot Code Swapping
Thank You!
   My site: http://www.zhuzhaoyuan.com
   My blog: http://blog.zhuzhaoyuan.com

Nginx Internals

  • 1.
    Nginx Internals Joshua Zhu 09/19/2009
  • 2.
    Agenda  Source code layout  Key concepts and infrastructure  The event-driven architecture  HTTP request handling  Mail proxying process  Nginx module development  Misc. topics
  • 3.
    Source Code Layout  Files  $ find . -name "*.[hc]" -print | wc –l 234  $ ls src core event http mail misc os  Lines of code  $ find . -name "*.[hc]" -print | xargs wc -l | tail -n1 110953 total
  • 4.
    Code Organization  core/  The backbone and infrastructure  event/  The event-driven engine and modules  http/  The HTTP server and modules  mail/  The Mail proxy server and modules  misc/  C++ compatibility test and the Google perftools module  os/  OS dependent implementation files
  • 5.
    Nginx Architecture  Non-blocking  Event driven  Single threaded[*]  One master process and several worker processes  Resource efficient  Highly modular
  • 6.
  • 7.
    Agenda  Source code layout  Key concepts and infrastructure  The event-driven architecture  HTTP request handling  Mail proxying process  Nginx module development  Misc. topics
  • 8.
    Memory Pool  Avoid memory fragmentation  Avoid memory leak  Allocation and deallocation can be very fast  Lifetime and pool size  Cycle  Connection  Request
  • 9.
    Memory Pool (cont’d)  ngx_pool_t  Small blocks  Large blocks  Free chain list  Cleanup handler list  API  ngx_palloc  memory aligned  ngx_pnalloc  ngx_pcalloc
  • 10.
  • 11.
  • 12.
    Buffer Management  Buffer  Pointers  memory  start/pos/last/end  file  file_pos/file_last/file  Flags  last_buf  last_in_chain  flush  in_file  memory  …
  • 13.
    Buffer Management (cont’d)  Buffer chain  Singly-linked list of buffers  Output chain  Context  in/free/busy chains  Output filter  Chain writer  Writer context
  • 14.
    String Utilities  ngx_str_t  data  len  sizeof() - 1  Memory related  String formatting  String comparison  String search  Base64 encoding/decoding  URI escaping/unescaping  UTF-8 decoding  String-to-number conversion
  • 15.
    Data Structures  Abstract data types  Array  List  Queue  Hash table  Red black tree  Radix tree  Characteristic  Set object values after added  keep interfaces clean  Chunked memory (part)  efficient
  • 16.
    Logging  Error log  Level  Debug  Access log  Multiple logs  Log format  variables  Per location  Rotation
  • 17.
    Configuration File  Directive  name  type  set  conf  offset  post  Parsing  ngx_conf_parse  Values  init  merge
  • 18.
    Configuration File (cont’d)  Block  events  http  server  upstream  location  if  Variables  Buildins  Other types  http_  sent_http_  upstream_http_  cookie_  arg_
  • 19.
    Agenda  Source code layout  Key concepts and infrastructure  The event-driven architecture  HTTP request handling  Mail proxying process  Nginx module development  Misc. topics
  • 20.
    Master and Workers  Master  Monitor workers, respawn when a worker dies  Handle signals and notify workers  exit  reconfiguration  update  log rotation  …  Worker  Process client requests  handle connections  Get cmd from master
  • 21.
  • 22.
  • 23.
    Inter-process Communication  Signals  Channel  socketpair  command  Shared memory  Connection counter  Stat  Atomic & spinlock  Mutex
  • 24.
    Event  ngx_event_t  Read  Write  Timeout  Callbacks  Handlers  ngx_event_accept  ngx_process_events_and_timers  ngx_handle_read_event  ngx_handle_write_event  Posted events  Posted accept events queue  Posted events queue
  • 25.
    Time Cache  The overhead of gettimeofday()  Time cache variables  ngx_cached_time  ngx_current_msec  Time strings  ngx_cached_err_log_time  ngx_cached_http_time  ngx_cached_http_log_time  Timer resolution  Interval timer  setitimer()
  • 26.
  • 27.
    Timer Management  Actions  Add a timer  Delete a timer  Get the minimum timer  Red black tree[*]  O(log n) complexity
  • 28.
    Accept Mutex  Thundering herd  Serialize accept()  Lock/unlock  Listening sockets  Delay
  • 29.
    I/O  Multiplexing  kqueue/epoll  NGX_USE_CLEAR_EVENT (edge triggered)  select/poll/dev/poll  NGX_USE_LEVEL_EVENT (level triggered)  …  Advanced I/O  sendfile()  writev()  direct I/O  mmap()  AIO  TCP/IP options  TCP_CORK/TCP_NODELAY/TCP_DEFER_ACCEPT
  • 30.
    Agenda  Source code layout  Key concepts and infrastructure  The event-driven architecture  HTTP request handling  Mail proxying process  Nginx module development  Misc. topics
  • 31.
    Important Structures  Connection  ngx_connection_t  HTTP connection  ngx_http_connection_t  HTTP request  ngx_http_request_t  headers_in  headers_out  …
  • 32.
    Virtual Servers  Address  Port  Server names  Core server conf
  • 33.
    Locations  Location tree  Static  Regex  = ^~ ~ ~*  Per-location configuration  Value  inheritance  override  Handler  Named location  try_files/post_action/error_page
  • 34.
    HTTP Contexts  Types  main_conf  srv_conf  loc_conf  Request  ngx_http_get_module_main_conf  ngx_http_get_module_srv_conf  ngx_http_get_module_loc_conf  Parse conf file  ngx_http_conf_get_module_main_conf  ngx_http_conf_get_module_srv_conf  ngx_http_conf_get_module_loc_conf  Module context  ngx_http_get_module_ctx  ngx_http_set_ctx
  • 35.
    HTTP Handling  Receive data  Parse the request  Find the virtual server  Find the location  Run phase handlers  Generate the response  Filter response headers  Filter the response body  Send out the output to the client
  • 36.
    Request Parsing  Request line  Headers  Interesting tricks  Finite state machine  ngx_strX_cmp
  • 37.
    Phases and Handlers  Phases  POST_READ  SERVER_REWRITE  FIND_CONFIG  REWRITE  POST_REWRITE  PREACCESS  ACCESS  POST_ACCESS  TRY_FILES  CONTENT  LOG  Phase handler  Checker  Handler  Next
  • 38.
    Phases and Handlers(cont’d)  Phase engine  Handlers  server_rewrite_index  location_rewrite_index  r->phase_handler  Default checkers  ngx_http_core_generic_phase  ngx_http_core_find_config_phase  ngx_http_core_post_rewrite_phase  ngx_http_core_access_phase  ngx_http_core_post_access_phase  ngx_http_core_try_files_phase  ngx_http_core_content_phase
  • 39.
    Phases and Handlers(cont’d) phase modules POST_READ realip SERVER_REWRITE rewrite REWRITE rewrite PREACCESS limit_req, limit_zone, realip ACCESS access, auth_basic CONTENT autoindex, dav, gzip, index, random_index, static LOG log
  • 40.
    Filter Chain  Singly-linked list like (CoR)  Filter response only  Header filter  Body filter  Send out the response  ngx_http_send_header  top_header_filter  ngx_http_output_filter  ngx_http_top_body_filter  ngx_http_header_filter  ngx_http_copy_filter  ngx_http_write_filter  Process order
  • 41.
  • 42.
    HTTP Handling Example  curl -i http://localhost/
  • 43.
    HTTP Keep-Alive  Request memory reuse  Connection memory shrink  Keep-alive timeout  Request count
  • 44.
    Subrequest  Filters  Addition filter  SSI filter  Maximum subrequests
  • 45.
    Internal Redirect  Return a different URL than originally requested  Examples  try_files  index/random_index  post_action  send_error_page  upstream_process_headers
  • 46.
    Upstream  Hooks  input_filter_init  input_filter  create_request  reinit_request  process_header  abort_request  finalize_request  rewrite_redirect  Modules  FastCGI  Proxy  Memcached  Event pipe  Load balancer
  • 47.
    Agenda  Source code layout  Key concepts and infrastructure  The event-driven architecture  HTTP request handling  Mail proxying process  Nginx module development  Misc. topics
  • 48.
    Mail Proxy  Sequence diagram
  • 49.
    Mail Proxy (cont’d)  Mail session  Command parsing  Packets relay  Things you can do  Load balancing  Authentication rewriting  Black lists/white lists
  • 50.
    Agenda  Source code layout  Key concepts and infrastructure  The event-driven architecture  HTTP request handling  Mail proxying process  Nginx module development  Misc. topics
  • 51.
    General Module Interface  Context  index & ctx_index  Directives  Type  core/event/http/mail  Hooks  init_master  called at master process initialization  init_module  called when the module is loaded  init_process  called at worker process initialization  exit_process  called at worker process termination  exit_master  called at master process termination
  • 52.
    Core Module Interface  Name  Hooks  create_conf  init_conf  Examples  Core  Events  Log  HTTP
  • 53.
    Event Module Interface  Name  Hooks  create_conf  init_conf  event_actions  add  del  enable  disable  add_conn  del_conn  process_changes  process_events  init  done
  • 54.
    Mail Module Interface  Protocol  type  init_session  init_protocol  parse_command  auth_state  create_main_conf  init_main_conf  create_srv_conf  merge_srv_conf
  • 55.
    HTTP Module Interface  Hooks  preconfiguration  postconfiguration  create_main_conf  init_main_conf  create_srv_conf  merge_srv_conf  create_loc_conf  merge_loc_conf
  • 56.
    A “Hello World”HTTP Module • Creating a hello world! module  Files  ngx_http_hello_module.c  config  Build  ./configure –add-module=/path/to/hello/module  Configuration  location & directive
  • 57.
    Agenda  Source code layout  Key concepts and infrastructure  The event-driven architecture  HTTP request handling  Mail proxying process  Nginx module development  Misc. topics
  • 58.
    Auto Scripts  Handle the differences  OS  Compiler  Data types  Libraries  Module enable/disable  Modules order
  • 59.
  • 60.
  • 61.
    Thank You!  My site: http://www.zhuzhaoyuan.com  My blog: http://blog.zhuzhaoyuan.com