Understanding HTTP and DNS Basics
Understanding HTTP and DNS Basics
Under your browser's hood lies a collection of files -- CSS, HTML, Javascript,
videos, images, etc. -- that makes displaying the page possible. All these files were
sent from a server to your browser, the client, by an application protocol called
HTTP (yes, this is why URLs in your browser address bar start with "http://").
The HTTP protocol has been through several changes from its inception. The
protocol started in its most simplified form returning only HTML pages. In 1991, the
first document version HTTP/0.9 was released. In 1992, HTTP/1.0 was released
with the ability to transmit different file types like CSS documents, videos, scripts
and images. 1995 saw the release of HTTP/1.1, which introduced the ability to
reuse established connections for subsequent requests, among a host of other
features. Further improvements made to HTTP/1.1 in 1999 resulted in what we
mostly see today. The evolution of HTTP doesn't stop there though. HTTP/2 is fast
gaining traction, and the latest version, HTTP/3, is currently in development.
Copy Code
192.168.0.1:1234
where the IP Address is 192.168.0.1 and the port number is 1234 . An IP Address
acts as the identifier for a device or server, which can contain hundreds or
thousands of ports, each used for a different communication purpose to that device
or server.
When it comes to the wider Internet, effective communication begins when each
device has a public IP address provided by an Internet Service Provider. However,
when we wish to connect to a resource like Google, we type in a URL, not an IP
address. How does the computer map www.google.com to the appropriate IP
address?
DNS
This mapping from domain name to IP address is handled by the Domain Name
System or DNS. DNS is a distributed database which translates domain names
like www.google.com to an IP address, so that the IP address can then be used to
make a request to the server. Stated differently, it keeps track of domain names and
their corresponding IP addresses on the Internet. So an address
like www.google.com might be resolved to an IP address 197.251.230.45 .
By the way, you can also get to Google's main page by typing the IP address into
your browser's address bar. However, most people want to use a user-friendly
address like www.google.com , instead of memorizing a number of digits. DNS
databases are stored on computers called DNS servers. It is important to know that
there is a very large world-wide network of hierarchically organized DNS servers,
and no single DNS server contains the complete database. If a DNS server does not
contain a requested domain name, the DNS server routes the request to another
DNS server up the hierarchy. Eventually, the address will be found in the DNS
database on a particular DNS server, and the corresponding IP address will be used
to receive the request.
Your typical interaction with the Internet starts with a web browser when you:
1. Enter a URL like http://www.google.com into your web browser's address bar.
2. The browser creates an HTTP request, which is packaged up and sent to
your device's network interface.
3. If your device already has a record of the IP address for the domain name in
its DNS cache, it will use this cached address. If the IP address isn't cached,
a DNS request will be made to the Domain Name System to obtain the IP
address for the domain.
4. The packaged-up HTTP request then goes over the Internet where it is
directed to the server with the matching IP address.
5. The remote server accepts the request and sends a response over the
Internet back to your network interface which hands it to your browser.
6. Finally, the browser displays the response in the form of a web page.
Facebook had a surprising 5.5-hour service outage in
2021. Understanding how Facebook disappeared from the
Internet and Why was Facebook down for five hours? explain in detail the
role of DNS servers in the outage. It is an interesting case study but by no
means necessary to review. The content goes beyond the scope of this
course, so if you choose to take a look, absorb as much as you can and
don't worry about understanding all of it.
The above set of steps is a simplification of what happens at a technical level. The
main thing to understand though is that when your browser issues a request, it's
simply sending some text to an IP address. Because the client (web browser) and
the server (recipient of the request) have an agreement, or protocol, in the form of
HTTP, the server can take apart the request, understand its components and send a
response back to the web browser. The web browser will then process the response
strings into content that you can understand. Navigating to websites like Facebook,
Google and Twitter means you've been using HTTP all along. The details were
hidden, but your browser was issuing the requests and processing the responses
automatically. The different parts of the Internet look something like:
Clients and Servers
The most common client is an application you interact with on a daily basis called
a Web Browser. Examples of web browsers include Internet Explorer, Firefox,
Safari and Chrome, including mobile versions. Web browsers are responsible for
issuing HTTP requests and processing the HTTP response in a user-friendly
manner onto your screen. Web browsers aren't the only clients around, as there are
many tools and applications that can also issue HTTP requests.
Statelessness
A protocol is said to be stateless when it's designed in such a way that each
request/response pair is completely independent of the previous one. It is important
to be aware of HTTP as a stateless protocol and the impact it has on server
resources and ease of use. In the context of HTTP, it means that the server does
not need to hang on to information, or state, between requests. As a result, when a
request breaks en route to the server, no part of the system has to do any cleanup.
Both these reasons make HTTP a resilient protocol, as well as a difficult protocol for
building stateful applications. Since HTTP, the protocol of the internet, is inherently
stateless that means web developers have to work hard to simulate a stateful
experience in web applications.
When you go to Facebook, for example, and log in, you expect to see the internal
Facebook page. That was one complete request/response cycle. You then click on
the picture -- another request/response cycle -- but you do not expect to be logged
out after that action. If HTTP is stateless, how did the application maintain state and
remember that you already input your username and password? In fact, if HTTP is
stateless, how does Facebook even know this request came from you, and how
does it differentiate data from you vs. any other user? There are tricks web
developers and frameworks employ to make it seem like the application is stateful,
but those tricks are beyond the scope of this book. The key concept to remember is
that even though you may feel the application is stateful, underneath the hood, the
web is built on HTTP, a stateless protocol. It's what makes the web so resilient,
distributed, and hard to control. It's also what makes it so difficult to secure and
build on top of.
Summary
This chapter covered an oversimplified interpretation of how the Internet works
along with an explanation of a few key terms. You also learned about statelessness
and how it impacts web applications. We'll take a closer look at what an address
such as http://www.google.com is and what it's made up of in the next chapter.
What is a URL?
Introduction
When you need to locate someone's home, you need their house address. If you
want to call your friend, you need your friend's phone number. Without that
information, finding that house or calling your friend is not possible. Further, if you're
provided an address or phone number, you can immediately tell one from the other,
due to the uniformity of how an address is formatted vs. how a phone number is
formatted.
There's a similar concept for finding and accessing servers on the Internet. When
you want to check Facebook's games page, you start by launching your web
browser and navigating to http://www.facebook.com/games . The web browser
makes an HTTP request to this address resulting in the resource being returned to
your browser. The address you entered, https://www.facebook.com/games , is known
as a Uniform Resource Locator or URL. A URL is like that address or phone number
you need in order to visit or communicate with your friend. A URL is the most
frequently used part of the general concept of a Uniform Resource Identifier or URI,
which specifies how resources are located. This book does not discuss URIs, but if
you are curious to know more, we recommend The Real Difference Between a URL
and a URI . This section looks at what a URL is, its components and what it means
to you as a web developer.
URL Components
When you see a URL, such as "http://www.example.com:88/home?item=book ", it is
comprised of several components. We can break this URL into 5 parts:
http :The scheme. It always comes before the colon and two
forward slashes and tells the web client how to access the resource.
In this case it tells the web client to use the Hypertext Transfer
Protocol or HTTP to make a request. Other popular URL schemes
are ftp , mailto or git . You may sometimes see this part of the URL
referred to as the "protocol". There is a connection between the
scheme and the protocol, as the scheme can indicate which protocol
(or system of rules) should be used to access the resource.
However, the correct term to use in this context is "scheme".
www.example.com : The host. It tells the client where the resource is
hosted or located.
:88 : The port or port number. It is only required if you want to use
a port other than the default.
/home : The path. It shows what local resource is being requested.
This part of the URL is optional.
?item=book : The query string, which is made up of query
parameters. It is used to send data to the server. This part of the
URL is also optional.
Sometimes, the path can point to a specific resource on the host. For
instance, www.example.com/home/index.html points to an HTML file located on the
example.com server.
Sometimes, we may want to include a port number in the URL. A URL in the form
of: http://localhost:3000/profile specifies that we want to use port 3000 . The
default port number for HTTP is port 80 . Even though this port number is not always
specified, it's assumed to be part of every URL. Unless a different port number is
specified, port 80 will be used by default in normal HTTP requests. To use
anything other than the default, one has to specify it in the URL.
Query Strings/Parameters
A simple URL with a query string might look like:
Copy Code
http://www.example.com?search=ruby&results=10
Query String
Description
Component
? This is a reserved character that marks the start of the query string
& This is a reserved character, used when adding more parameters to the query string.
Now let's take a look at an example. Suppose we had the following URL:
Copy Code
http://www.phoneshop.com?product=iphone&size=32gb&color=white
Because query strings are passed in through the URL, they are only used in
HTTP GET requests. We'll talk about the different HTTP requests later in the book,
but for now just know that whenever you type in a URL into the address bar of your
browser, you're issuing HTTP GET requests. Most links also issue HTTP GET
requests, though there are some minor exceptions.
Query strings are great to pass in additional information to the server, however,
there are some limits to the use of query strings:
Query strings have a maximum length. Therefore, if you have a lot of data to
pass on, you will not be able to do so with query strings.
The name/value pairs used in query strings are visible in the URL. For this
reason, passing sensitive information like username or password to the
server in this manner is not recommended.
Space and special characters like & cannot be used with query strings. They
must be URL encoded, which we'll talk about next.
URL Encoding
URLs are designed to accept only certain characters in the standard 128-
character ASCII character set . Reserved or unsafe ASCII characters which are not
being used for their intended purpose, as well as characters not in this set, have to
be encoded. URL encoding serves the purpose of replacing these non-conforming
characters with a % symbol followed by two hexadecimal digits that represent the
equivalent UTF-8 character.
In most cases, you don't have to worry too much about UTF-8. However, it's worth
knowing that UTF-8 uses 1-4 bytes to represent every possible character in the
Unicode character set. Below are some popular encoded characters and example
URLs:
$ %24 http://www.spam.com/i-have-%2410-million-for-you
£ %C2%A3 http://www.spam.com/big-inheritance-%C2%A3-millions
€ %E2%82%AC http://www.spam.com/big-inheritance-%E2%82%AC-millions
𐍈 %E2%82%AC http://www.symbols-of-the-world.com/hwair-%E2%82%AC
It's helpful to remember that all characters in the ASCII character set as well as
those in the extended ASCII character set (see http://www.asciitable.com/ ) have
single-byte UTF-8 codes. Thus, $ can be represented as %24 , and Æ can be
represented as %92 .
3. The character is reserved for special use within the URL scheme. Some
characters are reserved for a special meaning; their presence in a URL
serves a specific purpose. Characters such as / , ? , : , @ , and & are all
reserved and must be encoded. For example & is reserved for use as a query
string delimiter. : is also reserved to delimit host/port components and
user/password.
So what characters can be used safely within a URL? Only alphanumeric and
special characters $-_.+!'()", and reserved characters when used for their
reserved purposes can be used unencoded within a URL. As long as a character is
not being used for its reserved purpose, it has to be encoded.
Summary
In this chapter, we've discussed URLs and what a URL is. We also looked at
components of the URL and concluded by exploring URL encoding. We'll dive a little
deeper into requests and responses and what they comprise of after the
preparations chapter.
Preparations
With the basics of HTTP out of the way, let's get acquainted with the tools we'll use
in this book to demonstrate how HTTP works. This section goes through a few tools
you'll need to follow along. Note that you only need one of the tools listed. It is
important to note that you will be able to follow this book regardless of what tool you
use, so pick one that you're comfortable with and get started!
HTTP GUI Tools
We'll make heavy use of Graphical HTTP tools throughout this book. There are lots
of options available in this category of HTTP tools, and we'll be using RapidAPI,
also known as Paw.
That said, there are many other alternatives out there. Some other capable options
are Insomnia and Postman; they're both free and they work on OS X, Windows, and
Ubuntu.
Mac OS X/Linux :
It is shipped with OS X and most GNU/Linux distributions, and you can simply
invoke it on the command line by issuing the command below.
Copy Code
$ curl www.google.com
Windows:
If you have version 1803 or later of Windows 10, cURL should be installed by
default. If you have an earlier version of windows or cannot find cURL on your
version, we recommend installing a GUI version of cURL. You won't be able to
execute the command line commands, but you should still be able to perform the
appropriate actions in the GUI.
For that, we can use an HTTP tool and just like the browser did when we entered a
URL in the address bar, we can have our HTTP tool issue a request
to https://www.reddit.com . Our HTTP tool, Paw, doesn't process the response and
lets us see the raw response data, which looks something like this:
What a huge difference this raw response is from the display in your browser! If
you've never seen raw HTTP response data before, this may be quite shocking.
What you see here is, in fact, what your browser also receives, except it parses and
processes that huge blob of data into a user-friendly format.
If you're learning about HTTP in order to become a web developer, you'll need to
learn to read and process raw HTTP response data just by scanning it. Of course,
you won't be able to convert it into a high-resolution picture in your head, but you
should have a general idea of what the response is about. With enough experience,
you can dig into the raw data and do some debugging and see exactly what's in the
response.
1. Launch Chrome browser and open the Inspector by navigating to the Chrome
Menu on the top right-hand corner of your browser. Select Tools > More
Tools > Developer Tools. There are other ways of accessing the inspector,
such as right-clicking and selecting the 'Inspector' option, or using a keyboard
shortcut Ctrl+Shift+I (or Option+Command+I on a Mac).
2. Send a new request to Reddit by entering the
address https://www.reddit.com into your browser.
4. The first thing you should notice is that there are a lot of entries
there. Each entry is a separate request, which means just by visiting
the URL, your browser is making multiple requests, one for every
resource (image, file, etc.). Click on the first request for the main
page, www.reddit.com entry:
5. From here, you'll be able to see the specific request headers,
cookies as well as the raw response data:
The default sub-tab, Headers, shows the request headers sent to the server, as well
as the response headers received back from the server.
Another thing to note when using the inspector's Network tab is, other than the first
request, there are a ton of other requests returned:
Why are these additional responses sent back, who initiated the requests? What's
happening is that the resource we requested, the initial www.reddit.com entry,
returned some HTML. And in that HTML body are references to other resources like
images, css stylesheets, javascript files and more. Your browser, being smart and
helpful, understands that in order to produce a visually appealing presentation, it
has to go and grab all these referenced resources. Hence, the browser will make
separate requests for each resource referenced in the initial response. When you
scroll down the Network tab, you'll be able to see all the referenced resources.
These other requests are to make sure the page displays properly on your screen,
among other things. Overall, you see that the browser's inspector gives you a good
feel for these referenced resources. A pure HTTP tool, on the other hand, returns
one huge response chunk without any concern for automatically pulling in
referenced resources. A curl request will demonstrate this:
Copy Code
-A 'User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_5)
AppleWebKit/537.36 (KHTML, like Gecko) Chrome/38.0.2125.101
Safari/537.36'
The -A option is used to specify a User-Agent for an HTTP request when
using curl. Since this is another option for our command, don't forget to
add in a space between -v and -A . For the sake of simplicity, we specify
the User-Agent that is listed at the end of this page. You may use your
own User-Agent as well.
Copy Code
$ curl -X GET "https://www.reddit.com/" -m 30 -v
What you should see is just one request and the response containing the HTML, but
no additional requests being automatically issued, like you see in a browser.
Request Methods
Let's revisit the diagram from Step 3 above, when we looked at the responses in
the Network tab. You might have noticed two columns named Method and Status.
If you don't see the Method column, it may be hidden by default. To display
the Method column, right click on Status and select Method. The Method column
should now be visible next to the Status column.
We'll spend this section looking at what the information shown in these columns
mean.
Information displayed in the Method column is known as the HTTP Request
Method. You can think of this as the verb that tells the server what action to
perform on a resource. The two most common HTTP request methods you'll see
are GET and POST . When you think about retrieving information, think GET , which is
the most used HTTP request method. In the above diagram, you'll notice almost all
of the requests use GET to retrieve the resources needed to display the web page.
The Status column shows the response status for each request. We'll talk about
responses in detail later in the book. The important thing to understand is that every
request gets a response, even if the response is an error -- that's still a response.
(That's not 100% technically true as some requests can time out, but we'll set those
rare cases aside for now.)
GET Requests
GET requests are initiated by clicking a link or via the address bar of a browser.
When you type an address like https://www.reddit.com into the address bar of your
browser, you're making a GET request. You're asking the web browser to go retrieve
the resource at that address, which means we've been making GET requests
throughout this book. The same goes for interacting with links on web applications.
The default behavior of a link is to issue a GET request to a URL. Let's make a
simple GET request to https://www.reddit.com with an HTTP tool. Make sure to
select GET and enter the address:
You can view the raw HTTP response and other information sent back from the web
server on the right panel.
Copy Code
$ curl -X GET "https://www.reddit.com/" -m 30 -v
We can also send query strings using an HTTP tool. Let's look at another quick
example by sending a request to search for all things Michael
Jackson at https://itunes.apple.com/ with query strings. The final URL will look like
this:
Copy Code
https://itunes.apple.com/search?term=Michael%20Jackson
Copy Code
$ curl -X GET "https://itunes.apple.com/search?term=Michael
%20Jackson" -m 30 -v
That's all you need to know about issuing HTTP GET requests for now. The primary
concepts are:
GET requests are used to retrieve a resource, and most links are GETs.
The response from a GET request can be anything, but if it's HTML and that
HTML references other resources, your browser will automatically request
those referenced resources. A pure HTTP tool will not.
POST Requests
We've seen how to retrieve or ask for information from a server with GET , but what if
you need to send or submit data to the server? That's where another essential
HTTP request method comes in: POST . POST is used when you want to initiate some
action on the server, or send data to a server. Let's see an example with our HTTP
tool:
Here is the curl command:
Copy Code
$ curl -X POST "https://echo.epa.gov" -m 30 -v
Let's see another example of making a POST request by filling out a web form. Our
sample form looks like this in the browser:
After filling out the form, you'll be redirected to a page that looks like this:
Now let's switch over to our HTTP tool and simulate what we just did in the browser.
Instead of filling out a form in the browser, we will send a POST request to http://al-
blackjack.herokuapp.com/new_player . This is the URL that the first form (the one
where we input a name) submits to:
Note: You'll want to ensure that your Content-Type header is set
to application/x-www-form-urlencoded . If it isn't, then your POST request
won't be interpreted by the application correctly.
If you're using Paw 3, select the Form URL-Encoded tab instead of the
Text tab. If you're using Insomnia, make sure you click "Form URL
Encoded" in the Body dropdown menu. And if you're using Postman, make
sure the radio button for x-www-form-urlencoded is selected under the Body
tab.
Copy Code
$ curl -X POST "http://al-blackjack.herokuapp.com/new_player"
-d "player_name=Albert" -m 30 -v
Notice that in the screenshot and curl command we're supplying the additional
parameter of player_name=albert . It has the same effect as inputting the name into
the first "What's your name?" form and submitting it.
We can verify the contents using the inspector (right click and select Inspect ). You'll
see that the player_name parameter we're sending as part of the POST request is
embedded in the form via the name attribute of the input element:
But the mystery is, how is the data we're sending being submitted to the server
since it's not being sent through the URL? The answer to that is the HTTP body.
The body contains the data that is being transmitted in an HTTP message and is
optional. In other words, an HTTP message can be sent with an empty body. When
used, the body can contain HTML, images, audio and so on. You can think of the
body as the letter enclosed in an envelope, to be posted.
The POST request generated by the HTTP tool or curl is the same as you filling out
the form in the browser, submitting that form, and then being redirected to the next
page. Look carefully at the raw response in the HTTP tool screenshot. The key
piece of information that redirects us to the next page is specified in the
field Location: http://al-blackjack.herokuapp.com/bet . The Location header is an
HTTP response header (yes, requests have headers too, but in this case, it's a
response header). Don't worry too much about this yet as we'll discuss headers in a
later section. Your browser sees the Location header and automatically issues a
brand new request to the specified URL, thereby initiating a new, unrelated request.
The "Make a bet" form you see is the response from that second request.
Note: If you're using some other HTTP tool, like Insomnia or Postman, you
may have to uncheck "automatically follow redirects" in order to see
the Location response header.
If you're fuzzy on the previous paragraph, read it again. It's critical to understand
that when using a browser, the browser hides a lot of the underlying HTTP
request/response cycle from you. Your browser issued the initial POST request, got a
response with a Location header, then issued another request without any action
from you, then displayed the response from that second request. Once again, if you
were using a pure HTTP tool, you'd see the Location response header from the
first POST request, but the tool would not automatically issue a second request for
you. (Some HTTP tools have this ability, if you check the "automatically follow
redirects" option.)
HTTP Headers
HTTP headers allow the client and the server to send additional information during
the HTTP request/response cycle. Headers are colon-separated name-value pairs
that are sent in plain text. By using the Inspector, we can see these Headers.
Below, you can see both the request as well as the response headers:
The above shows the various headers being transmitted during a request/response
cycle. Further, we can see that the request and response contain a different set of
headers under Request Headers :
Request Headers
Request headers give more information about the client and the resource to be
fetched. Some useful request headers are:
Don't bother memorizing any of the request headers, but just know that it's part of
the request being sent to the server. We'll talk about response headers in the next
chapter.
Summary
This was a brief introduction on making HTTP requests. After going through this
section, you should be comfortable with:
HTTP method
path (the resource name and any query parameters)
headers
message body (for POST requests)
In the next chapter, we'll continue learning about HTTP by looking at HTTP
responses.
Processing Responses
Introduction
So far we've been sending various requests and looking at the raw HTTP data sent
back by the server. This raw data returned by the server is called a response. We'll
spend this section analyzing the various components of an HTTP response.
Status Code
The first component we'll look at is the HTTP Status Code. The status code is a
three-digit number that the server sends back after receiving a request signifying
the status of the request. The status text displayed next to status code provides
the description of the code. It is listed under the Status column of the Inspector:
The most common response status code you'll encounter is 200 which means the
request was handled successfully. Other useful status codes are:
Internal Server
500 The server has encountered a generic error.
Error
As a web developer, you should know the above response status codes and their
associated meaning very well.
302 Found
What happens when a resource is moved? The most common strategy is to re-route
the request from the original URL to a new URL. The general term for this kind of re-
routing is called a redirect . When your browser sees a response status code
of 302, it knows that the resource has been moved, and will automatically follow the
new re-routed URL in the Location response header. In this section, we'll focus on
redirects in the context of a Browser and an HTTP tool.
Say you want to access the account profile at GitHub, you'll have to go to the
address https://github.com/settings/profile . However, in order to have access to
the profile page, you must first be signed in. If you're not already signed in, the
browser will send you to a page to do that. After you enter your credentials, you'll be
redirected to the original page you were trying to access. This is a pretty common
workflow most web applications employ. Let's see how the browser and the HTTP
tool handle that workflow.
Compare that with the HTTP tool (note the status code), which
does not automatically follow the redirect:
Take note of the Location response header (it's a little hard to see, on line 10). You
should see Location: https://github.com/login?return_to=https%3A%2F
%2Fgithub.com%2Fsettings%2Fprofile , which contains a return_to parameter with a
value of the URL where the client should be redirected to after signing in. Compare
that with the screenshot from the browser above: you'll notice that the
address https://github.com/login?return_to=https%3A%2F%2Fgithub.com%2Fsettings
%2Fprofile is the same as what's in the browser address bar.
Next, let's look at a 404 response status code in the browser. The server returns this
status code when the requested resource cannot be found. Remember, a resource
can be anything including audio files, CSS stylesheets, JavaScript files, images etc.
Let's send a request to retrieve our awesome image at https://www.dropbox.com by
sending a GET request to https://www.dropbox.com/awesome_file.jpg :
We see the nice default 404 page from Dropbox. Now take a look at the response of
the same request using a pure HTTP tool:
Because the resource we want does not exist, the browser shows us nice formatted
text while the HTTP tool shows us the raw response with the status code.
A 500 status code says "there's something wrong on the server side". This is a
generic error status code and the core problem can range from a mis-configured
server setting to a misplaced comma in the application code. But whatever the
problem, it's a server side issue. Someone with access to the server will have to
debug and fix the problem, which is why sometimes you see a vague error message
asking you to contact your System Administrator. In the wild, a 500 error can be
shown in a variety of ways, just like a 404 page can. Here's an example of a generic
error page from the default Rails application:
Using an HTTP tool, we can see the status code and raw data:
Response Headers
Like request headers, we can also use the inspector to view response headers:
Response headers offer more information about the resource being sent back.
Some common response headers are:
There are a lot more response headers, but just like request headers, it's not
necessary to memorize them. They have subtle effects on the data being returned,
and in some cases, they have subtle workflow consequences (eg, your browser
automatically following a Location response header). Just understand that response
headers contain additional meta-information about the response data being
returned.
Summary
In this chapter, we've discussed the components of HTTP responses. We've also
looked at how to use the inspector to view the headers associated with an HTTP
response. Although we have only scratched the surface of the HTTP protocol, we
hope this will empower you to dig further should the need arise.
In sum, we've seen that HTTP is nothing more than an agreement in the form of
formatted text that dictates how a client and server communicate.
status code
headers
message body, which contains the raw response data
See if you can figure out where each of the above components are in the
screenshot below.
Each request made to a resource is treated as a brand new entity, and different
requests are not aware of each other. This statelessness is what makes HTTP and
the internet so distributed and difficult to control, but it's also the same ephemeral
attribute that makes it difficult for web developers to build stateful web applications.
As we look around the internet and use familiar applications, we feel that the
application somehow has a certain state. For example, when we log in to Facebook
or Twitter, we see our username at the top, signifying our authenticated status. If we
click around (which generates new requests to Facebook's servers) we are not
suddenly logged out; the server response contains HTML that still shows our
username, and the application seems to maintain its state.
In this chapter, we'll focus on how this happens by discussing some of the
techniques being employed by web developers to simulate a stateful experience.
Along the way, we'll also discuss some techniques used on the client to make
displaying dynamic content easy. The approaches we'll discuss are:
Sessions
Cookies
Asynchronous JavaScript calls, or AJAX
There's also a 4th approach that we won't discuss: sending stateful data as query
parameters when making a request. This approach used to be nearly universal, but
is mostly gone from all modern web sites.
A Stateful App
Let's begin by looking at an illustration of a stateful app. When you make a request
to https://www.reddit.com , the home page shows up:
Sessions
It's obvious the stateless HTTP protocol is somehow being augmented to maintain a
sense of statefulness. With some help from the client (i.e., the browser), HTTP can
be made to act as if it were maintaining a stateful connection with the server, even
though it's not. One way to accomplish this is by having the server send some form
of a unique token to the client. Whenever a client makes a request to that server,
the client appends this token as part of the request, allowing the server to identify
clients. In web development, we call this unique token that gets passed back and
forth the session identifier.
This mechanism of passing a session id back and forth between the client and
server creates a sense of persistent connection between requests. Web developers
leverage this faux statefulness to build sophisticated applications. Each request,
however, is technically stateless and unaware of the previous or the next one.
This sort of faux statefulness has several consequences. First, every request must
be inspected to see if it contains a session identifier. Second, if this request does, in
fact, contain a session id, the server must check to ensure that this session id is still
valid. The server needs to maintain some rules with regards to how to handle
session expiration and also decide how to store its session data. Third, the server
needs to retrieve the session data based on the session id. And finally, the server
needs to recreate the application state (e.g., the HTML for a web request) from the
session data and send it back to the client as the response.
This means that the server has to work very hard to simulate a stateful experience,
and every request still gets its own response, even if most of that response is
identical to the previous response. For example, if you're logged into Facebook, the
server has to generate the initial page you see, and the response is a pretty
complex and expensive HTML that your browser displays. The Facebook server has
to add up all the likes and comments for every photo and status, and present it in a
timeline for you. It's a very expensive page to generate. Now if you click on the
"like" link for a photo, Facebook has to regenerate that entire page. It has to
increment the like count for that photo you liked, and then send that HTML back as
a response, even though the vast majority of the page stayed the same.
There are many advanced techniques that servers employ to optimize sessions and,
as you can imagine, there are also many security concerns. Most of the advanced
session optimization and security concerns are out of scope of this book, but we'll
talk about one common way to store session information: in a browser cookie.
Cookies
A cookie is a piece of data that's sent from the server and stored in the client during
a request/response cycle. Cookies or HTTP cookies, are small files stored in the
browser and contain the session information. By default, most browsers have
cookies enabled. When you access any website for the first time, the server sends
session information and sets it in your browser cookie on your local computer. Note
that the actual session data is stored on the server. The client side cookie is
compared with the server-side session data on each request to identify the current
session. This way, when you visit the same website again, your session will be
recognized because of the stored cookie with its associated information.
Let's see a real example of how cookies are initiated with the help of the browser
inspector. We'll make a request to the address https://www.yahoo.com . Note that if
you're following along, you may have to use a different website if you already have
an existing cookie from Yahoo.
With the inspector's Network tab open, navigate to that address and inspect the
request headers:
Let's now revisit the original example of how Reddit, or any web application, keeps
track of the fact that we maintain our authenticated status even though we issue
request after request. Remember, each request is unrelated to each other and is not
aware of each other - how does the app "remember" we're authenticated? If you're
following along, perform the following steps with the inspector open:
The simple answer is: on the server somewhere. Sometimes, it's just
stored in memory, while other times, it could be stored in some persistent
storage, like a database or key/value store. Where the session data is
actually stored is not too important right now. The most important thing is
to understand that the session id is stored on the client, and it is used as
a "key" to the session data stored server side. That's how web
applications work around the statelessness of HTTP.
It is important to be aware of the fact that the id sent with a session is unique and
expires in a relatively short time. In this context, it means you'll be required to login
again after the session expires. If we log out, the session id information is gone:
This also implies that if we manually remove the session id (in the inspector, you
can right click on cookies and delete them), then we have essentially logged out.
To recap, we've seen that the session data is generated and stored on the server-
side and the session id is sent to the client in the form of a cookie. We've also
looked at how web applications take advantage of this to mimic a stateful
experience on the web.
AJAX
Last, we'll briefly mention AJAX and what it means in the HTTP request/response
cycle. AJAX is short for Asynchronous JavaScript and XML. Its main feature is that
it allows browsers to issue requests and process responses without a full page
refresh. For example, if you're logged into Facebook, the server has to generate the
initial page you see, and the response is a pretty complex and expensive HTML
page that your browser displays. The Facebook server has to add up all the likes
and comments for every photo and status, and present it in a timeline for you. As we
described earlier, it's a very expensive page to re-generate for every request
(remember, every action you take -- clicking a link, submitting a form -- issues a
new request).
When AJAX is used, all requests sent from the client are performed asynchronously,
which just means that the page doesn't refresh. Let's see an example by performing
some search on google:
As soon as you start your search, you'll see the network tab gets
flooded with requests.
No doubt you've performed searches many times and notice the page doesn't
refresh. The Network tab however gives us some new insight into what's happening:
every letter you type is issuing a new request, which means that an AJAX request is
triggered with every key-press. The responses from these requests are being
processed by some callback. You can think of a callback as a piece of logic you
pass on to some function to be executed after a certain event has happened. In this
case, the callback is triggered when the response is returned. You can probably
guess that the callback that's processing these asynchronous requests and
responses is updating the HTML with new search results.
We won't get into what the callback looks like or how to issue an AJAX request, but
the main thing to remember is that AJAX requests are just like normal requests:
they are sent to the server with all the normal components of an HTTP request, and
the server handles them like any other request. The only difference is that instead of
the browser refreshing and processing the response, the response is processed by
a callback function, which is usually some client-side JavaScript code.
Summary
This chapter covered techniques used by web developers to mimic statefulness,
despite having to work with HTTP, a stateless protocol. You learned about cookies
and sessions, and how modern web applications remember state for each client.
You also used the inspector to see cookies and the session id in action. Finally, you
learned about the role AJAX plays in displaying dynamic content in web
applications.
Security
As we've repeatedly stated throughout this book, the same attributes that make
HTTP so difficult to control, also make it so difficult to secure. Now that you know
about how web applications dance their way around the statelessness of HTTP, you
can probably guess that there are some security ramifications lurking around the
corner. For example, what if someone steals my browser's session id, does that
mean they can log in as me? Or what if I'm accessing a random website, can they
peek into my Reddit or Facebook cookie, where my session id information for those
sites are stored? We'll spend this chapter discussing some common security issues
that creep up with HTTP. This is by no means an exhaustive list of security issues;
just common ones that any web developer would be expected to know.
HTTPS sends messages through a cryptographic protocol called TLS for encryption.
Earlier versions of HTTPS used SSL or Secure Sockets Layer until TLS was
developed. These cryptographic protocols use certificates to communicate with
remote servers and exchange security keys before data encryption happens. You
can inspect these certificates by clicking on the padlock icon that appears before
the https:// :
Although most modern browsers do some high level check on a website's certificate
on your behalf, viewing the certificate sometimes serves as an extra security step
before interacting with the website.
Same-origin policy
The same-origin policy permits unrestricted interaction between resources
originating from the same origin, but restricts certain interactions between resources
originating from different origins. By origin, we mean the combination of the scheme,
host, and port. So http://mysite.com/doc1 has the same origin
as http://mysite.com/doc2 , but a different origin
from https://mysite.com/doc1 (different
scheme), http://mysite.com:4000/doc1 (different port),
and http://anothersite.com/doc1 (different host).
While secure, the same-origin policy is an issue for web developers who have a
legitimate need for making these restricted kinds of cross-origin requests. Cross-
origin resource sharing, or CORS, was developed to deal with this issue. CORS is a
mechanism that allows interactions that would normally be restricted cross-origin to
take place. It works by adding new HTTP headers, that let servers to serve
resources cross-origin to certain specified origins. Don't worry about what this
means right now. We'll cover it in much greater detail late in the Core Curriculum.
The same-origin policy is an important guard against session hijacking (see next
section) attacks and serves as a cornerstone of web application security. Let's look
at some HTTP security threats and their counter-measures.
Session Hijacking
We've already seen that the session plays an important role in keeping HTTP
stateful. We also know that a session id serves as that unique token used to identify
each session. Usually, the session id is implemented as a random string and comes
in the form of a cookie stored on the computer. With the session id in place on the
client side now every time a request is sent to the server, this data is added and
used to identify the session. In fact, this is what many web applications with
authentication systems do. When a user's username and password match, the
session id is stored on their browser so that on the next request they won't have to
re-authenticate.
Unfortunately, if an attacker gets a hold of the session id, both the attacker and the
user now share the same session and both can access the web application. In
session hijacking, the user won't even know an attacker is accessing his or her
session without ever even knowing the username or password.
For example, the form below allows you to add comments, which will then be
displayed on the site.
Because it's just a normal HTML <textarea> , users are free to input anything into
the form. This means users can add raw HTML and JavaScript into the text area
and submit it to the server as well:
If the server side code doesn't do any sanitization of input, the user input will be
injected into the page contents, and the browser will interpret the HTML and
JavaScript and execute it. In this case an alert message will pop up, which is
definitely not the desired outcome. Attackers can craft ingeniously malicious HTML
and JavaScript and be very destructive to both the server as well as future visitors
of this page. For example, an attacker can use JavaScript to grab the session id of
every future visitor of this site and then come back and assume their identity. It
could happen silently without the victims ever knowing about it. Note that the
malicious code would bypass the same-origin policy because the code lives on the
site.
Escaping
We mention the term "escaping" above. To escape a character means to
replace an HTML character with a combination of ASCII characters, which
tells the client to display that character as is, and to not process it; this
helps prevent malicious code from running on a page. These combinations
of ASCII characters are called HTML entities.
Consider the following HTML: <p>Hello World!<\p> . Let's say we don't want
the browser to read this as HTML. To accomplish this, we can escape
special characters that the browser uses to detect when HTML starts and
ends, namely < and > , with HTML entities. If we write the
following: <p>Hello World!<\p> , then that HTML will be displayed
as plain text instead.
Summary
In this section, we covered various aspects of security in web applications.
Needless to say, this is a huge topic, and we've only scratched the surface of a few
common issues. The point of this chapter is to reveal how fragile and problematic
developing and securing a web application is, and it's mostly due to working with
HTTP.