KEMBAR78
CORS for Developers

CORS for Developers

W3C Working Group Note,

This version:
http://www.w3.org/TR/2016/NOTE-cors-for-developers-1-20161015/
Editor's Draft:
https://w3c.github.io/webappsec-cors-for-developers/
Version History:
https://github.com/w3c/webappsec-cors-for-developers/commits/master/index.src.html
Issue Tracking:
Inline In Spec
Editor:
(Facebook)
Participate:
File an issue (open issues)

Abstract

This Note provides historical and instructional context for developers working with Cross-Origin Resource Sharing (CORS) and the CORS-mode of Fetch.

this whole document needs references and terms linkified

1. Introduction

One of the defining characteristics of a Web application is the way in which resources can link and communicate with each other. Over time, as browsers have evolved and become more powerful, the ways in which Web applications can interact has increased. To maintain security in this feature-rich environment, browsers use an origin based security model and restrict how data may be shared across origins. Cross-Origin Resource Sharing (CORS) allows Web applications to share data across origins and is therefore one of the most powerful features of the Web platform, but it evolved in an environment full of existing applications that assumed this capability was impossible. Browsers had to make the impossible possible, securely, and competing proposals had to be negotiated into a single system. Like many things that are the result of evolution or compromise, there are things about CORS that may seem confusing or inconsistent. Reading the W3C CORS or WHATWG Fetch specification is probably not much help with this confusion, as those documents are oriented towards people building browsers. This document aims to explain the history and usage of Cross-Origin Resource Sharing for people developing Web applications using CORS in the browser, and exposing CORS-aware resources on their servers.

2. Cross-Origin requests at the beginning of the Web

In the early days of browsers, a resource could interact with other resources in a very small number of ways. You could use an anchor to make a GET request, which would also navigate the browser, so the previous page couldn’t see the results. You could use a form to make a GET or POST request, which would also navigate the browser. Or you could include resources into your page, like images - but you couldn’t access the content of these images or other resources included in this way. In this world of static documents, a complex security model was unnecessary.

3. Cookies, “Ambient Authority” and Cross Site Request Forgery (CSRF)

HTTP is a stateless protocol, but state can be useful. For instance, you would probably not like to have to enter your password on every single request to an authenticated application. Cookies filled this role and emerged very early in the history of the Web. A cookie allows a site to save a small amount of client side state. Cookies were the first browser feature to need a client-side access control policy. That policy stated that the browser shall only send cookies back to the site that set them, as determined by the scheme and host of the resource. So, e.g., https://bank-example.com/a.html could set a cookie that https://bank-example.com/b.html could see as part of a future request, but https://evil-example.com/bad.html could not see.

Browsers save cookies as the user navigates through many application origins and send them back any time a new request to that origin is made. This led to the first cross-origin security problems. For example, although https://evil-example.com can’t read https://bank-example.com’s cookies and access your account directly, it could create a form that sent a POST to https://bank-example.com/transferFunds with instructions to transfer money to a different account. As the request would be sent with your cookie, the bank might execute the transfer as if it it was on your instructions. This kind of attack is called Cross Site Request Forgery (CSRF), a class of the “Confused Deputy” problem. CSRF exists because a cookie is present in the browser and attached to every request by default, without requiring additional user action, regardless of where the request originated. This style of access control interaction is often referred to as ambient authority. Managing ambient authority and defending against CSRF are key motivators for much of the subsequent evolution of the Web security model, including CORS.

If you are developing for the Web and are used to writing apps for a platform like Android or iOS, you may wonder why your app can’t make any request and read the results as it is possible to do with the platform native APIs. The differences is in the layer at which security protections for the user are enforced. An application on a mobile operating system may typically make whatever network requests it wants and do anything with data in its own scope. The user is protected from malicious applications at the operating system layer, isolating and defining boundaries between different applications, and between applications and global state and privileges.

A general purpose browser treats Web content from each origin as its own application and enforces isolation between them, on the user’s behalf, as a mobile operating system might between apps. This is necessary because of the ambient authority carried by things like cookies. While it may seem restrictive at first compared to how networking APIs work in a mobile operating system, it also allows rich and seamless interaction between different application domains if we follow the security rules.

4. The Cross-Origin HTTP access control table

Although it wasn’t thought of this way at the time, we can represent what the early browser allowed with cross-origin requests in terms of an access control table. Later on, we can see how CORS allows us to modify the permissions in this table. Rather than the read/write permissions you might use for a local file, for HTTP requests we will think in terms of permissions to SEND requests and READ the results. SEND permissions are the types and attributes of requests a resource is allowed to make from the browser. READ permissions are the types and attributes of responses a resource is allowed by the browser to access. The following are the things a conforming user agent allowed a resource to do, explicitly or implicitly, with cross-origin resources:

4.1. CROSS-ORIGIN SEND PERMISSIONS (pre-JavaScript browsers)

GET, with cookies ALLOWED
POST, with cookies ALLOWED
PUT, DELETE, etc., with cookies DISALLOWED
Set header value: Accept USER-CONTROLLED*
Set header value: Accept-Language USER-CONTROLLED*
Set header value: Content-Language USER-CONTROLLED*

4.2. CROSS-ORIGIN READ PERMISSIONS (pre-JavaScript browsers)

Cookies DISALLOWED
Resource body DISALLOWED
Read header values DISALLOWED

5. <script>, resources-as-applications, XHR and AJAX

With the introduction of JavaScript, resources on the Web became applications. (plugins are their own topic, of thankfully ever-decreasing relevance) This raised interesting challenges about the security perimeter for the web-resource-as-application: how to allow rich interactivity that is useful and powerful without breaking the security of existing services and resources? Netscape decided to go with a security model similar to that of cookies. The Same Origin Policy, as it was called, came to be the foundational strategy for client-side web application security. The security boundaries of a resource are defined by its Origin, the URI scheme/host/port tuple where it was loaded from. Under the Same Origin Policy, a resource has authority to read from and send to other resources with few restrictions when they are same-origin, but it cannot read and send to cross-origin resources, except as previously allowed in our permission tables above. The XMLHttpRequest and introduction of the async request pattern followed after JavaScript and adhered to the same basic rules.
Leaks at the edges of the Same Origin Policy model

The Same Origin Policy is conceptually simple, but it has a few complications and edge cases. Firstly, resources-as-applications can query their internal state in a way that leaks some information about previously opaque requests and responses, like whether an <img> load failed or was successful, and what the dimensions of the image are, even if access to the actual pixel content is prevented cross-origin.

More importantly, because the <script> tag was not limited to only loading same-origin resources, cross-origin documents can implicitly read the response content of the script through its (often deliberate) side-effects. This led to some interesting security vulnerabilities and also some interesting ways to do cross-origin mashup applications.

On the vulnerability side, <script> introduced a new possibility for CSRF-type attacks that would actually disclose the contents of the response. E.g. if https://bank-example.com/getBalanceAsync provides a resource that looks up the user’s balance using the cookies attached to the request, then returns a script that does a document.write() of that information, any other site on the internet could include that resource with a <script> tag on their own site and read your account balance. This is why you often see JavaScript resources that begin with a guard statement like for(;;) {} . This infinite loop blocks the thread and prevents access to the information that follows, except by a Same-Origin caller that can access the response contents explicitly with an async request and read past the guard.

The script-read loophole also became widely used to deliberately exchange information across origins with a pattern called JSONP. (JSON with Padding) In this use, a JSON response is padded with a function specified by the caller to allow cross-origin data transfer. So if https://brokerage-example.com included:

<script src=”https://bank-example.com/jsonp.js?callback=foo”> ,

https://bank-example.com could respond with a payload that lookes like:

foo({accountNumber: 12345});

Which would invoke the foo() function in the context of the https://brokerage-example.com resource and give it access to the account number.

While extremely useful, JSONP and cross-origin script inclusion generally are dangerous patterns. JSONP is really just “friendly” Cross-Site Scripting. A resource using JSONP must completely trust the remote resource not to be malicious or compromised; the payload could just as easily be an information-stealing script as an information-passing one and it will be executed in the security domain of the caller, whatever it contains. A resource providing JSONP responses must also be careful not to send them to the wrong origins. Enabling cross-origin async data exchange in a more secure manner than JSONP was a motivating use case in the design of CORS.

The browser permissions table, post-JavaScript but pre-CORS, looks as follows. The CORS specification refers to requests and responses that fall within this permission set as “simple” requests and responses, and the Fetch specification calls these methods and headers “safelisted”.

5.1. CROSS-ORIGIN SEND PERMISSIONS (Simple / Safelisted Request)

GET, with cookies ALLOWED
POST, with cookies ALLOWED
PUT, DELETE, etc., with cookies DISALLOWED
Set header value: Accept USER-CONTROLLED
Set header value: Accept-Language USER-CONTROLLED
Set header value: Content-Language USER-CONTROLLED
Set header value: Content-Type PARTIAL: application/x-www-form-urlencoded , multipart/form-data , or text/plain
Set header value: anything else DISALLOWED

User agents with implement the Fetch specification also allow the following headers to be sent as part of a simple / safelisted request, so long as the value, once parsed is not a failure:

Set header value: DPR ALLOWED
Set header value: Downlink ALLOWED
Set header value: Save-Data ALLOWED
Set header value: Viewport-Width ALLOWED
Set header value: Width ALLOWED

5.2. CROSS-ORIGIN READ PERMISSIONS (Simple / Safelisted Response)

Cookies DISALLOWED
Resource body DISALLOWED (some side effects like response status and image size may be observable from scripts)
<script> resource body PARTIAL (implict access to declared variables and side effects of functions)
Read header values DISALLOWED

6. CORS

As JavaScript and AJAX became ubiquitous on the web and the browser became a rich client-side application platform, people wanted to be able to do more complex things with cross-origin requests (as compared to the old “simple” requests and responses).

Enter Cross-Origin Resource Sharing. The challenge when designing CORS was how to enable web applications to request and receive more cross-origin permissions, without exposing existing applications to new attacks. By the time CORS arrived on the scene there were already billions of HTTP resources in use by a wide array of applications beyond the traditional browser. Many of those resources relied implicitly on the original permission table. For example, they might expose a privileged endpoint on an intranet, but check for a special HTTP request header that only a non-browser or same-origin client could set, to protect themselves from CSRF-type attacks. The design of CORS couldn’t suddenly make all those existing applications vulnerable; it needed to evolve the web platform in a backwards-compatible way.

6.1. Identify Yourself and Get Permission: Simple Request Read Permissions

The way to expand the permissions available to invoke and read cross-origin resources safely is to identify yourself to the foreign resource and have it grant you additional privileges. This extending of the access control table is at the heart of Cross-Origin Resource Sharing.

First: how to identify yourself? The JavaScript Same Origin Policy made the Origin the main security boundary / principal for the web, so this is how you identify yourself to request new permissions. Whenever you make a request in CORS mode, the browser will add an HTTP header, “Origin”, identifying the context from which the request originated.

Next: how to get extra permissions? If you are making a cross-origin request already allowed by the pre-CORS permission table (AKA a “simple” request) all you need to is let the browser send an Origin header with the request. When it receives such a request, the foreign resource can examine the Origin header of the request and decide to grant it additional Read permissions.

If a simple CORS request (withCredentials=true) and an Origin header were initiated with XMLHttpRequest or the Fetch API by a resource with an Origin of https://example.com, and the response from the resource included the following:

Access-Control-Allow-Origin: https://example.com
Access-Control-Allow-Credentials: true
Access-Control-Expose-Headers: Special-Response-Header, Header2

The browser would add the following entries to the read permissions table for https://example.com regarding that response:

Resource body ALLOWED
Read header value: Special-Response-Header ALLOWED
Read header value: Header2 ALLOWED

The Access-Control-Allow-Origin header tells the browser which Origin can read the response body. The Access-Control-Allow-Credentials header in the response tells the browser it is OK to expose the response for a request that included cookies. (more about that in a bit) The Access-Control-Allow-Headers header (which is optional in a CORS response) lists headers in the response that the resource from https://example.com is able to read.

When any matching Access-Control-Allow-Origin response is received, read permission is also granted implicitly for a number of headers deemed to be at low risk of containing sensitive information:

Read header value: Cache-Control ALLOWED
Read header value: Content-Language ALLOWED
Read header value: Content-Type ALLOWED
Read header value: Expires ALLOWED
Read header value: Last-Modified ALLOWED
Read header value: Pragma ALLOWED
A Few Caveats for CORS Read Permissions

Although the CORS specification implies that you can list multiple origins in the Access-Control-Allow-Origin header, in practice only a single value is allowed by all modern browsers. The multiple value syntax was intended to allow all origins in a redirect chain to be listed, as allowed by RFC6454, but this was never implemented.

If a cross-origin resource redirects to another resource at a new origin, the browser will set the value of the Origin header to null after redirecting. This prevents additional confused deputy attacks, but a cost of making it difficult to transparently move CORS resources that support (cookie-based) credentials and simple requests across domains with 3xx status codes as one can with non-CORS resources. It is possible to redirect same-origin resources to a cross-origin location (single hop only) because browsers will transparently apply the CORS algorithm to such requests and include the Origin header for the first hop.

To make possible transparent movement of CORS-enabled resources accessed via simple requests requires work and coordination by servers. For example, a resource at https://old.example.com/resource might respond to a CORS request with credentals with a redirect to https://new.example.com/resource?originalOrigin=...&credential=... which would carry the authority to access the resource in the URL parameter instead of the Cookie and Origin headers. The server at https://new.example.com would need custom logic to recognize and authenticate these parameters and perform appropriate authorization before returning Access-Control-Allow- headers to the user agent.

Requests which require a prefilight can transit multiple origins in user agents that implement recent versions of the WHATWG Fetch specification which instruct the preflight request be transparently re-issued to the new target using the pre-redirect value of the Origin header.

No Sharing of Cookies

As security-critical access control tokens, browsers will not allow cookies to be shared. So the following response headers cannot be accessed, even if listed in Access-Control-Expose-Headers.

  • Set-Cookie
  • Set-Cookie2

6.2. Getting Additional Send Permissions

What if you want to make a request that isn’t allowed by the simple request permissions table? To protect legacy resources that may not understand CORS, you must first ask permission to expand your send privileges. This permission granting is done with a “preflight request”. If you are writing the client side of an application, the browser will take care of most of this for you. You simply enable CORS and make your request. If, e.g., you wanted to send a cross-origin PUT request and set the header “Special-Request-Header: foobar”, the browser would automatically send a preflight request. That preflight would include the following:

Origin: https://example.com
Access-Control-Request-Method: PUT
Access-Control-Request-Headers: Special-Request-Header

If the resource is CORS-aware and supports these options, it might respond to this preflight request with:

Access-Control-Allow-Origin: https://example.com
Access-Control-Allow-Methods: PUT
Access-Control-Allow-Headers: Special-Request-Header
Access-Control-Allow-Credentials: true
Access-Control-Max-Age: 60

This adds the following entries to the permission table for https://example.com :

PUT, with cookies ALLOWED (for 60 seconds)
Set header: Special-Request-Header ALLOWED (for 60 seconds)

The browser will then transparently make the actual request if the correct permissions were granted, and return the response with any new read permissions granted by the read permission headers described above. The Access-Control-Max-Age header in the preflight response is optional, and allows the permission grant to be cached. If it is not present, a preflight must be made before each non-simple request.

Caveats for CORS Send Permissions

For security and correctness reasons, a number of headers are forbidden to be set by a same-origin or CORS-authorized caller, even if the preflight ceremony is completed. These include:

  • Accept-Charset
  • Accept-Encoding
  • Access-Control-Request-Headers
  • Access-Control-Request-Method
  • Connection
  • Content-Length
  • Cookie
  • Cookie2
  • Date
  • DNT
  • Expect
  • Host
  • Keep-Alive
  • Origin
  • Referer
  • TE
  • Trailer
  • Transfer-Encoding
  • Upgrade
  • Via
  • Or any header beginning with Proxy- or Sec-

Caveat caveats

Some server-side application frameworks, notably those that coerce HTTP headers into UNIX environment variables such as PHP or Perl CGIs, transparently translate hyphens to underscore characters when exposing headers to the application. This means that a CORS client may be able to set headers that begin with Proxy_ or Sec_ which will be indistinguishable to these applications from their hyphenated equivalents which are banned by browsers.

6.3. Anonymous Requests (or, “Access-Control-Allow-Origin: *”)

In addition to credentialed requests, CORS sought to solve one additional use case. Some resources on the web are truly public, and can be read by anyone. Before CORS, an application could access such resources by requesting that its origin server proxy them. This was safe because the proxy server had no ambient authority - that is, it couldn’t make those requests with users’ cookies, it could only return public content available to anyone.

While this worked, it was wasteful of resources, high-latency, and prevented building some classes of server-less applications in the browser. CORS allows such resources to be requested and accessed directly with anonymous requests. (In fact, anonymous requests are the default mode of CORS.)

Unlike the previous values for Access-Control-Allow-Origin headers we’ve seen, which add values to the discretionary access control table, anonymous requests follow a mandatory access control model. If a request is marked as accessible to anonymous requests, it is only accessible to anonymous requests.

How does this work? A resource includes the following header in its response:

Access-Control-Allow-Origin: *

(and optionally an Access-Control-Allow-Headers header indicating allowed headers)

Upon seeing this, the browser will grant access to the response (and indicated headers) if and only if the request was made anonymously. An anonymous request can be made by setting the withCredentials on an XMLHttpRequest to false (the default), which causes the browser to omit the Cookie headers in the request.

document how to do this with Fetch API

Anonymous requests are also special in that responses do not change the state of the browser. If you started anonymous, you stay anonymous after an anonymous request. So, although a resource accessed without credentials using CORS might send a Set-Cookie header, such headers would be ignored. Even the response is not stored in the normal browser cache, although it might be cached in a special location just for anonymous requests.

Why does * mean “anonymous requests only” and not “everyone”?

The use of the asterisk by CORS can be confusing in the context of an otherwise discretionary access control model. It is intended only to replace the proxy use case for static resources or those that respond uniformly to requests from all origins. A * says a resource is accessible to "anybody", but if your cookies identify you, you’re not “anybody” anymore - now you are somebody.

An unfortunate consequence of this model is that a CORS client must know in advance if the contract of a resource it is requesting requires anonymous or credentialed access, or it must implement client-side failure and retry logic. This can be problematic when designing something like an open data explorer when all potential data sources do not uniformly respond to anonymous requests.

The story behind * Why does the * mode of CORS behave so differently than the credentialed mode? Given the already long history of web vulnerabilities attributable to abuse of ambient authority, there were two schools of thought about how to expand the web platform with cross-origin requests. One camp proposed extending the already-familiar cookie model to authenticate to cross-origin resources. The other camp felt that ambient authority was a mistake in the architecture of the web, and advocated for cross-origin requests free of any ambient credentials or origin-based authentication. (Such requests, they also argued, could be exempt from the Same Origin Policy entirely, as they would be equivalent to loading from an unprivileged proxy.) The XDomainRequest object in Microsoft Internet Explorer 8 and 9 retained some of this style - it had no support for credentials, only anonymous requests, although it also used the same Access-Control-[...] headers.

The familiarity and compatibility of the cookie-based credentials model of CORS eventually won more favor with developers than re-designing their applications in a capability-security style. (https://www.w3.org/TR/capability-urls/) As a compromise, the anonymous request mode was retained as a “subset” of CORS. Unfortunately, the subtle differences in architectural style that persisted and the choice of * to represent such requests has been a consistent source of confusion.

Although the purest vision for anonymous requests wanted to do away with any need for permission grants, it was decided that an Access-Control-Allow-Origin header must still be required in order to protect legacy resources that rely on ambient authority outside of the control of the browser. Network topology is the most common example of this sort of authority. E.g. a resource on an intranet might assume that only trusted clients on the same physical network can reach it and so impose no access controls. If those same clients also fetch resources from the public internet and there were no requirement for the opt-in header, a remote resource could possibly “pivot” through a browser on the trusted network to access and exfiltrate information from those unprotected services.

Responses with * are not truly uniform, either. Because the user agent always sends the Origin header for a CORS requests, even those made without credentials, a server might choose to vary its response based on that information.

7. Advice for Resource Owners

Understanding the client side operation of CORS helps inform how to send headers as a resource owner, but some additional advice applies.

7.1. Always send * for resources that respond uniformly

It is always safe to return Access-Control-Allow-Origin: * for static resources that do not respond differently based on cookies or other distinguishing characteristics of the request. Such resources could always be fetched into the context of any foreign resource via a proxy, so allowing read access is an optimization that has no additional security or privacy implications. Sending * is part of being a good citizen of the Web as it allows browser-based applications like podcast aggregators or Open Data browsers to access your resources as a native app equivalent might, without needing a proxy server.

7.2. Use Vary

If a resource requires credentials, it cannot return * , it must reflect the Origin header value from the request. This implies that, unless the response explicitly indicates it should never be cached, it should also send a Vary: Origin header to keep the browser from re-using the response and denying access to requests from other origins that might be authorized.

If a resource is intended to be readable, truly regardless of the context of the request, first it should take care not to actually vary and reveal private information based on the presence and content of a cookie. Then it should send Vary: Cookie, Origin and either Access-Control-Allow-Origin: * for requests made without cookies or reflect the request’s Origin into the Access-Control-Allow-Origin header for credentialed requests.

7.3. Do not blindly reflect request Origin for credentialed requests when a resource may contain secrets or PII

If your resource includes customized, per-user content, it should not expose such information cross-origin. Consider carefully if "public" content may inadvertantly include things like session tokens, anti-CSRF tokens, grettings that expose personally identifying information like user names, or comments that contain uniquely identifying identifiers. Such things exposed blindly cross-origin may compromise the security and privacy of your users by providing a stable identifier that can track them across all their activities on the web or allow hijacking of their account. (even though cookie headers are never shared with CORS)

7.4. Avoid returning Access-Control-Allow-Origin: "null"

It may seem safe to return Access-Control-Allow-Origin: "null" , but the serialization of the Origin of any resource that uses a non-hierarchical scheme (such as data: or file: ) and sandboxed documents is defined to be "null". Many User Agents will grant such documents access to a response with an Access-Control-Allow-Origin: "null" header, and any origin can create a hostile document with a "null" Origin. The "null" value for the ACAO header should therefore be avoided.
The simple string comparison of CORS as applied to "null" is controversial. Some believe that "null" should be treated as a keyword token indicating the lack of an Origin, which, when tested, should never compare as equal to another "null" Origin. (As is the case with null values in SQL, for example.) It is unwise to build systems which rely on the "null" equals "null" comparison as this behavior may change in the future.

References

Issues Index

this whole document needs references and terms linkified
document how to do this with Fetch API