KEMBAR78
should we disallow "out-of-container" relative URLs? · Issue #1912 · w3c/epub-specs · GitHub
Skip to content

should we disallow "out-of-container" relative URLs? #1912

@rdeltour

Description

@rdeltour

Definitions

By out-of-container relative URL string, I mean a relative URL string that has a number of double-dot path segments ("..") high enough to conceptually go outside the container.

For instance, "../../../../EPUB/content.xhtml" is an out-of-container relative URL string given the following container:

/
├── mimetype
├── META-INF
│   └── container.xml
└── EPUB
    └── content.xhtml

Problem

In previous versions of EPUB, the URL definition was unclear (see #1888), but I believe the intent was to disallow them.
In the #1898 proposal, out-of-container URL string are conforming, but the base URL of the container is defined such that an out-of-container URL string is necessarily parsed into a in-container URL.

For instance, after #1898, the URL string "../../../../EPUB/content.xhtml" will be parsed to the same URL as the URL string "EPUB/content.xhtml".

But as we added to a note in the #1898 proposal, using an out-of-container URL string will likely lead to interoperability issues with legacy or non-conforming RS.
In addition, as I said earlier, I believe the intent in previous versions of EPUB was to disallow them.

Proposal

I think we should forbid out-of-container URL strings.

Here's a proposal (assuming #1898 is merged).
Replace:

In the OCF Abstract Container, when a file uses a URL string to reference another file in the container, the string MUST be a path-relative-scheme-less-URL string, optionally followed by U+0023 (#) and a URL-fragment string.

by something along the lines of:

In the OCF Abstract Container, a relative-URL string MUST be a container relative URL string.

A URL string url is a container relative URL string if it is a path-relative-scheme-less-URL string and the following steps return true:

  1. let testURLRecord be the result of applying the URL parser to url with "https://example.org/A/".
  2. let testURLString be the result of applying the URL Serializer to testURLRecord.
  3. if testURLString does not start with "https://example.org/A/", then return false.
  4. set testURLRecord to the result of applying the URL parser to url with "https://example.org/B/".
  5. let testURLString be the result of applying the URL Serializer to testURLRecord.
  6. if testURLString does not start with "https://example.org/B/", then return false.
  7. Return true.

Explanation

The proposal above intends to override the URL standard definition of relative-URL string, so that:

  • scheme-relative- and path-relative- URL strings are not allowed (in other words, URL strings starting with "/" are not allowed)
  • "exceeding" or "leaky" URL strings are not allowed (in other words, URLs with enough ".." path segments to go "outside" the container are not allowed)

The intent is even if we refer to a broader "category" of URL strings, like a relative-URL-with-fragment string, our restrictions on relative-URL string apply.

In some way, it is monkey patching the URL standard definition. Monkey patches are usually not considered a good thing. But I do not see how to do otherwise: for the document formats we own (e.g. Package Document), we can easily define what is a valid URL string; but for other formats used in EPUB (e.g. HTML), they directly refer to the URL standard so I don't see an alternative to tweaking the definition.

Editorial consequences

We will be able to replace all our use of:

path-relative-scheme-less-URL string, optionally followed by U+0023 (#) and a URL-fragment string
by
relative-URL-string with fragment string (which is a bit more readable).

We may no longer need to assume the properties of the container root URL in the core spec, as they really only apply to out-of-container URLs.
We still need those in in the RS spec, to specify how reading systems must process non-conforming URLs.

Metadata

Metadata

Assignees

No one assigned

    Labels

    EPUB33Issues fixed in the EPUB 3.3 revisionSpec-EPUB3The issue affects the core EPUB 3.X Recommendation

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions