-
Notifications
You must be signed in to change notification settings - Fork 64
Description
From the PING review:
Self-contained packages have potential huge privacy advantages, but it's not clear that the EPUB spec or current implementations fulfill these opportunities. Is that a goal that the community could work towards?
The current spec anticipates and requires (at least a SHOULD) loading of remote resources from arbitrary origins. This introduces risks of additional data collection about who is reading what book (and from where), and what part of the book is being read at a particular moment (depending on the implementation or requirements on how remote resources are loaded). And remote resource loading should also make explicit that the author/publisher of the book may effectively be collecting data on the reader's habits, in addition to the reading system. Different levels of scripting access are defined, but it's not clear whether any such level would indicate that user reading behavior would not be disclosed.
It would be useful to specify a privacy threat model specific to EPUB, to the extent that it varies from the Web. Can we guarantee that reading habits will not be surveilled, by the publisher, the retailer, the reading system, or other parties? Or if that data is revealed, then we should clarify to whom or under what conditions. Book-like privacy could be achieved, but would require significant changes from the current spec and current popular implementations.
The spec suggests that the manifest is an exhaustive pre-stated list of resources, including remote resources, but it's not clear how that's intended to be handled by reading systems. Should a reading system refuse to fetch any remote resource not included in the manifest? Are remote resources intended to be the same for all readers of a book, or might they be personalized?