Specify HTML numeric character reference fallback encoding for multipart upload filename characters not representable in form acceptCharset/form charset.
Rationale:
- Consistency: this will make filename fallback character replacement consistent with encoding of form element names and values in multipart uploads when a source character is not representable in the
acceptCharset/form charset. @annevk points out that this is exactly the "html" error handling of the Encoding Standard. https://encoding.spec.whatwg.org/#concept-encoding-process
- Predictability: this is consistent with existing behavior in at least two browsers (Firefox and Edge). I have also started an intent to implement and ship thread for this behavior for Chrome. edit: this proposal was accepted, I'm now working to implement it in Chrome
- Reduced data loss: this change reduces the risk of user confusion and website malfunction when multiple uploaded files with distinct local filenames but identical representation after user agent-specific fallback character replacement are uploaded using
<input type=file multiple>; with this behavior standardized, web pages may even be able to portably recover useful user-visible representations of the original filenames, though some ambiguity remains with that approach as a local file could actually contain name parts matching numeric character references (moving to UTF-8 for the form submission of course resolves the ambiguity and should be the only recommended solution for newly-built web pages).
Accidentally filed here too: w3c/html#1077