-
Notifications
You must be signed in to change notification settings - Fork 137
editorial: mark languageCode at risk #764
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Based on teleconference discussion and #608. |
afc08b6
to
7471998
Compare
@rsolomakhin, big ask... but maybe we can write a small MDN description of how to achieve the same behavior using JS? That will at least give us a good story for anyone that might actually need this. |
Filed bug on Gecko to remove the attribute: https://bugzilla.mozilla.org/show_bug.cgi?id=1485881 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @marcoscaceres,
The text looks fine but I have a suggestion: add a note that says that even if the feature is removed, the party that receives the response data can use JS or other libraries to determine the language of the text in question.
I am approving this nonetheless, and I will live if you don't include such a note.
Filed an issue in Chromium: https://crbug.com/877521. Looking into doing this work in JavaScript for an MDN article may take a while for me personally. |
I still need to write up conclusions from the i18n WG meeting yesterday - might not get to that until the weekend. |
No problem. It might not come up and when can do that if it does. Happy to help also when needed. |
Merging this, as it's just editorial. I'll do a new PR for removal and move the various browser bugs over to that. |
For the sake of traceability, here are the conclusions of my i18n review (it's possible @aphillips might have more or better suggestions)... First, the i18n WG guidelines for spec authors [0] don't say about how to handle web forms. There are a few suggestions about character encoding [1] and text direction [2] in the guidelines for content authors and developers, but those are rather minimal, too. Because the Payment Request API is a strange beast (in essence it moves ecommerce checkout forms out of web content and into a browser dialog), it's likely an outlier for advice to spec developers. I've raised the issue of web forms guidance for discussion in the i18n WG. Second, there are a few topics we might want to broach in the Payment Request API spec, such as: (a) Recommend that the browser set a language tag for user input in the payment dialog. For instance, it could inherit the language tag from the html (b) Recommend that the browser be able to handle a locale value that is distinct from the language tag. As noted in [4] and relevant for our use cases, "the region code is also sometimes used to indicate the physical location, market, legal, or other governing policies for the user." (c) Require the browser to treat all input from the payment dialog as UTF-8, consistent with [1]. (d) Mention that the user can set a base direction for textual input, as described at [2]. Third, there are probably easier ways to determine the script of user-inputted text than the algorithm Rouslan provided [5] (which I take it described what libaddressinput [6] uses). For instance, the browser could simply inspect the characters themselves to see if there are in Latin script, Japanese script, etc. (I'll grant you that mixed-script input could be a challenge, though.) Fourth, a scenario Addison mentioned on an i18n WG call is the need for the same address in multiple forms (e.g., an English-language version for delivery from the U.S. to a import handling location in China and a Chinese-language version for final delivery to the customer). We have not designed for this yet, but might want to open a tracking bug for multiple representations of the same address. Fifth, a related scenario might be billing address in one script and shipping address in another script. This is simpler than multiple representations of the same address, but still requires support for two different scripts in the same set of input forms. We might uncover additional issues in the future, but these are the ones we've discussed so far. [0] https://w3c.github.io/bp-i18n-specdev/#loc_forms |
Thanks so much for this input, Peter and I18n folks. Just noting for (a), you’ll be happy to hear that’s literally what we recommend in the spec:
I’ll write a full response for the other points, but the tl;dr is that at a glance we get most what’s mentioned for free from the IDL layer (DOMStrings are already UTF16, irrespective of payment dialog input fields). And we can defer to the merchant for script detection when they need it (hence removal of this attribute). If there is to be a script/lang detection mechanism, we should add that as native functionality to ECMAScript via the |
@stpeter, @marcoscaceres Thanks for the summary. I don't necessarily see that removing One thing about the language tag is that it should not be used to indicate region/country or jurisdiction. That should be a separate bit of data, such as an ISO-3166 code or such. The region subtag in a language tag can indicate defaults for market, legal, or other locale-affected API usages. But it is a separate thing and it is a best practice not to use it as a proxy. That is, the language of an address has nothing to do with where the address is in the world. LTLI says something about this, but the quote @stpeter cites needs more context and explanation. When it comes to text analysis, there are a number of APIs for determining the script of content. The key thing to recall here is there is what we call the "common" script, consisting of characters shared between many different writing systems. Punctuation, for example. Understanding this reduces (but does not eliminate) cases where there are truly mixed script usages. Script is defined by Unicode and there are APIs that could be exposed in e.g. intl, although I caution that script isn't necessarily always useful in the way that this spec's usage seems to suggest. There is a need for more general I18N documentation for things such as field handling and definition, defining locale-neutral data structures, cultural awareness, etc. The LTLI document that you mention is actually one of the items that the I18N WG prioritized just below our current work and which I hope that we can get back to once Charmod/String-Meta are out of our systems. In the meantime, happy to help. |
The challenge here is having a clear algorithm to identify the language of content - in this case, an address. Does [0] provide the algorithm? Apologies if it does and I missed it. Without such an algorithm, none of us can implement At the risk of getting circular, if there is such an algorithm, whereby a string is give and out comes a language tag(s), then IMO, it should be part of |
I see the point that @aphillips makes: in the Payment Request API, the end user is a producer (as defined in the string-meta spec [0]) of a string or set of strings, and this is our chance to attach metadata about the language and base direction of the string. If we don't do that when the string is created, some other consumer of the string will need to figure it out later on, and they won't have as much context as we do at string creation time... |
Thinking about this further, I have a question for @aphillips - the answer to which might clear up some of the confusion around the current |
Oh, can we please move this discussion to #608 ? The issue we are currently in was for the pull request to add “at risk” to the spec, and it’s been merged and closed. |
The following tasks have been completed:
Implementation commitment:
Impact on Payment Handler spec?
Preview | Diff