-
Notifications
You must be signed in to change notification settings - Fork 5.2k
Description
Summary
Up through .NET 7 we have the provided UTF-16 based parsing functionality in the form of various Parse and TryParse APIs exposed as static methods on the respective types. In .NET 7 this support was expanded via the new IParsable and ISpanParsable interface and some additional numeric specific APIs on INumberBase that take a NumberStyles parameter.
We have also provided Utf8Parser which gives access to UTF-8 based parsing functionality for a limited subset of scenarios and which has a core differing behavior in that rather than treating the input as invalid after the first invalid character, it is instead treated as "end of input".
In .NET 8, we have again expanded the core support to provide IUtf8SpanParsable and some additional methods on INumberBase which provides parity to the UTF-16 scenarios. However, there still remains a divergence in that UTF8Parser still supports treating the first invalid character specially and there is no equivalent functionality for UTF-16 or for UTF-8 when using the new interfaces/APIs.
It is proposed that we then provide a new NumberStyles option that allows access to this functionality for UTF-16 or UTF-8 scenarios via the new interfaces/types. This would make Utf8Parser and Utf8Formatter both "functionally obsolete" even if we do not actually mark them as such.
API Proposal
namespace System.Globalization
{
[Flags]
public partial enum NumberStyles
{
AllowTrailingInvalidCharacters = 0x800
}
}
namespace System.Numerics
{
public static virtual bool TryParse([NotNullWhen(true)] string? s, NumberStyles style, IFormatProvider? provider, [MaybeNullWhen(false)] out TSelf result, out int charsConsumed);
public static virtual bool TryParse(ReadOnlySpan<char> s, NumberStyles style, IFormatProvider? provider, [MaybeNullWhen(false)] out TSelf result, out int charsConsumed);
public static virtual bool TryParse(ReadOnlySpan<byte> s, NumberStyles style, IFormatProvider? provider, [MaybeNullWhen(false)] out TSelf result, out int bytesConsumed);
}We would then remove api-approved from proposals such as #73842 given that it provides no new functionality. -- Even if this proposal is rejected, there needs to be a determination if APIs are exposed on Utf8Formatter given that it is effectively just a mirror for Utf8Parser and doesn't actually provide access to any functionality that can not be done using IUtf8SpanFormattable
Given that the current requirement is that unsupported/unrecognized NumberStyles throw an ArgumentException, we can reasonably provide a default implementation since the only valid bytesConsumed/charsConsumed will be "everything" or "nothing". The implementor would be expected to override the DIM if and when they add support for AllowTrailingInvalidCharacters
Alternative Names
There are several potential other names that we could choose from, ranging in verbosity and total clarity on what's being allowed:
- AllowTrailingCharacters
- AllowTrailingAny
- AllowInvalidCharacterForEndOfString
- StopOnFirstInvalidCharacter
- etc