-
Notifications
You must be signed in to change notification settings - Fork 5.2k
Description
Updated by @MihaZupan on 2024-01-15.
Background and Motivation
In low allocation scenarios, it may be useful to support escaping URI data into an existing Span<char>
buffer, rather than allocating a new string
.
As an example, see this performance improvement suggestion for Elasticsearch.NET. The proposed change removes allocations on a hot path for query string building. It uses the existing Uri.EscapeDataString
method which allocates a new string, only when the input contains characters that need encoding. This could be avoided if I were able to encode directly into the buffer rented from the ArrayPool
.
Proposed API
namespace System
{
partial class Uri
{
// Existing
public static string EscapeDataString(string stringToEscape);
public static string UnescapeDataString(string stringToUnescape);
// New
public static string EscapeDataString(ReadOnlySpan<char> charsToEscape);
public static bool TryEscapeDataString(ReadOnlySpan<char> charsToEscape, Span<char> destination, out int charsWritten);
public static string UnescapeDataString(ReadOnlySpan<char> charsToUnescape);
public static bool TryUnescapeDataString(ReadOnlySpan<char> charsToUnescape, Span<char> destination, out int charsWritten);
}
}
Usage Examples
The foreach
loop from the linked issue could then be re-written along the lines of:
foreach (var key in nv.AllKeys)
{
if (position != 1)
bufferSpan[position++] = '&';
// assume we are sure we have space in the destination
if (Uri.TryEscapeDataString(key, bufferSpan.Slice(position), out var charsWritten))
position += charsWritten;
var value = nv[key];
if (!value.IsNullOrEmpty())
{
bufferSpan[position++] = '=';
if (Uri.TryEscapeDataString(value, bufferSpan.Slice(position), out charsWritten))
position += charsWritten;
}
}
In cases where the keys/values contain reserved characters, this avoids the temporary string allocations.
Alternative Designs
- The existing method names contain
string
.
Is it okay for the new span-based overloads to share the same name for better discoverability, or should we consider alternative names, e.g.EscapeData
/EscapeDataChars
? - The above design follows other
Span<T>
based overloads I have seen in common use and assumes that the operation could fail if the destination is too small. An alternative would be to throw in that case and instead apply the following signature (where the return value indicates the chars written):public static int EscapeDataString(ReadOnlySpan<char> stringToEscape, Span<char> destination);