-
Notifications
You must be signed in to change notification settings - Fork 5.2k
Description
Background and motivation
We've added new FrozenDictionary/Set types in .NET 8. These types provide a truly immutable veneer, making a copy of all data passed in and exposing only readable surface area. Once you get one of these instances, you can be sure the data won't change out from under you (short of someone using unsafe code, at which point in general all bets are off).
This surface area is valuable for two reasons:
- You can use the APIs in places you need immutability, with no surface area that enables mutation or even suggests it, and the data structures are optimized for reading. This is as opposed to types like ImmutableDictionary/Set, which are persistent data structures optimized for creating derivatives that share as much memory as possible, and as a result have slower access complexity as well as surface area that encourages adding/removing to create new instances derived from the original.
- Because it's read-only, we can optimize the implementations to make reading as fast as possible.
I've recently had discussions with multiple devs who were excited about (1) and looking forward to using these new types in, for example, a UI framework for a client application. Unfortunately, (2) can get in the way of that, as in the limit we can end up spending significant amounts of time during dictionary/set construction finding ways to optimize subsequent reads. This can be a really good tradeoff for a long-running service, where you might be willing to spend additional seconds per collection construction to then get multiple factors of read-speed improvement. But it's a bad tradeoff if Dictionary<>/HashSet<>'s read perf is good enough, you're going to be creating these on paths where speed matters (like client app startup), you're going to be recreating them frequently, etc.
We need to accomodate (1) while making (2) possible, and we should make it a pit of success, such that you get read performance on par which the existing mutable collections while making construction as fast as possible, effectively little more than the copy necessary to ensure the data won't change.
This has another benefit, related to app size and trimming. For (2), we want to have many different concrete internal implementations that we can choose from based on the nature of the supplied data, but that increases app size. If someone cares more about app size, using a separate factory method that only has a single concrete implementation can lead to smaller app sizes.
Contributes to #77891
API Proposal
public static partial class FrozenDictionary
{
public static FrozenDictionary<TKey, TValue> ToFrozenDictionary<TKey, TValue>(this IEnumerable<KeyValuePair<TKey, TValue>> source, IEqualityComparer<TKey>? comparer = null) where TKey : notnull;
+ public static FrozenDictionary<TKey, TValue> ToFrozenDictionary<TKey, TValue>(this IEnumerable<KeyValuePair<TKey, TValue>> source, IEqualityComparer<TKey>? comparer, double readOptimizationLevel) where TKey : notnull;
public static FrozenDictionary<TKey, TSource> ToFrozenDictionary<TSource, TKey>(this IEnumerable<TSource> source, Func<TSource, TKey> keySelector, IEqualityComparer<TKey>? comparer = null) where TKey : notnull;
public static FrozenDictionary<TKey, TElement> ToFrozenDictionary<TSource, TKey, TElement>(this IEnumerable<TSource> source, Func<TSource, TKey> keySelector, Func<TSource, TElement> elementSelector, IEqualityComparer<TKey>? comparer = null) where TKey : notnull;
}
public static partial class FrozenSet
{
public static FrozenSet<T> ToFrozenSet<T>(this IEnumerable<T> source, IEqualityComparer<T>? comparer = null);
+ public static FrozenSet<T> ToFrozenSet<T>(this IEnumerable<T> source, IEqualityComparer<T>? comparer, double readOptimizationLevel);
}- The
readOptimizationLevelcan be in the range [0.0, 1.0], where 0.0 is to not spend any additional time optimizing subsequent reads, and 1.0 is to spend as much time as is needed to best optimize subsequent reads. - The existing overloads behave as if 0.0 is specified, but as an implementation detail in favor of trimming wouldn't just delegate to the larger overload. This is also why it's not just another optional parameter (if we decided we didn't care about trimming, it could just be an additional optional parameter on each of the existing overloads).
API Usage
class MyClass
{
private static readonly FrozenSet<string> s_lookup = ReadConfig().ToFrozenSet(comparer: null, readOptimizationLevel: 1.0); // heavily optimized at construction for subsequent reads
private readonly FrozenSet<int> _data;
public MyClass(IEnumerable<int> data) => _data = data.ToFrozenSet(); // HashSet<int>-like read perf and construction time
public IReadOnlySet<int> Data => _data; // no concern about downcasting and mutation
public bool Exists(string value) => s_lookup.Contains(value); // fastest possible read speed
}Alternative Designs
No response
Risks
No response