KEMBAR78
[API Proposal]: Provide optimized read-only collections · Issue #67209 · dotnet/runtime · GitHub
Skip to content

[API Proposal]: Provide optimized read-only collections #67209

@geeknoid

Description

@geeknoid

EDITED on 10/30/2022 by @stephentoub

Here's what I recommend we do:

namespace System.Collections.Immutable
{
    public static partial class FrozenCollectionExtensions
    {
        public static FrozenDictionary<TKey, TValue> ToFrozenDictionary<TKey, TValue>(
            this IEnumerable<KeyValuePair<TKey, TValue>> source, IEqualityComparer<TKey>? comparer = null)
            where TKey : notnull;

        public static FrozenSet<T> ToFrozenSet<T>(
            this IEnumerable<T> source, IEqualityComparer<T>? comparer = null);
    }

    public abstract class FrozenDictionary<TKey, TValue> :
        IDictionary<TKey, TValue>, IReadOnlyDictionary<TKey, TValue>, IDictionary
        where TKey : notnull
    {
        internal FrozenDictionary(); // not designed for external extensibility

        public static FrozenDictionary<TKey, TValue> Empty { get; }

        public IEqualityComparer<TKey> Comparer { get; }
        public abstract int Count { get; }

        public abstract ImmutableArray<TKey> Keys { get; }
        public abstract ImmutableArray<TValue> Values { get; }

        public bool ContainsKey(TKey key);
        public bool TryGetValue(TKey key, [MaybeNullWhen(false)] out TValue value);
        public ref readonly TValue this[TKey key] { get; }
        public abstract ref readonly TValue GetValueRefOrNullRef(TKey key);

        public void CopyTo(KeyValuePair<TKey, TValue>[] destination, int destinationIndex);
        public void CopyTo(Span<KeyValuePair<TKey, TValue>> destination);

        public abstract Enumerator GetEnumerator();
        public struct Enumerator : IEnumerator<KeyValuePair<TKey, TValue>>
        {
            public readonly KeyValuePair<TKey, TValue> Current { get; }
            public bool MoveNext();
        }
    }

    public abstract class FrozenSet<T> : ISet<T>, ICollection,
#if NET5_0_OR_GREATER
        IReadOnlySet<T>
#else
        IReadOnlyCollection<T>
#endif
    {
        internal FrozenSet(); // not designed for external extensibility

        public static FrozenSet<T> Empty { get; }

        public IEqualityComparer<T> Comparer { get; }
        public abstract int Count { get; }

        public abstract ImmutableArray<T> Items { get; }

        public bool Contains(T item);

        public void CopyTo(Span<T> destination);
        public void CopyTo(T[] destination, int destinationIndex);

        public abstract bool IsProperSubsetOf(IEnumerable<T> other);
        public abstract bool IsProperSupersetOf(IEnumerable<T> other);
        public abstract bool IsSubsetOf(IEnumerable<T> other);
        public abstract bool IsSupersetOf(IEnumerable<T> other);
        public abstract bool Overlaps(IEnumerable<T> other);
        public abstract bool SetEquals(IEnumerable<T> other);

        public abstract Enumerator GetEnumerator();
        public struct Enumerator : IEnumerator<T>
        {
            public readonly T Current { get; }
            public bool MoveNext();
        }
    }
}

A few notes for discussion:

  • Assembly. I've prototyped this to live in System.Collections.Immutable.dll. These types are immutable, differing from the existing immutable collections primarily in that the existing ones are primarily "persistent" and are designed with mutable-like members that return a new instance with the mutation rather than changing the original. Including these in S.C.Immutable.dll has the added bonus that we ship that assembly downlevel, and we'd like for these collections to be widely available.
  • Namespace. I've similarly included the types in the System.Collections.Immutable namespace, but we could choose something else, like System.Collections.Frozen. Note that the public surface area of these types depends on ImmutableArray<T>.
  • Ref-returning Members. One of the design goals here is to enable all means of reading to be efficient; that includes when the values stored in a dictionary are large structs. Thus, as with ReadOnlySpan<T>, the indexer here returns a ref readonly T rather than just a T. There aren't safety concerns here as the underlying storage is immutable. For advanced usage, this also exposes a GetValueRefOrNullRef that mirrors the same method for Dictionary<,> we expose from CollectionsMarshal (but there, we put it on CollectionsMarshal because a growing dictionary can make the ref erroneous, and that's not a problem here).
  • Abstract base types. As with types like Dictionary<,> and HashSet<>, these types are not meant to be extended arbitrarily. However, because they optimize for the exact set of inputs frozen into the collection, it's important to be able to customize the implementation based on the inputs. Thus, the collections are abstract, with a ToFrozenDictionary/Set factory that produces the right internally-derived implementation. Developers consuming the types only concern themselves with the FrozenDictionary<,>/FrozenSet<,> types and the associated extension methods, and the abstractness is just an implementation detail that's required to be public.
  • Extension methods. I've put the two ToFrozenDictionary/Set extension methods on a new FrozenCollectionExtensions type. We could instead introduce two new non-generic FrozenDictionary/Set types, with one extension method each (which is closer to how the existing immutable collections work), or we could put these on to some existing type in the same namespace.
  • Immutable interfaces. I did not make these types implement IImmutableDictionary/Set. While it'd be possible, with the "mutating" methods constructing ImmutableDictionary/Set, it doesn't seem valuable, and also seems potentially misleading.
  • Non-generic interfaces. Dictionary<,> implements IDictionary and SortedSet<> implements ICollection, so I chose to also implement those on FrozenDictionary<,> and FrozenSet<>, respectively.

Background and motivation

There are many scenarios where collections are initialized during startup, or during major state changes in the application, and are not mutated afterwards. Taking advantage of this pattern can lead to substantial improvements in overall access performance of the collections.

Additionally, the overwhelming majority of dictionaries and sets use strings and int as key types. Providing collections that are optimized for these types can also contribute to substantial performance gains relative to the canonical Dictionary<K,V> and HashSet types.

API Proposal

Please see https://github.com/geeknoid/FrozenCollections for an API and implementation which delivers substantial performance gains relative to the built-in .NET collections for long-live read-only collections. The implementation spends extra time during the construction phase of these collections in order to increase lookup performance. The README file shows current performance gains using .NET 6.

The Freezer class provides a number of extension methods which make it a snap to turn an existing dictionary or set into a frozen dictionary or set.

In addition to the freezing functionality, the frozen dictionary API also provides the GetByRef method which makes it possible to get a byref to a value in a frozen dictionary, making dictionaries holding large value types efficient.

API Usage

var d = new Dictionary<string, int>(...);
var frozen = d.Freeze();

Alternative Designs

No response

Risks

No response

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions