KEMBAR78
Use Span to fill List<T> in more ToList scenarios by brantburnett · Pull Request #86796 · dotnet/runtime · GitHub
Skip to content

Conversation

@brantburnett
Copy link
Contributor

@brantburnett brantburnett commented May 26, 2023

Adds span-based filling of List<T> during calls to ToList using CollectionsMarshal.SetCount to the following additional scenarios:

  • ToList of a Union or Distinct
  • ToList of a Lookup<TKey, TElement> including cases within grouping
  • ToList of a Select projection against an IPartition (i.e. sorted)
  • ToList of an OrderedEnumerable returning a single element (i.e. Skip/Take applied to a sorted list for a single result)
  • ToList of Append and Prepend scenarios, mostly multiple appends/prepends

Secondary improvements:

  • ToArray of an empty Lookup<TKey, TElement> now uses Array.Empty
  • ToList of multiple Prepend operations no longer allocates an intermediate array

Fixes #80760

@ghost ghost added area-System.Collections community-contribution Indicates that the PR has been added by a community member labels May 26, 2023
@ghost
Copy link

ghost commented May 26, 2023

Tagging subscribers to this area: @dotnet/area-system-collections
See info in area-owners.md if you want to be subscribed.

Issue Details

Adds span-based filling of List<T> during calls to ToList using CollectionsMarshal.SetCount to the following additional scenarios:

  • ToList of a Union or Distinct
  • ToList of a Lookup<TKey, TElement> including cases within grouping
  • ToList of a Select projection against an IPartition (i.e. sorted)
  • ToList of an OrderedEnumerable returning a single element (i.e. Skip/Take applied to a sorted list for a single result)
  • ToList of a DefaultIfEmpty where the source is empty and an IIListProvider
  • ToList of Append and Prepend scenarios

Secondary improvements:

  • ToArray of an empty Lookup<TKey, TElement> now uses Array.Empty

Fixes #80760

Author: brantburnett
Assignees: -
Labels:

area-System.Collections

Milestone: -

@stephentoub
Copy link
Member

stephentoub commented May 26, 2023

Thanks. Can you share some benchmarks that demonstrate these improvements are meaningful, in particular in the places that are requiring a lot of code to be added, like in AppendPrepend1Iterator?

@brantburnett
Copy link
Contributor Author

Thanks. Can you share some benchmarks that demonstrate these improvements are meaningful, in particular in the places that are requiring a lot of code to be added, like in AppendPrepend1Iterator?

Upon more careful review, the added complications in Append/Prepend were mostly not worth it. Missing the optimizations in List<T>.AddRange was degrading performance when the source was an ICollection<T>, such as an array. I've reworked this to be much simpler and it still gets some modest gains, especially on multiple Append.

I'll add benchmarks on some of the other changes shortly.

BenchmarkDotNet=v0.13.2.2052-nightly, OS=Windows 11 (10.0.22621.1776)
Intel Core i7-10850H CPU 2.70GHz, 1 CPU, 12 logical and 6 physical cores
.NET SDK=8.0.100-preview.4.23260.5
[Host] : .NET 8.0.0 (8.0.23.25905), X64 RyuJIT AVX2
Current : .NET 8.0.0 (42.42.42.42424), X64 RyuJIT AVX2
New : .NET 8.0.0 (42.42.42.42424), X64 RyuJIT AVX2

PowerPlanMode=00000000-0000-0000-0000-000000000000 Arguments=/p:EnableUnsafeBinaryFormatterSerialization=true IterationTime=250.0000 ms
MaxIterationCount=20 MinIterationCount=15 WarmupCount=1

Method Job Mean Error StdDev Median Min Max Ratio RatioSD Gen0 Gen1 Allocated Alloc Ratio
AppendSingleToEmptyArray_ToList Current 45.65 ns 0.258 ns 0.228 ns 45.58 ns 45.38 ns 46.12 ns 1.00 0.00 0.0190 - 120 B 1.00
AppendSingleToEmptyArray_ToList New 35.32 ns 0.243 ns 0.228 ns 35.22 ns 35.06 ns 35.79 ns 0.77 0.00 0.0190 - 120 B 1.00
AppendThreeToArray_ToList Current 115.45 ns 0.436 ns 0.364 ns 115.59 ns 114.95 ns 116.10 ns 1.00 0.00 0.1313 - 824 B 1.00
AppendThreeToArray_ToList New 101.24 ns 0.252 ns 0.210 ns 101.19 ns 100.93 ns 101.71 ns 0.88 0.00 0.1249 - 784 B 0.95
AppendThreeToEnumerable_ToList Current 366.20 ns 8.505 ns 9.794 ns 360.39 ns 356.33 ns 389.72 ns 1.00 0.00 0.2489 - 1568 B 1.00
AppendThreeToEnumerable_ToList New 340.89 ns 2.360 ns 2.092 ns 341.43 ns 336.62 ns 343.84 ns 0.92 0.03 0.2425 0.0014 1528 B 0.97
PrependSingleToEmptyArray_ToList Current 44.10 ns 0.893 ns 0.836 ns 43.74 ns 42.91 ns 45.72 ns 1.00 0.00 0.0191 - 120 B 1.00
PrependSingleToEmptyArray_ToList New 34.52 ns 0.080 ns 0.067 ns 34.53 ns 34.41 ns 34.64 ns 0.78 0.02 0.0191 - 120 B 1.00
PrependThreeToArray_ToList Current 104.97 ns 0.147 ns 0.130 ns 104.96 ns 104.71 ns 105.15 ns 1.00 0.00 0.1247 - 784 B 1.00
PrependThreeToArray_ToList New 101.45 ns 0.219 ns 0.183 ns 101.42 ns 101.22 ns 101.79 ns 0.97 0.00 0.1249 - 784 B 1.00
PrependThreeToEnumerable_ToList Current 356.76 ns 7.159 ns 7.352 ns 355.84 ns 344.37 ns 369.00 ns 1.00 0.00 0.2424 - 1528 B 1.00
PrependThreeToEnumerable_ToList New 343.00 ns 0.533 ns 0.473 ns 343.09 ns 342.02 ns 343.86 ns 0.96 0.02 0.2433 0.0014 1528 B 1.00

@brantburnett
Copy link
Contributor Author

Here are some benchmarks of some other cases.

BenchmarkDotNet=v0.13.2.2052-nightly, OS=Windows 11 (10.0.22621.1776)
Intel Core i7-10850H CPU 2.70GHz, 1 CPU, 12 logical and 6 physical cores
.NET SDK=8.0.100-preview.4.23260.5
[Host] : .NET 8.0.0 (8.0.23.25905), X64 RyuJIT AVX2
Current : .NET 8.0.0 (42.42.42.42424), X64 RyuJIT AVX2
New : .NET 8.0.0 (42.42.42.42424), X64 RyuJIT AVX2

PowerPlanMode=00000000-0000-0000-0000-000000000000 Arguments=/p:EnableUnsafeBinaryFormatterSerialization=true IterationTime=250.0000 ms
MaxIterationCount=20 MinIterationCount=15 WarmupCount=1

Method Job input Mean Error StdDev Median Min Max Ratio RatioSD
Distinct_ToList Current Array 1,030.0 ns 16.17 ns 15.12 ns 1,026.4 ns 1,007.4 ns 1,059.2 ns 1.00 0.00
Distinct_ToList New Array 904.0 ns 8.63 ns 7.21 ns 902.9 ns 890.5 ns 918.3 ns 0.88 0.01
Distinct_ToList Current IEnumerable 1,500.5 ns 49.13 ns 56.58 ns 1,486.0 ns 1,437.1 ns 1,622.8 ns 1.00 0.00
Distinct_ToList New IEnumerable 1,402.4 ns 10.73 ns 8.96 ns 1,403.7 ns 1,387.5 ns 1,420.4 ns 0.92 0.03
Distinct_ToList Current IList 919.4 ns 6.54 ns 5.11 ns 919.9 ns 909.1 ns 927.4 ns 1.00 0.00
Distinct_ToList New IList 867.8 ns 10.52 ns 9.32 ns 864.0 ns 857.5 ns 887.2 ns 0.94 0.01
Distinct_ToList Current List 950.9 ns 17.87 ns 19.12 ns 943.2 ns 922.2 ns 991.8 ns 1.00 0.00
Distinct_ToList New List 886.4 ns 5.57 ns 4.94 ns 885.8 ns 880.3 ns 897.9 ns 0.93 0.02
GroupBy_ToList Current Array 1,942.0 ns 50.05 ns 57.64 ns 1,914.4 ns 1,862.5 ns 2,038.9 ns 1.00 0.00
GroupBy_ToList New Array 1,840.7 ns 34.90 ns 32.64 ns 1,831.2 ns 1,798.7 ns 1,898.6 ns 0.94 0.03
GroupBy_ToList Current IEnumerable 1,863.5 ns 15.90 ns 12.42 ns 1,858.5 ns 1,850.1 ns 1,889.9 ns 1.00 0.00
GroupBy_ToList New IEnumerable 1,854.5 ns 51.87 ns 59.73 ns 1,844.6 ns 1,782.0 ns 1,973.5 ns 0.99 0.03
GroupBy_ToList Current IList 1,841.0 ns 7.91 ns 7.40 ns 1,841.0 ns 1,829.5 ns 1,853.3 ns 1.00 0.00
GroupBy_ToList New IList 1,796.9 ns 5.67 ns 5.31 ns 1,796.0 ns 1,788.1 ns 1,805.2 ns 0.98 0.01
GroupBy_ToList Current List 1,887.6 ns 46.44 ns 53.48 ns 1,916.0 ns 1,790.1 ns 1,938.7 ns 1.00 0.00
GroupBy_ToList New List 1,767.3 ns 19.99 ns 17.72 ns 1,761.3 ns 1,741.7 ns 1,805.6 ns 0.92 0.02
OrderSelect_ToList Current Array 552.8 ns 9.48 ns 8.40 ns 550.3 ns 540.3 ns 575.5 ns 1.00 0.00
OrderSelect_ToList New Array 521.4 ns 15.42 ns 17.76 ns 513.1 ns 502.0 ns 544.8 ns 0.95 0.04
OrderSelect_ToList Current IEnumerable 848.0 ns 21.32 ns 23.70 ns 835.6 ns 820.8 ns 894.6 ns 1.00 0.00
OrderSelect_ToList New IEnumerable 822.6 ns 7.27 ns 6.07 ns 820.5 ns 815.5 ns 831.9 ns 0.97 0.03
OrderSelect_ToList Current IList 520.7 ns 2.68 ns 2.51 ns 520.2 ns 517.5 ns 525.3 ns 1.00 0.00
OrderSelect_ToList New IList 478.6 ns 2.94 ns 2.45 ns 478.6 ns 474.6 ns 483.3 ns 0.92 0.01
OrderSelect_ToList Current List 520.8 ns 8.18 ns 7.25 ns 517.9 ns 514.3 ns 541.3 ns 1.00 0.00
OrderSelect_ToList New List 478.7 ns 9.21 ns 9.04 ns 477.1 ns 466.7 ns 492.3 ns 0.92 0.02

@brantburnett brantburnett marked this pull request as ready for review May 28, 2023 20:37
brantburnett and others added 3 commits June 5, 2023 13:37
@brantburnett
Copy link
Contributor Author

Are there any further changes required on this?

@eiriktsarpalis eiriktsarpalis added this to the 9.0.0 milestone Aug 14, 2023
Copy link
Member

@eiriktsarpalis eiriktsarpalis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM @brantburnett, apologies for the delay in review. Before we get this merged, any chance you could remove the singleton optimization since it doesn't appear to be contributing substantially?

@brantburnett
Copy link
Contributor Author

LGTM @brantburnett, apologies for the delay in review. Before we get this merged, any chance you could remove the singleton optimization since it doesn't appear to be contributing substantially?

Sure, I'll try to get it done today.

@eiriktsarpalis eiriktsarpalis added the needs-author-action An issue or pull request that requires more info or actions from the author. label Oct 27, 2023
@ghost ghost removed the needs-author-action An issue or pull request that requires more info or actions from the author. label Oct 27, 2023
Copy link
Member

@eiriktsarpalis eiriktsarpalis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@ghost ghost locked as resolved and limited conversation to collaborators Nov 30, 2023
@brantburnett brantburnett deleted the tolist_spanfill branch February 15, 2024 22:43
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

area-System.Collections community-contribution Indicates that the PR has been added by a community member

Projects

None yet

Development

Successfully merging this pull request may close these issues.

IListProvider<T>.ToList() implementations use list.Add(element) instead of direct array initialization.

3 participants