Faster optimized frozen dictionary creation (5/n) #87960
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
With all of the recent performance improvements, around 50% of the time for creating frozen dictionaries/sets is now spent in
FrozenHashTable.CalcNumBuckets
.The rest is now optimal, so my last effort was on optimizing
FrozenHashTable.CalcNumBuckets
.To understand my approaches it's crucial to understand what the method is supposed to do and how it was achieving this goal so far.
The method tries to find a prime number of buckets that provide less than 5% collision rate for unique hash codes.
The first thing it does, it creates a hash set of integers and iterates over all hash codes to filter out the duplicates. It's important because no increase in buckets will avoid collisions from duplicate input hash codes.
Once we have the number of unique hash codes, it iterates over a precomputed array of prime numbers and find first prime number than is bigger than the unique hash code count.
runtime/src/libraries/System.Collections.Immutable/src/System/Collections/Frozen/FrozenHashTable.cs
Line 186 in 04133eb
Then it chooses a multiplier for the max bucket count, which is 16 for small bucket count (1_000) and 3 for larger inputs.
runtime/src/libraries/System.Collections.Immutable/src/System/Collections/Frozen/FrozenHashTable.cs
Lines 196 to 200 in 04133eb
Then it sets the best number of collisions found so far to unique hash codes count, which means "so far the best result we found has given us 100% collision rate".
runtime/src/libraries/System.Collections.Immutable/src/System/Collections/Frozen/FrozenHashTable.cs
Line 219 in 04133eb
And finally it iterates over all of the prime numbers from selected range, and in the nested loop it iterates over every unique hash code and marks buckets as seen.
runtime/src/libraries/System.Collections.Immutable/src/System/Collections/Frozen/FrozenHashTable.cs
Line 223 in 04133eb
runtime/src/libraries/System.Collections.Immutable/src/System/Collections/Frozen/FrozenHashTable.cs
Line 235 in 04133eb
When it finds a bucket that was already visited, it increases collision count.
When the collision count becomes equal to the best known collision count, it bails and continues the search for next prime number.
runtime/src/libraries/System.Collections.Immutable/src/System/Collections/Frozen/FrozenHashTable.cs
Lines 238 to 245 in 04133eb
When none of the prime numbers meet our 5% collision rate criteria, it returns the bucket count with the lowest collision count found for all prime numbers.
The approaches that I've tried:
In the table below you can see perf numbers for #87688 as baseline, the mentioned loop reverse and proposed change: