-
Notifications
You must be signed in to change notification settings - Fork 5.2k
Add AVX2 support to IndexOfAnyValues #78863
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Tagging subscribers to this area: @dotnet/area-system-memory Issue DetailsEasier to review while ignoring whitespaceThis change makes
Chars
Bytes
|
7522873
to
56059b6
Compare
Completely independent, I just happened to have worked on both at a similar point in time. |
Any thoughts on this PR @dotnet/area-system-memory? |
Sorry, this has been on my TODO list. Can you provide a similar explanation to what you did here: #78861 (comment)? |
This file ( This PR effectively copy-pastes the existing Taking if (searchSpaceLength > 16)
{
// Process the input with 2 Vector128s (16 chars) at a time.
}
// Process the last 1-16 chars with 2 overlapped Vector128s.
return NotFound; and after this PR, it's written as if (searchSpaceLength > 16)
{
if (Avx2.IsSupported)
{
if (searchSpaceLength > 32)
{
// Process the input with 2 Vector256s (32 chars) at a time.
}
// Process the last 1-32 chars with 2 overlapped Vector256s.
return NotFound;
}
else
{
// Process the input with 2 Vector128s (16 chars) at a time.
}
}
// Process the last 1-16 chars with 2 overlapped Vector128s.
return NotFound; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good and thank you for the explanation. I assume all the helper methods are just copy pasted from the Vector128 implementations?
{ | ||
private const int MaxNeedleLength = 10; | ||
private const int MaxHaystackLength = 40; | ||
private const int MaxHaystackLength = 100; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where does this number come from?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are used by the test when running a few million random inputs through the implementation to check random edge cases.
This number should be large enough so that we test all sorts of code paths - e.g. 100 is enough for
// Process the input with 2 Vector256s (32 chars) at a time.
+
// Process the input with 2 Vector256s (32 chars) at a time.
+
// Process the last 1-32 chars with 2 overlapped Vector256s.
so if the "2 Vector256s at a time" step left something broken, we'd catch it.
On the other hand it doesn't make sense to increase it too much as we'd just end up testing fewer variants of short inputs - and there's nothing in the code that would behave fundamentally differently for inputs of 1k vs 10k characters.
Yes*, with the exception that we call this |
The build failure is known according to Build Analysis |
Easier to review while ignoring whitespace
This change makes
IndexOfAnyValues
noticeably faster than a regularIndexOf(char)
without #78861 :)Chars
Bytes