KEMBAR78
Broaden use of SearchValues in TryFindNextPossibleStartingPosition in Regex by stephentoub · Pull Request #89205 · dotnet/runtime · GitHub
Skip to content

Conversation

@stephentoub
Copy link
Member

Replaces #89140. That PR changed how we emit custom IndexOf helpers, such that if we could fully enumerate the set and it contained a small enough number of characters, we would have the whole thing vectorized by first doing an ASCII-based search and then falling back to a probabilistic map search. But we then decided to move that approach down into SearchValues itself, via #89155. This means we can simplify TryFindNextPossibleStartingPosition in Regex to not track AsciiSet specially and instead just increase the number of characters we query the set for (from 5 to 128). That way, we'll use SearchValues rather than emitting our own helper up until a (semi-arbitrary) point where we deem it impossible or infeasible to enumerate all the chars that make up the set.

… Regex

SearchValues has been updated to have an ASCII fast-path for inputs that are not only ASCII.  This means we can simplify TryFindNextPossibleStartingPosition in Regex to not track AsciiSet specially and instead just increase the number of characters we query the set for (from 5 to 128).  That way, we'll use SearchValues rather than emitting our own helper up until a (semi-arbitrary) point where we deem it impossible or infeasible to enumerate all the chars that make up the set.
@stephentoub stephentoub added this to the 8.0.0 milestone Jul 19, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants