KEMBAR78
Use IndexOfAny{Except}InRange in RegexCompiler / source generator by stephentoub · Pull Request #76859 · dotnet/runtime · GitHub
Skip to content

Conversation

@stephentoub
Copy link
Member

Depends on #76803.

This augments our existing use of IndexOf, IndexOfAny, and IndexOfAnyExcept to also support IndexOfAnyInRange and IndexOfAnyExceptInRange. That means, for example, we can now efficiently find the start of a pattern like [0-9]{5}, via a vectorized search, whereas previously it'd require iterating character by character in a scalar loop.

As part of this, I changed some tuples to instead be named structs. They were becoming unwieldy, and we expect we'll be adding even more here as additional IndexOf variants become available.

@stephentoub stephentoub added this to the 8.0.0 milestone Oct 11, 2022
@stephentoub stephentoub requested a review from joperezr October 11, 2022 04:58
@ghost ghost assigned stephentoub Oct 11, 2022
@ghost
Copy link

ghost commented Oct 11, 2022

Tagging subscribers to this area: @dotnet/area-system-text-regularexpressions
See info in area-owners.md if you want to be subscribed.

Issue Details

Depends on #76803.

This augments our existing use of IndexOf, IndexOfAny, and IndexOfAnyExcept to also support IndexOfAnyInRange and IndexOfAnyExceptInRange. That means, for example, we can now efficiently find the start of a pattern like [0-9]{5}, via a vectorized search, whereas previously it'd require iterating character by character in a scalar loop.

As part of this, I changed some tuples to instead be named structs. They were becoming unwieldy, and we expect we'll be adding even more here as additional IndexOf variants become available.

Author: stephentoub
Assignees: -
Labels:

area-System.Text.RegularExpressions, tenet-performance

Milestone: 8.0.0

Copy link
Member

@joperezr joperezr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume the tests we have currently exercise this code already for both source generator and compiled engine.

Left a few nits, but this otherwise LGTM assuming CI will be green once the change in System.Memory goes in.

@joperezr
Copy link
Member

BTW, just curious, did you happen to run our perf tests to measure any gains that we might have gotten from this?

@stephentoub stephentoub added the blocked Issue/PR is blocked on something - see comments label Oct 12, 2022
This augments our existing use of IndexOf, IndexOfAny, and IndexOfAnyExcept to also support IndexOfAnyInRange and IndexOfAnyExceptInRange.  That means, for example, we can now efficiently find the start of a pattern like `[0-9]{5}`, via a vectorized search, whereas previously it'd require iterating character by character in a scalar loop.

As part of this, I changed some tuples to instead be named structs.  They were becoming unwieldy, and we expect we'll be adding even more here as additional IndexOf variants become available.
And add a bit more test coverage
@stephentoub
Copy link
Member Author

BTW, just curious, did you happen to run our perf tests to measure any gains that we might have gotten from this?

I didn't. I don't think any of our perf tests have patterns impacted by this. I'll pay attention to any improvements/regression reports, though, in case one slips through.

@stephentoub
Copy link
Member Author

I assume the tests we have currently exercise this code already for both source generator and compiled engine.

Yup. Though I added a few more tests to cover a few gaps.

@AndyAyersMS
Copy link
Member

@ghost ghost locked as resolved and limited conversation to collaborators Dec 3, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants