-
Notifications
You must be signed in to change notification settings - Fork 1.6k
<regex>: Remove non-standard _Uelem from matcher
#5671
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
StephanTLavavej
merged 8 commits into
microsoft:main
from
muellerj2:regex-remove-uelem-from-matcher
Aug 16, 2025
Merged
<regex>: Remove non-standard _Uelem from matcher
#5671
StephanTLavavej
merged 8 commits into
microsoft:main
from
muellerj2:regex-remove-uelem-from-matcher
Aug 16, 2025
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
StephanTLavavej
approved these changes
Aug 11, 2025
|
I'm mirroring this to the MSVC-internal repo - please notify me if any further changes are pushed. |
😻 🎉 😸 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Resolves #995 by eliminating
_Uelemcompletely.This PR does not quite complete support for custom character types: The internals of
regex_searchstill place some additional requirements on such types. But except for ADL resilience, the support should be complete forregex_match, as the passing test confirms.<regex>changes_Uelemfrom_Is_wordby converting tounsigned charinstead combined with a check that re-conversion to the character type yields the same character again._STD-qualify some_Is_wordcalls._Uelemfrom_Lookup_rangeby relying on thelt()function in the char traits type instead.charandwchar_tthat avoid calling thelt()function for the standard traits type._Uelemfrom_Do_class()by converting tounsigned charinstead and checking that re-conversion to the character type yields the same character again.unsigned intbefore comparing with the line terminator code points.unsigned int.charandwchar_t, but in this case rather because their logic is noticeably simpler.Test changes
The test coverage for custom character types largely remains rudimentary, but I added some test coverage that matching behaves as it should in places where the parser and matcher rely on potentially narrowing conversions.
The test covers the matcher changes in this PR: Word boundaries (
_Is_word), single character matching (_Matcher2::_Do_class), character ranges (_Lookup_range) and line terminators (_Is_ecmascript_line_terminator). Additionally, some of the character ranges are chosen to validate that the implementation of_Builder::_Add_rangeremains correct for custom character types.Beyond this, I made the following changes to tests:
<xstring>: Suppress code analysis warning C6510 forbasic_string#5563.wrapped_wcharwas turned into a templatewrapped_character<Elem>so that it can be used with anunsigned long longcharacter type as well.operator wchar_t()fromwrapped_character<Elem>and replaced it by friend functionsconvert_to<target_type>, which are now called by the test traits classes.<regex>as it removes implicit conversions from the custom types that only existed to support the implementation the test traits classes.unsigned long long-like character type.transform_primaryin the test regex traits was replaced by a dummy implementation to allow the test to run under /clr:pure.