KEMBAR78
`<regex>`: Compare against `char` directly by StephanTLavavej · Pull Request #5675 · microsoft/STL · GitHub
Skip to content

Conversation

@StephanTLavavej
Copy link
Member

@StephanTLavavej StephanTLavavej commented Aug 13, 2025

Fixes #5673 and fixes VSO-2548012 in conjunction with small Build2 changes.

Build2 is doing something extremely complicated and fragile with user-defined character types and regex. When we construct user-defined _Elem from ordinary chars, that triggers assertions in their machinery, even though the resulting homogeneous comparisons between _Elem and _Elem are arguably more cromulent.

This PR slightly backs off from the purity of such comparisons, resulting in code that's somewhat closer to our old way of doing things, but still avoiding the internal _Uelem machinery. Now we directly compare _Elem to char. Build2 has heterogeneous comparison operators that handle this.

It will still be necessary to update Build2 to round-trip their user-defined character type instead of asserting, but that's a less invasive change: build2/build2#478

This makes no difference to the vast, vast majority of users who simply use char or wchar_t.

@StephanTLavavej StephanTLavavej added the bug Something isn't working label Aug 13, 2025
@StephanTLavavej StephanTLavavej requested a review from a team as a code owner August 13, 2025 05:05
@StephanTLavavej StephanTLavavej added the regex meow is a substring of homeowner label Aug 13, 2025
@github-project-automation github-project-automation bot moved this to Initial Review in STL Code Reviews Aug 13, 2025
@StephanTLavavej StephanTLavavej moved this from Initial Review to Work In Progress in STL Code Reviews Aug 13, 2025
Copy link
Member

@davidmrdavid davidmrdavid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just some questions

Copy link
Contributor

@muellerj2 muellerj2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me as well.

I also briefly checked the remaining static_cast's to _Elem in <regex>, but these are either meaningful round-trip casts or they appear in instances where we can't avoid casting to _Elem at some point (at least as long as we don't want to support heterogeneous char_traits or regex_traits implementations, which even Build2 does not provide.)

@StephanTLavavej StephanTLavavej moved this from Work In Progress to Ready To Merge in STL Code Reviews Aug 13, 2025
@StephanTLavavej
Copy link
Member Author

@davidmrdavid @muellerj2 I've pushed changes to the test, improving the casts and centralizing the operators. Thanks for the reviews! 😻

@StephanTLavavej StephanTLavavej moved this from Ready To Merge to Merging in STL Code Reviews Aug 15, 2025
@StephanTLavavej
Copy link
Member Author

I'm mirroring this to the MSVC-internal repo - please notify me if any further changes are pushed.

@StephanTLavavej
Copy link
Member Author

I've pushed a non-trivial merge with main, reverse-mirroring what I did in MSVC-PR-662263.

This fixes the conflict in GH_000995_regex_custom_char_types with #5671 (which had moved and altered code in the test). Now, signed_wchar_enum and ullong_enum are gaining comparison operators through the same technique, wrapped_character's operators have been adjusted now that it's templated on Elem, and they're all implemented in terms of each other. (I suppose I was ultimately persuaded that it's the simpler thing to do, thanks for the review comment.)

@StephanTLavavej StephanTLavavej merged commit a2686a0 into microsoft:main Aug 16, 2025
39 checks passed
@github-project-automation github-project-automation bot moved this from Merging to Done in STL Code Reviews Aug 16, 2025
@StephanTLavavej StephanTLavavej deleted the regex-char-comparisons branch August 16, 2025 22:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working regex meow is a substring of homeowner

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

<regex>: After _Uelem changes, Build2 asserts in the Real World Code test suite

3 participants