KEMBAR78
Win8 baseline: Improve STL Hardening codegen size by StephanTLavavej · Pull Request #5433 · microsoft/STL · GitHub
Skip to content

Conversation

@StephanTLavavej
Copy link
Member

@StephanTLavavej StephanTLavavej commented Apr 24, 2025

🗺️ Overview

  • Will be merged after Win8 baseline: Drop support for Win7 / Server 2008 R2 #5432.
  • Followup to STL Hardening #5274.
  • For MSVC, this uses __fastfail:
    • "Support for the native fast fail mechanism began in Windows 8."
    • I am manually declaring the intrinsic because it's currently declared by the large <intrin.h>. We generally try to avoid duplicating intrinsic declarations (due to calling convention madness) but in this one case I think it's fine. Because <yvals.h> is very central and special, moving the declaration into <intrin0.h> might still be overly impactful.
    • I also need to hardcode FAST_FAIL_INVALID_ARG because there's no way we can drag in <winnt.h>.
  • For Clang, this uses __builtin_verbose_trap:
    • This is the same mechanism used by libc++ Hardening.
    • It's cool in that it emits a single instruction, but when debugging info is enabled, it records the reason without bloating the codegen itself. Conveniently, we're already passing messages down to this layer.
  • With STL Hardening enabled, I measured v[idx] on x64 with /O2:
    • This reduced codegen size from 56 bytes to 29 bytes by avoiding the _invoke_watson() function call.
    • Clang codegen was similarly improved.
  • In <ranges>, we need to avoid a Clang error: "argument to __builtin_verbose_trap must be a pointer to a constant string".
  • We need to work around VSO-2457624 "/clr /std:c++20 silent bad codegen with the __fastfail intrinsic".

📜 Changelog

  • The STL no longer supports targeting Windows 7 and Server 2008 R2:

…tin_verbose_trap`.

+ MSVC [`__fastfail`](https://learn.microsoft.com/en-us/cpp/intrinsics/fastfail?view=msvc-170): "Support for the native fast fail mechanism began in Windows 8."
+ Clang [`__builtin_verbose_trap`](https://clang.llvm.org/docs/LanguageExtensions.html#builtin-verbose-trap)
+ I measured `v[idx]` on x64 with `/O2`, and this reduced codegen size from 56 bytes to 29 bytes. Clang codegen was similarly improved.
+ In `<ranges>`, we need to avoid a Clang error: argument to `__builtin_verbose_trap` must be a pointer to a constant string
@StephanTLavavej StephanTLavavej added the performance Must go faster label Apr 24, 2025
@StephanTLavavej StephanTLavavej requested a review from a team as a code owner April 24, 2025 23:46
@github-project-automation github-project-automation bot moved this to Initial Review in STL Code Reviews Apr 24, 2025
@StephanTLavavej StephanTLavavej moved this from Initial Review to Final Review in STL Code Reviews Apr 25, 2025
@StephanTLavavej
Copy link
Member Author

I'm mirroring this to the MSVC-internal repo - please notify me if any further changes are pushed.

@StephanTLavavej
Copy link
Member Author

I pushed a commit to work around /clr silent bad codegen.

@StephanTLavavej StephanTLavavej moved this from Final Review to Merging in STL Code Reviews Apr 29, 2025
@StephanTLavavej StephanTLavavej merged commit b6bf6d9 into microsoft:main Apr 29, 2025
39 checks passed
@StephanTLavavej StephanTLavavej deleted the doom-2-fastfail branch April 29, 2025 11:29
@github-project-automation github-project-automation bot moved this from Merging to Done in STL Code Reviews Apr 29, 2025
@StephanTLavavej StephanTLavavej mentioned this pull request May 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance Must go faster

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

3 participants