KEMBAR78
Handle side-effects in impBoxPatternMatch by EgorBo · Pull Request #90496 · dotnet/runtime · GitHub
Skip to content

Conversation

@EgorBo
Copy link
Member

@EgorBo EgorBo commented Aug 13, 2023

Closes #90492
Contributes to #9120

The main motivation is to get rid of [MethodImpl(MethodImplOptions.AggressiveOptimization)] from AsyncTaskMethodBuilderT.cs it was placed there to avoid unnecessary allocation Tier0 used to produce while Tier-FullOpts did not. New tier0 codegen for that method:

; Assembly listing for method System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1[System.Threading.Tasks.VoidTaskResult]:AwaitUnsafeOnCompleted[System.Runtime.CompilerServices.TaskAwaiter](byref,System.Runtime.CompilerServices.IAsyncStateMachineBox) (Tier0)

G_M46438_IG01:  ;; offset=0x0000
       push     rbp
       sub      rsp, 80
       lea      rbp, [rsp+0x50]
       vxorps   xmm4, xmm4, xmm4
       vmovdqa  xmmword ptr [rbp-0x30], xmm4
       vmovdqa  xmmword ptr [rbp-0x20], xmm4
       vmovdqa  xmmword ptr [rbp-0x10], xmm4
       mov      bword ptr [rbp+0x10], rcx
       mov      gword ptr [rbp+0x18], rdx
						;; size=37 bbWeight=1 PerfScore 10.08
G_M46438_IG02:  ;; offset=0x0025
       xor      ecx, ecx
       mov      gword ptr [rbp-0x08], rcx
       mov      rcx, bword ptr [rbp+0x10]
       mov      rcx, gword ptr [rcx]
       mov      gword ptr [rbp-0x30], rcx
       mov      rcx, bword ptr [rbp+0x10]
       mov      bword ptr [rbp-0x10], rcx
       mov      rcx, bword ptr [rbp-0x10]
       mov      rcx, gword ptr [rcx]
       mov      rdx, gword ptr [rbp+0x18]
       mov      r8d, 1
       call     [System.Runtime.CompilerServices.TaskAwaiter:UnsafeOnCompletedInternal(System.Threading.Tasks.Task,System.Runtime.CompilerServices.IAsyncStateMachineBox,bool)]
       nop      
						;; size=49 bbWeight=1 PerfScore 14.75
G_M46438_IG03:  ;; offset=0x0056
       add      rsp, 80
       pop      rbp
       ret      
						;; size=6 bbWeight=1 PerfScore 1.75

; Total bytes of code 92

@ghost ghost added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Aug 13, 2023
@ghost ghost assigned EgorBo Aug 13, 2023
@ghost
Copy link

ghost commented Aug 13, 2023

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details

Closes #90492

The changes are not big if you ignore white-spaces in code-review tab

Author: EgorBo
Assignees: -
Labels:

area-CodeGen-coreclr

Milestone: -

@stephentoub
Copy link
Member

can we also delete

[MethodImpl(MethodImplOptions.AggressiveOptimization)] // workaround boxing allocations in Tier0: https://github.com/dotnet/runtime/issues/9120
?

@EgorBo
Copy link
Member Author

EgorBo commented Aug 13, 2023

can we also delete

[MethodImpl(MethodImplOptions.AggressiveOptimization)] // workaround boxing allocations in Tier0: https://github.com/dotnet/runtime/issues/9120

?

so there is still one allocation I can't yet eliminate

Looks like it can be removed, Tier0 codegen:

; Assembly listing for method System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1[System.Threading.Tasks.VoidTaskResult]:AwaitUnsafeOnCompleted[System.Runtime.CompilerServices.TaskAwaiter](byref,System.Runtime.CompilerServices.IAsyncStateMachineBox) (Tier0)

G_M46438_IG01:  ;; offset=0x0000
       push     rbp
       sub      rsp, 80
       lea      rbp, [rsp+0x50]
       vxorps   xmm4, xmm4, xmm4
       vmovdqa  xmmword ptr [rbp-0x30], xmm4
       vmovdqa  xmmword ptr [rbp-0x20], xmm4
       vmovdqa  xmmword ptr [rbp-0x10], xmm4
       mov      bword ptr [rbp+0x10], rcx
       mov      gword ptr [rbp+0x18], rdx
						;; size=37 bbWeight=1 PerfScore 10.08
G_M46438_IG02:  ;; offset=0x0025
       xor      ecx, ecx
       mov      gword ptr [rbp-0x08], rcx
       mov      rcx, bword ptr [rbp+0x10]
       mov      rcx, gword ptr [rcx]
       mov      gword ptr [rbp-0x30], rcx
       mov      rcx, bword ptr [rbp+0x10]
       mov      bword ptr [rbp-0x10], rcx
       mov      rcx, bword ptr [rbp-0x10]
       mov      rcx, gword ptr [rcx]
       mov      rdx, gword ptr [rbp+0x18]
       mov      r8d, 1
       call     [System.Runtime.CompilerServices.TaskAwaiter:UnsafeOnCompletedInternal(System.Threading.Tasks.Task,System.Runtime.CompilerServices.IAsyncStateMachineBox,bool)]
       nop      
						;; size=49 bbWeight=1 PerfScore 14.75
G_M46438_IG03:  ;; offset=0x0056
       add      rsp, 80
       pop      rbp
       ret      
						;; size=6 bbWeight=1 PerfScore 1.75

; Total bytes of code 92

Previous Tier0 codegen: https://gist.github.com/EgorBo/7da915602c4d826e0e7d0d981aafc08d

@EgorBo
Copy link
Member Author

EgorBo commented Aug 14, 2023

@MihuBot --tier0

@EgorBo
Copy link
Member Author

EgorBo commented Aug 14, 2023

@MihuBot

@EgorBo
Copy link
Member Author

EgorBo commented Aug 14, 2023

@jakobbotsch @dotnet/jit-contrib PTAL, I recommend disabling "whitespaces" for review.

impBoxPatternMatch used to be conservative about side-effects, now I spill them as separate statements.
Diffs aren't big but at least I was able to remove AggressiveOpt.

impResolveToken(codeAddr + 1, &isInstResolvedToken, CORINFO_TOKENKIND_Casting);
}
else if ((foldAsHelper == CORINFO_HELP_BOX_NULLABLE) &&
((impStackTop().val->gtFlags & GTF_SIDE_EFFECT) == 0))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't do it for nullable due to empty diffs for it (would have to add tests as the logic is a bit different).

Comment on lines 2875 to 2879
GenTree* op = impPopStack().val;
if ((op->gtFlags & GTF_SIDE_EFFECT) != 0)
{
JITDUMP("\n Importing BOX; BR_TRUE/FALSE as %sconstant\n",
treeToNullcheck == nullptr ? "" : "nullcheck+");
impPopStack();

GenTree* result = gtNewIconNode(1);

if (treeToNullcheck != nullptr)
{
GenTree* nullcheck = gtNewNullCheck(treeToNullcheck, compCurBB);
result = gtNewOperNode(GT_COMMA, TYP_INT, nullcheck, result);
}

impPushOnStack(result, typeInfo(TYP_INT));
return 0;
impStoreTemp(lvaGrabTemp(true DEBUGARG("spill side effects")), op, CHECK_SPILL_ALL);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit of an unusual way of doing this in the importer, use impSpillSideEffect instead?

Also we already do this from the caller for BoxPatterns::IsByRefLike. Should it be unified?

Copy link
Member Author

@EgorBo EgorBo Aug 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, thanks for the hint. addressed.

Also we already do this from the caller for BoxPatterns::IsByRefLike. Should it be unified?

Looks like that one has to be always done regardless wether we match a pattern or not while here we only do this when we match it. Presumably the byref case is rare so it's ok to call it twice for simplicity?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like that one has to be always done regardless wether we match a pattern or not while here we only do this when we match it.

I don't think so (note that the comment is misleading -- we BADCODE today on unmatched patterns, we do not generate code to throw InvalidProgramException at runtime). But I'm ok with leaving it to avoid further risk and potentially cleaning it up in .NET 9, assuming you want this in .NET 8.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see, yeah let's clean it up in .NET 9.0, it has a potential to slightly refactor the whole thing

impPushOnStack(result, typeInfo(TYP_INT));
return 0;
}
impSpillSideEffects(true, CHECK_SPILL_ALL DEBUGARG("spilling side-effects"));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you check diffs for this? I'm not sure we want to spill all side effects, it should be enough to spill the top one (and the slots it interferes with). I also don't think we need to pass true. Not sure if there is any significant difference given the side-effecting trees we would expect at the top of the stack here, though.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Diffs - I've also checked that all side-effects are preserved in my tests

@EgorBo EgorBo merged commit 8323e58 into dotnet:main Aug 14, 2023
@EgorBo EgorBo deleted the handle-sideeffects-in-impBoxPatternMatch branch August 14, 2023 14:50
@ghost ghost locked as resolved and limited conversation to collaborators Sep 15, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Allocation for a simple isinst pattern in Tier0

3 participants