-
Notifications
You must be signed in to change notification settings - Fork 5.2k
Improve constant folding for some frozen objects #86318
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch Issue DetailsJust an excercise to improve constant folding in VN around frozen objects (can be useful for simialr cases). As a demonstration: void Foo() => Bar("TruE");
void Bar(string str)
{
if (bool.TryParse(str, out bool value))
Console.WriteLine(value);
} New codegen: ; Method Program:Foo():this
sub rsp, 56
xor eax, eax
mov qword ptr [rsp+20H], rax
mov byte ptr [rsp+30H], 0
mov dword ptr [rsp+30H], 1
movzx rcx, byte ptr [rsp+30H]
call [System.Console:WriteLine(bool)]
nop
add rsp, 56
ret
; Total bytes of code: 41 (old codegen was terrible)
|
This is great improvement. And also why old codegen is this much bad, provided the value passed to function is constant string and not even dynamic one! 🙄😕 |
It's not bad, it's just too big (a lot of stuff inlined), boolean parsing is done via two OridnalIgnoreCase comparisons, e.g.: internal static bool IsFalseStringIgnoreCase(ReadOnlySpan<char> value)
{
return value.Equals("False", StringComparison.OrdinalIgnoreCase);
} emits: ; Method Program:IsFalseStringIgnoreCase(System.ReadOnlySpan`1[ushort]):bool
G_M35886_IG01: ;; offset=0000H
;; size=0 bbWeight=1 PerfScore 0.00
G_M35886_IG02: ;; offset=0000H
mov rax, bword ptr [rcx]
cmp dword ptr [rcx+08H], 5
jne SHORT G_M35886_IG04
;; size=9 bbWeight=1 PerfScore 6.00
G_M35886_IG03: ;; offset=0009H
mov rdx, 0x20002000200020
or rdx, qword ptr [rax]
mov rcx, 0x73006C00610066
xor rdx, rcx
mov eax, dword ptr [rax+06H]
or eax, 0x200020
xor eax, 0x650073
or rax, rdx
sete al
movzx rax, al
jmp SHORT G_M35886_IG05
;; size=50 bbWeight=0.25 PerfScore 2.44
G_M35886_IG04: ;; offset=003BH
xor eax, eax
;; size=2 bbWeight=0.25 PerfScore 0.06
G_M35886_IG05: ;; offset=003DH
ret
;; size=1 bbWeight=1 PerfScore 1.00
; Total bytes of code: 62 |
@jakobbotsch @SingleAccretion PTAL since you're familiar with this pattern (when we accumulate a final byte offset out of ADD(ADD(ADD... trees) @dotnet/jit-contrib There are jit-diff improvements, but SPMI is empty because of missing contexts (this opt opens new oportunities to fold frozen objects but we need a JIT-VM api call) |
return TryParseUncommon(value, out result); | ||
|
||
[MethodImpl(MethodImplOptions.NoInlining)] | ||
static bool TryParseUncommon(ReadOnlySpan<char> value, out bool result) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TryParseUncommon
used to be inlineable in some cases (e.g. without pgo data)
{ | ||
if (!TryParse(value, out bool result)) | ||
{ | ||
ThrowHelper.ThrowFormatException_BadBoolean(value); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replaced throw with a throw helper to make it inlineable (at least for constant input)
Judging by your snippet, Rust doesn't support ignore case parsing so the codegen you see is responsible to return some parsing error for you |
Yeah, my mistake, I did not look correctly, that it gave an error! Sometimes compiler explorer windows are so confusing! 😅😅 |
@EgorBo I have one doubt, in the below example, can't we generate the same codegen for |
these are two different methods which inline differently. Also, this PR fixes constant folding |
Ohh, I see. I thought it would be the same as those constant strings are valid boolean values, so we can optimize the codegen for TryParse, or maybe Short-Circuting the call to Parse directly from that method once we found that specific input string. But it's ok, there might be some reason of not doing that.
I wish, there was an easier way to see codegen of the given program against the PR on browser itself😅😅 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Jit changes LGTM.
cc @dotnet/jit-contrib for a green sign off |
@dotnet/jit-contrib @jakobbotsch Ping, this PR also fixes a cross-compile bug @yowl hit in the NativeAOT-LLVM branch |
if (vnStore->IsVNConstantNonHandle(funcApp.m_args[0]) && (vnStore->TypeOfVN(funcApp.m_args[0]) == TYP_I_IMPL)) | ||
{ | ||
*pObj = vnStore->ConstantObjHandle(funcApp.m_args[0]); | ||
*byteOffset = vnStore->ConstantValue<ssize_t>(funcApp.m_args[1]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm surprised we haven't hit this assert with this previous code:
runtime/src/coreclr/jit/valuenum.h
Lines 1080 to 1099 in bda1f6c
T val1 = reinterpret_cast<T*>(c->m_defs)[offset]; | |
T val2 = SafeGetConstantValue<T>(c, offset); | |
// Detect if there is a mismatch between the VN storage type and explicitly | |
// passed-in type T. | |
bool mismatch = false; | |
if (varTypeIsFloating(c->m_typ)) | |
{ | |
mismatch = (memcmp(&val1, &val2, sizeof(val1)) != 0); | |
} | |
else | |
{ | |
mismatch = (val1 != val2); | |
} | |
if (mismatch) | |
{ | |
assert( | |
!"Called ConstantValue<T>(vn), but type(T) != type(vn); Use CoercedConstantValue instead."); | |
} |
I would expect it to hit for 64-bit to 32-bit crossgen2 compilations. It seems to indicate that this code is never hit for crossgen2 32-bit compilations? Is that expected?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would expect it to hit for 64-bit to 32-bit crossgen2 compilations.
That's correct - it will be hit if you replay a 32 bit collection with a cross-targeting Jit today. Evidently, we just don't have people (or machines) doing that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
crossgen doesn't have frozen objects so this path indeed has never been executed, but we could see it on NativeAOT, but I'm not sure we cover the cross compilation case on CI for it
Thanks! |
Just an exercise to improve constant folding in VN around frozen objects. As a demonstration:
New codegen:
old codegen was terrible