-
Notifications
You must be signed in to change notification settings - Fork 4.2k
Cache ROS constructed from arrays of constants (remaining types) #69820
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
3960ec4 to
81f7507
Compare
|
Will this automatically work for collection expressions as well? Worth testing? #Resolved |
| return true; | ||
| } | ||
|
|
||
| if (inPlaceTarget is not null) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
📝 I'm not 100% sure about this part. Without it, we leave something on the stack which results in an assertion when the subsequent statement gets emitted. #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not 100% sure about this part.
When we are relying on a constructor, I think we can handle inPlaceTarget case the same way it is handled in if (specialElementType.SizeInBytes() == 1) above. We also should be able to handle it in emitAsCachedArrayFromBlob, but we never reach it due to a similar early check. Might be worth adjusting that as well.
|
@dotnet/roslyn-compiler for review. Thanks |
1 similar comment
|
@dotnet/roslyn-compiler for review. Thanks |
| return false; | ||
| } | ||
|
|
||
| if (inPlaceTarget is null && !used) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps the name should be adjusted now #Closed Refers to: src/Compilers/CSharp/Portable/CodeGen/EmitArrayInitializer.cs:398 in 1f8253c. [](commit_id = 1f8253c, deletion_comment = False) |
| var rosArrayCtor = (MethodSymbol?)Binder.GetWellKnownTypeMember(_module.Compilation, WellKnownMember.System_ReadOnlySpan_T__ctor_Array, _diagnostics, syntax: wrappedExpression.Syntax, isOptional: true); | ||
| if (rosArrayCtor is null) | ||
| // Emit: new ReadOnlySpan<T>(PrivateImplementationDetails.ArrayField ??= RuntimeHelpers.InitializeArray(new int[Length], PrivateImplementationDetails.DataField)); | ||
| bool emitAsCachedArrayFromBlob(NamedTypeSymbol spanType, BoundExpression wrappedExpression, int elementCount, ImmutableArray<byte> data, ref ArrayTypeSymbol arrayType, TypeSymbol elementType) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main benefit of this extraction is that it makes the codeflow in the main body of the method clearer. The cases become clearer and in particular this last branch is a single case. I've made sure to keep the diff very clean to minimize review overhead.
|
|
||
| if (constants.IsEmpty) | ||
| { | ||
| emitEmptyReadonlySpan(spanType, arrayCreation, used, inPlaceTarget); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moving that optimization earlier in the existing code affects some scenarios. Since that is not the purpose of the PR, I kept the original optimization in it's existing location.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moving that optimization earlier in the existing code affects some scenarios.
It doesn't look this way to me. We can discuss offline in more details
| } | ||
|
|
||
| var initializers = initializer.Initializers; | ||
| if (initializers.Any(static init => init.ConstantValueOpt == null)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tagging @stephentoub for thoughts
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will hold on to strings forever, at the same time this could be the only time they are used.
As string literals, won't they already be held on to forever as part of interning?
Even without that, this doesn't seem different to me from, say, lambda/delegate caching, where the first time a static lambda is used we cache a delegate to it, and we'll hold onto that delegate forever even if we never use it again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As string literals, won't they already be held on to forever as part of interning?
To be honest, I do not know the answer. And whether the answer is the same for all flavors of frameworks out there.
this doesn't seem different to me from, say, lambda/delegate caching
Strings could be quite big though. And there could be a lot of them in a single initialization. Also, we do not cache delegates when they are created by using new. So, there is some control over that form of caching.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be honest, I do not know the answer. And whether the answer is the same for all flavors of frameworks out there.
It should be true for both .NET Framework and .NET Core, every time we encounter a string literal we add it to a global hashmap (string interning Stephen mentioned above) where it's essentially rooted forever. Except the cases with unloadable ALCs but I guess it should not be a problem here as well. E.g.
void Test(bool cond)
{
if (cond)
Console.WriteLine("true!!");
else
Console.WriteLine("false!!");
}When JIT compiles this method (on its first execution) it will create string objects for both literals even if one of them (e.g. false!!!) will never be used - we might make it more efficient in future, but it's a current behavior of .NET Framework and .NET Core.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would even say that the behavior change caused by extraction of a local, or an inline of a local, might come as a big surprise.
There are plenty of situations where that's the case, including in the brand new collection expressions feature.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We would need to revisit every single use of this in dotnet/runtime where the source might be compiled downlevel, as this could regress all of them. Are you planning to do that?
If we decide that the behavior change, which, If remember correctly, was introduced without much discussion at the time (and likely specifically for the benefit of a single component in development at the time) was a mistake, and should be changed, then we will have to decide what to do with the component.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are plenty of situations where that's the case, including in the brand new collection expressions feature.
Doesn't mean it is a good thing. The nature of differences is not the same, and the impact is quite different. Each situation is somewhat unique.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree there is a non-zero chance this causes unexpected behavior for a customer. There are definitely customers out there that generate large string[] for initialization purposes that are effectively single use. I cannot specifically remember a case where it would combined with ReadOnlySpan<string> such that it would trigger this optimization but it's certainly possible.
There are other optimizations we've taken in the past that had the potential to negatively impact customer scenarios. Even simple optimizations like increased method group to delegate caching broke partner teams. It is always going to be a trade off.
The criteria I usually consider is:
- Is this on the whole going to be an improvement? In this case I believe the answer is yes it's overall going to produce significant wins compared to the potential downside.
- Is there a reasonable and documented way the user can undo the optimization if it's found to be negative? Consider as an example for method group caching the undo operation was just make the delegate allocation explicit:
= new Action(Method)vs.= Method. What is the undo operation here? I believe assigning to an intermediate local would subvert the optimization. Is that the way we want to document? Whatever the answer is I would like for it to be explicitly listed in the PR / issue for customers to see. - Are we violating anything in the language specification? For method group to delegate allocation we had to go back and confirm with LDM that they were okay with this change.
Assuming we have resolutions for (2) and (3) I would overall be in favor of moving forward. I tihnk we should consider an entry in the breaking change list though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was asked if I could make it clearer which optimization path I preferred: weak reference or not.
I would lean towards starting the non-weak reference approach. My rationale is
- Customers who find the behavior undesirable can use the undo mechanism
- If this does produce enough negative customer feedback we could flip to the
WeakReferenceapproach in an update / servicing fix.
|
|
||
| ImmutableArray<ConstantValue> constants = initializers.Select(static init => init.ConstantValueOpt!).ToImmutableArray(); | ||
|
|
||
| if (constants.IsEmpty) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
|
||
| public override bool Equals((ImmutableArray<ConstantValue> Constants, ushort Value) x, (ImmutableArray<ConstantValue> Constants, ushort Value) y) => | ||
| x.Value == y.Value && | ||
| ByteSequenceComparer.Equals(x.Constants, y.Constants); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That was just wrong :-/ We're dealing with ConstantValue not bytes here. This was definitely not doing what I intended... Thanks for catching this.
| switch (constant.Discriminator) | ||
| { | ||
| case ConstantValueTypeDiscriminator.Nothing: | ||
| return _singleZeroByteArray; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, for null constants. I'll add a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for null constants. I'll add a comment
Consider using ConstantValueTypeDiscriminator.Null instead. "Nothing" is VB term.
| static byte[] getBytes(ConstantValue constant) | ||
| { | ||
| Debug.Assert(Enum.GetValues(typeof(ConstantValueTypeDiscriminator)).Cast<ConstantValueTypeDiscriminator>().Max() | ||
| == ConstantValueTypeDiscriminator.DateTime); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't look like we depend on this condition below. If we are not handling a case, the code is going to throw below regardless of underlying value of ConstantValueTypeDiscriminator.DateTime. Instead, it might be better to add a comment in ConstantValueTypeDiscriminator that new values are likely to need a special treatment in this function. #Closed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I'll shift to a comment.
That said this assertion would fail for existing tests as soon as a new ConstantValueTypeDiscriminator enum entry was added, before the right test was crafted. That would have also raised the right alarm.
| return Encoding.Unicode.GetBytes(constant.StringValue!); | ||
|
|
||
| case ConstantValueTypeDiscriminator.NInt: | ||
| return BitConverter.GetBytes(constant.Int32Value); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The order of bytes in the array returned by the GetBytes method depends on whether the computer architecture is little-endian or big-endian. So, build on different machines might produce different bytes for the same value. Is this a concern given the purpose of this helper? #Closed
|
|
||
| return bytes; | ||
| case ConstantValueTypeDiscriminator.DateTime: | ||
| return BitConverter.GetBytes(constant.DateTimeValue.Ticks); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| } | ||
| """; | ||
| var compilation = CreateCompilationWithMscorlibAndSpan(src); | ||
| var verifier = CompileAndVerify(compilation, verify: Verification.Skipped); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| CompileAndVerify(compilation, expectedOutput: "1 42", verify: Verification.Skipped).VerifyDiagnostics(); | ||
| } | ||
|
|
||
| [Fact, WorkItem("https://github.com/dotnet/roslyn/issues/69472")] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WorkItem("#69472")
Did the behavior of this test or the next test change with this PR? #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, these two tests behave the same as before. This was to ensure there was no impact on existing scenarios. We probably have these two covered already in some fashion, but it was easier this way
| IL_000b: ret | ||
| } | ||
| """); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| } | ||
| """; | ||
| var compilation = CreateCompilationWithMscorlibAndSpan(src); | ||
| var verifier = CompileAndVerify(compilation, expectedOutput: "ran", verify: Verification.Skipped); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PEVerify failed for assembly 'C:\Users\jcouv\AppData\Local\Temp\RoslynTests':
[ : C::MString][mdToken=0x6000006][offset 0x00000010] Cannot change initonly field outside its .ctor.
[ : C::MObject][mdToken=0x6000007][offset 0x00000010] Cannot change initonly field outside its .ctor.
[ : C::MC][mdToken=0x6000008][offset 0x00000010] Cannot change initonly field outside its .ctor.
[ : C::MC][mdToken=0x6000008][offset 0x00000015][found ref array mdarray 'System.Object[]'][expected ref array mdarray 'C[]'] Unexpected array type on the stack.
and
System.Exception : IL Verify failed unexpectedly:
[MString]: Cannot change initonly field outside its .ctor. { Offset = 0x10 }
[MString]: Return type is ByRef, TypedReference, ArgHandle, or ArgIterator. { Offset = 0x1a }
[MObject]: Cannot change initonly field outside its .ctor. { Offset = 0x10 }
[MObject]: Return type is ByRef, TypedReference, ArgHandle, or ArgIterator. { Offset = 0x1a }
[MC]: Cannot change initonly field outside its .ctor. { Offset = 0x10 }
[MC]: Unexpected type on the stack. { Offset = 0x15, Found = ref 'object[]', Expected = ref '[02d07b78-95a0-409e-bff4-1cbc39fcff23]C[]' }
[MC]: Return type is ByRef, TypedReference, ArgHandle, or ArgIterator. { Offset = 0x1a }
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[ : C::MString][mdToken=0x6000006][offset 0x00000010] Cannot change initonly field outside its .ctor.
This looks like a real problem. It looks like we shouldn't be marking the field readonly.
[ : C::MC][mdToken=0x6000008][offset 0x00000015][found ref array mdarray 'System.Object[]'][expected ref array mdarray 'C[]'] Unexpected array type on the stack.
This is concerning as well.
- At the very least, it looks like we should pay attention to
IsPeVerifyCompatEnabledafter all (and test that as well). - Are we confident that we are not going to run in trouble trying to consume the span. Could that trigger some hard runtime type check failure in some scenarios, For example, when we try to store an element reference in a
ref readonlylocal, etc. ?
It looks like we should verify expected verification failure reasons in the tests targeting scenarios affected by this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made the field writable and removed arbitrary reference type scenario, so there's no PEVerify regression (so didn't add any tests targeting IsPeVerifyCompatEnabled case).
I added some usage of the ROS with a ref readonly local.
| public class C | ||
| { | ||
| public static System.ReadOnlySpan<string> MString() => new string[] { null, null, null }; | ||
| public static System.ReadOnlySpan<string> MString2() => new string[] { null, null, null }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| { | ||
| // Code size 27 (0x1b) | ||
| .maxstack 2 | ||
| IL_0000: ldsfld "{{type}}[] <PrivateImplementationDetails>.709E80C88487A2411E1EE4DFB9F22A861492D20C4765150C0C794ABD70F8147C_B{{typeCode}}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like there are some issue to follow up on
| } | ||
|
|
||
| public override ImmutableArray<byte> MappedData => default(ImmutableArray<byte>); | ||
| public override bool IsReadOnly => false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
📝 Note this affects the caching field for blob wrappers too (existing logic, see tryEmitAsCachedArrayFromBlob which involves an assignment to the caching field, so it cannot be read only). But somehow those writes were not flagged by ILVerify warnings (see MultipleArrays_InPlaceAndUsed).
| [Main]: Unexpected type on the stack. { Offset = 0x8, Found = address of '<PrivateImplementationDetails>+__StaticArrayInitTypeSize=3', Expected = Native Int } | ||
| """; | ||
|
|
||
| var verifier = CompileAndVerify(compilation, expectedOutput: "3402", verify: Verification.Fails with { ILVerifyMessage = ilVerifyMessage, PEVerifyMessage = peVerifyMessage }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
@jcouv It looks like correctness legs fail due to a formatting error |
|
|
||
| if (elementType.IsReferenceType && elementType.SpecialType != SpecialType.System_String) | ||
| { | ||
| return false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we get here for object (see ReadOnlySpanFromArrayOfConstants_Null) and it's intentional
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM (commit 14)
Closes #69472 by caching arrays of constants when used in ROS construction for remaining types.
For something like:
we'll produce something like: