-
Notifications
You must be signed in to change notification settings - Fork 5.2k
Avoid runtime "atomic write" check in ConcurrentDictionary for shared generics #77005
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid runtime "atomic write" check in ConcurrentDictionary for shared generics #77005
Conversation
…ictionary ConcurrentDictionary caches an s_isValueWriteAtomic in a static readonly, but this trick does not work for shared generics. Instead add an eager check for the intrinsified cases and fall back to the old check otherwise.
|
Tagging subscribers to this area: @dotnet/area-system-collections Issue DetailsConcurrentDictionary caches an s_isValueWriteAtomic in a static readonly, but this trick does not work for shared generics. Instead add an eager check for the intrinsified cases and fall back to the old check otherwise. [Benchmark]
public int BenchIntString()
{
ConcurrentDictionary<int, string> cd = new();
for (int i = 0; i < 1000000; i++)
cd[0] = "";
return cd.Count;
}
[Benchmark]
public int BenchStringInt()
{
ConcurrentDictionary<string, int> cd = new();
for (int i = 0; i < 1000000; i++)
cd[""] = i;
return cd.Count;
}
[Benchmark]
public int BenchStringString()
{
ConcurrentDictionary<string, string> cd = new();
for (int i = 0; i < 1000000; i++)
cd[""] = "";
return cd.Count;
}
[Benchmark]
public int BenchIntInt()
{
ConcurrentDictionary<int, int> cd = new();
for (int i = 0; i < 1000000; i++)
cd[0] = i;
return cd.Count;
}
[Benchmark]
public int BenchGuidInt()
{
ConcurrentDictionary<Guid, int> cd = new();
Guid g = Guid.NewGuid();
for (int i = 0; i < 1000000; i++)
cd[g] = i;
return cd.Count;
}
|
| return true; | ||
| } | ||
|
|
||
| return s_isValueUnderlyingTypeWriteAtomic; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curious, why is this fallback field still needed? Couldn't it have simply been false?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's needed to avoid regressing enums -- they also return true via this check.
I can get a few more cases here by extracting the field into a separate generic class I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding a comment explaining why that fallback is necessary would help.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've essentially kept the existing code now except extracted it into a new generic class only keyed on the value type. Then I have inlined the shared generic case since there are some limitations around inlining with different shared generics.
Can you take another look? Do you want me to add comment that the check also works for enums?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. I'm guessing you've reverted to type codes for some of the primitives because of these limitations?
Could you update your benchmark results just in case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The typeof checks were unnecessary once I factored the field out into a new static class on only the value type. We can rely on JIT baking in the value of the static readonly field in tier-1 code as long as TValue is a value type, so we only need the reference type check up front after that.
Updated the benchmarks.
|
Why would the |
...ries/System.Collections.Concurrent/src/System/Collections/Concurrent/ConcurrentDictionary.cs
Show resolved
Hide resolved
Yep, there seems to be some variance, I commented on it above -- it seems to swing between 0.95 and 1.05 ratio. BDN also complains about modality so there may be some data alignment effects going on. |
|
Are there cases that this is going to regress? For example, |
| } | ||
| } | ||
|
|
||
| internal static class ConcurrentDictionaryTypeProps<T> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that you are paying several hundred bytes in type system data structures per ConcurrentDictionary instantiation for this optimization. I do not think it is a problem. ConcurrentDictionary instantiations are pretty expensive. Just pointing it out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my initial attempt I did not have the extra class: 6fde9d8
Of course that does not get as many cases, but I guess there is trade-off as you say.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Of course that does not get as many cases
Can you elaborate? Other than extra fields for the same TValue answer being stored for different TKeys, and for reference types having a field, what "cases" suffer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ConcurrentDictionary<shared_for_generics, not_shared_for_generics>, like ConcurrentDictionary<object, int>.
When the static field is part of the ConcurrentDictionary, these cases require generic lookup. The JIT is not able to see that the value depends on not_shared_for_generics only.
When the static field is in separate type, these cases do not require generic lookup and the tiered JIT can treat the readonly field as constant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. Thanks.
Yes:
But it turns out it is because we cannot inline *************** Inline @[000110] Finishing PHASE Post-import [no changes]
Inlining [000110] failed, so bashing STMT00030 to NOP
INLINER: during 'fgInline' result 'failed this call site' reason 'ldfld needs helper' for 'System.Collections.Concurrent.ConcurrentDictionary`2[int,System.ValueTuple`2[System.__Canon,System.__Canon]]:TryAddInternal(int,System.Nullable`1[int],System.ValueTuple`2[System.__Canon,System.__Canon],bool,bool,byref):bool:this' calling 'System.Collections.Concurrent.ConcurrentDictionary`2[int,System.ValueTuple`2[System.__Canon,System.__Canon]]:get_IsValueWriteAtomic():bool'This is actually a bit odd to me given that we can inline it fine in the other cases, but I assume there's a good reason. cc @EgorBo anyway for visibility. If I inline it by hand performance is the same (or very close). I guess I will go back to that and add a comment about it. |
|
I guess it makes sense since in the other cases we do not need to access the field at runtime. I will inline it by hand. |
ConcurrentDictionary caches an s_isValueWriteAtomic in a static readonly, but this trick does not work for shared generics. Instead add an eager check for the reference types and extract the static readonly field so that its value can be baked in even if the key is System.__Canon.
(The
BenchGuidIntone swings, I can get between 0.95-1.05 ratio by running them more times).