-
Notifications
You must be signed in to change notification settings - Fork 4.9k
Improve Regex cache speed when cache is large #27278
Conversation
|
Results. If CacheSize 2% of input (40_000). Then 3.25 faster, +878Kb. But if CacheSize = 15 (Default) almost no impact. I don't understand but it looks like using cache more slows it down actually. |
| s_livecode.AddFirst(current); | ||
| return current.Value; | ||
| } | ||
| var entry = s_livecode_dict[key]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will do two dictionary lookups. It'd be better to use TryGetValue.
| s_livecode.AddFirst(current); | ||
| return current.Value; | ||
| } | ||
| var entry = s_livecode_dict[key]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto
| var last = s_livecode.Last; | ||
| s_livecode_dict.Remove(last.Value._key); | ||
| s_livecode.RemoveLast(); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: this is uses tabs instead of spaces
|
|
||
| // it wasn't in the cache, so we'll add a new one. Shortcut out for the case where cacheSize is zero. | ||
| if (s_cacheSize != 0) | ||
| else if(s_cacheSize != 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: missing space between if and (
| internal bool _refsInitialized = false; | ||
|
|
||
| internal static LinkedList<CachedCodeEntry> s_livecode = new LinkedList<CachedCodeEntry>();// the cache of code and factories that are currently loaded | ||
| internal static Dictionary<CachedCodeEntryKey, CachedCodeEntry> s_livecode_dict = new Dictionary<CachedCodeEntryKey, CachedCodeEntry>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You might also look at getting rid of s_livecode altogether and just making the CachedCodeEntry themselves into nodes in a linked list (then storing a head and last field directly).
| while (s_livecode.Count > s_cacheSize) | ||
| s_livecode.RemoveLast(); | ||
| { | ||
| var last = s_livecode.Last; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: var => LinkedListNode
| } | ||
| var entry = s_livecode_dict[key]; | ||
| s_livecode.Remove(entry); | ||
| s_livecode.AddFirst(entry); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Each time we do this Remove/AddFirst, we end up allocating another node in the linked list. That's one of the reasons I suggest just making the entries themselves into nodes in their own list, and avoiding LinkedList<T>.
|
Can you clarify what the purpose of the LinkedList is now that you have a Dictionary? It seems to me you can get the cached value out of the Dictionary so the list doesn't add anything. What am I missing? |
|
@artkpv is this the code you are benchmarking? https://github.com/dotnet/corefx/issues/24425#issuecomment-361280655 I am curious how much of the improvement (for your 2%) case is faster lookup in the cache, and how much is faster Regex.Match because you are using a cached regex. Maybe it doesn't matter, but if you care, you could use a profiler and see how much time was spent in cache lookup specifically. |
It's an LRU cache, so you need to know the order in which the items are used. |
|
@stephentoub Code style issues fixed. @danmosemsft No, it is PR #27202 . The test in this PR also.
Didn't understand. Regex.Match also uses the cache, right? Through ctor. So should be no difference. Or I miss something? |
|
Here is the version for the linked list built into Regex. I'm checking the code for correctness again. The test results (2% case): |
|
macOS runs are hitting some NRE: Note that Regex tests did not hit this, suggesting we are missing coverage in the regex tests? |
14aeb25 to
f84fb9d
Compare
|
@danmosemsft I've added some unit tests though they only check that it doesn't throw. Probably we need some api for the cache. Regarding the error, I'm not sure how it happened. I'm checking the code again. I guess it was line 364 other thread changed cache size to 0 while the linked list was being updated at line 1119. So I put it into |
| lock (s_livecode) | ||
| { | ||
| for (LinkedListNode<CachedCodeEntry> current = s_livecode.First; current != null; current = current.Next) | ||
| s_livecode.TryGetValue(key, out var entry); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In a common case (possibly the most common?), the regex will be the last used regex. Would it make sense to test whether s_livecode_first == entry (and return immediately if true) before looking up in the dictionary, etc.?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you do this then CachePromoteLinkedListEntry would be repeating that test, so you would probably move the check out into the other caller also (CacheCode ) even though in that code path it's unlikely to be true - since we aren't asked to cache anything if it was found in the cache.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, reasonable, in case of the same regex made many times. Adding this
| return current.Value; | ||
| } | ||
| CachePromoteLinkedListEntry(entry); | ||
| return entry; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit, if you want you could remove this return entry; and change the return null; to return entry;
|
|
||
| internal static LinkedList<CachedCodeEntry> s_livecode = new LinkedList<CachedCodeEntry>();// the cache of code and factories that are currently loaded | ||
| // the cache of code and factories that are currently loaded: | ||
| internal static Dictionary<CachedCodeEntryKey, CachedCodeEntry> s_livecode = new Dictionary<CachedCodeEntryKey, CachedCodeEntry>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the default is going to be very common, and is not too large, you probably want to pre size:
internal static Dictionary<CachedCodeEntryKey, CachedCodeEntry> s_livecode = new Dictionary<CachedCodeEntryKey, CachedCodeEntry>(s_cacheSize);
|
This will conflict with #27473. That's fine whoever is last has to fix a simple conflict. It would probably make sense to run tests on this several times before merging ( |
| [Fact] | ||
| public void Ctor_Cache_Shrink_cache_does_not_throw() | ||
| { | ||
| int originalCacheSize = Regex.CacheSize; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: preferably resetting the cache size should be in a finally block, in case one test fails it should not break the others.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
|
Tested it again. Results below. Added unit test for the limit, assertions. The case that runs a bit slower (~40 msec) is 40_000, with 1600 unique, 15 cache. Others either run the same or quicker (large cache). Update. Next i'm adding some checks on the switch as i'm not sure about this. |
|
I'm currently looking over your changes and saw some code styling nits and a missing license header. I hope you don't mind that I pushed into your branch. |
|
I think you went a bit overboard with the amount of Debug.Assert calls. I recommend either removing a few which don't bring any value or adding messages to the calls which don't use the string overload. |
96546e0 to
a8b0126
Compare
| } | ||
|
|
||
| lock (s_livecode) | ||
| return TryGetCacheValueSmall(key, out entry); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Attributing methods with AggressiveInlining can be very dangerous and regress perf in corner cases which are usually not tested. I factored out the retrieval for small caches so that the the caller's native size stays sane.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IL Size is now 30 bytes.
|
Note: The numbers for the concurrent tests could be spoiled because of the task creation inside the tight loop. I removed the costly LINQ operations which should remove some noise. |
|
Results with BenchmarkDotNet: BenchmarkDotNet=v0.10.13, OS=Windows 10.0.17128
Intel Core i7-6600U CPU 2.60GHz (Skylake), 1 CPU, 4 logical cores and 2 physical cores
Frequency=2742185 Hz, Resolution=364.6727 ns, Timer=TSC
[Host] : .NET Core ? (CoreCLR 4.6.26228.08, CoreFX 4.6.26424.0), 64bit RyuJIT
Job=.NET Core 2.1 regex Runtime=Core Toolchain=InProcessToolchain
Before
After
Run yourself
|
|
The improvements especially in multi threaded scenarios are very promising. I'm very happy to merge this asap. |
0bb146e to
ddb071c
Compare
ddb071c to
733e524
Compare
|
I'm particularly glad to see wins even for the default cache size. 🍾 🥂 |
|
@dotnet-bot test Windows x64 Debug Build |
|
Nice work @artkpv appreciate it. |
|
Hmm @ViktorHofer I interpreted your comment above as a signoff but now I see you didn't actually hit signoff. All good? |
|
My intention was to wait for @artkpv's response but we can tackle further optimization in a separate PR. |
|
Great to see it merged! 😄👏 Really glad it ships and runs in Corefx! @ViktorHofer thanks for the testing and those commits. |
|
If you want to help further we have some other open issues for Regex i.e. https://github.com/dotnet/corefx/issues/7569 & https://github.com/dotnet/corefx/issues/26787 |
* Regex Cache changed to use Dict. PerTest added * Regex perf test for cache, refs dotnet/corefx#24425 * Getting CacheSize back in case of exception * Adjusting input not to hit cache each call * Code style issues * Linked list built into Regex * Cache size change in critial region. Adding tests * Check the first first. Default init. Refactoring. Tests fixed * Optimizations: reordering checks, methods calls * Eq op fixed. Comments fixed * Avoiding lock, refs dotnet/corefx#26364 * Merge. Removing message suppresion * Linked list fix * Indention fixed. Commenting * Unit tests fixed * Avoiding extra check. Access to public in internal class. Commenting. * Adding perf tests for various CacheSize * Perf test: cleaning up test * Testing Cache.Equals. Enhancing tests. Refactoring: renaming * Tests to run in other process * (To continue) Addressing most frequent cases to avoid lock * Perf tests enhanced. Addressing most frequent cases to avoid lock * Avoiding lock. Tests turning. * Assertions added. Check added for Most Recent inside lock. Removing nullifying * Perf test cases added, tuned * Perf test: disable cache to get clearer results * Fixing assertion error * Perf test: multithreading * clean up * Switching to Dictionary at 10, using Linked list before * Limit to 11 * Bug fix: clearing prev hashes. Switch at 10 * Fixing possible bug with CacheSize change. Adding tests * Removing assertion * Check for cache size change near limit * License header, code styling, nits * Realign inline methods for Cache retrieval * right size multi threaded bench costs per thread * Fix cache num retrieval on netfx Commit migrated from dotnet/corefx@a96dd02
Fixes https://github.com/dotnet/corefx/issues/24425
Fixes https://github.com/dotnet/corefx/issues/26364
A try to speed up cache using
Dictionary<CachedCodeEntryKey, CachedCodeEntry>, refs #24425