Force all deserialized objects to the oldest GC generation #19681

ilevkivskyi · 2025-08-18T12:38:58Z

This is a hack, but it gives ~30% perf win for mypy -c 'import torch' on a warm run. This should not increase memory consumption too much, since we shouldn't create any cyclic garbage during deserialization (we do create some cyclic references, like TypeInfo -> SymbolTable -> Instance -> TypeInfo, but those are genuine long-living objects).

ilevkivskyi · 2025-08-18T12:44:29Z

I just realized I did my measurements with fixed-format cache, but I guess the numbers will be similar for JSON cache.

JukkaL

Together with the fixed-format cache, import torch with a warm cache was ~90% faster than before for me, based on a quick experiment!

JukkaL · 2025-08-18T14:49:35Z

mypy/build.py

+                    # a hack, but it gives huge performance wins for large third-party
+                    # libraries, like torch.
+                    gc.collect()
+                    gc.disable()


Could we get here multiple times, if there are multiple dirty sub-DAGs? If yes, do you think it'll be a problem?

A quick workaround would be to do this only at most N times per run (possibly N=1).

Yeah, I was thinking about this. FWIW, I don't think it will be a problem, since freeze/unfreeze are quite fast. Also, we may accidentally get some objects from the stale SCCs previously processed in the oldest generation, but it is probably not so bad. But also I think it is fine to start with just one pass per run and increase the limit as we get more data for this.

(With mypy -c 'import torch' we enter here only once)

github-actions · 2025-08-18T23:58:29Z

According to mypy_primer, this change doesn't affect type check results on a corpus of open source code. ✅

I am not sure what happens, but for some reason after GC `freeze()`/`unfreeze()` hack #19681 was merged, compiled tests are running twice slower (on GH runner, but I also see much smaller but visible slow-down locally). I have two theories: * The constant overhead we add outweighs the savings when running thousands of tiny builds. * The 8% of extra memory we use goes over the limit in the runner because we were already very close to it. In any case, I propose to try disabling this hack in most tests and see if it helps.

Force all deserialized objects to the oldest generation

ef14a68

ilevkivskyi requested review from JukkaL and hauntsaninja August 18, 2025 12:38

This comment has been minimized.

Sign in to view

JukkaL reviewed Aug 18, 2025

View reviewed changes

Limit the hack to just one use

b7051c3

JukkaL approved these changes Aug 19, 2025

View reviewed changes

JukkaL merged commit 6c5b13c into python:master Aug 19, 2025
20 checks passed

JukkaL mentioned this pull request Aug 19, 2025

Faster cache deserialization #3456

Open

ilevkivskyi deleted the freeze-unfreeze branch August 19, 2025 18:04

JukkaL mentioned this pull request Aug 21, 2025

mypy is slow when type checking torch #17919

Open

ilevkivskyi mentioned this pull request Aug 30, 2025

Try fixing test times after GC hack #19766

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Force all deserialized objects to the oldest GC generation #19681

Force all deserialized objects to the oldest GC generation #19681

ilevkivskyi commented Aug 18, 2025

Uh oh!

ilevkivskyi commented Aug 18, 2025

Uh oh!

This comment has been minimized.

JukkaL left a comment

Uh oh!

JukkaL Aug 18, 2025

Uh oh!

ilevkivskyi Aug 18, 2025

Uh oh!

github-actions bot commented Aug 18, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Force all deserialized objects to the oldest GC generation #19681

Force all deserialized objects to the oldest GC generation #19681

Conversation

ilevkivskyi commented Aug 18, 2025

Uh oh!

ilevkivskyi commented Aug 18, 2025

Uh oh!

This comment has been minimized.

JukkaL left a comment

Choose a reason for hiding this comment

Uh oh!

JukkaL Aug 18, 2025

Choose a reason for hiding this comment

Uh oh!

ilevkivskyi Aug 18, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Aug 18, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants