-
-
Notifications
You must be signed in to change notification settings - Fork 33.2k
gh-132917: Check resident set size (RSS) before GC trigger. #133399
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🤖 New build scheduled with the buildbot fleet by @nascheme for commit 5e965ab 🤖 Results will be shown at: https://buildbot.python.org/all/#/grid?branch=refs%2Fpull%2F133399%2Fmerge If you want to schedule another build, you need to add the 🔨 test-with-buildbots label again. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few comments below, but looks good to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
I like this approach is simple and elegant and will work most of the time. I am slightly worried about Python objects that hold other resources that are not just memory not being accounted for and maybe how SWAP can affect this numbers.
For example in my mac laptop and I can prevent a GC run by opening many chrome tabs and forcing a swap to disk causing the RSS to drop and raise again. I don't think is a problem but maybe in the future we can use heap memory instead or a combined value?
Yeah, that's probably a good idea. Instead of |
…thongh-133399) For the free-threaded build, check the process resident set size (RSS) increase before triggering a full automatic garbage collection. If the RSS has not increased 10% since the last collection then it is deferred.
This greatly reduces the time to run the script in GH-132917. For the default build, I get 0.19 seconds to run. For free-threaded build main branch, 1.9 seconds. With this PR applied, 0.2 seconds.
I've tried to implement a
get_current_rss()
function that works on Linux, MacOS, Windows, FreeBSD and OpenBSD. If the platform doesn't have an implementation or the function fails,gc_should_collect()
reverts to the old behavior.I did not implement this for the default build GC yet. I think we should try it in 3.14 for only the free-threaded build. If it seems to work okay, we can add a similar RSS check for the default GC collection threshold check.
As conditions, I'm using RSS increase of 10% or young object count increase of 40 x threshold. I think this is quite conservative in terms of running the GC often enough.