KEMBAR78
gh-100637: Fix int and bool __sizeof__ calculation to include the 1 element ob_digit array for 0 and False by ionite34 · Pull Request #100663 · python/cpython · GitHub
Skip to content

Conversation

@ionite34
Copy link
Contributor

@ionite34 ionite34 commented Jan 1, 2023

Fixes behavior where int and bool __sizeof__ under-reports true size by 4 bytes as it did not take into account the size 1 ob_digit array for 0 and False.

Changes

  • 88e249b Use Py_MAX to use the maximum of 1 or the absolute ob_size size.
offsetof(PyLongObject, ob_digit) + Py_MAX(Py_ABS(Py_SIZE(self)), 1)*sizeof(digit)

Testing

  • 391ec87 Update sys.getsizeof unit test for int 0 to use the same size as the cases for -1 and 1
  • 88e249b Add a sys.getsizeof unit test for False matching the previous test case for True

Fixes behavior where int (and subtypes like bool) __sizeof__ under-reports true size as it did not take into account the size 1 `ob_digit` array for empty ints.
@bedevere-bot
Copy link

Most changes to Python require a NEWS entry.

Please add it using the blurb_it web app or the blurb command-line tool.

@ghost
Copy link

ghost commented Jan 1, 2023

All commit authors signed the Contributor License Agreement.
CLA signed

Copy link
Member

@mdickinson mdickinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me, and works as advertised in manual testing - thank you!

I've made one nitpick level suggestion for a stylistic change.

(Actually, thinking about style, I'm not sure why the body of this function doesn't simply consist of a single return - the res temporary seems unnecessary.)

Co-authored-by: Mark Dickinson <dickinsm@gmail.com>
Copy link
Member

@mdickinson mdickinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

I'll leave this open for a bit in case anyone else has comments (especially because it's New Year's Day and some potential commenters may be AFK), but I think this is ready to merge.

@ionite34
Copy link
Contributor Author

ionite34 commented Jan 1, 2023

(Actually, thinking about style, I'm not sure why the body of this function doesn't simply consist of a single return - the res temporary seems unnecessary.)

Thanks for the review and line split suggestion! I'm not really sure about the res over plain return either. Maybe there was a diff minimizing rationale for that original res that I'm not aware of. Currently seems fine though.

@mdickinson mdickinson merged commit d7e7f79 into python:main Jan 2, 2023
@ionite34
Copy link
Contributor Author

ionite34 commented Jan 3, 2023

Would this be something that needs backport to 3.10/3.11? Not sure what the policy is for that.

@ionite34 ionite34 deleted the fix-int-sizeof branch January 3, 2023 02:10
@mdickinson mdickinson added the needs backport to 3.11 only security fixes label Jan 3, 2023
@miss-islington
Copy link
Contributor

Thanks @ionite34 for the PR, and @mdickinson for merging it 🌮🎉.. I'm working now to backport this PR to: 3.11.
🐍🍒⛏🤖

miss-islington pushed a commit to miss-islington/cpython that referenced this pull request Jan 3, 2023
…he 1 element ob_digit array for 0 and False (pythonGH-100663)

Fixes behaviour where int (and subtypes like bool) __sizeof__ under-reports true size as it did not take into account the size 1 `ob_digit` array for the zero int.

(cherry picked from commit d7e7f79)

Co-authored-by: Ionite <dev@ionite.io>
Co-authored-by: Mark Dickinson <dickinsm@gmail.com>
@bedevere-bot
Copy link

GH-100717 is a backport of this pull request to the 3.11 branch.

@bedevere-bot bedevere-bot removed the needs backport to 3.11 only security fixes label Jan 3, 2023
@mdickinson
Copy link
Member

mdickinson commented Jan 3, 2023

Would this be something that needs backport to 3.10/3.11? Not sure what the policy is for that.

It's a judgement call, I think: it's arguably a bugfix, but an almost entirely inconsequential one - I can't imagine much real world code that would benefit, even though it's nice to have the fix going forward.

I'll backport to 3.11: Python 3.11 was the version that deliberately started always allocating at least one digit (the relevant commit was 15d50d7). That isn't true in Python 3.10.

(3.10 code:

result = PyObject_Malloc(offsetof(PyLongObject, ob_digit) +
size*sizeof(digit));
)
(3.11 code:
Py_ssize_t ndigits = size ? size : 1;
/* Number of bytes needed is: offsetof(PyLongObject, ob_digit) +
sizeof(digit)*size. Previous incarnations of this code used
sizeof(PyVarObject) instead of the offsetof, but this risks being
incorrect in the presence of padding between the PyVarObject header
and the digits. */
result = PyObject_Malloc(offsetof(PyLongObject, ob_digit) +
ndigits*sizeof(digit));
)

mdickinson added a commit that referenced this pull request Jan 3, 2023
…the 1 element ob_digit array for 0 and False (GH-100663) (#100717)

gh-100637: Fix int and bool __sizeof__ calculation to include the 1 element ob_digit array for 0 and False (GH-100663)

Fixes behaviour where int (and subtypes like bool) __sizeof__ under-reports true size as it did not take into account the size 1 `ob_digit` array for the zero int.

(cherry picked from commit d7e7f79)

Co-authored-by: Ionite <dev@ionite.io>
Co-authored-by: Mark Dickinson <dickinsm@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants