KEMBAR78
gh-104263: Rely on Py_NAN and introduce Py_INFINITY by seberg · Pull Request #104202 · python/cpython · GitHub
Skip to content

Conversation

@seberg
Copy link
Contributor

@seberg seberg commented May 5, 2023

It seems to me code all around relies on both Py_NAN and Py_HUGE_VAL being correct anyway. The actual value for Py_NAN is subtly incorrect on MIPS (depending on settings) or at least nonstandard, which seems to confuse some builtin functions.
(Probably it is signalling, but NumPy saw this with fmin, which probably should also ignore signalling NaNs, see also numpy/numpy#23158).

The guards about _PY_SHORT_FLOAT_REPR making sense are relatively unrelated to NAN and INF being available.

Nevertheless, I currently hide the Py_NAN definition if that is not set, since I am not sure what good alternative there is to be certain that Py_NAN is well defined.
OTOH, I do suspect there is no platform where it is not and it should probably be changed?!


@stefanor reported that this NAN value creates odd issues with NaN (or system fmin) for NumPy. I don't have MIPS, and don't plan to confirm that this fixes the issue (even if it is a bit of taping over).

For me the main point is that it deletes a bunch of code that seems unnecessary.

It seems to me code all around relies on both being correct anyway.
The actual value for Py_NAN is subtly incorrect on MIPS (depending
on settings) or at least nonstandard, which seems to confuse some
builtin functions.
(Probably it is signalling, but NumPy saw this with fmin, which probably
should also ignore signalling NaNs, see also numpy/numpy#23158).

The guards about `_PY_SHORT_FLOAT_REPR` making sense are relatively
unrelated to NAN and INF being available.

Nevertheless, I currently hide the Py_NAN definition if that is not
set, since I am not sure what good alternative there is to be certain
that Py_NAN is well defined.
OTOH, I do suspect there is no platform where it is not and it should
probably be changed?!
@JelleZijlstra
Copy link
Member

I'm not familiar with floating-point behavior, but from the linked numpy issue it seems there is an arguable, obscure CPython bug. Could you open an issue so we can discuss whether CPython needs to change, and add tests for the behavior that we should want?

@sunmy2019 sunmy2019 changed the title MAINT: Rely on Py_NAN and Py_HUGE_VAL being correctly defined gh-104263: MAINT: Rely on Py_NAN and Py_HUGE_VAL being correctly defined May 7, 2023
@mdickinson
Copy link
Member

mdickinson commented May 7, 2023

Hmm. There's some history here. The avoidance of Py_NaN and Py_HUGE_VAL was deliberate, at least in part to avoid operations that might set floating-point exceptions.

I don't think there's any need to change the infinity-handling code here; that bit pattern should be correct regardless of machine; could we stick to just looking at the NaN operations?

One other thing: Py_NaN will not necessarily have the same sign as the value computed from the bit pattern, and I'd expect people to notice and complain if the sign bit changes (arguably they shouldn't, but that's a different discussion). I'd prefer to be in control of the sign (and the payload) here, and the Py_NaN macro doesn't give us that control.

Instead of this approach, could we maybe introduce a preprocessor #define that specifies how the parity of bit 51 relates to quiet / signalling, and use that in constructing the quiet NaN from a bit pattern?


Gah. Bit 51, not bit 52; edited the above text to correct.

seberg added 2 commits May 7, 2023 14:26
We can rely on both as Python now forces IEEE compliance and C99 so
that both should always be well defined and there is no need for
`math.nan` not being defined.
@seberg seberg changed the title gh-104263: MAINT: Rely on Py_NAN and Py_HUGE_VAL being correctly defined MAINT: Rely on Py_NAN and introduce Py_INFINITY May 7, 2023
@seberg seberg changed the title MAINT: Rely on Py_NAN and introduce Py_INFINITY gh-104263: Rely on Py_NAN and introduce Py_INFINITY May 7, 2023
@mdickinson mdickinson self-requested a review May 8, 2023 08:40
return -1;
}
if (PyModule_AddObject(mod, "inf", PyFloat_FromDouble(m_inf())) < 0) {
if (PyModule_AddObject(mod, "inf", PyFloat_FromDouble(Py_HUGE_VAL)) < 0) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should probably use Py_INFINITY here, since we're changing this line anyway. (I don't want to clutter this PR by changing all instances of Py_HUGE_VAL to Py_INFINITY; that can happen in a separate cleanup PR.)

@mdickinson mdickinson added the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label May 8, 2023
@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @mdickinson for commit fd39437 🤖

If you want to schedule another build, you need to add the 🔨 test-with-buildbots label again.

@bedevere-bot bedevere-bot removed the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label May 8, 2023
@mdickinson
Copy link
Member

FTR, my comment above is out of date; see the linked issue for discussion.

@mdickinson
Copy link
Member

@seberg I've started a buildbot run. I'm happy with the shape of this PR if the buildbots agree.

We're missing a news entry; do you want to add one, or shall I? (I guess most of this could be classed as internal refactoring, which doesn't really need a news entry, but the MIPS fix probably deserves something.)

@seberg
Copy link
Contributor Author

seberg commented May 8, 2023

Happy either way, you probably got a more precise angle on what to write.

The Py_INFINITY definition may be the most interesting part for a news entry (but the MIPS fix doesn't hurt).

@mdickinson
Copy link
Member

Updated to add a news entry and fix another couple of Py_HUGE_VAL instances. I'll start another buildbot run, and merge if the buildbots look happy. @seberg Thank you for the PR!

@mdickinson mdickinson added the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label May 8, 2023
@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @mdickinson for commit fabf7d3 🤖

If you want to schedule another build, you need to add the 🔨 test-with-buildbots label again.

@bedevere-bot bedevere-bot removed the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label May 8, 2023
mdickinson
mdickinson previously approved these changes May 8, 2023
Copy link
Member

@mdickinson mdickinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, subject to buildbot happiness.

@mdickinson mdickinson added the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label May 8, 2023
@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @mdickinson for commit e8c36a2 🤖

If you want to schedule another build, you need to add the 🔨 test-with-buildbots label again.

@bedevere-bot bedevere-bot removed the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label May 8, 2023
@mdickinson
Copy link
Member

Okay, we're getting relevant buildbot failures. E.g., from here

======================================================================
FAIL: test_nan_signs (test.test_float.InfNanTest.test_nan_signs)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "D:\buildarea\pull_request.bolen-windows10\build\Lib\test\test_float.py", line 1045, in test_nan_signs
    self.assertEqual(copysign(1.0, float('nan')), 1.0)
AssertionError: -1.0 != 1.0
----------------------------------------------------------------------

Looks like we will need that fabs call around the NAN macro (which isn't really surprising - there's nothing in the C standard that says that NAN should have any particular sign bit).

The question is whether we want to put the fabs in just for the float("nan") and math.nan uses (and complex equivalents) and leave Py_NAN as the "system" NaN, or include the fabs call in the Py_NAN macro.

@mdickinson mdickinson dismissed their stale review May 8, 2023 10:38

Dismissing review until we can get tests passing on all buildbots

@seberg
Copy link
Contributor Author

seberg commented May 8, 2023

Hmmmpf, I underestimated the complexity... Would be tempted to include it in the macro but don't have a real opinion or gut feeling on it (assuming any reasonable compiler optimizes it well).
The reason is that there are three places where it matters probably and 2 of them use -Py_NAN (all other places are probably errors anyway). OTOH, of course in typical use it shouldn't matter at all.

(NumPy has a similar use of -NPY_NAN although not sure it matters. Actually NumPy currently still hard-codes the NANf value and casts it. Somehow the cast from float to double apparently "fixes" the bit pattern on MIPS...)

@mdickinson
Copy link
Member

I'm leaning towards leaving Py_NAN as a simple wrapper around NAN (or __builtin_nan) and only adding the fabs where we need it - i.e., for math.nan and float("nan").

My general philosophy on this is that most of the time we shouldn't care too much about signs of NaNs. float("nan") is an exception because it seems like too much of a violation of the POLS to have float("nan") give a negative NaN while float("-nan") gives a positive NaN. For math.nan, in the absence of backwards compatibility constraints I think I'd actually prefer math.nan just to return exactly what C's NAN does, but I also think that that would be an unacceptable backwards compatibility break, so we're stuck with forcing a positive NaN in this case.

@mdickinson
Copy link
Member

FWIW, I believe that one time that people end up caring about signs of Python NaNs is when the NaN in question hits a C-level printf (in some extension module) and the printf implementation in question likes to put negative signs on negative NaNs. (The C standard doesn't seem to have anything to say about what printf should do here, and actual implementations seem to differ.)

// Use C99 NAN constant: quiet Not-A-Number.
// NAN is a float, Py_NAN is a double: cast to double.
# define Py_NAN ((double)NAN)
# endif
Copy link
Contributor Author

@seberg seberg May 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, did the minimal thing and added fabs specifically where it seems possibly interesting (there is the Unpack2 for float16 unpacking that doesn't test yet, but used -Py_NAN explicitly).

I was assuming the __builtin_nan is just predating NAN in C99 requirement, but can undo.

For what its worth, NumPy creates NaNs all over with the same system dependent sign bit I suspect, and nobody ever complained.

EDIT/NOTE: Buildbot tests should be rerun probably.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, agreed that getting rid of that use of __builtin_nan looks like the right thing to do - I would have done that in a followup PR if you hadn't done it here.

@JelleZijlstra JelleZijlstra added the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label May 9, 2023
@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @JelleZijlstra for commit a35ae19 🤖

If you want to schedule another build, you need to add the 🔨 test-with-buildbots label again.

@bedevere-bot bedevere-bot removed the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label May 9, 2023
@mdickinson
Copy link
Member

Thanks for the updates; that matches what I had in mind. Watching the buildbots (thanks @JelleZijlstra for triggering) ...

Copy link
Member

@mdickinson mdickinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, though let's not merge until the buildbots finish. (I'm seeing two buildbot failures so far, but they're both clearly unconnected to this PR.)

@mdickinson
Copy link
Member

Buildbots are happy, and so am I. Merging. Thank you, @seberg!

I'll try to find time shortly (though not today) to replace remaining occurrences of Py_HUGE_VAL with Py_INFINITY (at least, in those places where infinity is clearly what's intended, which I suspect is all of them).

@mdickinson mdickinson merged commit 7a3b035 into python:main May 10, 2023
@seberg seberg deleted the maint-double-nan branch May 10, 2023 16:45
carljm added a commit to carljm/cpython that referenced this pull request May 10, 2023
* main:
  pythonGH-102181: Improve specialization stats for SEND (pythonGH-102182)
  pythongh-103000: Optimise `dataclasses.asdict` for the common case (python#104364)
  pythongh-103538: Remove unused TK_AQUA code (pythonGH-103539)
  pythonGH-87695: Fix OSError from `pathlib.Path.glob()` (pythonGH-104292)
  pythongh-104263: Rely on Py_NAN and introduce Py_INFINITY (pythonGH-104202)
  pythongh-104010: Separate and improve docs for `typing.get_origin` and `typing.get_args` (python#104013)
  pythongh-101819: Adapt _io._BufferedIOBase_Type methods to Argument Clinic (python#104355)
  pythongh-103960: Dark mode: invert image brightness (python#103983)
  pythongh-104252: Immortalize Py_EMPTY_KEYS (pythongh-104253)
  pythongh-101819: Clean up _io windows console io after pythongh-104197 (python#104354)
  pythongh-101819: Harden _io init (python#104352)
  pythongh-103247: clear the module cache in a test in test_importlib/extensions/test_loader.py (pythonGH-104226)
  pythongh-103848: Adds checks to ensure that bracketed hosts found by urlsplit are of IPv6 or IPvFuture format (python#103849)
  pythongh-74895: adjust tests to work on Solaris (python#104326)
  pythongh-101819: Refactor _io in preparation for module isolation (python#104334)
  pythongh-90953: Don't use deprecated AST nodes in clinic.py (python#104322)
  pythongh-102327: Extend docs for "url" and "headers" parameters to HTTPConnection.request()
  pythongh-104328: Fix typo in ``typing.Generic`` multiple inheritance error message (python#104335)
carljm added a commit to carljm/cpython that referenced this pull request May 11, 2023
* main: (27 commits)
  pythongh-87849: fix SEND specialization family definition (pythonGH-104268)
  pythongh-101819: Adapt _io.IOBase.seek and _io.IOBase.truncate to Argument Clinic (python#104384)
  pythongh-101819: Adapt _io._Buffered* methods to Argument Clinic (python#104367)
  pythongh-101819: Refactor `_io` futher in preparation for module isolation (python#104369)
  pythongh-101819: Adapt _io.TextIOBase methods to Argument Clinic (python#104383)
  pythongh-101117: Improve accuracy of sqlite3.Cursor.rowcount docs (python#104287)
  pythonGH-92184: Convert os.altsep to '/' in filenames when creating ZipInfo objects (python#92185)
  pythongh-104357: fix inlined comprehensions that close over iteration var (python#104368)
  pythonGH-90208: Suppress OSError exceptions from `pathlib.Path.glob()` (pythonGH-104141)
  pythonGH-102181: Improve specialization stats for SEND (pythonGH-102182)
  pythongh-103000: Optimise `dataclasses.asdict` for the common case (python#104364)
  pythongh-103538: Remove unused TK_AQUA code (pythonGH-103539)
  pythonGH-87695: Fix OSError from `pathlib.Path.glob()` (pythonGH-104292)
  pythongh-104263: Rely on Py_NAN and introduce Py_INFINITY (pythonGH-104202)
  pythongh-104010: Separate and improve docs for `typing.get_origin` and `typing.get_args` (python#104013)
  pythongh-101819: Adapt _io._BufferedIOBase_Type methods to Argument Clinic (python#104355)
  pythongh-103960: Dark mode: invert image brightness (python#103983)
  pythongh-104252: Immortalize Py_EMPTY_KEYS (pythongh-104253)
  pythongh-101819: Clean up _io windows console io after pythongh-104197 (python#104354)
  pythongh-101819: Harden _io init (python#104352)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants