KEMBAR78
[don't merge...] Add thread-safety to some reference counting by da-woods · Pull Request #6214 · cython/cython · GitHub
Skip to content

Conversation

@da-woods
Copy link
Contributor

Add a compiler directive to turn it off. Also turn it off in the common case where a variable is completely local to a function without a parallel block.

Implement DECREF_SET (and similar) using atomic swap so that we know that the variable getting decrefed is the same one that got swapped out. I've used the
(currently private) atomic operations built in to
Python since it seemed like the easiest option.

Also handle del x.

This is by no means complete (e.g. I think we'll have to wrap reads in an atomic read too).

Add a compiler directive to turn it off. Also turn it
off in the common case where a variable is completely
local to a function without a parallel block.

Implement DECREF_SET (and similar) using atomic swap so
that we know that the variable getting decrefed is the
same one that got swapped out. I've used the
(currently private) atomic operations built in to
Python since it seemed like the easiest option.

Also handle `del x`.

This is by no means complete (e.g. I think we'll have
to wrap reads in an atomic read too).
@da-woods
Copy link
Contributor Author

da-woods commented May 26, 2024

Edit: I've created simpler bug report python/cpython#119585 which may be easier to work with that whatever Cython generates.


I'm running into an error when testing it with refnanny and I think it's probably not our fault.

@colesbury @lysnikolaou would one of you be able to have a look and tell me if this is me being stupid or a genuine issue:

# distutils: extra_compile_args=-fopenmp -DCYTHON_REFNANNY=1 -Og
# distutils: extra_link_args=-fopenmp

import cython
from cython.parallel import prange, threadid, parallel

class C:
    def __init__(self):
        print(f"Hi everybody {threadid()}")

    def __del__(self):
        print(f"Bye everybody {threadid()}")

def prange_with_gil(n: cython.int):
    i: cython.int

    for i in prange(n, num_threads=3, nogil=True):
        with cython.gil:
            s = C()
            if i % 2 == 0:
                try:
                    del s
                except UnboundLocalError:
                    pass  # can fail if another thread got there first

This can be compiled with path/to/freethreading_python cythonize -if test_file.pyx and run with

>>> import test_file
>>> test_file.prange_with_gil(100)

Essentially what happens is:

  1. At the end of the prange loop Cython does __Pyx_PyGILState_Release(__pyx_gilstate_save); - that's just equivalent of PyGILState_Release here.

  2. Python does _Py_brc_remove_thread(tstate); (in PyThreadState_Clear, called from PyGILState_Release).

  3. Python does merge_queued_objects(&brc->local_objects_to_merge);

  4. The first queued object is a C object. So we call the C.__del__.

  5. Here there's a small Cython bug that ultimately leads to the crash (but I believe shouldn't do). Cython sets up the "refnanny context" for the destructor. However it isn't sure if it has the GIL while doing this (obviously it does, so this is a bug). So it therefore does:

    PyGILState_STATE __pyx_gilstate_save = PyGILState_Ensure();\
    __pyx_refnanny = __Pyx_RefNanny->SetupContext((name), (__LINE__), (__FILE__));\
    PyGILState_Release(__pyx_gilstate_save);\
    
  6. In PyGILState_Release we end up in merge_queued_objects(&brc->local_objects_to_merge); again with the same list of queued objects. From then on we're in an infinite loop of calling C.__del__ over and over again.

  7. Eventually it crashes with a segmentation fault (I think just because we run out of C stack space)

(If it is an issue, let me know if you want me to file a Python bug for it, and whether I need to provide a Cython-free way of reproducing it... I suspect I can but not 100% sure).

@da-woods da-woods changed the title Add thread-safety to some reference counting [don't merge...] Add thread-safety to some reference counting May 27, 2024
@da-woods
Copy link
Contributor Author

Thinking about it more, I don't think we can do reads with atomics (because we can't do read and incref in a single step). That suggests to me that this scheme can't work and that we actually need locking (either on a per-scope or per-variable basis). I'll close this and rethink it.

@da-woods da-woods closed this May 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant