KEMBAR78
gh-133982: Test _pyio.BytesIO in free-threaded tests by cmaloney · Pull Request #136218 · python/cpython · GitHub
Skip to content

Conversation

@cmaloney
Copy link
Contributor

@cmaloney cmaloney commented Jul 2, 2025

The test was only checking one of the two I/O implementations. Ideally the two implementations should match behavior (and guarantees) in free-threaded Python.

Followed https://py-free-threading.github.io/porting/#general-considerations-for-porting as a general guide for "make multi-threaded safe". I have a general project to build benchmarks around I/O in my backlog (python/pyperformance#399) where I will likely work on optimizing _io / _pyio / _experimentalio performance down the line including in threaded contexts. For now though, goal is simple functional thread safety iterating to better.

_pyio.BytesIO has two parts to its state, _pos and _buffer that get updated independently at times (ex. seek just changes _pos) but often together (ex. write updates _pos, maybe extends _buffer, and copies data into _buffer). When updated together multiple threads simultaneously operating could cause issues, so introduced a lock self._lock to cover those cases.

The test was only checking one of the two I/O implementations. Ideally
the two implementations should match.
@cmaloney
Copy link
Contributor Author

cmaloney commented Jul 2, 2025

The ThreadSanitizer data race failure looks real, investigating

@cmaloney cmaloney marked this pull request as draft July 2, 2025 22:27
@cmaloney
Copy link
Contributor Author

cmaloney commented Jul 3, 2025

Updated to lock operations that effect multiple members which need to stay "in sync" (ex. buffer length + position in buffer during write). Believe this is ready for review.

@cmaloney cmaloney marked this pull request as ready for review July 3, 2025 04:42
@cmaloney cmaloney changed the title gh-133982: Test _pyio.BytesIO in free-threaded gh-133982: Test _pyio.BytesIO in free-threaded tests Jul 3, 2025
@ZeroIntensity
Copy link
Member

ZeroIntensity commented Jul 3, 2025

cc @corona10 @kumaraditya303 (related to the problems in gh-135410).

Copy link
Member

@corona10 corona10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the current C implementation of BytesIO thread-safe?

I couldn’t find any mention in the documentation (https://docs.python.org/3/library/io.html#binary-i-o) explicitly stating whether BytesIO is thread-safe. If the C implementation is already thread-safe, I’d like to suggest updating the documentation to clarify that.

@cmaloney
Copy link
Contributor Author

cmaloney commented Jul 3, 2025

The C implementation (_io) was made thread safe in GH-132616, the _pyio version was not updated at that time. I don't believe _pyio is in any current CPython benchmarks so this shouldn't be critical for the performance metric.

Added a note about thread safety to the docs. Would be nice if there was a standard sphinx tag / annotation that could be added to objects to mark them as "safe to interact with from multiple threads in free-threaded build"

@cmaloney
Copy link
Contributor Author

cmaloney commented Jul 3, 2025

merged main to rerun/work around flaky test gh-136186

Copy link
Member

@corona10 corona10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, yeah we don't have to think deeply about fallback implementation.

@corona10 corona10 merged commit 48cb9b6 into python:main Jul 4, 2025
44 checks passed
@cmaloney cmaloney deleted the pyio_bytesio_threading branch July 4, 2025 02:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

tests Tests in the Lib/test dir

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants