-
-
Notifications
You must be signed in to change notification settings - Fork 33.2k
gh-131591: Allow pdb to attach to a running process #132451
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Matt Wozniski <mwozniski@bloomberg.net>
Oh, one missing feature: this doesn't currently send a SIGINT to the PID when the client gets a ctrl-c. That shouldn't be tough to add, at least for Linux and macOS - I'm not sure if something equivalent is doable on Windows or not? EDIT: This has been implemented in a way that works for all platforms. |
@gaogaotiantian as promised this is the remote PDB initial implementation. Let’s try to get it ready for beta freeze 🤘 I suggest to try to get the basics landed first so we can iterate in parallel between all of us instead of dying on this PR |
I agree that we should merge the basic stuff in first. It's a relatively long PR so I'll need some time :) |
Feel free to push to the PR if you have small nits or thinks like that to avoid review roundtripping |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall I think the approach is good. I'm not the socket expert but it seems solid to me. I would like the pdb part to be even simpler - it would be much better to provide some basic version then add features when requested, than to promise a lot and realize some of them are too complicated to make it correct. The users would already be thrilled to have a basic attachable debugger that functions, even without some rarely used commands.
It would be a nice chance to understand what the users are actually using as well. Unsupported features are much better than buggy ones.
Lib/pdb.py
Outdated
if opts.pid: | ||
# If attaching to a remote pid, unrecognized arguments are not allowed. | ||
# This will raise an error if there are extra unrecognized arguments. | ||
parser.parse_args() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think attach is so different than the rest, we should only have one entry to parse and do the attach. parser.parse_args()
is also a bit magical here. I think we should combine this piece and the one below and just check if -m
or args
exists. If not, print some easy to understand message. Then do attach. Put the attaching code together and as a short-cut.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also considered putting -p
and -m
in a ArgumentParser.add_mutually_exclusive_group
but decided to err on the side of simplicity. That said, if you have a strong opinion on how best to do this, I'm happy to follow your lead. Want to push something that you're happy with?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm thinking of combining the two parts together:
if opts.pid:
# If attaching to a remote pid, unrecognized arguments are not allowed.
# This will raise an error if there are extra unrecognized arguments.
opts = parser.parse_args()
if opts.module:
parser.error("argument -m: not allowed with argument --pid")
attach(opts.pid, opts.commands)
return
# Existing code
opts, args = parser.parse_known_args()
Lib/pdb.py
Outdated
ns: dict[str, typing.Any] | ||
|
||
|
||
class _RemotePdb(Pdb): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we call the client _PdbClient
, why not _PdbServer
for this? I think it would clearer that they are a pair.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think _RemotePdb
makes it clearer that it is-a Pdb
in the OOP sense.
_PdbServer
doesn't imply anything about whether it uses inheritance or composition, but _RemotePdb
does, I think.
That said, it's a private class, so there's not much need to bikeshed on the name. If you prefer _PdbServer
we can switch it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we really do this kind of OOP naming in stdlib. _RemotePdb
is easily confused with a pdb that can remote to other processes. I still prefer _PdbServer
as it's a clear pair with _PdbClient
and it's obvious that it runs in the same process.
Lib/pdb.py
Outdated
] | ||
# fmt: on | ||
|
||
self._send(commands_entry={"bpnum": bnum, "terminators": end_cmds}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to send these end commands back and forth? They are constants. Also end
is a very important one (arguably the most important one as it's in the example).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They're constants, but they're not necessarily the same constants for the client and the server, since they can be running different versions of Python.
If the attaching client is running 3.14.9 and the remote PDB server is running 3.14.0, it's possible that the client will think that some command should terminate the command list, and send a command list to the server that the server thinks is incomplete because it doesn't recognize that terminator. If that happened, we'd be out of sync - the server would still be waiting for more lines, and the client would think that it's done sending lines.
That said, perhaps I'm being overly defensive, and we shouldn't expect any new command to ever be added after 3.X final. But the risk of getting it wrong is leaving you with a broken debugger session that might only be able to be fixed by killing the client.
Another possible option is to be defensive in the:
for line in commands:
if self.handle_command_def(line):
break
loop below. We could say that loop must break, and that if it doesn't, it can only mean that we got a list that was unterminated, and so we should roll back to the old set of commands. If you prefer that over sending list of early-command-terminators back over the wire, I'm fine with that.
That'd just be something like (pseudocode):
saved = old_commands_for_breakpoint(bnum)
for line in commands:
if self.handle_command_def(line):
break
else:
self.error("Incomplete command list!")
set_commands_for_breakpoint(bnum, saved)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the server would still be waiting for more lines, and the client would think that it's done sending lines.
That won't happen. When the client sent the commands as a list, the server will just finish the list and consider it done for the commands. Nothing worse would happen.
Also, I don't expect the list of resuming commands to change. And as discussed elsewhere, I don't think we should support cross version debugging.
I just tested it and CTRL+D
does not work either (to terminate the commands).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That won't happen. When the client sent the commands as a list, the server will just finish the list and consider it done for the commands. Nothing worse would happen.
Ah, you're right - I misremembered and thought that _RemotePdb
sets self.commands_defining = True
, but it doesn't (it used to in one of my earlier iterations).
OK, in that case, the only reason I can think of to send the list of resuming commands from the remote to the client is aliases. Currently I'm not handling that, and I'm just hardcoding, but in regular PDB you can:
(Pdb) alias blah cont
(Pdb) import functools
(Pdb) b functools.cache
Breakpoint 1 at /opt/bb/lib/python3.13/functools.py:676
(Pdb) commands
(com) p "hello from breakpoint"
(com) blah
(Pdb)
Note that blah
ended the command list. The only way we can support that for remote PDB is to send the list of resuming commands from the server to the client. But, I also don't care much if we support that. As you say, if we don't, the only impact is that the user needs to a) use the regular command instead of the alias, or b) add an end
command.
That seems fine to me. I think that having the resuming commands end the command list is a weird feature anyway, but I was aiming for feature parity.
So: if you care about aliases for resuming commands ending the commands list, I'll keep this code and update it to also send the list of aliases that expand to one of the commands in the list.
If you don't care about aliases for resuming commands ending the command list, I'm happy to hardcode the list client-side instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just tested it and
CTRL+D
does not work either (to terminate the commands).
Not for regular PDB either, so that's fine.
But ctrl-c just kills the whole remote pdb client, that's wrong.
Also
end
is a very important one
lol oops, that's what I get for focusing on edge cases 🤣
Fixed 'end' and ctrl-c in e44a670
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe the only "correct" way to do this is to send the commands to the server one by one - you'll never get the correct result with client parsing. For example, x = 1;;continue
should end the command prompt, and the current client does not do that. Unless you parse everything exactly as the server does, you won't get the same result.
For your blah
example - the current code does not handle it, and it would be complicated to make it work properly. If we want to make it right, we need to simulate the send/receive line by line.
The worse case in the current implementation is that the user needs to end the command with an end
- that's not too bad. We can do it for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For example,
x = 1;;continue
should end the command prompt, and the current client does not do that.
Oof, yeah. I didn't think about ;;
If we want to make it right, we need to simulate the send/receive line by line.
I can do that. It wouldn't be too tough, and it would wind up looking a lot like interact
- set a flag for the "mode", and make it be possible for the client to return to command mode (on ctrl-c or ctrl-d) or for the server to (when the command list is terminated).
I understand how to do this, and it wouldn't be difficult, but it would be almost exactly as complex as interact
, for better or worse.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed in 600aa05
These were removed from the Pdb base class in b577460
The only thing _RemotePdb needs to customize is how the recursive debugger is constructed. Have the base class provide a customization point that we can use.
This makes us robust against changes in the temporary script.
To be honest, I think the TCP socket here is not good for the user. Here's the detail For now, we use So I think we may need to allow people setup a port allocate range to avoid the common port usage range. But here's a better way to do this. Maybe we can let people choose to use the Unix Domain Socket(aka UDS) or the TCP Socket. For now, most of the OS platform has supported the UDS (even if Windows since 17063), So we can use UDS first and fallback to TCP if the OS platform not support UDS yet) |
The common port range is avoided automatically on every operating system I know of.
On Linux that implementation-defined port range is configurable via /proc/sys/net/ipv4/ip_local_port_range but the default range is 32768 to 60999. On Windows the port range is 49152 to 65535. I'm having trouble finding official docs for macOS, but StackOverflow says that it's configurable like on Linux, with a default range of 49152 to 65535.
Statically configuring a service to start on a port that's allocated to the pool of ephemeral ports is incorrect. The configuration should be corrected, rather than assuming that no other software on the machine ever uses the widely portable feature of supplying |
This is correct, I agree with your idea. But my concern is the different Linux distro may have different default local port range out of box. If we still use TCP rather than the UDS, I think we may need mention this behavior on the documentation. But in my side, I think the random server port is not a common usage for socket programming. When we try to communicate between the processes, UDS should be a better way to reach the target. It's more effective and nor more port conflict |
The base class sometimes calls these methods with an exception object.
Okay I tried to read the protocol for interact and breakpoint commands. Here are my thoughts. I think the protocol could be simpler and easier to maintain if we reuse more of the existing structure of pdb. We can't fully mimic a stdin/stdout because that could be difficult to parse, but I don't think we need to make the state of pdb controllable from both client and server. I don't think the client needs to send different commands for "interact", "commands" and "breakpoint commands" (which is very unfortunate because it's called I think Then, with the In order to do this, we need to give the client the capability to send "control signals" to server, mainly I believe with this protocol, we can do minimum changes to make |
Okay sure. I'm good with creating an issue for the bug so we won't forget it and working on landing this PR. |
Oh, we definitely need a whatsnew entry for it. |
I am working on more tests, will be pushing them soon |
@gaogaotiantian what's new entry added and added some tests that cover most of what is not super complicated to test. Do you mind formally approving so I know you are fine to land and then we can open new issues for the different things we want to improve. @godlygeek will open a issue as soon as we land for the infinite loop problem. |
Lib/test/test_remote_pdb.py
Outdated
def test_protocol_version(self): | ||
"""Test that incompatible protocol versions are properly detected.""" | ||
# Create a script using an incompatible protocol version | ||
script = f""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you do textwrap.dedent()
for all the scripts? So it's visually indented properly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gaogaotiantian done
Is the test flaky? That's something I really want to avoid because everyone is going to be disrupted by flaky tests. I'll run the tests a few times and see if it's stable. |
It failed for a second time, that is worth checking. It's the keyboard interrupt one. |
Will check tomorrow |
28e5992
to
659556f
Compare
@gaogaotiantian Ok the test is fixed. I plan to land this tomorrow unless you have any other concern. |
Sure let's land this first and polish it later. We still have some work to do but we don't have to do them all together immediately. Thanks for the work, this is a very cool feature for pdb. |
Awesome! Thanks for the thorough review @gaogaotiantian! This was not easy (it's 1,330 of very tricky code and more than a hundred comments) and you have been not only very thorough, patient and understanding and I am happy we have been able to navigate a plan. Let's iterate together on polishing the feature! ❤️ |
Initial proof-of-concept and design. This is lacking tests and documentation, but otherwise works well, and it would be very helpful if anyone interested can try it out!
It has been tested on macOS and Linux, though not yet on Windows.