-
-
Notifications
You must be signed in to change notification settings - Fork 33.2k
Description
I recently was playing with bytecode instructions and found that the dis.Instruction did almost what I needed, but not quite -- I ended up reimplementing it, mostly reinventing the wheel. I propose to improve this class in dis.py as follows:
start_offset: start of the instruction includingEXTENDED_ARGprefixesjump_target: bytecode offset where a jump goes (can be property computed fromopcodeandarg)baseopname,baseopcode: name and opcode of the "family head" for specialized instructions (can be properties)cache_offset,end_offset: start and end of the cache entries (can be properties)oparg: alias forarg(in most places we seem to useoparg)
Of these, only start_offset needs to be a new data field -- the others can be computed properties. For my application, start_offset was important to have (though admittedly I could have done without if I had treated EXTENDED_ARG as a NOP). It just feels wrong that offset points to the opcode but oparg includes the extended arg -- this means one has to explicitly represent EXTENDED_ARG instructions in a sequence of instructions even though their effect (arg) is included in the Instruction.
I also added a __str__ method to my Instruction class that shows the entire instruction -- this could call the _disassemble method with default arguments. Here I made one improvement over _disassemble: if the opname is longer than _OPNAME_WIDTH but the arg is shorter than _OPARG_WIDTH, I let the opname overflow into the space reserved for the oparg, leaving at least one space in between. This makes for fewer misaligned opargs, since the opname is left-justified and the oparg is right-justified.
@isidentical @serhiy-storchaka @iritkatriel @markshannon @brandtbucher