-
-
Notifications
You must be signed in to change notification settings - Fork 33.2k
Description
Currently, our JIT backend mostly just takes the machine code that Clang gives us, and emits it without modification. The one exception to this is removing zero-length jumps to the next uop at the end of each blob of machine code, which is both fragile and extremely limited in what it can do.
For example, consider _GUARD_NOS_NULL
. This is the (16-byte) sequence that Clang gives us:
cmpq $0x1, -0x10(%r13)
je _JIT_CONTINUE
jmp _JIT_JUMP_TARGET
And this is the (11-byte) sequence we want:
cmpq $0x1, -0x10(%r13)
jne _JIT_JUMP_TARGET
We should do a bit more here, and doing more is a lot easier if we're modifying textual assembly at build time. Thankfully, this is pretty straightforward: we just compile to assembly using Clang (-S
), modify it, and finish compiling using Clang again.
We should intentionally not take on too much complexity here, especially since we support a few different platforms. There's a lot that we can do by only reasoning about labels, jumps, branches, and returns and treating all other sequences of instructions as black boxes. Once we have the assembly parsed into basic blocks, we can do things like:
- Inverting the direction of branches (from branch-hot/jump-cold to branch-cold/jump-hot) like in
_GUARD_NOS_NULL
above is straightforward. - Having the assembler encode all
_JIT_CONTINUE
jumps during this step (by just adding the label at the end of the assembly) instead of doing it at runtime. It will also use more efficient "short" jump encodings most of the time, as an additional benefit. - Removing zero length jumps, as we do now, is trivial.
- Later: Splitting the stencils into "hot" (core uop logic) and "cold" (deopts, error handling, etc) code. The JIT will emit all "hot" code for a trace, followed by all cold code for a trace, keeping the cold code out-of-line.
I have a branch to do all but the last of these, and will open a PR soon.
Linked PRs
- GH-135904: Optimize the JIT's assembly control flow #135905
- GH-135904: Improve the JIT's performance on macOS #136528
- GH-135904: Add tests for the JIT build process #136766
- GH-135904: Implement assembler optimization for AArch64. #139855
- GH-135904: JIT compiler: Support 19 bit branch instructions on AArch64 for Mach-O. #140453