KEMBAR78
[wasm] Implement partial backward branch support in the Jiterpreter by kg · Pull Request #82756 · dotnet/runtime · GitHub
Skip to content

Conversation

kg
Copy link
Member

@kg kg commented Feb 28, 2023

This PR adds partial support for backward branches to the jiterpreter.

Due to WASM's quirky approach to constrained control flow and loops, the code it generates is very suboptimal, but it still produces a measurable speed-up (~22sec/iter -> ~20sec/iter for one of the benchmarks that regressed, for example) and is a decent starting point to improve on.

This PR also removes a write barrier from ldelem_ref that didn't need to be there and was very expensive. That takes the regressed benchmark down further from ~20sec/iter to ~3sec/iter.

Additional statistics and a runtime option are added to go with the new backward branch support.

More detail on how it works:

  1. For any methods that contain a trace entry point, the interpreter maintains a table listing all the bblocks that are targeted by a backwards branch.
  2. If a method has one or more backward branch targets in its table, when the jiterpreter generates traces in that method it wraps them in a wasm loop instruction so that we can transfer control back to the top.
  3. The jiterpreter scans the backward branch table when emitting instructions, and when it hits something it knows is a backwards branch target, it ensures that a branch target block starts there. Each encountered backwards branch target is added to a list.
  4. Any time the jiterpreter encounters a backwards branch, if the target is in the list of branch targets we've already encountered, eip is updated and control is sent back to the top of the trace by branching to the loop. After that, execution will skip over blocks until it reaches the branch target.

Attentive readers will note that this algorithm is inefficient because we have to scan over blocks until we find the branch target - there's only one loop for the entire trace. I tried creating a separate loop for each backwards branch target (which would allow us to jump directly to it) but its interaction with forward branches (which require the ability to jump forward to the end of a control region) made it too hard to get that working, at least initially.

A better implementation of control flow would probably allow direct branching for both forwards and backwards control flow, or at least reduce the cost of the branches from what it is currently. But that implementation would likely require a second pass in the trace compiler and building some sort of CFG on the fly so I haven't started on it yet.

kg added 5 commits February 27, 2023 14:51
… looping

Remove unnecessary slow write barrier from ldelem_ref
Update heuristic for backward branches
Add back branch success rate statistic
@ghost
Copy link

ghost commented Feb 28, 2023

Tagging subscribers to 'arch-wasm': @lewing
See info in area-owners.md if you want to be subscribed.

Issue Details

This PR adds partial support for backward branches to the jiterpreter.

Due to WASM's quirky approach to constrained control flow and loops, the code it generates is very suboptimal, but it still produces a measurable speed-up (~22sec/iter -> ~20sec/iter for one of the benchmarks that regressed, for example) and is a decent starting point to improve on.

This PR also removes a write barrier from ldelem_ref that didn't need to be there and was very expensive. That takes the regressed benchmark down further from ~20sec/iter to ~3sec/iter.

Additional statistics and a runtime option are added to go with the new backward branch support.

More detail on how it works:

  1. For any methods that contain a trace entry point, the interpreter maintains a table listing all the bblocks that are targeted by a backwards branch.
  2. If a method has one or more backward branch targets in its table, when the jiterpreter generates traces in that method it wraps them in a wasm loop instruction so that we can transfer control back to the top.
  3. The jiterpreter scans the backward branch table when emitting instructions, and when it hits something it knows is a backwards branch target, it ensures that a branch target block starts there. Each encountered backwards branch target is added to a list.
  4. Any time the jiterpreter encounters a backwards branch, if the target is in the list of branch targets we've already encountered, eip is updated and control is sent back to the top of the trace by branching to the loop. After that, execution will skip over blocks until it reaches the branch target.

Attentive readers will note that this algorithm is inefficient because we have to scan over blocks until we find the branch target - there's only one loop for the entire trace. I tried creating a separate loop for each backwards branch target (which would allow us to jump directly to it) but its interaction with forward branches (which require the ability to jump forward to the end of a control region) made it too hard to get that working, at least initially.

A better implementation of control flow would probably allow direct branching for both forwards and backwards control flow, or at least reduce the cost of the branches from what it is currently. But that implementation would likely require a second pass in the trace compiler and building some sort of CFG on the fly so I haven't started on it yet.

Author: kg
Assignees: -
Labels:

arch-wasm, area-Codegen-Jiterpreter-mono

Milestone: -

Don't generate a loop for trace if all the back branch offsets are before its start offset
@kg kg merged commit ab5e28c into dotnet:main Feb 28, 2023
@ghost ghost locked as resolved and limited conversation to collaborators Mar 31, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

arch-wasm WebAssembly architecture

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants