KEMBAR78
Arm64: Align methods containing loops to 32B by kunalspathak · Pull Request #59828 · dotnet/runtime · GitHub
Skip to content

Conversation

@kunalspathak
Copy link
Contributor

From Arm Neoverse N1 Software Optimization Guide it is beneficial to align methods at 32B boundary. This will further work well when we start loop alignment at 32B boundaries.

image

As a continuation of #2249, align bigger methods having loops to 32B alignment boundary.

  • I have also added a tighter check on the code size we emit for Arm64 to make sure that there is no misestimation happening for Arm64.
  • There was an assert that would validate the idCodeSize() before printing the instruction which was necessary because the callers never pass the sz and it

@ghost ghost added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Sep 30, 2021
@ghost
Copy link

ghost commented Sep 30, 2021

Tagging subscribers to this area: @JulieLeeMSFT
See info in area-owners.md if you want to be subscribed.

Issue Details

From Arm Neoverse N1 Software Optimization Guide it is beneficial to align methods at 32B boundary. This will further work well when we start loop alignment at 32B boundaries.

image

As a continuation of #2249, align bigger methods having loops to 32B alignment boundary.

  • I have also added a tighter check on the code size we emit for Arm64 to make sure that there is no misestimation happening for Arm64.
  • There was an assert that would validate the idCodeSize() before printing the instruction which was necessary because the callers never pass the sz and it
Author: kunalspathak
Assignees: -
Labels:

area-CodeGen-coreclr

Milestone: -

@kunalspathak
Copy link
Contributor Author

@dotnet/jit-contrib , @jkotas

@echesakov
Copy link
Contributor

echesakov commented Sep 30, 2021

I have also added a tighter check on the code size we emit for Arm64 to make sure that there is no misestimation happening for Arm64.

@kunalspathak Have we ever seen an overestimation issue on Arm64? I thought we shouldn't given that all the instructions have the same size.

@JulieLeeMSFT JulieLeeMSFT added this to the 7.0.0 milestone Sep 30, 2021
@kunalspathak
Copy link
Contributor Author

Have we ever seen an overestimation issue on Arm64? I thought we shouldn't

That's right, but the assert we had was little lose. I updated it for Arm64 to make sure that the memory consumed size and the code size matches.

Copy link
Contributor

@echesakov echesakov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

If I read this right, we don't attempt to align methods with R2R?

#ifdef TARGET_XARCH
// For x64/x86, align methods that are "optimizations enabled" to 32 byte boundaries if
#if defined(TARGET_XARCH) || defined(TARGET_ARM64)
// For x64/x86/arm64, align methods that are "optimizations enabled" to 32 byte boundaries if
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw, does it make any sense to align methods without loops in terms of performance? I just wonder if we should also enable it for "hot" methods with PGO/callcount data

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could do that but we need to validate if it pays off. Remember that aligning methods to higher boundary is not free and introduces memory fragmentation.

@kunalspathak
Copy link
Contributor Author

If I read this right, we don't attempt to align methods with R2R?

Correct, that is a separate work item.

@kunalspathak
Copy link
Contributor Author

Failures unrelated.

@kunalspathak kunalspathak merged commit 4963df9 into dotnet:main Oct 1, 2021
@ghost ghost locked as resolved and limited conversation to collaborators Nov 3, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants