KEMBAR78
Fold some HW nodes for constant input by EgorBo · Pull Request #78929 · dotnet/runtime · GitHub
Skip to content

Conversation

@EgorBo
Copy link
Member

@EgorBo EgorBo commented Nov 28, 2022

This PR setups an initial infrastructure to fold GT_HWINTRINSIC nodes for constant input in VN phase. I've added Lzcnt and Or,Xor,And as examples, e.g.:

class Prog
{
    Vector128<int> Test1() =>  Vector128.Create(42) | Vector128.Create(1); // is folded to {43;43;43;43}

    uint Test2() => Lzcnt.LeadingZeroCount(314); // is folded to 23
}

We can extend the list on demand.

Main motivation was this case (together with #78783) where I needed BitOperations.Log2 to be foldable for constant input (uses Lzcnt under the hood):

image

diffs - only in tests for now but I need this for my upcomining optimizations in C#

@ghost ghost added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Nov 28, 2022
@ghost ghost assigned EgorBo Nov 28, 2022
@ghost
Copy link

ghost commented Nov 28, 2022

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details

This PR setups an initial infrastructure to fold GT_HWINTRINSIC nodes for constant input in VN phase. I've added Lzcnt and Or,Xor,And as examples, e.g.:

class Prog
{
    Vector128<int> Test1() =>  Vector128.Create(42) | Vector128.Create(1); // is folded to {43;43;43;43}

    uint Test2() => Lzcnt.LeadingZeroCount(314); // is folded to 23
}

We can extend the list on demand.

Main motivation was this case (together with #78783) where I needed BitOperations.Log2 to be foldable for constant input (uses Lzcnt under the hood):

image

Author: EgorBo
Assignees: EgorBo
Labels:

area-CodeGen-coreclr

Milestone: -

@EgorBo EgorBo marked this pull request as ready for review November 28, 2022 22:33
@EgorBo
Copy link
Member Author

EgorBo commented Nov 29, 2022

@dotnet/jit-contrib @jakobbotsch can you please take a look - I decided to keep only LZCNT for now (because I needed it) and left a space for future foldings on demand including for vectors.

Comment on lines 6064 to 6072
// E.g. for "Sse2.Or(cnsVec1, cnsVec2)" it looks like this:
// case NI_SSE2_Or:
//{
// simd16_t v1 = GetConstantSimd16(arg0VN);
// simd16_t v2 = GetConstantSimd16(arg1VN);
// v1.i64[0] |= v2.i64[0];
// v1.i64[1] |= v2.i64[1];
// return VNForSimd16Con(v1);
//}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this comment missing some context? Seems out of place

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left an example of a folding for a binary HWIntrinsic VNFunc for someone who decides to implement it here. (E.g. LLVM folds all vector ops)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you plan to add it? Otherwise I don't really see a reason to create this empty skeleton which might just stay like this forever.

Copy link
Member Author

@EgorBo EgorBo Nov 30, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No near future plans although I did see hits in libraries.pmi for Intrinsics with two VNConstants args.
Ok I'll remove it

@EgorBo EgorBo merged commit 92ad3c3 into dotnet:main Dec 1, 2022
@ghost ghost locked as resolved and limited conversation to collaborators Dec 31, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants