KEMBAR78
Hardware accelerated implementation of Quaternion multiply by TJHeuvel · Pull Request #96624 · dotnet/runtime · GitHub
Skip to content

Conversation

@TJHeuvel
Copy link
Contributor

@TJHeuvel TJHeuvel commented Jan 8, 2024

I saw on the Discord that Quaternion multiply wasnt yet vectorised, this PR adds that. The non accelerated path is the exact same as before, and the accelerated path returns the same results.

I saw on the Discord that Quaternion multiply wasnt yet vectorised, this PR adds that. The non accelerated path is the exact same as before, and the accelerated path appears to return the same results.
@ghost ghost added community-contribution Indicates that the PR has been added by a community member area-System.Numerics labels Jan 8, 2024
@ghost
Copy link

ghost commented Jan 8, 2024

Tagging subscribers to this area: @dotnet/area-system-numerics
See info in area-owners.md if you want to be subscribed.

Issue Details

I saw on the Discord that Quaternion multiply wasnt yet vectorised, this PR adds that. The non accelerated path is the exact same as before, and the accelerated path returns the same results.

Author: TJHeuvel
Assignees: -
Labels:

area-System.Numerics, community-contribution

Milestone: -

…rnion.cs


Copy  paste error!

Co-authored-by: Austin Wise <AustinWise@gmail.com>
Comment on lines 191 to 205
if (Vector.IsHardwareAccelerated)
{
// Concatenate rotation is actually q2 * q1 instead of q1 * q2.
// So that's why value2 goes q1 and value1 goes q2.
float q1w = value2.W;
float q2w = value1.W;

Vector3 q1V = new Vector3(value2.X, value2.Y, value2.Z);
Vector3 q2V = new Vector3(value1.X, value1.Y, value1.Z);

return new Quaternion(
q1w * q2V + q2w * q1V + Vector3.Cross(q1V, q2V),
q1w * q2w - Vector3.Dot(q1V, q2V)
);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would expect something like the following to perform better:

if (Vector128.IsHardwareAccelerated)
{
    var left = value1.AsVector128();
    var right = value2.AsVector128();

    var result = right * left.GetElement(3);
    result += (Vector128.Shuffle(right, Vector128.Create(3, 2, 1, 0)) * left.GetElement(0)) * Vector128.Create(+1.0f, -1.0f, +1.0f, -1.0f);
    result += (Vector128.Shuffle(right, Vector128.Create(2, 3, 0, 1)) * left.GetElement(1)) * Vector128.Create(+1.0f, +1.0f, -1.0f, -1.0f);
    result += (Vector128.Shuffle(right, Vector128.Create(1, 0, 3, 2)) * left.GetElement(2)) * Vector128.Create(-1.0f, +1.0f, +1.0f, -1.0f);
    return result.AsQuaternion();
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, that is indeed quite a lot faster, and the results still looked correct.

| Method                 | Mean      | Error     | StdDev    |
|----------------------- |----------:|----------:|----------:|
| MulOperator            | 4.5998 ns | 0.0843 ns | 0.0788 ns |
| VectorMul              | 2.9167 ns | 0.0145 ns | 0.0128 ns |
| IntrinsicMulGetElement | 0.8482 ns | 0.0113 ns | 0.0106 ns |

I've modified it to use the GetElementUnsafe method, i saw other Numerics types use this too.

Tanner provided an even quicker version. I've changed it to use GetElementUnsafe as i saw other API's doing the same, and we can reasonably argue these fields to exist.
…rnion.cs

Co-authored-by: Tanner Gooding <tagoo@outlook.com>
@tannergooding
Copy link
Member

Failures are unrelated issues that are already tracked.

@tannergooding tannergooding merged commit db7d269 into dotnet:main Jan 17, 2024
tmds pushed a commit to tmds/runtime that referenced this pull request Jan 23, 2024
)

* Vector based implementation of Quaternion multiply

I saw on the Discord that Quaternion multiply wasnt yet vectorised, this PR adds that. The non accelerated path is the exact same as before, and the accelerated path appears to return the same results.

* Convert tabs to spaces

* Update src/libraries/System.Private.CoreLib/src/System/Numerics/Quaternion.cs

Copy  paste error!

Co-authored-by: Austin Wise <AustinWise@gmail.com>

* Even faster version

Tanner provided an even quicker version. I've changed it to use GetElementUnsafe as i saw other API's doing the same, and we can reasonably argue these fields to exist.

* Update src/libraries/System.Private.CoreLib/src/System/Numerics/Quaternion.cs

Co-authored-by: Tanner Gooding <tagoo@outlook.com>

* Fix incorrect non-hardware accelerated version

---------

Co-authored-by: Austin Wise <AustinWise@gmail.com>
Co-authored-by: Tanner Gooding <tagoo@outlook.com>
@cincuranet
Copy link
Contributor

Looks like we have some small number of regressions for this (rest is not interesting): dotnet/perf-autofiling-issues#27922

cc @TJHeuvel @tannergooding

@github-actions github-actions bot locked and limited conversation to collaborators Mar 1, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

area-System.Numerics community-contribution Indicates that the PR has been added by a community member

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants