KEMBAR78
Consider implementing ARM64 `__load_acquire`/`__stlr` intrinsics · Issue #62103 · llvm/llvm-project · GitHub
Skip to content

Consider implementing ARM64 __load_acquire/__stlr intrinsics #62103

@StephanTLavavej

Description

@StephanTLavavej

As of VS 2022 17.6 Preview 3, MSVC supports the following ARM64 intrinsics used by its STL:

unsigned __int8  __load_acquire8 (const volatile unsigned __int8  * _Target);
unsigned __int16 __load_acquire16(const volatile unsigned __int16 * _Target);
unsigned __int32 __load_acquire32(const volatile unsigned __int32 * _Target);
unsigned __int64 __load_acquire64(const volatile unsigned __int64 * _Target);

void __stlr8 (volatile unsigned __int8  * _Target, unsigned __int8  _Value);
void __stlr16(volatile unsigned __int16 * _Target, unsigned __int16 _Value);
void __stlr32(volatile unsigned __int32 * _Target, unsigned __int32 _Value);
void __stlr64(volatile unsigned __int64 * _Target, unsigned __int64 _Value);

According to my understanding, the __load_acquire intrinsic emits either the ldar or ldapr instructions (according to criteria that are beyond my cat-sized brain 🐱 🧠), while the __stlr intrinsic emits the stlr instruction. These are significantly more efficient than what was previously possible.

Currently, MSVC's STL is using its classic (slower) codepaths for Clang/LLVM ARM64. It would be nice if Clang added support for the new faster intrinsics.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions