Decouple Atomic from host on MSVC #43

wmaxey · 2020-10-07T19:28:07Z

No description provided.

include/cuda/std/detail/__atomic

include/cuda/std/detail/__atomic_derived

griwes · 2020-10-07T20:06:04Z

Looks good overall pending the first review comment; please ping me when you un-WIP this.

include/cuda/std/detail/__atomic_derived

wmaxey · 2020-10-07T22:46:27Z

include/cuda/std/detail/__atomic

    return __cxx_atomic_alignment_unwrap(detail::__atomic_load_n_cuda(&__a->__a_value, __order, detail::__scope_tag<_Sco>()));
 #else
-    return __cxx_atomic_alignment_unwrap(::std::atomic_load_explicit(&__a->__a_value, (::std::memory_order)__order));
+    alignas(_Tp) unsigned char __buf[sizeof(_Tp)];


@griwes it looks like I tore off another alignment unwrap here.

Are these char* output buffers a code smell? This was done to fix initialization warnings.

Why not use __atomic_load_n here instead?

__atomic_load_n does not work for non-integral non-pointer types: https://gcc.godbolt.org/z/osrcs1

Maybe I'm mistaken in some way about its usage?

wmaxey · 2020-10-14T22:00:54Z

Testing on a GV100 looks good. No failures with 1 unsupported test. I'll kick off a CI run now.

I've replaced the __cxx_atomic_alignment_unwrap/wrap and in the case of some functions like __cxx_atomic_compare_exchange_strong I've removed them because the underlying atomic handles it.

I'm unsure if that is the correct thing to do however.

ogiroux · 2020-10-26T18:03:37Z

include/cuda/std/detail/__atomic

    return __cxx_atomic_alignment_unwrap(detail::__atomic_load_n_cuda(&__a->__a_value, __order, detail::__scope_tag<_Sco>()));
 #else
-    return __cxx_atomic_alignment_unwrap(::std::atomic_load_explicit(&__a->__a_value, (::std::memory_order)__order));
+    alignas(_Tp) unsigned char __buf[sizeof(_Tp)];


Why not use __atomic_load_n here instead?

ogiroux · 2020-10-26T18:04:59Z

include/cuda/std/detail/__atomic

    return detail::__atomic_fetch_add_cuda(&__a->__a_value, __delta, __order, detail::__scope_tag<_Sco>());
 #else
-    return ::std::atomic_fetch_add_explicit(&__a->__a_value, __delta, (::std::memory_order)__order);
+    return __atomic_fetch_add(&__a->__a_value, __delta * __skip_amt<_Tp*>::value, __order);


It doesn't seem right to me that we should need the skip amount in this layer. The layer below should be doing that.

Basically, every deviation between the CUDA_ARCH side and this side looks like a bug to me.

It doesn't seem right to me that we should need the skip amount in this layer. The layer below should be doing that.

I'd agree, but there is no layer at the compiler intrinsic level for GCC. At that point incrementing by the sizeof(_Tp) is necessary. https://github.com/NVIDIA/libcudacxx/blob/main/libcxx/include/atomic#L846

Basically, every deviation between the CUDA_ARCH side and this side looks like a bug to me.

@griwes, @jrhemstad, and I had a meeting today about how we could resolve some of this with a better platform layering framework. There's some neat ideas on the table for making this nesting doll thing be a bit cleaner.

It would be relevant to know what things are being done wrong ahead of time.

ogiroux

I think this looks good.

…/sub; make msvc sfinae a tad less repetitive

…mics

…clude to <atomic>

wmaxey · 2020-11-13T23:28:59Z

Atomic decouple builds clean on CI.

SC: 29322243.2

wmaxey requested review from brycelelbach and griwes October 7, 2020 19:28

griwes reviewed Oct 7, 2020

View reviewed changes

include/cuda/std/detail/__atomic Outdated Show resolved Hide resolved

include/cuda/std/detail/__atomic_derived Show resolved Hide resolved

wmaxey commented Oct 7, 2020

View reviewed changes

wmaxey force-pushed the feature/atomic_decouple branch 2 times, most recently from 149fcee to 264a9f2 Compare October 14, 2020 21:53

wmaxey requested a review from griwes October 14, 2020 21:53

wmaxey changed the title ~~WIP: Atomic decouple~~ Decouple Atomic from host STL on MSVC Oct 14, 2020

wmaxey changed the title ~~Decouple Atomic from host STL on MSVC~~ Decouple Atomic from host on MSVC Oct 14, 2020

ogiroux reviewed Oct 26, 2020

View reviewed changes

brycelelbach modified the milestones: 1.4.0, 2.0.0 Oct 29, 2020

wmaxey requested a review from ogiroux October 29, 2020 19:57

ogiroux approved these changes Nov 12, 2020

View reviewed changes

ogiroux and others added 8 commits November 12, 2020 20:16

Initial version

6488a1a

Now with dramatically more workingness

23d6f5f

Fixed a dangling reference

45c28ea

Properly offset ptrdiff_t deltas before calling compiler __atomic_add…

0c7add5

…/sub; make msvc sfinae a tad less repetitive

Replace __atomic.*_n with generic versions on host code

4b7dca9

Moved msvc atomics to libcxx in the event it is upstreamed

7b2a928

Use initialized memory rather than an initialized object for MSVC ato…

2610560

…mics

Implement memory_order_consume operations as acquire, move support in…

7d802a6

…clude to <atomic>

wmaxey force-pushed the feature/atomic_decouple branch from 264a9f2 to 7d802a6 Compare November 13, 2020 04:16

wmaxey merged commit 7b37c3d into NVIDIA:main Nov 14, 2020

Decouple Atomic from host on MSVC #43

Decouple Atomic from host on MSVC #43

Uh oh!

Conversation

wmaxey commented Oct 7, 2020

Uh oh!

Uh oh!

Uh oh!

griwes commented Oct 7, 2020

Uh oh!

Uh oh!

wmaxey Oct 7, 2020

Choose a reason for hiding this comment

Uh oh!

ogiroux Oct 26, 2020

Choose a reason for hiding this comment

Uh oh!

wmaxey Oct 27, 2020

Choose a reason for hiding this comment

Uh oh!

wmaxey commented Oct 14, 2020

Uh oh!

ogiroux Oct 26, 2020

Choose a reason for hiding this comment

Uh oh!

ogiroux Oct 26, 2020

Choose a reason for hiding this comment

Uh oh!

ogiroux Oct 26, 2020

Choose a reason for hiding this comment

Uh oh!

wmaxey Oct 27, 2020

Choose a reason for hiding this comment

Uh oh!

ogiroux left a comment

Choose a reason for hiding this comment

Uh oh!

wmaxey commented Nov 13, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants