CUDA official GCC conflicts #25054

albestro · 2021-07-23T09:31:31Z

Looking at the CUDA conflicts declaration I realized that there is a mismatch between CUDA versions and officially supported GCC.

In particular, targeting CUDA 11 on generic x86_64, looking at the official DOC for various minor versions (11.0, 11.1.0, 11.2.0, 11.3.0, 11.4.0), they all report GCC 9.x as supported version.

From this, together with the following notes extents from the official doc

(2) Note that starting with CUDA 11.0, the minimum recommended GCC compiler is at least GCC 5 [ed GCC 6 for Cuda 11.4.0] due to C++11 requirements in CUDA libraries e.g. cuFFT and CUB

(3) Minor versions of the following compilers listed: of GCC, ICC, PGI and XLC, as host compilers for nvcc are supported.

I would say that:

CUDA 11 (at the time of writing) works with GCC up to version 9 (all minor versions included);
CUDA [11.0, 11.4) requires GCC 5 as minimum version
CUDA 11.4 requires GCC 6 as minimum version

As an additional information, I quickly checked crt/host_config.h in the CUDA version I have right now (11.0) which contains the following snippet

#if __GNUC__ > 9

#error -- unsupported GNU version! gcc versions later than 9 are not supported!

#endif /* __GNUC__ > 9 */

which looks quite strict in not supporting newer versions.

As a last note, I looked at https://gist.github.com/ax3l/9489132 that is reported just above the declaration of cuda conflicts in spack, and it says

[...] Sometimes it is possible to hack the requirements there to get some newer versions working, too :)

which may be (at least partially) in contrast with the previous crt/host_config.h. Moreover, there is also a section that tries to report in a table the compatibility list of CUDA with the different compilers, but it looks incomplete and not fully correct (e.g. it reports 11.1.0 NVCC:11.1.74 compatible with GCC (5-)6-10.0, but AFAIK is incorrect).

The content of the gist may be useful and it may be worth to put it somewhere where it can be easily updated/fixed (thanks @haampie for the suggestion).

haampie · 2021-07-23T10:06:11Z

lib/spack/spack/build_systems/cuda.py

-    conflicts('%gcc@11:', when='+cuda ^cuda@:11.1.0 target=x86_64:')
+    conflicts('%gcc@:4', when='+cuda ^cuda@11.0.0: target=x86_64:')
+    conflicts('%gcc@:5', when='+cuda ^cuda@11.4.0: target=x86_64:')
+    conflicts('%gcc@10:', when='+cuda ^cuda@11.0.0: target=x86_64:')


This would also mean that newer versions of cuda would conflict with newer gcc versions, so it's better to have a lower bound on gcc + an upper bound on cuda or the other way around.

We should just check better that whenever a new cuda minor version is released we actually bump the upperbound for cuda on the conflict rule.

haampie · 2021-07-23T10:22:00Z

I ran this script on x86_64:

$ get_headers.sh
#!/bin/bash -e

cat <<EOF |
8.0-devel-ubuntu16.04
9.0-devel-ubuntu16.04
9.1-devel-ubuntu16.04
9.2-devel-ubuntu16.04
10.0-devel-ubuntu18.04
10.1-devel-ubuntu18.04
10.2-devel-ubuntu18.04
11.0.3-devel-ubuntu18.04
11.1.1-devel-ubuntu18.04
11.2.0-devel-ubuntu18.04
11.2.1-devel-ubuntu18.04
11.2.2-devel-ubuntu18.04
11.3.0-devel-ubuntu18.04
11.3.1-devel-ubuntu18.04
11.4.0-devel-ubuntu18.04
11.5.0-devel-ubuntu18.04
11.6.0-devel-ubuntu18.04
EOF

while read tag
do
    mkdir -p "$tag"
    echo "$tag"
    docker run --rm "nvidia/cuda:$tag" bash -c 'cat /usr/local/cuda-*.*/targets/x86_64-linux/include/host_config.h /usr/local/cuda-*.*/targets/x86_64-linux/include/crt/host_config.h' > "$tag/host_config.h" || true
done

and grepping that header file I get:

$ grep unsupported */host_config.h | grep -E '(gcc|clang)' | sort -h
8.0     #error -- unsupported GNU version! gcc versions later than 5 are not supported!
9.0     #error -- unsupported GNU version! gcc versions later than 6 are not supported!
9.1     #error -- unsupported GNU version! gcc versions later than 6 are not supported!
9.2     #error -- unsupported GNU version! gcc versions later than 7 are not supported!
10.0    #error -- unsupported GNU version! gcc versions later than 7 are not supported!
10.1    #error -- unsupported clang version! clang version must be less than 9 and greater than 3.2
10.1    #error -- unsupported GNU version! gcc versions later than 8 are not supported!
10.2    #error -- unsupported clang version! clang version must be less than 9 and greater than 3.2
10.2    #error -- unsupported GNU version! gcc versions later than 8 are not supported!
11.0.3  #error -- unsupported clang version! clang version must be less than 10 and greater than 3.2
11.0.3  #error -- unsupported GNU version! gcc versions later than 9 are not supported!
11.1.1  #error -- unsupported clang version! clang version must be less than 11 and greater than 3.2
11.1.1  #error -- unsupported GNU version! gcc versions later than 10 are not supported!
11.2.0  #error -- unsupported clang version! clang version must be less than 12 and greater than 3.2
11.2.0  #error -- unsupported GNU version! gcc versions later than 10 are not supported!
11.2.1  #error -- unsupported clang version! clang version must be less than 12 and greater than 3.2
11.2.1  #error -- unsupported GNU version! gcc versions later than 10 are not supported!
11.2.2  #error -- unsupported clang version! clang version must be less than 12 and greater than 3.2
11.2.2  #error -- unsupported GNU version! gcc versions later than 10 are not supported!
11.3.0  #error -- unsupported clang version! clang version must be less than 12 and greater than 3.2
11.3.0  #error -- unsupported GNU version! gcc versions later than 10 are not supported!
11.3.1  #error -- unsupported clang version! clang version must be less than 12 and greater than 3.2
11.3.1  #error -- unsupported GNU version! gcc versions later than 10 are not supported!
11.4.0  #error -- unsupported clang version! clang version must be less than 12 and greater than 3.2
11.4.0  #error -- unsupported GNU version! gcc versions later than 10 are not supported!
11.5.0  #error -- unsupported clang version! clang version must be less than 13 and greater than 3.2
11.5.0  #error -- unsupported GNU version! gcc versions later than 11 are not supported!
11.6.0  #error -- unsupported clang version! clang version must be less than 14 and greater than 3.2
11.6.0  #error -- unsupported GNU version! gcc versions later than 11 are not supported!

So for GCC:

    conflicts( '%gcc@6:', when='+cuda ^cuda@:8.0')
    conflicts( '%gcc@7:', when='+cuda ^cuda@:9.1')
    conflicts( '%gcc@8:', when='+cuda ^cuda@:10.0')
    conflicts( '%gcc@9:', when='+cuda ^cuda@:10.2')
    conflicts('%gcc@10:', when='+cuda ^cuda@:11.0')
    conflicts('%gcc@11:', when='+cuda ^cuda@:11.4')
    conflicts('%gcc@12:', when='+cuda ^cuda@:11.6')

And clang:

    conflicts( '%clang@9:', when='+cuda ^cuda@:10.2')
    conflicts('%clang@10:', when='+cuda ^cuda@:11.0')
    conflicts('%clang@11:', when='+cuda ^cuda@:11.1')
    conflicts('%clang@12:', when='+cuda ^cuda@:11.4')
    conflicts('%clang@13:', when='+cuda ^cuda@:11.5')
    conflicts('%clang@14:', when='+cuda ^cuda@:11.6')

Should we just specify this on the minor versions only @ax3l, that would simplify life a bit...

albestro · 2021-07-23T11:45:46Z

$ grep unsupported */host_config.h | grep -E '(gcc|clang)' | sort -h
8.0     #error -- unsupported GNU version! gcc versions later than 5 are not supported!
9.0     #error -- unsupported GNU version! gcc versions later than 6 are not supported!
9.1     #error -- unsupported GNU version! gcc versions later than 6 are not supported!
9.2     #error -- unsupported GNU version! gcc versions later than 7 are not supported!
10.0    #error -- unsupported GNU version! gcc versions later than 7 are not supported!
10.1    #error -- unsupported clang version! clang version must be less than 9 and greater than 3.2
10.1    #error -- unsupported GNU version! gcc versions later than 8 are not supported!
10.2    #error -- unsupported clang version! clang version must be less than 9 and greater than 3.2
10.2    #error -- unsupported GNU version! gcc versions later than 8 are not supported!
11.0.3  #error -- unsupported clang version! clang version must be less than 10 and greater than 3.2
11.0.3  #error -- unsupported GNU version! gcc versions later than 9 are not supported!
11.1.1  #error -- unsupported clang version! clang version must be less than 11 and greater than 3.2
11.1.1  #error -- unsupported GNU version! gcc versions later than 10 are not supported!
11.2.0  #error -- unsupported clang version! clang version must be less than 12 and greater than 3.2
11.2.0  #error -- unsupported GNU version! gcc versions later than 10 are not supported!
11.2.1  #error -- unsupported clang version! clang version must be less than 12 and greater than 3.2
11.2.1  #error -- unsupported GNU version! gcc versions later than 10 are not supported!
11.2.2  #error -- unsupported clang version! clang version must be less than 12 and greater than 3.2
11.2.2  #error -- unsupported GNU version! gcc versions later than 10 are not supported!
11.3.0  #error -- unsupported clang version! clang version must be less than 12 and greater than 3.2
11.3.0  #error -- unsupported GNU version! gcc versions later than 10 are not supported!
11.3.1  #error -- unsupported clang version! clang version must be less than 12 and greater than 3.2
11.3.1  #error -- unsupported GNU version! gcc versions later than 10 are not supported!
11.4.0  #error -- unsupported clang version! clang version must be less than 12 and greater than 3.2
11.4.0  #error -- unsupported GNU version! gcc versions later than 10 are not supported!

Nice job @haampie!

I was quickly checking previous CUDA version documentation (<11) and, at least for GCC, it does not seems so explicit the range of version supported. In particular, I'm not really sure about the minimal requirement for GCC for CUDA<11 (it may be related to C++11), but for sure it is stated for CUDA 11 (GCC 5, and starting from 11.4 it is GCC 6).

At least we should fix the GCC allowed range for CUDA11. I don't know if you want to touch also others (IMHO it is ok fixing as per the output provided by @haampie, which means a range for CLANG and a open range with upper bound for GCC on CUDA<11).

Waiting a feedback from others (@ax3l?) on how to proceed, and as soon as we agree, I'll update the code changes in this PR.

haampie · 2021-07-26T09:20:33Z

@albestro let's get a PR in that fixes the issue with CUDA 11.x and review the other versions in a separate thread.

So these lower bounds for GCC:

    conflicts('%gcc@:4', when='+cuda ^cuda@11.0:')
    conflicts('%gcc@:5', when='+cuda ^cuda@11.4:')

and these upper bounds for GCC:

    conflicts('%gcc@10:', when='+cuda ^cuda@:11.0')
    conflicts('%gcc@11:', when='+cuda ^cuda@:11.4')

They hold for x86_64, ppc64le, arm64, since the host_config.h header is exactly the same on these versions

my current script

downloading

#!/usr/bin/env bash

set -e

cat <<-EOF |
nvidia/cuda-arm64:11.0.3-devel-ubuntu18.04
nvidia/cuda-arm64:11.1.1-devel-ubuntu18.04
nvidia/cuda-arm64:11.2.0-devel-ubuntu18.04
nvidia/cuda-arm64:11.2.1-devel-ubuntu18.04
nvidia/cuda-arm64:11.2.2-devel-ubuntu18.04
nvidia/cuda-arm64:11.3.0-devel-ubuntu18.04
nvidia/cuda-arm64:11.3.1-devel-ubuntu18.04
nvidia/cuda-arm64:11.4.0-devel-ubuntu18.04
nvidia/cuda-ppc64le:8.0-devel-ubuntu16.04
nvidia/cuda-ppc64le:9.0-devel-ubuntu16.04
nvidia/cuda-ppc64le:9.1-devel-ubuntu16.04
nvidia/cuda-ppc64le:9.2-devel-ubuntu16.04
nvidia/cuda-ppc64le:10.0-devel-ubuntu18.04
nvidia/cuda-ppc64le:10.1-devel-ubuntu18.04
nvidia/cuda-ppc64le:10.2-devel-ubuntu18.04
nvidia/cuda-ppc64le:11.0.3-devel-ubuntu18.04
nvidia/cuda-ppc64le:11.1.1-devel-ubuntu18.04
nvidia/cuda-ppc64le:11.2.0-devel
nvidia/cuda-ppc64le:11.2.1-devel
nvidia/cuda-ppc64le:11.2.2-devel
nvidia/cuda-ppc64le:11.3.0-devel-centos8
nvidia/cuda-ppc64le:11.3.1-devel
nvidia/cuda-ppc64le:11.4.0-devel
nvidia/cuda:8.0-devel-ubuntu16.04
nvidia/cuda:9.0-devel-ubuntu16.04
nvidia/cuda:9.1-devel-ubuntu16.04
nvidia/cuda:9.2-devel-ubuntu16.04
nvidia/cuda:10.0-devel-ubuntu18.04
nvidia/cuda:10.1-devel-ubuntu18.04
nvidia/cuda:10.2-devel-ubuntu18.04
nvidia/cuda:11.0.3-devel-ubuntu18.04
nvidia/cuda:11.1.1-devel-ubuntu18.04
nvidia/cuda:11.2.0-devel-ubuntu18.04
nvidia/cuda:11.2.1-devel-ubuntu18.04
nvidia/cuda:11.2.2-devel-ubuntu18.04
nvidia/cuda:11.3.0-devel-ubuntu18.04
nvidia/cuda:11.3.1-devel-ubuntu18.04
nvidia/cuda:11.4.0-devel-ubuntu18.04
EOF

while read image
do
	echo "$image"
	mkdir -p "$image"
	rootfs="/dev/shm/rootfs"
	unshare -r rm -rf "$rootfs" && mkdir "$rootfs"
	docker export $(docker create "$image") | tar -C "$rootfs" -xf -
	cat "$rootfs"/usr/local/cuda-*.*/targets/*/include/host_config.h "$rootfs"/usr/local/cuda-*.*/targets/*/include/crt/host_config.h > "$image/host_config.h" || true
done

comparing header files

#!/usr/bin/env bash

set -e

cat <<-EOF |
nvidia/cuda-arm64:11.0.3-devel-ubuntu18.04   nvidia/cuda:11.0.3-devel-ubuntu18.04
nvidia/cuda-arm64:11.1.1-devel-ubuntu18.04   nvidia/cuda:11.1.1-devel-ubuntu18.04
nvidia/cuda-arm64:11.2.0-devel-ubuntu18.04   nvidia/cuda:11.2.0-devel-ubuntu18.04
nvidia/cuda-arm64:11.2.1-devel-ubuntu18.04   nvidia/cuda:11.2.1-devel-ubuntu18.04
nvidia/cuda-arm64:11.2.2-devel-ubuntu18.04   nvidia/cuda:11.2.2-devel-ubuntu18.04
nvidia/cuda-arm64:11.3.0-devel-ubuntu18.04   nvidia/cuda:11.3.0-devel-ubuntu18.04
nvidia/cuda-arm64:11.3.1-devel-ubuntu18.04   nvidia/cuda:11.3.1-devel-ubuntu18.04
nvidia/cuda-arm64:11.4.0-devel-ubuntu18.04   nvidia/cuda:11.4.0-devel-ubuntu18.04
nvidia/cuda-ppc64le:8.0-devel-ubuntu16.04    nvidia/cuda:8.0-devel-ubuntu16.04
nvidia/cuda-ppc64le:9.0-devel-ubuntu16.04    nvidia/cuda:9.0-devel-ubuntu16.04
nvidia/cuda-ppc64le:9.1-devel-ubuntu16.04    nvidia/cuda:9.1-devel-ubuntu16.04
nvidia/cuda-ppc64le:9.2-devel-ubuntu16.04    nvidia/cuda:9.2-devel-ubuntu16.04
nvidia/cuda-ppc64le:10.0-devel-ubuntu18.04   nvidia/cuda:10.0-devel-ubuntu18.04
nvidia/cuda-ppc64le:10.1-devel-ubuntu18.04   nvidia/cuda:10.1-devel-ubuntu18.04
nvidia/cuda-ppc64le:10.2-devel-ubuntu18.04   nvidia/cuda:10.2-devel-ubuntu18.04
nvidia/cuda-ppc64le:11.0.3-devel-ubuntu18.04 nvidia/cuda:11.0.3-devel-ubuntu18.04
nvidia/cuda-ppc64le:11.1.1-devel-ubuntu18.04 nvidia/cuda:11.1.1-devel-ubuntu18.04
nvidia/cuda-ppc64le:11.2.0-devel             nvidia/cuda:11.2.0-devel-ubuntu18.04
nvidia/cuda-ppc64le:11.2.1-devel             nvidia/cuda:11.2.1-devel-ubuntu18.04
nvidia/cuda-ppc64le:11.2.2-devel             nvidia/cuda:11.2.2-devel-ubuntu18.04
nvidia/cuda-ppc64le:11.3.0-devel-centos8     nvidia/cuda:11.3.0-devel-ubuntu18.04
nvidia/cuda-ppc64le:11.3.1-devel             nvidia/cuda:11.3.1-devel-ubuntu18.04
nvidia/cuda-ppc64le:11.4.0-devel             nvidia/cuda:11.4.0-devel-ubuntu18.04
EOF

while read line
do
    read -a arr <<< $line
    diff ${arr[0]}/host_config.h ${arr[1]}/host_config.h
done

running the latter i don't get any diff, so all header files are the same across archs

albestro · 2021-07-26T12:38:10Z

I've removed the existing duplication for CUDA 11 for the available platforms (x86_64 and ppc64_le) and I added a small section on top.

In the end this PR will result in an update of the CUDA 11 compatibility with GCC, with the lower bound updated to GCC 6 for version of CUDA>=11.4, and about the upper bound, the compatibility of GCC10.x has been extended to CUDA 11.4.

I put there also a note about the update of this latter one. After the discussions we had, it looks to me that probably the same decision had been taken about keeping an upper bound to not constrain newer CUDA versions, but then it was not get updated after new CUDA versions were released.

Is there any suggestion on how/where to put an alert for this, i.e. "do this and that when cuda version is updated"?

@haampie @ax3l

ax3l

Thanks a lot and and appreciate also the great scripting.

I would also mark GCC 10 as a conflict for CUDA <11.4.1 due to an incompatibility in a stdlib, even if Nvidia advertised it otherwise:
https://gist.github.com/ax3l/9489132#gistcomment-3860114

Let's get this in? cc @haampie

ax3l · 2021-09-04T07:09:28Z

This probably needs a rebase. Sorry for being so busy.

albestro · 2021-09-08T15:59:41Z

@haampie @ax3l Just rebased. Please check that I did it as expected.

I also partially rephrased the note in the comment trying to make it more clear. Please give a check to that comment too.

ax3l

Thx! :)

spackbot-app bot added build-systems conflicts labels Jul 23, 2021

haampie reviewed Jul 23, 2021

View reviewed changes

alalazo assigned haampie Jul 26, 2021

alalazo requested a review from ax3l July 26, 2021 08:19

albestro force-pushed the alby/cuda_gcc_conflicts_mapping branch from d37128b to 16977d1 Compare July 26, 2021 12:30

albestro requested a review from haampie July 26, 2021 16:29

ax3l previously approved these changes Sep 4, 2021

View reviewed changes

albestro added 3 commits September 8, 2021 17:36

update CUDA 11 / GCC compatibility range

151d546

additional unofficial conflict

d03c102

minor changes to comments

8ccae20

albestro dismissed ax3l’s stale review via 8ccae20 September 8, 2021 15:55

albestro force-pushed the alby/cuda_gcc_conflicts_mapping branch from 16977d1 to 8ccae20 Compare September 8, 2021 15:55

haampie approved these changes Sep 8, 2021

View reviewed changes

ax3l approved these changes Sep 9, 2021

View reviewed changes

ax3l merged commit 59d8031 into spack:develop Sep 9, 2021

albestro deleted the alby/cuda_gcc_conflicts_mapping branch September 10, 2021 04:48

haampie mentioned this pull request Jan 18, 2022

CUDA: add v11.6.0 #28439

Merged

albestro mentioned this pull request Feb 18, 2022

Add back CUDA conflicts for GCC and Clang + Add CUDA 11.4.3 and 11.4.4 #29076

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

CUDA official GCC conflicts #25054

CUDA official GCC conflicts #25054

Uh oh!

albestro commented Jul 23, 2021

Uh oh!

haampie Jul 23, 2021 •

edited

Loading

Uh oh!

haampie commented Jul 23, 2021 •

edited

Loading

Uh oh!

albestro commented Jul 23, 2021

Uh oh!

haampie commented Jul 26, 2021 •

edited

Loading

Uh oh!

albestro commented Jul 26, 2021

Uh oh!

ax3l left a comment

Uh oh!

ax3l commented Sep 4, 2021

Uh oh!

albestro commented Sep 8, 2021

Uh oh!

ax3l left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

CUDA official GCC conflicts #25054

CUDA official GCC conflicts #25054

Uh oh!

Conversation

albestro commented Jul 23, 2021

Uh oh!

haampie Jul 23, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

haampie commented Jul 23, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

albestro commented Jul 23, 2021

Uh oh!

haampie commented Jul 26, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

albestro commented Jul 26, 2021

Uh oh!

ax3l left a comment

Choose a reason for hiding this comment

Uh oh!

ax3l commented Sep 4, 2021

Uh oh!

albestro commented Sep 8, 2021

Uh oh!

ax3l left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

haampie Jul 23, 2021 •

edited

Loading

haampie commented Jul 23, 2021 •

edited

Loading

haampie commented Jul 26, 2021 •

edited

Loading