KEMBAR78
Deprecate and Replace `cub::BFE` by fbusato · Pull Request #4031 · NVIDIA/cccl · GitHub
Skip to content

Conversation

@fbusato
Copy link
Contributor

@fbusato fbusato commented Mar 6, 2025

Fixes #4025

Description

Replace/deprecate cub::BFE with new <cuda/bit> functionalities nvidia.github.io/cccl/libcudacxx/extended_api/bit.html

The initial PR found the following problems:

  • MSVC triggers unused var warning in bitfield.h
  • catch2_test_device_segmented_radix_sort_keys.cu includes tests with 0 bit width
  • catch2_test_block_radix_sort.cu includes tests with 0 bit width

Performance comparison PTX BFE vs. cuda::bitfield_extract with SM80. TLDR: slightly faster

[0] NVIDIA RTX A6000

T{ct} OffsetT{ct} Elements{io} Entropy Ref Time Ref Noise Cmp Time Cmp Noise Diff %Diff Status
I8 I32 2^28 1 3.667 ms 0.91% 3.646 ms 0.85% -21.558 us -0.59% SAME
I8 I32 2^28 0.544 3.589 ms 1.07% 3.573 ms 0.96% -15.195 us -0.42% SAME
I8 I32 2^28 0.201 3.572 ms 0.43% 3.566 ms 0.73% -5.587 us -0.16% SAME
I8 I64 2^28 1 3.891 ms 0.60% 3.887 ms 0.57% -4.424 us -0.11% SAME
I8 I64 2^28 0.544 3.816 ms 0.55% 3.812 ms 0.63% -3.316 us -0.09% SAME
I8 I64 2^28 0.201 3.824 ms 0.29% 3.811 ms 0.28% -13.288 us -0.35% FAST
I16 I32 2^28 1 8.461 ms 0.45% 8.441 ms 0.51% -20.436 us -0.24% SAME
I16 I32 2^28 0.544 8.349 ms 0.37% 8.304 ms 0.33% -45.869 us -0.55% FAST
I16 I32 2^28 0.201 8.239 ms 0.37% 8.173 ms 0.36% -66.614 us -0.81% FAST
I16 I64 2^28 1 8.561 ms 0.53% 8.504 ms 0.47% -57.116 us -0.67% FAST
I16 I64 2^28 0.544 8.410 ms 0.37% 8.335 ms 0.59% -75.033 us -0.89% FAST
I16 I64 2^28 0.201 8.271 ms 0.37% 8.184 ms 0.38% -86.897 us -1.05% FAST
I32 I32 2^28 1 14.839 ms 0.40% 14.847 ms 0.39% 8.064 us 0.05% SAME
I32 I32 2^28 0.544 14.883 ms 0.53% 14.896 ms 0.53% 13.054 us 0.09% SAME
I32 I32 2^28 0.201 14.871 ms 0.51% 14.878 ms 0.51% 6.633 us 0.04% SAME
I32 I64 2^28 1 14.848 ms 0.58% 14.855 ms 0.63% 6.919 us 0.05% SAME
I32 I64 2^28 0.544 14.906 ms 0.60% 14.905 ms 0.55% -1.031 us -0.01% SAME
I32 I64 2^28 0.201 14.900 ms 0.45% 14.912 ms 0.52% 11.272 us 0.08% SAME
I64 I32 2^28 1 56.083 ms 0.16% 56.086 ms 0.17% 2.822 us 0.01% SAME
I64 I32 2^28 0.544 55.997 ms 0.49% 55.995 ms 0.50% -1.792 us -0.00% SAME
I64 I32 2^28 0.201 55.845 ms 0.36% 55.846 ms 0.36% 0.610 us 0.00% SAME
I64 I64 2^28 1 56.104 ms 0.51% 56.102 ms 0.51% -1.453 us -0.00% SAME
I64 I64 2^28 0.544 56.008 ms 0.50% 56.008 ms 0.50% 0.083 us 0.00% SAME
I64 I64 2^28 0.201 55.854 ms 0.36% 55.851 ms 0.36% -2.967 us -0.01% SAME
I128 I32 2^28 1 217.763 ms 0.03% 217.760 ms 0.02% -3.585 us -0.00% SAME
I128 I32 2^28 0.544 217.483 ms 0.29% 217.490 ms 0.29% 7.103 us 0.00% SAME
I128 I32 2^28 0.201 217.180 ms 0.03% 217.195 ms 0.03% 15.138 us 0.01% SAME
I128 I64 2^28 1 217.926 ms 0.23% 217.915 ms 0.23% -11.061 us -0.01% SAME
I128 I64 2^28 0.544 217.440 ms 0.29% 217.449 ms 0.29% 9.034 us 0.00% SAME
I128 I64 2^28 0.201 217.128 ms 0.03% 217.127 ms 0.03% -1.376 us -0.00% SAME
F32 I32 2^28 1 14.925 ms 1.21% 14.930 ms 1.28% 4.824 us 0.03% SAME
F32 I32 2^28 0.544 14.933 ms 0.57% 14.930 ms 0.55% -2.420 us -0.02% SAME
F32 I32 2^28 0.201 14.952 ms 0.48% 14.953 ms 0.51% 0.608 us 0.00% SAME
F32 I64 2^28 1 14.837 ms 0.67% 14.844 ms 0.78% 7.563 us 0.05% SAME
F32 I64 2^28 0.544 14.917 ms 0.68% 14.914 ms 0.58% -2.788 us -0.02% SAME
F32 I64 2^28 0.201 14.966 ms 0.50% 14.967 ms 0.46% 0.405 us 0.00% SAME
F64 I32 2^28 1 56.097 ms 0.17% 56.100 ms 0.17% 2.593 us 0.00% SAME
F64 I32 2^28 0.544 56.032 ms 0.50% 56.034 ms 0.49% 1.903 us 0.00% SAME
F64 I32 2^28 0.201 55.862 ms 0.35% 55.860 ms 0.35% -1.766 us -0.00% SAME
F64 I64 2^28 1 56.130 ms 0.50% 56.132 ms 0.51% 2.047 us 0.00% SAME
F64 I64 2^28 0.544 56.035 ms 0.50% 56.033 ms 0.49% -2.434 us -0.00% SAME
F64 I64 2^28 0.201 55.869 ms 0.35% 55.871 ms 0.35% 1.195 us 0.00% SAME

Summary

  • Total Matches: 42
    • Pass (diff <= min_noise): 36
    • Unknown (infinite noise): 0
    • Failure (diff > min_noise): 6

@fbusato fbusato added the 3.0 Targeted for 3.0 release label Mar 6, 2025
@fbusato fbusato requested a review from bernhardmgruber March 6, 2025 01:22
@fbusato fbusato self-assigned this Mar 6, 2025
@fbusato fbusato requested a review from a team as a code owner March 6, 2025 01:22
@fbusato fbusato added this to CCCL Mar 6, 2025
@github-project-automation github-project-automation bot moved this to Todo in CCCL Mar 6, 2025
@cccl-authenticator-app cccl-authenticator-app bot moved this from Todo to In Review in CCCL Mar 6, 2025
@github-actions
Copy link
Contributor

github-actions bot commented Mar 6, 2025

🟨 CI finished in 1h 41m: Pass: 84%/93 | Total: 2d 21h | Avg: 44m 42s | Max: 1h 25m | Hits: 60%/115695
  • 🟨 cub: Pass: 75%/45 | Total: 1d 21h | Avg: 1h 00m | Max: 1h 25m | Hits: 25%/40744

    🔍 cpu: amd64 🔍
      🔍 amd64              Pass:  74%/43  | Total:  1d 18h | Avg: 59m 49s | Max:  1h 25m | Hits:  25%/38308 
      🟩 arm64              Pass: 100%/2   | Total:  2h 17m | Avg:  1h 08m | Max:  1h 10m | Hits:  26%/2436  
    🔍 cudacxx_family: nvcc 🔍
      🟩 ClangCUDA          Pass: 100%/2   | Total:  2h 09m | Avg:  1h 04m | Max:  1h 07m | Hits:  26%/2104  
      🔍 nvcc               Pass:  74%/43  | Total:  1d 19h | Avg:  1h 00m | Max:  1h 25m | Hits:  25%/38640 
    🔍 sm: 90 🔍
      🔍 90                 Pass:  33%/3   | Total:  1h 07m | Avg: 22m 23s | Max: 28m 55s | Hits:  26%/1218  
      🟩 90;90a;100         Pass: 100%/1   | Total:  1h 17m | Avg:  1h 17m | Max:  1h 17m | Hits:  26%/1218  
    🟨 ctk
      🟨 12.0               Pass:  80%/5   | Total:  5h 48m | Avg:  1h 09m | Max:  1h 15m | Hits:  26%/4880  
      🟩 12.5               Pass: 100%/2   | Total:  2h 34m | Avg:  1h 17m | Max:  1h 20m | Hits:  17%/2254  
      🟨 12.8               Pass:  73%/38  | Total:  1d 12h | Avg: 58m 03s | Max:  1h 25m | Hits:  26%/33610 
    🟨 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total:  2h 09m | Avg:  1h 04m | Max:  1h 07m | Hits:  26%/2104  
      🟨 nvcc12.0           Pass:  80%/5   | Total:  5h 48m | Avg:  1h 09m | Max:  1h 15m | Hits:  26%/4880  
      🟩 nvcc12.5           Pass: 100%/2   | Total:  2h 34m | Avg:  1h 17m | Max:  1h 20m | Hits:  17%/2254  
      🟨 nvcc12.8           Pass:  72%/36  | Total:  1d 10h | Avg: 57m 41s | Max:  1h 25m | Hits:  26%/31506 
    🟨 cxx
      🟩 Clang14            Pass: 100%/4   | Total:  4h 39m | Avg:  1h 09m | Max:  1h 13m | Hits:  27%/4880  
      🟩 Clang15            Pass: 100%/2   | Total:  2h 12m | Avg:  1h 06m | Max:  1h 07m | Hits:  27%/2436  
      🟩 Clang16            Pass: 100%/2   | Total:  2h 21m | Avg:  1h 10m | Max:  1h 13m | Hits:  27%/2436  
      🟩 Clang17            Pass: 100%/2   | Total:  2h 13m | Avg:  1h 06m | Max:  1h 07m | Hits:  27%/2436  
      🟨 Clang18            Pass:  71%/7   | Total:  6h 02m | Avg: 51m 44s | Max:  1h 08m | Hits:  26%/5758  
      🟩 GCC7               Pass: 100%/2   | Total:  2h 14m | Avg:  1h 07m | Max:  1h 09m | Hits:  26%/2440  
      🟩 GCC8               Pass: 100%/1   | Total:  1h 10m | Avg:  1h 10m | Max:  1h 10m | Hits:  26%/1220  
      🟩 GCC9               Pass: 100%/2   | Total:  2h 16m | Avg:  1h 08m | Max:  1h 08m | Hits:  26%/2440  
      🟩 GCC10              Pass: 100%/2   | Total:  2h 15m | Avg:  1h 07m | Max:  1h 07m | Hits:  26%/2440  
      🟩 GCC11              Pass: 100%/2   | Total:  2h 10m | Avg:  1h 05m | Max:  1h 06m | Hits:  26%/2436  
      🟩 GCC12              Pass: 100%/2   | Total:  2h 22m | Avg:  1h 11m | Max:  1h 12m | Hits:  26%/2436  
      🟨 GCC13              Pass:  45%/11  | Total:  7h 09m | Avg: 39m 00s | Max:  1h 17m | Hits:  26%/6090  
      🟥 MSVC14.29          Pass:   0%/2   | Total:  2h 35m | Avg:  1h 17m | Max:  1h 19m
      🟨 MSVC14.42          Pass:  50%/2   | Total:  2h 50m | Avg:  1h 25m | Max:  1h 25m | Hits:  12%/1042  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  2h 34m | Avg:  1h 17m | Max:  1h 20m | Hits:  17%/2254  
    🟨 cxx_family
      🟨 Clang              Pass:  88%/17  | Total: 17h 29m | Avg:  1h 01m | Max:  1h 13m | Hits:  26%/17946 
      🟨 GCC                Pass:  72%/22  | Total: 19h 38m | Avg: 53m 35s | Max:  1h 17m | Hits:  26%/19502 
      🟨 MSVC               Pass:  25%/4   | Total:  5h 26m | Avg:  1h 21m | Max:  1h 25m | Hits:  12%/1042  
      🟩 NVHPC              Pass: 100%/2   | Total:  2h 34m | Avg:  1h 17m | Max:  1h 20m | Hits:  17%/2254  
    🟨 gpu
      🟨 h100               Pass:  33%/3   | Total:  1h 07m | Avg: 22m 23s | Max: 28m 55s | Hits:  26%/1218  
      🟨 rtx2080            Pass:  91%/34  | Total:  1d 15h | Avg:  1h 10m | Max:  1h 25m | Hits:  25%/37090 
      🟨 rtxa6000           Pass:  25%/8   | Total:  4h 06m | Avg: 30m 47s | Max:  1h 08m | Hits:  26%/2436  
    🟨 jobs
      🟨 Build              Pass:  91%/37  | Total:  1d 18h | Avg:  1h 09m | Max:  1h 25m | Hits:  25%/40744 
      🟥 DeviceLaunch       Pass:   0%/1   | Total: 22m 15s | Avg: 22m 15s | Max: 22m 15s
      🟥 GraphCapture       Pass:   0%/1   | Total: 17m 31s | Avg: 17m 31s | Max: 17m 31s
      🟥 HostLaunch         Pass:   0%/3   | Total:  1h 10m | Avg: 23m 21s | Max: 23m 58s
      🟥 TestGPU            Pass:   0%/3   | Total: 43m 14s | Avg: 14m 24s | Max: 15m 27s
    🟨 std
      🟨 17                 Pass:  85%/20  | Total: 23h 19m | Avg:  1h 09m | Max:  1h 25m | Hits:  26%/20465 
      🟨 20                 Pass:  68%/25  | Total: 21h 49m | Avg: 52m 23s | Max:  1h 25m | Hits:  25%/20279 
    
  • 🟨 thrust: Pass: 93%/45 | Total: 22h 48m | Avg: 30m 24s | Max: 1h 01m | Hits: 79%/74643

    🔍 cpu: amd64 🔍
      🔍 amd64              Pass:  93%/43  | Total: 21h 54m | Avg: 30m 34s | Max:  1h 01m | Hits:  79%/71088 
      🟩 arm64              Pass: 100%/2   | Total: 53m 40s | Avg: 26m 50s | Max: 28m 45s | Hits:  77%/3555  
    🔍 cudacxx_family: nvcc 🔍
      🟩 ClangCUDA          Pass: 100%/2   | Total: 47m 45s | Avg: 23m 52s | Max: 24m 04s | Hits:  77%/3554  
      🔍 nvcc               Pass:  93%/43  | Total: 22h 00m | Avg: 30m 42s | Max:  1h 01m | Hits:  79%/71089 
    🔍 cxx_family: MSVC 🔍
      🟩 Clang              Pass: 100%/17  | Total:  7h 31m | Avg: 26m 31s | Max: 31m 14s | Hits:  79%/30209 
      🟩 GCC                Pass: 100%/21  | Total:  9h 11m | Avg: 26m 16s | Max: 34m 15s | Hits:  81%/37338 
      🔍 MSVC               Pass:  40%/5   | Total:  4h 24m | Avg: 52m 56s | Max:  1h 01m | Hits:  62%/3542  
      🟩 NVHPC              Pass: 100%/2   | Total:  1h 40m | Avg: 50m 25s | Max: 51m 47s | Hits:  63%/3554  
    🔍 gpu: rtx2080 🔍
      🟩 h100               Pass: 100%/2   | Total: 28m 37s | Avg: 14m 18s | Max: 16m 52s | Hits:  88%/3556  
      🔍 rtx2080            Pass:  90%/33  | Total: 18h 24m | Avg: 33m 27s | Max:  1h 01m | Hits:  76%/53324 
      🟩 rtx4090            Pass: 100%/10  | Total:  3h 55m | Avg: 23m 32s | Max:  1h 00m | Hits:  85%/17763 
    🔍 jobs: Build 🔍
      🔍 Build              Pass:  92%/38  | Total: 21h 10m | Avg: 33m 26s | Max:  1h 01m | Hits:  75%/62206 
      🟩 TestCPU            Pass: 100%/3   | Total: 52m 53s | Avg: 17m 37s | Max: 36m 23s | Hits:  90%/5326  
      🟩 TestGPU            Pass: 100%/4   | Total: 44m 41s | Avg: 11m 10s | Max: 11m 45s | Hits:  99%/7111  
    🔍 std: 17 🔍
      🔍 17                 Pass:  85%/20  | Total: 11h 48m | Avg: 35m 26s | Max:  1h 01m | Hits:  76%/30218 
      🟩 20                 Pass: 100%/23  | Total: 10h 20m | Avg: 26m 59s | Max:  1h 00m | Hits:  80%/40869 
    🟨 ctk
      🟨 12.0               Pass:  80%/5   | Total:  3h 08m | Avg: 37m 45s | Max:  1h 01m | Hits:  77%/7110  
      🟩 12.5               Pass: 100%/2   | Total:  1h 40m | Avg: 50m 25s | Max: 51m 47s | Hits:  63%/3554  
      🟨 12.8               Pass:  94%/38  | Total: 17h 58m | Avg: 28m 23s | Max:  1h 00m | Hits:  80%/63979 
    🟨 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total: 47m 45s | Avg: 23m 52s | Max: 24m 04s | Hits:  77%/3554  
      🟨 nvcc12.0           Pass:  80%/5   | Total:  3h 08m | Avg: 37m 45s | Max:  1h 01m | Hits:  77%/7110  
      🟩 nvcc12.5           Pass: 100%/2   | Total:  1h 40m | Avg: 50m 25s | Max: 51m 47s | Hits:  63%/3554  
      🟨 nvcc12.8           Pass:  94%/36  | Total: 17h 10m | Avg: 28m 38s | Max:  1h 00m | Hits:  80%/60425 
    🟨 cxx
      🟩 Clang14            Pass: 100%/4   | Total:  2h 00m | Avg: 30m 14s | Max: 30m 56s | Hits:  77%/7108  
      🟩 Clang15            Pass: 100%/2   | Total: 58m 04s | Avg: 29m 02s | Max: 29m 47s | Hits:  77%/3554  
      🟩 Clang16            Pass: 100%/2   | Total:  1h 00m | Avg: 30m 27s | Max: 31m 14s | Hits:  77%/3554  
      🟩 Clang17            Pass: 100%/2   | Total: 59m 53s | Avg: 29m 56s | Max: 30m 18s | Hits:  77%/3554  
      🟩 Clang18            Pass: 100%/7   | Total:  2h 31m | Avg: 21m 36s | Max: 30m 59s | Hits:  83%/12439 
      🟩 GCC7               Pass: 100%/2   | Total:  1h 05m | Avg: 32m 50s | Max: 34m 15s | Hits:  76%/3556  
      🟩 GCC8               Pass: 100%/1   | Total: 31m 24s | Avg: 31m 24s | Max: 31m 24s | Hits:  76%/1778  
      🟩 GCC9               Pass: 100%/2   | Total:  1h 02m | Avg: 31m 26s | Max: 32m 51s | Hits:  76%/3556  
      🟩 GCC10              Pass: 100%/2   | Total: 59m 52s | Avg: 29m 56s | Max: 30m 29s | Hits:  76%/3556  
      🟩 GCC11              Pass: 100%/2   | Total:  1h 00m | Avg: 30m 15s | Max: 30m 51s | Hits:  76%/3556  
      🟩 GCC12              Pass: 100%/2   | Total: 59m 48s | Avg: 29m 54s | Max: 30m 21s | Hits:  76%/3556  
      🟩 GCC13              Pass: 100%/10  | Total:  3h 31m | Avg: 21m 09s | Max: 32m 28s | Hits:  86%/17780 
      🟥 MSVC14.29          Pass:   0%/2   | Total:  1h 54m | Avg: 57m 19s | Max:  1h 01m
      🟨 MSVC14.42          Pass:  66%/3   | Total:  2h 30m | Avg: 50m 00s | Max:  1h 00m | Hits:  62%/3542  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  1h 40m | Avg: 50m 25s | Max: 51m 47s | Hits:  63%/3554  
    🟩 cmake_options
      🟩 -DTHRUST_DISPATCH_TYPE=Force32bit Pass: 100%/2   | Total: 38m 36s | Avg: 19m 18s | Max: 27m 21s | Hits:  88%/3556  
    🟩 sm
      🟩 90                 Pass: 100%/2   | Total: 28m 37s | Avg: 14m 18s | Max: 16m 52s | Hits:  88%/3556  
      🟩 90;90a;100         Pass: 100%/1   | Total: 30m 57s | Avg: 30m 57s | Max: 30m 57s | Hits:  76%/1778  
    
  • 🟩 cccl_c_parallel: Pass: 100%/2 | Total: 15m 41s | Avg: 7m 50s | Max: 12m 53s | Hits: 97%/308

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total: 15m 41s | Avg:  7m 50s | Max: 12m 53s | Hits:  97%/308   
    🟩 ctk
      🟩 12.8               Pass: 100%/2   | Total: 15m 41s | Avg:  7m 50s | Max: 12m 53s | Hits:  97%/308   
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/2   | Total: 15m 41s | Avg:  7m 50s | Max: 12m 53s | Hits:  97%/308   
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/2   | Total: 15m 41s | Avg:  7m 50s | Max: 12m 53s | Hits:  97%/308   
    🟩 cxx
      🟩 GCC13              Pass: 100%/2   | Total: 15m 41s | Avg:  7m 50s | Max: 12m 53s | Hits:  97%/308   
    🟩 cxx_family
      🟩 GCC                Pass: 100%/2   | Total: 15m 41s | Avg:  7m 50s | Max: 12m 53s | Hits:  97%/308   
    🟩 gpu
      🟩 rtx2080            Pass: 100%/2   | Total: 15m 41s | Avg:  7m 50s | Max: 12m 53s | Hits:  97%/308   
    🟩 jobs
      🟩 Build              Pass: 100%/1   | Total:  2m 48s | Avg:  2m 48s | Max:  2m 48s | Hits:  96%/154   
      🟩 Test               Pass: 100%/1   | Total: 12m 53s | Avg: 12m 53s | Max: 12m 53s | Hits:  98%/154   
    
  • 🟩 python: Pass: 100%/1 | Total: 1h 03m | Avg: 1h 03m | Max: 1h 03m

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total:  1h 03m | Avg:  1h 03m | Max:  1h 03m
    🟩 ctk
      🟩 12.8               Pass: 100%/1   | Total:  1h 03m | Avg:  1h 03m | Max:  1h 03m
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/1   | Total:  1h 03m | Avg:  1h 03m | Max:  1h 03m
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total:  1h 03m | Avg:  1h 03m | Max:  1h 03m
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total:  1h 03m | Avg:  1h 03m | Max:  1h 03m
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total:  1h 03m | Avg:  1h 03m | Max:  1h 03m
    🟩 gpu
      🟩 rtx2080            Pass: 100%/1   | Total:  1h 03m | Avg:  1h 03m | Max:  1h 03m
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total:  1h 03m | Avg:  1h 03m | Max:  1h 03m
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
+/- CUB
Thrust
CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental
+/- python
+/- CCCL C Parallel Library
+/- Catch2Helper

🏃‍ Runner counts (total jobs: 93)

# Runner
66 linux-amd64-cpu16
9 windows-amd64-cpu16
6 linux-amd64-gpu-rtxa6000-latest-1
4 linux-arm64-cpu16
3 linux-amd64-gpu-h100-latest-1
3 linux-amd64-gpu-rtx4090-latest-1
2 linux-amd64-gpu-rtx2080-latest-1

@fbusato
Copy link
Contributor Author

fbusato commented Mar 6, 2025

Using cuda::bitfield_extract with range checks enabled found a potential bug in the segmented radix sort test.
The test could generate begin_bit == end_bit which translates to num_bits == 0. I fixed the test by skipping this configuration and moved some computation after the bit check to improve execution time.
@elstehle could you please review my changes?

@fbusato fbusato requested a review from elstehle March 6, 2025 17:39
@elstehle
Copy link
Contributor

elstehle commented Mar 6, 2025

Thanks! I'm out until Monday. Will review then. Could you, in the meantime, please run the benchmarks for radix sort wnd share the results here?

@fbusato
Copy link
Contributor Author

fbusato commented Mar 6, 2025

I can take a look

@github-actions
Copy link
Contributor

github-actions bot commented Mar 6, 2025

🟨 CI finished in 1h 15m: Pass: 90%/93 | Total: 18h 36m | Avg: 12m 00s | Max: 1h 00m | Hits: 97%/121785
  • 🟨 cub: Pass: 86%/45 | Total: 10h 31m | Avg: 14m 01s | Max: 37m 47s | Hits: 97%/46834

    🔍 cpu: amd64 🔍
      🔍 amd64              Pass:  86%/43  | Total: 10h 11m | Avg: 14m 13s | Max: 37m 47s | Hits:  96%/44398 
      🟩 arm64              Pass: 100%/2   | Total: 20m 04s | Avg: 10m 02s | Max: 10m 18s | Hits:  98%/2436  
    🔍 cudacxx_family: nvcc 🔍
      🟩 ClangCUDA          Pass: 100%/2   | Total: 15m 34s | Avg:  7m 47s | Max:  7m 53s | Hits:  99%/2104  
      🔍 nvcc               Pass:  86%/43  | Total: 10h 15m | Avg: 14m 19s | Max: 37m 47s | Hits:  96%/44730 
    🔍 sm: 90 🔍
      🔍 90                 Pass:  66%/3   | Total: 44m 03s | Avg: 14m 41s | Max: 23m 58s | Hits:  99%/2436  
      🟩 90;90a;100         Pass: 100%/1   | Total: 10m 34s | Avg: 10m 34s | Max: 10m 34s | Hits:  98%/1218  
    🟨 ctk
      🟨 12.0               Pass:  80%/5   | Total:  1h 15m | Avg: 15m 02s | Max: 32m 33s | Hits:  98%/4880  
      🟩 12.5               Pass: 100%/2   | Total: 28m 31s | Avg: 14m 15s | Max: 14m 49s | Hits:  97%/2254  
      🟨 12.8               Pass:  86%/38  | Total:  8h 47m | Avg: 13m 53s | Max: 37m 47s | Hits:  96%/39700 
    🟨 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total: 15m 34s | Avg:  7m 47s | Max:  7m 53s | Hits:  99%/2104  
      🟨 nvcc12.0           Pass:  80%/5   | Total:  1h 15m | Avg: 15m 02s | Max: 32m 33s | Hits:  98%/4880  
      🟩 nvcc12.5           Pass: 100%/2   | Total: 28m 31s | Avg: 14m 15s | Max: 14m 49s | Hits:  97%/2254  
      🟨 nvcc12.8           Pass:  86%/36  | Total:  8h 32m | Avg: 14m 13s | Max: 37m 47s | Hits:  96%/37596 
    🟨 cxx
      🟩 Clang14            Pass: 100%/4   | Total: 41m 23s | Avg: 10m 20s | Max: 11m 01s | Hits:  99%/4880  
      🟩 Clang15            Pass: 100%/2   | Total: 20m 53s | Avg: 10m 26s | Max: 11m 06s | Hits:  99%/2436  
      🟩 Clang16            Pass: 100%/2   | Total: 19m 42s | Avg:  9m 51s | Max: 10m 01s | Hits:  99%/2436  
      🟩 Clang17            Pass: 100%/2   | Total: 20m 32s | Avg: 10m 16s | Max: 10m 49s | Hits:  99%/2436  
      🟨 Clang18            Pass:  85%/7   | Total:  1h 22m | Avg: 11m 43s | Max: 21m 54s | Hits:  99%/6976  
      🟩 GCC7               Pass: 100%/2   | Total: 20m 56s | Avg: 10m 28s | Max: 10m 58s | Hits:  98%/2440  
      🟩 GCC8               Pass: 100%/1   | Total:  9m 43s | Avg:  9m 43s | Max:  9m 43s | Hits:  98%/1220  
      🟩 GCC9               Pass: 100%/2   | Total: 21m 22s | Avg: 10m 41s | Max: 10m 45s | Hits:  98%/2440  
      🟩 GCC10              Pass: 100%/2   | Total: 19m 56s | Avg:  9m 58s | Max: 10m 05s | Hits:  98%/2440  
      🟩 GCC11              Pass: 100%/2   | Total: 20m 03s | Avg: 10m 01s | Max: 10m 14s | Hits:  98%/2436  
      🟩 GCC12              Pass: 100%/2   | Total: 20m 02s | Avg: 10m 01s | Max: 10m 06s | Hits:  98%/2436  
      🟨 GCC13              Pass:  81%/11  | Total:  2h 45m | Avg: 15m 00s | Max: 23m 58s | Hits:  99%/10962 
      🟥 MSVC14.29          Pass:   0%/2   | Total:  1h 10m | Avg: 35m 10s | Max: 37m 47s
      🟨 MSVC14.42          Pass:  50%/2   | Total:  1h 10m | Avg: 35m 28s | Max: 36m 10s | Hits:  15%/1042  
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 28m 31s | Avg: 14m 15s | Max: 14m 49s | Hits:  97%/2254  
    🟨 cxx_family
      🟨 Clang              Pass:  94%/17  | Total:  3h 04m | Avg: 10m 51s | Max: 21m 54s | Hits:  99%/19164 
      🟨 GCC                Pass:  90%/22  | Total:  4h 37m | Avg: 12m 35s | Max: 23m 58s | Hits:  98%/24374 
      🟨 MSVC               Pass:  25%/4   | Total:  2h 21m | Avg: 35m 19s | Max: 37m 47s | Hits:  15%/1042  
      🟩 NVHPC              Pass: 100%/2   | Total: 28m 31s | Avg: 14m 15s | Max: 14m 49s | Hits:  97%/2254  
    🟨 jobs
      🟨 Build              Pass:  91%/37  | Total:  7h 59m | Avg: 12m 57s | Max: 37m 47s | Hits:  96%/40744 
      🟩 DeviceLaunch       Pass: 100%/1   | Total: 21m 41s | Avg: 21m 41s | Max: 21m 41s | Hits:  99%/1218  
      🟩 GraphCapture       Pass: 100%/1   | Total: 17m 15s | Avg: 17m 15s | Max: 17m 15s | Hits:  99%/1218  
      🟩 HostLaunch         Pass: 100%/3   | Total:  1h 09m | Avg: 23m 06s | Max: 23m 58s | Hits:  99%/3654  
      🟥 TestGPU            Pass:   0%/3   | Total: 44m 04s | Avg: 14m 41s | Max: 15m 43s
    🟨 gpu
      🟨 h100               Pass:  66%/3   | Total: 44m 03s | Avg: 14m 41s | Max: 23m 58s | Hits:  99%/2436  
      🟨 rtx2080            Pass:  91%/34  | Total:  7h 30m | Avg: 13m 15s | Max: 37m 47s | Hits:  96%/37090 
      🟨 rtxa6000           Pass:  75%/8   | Total:  2h 16m | Avg: 17m 03s | Max: 23m 28s | Hits:  99%/7308  
    🟨 std
      🟨 17                 Pass:  85%/20  | Total:  4h 41m | Avg: 14m 05s | Max: 37m 47s | Hits:  98%/20465 
      🟨 20                 Pass:  88%/25  | Total:  5h 49m | Avg: 13m 59s | Max: 36m 10s | Hits:  95%/26369 
    
  • 🟨 thrust: Pass: 93%/45 | Total: 6h 49m | Avg: 9m 05s | Max: 32m 25s | Hits: 98%/74643

    🔍 cpu: amd64 🔍
      🔍 amd64              Pass:  93%/43  | Total:  6h 39m | Avg:  9m 17s | Max: 32m 25s | Hits:  98%/71088 
      🟩 arm64              Pass: 100%/2   | Total:  9m 46s | Avg:  4m 53s | Max:  5m 14s | Hits:  99%/3555  
    🔍 cudacxx_family: nvcc 🔍
      🟩 ClangCUDA          Pass: 100%/2   | Total: 10m 47s | Avg:  5m 23s | Max:  5m 33s | Hits: 100%/3554  
      🔍 nvcc               Pass:  93%/43  | Total:  6h 38m | Avg:  9m 15s | Max: 32m 25s | Hits:  98%/71089 
    🔍 cxx_family: MSVC 🔍
      🟩 Clang              Pass: 100%/17  | Total:  1h 37m | Avg:  5m 45s | Max: 10m 14s | Hits: 100%/30209 
      🟩 GCC                Pass: 100%/21  | Total:  2h 19m | Avg:  6m 38s | Max: 11m 29s | Hits:  99%/37338 
      🔍 MSVC               Pass:  40%/5   | Total:  2h 23m | Avg: 28m 40s | Max: 32m 25s | Hits:  70%/3542  
      🟩 NVHPC              Pass: 100%/2   | Total: 28m 11s | Avg: 14m 05s | Max: 14m 09s | Hits:  99%/3554  
    🔍 gpu: rtx2080 🔍
      🟩 h100               Pass: 100%/2   | Total: 16m 37s | Avg:  8m 18s | Max: 11m 29s | Hits:  99%/3556  
      🔍 rtx2080            Pass:  90%/33  | Total:  4h 24m | Avg:  8m 00s | Max: 28m 07s | Hits:  99%/53324 
      🟩 rtx4090            Pass: 100%/10  | Total:  2h 08m | Avg: 12m 49s | Max: 32m 25s | Hits:  94%/17763 
    🔍 jobs: Build 🔍
      🔍 Build              Pass:  92%/38  | Total:  5h 16m | Avg:  8m 20s | Max: 29m 37s | Hits:  99%/62206 
      🟩 TestCPU            Pass: 100%/3   | Total: 47m 59s | Avg: 15m 59s | Max: 32m 25s | Hits:  90%/5326  
      🟩 TestGPU            Pass: 100%/4   | Total: 44m 14s | Avg: 11m 03s | Max: 11m 29s | Hits:  99%/7111  
    🔍 std: 17 🔍
      🔍 17                 Pass:  85%/20  | Total:  3h 05m | Avg:  9m 16s | Max: 28m 07s | Hits:  99%/30218 
      🟩 20                 Pass: 100%/23  | Total:  3h 26m | Avg:  8m 57s | Max: 32m 25s | Hits:  97%/40869 
    🟨 ctk
      🟨 12.0               Pass:  80%/5   | Total: 48m 33s | Avg:  9m 42s | Max: 28m 07s | Hits:  99%/7110  
      🟩 12.5               Pass: 100%/2   | Total: 28m 11s | Avg: 14m 05s | Max: 14m 09s | Hits:  99%/3554  
      🟨 12.8               Pass:  94%/38  | Total:  5h 32m | Avg:  8m 44s | Max: 32m 25s | Hits:  98%/63979 
    🟨 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total: 10m 47s | Avg:  5m 23s | Max:  5m 33s | Hits: 100%/3554  
      🟨 nvcc12.0           Pass:  80%/5   | Total: 48m 33s | Avg:  9m 42s | Max: 28m 07s | Hits:  99%/7110  
      🟩 nvcc12.5           Pass: 100%/2   | Total: 28m 11s | Avg: 14m 05s | Max: 14m 09s | Hits:  99%/3554  
      🟨 nvcc12.8           Pass:  94%/36  | Total:  5h 21m | Avg:  8m 55s | Max: 32m 25s | Hits:  98%/60425 
    🟨 cxx
      🟩 Clang14            Pass: 100%/4   | Total: 21m 00s | Avg:  5m 15s | Max:  5m 39s | Hits: 100%/7108  
      🟩 Clang15            Pass: 100%/2   | Total: 10m 55s | Avg:  5m 27s | Max:  5m 30s | Hits: 100%/3554  
      🟩 Clang16            Pass: 100%/2   | Total: 11m 27s | Avg:  5m 43s | Max:  5m 55s | Hits: 100%/3554  
      🟩 Clang17            Pass: 100%/2   | Total: 10m 57s | Avg:  5m 28s | Max:  5m 43s | Hits: 100%/3554  
      🟩 Clang18            Pass: 100%/7   | Total: 43m 38s | Avg:  6m 14s | Max: 10m 14s | Hits: 100%/12439 
      🟩 GCC7               Pass: 100%/2   | Total: 10m 58s | Avg:  5m 29s | Max:  5m 36s | Hits:  99%/3556  
      🟩 GCC8               Pass: 100%/1   | Total:  5m 27s | Avg:  5m 27s | Max:  5m 27s | Hits:  99%/1778  
      🟩 GCC9               Pass: 100%/2   | Total: 10m 47s | Avg:  5m 23s | Max:  5m 48s | Hits:  99%/3556  
      🟩 GCC10              Pass: 100%/2   | Total: 11m 13s | Avg:  5m 36s | Max:  5m 50s | Hits:  99%/3556  
      🟩 GCC11              Pass: 100%/2   | Total: 11m 51s | Avg:  5m 55s | Max:  5m 58s | Hits:  99%/3556  
      🟩 GCC12              Pass: 100%/2   | Total: 12m 10s | Avg:  6m 05s | Max:  6m 20s | Hits:  99%/3556  
      🟩 GCC13              Pass: 100%/10  | Total:  1h 17m | Avg:  7m 42s | Max: 11m 29s | Hits:  99%/17780 
      🟥 MSVC14.29          Pass:   0%/2   | Total: 54m 48s | Avg: 27m 24s | Max: 28m 07s
      🟨 MSVC14.42          Pass:  66%/3   | Total:  1h 28m | Avg: 29m 31s | Max: 32m 25s | Hits:  70%/3542  
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 28m 11s | Avg: 14m 05s | Max: 14m 09s | Hits:  99%/3554  
    🟩 cmake_options
      🟩 -DTHRUST_DISPATCH_TYPE=Force32bit Pass: 100%/2   | Total: 17m 32s | Avg:  8m 46s | Max: 11m 09s | Hits:  99%/3556  
    🟩 sm
      🟩 90                 Pass: 100%/2   | Total: 16m 37s | Avg:  8m 18s | Max: 11m 29s | Hits:  99%/3556  
      🟩 90;90a;100         Pass: 100%/1   | Total:  6m 00s | Avg:  6m 00s | Max:  6m 00s | Hits:  99%/1778  
    
  • 🟩 cccl_c_parallel: Pass: 100%/2 | Total: 15m 05s | Avg: 7m 32s | Max: 12m 46s | Hits: 98%/308

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total: 15m 05s | Avg:  7m 32s | Max: 12m 46s | Hits:  98%/308   
    🟩 ctk
      🟩 12.8               Pass: 100%/2   | Total: 15m 05s | Avg:  7m 32s | Max: 12m 46s | Hits:  98%/308   
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/2   | Total: 15m 05s | Avg:  7m 32s | Max: 12m 46s | Hits:  98%/308   
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/2   | Total: 15m 05s | Avg:  7m 32s | Max: 12m 46s | Hits:  98%/308   
    🟩 cxx
      🟩 GCC13              Pass: 100%/2   | Total: 15m 05s | Avg:  7m 32s | Max: 12m 46s | Hits:  98%/308   
    🟩 cxx_family
      🟩 GCC                Pass: 100%/2   | Total: 15m 05s | Avg:  7m 32s | Max: 12m 46s | Hits:  98%/308   
    🟩 gpu
      🟩 rtx2080            Pass: 100%/2   | Total: 15m 05s | Avg:  7m 32s | Max: 12m 46s | Hits:  98%/308   
    🟩 jobs
      🟩 Build              Pass: 100%/1   | Total:  2m 19s | Avg:  2m 19s | Max:  2m 19s | Hits:  98%/154   
      🟩 Test               Pass: 100%/1   | Total: 12m 46s | Avg: 12m 46s | Max: 12m 46s | Hits:  98%/154   
    
  • 🟩 python: Pass: 100%/1 | Total: 1h 00m | Avg: 1h 00m | Max: 1h 00m

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total:  1h 00m | Avg:  1h 00m | Max:  1h 00m
    🟩 ctk
      🟩 12.8               Pass: 100%/1   | Total:  1h 00m | Avg:  1h 00m | Max:  1h 00m
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/1   | Total:  1h 00m | Avg:  1h 00m | Max:  1h 00m
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total:  1h 00m | Avg:  1h 00m | Max:  1h 00m
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total:  1h 00m | Avg:  1h 00m | Max:  1h 00m
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total:  1h 00m | Avg:  1h 00m | Max:  1h 00m
    🟩 gpu
      🟩 rtx2080            Pass: 100%/1   | Total:  1h 00m | Avg:  1h 00m | Max:  1h 00m
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total:  1h 00m | Avg:  1h 00m | Max:  1h 00m
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
+/- CUB
Thrust
CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental
+/- python
+/- CCCL C Parallel Library
+/- Catch2Helper

🏃‍ Runner counts (total jobs: 93)

# Runner
66 linux-amd64-cpu16
9 windows-amd64-cpu16
6 linux-amd64-gpu-rtxa6000-latest-1
4 linux-arm64-cpu16
3 linux-amd64-gpu-h100-latest-1
3 linux-amd64-gpu-rtx4090-latest-1
2 linux-amd64-gpu-rtx2080-latest-1

@fbusato fbusato requested a review from a team as a code owner March 6, 2025 22:24
@fbusato fbusato requested a review from ericniebler March 6, 2025 22:24
@github-actions
Copy link
Contributor

github-actions bot commented Mar 7, 2025

🟨 CI finished in 1h 33m: Pass: 98%/158 | Total: 2d 13h | Avg: 23m 16s | Max: 1h 16m | Hits: 84%/246942
  • 🟨 cub: Pass: 93%/45 | Total: 1d 12h | Avg: 48m 07s | Max: 1h 16m | Hits: 80%/49960

    🔍 cpu: amd64 🔍
      🔍 amd64              Pass:  93%/43  | Total:  1d 10h | Avg: 47m 48s | Max:  1h 16m | Hits:  80%/47524 
      🟩 arm64              Pass: 100%/2   | Total:  1h 50m | Avg: 55m 08s | Max: 56m 14s | Hits:  85%/2436  
    🔍 ctk: 12.8 🔍
      🟩 12.0               Pass: 100%/5   | Total:  4h 41m | Avg: 56m 13s | Max:  1h 06m | Hits:  73%/5922  
      🟩 12.5               Pass: 100%/2   | Total:  1h 52m | Avg: 56m 04s | Max: 56m 35s | Hits:  83%/2254  
      🔍 12.8               Pass:  92%/38  | Total:  1d 05h | Avg: 46m 38s | Max:  1h 16m | Hits:  81%/41784 
    🔍 cudacxx: nvcc12.8 🔍
      🟩 ClangCUDA18        Pass: 100%/2   | Total:  1h 57m | Avg: 58m 31s | Max:  1h 00m | Hits:  85%/2104  
      🟩 nvcc12.0           Pass: 100%/5   | Total:  4h 41m | Avg: 56m 13s | Max:  1h 06m | Hits:  73%/5922  
      🟩 nvcc12.5           Pass: 100%/2   | Total:  1h 52m | Avg: 56m 04s | Max: 56m 35s | Hits:  83%/2254  
      🔍 nvcc12.8           Pass:  91%/36  | Total:  1d 03h | Avg: 45m 59s | Max:  1h 16m | Hits:  81%/39680 
    🔍 cudacxx_family: nvcc 🔍
      🟩 ClangCUDA          Pass: 100%/2   | Total:  1h 57m | Avg: 58m 31s | Max:  1h 00m | Hits:  85%/2104  
      🔍 nvcc               Pass:  93%/43  | Total:  1d 10h | Avg: 47m 38s | Max:  1h 16m | Hits:  80%/47856 
    🚨 jobs: TestGPU 🚨
      🟩 Build              Pass: 100%/37  | Total:  1d 09h | Avg: 54m 18s | Max:  1h 16m | Hits:  78%/43870 
      🟩 DeviceLaunch       Pass: 100%/1   | Total: 21m 14s | Avg: 21m 14s | Max: 21m 14s | Hits:  99%/1218  
      🟩 GraphCapture       Pass: 100%/1   | Total: 19m 23s | Avg: 19m 23s | Max: 19m 23s | Hits:  99%/1218  
      🟩 HostLaunch         Pass: 100%/3   | Total:  1h 09m | Avg: 23m 14s | Max: 24m 03s | Hits:  99%/3654  
      🔥 TestGPU            Pass:   0%/3   | Total: 45m 53s | Avg: 15m 17s | Max: 17m 37s
    🔍 sm: 90 🔍
      🔍 90                 Pass:  66%/3   | Total: 59m 47s | Avg: 19m 55s | Max: 24m 03s | Hits:  92%/2436  
      🟩 90;90a;100         Pass: 100%/1   | Total:  1h 00m | Avg:  1h 00m | Max:  1h 00m | Hits:  85%/1218  
    🔍 std: 20 🔍
      🟩 17                 Pass: 100%/20  | Total: 18h 32m | Avg: 55m 37s | Max:  1h 15m | Hits:  74%/23591 
      🔍 20                 Pass:  88%/25  | Total: 17h 33m | Avg: 42m 08s | Max:  1h 16m | Hits:  85%/26369 
    🟨 cxx
      🟩 Clang14            Pass: 100%/4   | Total:  3h 30m | Avg: 52m 35s | Max: 55m 11s | Hits:  85%/4880  
      🟩 Clang15            Pass: 100%/2   | Total:  1h 39m | Avg: 49m 57s | Max: 51m 23s | Hits:  85%/2436  
      🟩 Clang16            Pass: 100%/2   | Total:  1h 39m | Avg: 49m 52s | Max: 50m 48s | Hits:  85%/2436  
      🟩 Clang17            Pass: 100%/2   | Total:  1h 41m | Avg: 50m 56s | Max: 51m 45s | Hits:  85%/2436  
      🟨 Clang18            Pass:  85%/7   | Total:  5h 10m | Avg: 44m 17s | Max:  1h 00m | Hits:  88%/6976  
      🟩 GCC7               Pass: 100%/2   | Total:  1h 45m | Avg: 52m 36s | Max: 54m 03s | Hits:  85%/2440  
      🟩 GCC8               Pass: 100%/1   | Total: 50m 33s | Avg: 50m 33s | Max: 50m 33s | Hits:  85%/1220  
      🟩 GCC9               Pass: 100%/2   | Total:  1h 50m | Avg: 55m 22s | Max: 56m 56s | Hits:  85%/2440  
      🟩 GCC10              Pass: 100%/2   | Total:  1h 55m | Avg: 57m 50s | Max:  1h 01m | Hits:  73%/2440  
      🟩 GCC11              Pass: 100%/2   | Total:  1h 39m | Avg: 49m 41s | Max: 50m 56s | Hits:  85%/2436  
      🟩 GCC12              Pass: 100%/2   | Total:  1h 45m | Avg: 52m 33s | Max: 53m 55s | Hits:  85%/2436  
      🟨 GCC13              Pass:  81%/11  | Total:  5h 57m | Avg: 32m 27s | Max:  1h 00m | Hits:  91%/10962 
      🟩 MSVC14.29          Pass: 100%/2   | Total:  2h 16m | Avg:  1h 08m | Max:  1h 10m | Hits:  15%/2084  
      🟩 MSVC14.42          Pass: 100%/2   | Total:  2h 31m | Avg:  1h 15m | Max:  1h 16m | Hits:  15%/2084  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  1h 52m | Avg: 56m 04s | Max: 56m 35s | Hits:  83%/2254  
    🟨 cxx_family
      🟨 Clang              Pass:  94%/17  | Total: 13h 41m | Avg: 48m 21s | Max:  1h 00m | Hits:  86%/19164 
      🟨 GCC                Pass:  90%/22  | Total: 15h 43m | Avg: 42m 53s | Max:  1h 01m | Hits:  87%/24374 
      🟩 MSVC               Pass: 100%/4   | Total:  4h 48m | Avg:  1h 12m | Max:  1h 16m | Hits:  15%/4168  
      🟩 NVHPC              Pass: 100%/2   | Total:  1h 52m | Avg: 56m 04s | Max: 56m 35s | Hits:  83%/2254  
    🟨 gpu
      🟨 h100               Pass:  66%/3   | Total: 59m 47s | Avg: 19m 55s | Max: 24m 03s | Hits:  92%/2436  
      🟩 rtx2080            Pass: 100%/34  | Total:  1d 07h | Avg: 55m 25s | Max:  1h 16m | Hits:  77%/40216 
      🟨 rtxa6000           Pass:  75%/8   | Total:  3h 41m | Avg: 27m 41s | Max: 52m 18s | Hits:  94%/7308  
    
  • 🟩 thrust: Pass: 100%/45 | Total: 12h 21m | Avg: 16m 28s | Max: 39m 26s | Hits: 92%/79956

    🟩 cmake_options
      🟩 -DTHRUST_DISPATCH_TYPE=Force32bit Pass: 100%/2   | Total: 23m 57s | Avg: 11m 58s | Max: 12m 41s | Hits:  97%/3556  
    🟩 cpu
      🟩 amd64              Pass: 100%/43  | Total: 11h 56m | Avg: 16m 39s | Max: 39m 26s | Hits:  92%/76401 
      🟩 arm64              Pass: 100%/2   | Total: 25m 36s | Avg: 12m 48s | Max: 13m 15s | Hits:  94%/3555  
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total:  1h 33m | Avg: 18m 42s | Max: 33m 29s | Hits:  89%/8881  
      🟩 12.5               Pass: 100%/2   | Total: 53m 57s | Avg: 26m 58s | Max: 28m 28s | Hits:  93%/3554  
      🟩 12.8               Pass: 100%/38  | Total:  9h 54m | Avg: 15m 38s | Max: 39m 26s | Hits:  92%/67521 
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total: 26m 38s | Avg: 13m 19s | Max: 13m 39s | Hits:  94%/3554  
      🟩 nvcc12.0           Pass: 100%/5   | Total:  1h 33m | Avg: 18m 42s | Max: 33m 29s | Hits:  89%/8881  
      🟩 nvcc12.5           Pass: 100%/2   | Total: 53m 57s | Avg: 26m 58s | Max: 28m 28s | Hits:  93%/3554  
      🟩 nvcc12.8           Pass: 100%/36  | Total:  9h 27m | Avg: 15m 45s | Max: 39m 26s | Hits:  92%/63967 
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total: 26m 38s | Avg: 13m 19s | Max: 13m 39s | Hits:  94%/3554  
      🟩 nvcc               Pass: 100%/43  | Total: 11h 55m | Avg: 16m 37s | Max: 39m 26s | Hits:  92%/76402 
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total: 57m 05s | Avg: 14m 16s | Max: 15m 20s | Hits:  94%/7108  
      🟩 Clang15            Pass: 100%/2   | Total: 28m 31s | Avg: 14m 15s | Max: 14m 28s | Hits:  94%/3554  
      🟩 Clang16            Pass: 100%/2   | Total: 29m 32s | Avg: 14m 46s | Max: 14m 49s | Hits:  94%/3554  
      🟩 Clang17            Pass: 100%/2   | Total: 27m 50s | Avg: 13m 55s | Max: 14m 31s | Hits:  94%/3554  
      🟩 Clang18            Pass: 100%/7   | Total:  1h 25m | Avg: 12m 11s | Max: 14m 48s | Hits:  96%/12439 
      🟩 GCC7               Pass: 100%/2   | Total: 30m 44s | Avg: 15m 22s | Max: 15m 36s | Hits:  94%/3556  
      🟩 GCC8               Pass: 100%/1   | Total: 13m 50s | Avg: 13m 50s | Max: 13m 50s | Hits:  94%/1778  
      🟩 GCC9               Pass: 100%/2   | Total: 31m 33s | Avg: 15m 46s | Max: 16m 09s | Hits:  94%/3556  
      🟩 GCC10              Pass: 100%/2   | Total: 30m 15s | Avg: 15m 07s | Max: 15m 37s | Hits:  94%/3556  
      🟩 GCC11              Pass: 100%/2   | Total: 29m 57s | Avg: 14m 58s | Max: 15m 10s | Hits:  94%/3556  
      🟩 GCC12              Pass: 100%/2   | Total: 29m 06s | Avg: 14m 33s | Max: 14m 36s | Hits:  94%/3556  
      🟩 GCC13              Pass: 100%/10  | Total:  1h 58m | Avg: 11m 50s | Max: 15m 09s | Hits:  96%/17780 
      🟩 MSVC14.29          Pass: 100%/2   | Total:  1h 08m | Avg: 34m 15s | Max: 35m 02s | Hits:  66%/3542  
      🟩 MSVC14.42          Pass: 100%/3   | Total:  1h 46m | Avg: 35m 39s | Max: 39m 26s | Hits:  67%/5313  
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 53m 57s | Avg: 26m 58s | Max: 28m 28s | Hits:  93%/3554  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/17  | Total:  3h 48m | Avg: 13m 25s | Max: 15m 20s | Hits:  95%/30209 
      🟩 GCC                Pass: 100%/21  | Total:  4h 43m | Avg: 13m 31s | Max: 16m 09s | Hits:  95%/37338 
      🟩 MSVC               Pass: 100%/5   | Total:  2h 55m | Avg: 35m 05s | Max: 39m 26s | Hits:  67%/8855  
      🟩 NVHPC              Pass: 100%/2   | Total: 53m 57s | Avg: 26m 58s | Max: 28m 28s | Hits:  93%/3554  
    🟩 gpu
      🟩 h100               Pass: 100%/2   | Total: 19m 09s | Avg:  9m 34s | Max: 11m 16s | Hits:  97%/3556  
      🟩 rtx2080            Pass: 100%/33  | Total:  9h 20m | Avg: 16m 58s | Max: 35m 02s | Hits:  92%/58637 
      🟩 rtx4090            Pass: 100%/10  | Total:  2h 42m | Avg: 16m 15s | Max: 39m 26s | Hits:  92%/17763 
    🟩 jobs
      🟩 Build              Pass: 100%/38  | Total: 10h 49m | Avg: 17m 05s | Max: 39m 26s | Hits:  91%/67519 
      🟩 TestCPU            Pass: 100%/3   | Total: 48m 32s | Avg: 16m 10s | Max: 33m 06s | Hits:  90%/5326  
      🟩 TestGPU            Pass: 100%/4   | Total: 43m 27s | Avg: 10m 51s | Max: 11m 16s | Hits:  99%/7111  
    🟩 sm
      🟩 90                 Pass: 100%/2   | Total: 19m 09s | Avg:  9m 34s | Max: 11m 16s | Hits:  97%/3556  
      🟩 90;90a;100         Pass: 100%/1   | Total: 13m 21s | Avg: 13m 21s | Max: 13m 21s | Hits:  94%/1778  
    🟩 std
      🟩 17                 Pass: 100%/20  | Total:  6h 08m | Avg: 18m 25s | Max: 35m 02s | Hits:  90%/35531 
      🟩 20                 Pass: 100%/23  | Total:  5h 49m | Avg: 15m 11s | Max: 39m 26s | Hits:  93%/40869 
    
  • 🟩 libcudacxx: Pass: 100%/43 | Total: 9h 09m | Avg: 12m 46s | Max: 36m 52s | Hits: 79%/104996

    🟩 cpu
      🟩 amd64              Pass: 100%/41  | Total:  8h 42m | Avg: 12m 44s | Max: 36m 52s | Hits:  79%/99253 
      🟩 arm64              Pass: 100%/2   | Total: 27m 01s | Avg: 13m 30s | Max: 23m 23s | Hits:  65%/5743  
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total:  1h 18m | Avg: 15m 39s | Max: 27m 04s | Hits:  70%/13988 
      🟩 12.5               Pass: 100%/2   | Total: 45m 24s | Avg: 22m 42s | Max: 36m 52s | Hits:  62%/5688  
      🟩 12.8               Pass: 100%/36  | Total:  7h 05m | Avg: 11m 49s | Max: 30m 45s | Hits:  81%/85320 
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total: 44m 29s | Avg: 22m 14s | Max: 23m 56s | Hits:  27%/5704  
      🟩 nvcc12.0           Pass: 100%/5   | Total:  1h 18m | Avg: 15m 39s | Max: 27m 04s | Hits:  70%/13988 
      🟩 nvcc12.5           Pass: 100%/2   | Total: 45m 24s | Avg: 22m 42s | Max: 36m 52s | Hits:  62%/5688  
      🟩 nvcc12.8           Pass: 100%/34  | Total:  6h 21m | Avg: 11m 12s | Max: 30m 45s | Hits:  85%/79616 
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total: 44m 29s | Avg: 22m 14s | Max: 23m 56s | Hits:  27%/5704  
      🟩 nvcc               Pass: 100%/41  | Total:  8h 24m | Avg: 12m 18s | Max: 36m 52s | Hits:  82%/99292 
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total: 18m 19s | Avg:  4m 34s | Max:  5m 29s | Hits:  97%/11376 
      🟩 Clang15            Pass: 100%/2   | Total: 29m 57s | Avg: 14m 58s | Max: 25m 05s | Hits:  65%/5700  
      🟩 Clang16            Pass: 100%/2   | Total:  9m 12s | Avg:  4m 36s | Max:  4m 39s | Hits:  99%/5700  
      🟩 Clang17            Pass: 100%/2   | Total:  9m 18s | Avg:  4m 39s | Max:  4m 50s | Hits:  99%/5700  
      🟩 Clang18            Pass: 100%/6   | Total:  1h 34m | Avg: 15m 40s | Max: 23m 56s | Hits:  56%/14275 
      🟩 GCC7               Pass: 100%/2   | Total: 41m 17s | Avg: 20m 38s | Max: 20m 42s | Hits:  34%/5638  
      🟩 GCC8               Pass: 100%/1   | Total:  7m 00s | Avg:  7m 00s | Max:  7m 00s | Hits:  91%/2829  
      🟩 GCC9               Pass: 100%/2   | Total: 24m 56s | Avg: 12m 28s | Max: 21m 05s | Hits:  65%/5650  
      🟩 GCC10              Pass: 100%/2   | Total:  9m 38s | Avg:  4m 49s | Max:  5m 32s | Hits:  96%/5706  
      🟩 GCC11              Pass: 100%/2   | Total: 29m 49s | Avg: 14m 54s | Max: 22m 49s | Hits:  61%/5702  
      🟩 GCC12              Pass: 100%/2   | Total: 11m 21s | Avg:  5m 40s | Max:  7m 22s | Hits:  95%/5702  
      🟩 GCC13              Pass: 100%/10  | Total:  1h 48m | Avg: 10m 48s | Max: 30m 45s | Hits:  83%/14536 
      🟩 MSVC14.29          Pass: 100%/2   | Total: 54m 24s | Avg: 27m 12s | Max: 27m 20s | Hits:  98%/5364  
      🟩 MSVC14.42          Pass: 100%/2   | Total: 56m 34s | Avg: 28m 17s | Max: 30m 30s | Hits:  95%/5430  
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 45m 24s | Avg: 22m 42s | Max: 36m 52s | Hits:  62%/5688  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/16  | Total:  2h 40m | Avg: 10m 03s | Max: 25m 05s | Hits:  80%/42751 
      🟩 GCC                Pass: 100%/21  | Total:  3h 52m | Avg: 11m 03s | Max: 30m 45s | Hits:  76%/45763 
      🟩 MSVC               Pass: 100%/4   | Total:  1h 50m | Avg: 27m 44s | Max: 30m 30s | Hits:  97%/10794 
      🟩 NVHPC              Pass: 100%/2   | Total: 45m 24s | Avg: 22m 42s | Max: 36m 52s | Hits:  62%/5688  
    🟩 gpu
      🟩 h100               Pass: 100%/2   | Total: 16m 17s | Avg:  8m 08s | Max: 11m 51s | Hits:  98%/2961  
      🟩 rtx2080            Pass: 100%/41  | Total:  8h 52m | Avg: 12m 59s | Max: 36m 52s | Hits:  78%/102035
    🟩 jobs
      🟩 Build              Pass: 100%/37  | Total:  7h 53m | Avg: 12m 48s | Max: 36m 52s | Hits:  79%/104956
      🟩 NVRTC              Pass: 100%/2   | Total: 35m 16s | Avg: 17m 38s | Max: 18m 13s | Hits:  90%/40    
      🟩 Test               Pass: 100%/3   | Total: 37m 54s | Avg: 12m 38s | Max: 16m 39s
      🟩 VerifyCodegen      Pass: 100%/1   | Total:  2m 14s | Avg:  2m 14s | Max:  2m 14s
    🟩 sm
      🟩 75                 Pass: 100%/2   | Total: 35m 16s | Avg: 17m 38s | Max: 18m 13s | Hits:  90%/40    
      🟩 90                 Pass: 100%/2   | Total: 16m 17s | Avg:  8m 08s | Max: 11m 51s | Hits:  98%/2961  
      🟩 90;90a;100         Pass: 100%/1   | Total: 30m 45s | Avg: 30m 45s | Max: 30m 45s | Hits:  30%/2961  
    🟩 std
      🟩 17                 Pass: 100%/21  | Total:  4h 25m | Avg: 12m 37s | Max: 27m 20s | Hits:  80%/56127 
      🟩 20                 Pass: 100%/21  | Total:  4h 41m | Avg: 13m 25s | Max: 36m 52s | Hits:  77%/48869 
    
  • 🟩 cudax: Pass: 100%/22 | Total: 2h 20m | Avg: 6m 23s | Max: 14m 07s | Hits: 94%/11722

    🟩 cpu
      🟩 amd64              Pass: 100%/18  | Total:  2h 05m | Avg:  6m 58s | Max: 14m 07s | Hits:  94%/9406  
      🟩 arm64              Pass: 100%/4   | Total: 15m 18s | Avg:  3m 49s | Max:  4m 04s | Hits:  96%/2316  
    🟩 ctk
      🟩 12.0               Pass: 100%/1   | Total: 14m 07s | Avg: 14m 07s | Max: 14m 07s | Hits:  57%/277   
      🟩 12.5               Pass: 100%/2   | Total: 12m 51s | Avg:  6m 25s | Max:  6m 46s | Hits:  92%/742   
      🟩 12.8               Pass: 100%/19  | Total:  1h 53m | Avg:  5m 59s | Max: 14m 05s | Hits:  95%/10703 
    🟩 cudacxx
      🟩 nvcc12.0           Pass: 100%/1   | Total: 14m 07s | Avg: 14m 07s | Max: 14m 07s | Hits:  57%/277   
      🟩 nvcc12.5           Pass: 100%/2   | Total: 12m 51s | Avg:  6m 25s | Max:  6m 46s | Hits:  92%/742   
      🟩 nvcc12.8           Pass: 100%/19  | Total:  1h 53m | Avg:  5m 59s | Max: 14m 05s | Hits:  95%/10703 
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/22  | Total:  2h 20m | Avg:  6m 23s | Max: 14m 07s | Hits:  94%/11722 
    🟩 cxx
      🟩 Clang14            Pass: 100%/1   | Total:  4m 05s | Avg:  4m 05s | Max:  4m 05s | Hits:  96%/581   
      🟩 Clang15            Pass: 100%/1   | Total:  4m 25s | Avg:  4m 25s | Max:  4m 25s | Hits:  96%/579   
      🟩 Clang16            Pass: 100%/1   | Total:  4m 34s | Avg:  4m 34s | Max:  4m 34s | Hits:  96%/579   
      🟩 Clang17            Pass: 100%/1   | Total:  4m 23s | Avg:  4m 23s | Max:  4m 23s | Hits:  96%/579   
      🟩 Clang18            Pass: 100%/4   | Total: 23m 48s | Avg:  5m 57s | Max: 11m 57s | Hits:  97%/2316  
      🟩 GCC10              Pass: 100%/1   | Total:  4m 26s | Avg:  4m 26s | Max:  4m 26s | Hits:  96%/581   
      🟩 GCC11              Pass: 100%/1   | Total:  4m 38s | Avg:  4m 38s | Max:  4m 38s | Hits:  96%/579   
      🟩 GCC12              Pass: 100%/2   | Total: 16m 47s | Avg:  8m 23s | Max: 12m 18s | Hits:  97%/1158  
      🟩 GCC13              Pass: 100%/6   | Total: 33m 33s | Avg:  5m 35s | Max: 14m 05s | Hits:  96%/3474  
      🟩 MSVC14.39          Pass: 100%/1   | Total: 14m 07s | Avg: 14m 07s | Max: 14m 07s | Hits:  57%/277   
      🟩 MSVC14.42          Pass: 100%/1   | Total: 13m 08s | Avg: 13m 08s | Max: 13m 08s | Hits:  57%/277   
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 12m 51s | Avg:  6m 25s | Max:  6m 46s | Hits:  92%/742   
    🟩 cxx_family
      🟩 Clang              Pass: 100%/8   | Total: 41m 15s | Avg:  5m 09s | Max: 11m 57s | Hits:  96%/4634  
      🟩 GCC                Pass: 100%/10  | Total: 59m 24s | Avg:  5m 56s | Max: 14m 05s | Hits:  96%/5792  
      🟩 MSVC               Pass: 100%/2   | Total: 27m 15s | Avg: 13m 37s | Max: 14m 07s | Hits:  57%/554   
      🟩 NVHPC              Pass: 100%/2   | Total: 12m 51s | Avg:  6m 25s | Max:  6m 46s | Hits:  92%/742   
    🟩 gpu
      🟩 h100               Pass: 100%/2   | Total: 17m 49s | Avg:  8m 54s | Max: 14m 05s | Hits:  97%/1158  
      🟩 rtx2080            Pass: 100%/20  | Total:  2h 02m | Avg:  6m 08s | Max: 14m 07s | Hits:  94%/10564 
    🟩 jobs
      🟩 Build              Pass: 100%/19  | Total:  1h 42m | Avg:  5m 23s | Max: 14m 07s | Hits:  93%/9985  
      🟩 Test               Pass: 100%/3   | Total: 38m 20s | Avg: 12m 46s | Max: 14m 05s | Hits:  99%/1737  
    🟩 sm
      🟩 90                 Pass: 100%/3   | Total: 21m 35s | Avg:  7m 11s | Max: 14m 05s | Hits:  97%/1737  
      🟩 90a                Pass: 100%/1   | Total:  4m 00s | Avg:  4m 00s | Max:  4m 00s | Hits:  96%/579   
    🟩 std
      🟩 17                 Pass: 100%/4   | Total: 18m 08s | Avg:  4m 32s | Max:  6m 46s | Hits:  95%/2108  
      🟩 20                 Pass: 100%/18  | Total:  2h 02m | Avg:  6m 48s | Max: 14m 07s | Hits:  94%/9614  
    
  • 🟩 cccl_c_parallel: Pass: 100%/2 | Total: 15m 35s | Avg: 7m 47s | Max: 13m 09s | Hits: 98%/308

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total: 15m 35s | Avg:  7m 47s | Max: 13m 09s | Hits:  98%/308   
    🟩 ctk
      🟩 12.8               Pass: 100%/2   | Total: 15m 35s | Avg:  7m 47s | Max: 13m 09s | Hits:  98%/308   
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/2   | Total: 15m 35s | Avg:  7m 47s | Max: 13m 09s | Hits:  98%/308   
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/2   | Total: 15m 35s | Avg:  7m 47s | Max: 13m 09s | Hits:  98%/308   
    🟩 cxx
      🟩 GCC13              Pass: 100%/2   | Total: 15m 35s | Avg:  7m 47s | Max: 13m 09s | Hits:  98%/308   
    🟩 cxx_family
      🟩 GCC                Pass: 100%/2   | Total: 15m 35s | Avg:  7m 47s | Max: 13m 09s | Hits:  98%/308   
    🟩 gpu
      🟩 rtx2080            Pass: 100%/2   | Total: 15m 35s | Avg:  7m 47s | Max: 13m 09s | Hits:  98%/308   
    🟩 jobs
      🟩 Build              Pass: 100%/1   | Total:  2m 26s | Avg:  2m 26s | Max:  2m 26s | Hits:  98%/154   
      🟩 Test               Pass: 100%/1   | Total: 13m 09s | Avg: 13m 09s | Max: 13m 09s | Hits:  98%/154   
    
  • 🟩 python: Pass: 100%/1 | Total: 1h 03m | Avg: 1h 03m | Max: 1h 03m

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total:  1h 03m | Avg:  1h 03m | Max:  1h 03m
    🟩 ctk
      🟩 12.8               Pass: 100%/1   | Total:  1h 03m | Avg:  1h 03m | Max:  1h 03m
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/1   | Total:  1h 03m | Avg:  1h 03m | Max:  1h 03m
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total:  1h 03m | Avg:  1h 03m | Max:  1h 03m
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total:  1h 03m | Avg:  1h 03m | Max:  1h 03m
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total:  1h 03m | Avg:  1h 03m | Max:  1h 03m
    🟩 gpu
      🟩 rtx2080            Pass: 100%/1   | Total:  1h 03m | Avg:  1h 03m | Max:  1h 03m
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total:  1h 03m | Avg:  1h 03m | Max:  1h 03m
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
+/- libcu++
+/- CUB
Thrust
CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
+/- libcu++
+/- CUB
+/- Thrust
+/- CUDA Experimental
+/- python
+/- CCCL C Parallel Library
+/- Catch2Helper

🏃‍ Runner counts (total jobs: 158)

# Runner
111 linux-amd64-cpu16
15 windows-amd64-cpu16
10 linux-arm64-cpu16
8 linux-amd64-gpu-rtx2080-latest-1
6 linux-amd64-gpu-rtxa6000-latest-1
5 linux-amd64-gpu-h100-latest-1
3 linux-amd64-gpu-rtx4090-latest-1

@github-actions
Copy link
Contributor

github-actions bot commented Mar 7, 2025

🟩 CI finished in 1h 18m: Pass: 100%/158 | Total: 1d 05h | Avg: 11m 10s | Max: 1h 01m | Hits: 91%/250596
  • 🟩 cub: Pass: 100%/45 | Total: 11h 53m | Avg: 15m 50s | Max: 49m 04s | Hits: 92%/53614

    🟩 cpu
      🟩 amd64              Pass: 100%/43  | Total: 11h 35m | Avg: 16m 09s | Max: 49m 04s | Hits:  91%/51178 
      🟩 arm64              Pass: 100%/2   | Total: 18m 12s | Avg:  9m 06s | Max:  9m 32s | Hits:  98%/2436  
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total:  1h 17m | Avg: 15m 25s | Max: 35m 36s | Hits:  83%/5922  
      🟩 12.5               Pass: 100%/2   | Total: 31m 55s | Avg: 15m 57s | Max: 16m 17s | Hits:  97%/2254  
      🟩 12.8               Pass: 100%/38  | Total: 10h 04m | Avg: 15m 54s | Max: 49m 04s | Hits:  92%/45438 
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total: 16m 35s | Avg:  8m 17s | Max:  8m 27s | Hits:  98%/2104  
      🟩 nvcc12.0           Pass: 100%/5   | Total:  1h 17m | Avg: 15m 25s | Max: 35m 36s | Hits:  83%/5922  
      🟩 nvcc12.5           Pass: 100%/2   | Total: 31m 55s | Avg: 15m 57s | Max: 16m 17s | Hits:  97%/2254  
      🟩 nvcc12.8           Pass: 100%/36  | Total:  9h 47m | Avg: 16m 19s | Max: 49m 04s | Hits:  92%/43334 
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total: 16m 35s | Avg:  8m 17s | Max:  8m 27s | Hits:  98%/2104  
      🟩 nvcc               Pass: 100%/43  | Total: 11h 36m | Avg: 16m 12s | Max: 49m 04s | Hits:  91%/51510 
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total: 41m 11s | Avg: 10m 17s | Max: 11m 04s | Hits:  98%/4880  
      🟩 Clang15            Pass: 100%/2   | Total: 19m 59s | Avg:  9m 59s | Max: 10m 12s | Hits:  98%/2436  
      🟩 Clang16            Pass: 100%/2   | Total: 21m 20s | Avg: 10m 40s | Max: 10m 48s | Hits:  98%/2436  
      🟩 Clang17            Pass: 100%/2   | Total: 19m 59s | Avg:  9m 59s | Max: 10m 07s | Hits:  98%/2436  
      🟩 Clang18            Pass: 100%/7   | Total:  1h 30m | Avg: 12m 55s | Max: 22m 50s | Hits:  99%/8194  
      🟩 GCC7               Pass: 100%/2   | Total: 21m 08s | Avg: 10m 34s | Max: 11m 03s | Hits:  98%/2440  
      🟩 GCC8               Pass: 100%/1   | Total: 10m 27s | Avg: 10m 27s | Max: 10m 27s | Hits:  98%/1220  
      🟩 GCC9               Pass: 100%/2   | Total: 22m 13s | Avg: 11m 06s | Max: 11m 42s | Hits:  98%/2440  
      🟩 GCC10              Pass: 100%/2   | Total: 59m 43s | Avg: 29m 51s | Max: 49m 04s | Hits:  95%/2440  
      🟩 GCC11              Pass: 100%/2   | Total: 20m 58s | Avg: 10m 29s | Max: 10m 34s | Hits:  98%/2436  
      🟩 GCC12              Pass: 100%/2   | Total: 21m 59s | Avg: 10m 59s | Max: 11m 38s | Hits:  98%/2436  
      🟩 GCC13              Pass: 100%/11  | Total:  3h 04m | Avg: 16m 48s | Max: 25m 05s | Hits:  99%/13398 
      🟩 MSVC14.29          Pass: 100%/2   | Total:  1h 12m | Avg: 36m 18s | Max: 37m 01s | Hits:  15%/2084  
      🟩 MSVC14.42          Pass: 100%/2   | Total:  1h 14m | Avg: 37m 08s | Max: 37m 40s | Hits:  15%/2084  
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 31m 55s | Avg: 15m 57s | Max: 16m 17s | Hits:  97%/2254  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/17  | Total:  3h 13m | Avg: 11m 21s | Max: 22m 50s | Hits:  98%/20382 
      🟩 GCC                Pass: 100%/22  | Total:  5h 41m | Avg: 15m 31s | Max: 49m 04s | Hits:  98%/26810 
      🟩 MSVC               Pass: 100%/4   | Total:  2h 26m | Avg: 36m 43s | Max: 37m 40s | Hits:  15%/4168  
      🟩 NVHPC              Pass: 100%/2   | Total: 31m 55s | Avg: 15m 57s | Max: 16m 17s | Hits:  97%/2254  
    🟩 gpu
      🟩 h100               Pass: 100%/3   | Total: 52m 51s | Avg: 17m 37s | Max: 24m 04s | Hits:  99%/3654  
      🟩 rtx2080            Pass: 100%/34  | Total:  8h 24m | Avg: 14m 50s | Max: 49m 04s | Hits:  89%/40216 
      🟩 rtxa6000           Pass: 100%/8   | Total:  2h 35m | Avg: 19m 29s | Max: 25m 05s | Hits:  99%/9744  
    🟩 jobs
      🟩 Build              Pass: 100%/37  | Total:  8h 51m | Avg: 14m 22s | Max: 49m 04s | Hits:  90%/43870 
      🟩 DeviceLaunch       Pass: 100%/1   | Total: 21m 00s | Avg: 21m 00s | Max: 21m 00s | Hits:  99%/1218  
      🟩 GraphCapture       Pass: 100%/1   | Total: 19m 43s | Avg: 19m 43s | Max: 19m 43s | Hits:  99%/1218  
      🟩 HostLaunch         Pass: 100%/3   | Total:  1h 11m | Avg: 23m 47s | Max: 25m 05s | Hits:  99%/3654  
      🟩 TestGPU            Pass: 100%/3   | Total:  1h 09m | Avg: 23m 05s | Max: 24m 42s | Hits:  99%/3654  
    🟩 sm
      🟩 90                 Pass: 100%/3   | Total: 52m 51s | Avg: 17m 37s | Max: 24m 04s | Hits:  99%/3654  
      🟩 90;90a;100         Pass: 100%/1   | Total: 11m 10s | Avg: 11m 10s | Max: 11m 10s | Hits:  98%/1218  
    🟩 std
      🟩 17                 Pass: 100%/20  | Total:  5h 32m | Avg: 16m 37s | Max: 49m 04s | Hits:  87%/23591 
      🟩 20                 Pass: 100%/25  | Total:  6h 20m | Avg: 15m 13s | Max: 37m 40s | Hits:  96%/30023 
    
  • 🟩 thrust: Pass: 100%/45 | Total: 6h 51m | Avg: 9m 08s | Max: 36m 10s | Hits: 96%/79956

    🟩 cmake_options
      🟩 -DTHRUST_DISPATCH_TYPE=Force32bit Pass: 100%/2   | Total: 17m 33s | Avg:  8m 46s | Max: 11m 16s | Hits:  99%/3556  
    🟩 cpu
      🟩 amd64              Pass: 100%/43  | Total:  6h 41m | Avg:  9m 20s | Max: 36m 10s | Hits:  96%/76401 
      🟩 arm64              Pass: 100%/2   | Total:  9m 38s | Avg:  4m 49s | Max:  5m 08s | Hits:  99%/3555  
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total: 45m 32s | Avg:  9m 06s | Max: 25m 36s | Hits:  94%/8881  
      🟩 12.5               Pass: 100%/2   | Total: 28m 08s | Avg: 14m 04s | Max: 14m 35s | Hits:  99%/3554  
      🟩 12.8               Pass: 100%/38  | Total:  5h 37m | Avg:  8m 53s | Max: 36m 10s | Hits:  96%/67521 
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total:  9m 55s | Avg:  4m 57s | Max:  5m 03s | Hits: 100%/3554  
      🟩 nvcc12.0           Pass: 100%/5   | Total: 45m 32s | Avg:  9m 06s | Max: 25m 36s | Hits:  94%/8881  
      🟩 nvcc12.5           Pass: 100%/2   | Total: 28m 08s | Avg: 14m 04s | Max: 14m 35s | Hits:  99%/3554  
      🟩 nvcc12.8           Pass: 100%/36  | Total:  5h 27m | Avg:  9m 06s | Max: 36m 10s | Hits:  96%/63967 
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total:  9m 55s | Avg:  4m 57s | Max:  5m 03s | Hits: 100%/3554  
      🟩 nvcc               Pass: 100%/43  | Total:  6h 41m | Avg:  9m 20s | Max: 36m 10s | Hits:  96%/76402 
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total: 20m 35s | Avg:  5m 08s | Max:  5m 29s | Hits: 100%/7108  
      🟩 Clang15            Pass: 100%/2   | Total: 11m 27s | Avg:  5m 43s | Max:  5m 56s | Hits: 100%/3554  
      🟩 Clang16            Pass: 100%/2   | Total: 11m 38s | Avg:  5m 49s | Max:  5m 50s | Hits: 100%/3554  
      🟩 Clang17            Pass: 100%/2   | Total: 11m 10s | Avg:  5m 35s | Max:  5m 53s | Hits: 100%/3554  
      🟩 Clang18            Pass: 100%/7   | Total: 43m 34s | Avg:  6m 13s | Max: 10m 05s | Hits: 100%/12439 
      🟩 GCC7               Pass: 100%/2   | Total: 10m 02s | Avg:  5m 01s | Max:  5m 14s | Hits:  99%/3556  
      🟩 GCC8               Pass: 100%/1   | Total:  5m 18s | Avg:  5m 18s | Max:  5m 18s | Hits:  99%/1778  
      🟩 GCC9               Pass: 100%/2   | Total: 11m 01s | Avg:  5m 30s | Max:  5m 32s | Hits:  99%/3556  
      🟩 GCC10              Pass: 100%/2   | Total: 10m 54s | Avg:  5m 27s | Max:  5m 35s | Hits:  99%/3556  
      🟩 GCC11              Pass: 100%/2   | Total: 12m 15s | Avg:  6m 07s | Max:  6m 08s | Hits:  99%/3556  
      🟩 GCC12              Pass: 100%/2   | Total: 11m 52s | Avg:  5m 56s | Max:  6m 17s | Hits:  99%/3556  
      🟩 GCC13              Pass: 100%/10  | Total:  1h 17m | Avg:  7m 43s | Max: 11m 26s | Hits:  99%/17780 
      🟩 MSVC14.29          Pass: 100%/2   | Total: 54m 38s | Avg: 27m 19s | Max: 29m 02s | Hits:  70%/3542  
      🟩 MSVC14.42          Pass: 100%/3   | Total:  1h 31m | Avg: 30m 36s | Max: 36m 10s | Hits:  70%/5313  
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 28m 08s | Avg: 14m 04s | Max: 14m 35s | Hits:  99%/3554  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/17  | Total:  1h 38m | Avg:  5m 47s | Max: 10m 05s | Hits: 100%/30209 
      🟩 GCC                Pass: 100%/21  | Total:  2h 18m | Avg:  6m 35s | Max: 11m 26s | Hits:  99%/37338 
      🟩 MSVC               Pass: 100%/5   | Total:  2h 26m | Avg: 29m 17s | Max: 36m 10s | Hits:  70%/8855  
      🟩 NVHPC              Pass: 100%/2   | Total: 28m 08s | Avg: 14m 04s | Max: 14m 35s | Hits:  99%/3554  
    🟩 gpu
      🟩 h100               Pass: 100%/2   | Total: 15m 57s | Avg:  7m 58s | Max: 10m 48s | Hits:  99%/3556  
      🟩 rtx2080            Pass: 100%/33  | Total:  4h 25m | Avg:  8m 02s | Max: 29m 02s | Hits:  97%/58637 
      🟩 rtx4090            Pass: 100%/10  | Total:  2h 10m | Avg: 13m 02s | Max: 36m 10s | Hits:  94%/17763 
    🟩 jobs
      🟩 Build              Pass: 100%/38  | Total:  5h 16m | Avg:  8m 19s | Max: 29m 02s | Hits:  96%/67519 
      🟩 TestCPU            Pass: 100%/3   | Total: 51m 46s | Avg: 17m 15s | Max: 36m 10s | Hits:  90%/5326  
      🟩 TestGPU            Pass: 100%/4   | Total: 43m 35s | Avg: 10m 53s | Max: 11m 26s | Hits:  99%/7111  
    🟩 sm
      🟩 90                 Pass: 100%/2   | Total: 15m 57s | Avg:  7m 58s | Max: 10m 48s | Hits:  99%/3556  
      🟩 90;90a;100         Pass: 100%/1   | Total:  6m 34s | Avg:  6m 34s | Max:  6m 34s | Hits:  99%/1778  
    🟩 std
      🟩 17                 Pass: 100%/20  | Total:  3h 06m | Avg:  9m 18s | Max: 29m 02s | Hits:  95%/35531 
      🟩 20                 Pass: 100%/23  | Total:  3h 27m | Avg:  9m 02s | Max: 36m 10s | Hits:  97%/40869 
    
  • 🟩 libcudacxx: Pass: 100%/43 | Total: 7h 22m | Avg: 10m 16s | Max: 30m 20s | Hits: 87%/104996

    🟩 cpu
      🟩 amd64              Pass: 100%/41  | Total:  7h 12m | Avg: 10m 33s | Max: 30m 20s | Hits:  86%/99253 
      🟩 arm64              Pass: 100%/2   | Total:  9m 13s | Avg:  4m 36s | Max:  5m 28s | Hits:  96%/5743  
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total:  1h 01m | Avg: 12m 19s | Max: 25m 34s | Hits:  84%/13988 
      🟩 12.5               Pass: 100%/2   | Total: 20m 51s | Avg: 10m 25s | Max: 12m 21s | Hits:  94%/5688  
      🟩 12.8               Pass: 100%/36  | Total:  5h 59m | Avg:  9m 59s | Max: 30m 20s | Hits:  87%/85320 
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total: 42m 22s | Avg: 21m 11s | Max: 22m 02s | Hits:  27%/5704  
      🟩 nvcc12.0           Pass: 100%/5   | Total:  1h 01m | Avg: 12m 19s | Max: 25m 34s | Hits:  84%/13988 
      🟩 nvcc12.5           Pass: 100%/2   | Total: 20m 51s | Avg: 10m 25s | Max: 12m 21s | Hits:  94%/5688  
      🟩 nvcc12.8           Pass: 100%/34  | Total:  5h 17m | Avg:  9m 19s | Max: 30m 20s | Hits:  91%/79616 
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total: 42m 22s | Avg: 21m 11s | Max: 22m 02s | Hits:  27%/5704  
      🟩 nvcc               Pass: 100%/41  | Total:  6h 39m | Avg:  9m 44s | Max: 30m 20s | Hits:  90%/99292 
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total: 28m 08s | Avg:  7m 02s | Max:  8m 55s | Hits:  93%/11376 
      🟩 Clang15            Pass: 100%/2   | Total: 10m 42s | Avg:  5m 21s | Max:  6m 12s | Hits:  96%/5700  
      🟩 Clang16            Pass: 100%/2   | Total: 10m 53s | Avg:  5m 26s | Max:  6m 28s | Hits:  96%/5700  
      🟩 Clang17            Pass: 100%/2   | Total:  9m 04s | Avg:  4m 32s | Max:  4m 41s | Hits:  99%/5700  
      🟩 Clang18            Pass: 100%/6   | Total:  1h 06m | Avg: 11m 01s | Max: 22m 02s | Hits:  69%/14275 
      🟩 GCC7               Pass: 100%/2   | Total: 27m 51s | Avg: 13m 55s | Max: 21m 21s | Hits:  61%/5638  
      🟩 GCC8               Pass: 100%/1   | Total:  4m 08s | Avg:  4m 08s | Max:  4m 08s | Hits:  99%/2829  
      🟩 GCC9               Pass: 100%/2   | Total: 28m 45s | Avg: 14m 22s | Max: 25m 09s | Hits:  65%/5650  
      🟩 GCC10              Pass: 100%/2   | Total: 10m 07s | Avg:  5m 03s | Max:  5m 59s | Hits:  96%/5706  
      🟩 GCC11              Pass: 100%/2   | Total:  8m 19s | Avg:  4m 09s | Max:  4m 12s | Hits:  99%/5702  
      🟩 GCC12              Pass: 100%/2   | Total: 12m 25s | Avg:  6m 12s | Max:  6m 15s | Hits:  93%/5702  
      🟩 GCC13              Pass: 100%/10  | Total:  1h 35m | Avg:  9m 33s | Max: 22m 11s | Hits:  84%/14536 
      🟩 MSVC14.29          Pass: 100%/2   | Total: 51m 47s | Avg: 25m 53s | Max: 26m 13s | Hits:  99%/5364  
      🟩 MSVC14.42          Pass: 100%/2   | Total: 57m 16s | Avg: 28m 38s | Max: 30m 20s | Hits:  93%/5430  
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 20m 51s | Avg: 10m 25s | Max: 12m 21s | Hits:  94%/5688  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/16  | Total:  2h 04m | Avg:  7m 48s | Max: 22m 02s | Hits:  86%/42751 
      🟩 GCC                Pass: 100%/21  | Total:  3h 07m | Avg:  8m 54s | Max: 25m 09s | Hits:  84%/45763 
      🟩 MSVC               Pass: 100%/4   | Total:  1h 49m | Avg: 27m 15s | Max: 30m 20s | Hits:  96%/10794 
      🟩 NVHPC              Pass: 100%/2   | Total: 20m 51s | Avg: 10m 25s | Max: 12m 21s | Hits:  94%/5688  
    🟩 gpu
      🟩 h100               Pass: 100%/2   | Total: 16m 10s | Avg:  8m 05s | Max: 12m 02s | Hits:  99%/2961  
      🟩 rtx2080            Pass: 100%/41  | Total:  7h 05m | Avg: 10m 23s | Max: 30m 20s | Hits:  87%/102035
    🟩 jobs
      🟩 Build              Pass: 100%/37  | Total:  6h 18m | Avg: 10m 13s | Max: 30m 20s | Hits:  87%/104956
      🟩 NVRTC              Pass: 100%/2   | Total: 31m 18s | Avg: 15m 39s | Max: 15m 45s | Hits:  90%/40    
      🟩 Test               Pass: 100%/3   | Total: 30m 07s | Avg: 10m 02s | Max: 12m 02s
      🟩 VerifyCodegen      Pass: 100%/1   | Total:  2m 14s | Avg:  2m 14s | Max:  2m 14s
    🟩 sm
      🟩 75                 Pass: 100%/2   | Total: 31m 18s | Avg: 15m 39s | Max: 15m 45s | Hits:  90%/40    
      🟩 90                 Pass: 100%/2   | Total: 16m 10s | Avg:  8m 05s | Max: 12m 02s | Hits:  99%/2961  
      🟩 90;90a;100         Pass: 100%/1   | Total:  4m 55s | Avg:  4m 55s | Max:  4m 55s | Hits:  99%/2961  
    🟩 std
      🟩 17                 Pass: 100%/21  | Total:  4h 16m | Avg: 12m 12s | Max: 26m 56s | Hits:  83%/56127 
      🟩 20                 Pass: 100%/21  | Total:  3h 03m | Avg:  8m 44s | Max: 30m 20s | Hits:  92%/48869 
    
  • 🟩 cudax: Pass: 100%/22 | Total: 2h 02m | Avg: 5m 35s | Max: 14m 17s | Hits: 97%/11722

    🟩 cpu
      🟩 amd64              Pass: 100%/18  | Total:  1h 51m | Avg:  6m 11s | Max: 14m 17s | Hits:  97%/9406  
      🟩 arm64              Pass: 100%/4   | Total: 11m 32s | Avg:  2m 53s | Max:  2m 56s | Hits:  99%/2316  
    🟩 ctk
      🟩 12.0               Pass: 100%/1   | Total: 12m 44s | Avg: 12m 44s | Max: 12m 44s | Hits:  59%/277   
      🟩 12.5               Pass: 100%/2   | Total: 10m 34s | Avg:  5m 17s | Max:  5m 19s | Hits:  96%/742   
      🟩 12.8               Pass: 100%/19  | Total:  1h 39m | Avg:  5m 14s | Max: 14m 17s | Hits:  98%/10703 
    🟩 cudacxx
      🟩 nvcc12.0           Pass: 100%/1   | Total: 12m 44s | Avg: 12m 44s | Max: 12m 44s | Hits:  59%/277   
      🟩 nvcc12.5           Pass: 100%/2   | Total: 10m 34s | Avg:  5m 17s | Max:  5m 19s | Hits:  96%/742   
      🟩 nvcc12.8           Pass: 100%/19  | Total:  1h 39m | Avg:  5m 14s | Max: 14m 17s | Hits:  98%/10703 
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/22  | Total:  2h 02m | Avg:  5m 35s | Max: 14m 17s | Hits:  97%/11722 
    🟩 cxx
      🟩 Clang14            Pass: 100%/1   | Total:  3m 24s | Avg:  3m 24s | Max:  3m 24s | Hits: 100%/581   
      🟩 Clang15            Pass: 100%/1   | Total:  3m 26s | Avg:  3m 26s | Max:  3m 26s | Hits: 100%/579   
      🟩 Clang16            Pass: 100%/1   | Total:  3m 37s | Avg:  3m 37s | Max:  3m 37s | Hits: 100%/579   
      🟩 Clang17            Pass: 100%/1   | Total:  3m 27s | Avg:  3m 27s | Max:  3m 27s | Hits: 100%/579   
      🟩 Clang18            Pass: 100%/4   | Total: 21m 21s | Avg:  5m 20s | Max: 12m 03s | Hits: 100%/2316  
      🟩 GCC10              Pass: 100%/1   | Total:  3m 21s | Avg:  3m 21s | Max:  3m 21s | Hits:  99%/581   
      🟩 GCC11              Pass: 100%/1   | Total:  3m 28s | Avg:  3m 28s | Max:  3m 28s | Hits:  99%/579   
      🟩 GCC12              Pass: 100%/2   | Total: 16m 01s | Avg:  8m 00s | Max: 12m 28s | Hits:  99%/1158  
      🟩 GCC13              Pass: 100%/6   | Total: 29m 17s | Avg:  4m 52s | Max: 14m 17s | Hits:  99%/3474  
      🟩 MSVC14.39          Pass: 100%/1   | Total: 12m 44s | Avg: 12m 44s | Max: 12m 44s | Hits:  59%/277   
      🟩 MSVC14.42          Pass: 100%/1   | Total: 12m 15s | Avg: 12m 15s | Max: 12m 15s | Hits:  59%/277   
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 10m 34s | Avg:  5m 17s | Max:  5m 19s | Hits:  96%/742   
    🟩 cxx_family
      🟩 Clang              Pass: 100%/8   | Total: 35m 15s | Avg:  4m 24s | Max: 12m 03s | Hits: 100%/4634  
      🟩 GCC                Pass: 100%/10  | Total: 52m 07s | Avg:  5m 12s | Max: 14m 17s | Hits:  99%/5792  
      🟩 MSVC               Pass: 100%/2   | Total: 24m 59s | Avg: 12m 29s | Max: 12m 44s | Hits:  59%/554   
      🟩 NVHPC              Pass: 100%/2   | Total: 10m 34s | Avg:  5m 17s | Max:  5m 19s | Hits:  96%/742   
    🟩 gpu
      🟩 h100               Pass: 100%/2   | Total: 17m 22s | Avg:  8m 41s | Max: 14m 17s | Hits:  99%/1158  
      🟩 rtx2080            Pass: 100%/20  | Total:  1h 45m | Avg:  5m 16s | Max: 12m 44s | Hits:  97%/10564 
    🟩 jobs
      🟩 Build              Pass: 100%/19  | Total:  1h 24m | Avg:  4m 25s | Max: 12m 44s | Hits:  97%/9985  
      🟩 Test               Pass: 100%/3   | Total: 38m 48s | Avg: 12m 56s | Max: 14m 17s | Hits:  99%/1737  
    🟩 sm
      🟩 90                 Pass: 100%/3   | Total: 20m 21s | Avg:  6m 47s | Max: 14m 17s | Hits:  99%/1737  
      🟩 90a                Pass: 100%/1   | Total:  3m 08s | Avg:  3m 08s | Max:  3m 08s | Hits:  99%/579   
    🟩 std
      🟩 17                 Pass: 100%/4   | Total: 13m 58s | Avg:  3m 29s | Max:  5m 19s | Hits:  99%/2108  
      🟩 20                 Pass: 100%/18  | Total:  1h 48m | Avg:  6m 03s | Max: 14m 17s | Hits:  97%/9614  
    
  • 🟩 cccl_c_parallel: Pass: 100%/2 | Total: 15m 43s | Avg: 7m 51s | Max: 13m 20s | Hits: 98%/308

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total: 15m 43s | Avg:  7m 51s | Max: 13m 20s | Hits:  98%/308   
    🟩 ctk
      🟩 12.8               Pass: 100%/2   | Total: 15m 43s | Avg:  7m 51s | Max: 13m 20s | Hits:  98%/308   
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/2   | Total: 15m 43s | Avg:  7m 51s | Max: 13m 20s | Hits:  98%/308   
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/2   | Total: 15m 43s | Avg:  7m 51s | Max: 13m 20s | Hits:  98%/308   
    🟩 cxx
      🟩 GCC13              Pass: 100%/2   | Total: 15m 43s | Avg:  7m 51s | Max: 13m 20s | Hits:  98%/308   
    🟩 cxx_family
      🟩 GCC                Pass: 100%/2   | Total: 15m 43s | Avg:  7m 51s | Max: 13m 20s | Hits:  98%/308   
    🟩 gpu
      🟩 rtx2080            Pass: 100%/2   | Total: 15m 43s | Avg:  7m 51s | Max: 13m 20s | Hits:  98%/308   
    🟩 jobs
      🟩 Build              Pass: 100%/1   | Total:  2m 23s | Avg:  2m 23s | Max:  2m 23s | Hits:  98%/154   
      🟩 Test               Pass: 100%/1   | Total: 13m 20s | Avg: 13m 20s | Max: 13m 20s | Hits:  98%/154   
    
  • 🟩 python: Pass: 100%/1 | Total: 1h 01m | Avg: 1h 01m | Max: 1h 01m

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total:  1h 01m | Avg:  1h 01m | Max:  1h 01m
    🟩 ctk
      🟩 12.8               Pass: 100%/1   | Total:  1h 01m | Avg:  1h 01m | Max:  1h 01m
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/1   | Total:  1h 01m | Avg:  1h 01m | Max:  1h 01m
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total:  1h 01m | Avg:  1h 01m | Max:  1h 01m
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total:  1h 01m | Avg:  1h 01m | Max:  1h 01m
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total:  1h 01m | Avg:  1h 01m | Max:  1h 01m
    🟩 gpu
      🟩 rtx2080            Pass: 100%/1   | Total:  1h 01m | Avg:  1h 01m | Max:  1h 01m
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total:  1h 01m | Avg:  1h 01m | Max:  1h 01m
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
+/- libcu++
+/- CUB
Thrust
CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
+/- libcu++
+/- CUB
+/- Thrust
+/- CUDA Experimental
+/- python
+/- CCCL C Parallel Library
+/- Catch2Helper

🏃‍ Runner counts (total jobs: 158)

# Runner
111 linux-amd64-cpu16
15 windows-amd64-cpu16
10 linux-arm64-cpu16
8 linux-amd64-gpu-rtx2080-latest-1
6 linux-amd64-gpu-rtxa6000-latest-1
5 linux-amd64-gpu-h100-latest-1
3 linux-amd64-gpu-rtx4090-latest-1

Comment on lines 135 to 136
const int begin_bit = GENERATE_COPY(take(2, random(0, key_size - 1)));
const int end_bit = GENERATE_COPY(take(2, random(begin_bit + 1, key_size)));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@elstehle could you please have a quick look whether this fix is correct? We previously tested begin_bit == end_bit sometimes. Was this an invalid scenario?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if begin_bit == end_bit is valid then the kernel should not be called (I guess)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think begin_bit == end_bit is generally valid, it's similar to num_items==0.

[...] then the kernel should not be called (I guess)

We can skip any kernel invocation only if the user invoked DeviceRardixSort via the DoubleBuffer interface. Otherwise the user will expect the output to end up in d_{keys,values}_out, in which case we need to copy the "sorted" output there.

@fbusato
Copy link
Contributor Author

fbusato commented Mar 7, 2025

added performance comparison in the description

Copy link
Contributor

@elstehle elstehle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apart from the begin_bit==end_bit constraint, this looks good. Can we lift that new constraint? Otherwise, this will be a breaking change as it narrows the usage scenarios.

@bernhardmgruber
Copy link
Contributor

Btw, I already included this change in the incoming CCCL 3.0 migration guide: #4069

@fbusato
Copy link
Contributor Author

fbusato commented Mar 10, 2025

@elstehle sorry, what do you mean for lifting the new constraint? what do you think if we return from the host call when begin_bit==end_bit ?

@elstehle
Copy link
Contributor

@elstehle sorry, what do you mean for lifting the new constraint?

I assumed that the new bitfield_extract wasn't supporting 0-sized bit-fields and that, for this reason, we decided to drop tests in radix sort where begin_bit == end_bit. If this is the case, we break users that previously invoked radix sort where begin_bit == end_bit, as that isn't supported from this PR forward. Sorry, if my understanding is wrong here.

what do you think if we return from the host call when begin_bit==end_bit ?

Is your question: "What should we do if the user passes begin_bit==end_bit"? We can skip the kernel invocation only if the user invoked DeviceRardixSort via the DoubleBuffer interface. Because only in that case we can indicate that the sorted output is to be found at the input buffer. Otherwise, the user will expect the output to end up in d_{keys,values}_out, in which case we will need to either (a) copy the input to d_{keys,values}_out or (b) run a single "radix sort" pass to get the outputs written to d_{keys,values}_out. I'm leaning to (b), as (b) has the advantage that we're not superfluously compiling an extra copy kernel. But I'm not sure if the new bitfield_extract supports 0-sized bit-fields today. If not, I wouldn't want to regress performance of that function just to cover the scenario for 0-sized bit-fields.

@fbusato
Copy link
Contributor Author

fbusato commented Mar 10, 2025

bitfield_extract doesn't support 0-bit width. Making 0-bit radix-sort an error in CCCL 3.x is too far?
I would prefer (a). I don't think 0-bit is so important from the performance point of view

@elstehle
Copy link
Contributor

bitfield_extract doesn't support 0-bit width. Making 0-bit radix-sort an error in CCCL 3.x is too far? I would prefer (a). I don't think 0-bit is so important from the performance point of view

After reconsidering, I would actually prefer failing explicitly when begin_bit == end_bit, rather than (a).

Broadly, we have two types of users:

  1. Those who will never invoke radix sort with begin_bit == end_bit.
  2. Those who might invoke radix sort with begin_bit == end_bit.

The priority should be to avoid regressing user group (1), whether that means preventing performance degradation or avoiding the compilation of an extra copy kernel. If we can achieve that while still accommodating user group (2), I’m in favor. Otherwise, I’d prefer to fail in cases where begin_bit == end_bit and recommend that users guard their radix sort invocation to ensure begin_bit < end_bit.

@github-actions
Copy link
Contributor

🟨 CI finished in 1h 37m: Pass: 93%/93 | Total: 2d 21h | Avg: 44m 56s | Max: 1h 24m | Hits: 63%/125451
  • 🟨 cub: Pass: 93%/45 | Total: 1d 21h | Avg: 1h 00m | Max: 1h 24m | Hits: 40%/50488

    🔍 cpu: amd64 🔍
      🔍 amd64              Pass:  93%/43  | Total:  1d 18h | Avg: 59m 54s | Max:  1h 24m | Hits:  41%/48052 
      🟩 arm64              Pass: 100%/2   | Total:  2h 14m | Avg:  1h 07m | Max:  1h 07m | Hits:  26%/2436  
    🔍 cudacxx_family: nvcc 🔍
      🟩 ClangCUDA          Pass: 100%/2   | Total:  2h 12m | Avg:  1h 06m | Max:  1h 09m | Hits:  26%/2104  
      🔍 nvcc               Pass:  93%/43  | Total:  1d 18h | Avg: 59m 55s | Max:  1h 24m | Hits:  41%/48384 
    🔍 cxx_family: MSVC 🔍
      🟩 Clang              Pass: 100%/17  | Total: 17h 19m | Avg:  1h 01m | Max:  1h 09m | Hits:  35%/20382 
      🟩 GCC                Pass: 100%/22  | Total: 19h 50m | Avg: 54m 07s | Max:  1h 14m | Hits:  46%/26810 
      🔍 MSVC               Pass:  25%/4   | Total:  5h 26m | Avg:  1h 21m | Max:  1h 24m | Hits:  12%/1042  
      🟩 NVHPC              Pass: 100%/2   | Total:  2h 32m | Avg:  1h 16m | Max:  1h 17m | Hits:  23%/2254  
    🔍 gpu: rtx2080 🔍
      🟩 h100               Pass: 100%/3   | Total:  1h 18m | Avg: 26m 18s | Max: 31m 40s | Hits:  75%/3654  
      🔍 rtx2080            Pass:  91%/34  | Total:  1d 15h | Avg:  1h 09m | Max:  1h 24m | Hits:  26%/37090 
      🟩 rtxa6000           Pass: 100%/8   | Total:  4h 25m | Avg: 33m 14s | Max:  1h 08m | Hits:  81%/9744  
    🔍 jobs: Build 🔍
      🔍 Build              Pass:  91%/37  | Total:  1d 18h | Avg:  1h 08m | Max:  1h 24m | Hits:  26%/40744 
      🟩 DeviceLaunch       Pass: 100%/1   | Total: 21m 56s | Avg: 21m 56s | Max: 21m 56s | Hits:  99%/1218  
      🟩 GraphCapture       Pass: 100%/1   | Total: 16m 31s | Avg: 16m 31s | Max: 16m 31s | Hits:  99%/1218  
      🟩 HostLaunch         Pass: 100%/3   | Total:  1h 11m | Avg: 23m 55s | Max: 24m 06s | Hits:  99%/3654  
      🟩 TestGPU            Pass: 100%/3   | Total:  1h 06m | Avg: 22m 17s | Max: 23m 16s | Hits:  99%/3654  
    🔍 std: 17 🔍
      🔍 17                 Pass:  85%/20  | Total: 23h 24m | Avg:  1h 10m | Max:  1h 24m | Hits:  26%/20465 
      🟩 20                 Pass: 100%/25  | Total: 21h 45m | Avg: 52m 12s | Max:  1h 23m | Hits:  49%/30023 
    🟨 ctk
      🟨 12.0               Pass:  80%/5   | Total:  5h 54m | Avg:  1h 10m | Max:  1h 15m | Hits:  26%/4880  
      🟩 12.5               Pass: 100%/2   | Total:  2h 32m | Avg:  1h 16m | Max:  1h 17m | Hits:  23%/2254  
      🟨 12.8               Pass:  94%/38  | Total:  1d 12h | Avg: 57m 58s | Max:  1h 24m | Hits:  42%/43354 
    🟨 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total:  2h 12m | Avg:  1h 06m | Max:  1h 09m | Hits:  26%/2104  
      🟨 nvcc12.0           Pass:  80%/5   | Total:  5h 54m | Avg:  1h 10m | Max:  1h 15m | Hits:  26%/4880  
      🟩 nvcc12.5           Pass: 100%/2   | Total:  2h 32m | Avg:  1h 16m | Max:  1h 17m | Hits:  23%/2254  
      🟨 nvcc12.8           Pass:  94%/36  | Total:  1d 10h | Avg: 57m 29s | Max:  1h 24m | Hits:  43%/41250 
    🟨 cxx
      🟩 Clang14            Pass: 100%/4   | Total:  4h 26m | Avg:  1h 06m | Max:  1h 08m | Hits:  27%/4880  
      🟩 Clang15            Pass: 100%/2   | Total:  2h 14m | Avg:  1h 07m | Max:  1h 07m | Hits:  27%/2436  
      🟩 Clang16            Pass: 100%/2   | Total:  2h 10m | Avg:  1h 05m | Max:  1h 06m | Hits:  27%/2436  
      🟩 Clang17            Pass: 100%/2   | Total:  2h 10m | Avg:  1h 05m | Max:  1h 06m | Hits:  27%/2436  
      🟩 Clang18            Pass: 100%/7   | Total:  6h 18m | Avg: 54m 03s | Max:  1h 09m | Hits:  48%/8194  
      🟩 GCC7               Pass: 100%/2   | Total:  2h 21m | Avg:  1h 10m | Max:  1h 12m | Hits:  26%/2440  
      🟩 GCC8               Pass: 100%/1   | Total:  1h 05m | Avg:  1h 05m | Max:  1h 05m | Hits:  26%/1220  
      🟩 GCC9               Pass: 100%/2   | Total:  2h 25m | Avg:  1h 12m | Max:  1h 13m | Hits:  26%/2440  
      🟩 GCC10              Pass: 100%/2   | Total:  2h 18m | Avg:  1h 09m | Max:  1h 14m | Hits:  26%/2440  
      🟩 GCC11              Pass: 100%/2   | Total:  2h 09m | Avg:  1h 04m | Max:  1h 05m | Hits:  26%/2436  
      🟩 GCC12              Pass: 100%/2   | Total:  2h 13m | Avg:  1h 06m | Max:  1h 08m | Hits:  26%/2436  
      🟩 GCC13              Pass: 100%/11  | Total:  7h 16m | Avg: 39m 38s | Max:  1h 11m | Hits:  66%/13398 
      🟥 MSVC14.29          Pass:   0%/2   | Total:  2h 40m | Avg:  1h 20m | Max:  1h 24m
      🟨 MSVC14.42          Pass:  50%/2   | Total:  2h 46m | Avg:  1h 23m | Max:  1h 23m | Hits:  12%/1042  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  2h 32m | Avg:  1h 16m | Max:  1h 17m | Hits:  23%/2254  
    🟩 sm
      🟩 90                 Pass: 100%/3   | Total:  1h 18m | Avg: 26m 18s | Max: 31m 40s | Hits:  75%/3654  
      🟩 90;90a;100         Pass: 100%/1   | Total:  1h 11m | Avg:  1h 11m | Max:  1h 11m | Hits:  26%/1218  
    
  • 🟨 thrust: Pass: 93%/45 | Total: 23h 11m | Avg: 30m 55s | Max: 59m 12s | Hits: 79%/74643

    🔍 cpu: amd64 🔍
      🔍 amd64              Pass:  93%/43  | Total: 22h 16m | Avg: 31m 05s | Max: 59m 12s | Hits:  79%/71088 
      🟩 arm64              Pass: 100%/2   | Total: 54m 23s | Avg: 27m 11s | Max: 28m 12s | Hits:  77%/3555  
    🔍 cudacxx_family: nvcc 🔍
      🟩 ClangCUDA          Pass: 100%/2   | Total: 50m 45s | Avg: 25m 22s | Max: 26m 32s | Hits:  77%/3554  
      🔍 nvcc               Pass:  93%/43  | Total: 22h 20m | Avg: 31m 10s | Max: 59m 12s | Hits:  79%/71089 
    🔍 cxx_family: MSVC 🔍
      🟩 Clang              Pass: 100%/17  | Total:  7h 36m | Avg: 26m 51s | Max: 39m 35s | Hits:  79%/30209 
      🟩 GCC                Pass: 100%/21  | Total:  9h 23m | Avg: 26m 50s | Max: 35m 08s | Hits:  81%/37338 
      🔍 MSVC               Pass:  40%/5   | Total:  4h 23m | Avg: 52m 36s | Max: 59m 12s | Hits:  62%/3542  
      🟩 NVHPC              Pass: 100%/2   | Total:  1h 48m | Avg: 54m 11s | Max: 54m 33s | Hits:  64%/3554  
    🔍 gpu: rtx2080 🔍
      🟩 h100               Pass: 100%/2   | Total: 28m 37s | Avg: 14m 18s | Max: 17m 17s | Hits:  88%/3556  
      🔍 rtx2080            Pass:  90%/33  | Total: 18h 51m | Avg: 34m 17s | Max: 59m 12s | Hits:  76%/53324 
      🟩 rtx4090            Pass: 100%/10  | Total:  3h 51m | Avg: 23m 06s | Max: 56m 46s | Hits:  85%/17763 
    🔍 jobs: Build 🔍
      🔍 Build              Pass:  92%/38  | Total: 21h 38m | Avg: 34m 09s | Max: 59m 12s | Hits:  75%/62206 
      🟩 TestCPU            Pass: 100%/3   | Total: 48m 42s | Avg: 16m 14s | Max: 33m 39s | Hits:  90%/5326  
      🟩 TestGPU            Pass: 100%/4   | Total: 44m 34s | Avg: 11m 08s | Max: 11m 34s | Hits:  99%/7111  
    🔍 std: 17 🔍
      🔍 17                 Pass:  85%/20  | Total: 12h 00m | Avg: 36m 02s | Max: 59m 12s | Hits:  76%/30218 
      🟩 20                 Pass: 100%/23  | Total: 10h 33m | Avg: 27m 31s | Max: 56m 46s | Hits:  80%/40869 
    🟨 ctk
      🟨 12.0               Pass:  80%/5   | Total:  3h 08m | Avg: 37m 36s | Max: 59m 12s | Hits:  77%/7110  
      🟩 12.5               Pass: 100%/2   | Total:  1h 48m | Avg: 54m 11s | Max: 54m 33s | Hits:  64%/3554  
      🟨 12.8               Pass:  94%/38  | Total: 18h 14m | Avg: 28m 48s | Max: 58m 15s | Hits:  80%/63979 
    🟨 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total: 50m 45s | Avg: 25m 22s | Max: 26m 32s | Hits:  77%/3554  
      🟨 nvcc12.0           Pass:  80%/5   | Total:  3h 08m | Avg: 37m 36s | Max: 59m 12s | Hits:  77%/7110  
      🟩 nvcc12.5           Pass: 100%/2   | Total:  1h 48m | Avg: 54m 11s | Max: 54m 33s | Hits:  64%/3554  
      🟨 nvcc12.8           Pass:  94%/36  | Total: 17h 24m | Avg: 29m 00s | Max: 58m 15s | Hits:  80%/60425 
    🟨 cxx
      🟩 Clang14            Pass: 100%/4   | Total:  1h 56m | Avg: 29m 14s | Max: 31m 50s | Hits:  77%/7108  
      🟩 Clang15            Pass: 100%/2   | Total: 57m 07s | Avg: 28m 33s | Max: 29m 27s | Hits:  77%/3554  
      🟩 Clang16            Pass: 100%/2   | Total:  1h 09m | Avg: 34m 50s | Max: 39m 35s | Hits:  77%/3554  
      🟩 Clang17            Pass: 100%/2   | Total: 57m 47s | Avg: 28m 53s | Max: 29m 16s | Hits:  77%/3554  
      🟩 Clang18            Pass: 100%/7   | Total:  2h 34m | Avg: 22m 07s | Max: 31m 09s | Hits:  83%/12439 
      🟩 GCC7               Pass: 100%/2   | Total:  1h 05m | Avg: 32m 47s | Max: 33m 20s | Hits:  76%/3556  
      🟩 GCC8               Pass: 100%/1   | Total: 29m 06s | Avg: 29m 06s | Max: 29m 06s | Hits:  76%/1778  
      🟩 GCC9               Pass: 100%/2   | Total:  1h 09m | Avg: 34m 39s | Max: 35m 02s | Hits:  76%/3556  
      🟩 GCC10              Pass: 100%/2   | Total:  1h 05m | Avg: 32m 39s | Max: 33m 17s | Hits:  76%/3556  
      🟩 GCC11              Pass: 100%/2   | Total:  1h 02m | Avg: 31m 00s | Max: 31m 04s | Hits:  76%/3556  
      🟩 GCC12              Pass: 100%/2   | Total:  1h 01m | Avg: 30m 38s | Max: 32m 04s | Hits:  76%/3556  
      🟩 GCC13              Pass: 100%/10  | Total:  3h 30m | Avg: 21m 05s | Max: 35m 08s | Hits:  86%/17780 
      🟥 MSVC14.29          Pass:   0%/2   | Total:  1h 57m | Avg: 58m 43s | Max: 59m 12s
      🟨 MSVC14.42          Pass:  66%/3   | Total:  2h 25m | Avg: 48m 31s | Max: 56m 46s | Hits:  62%/3542  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  1h 48m | Avg: 54m 11s | Max: 54m 33s | Hits:  64%/3554  
    🟩 cmake_options
      🟩 -DTHRUST_DISPATCH_TYPE=Force32bit Pass: 100%/2   | Total: 37m 28s | Avg: 18m 44s | Max: 26m 02s | Hits:  88%/3556  
    🟩 sm
      🟩 90                 Pass: 100%/2   | Total: 28m 37s | Avg: 14m 18s | Max: 17m 17s | Hits:  88%/3556  
      🟩 90;90a;100         Pass: 100%/1   | Total: 32m 10s | Avg: 32m 10s | Max: 32m 10s | Hits:  76%/1778  
    
  • 🟩 cccl_c_parallel: Pass: 100%/2 | Total: 17m 07s | Avg: 8m 33s | Max: 14m 31s | Hits: 96%/320

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total: 17m 07s | Avg:  8m 33s | Max: 14m 31s | Hits:  96%/320   
    🟩 ctk
      🟩 12.8               Pass: 100%/2   | Total: 17m 07s | Avg:  8m 33s | Max: 14m 31s | Hits:  96%/320   
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/2   | Total: 17m 07s | Avg:  8m 33s | Max: 14m 31s | Hits:  96%/320   
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/2   | Total: 17m 07s | Avg:  8m 33s | Max: 14m 31s | Hits:  96%/320   
    🟩 cxx
      🟩 GCC13              Pass: 100%/2   | Total: 17m 07s | Avg:  8m 33s | Max: 14m 31s | Hits:  96%/320   
    🟩 cxx_family
      🟩 GCC                Pass: 100%/2   | Total: 17m 07s | Avg:  8m 33s | Max: 14m 31s | Hits:  96%/320   
    🟩 gpu
      🟩 rtx2080            Pass: 100%/2   | Total: 17m 07s | Avg:  8m 33s | Max: 14m 31s | Hits:  96%/320   
    🟩 jobs
      🟩 Build              Pass: 100%/1   | Total:  2m 36s | Avg:  2m 36s | Max:  2m 36s | Hits:  95%/160   
      🟩 Test               Pass: 100%/1   | Total: 14m 31s | Avg: 14m 31s | Max: 14m 31s | Hits:  98%/160   
    
  • 🟩 python: Pass: 100%/1 | Total: 1h 00m | Avg: 1h 00m | Max: 1h 00m

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total:  1h 00m | Avg:  1h 00m | Max:  1h 00m
    🟩 ctk
      🟩 12.8               Pass: 100%/1   | Total:  1h 00m | Avg:  1h 00m | Max:  1h 00m
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/1   | Total:  1h 00m | Avg:  1h 00m | Max:  1h 00m
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total:  1h 00m | Avg:  1h 00m | Max:  1h 00m
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total:  1h 00m | Avg:  1h 00m | Max:  1h 00m
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total:  1h 00m | Avg:  1h 00m | Max:  1h 00m
    🟩 gpu
      🟩 rtx2080            Pass: 100%/1   | Total:  1h 00m | Avg:  1h 00m | Max:  1h 00m
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total:  1h 00m | Avg:  1h 00m | Max:  1h 00m
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
+/- CUB
Thrust
CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental
+/- python
+/- CCCL C Parallel Library
+/- Catch2Helper

🏃‍ Runner counts (total jobs: 93)

# Runner
66 linux-amd64-cpu16
9 windows-amd64-cpu16
6 linux-amd64-gpu-rtxa6000-latest-1
4 linux-arm64-cpu16
3 linux-amd64-gpu-h100-latest-1
3 linux-amd64-gpu-rtx4090-latest-1
2 linux-amd64-gpu-rtx2080-latest-1

@fbusato fbusato enabled auto-merge (squash) March 11, 2025 22:17
@github-actions
Copy link
Contributor

🟨 CI finished in 1h 21m: Pass: 94%/158 | Total: 2d 10h | Avg: 22m 23s | Max: 1h 18m | Hits: 88%/240766
  • 🟨 cub: Pass: 82%/45 | Total: 1d 11h | Avg: 47m 38s | Max: 1h 18m | Hits: 78%/43870

    🔍 cpu: amd64 🔍
      🔍 amd64              Pass:  81%/43  | Total:  1d 09h | Avg: 47m 14s | Max:  1h 18m | Hits:  78%/41434 
      🟩 arm64              Pass: 100%/2   | Total:  1h 52m | Avg: 56m 25s | Max: 57m 41s | Hits:  85%/2436  
    🔍 ctk: 12.8 🔍
      🟩 12.0               Pass: 100%/5   | Total:  4h 32m | Avg: 54m 27s | Max:  1h 08m | Hits:  73%/5922  
      🟩 12.5               Pass: 100%/2   | Total:  1h 52m | Avg: 56m 24s | Max: 57m 23s | Hits:  83%/2254  
      🔍 12.8               Pass:  78%/38  | Total:  1d 05h | Avg: 46m 17s | Max:  1h 18m | Hits:  79%/35694 
    🔍 cudacxx: nvcc12.8 🔍
      🟩 ClangCUDA18        Pass: 100%/2   | Total:  1h 57m | Avg: 58m 50s | Max:  1h 00m | Hits:  85%/2104  
      🟩 nvcc12.0           Pass: 100%/5   | Total:  4h 32m | Avg: 54m 27s | Max:  1h 08m | Hits:  73%/5922  
      🟩 nvcc12.5           Pass: 100%/2   | Total:  1h 52m | Avg: 56m 24s | Max: 57m 23s | Hits:  83%/2254  
      🔍 nvcc12.8           Pass:  77%/36  | Total:  1d 03h | Avg: 45m 35s | Max:  1h 18m | Hits:  78%/33590 
    🔍 cudacxx_family: nvcc 🔍
      🟩 ClangCUDA          Pass: 100%/2   | Total:  1h 57m | Avg: 58m 50s | Max:  1h 00m | Hits:  85%/2104  
      🔍 nvcc               Pass:  81%/43  | Total:  1d 09h | Avg: 47m 07s | Max:  1h 18m | Hits:  78%/41766 
    🔍 sm: 90 🔍
      🔍 90                 Pass:  33%/3   | Total:  1h 02m | Avg: 20m 56s | Max: 24m 05s | Hits:  85%/1218  
      🟩 90;90a;100         Pass: 100%/1   | Total: 57m 24s | Avg: 57m 24s | Max: 57m 24s | Hits:  85%/1218  
    🔍 std: 20 🔍
      🟩 17                 Pass: 100%/20  | Total: 18h 17m | Avg: 54m 53s | Max:  1h 15m | Hits:  76%/23591 
      🔍 20                 Pass:  68%/25  | Total: 17h 26m | Avg: 41m 50s | Max:  1h 18m | Hits:  81%/20279 
    🟨 cxx
      🟩 Clang14            Pass: 100%/4   | Total:  3h 24m | Avg: 51m 03s | Max: 52m 14s | Hits:  85%/4880  
      🟩 Clang15            Pass: 100%/2   | Total:  1h 41m | Avg: 50m 42s | Max: 53m 36s | Hits:  85%/2436  
      🟩 Clang16            Pass: 100%/2   | Total:  1h 37m | Avg: 48m 49s | Max: 49m 33s | Hits:  85%/2436  
      🟩 Clang17            Pass: 100%/2   | Total:  1h 37m | Avg: 48m 47s | Max: 49m 13s | Hits:  85%/2436  
      🟨 Clang18            Pass:  71%/7   | Total:  5h 15m | Avg: 45m 02s | Max:  1h 00m | Hits:  85%/5758  
      🟩 GCC7               Pass: 100%/2   | Total:  1h 38m | Avg: 49m 16s | Max: 50m 05s | Hits:  85%/2440  
      🟩 GCC8               Pass: 100%/1   | Total: 49m 16s | Avg: 49m 16s | Max: 49m 16s | Hits:  85%/1220  
      🟩 GCC9               Pass: 100%/2   | Total:  1h 41m | Avg: 50m 52s | Max: 51m 15s | Hits:  85%/2440  
      🟩 GCC10              Pass: 100%/2   | Total:  1h 44m | Avg: 52m 18s | Max: 55m 46s | Hits:  85%/2440  
      🟩 GCC11              Pass: 100%/2   | Total:  1h 36m | Avg: 48m 21s | Max: 48m 42s | Hits:  85%/2436  
      🟩 GCC12              Pass: 100%/2   | Total:  1h 47m | Avg: 53m 48s | Max: 54m 13s | Hits:  85%/2436  
      🟨 GCC13              Pass:  45%/11  | Total:  6h 02m | Avg: 32m 57s | Max: 57m 24s | Hits:  85%/6090  
      🟩 MSVC14.29          Pass: 100%/2   | Total:  2h 24m | Avg:  1h 12m | Max:  1h 15m | Hits:  15%/2084  
      🟩 MSVC14.42          Pass: 100%/2   | Total:  2h 29m | Avg:  1h 14m | Max:  1h 18m | Hits:  15%/2084  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  1h 52m | Avg: 56m 24s | Max: 57m 23s | Hits:  83%/2254  
    🟨 cxx_family
      🟨 Clang              Pass:  88%/17  | Total: 13h 36m | Avg: 48m 00s | Max:  1h 00m | Hits:  85%/17946 
      🟨 GCC                Pass:  72%/22  | Total: 15h 21m | Avg: 41m 52s | Max: 57m 24s | Hits:  85%/19502 
      🟩 MSVC               Pass: 100%/4   | Total:  4h 53m | Avg:  1h 13m | Max:  1h 18m | Hits:  15%/4168  
      🟩 NVHPC              Pass: 100%/2   | Total:  1h 52m | Avg: 56m 24s | Max: 57m 23s | Hits:  83%/2254  
    🟨 gpu
      🟨 h100               Pass:  33%/3   | Total:  1h 02m | Avg: 20m 56s | Max: 24m 05s | Hits:  85%/1218  
      🟩 rtx2080            Pass: 100%/34  | Total:  1d 06h | Avg: 54m 36s | Max:  1h 18m | Hits:  78%/40216 
      🟨 rtxa6000           Pass:  25%/8   | Total:  3h 44m | Avg: 28m 03s | Max: 52m 46s | Hits:  85%/2436  
    🟨 jobs
      🟩 Build              Pass: 100%/37  | Total:  1d 09h | Avg: 53m 36s | Max:  1h 18m | Hits:  78%/43870 
      🟥 DeviceLaunch       Pass:   0%/1   | Total: 21m 18s | Avg: 21m 18s | Max: 21m 18s
      🟥 GraphCapture       Pass:   0%/1   | Total: 17m 28s | Avg: 17m 28s | Max: 17m 28s
      🟥 HostLaunch         Pass:   0%/3   | Total:  1h 10m | Avg: 23m 36s | Max: 24m 28s
      🟥 TestGPU            Pass:   0%/3   | Total: 50m 37s | Avg: 16m 52s | Max: 18m 00s
    
  • 🟨 cudax: Pass: 95%/22 | Total: 2h 21m | Avg: 6m 27s | Max: 14m 41s | Hits: 94%/11143

    🔍 cpu: amd64 🔍
      🔍 amd64              Pass:  94%/18  | Total:  2h 06m | Avg:  7m 01s | Max: 14m 41s | Hits:  93%/8827  
      🟩 arm64              Pass: 100%/4   | Total: 15m 26s | Avg:  3m 51s | Max:  4m 04s | Hits:  96%/2316  
    🔍 ctk: 12.8 🔍
      🟩 12.0               Pass: 100%/1   | Total: 12m 18s | Avg: 12m 18s | Max: 12m 18s | Hits:  57%/277   
      🟩 12.5               Pass: 100%/2   | Total: 13m 03s | Avg:  6m 31s | Max:  6m 45s | Hits:  92%/742   
      🔍 12.8               Pass:  94%/19  | Total:  1h 56m | Avg:  6m 08s | Max: 14m 41s | Hits:  95%/10124 
    🔍 cudacxx: nvcc12.8 🔍
      🟩 nvcc12.0           Pass: 100%/1   | Total: 12m 18s | Avg: 12m 18s | Max: 12m 18s | Hits:  57%/277   
      🟩 nvcc12.5           Pass: 100%/2   | Total: 13m 03s | Avg:  6m 31s | Max:  6m 45s | Hits:  92%/742   
      🔍 nvcc12.8           Pass:  94%/19  | Total:  1h 56m | Avg:  6m 08s | Max: 14m 41s | Hits:  95%/10124 
    🔍 cxx: Clang18 🔍
      🟩 Clang14            Pass: 100%/1   | Total:  4m 18s | Avg:  4m 18s | Max:  4m 18s | Hits:  96%/581   
      🟩 Clang15            Pass: 100%/1   | Total:  4m 22s | Avg:  4m 22s | Max:  4m 22s | Hits:  96%/579   
      🟩 Clang16            Pass: 100%/1   | Total:  4m 17s | Avg:  4m 17s | Max:  4m 17s | Hits:  96%/579   
      🟩 Clang17            Pass: 100%/1   | Total:  4m 35s | Avg:  4m 35s | Max:  4m 35s | Hits:  96%/579   
      🔍 Clang18            Pass:  75%/4   | Total: 23m 51s | Avg:  5m 57s | Max: 12m 14s | Hits:  96%/1737  
      🟩 GCC10              Pass: 100%/1   | Total:  4m 08s | Avg:  4m 08s | Max:  4m 08s | Hits:  96%/581   
      🟩 GCC11              Pass: 100%/1   | Total:  4m 34s | Avg:  4m 34s | Max:  4m 34s | Hits:  96%/579   
      🟩 GCC12              Pass: 100%/2   | Total: 18m 37s | Avg:  9m 18s | Max: 14m 16s | Hits:  96%/1158  
      🟩 GCC13              Pass: 100%/6   | Total: 33m 10s | Avg:  5m 31s | Max: 14m 02s | Hits:  96%/3474  
      🟩 MSVC14.39          Pass: 100%/1   | Total: 12m 18s | Avg: 12m 18s | Max: 12m 18s | Hits:  57%/277   
      🟩 MSVC14.42          Pass: 100%/1   | Total: 14m 41s | Avg: 14m 41s | Max: 14m 41s | Hits:  57%/277   
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 13m 03s | Avg:  6m 31s | Max:  6m 45s | Hits:  92%/742   
    🔍 cxx_family: Clang 🔍
      🔍 Clang              Pass:  87%/8   | Total: 41m 23s | Avg:  5m 10s | Max: 12m 14s | Hits:  96%/4055  
      🟩 GCC                Pass: 100%/10  | Total:  1h 00m | Avg:  6m 02s | Max: 14m 16s | Hits:  96%/5792  
      🟩 MSVC               Pass: 100%/2   | Total: 26m 59s | Avg: 13m 29s | Max: 14m 41s | Hits:  57%/554   
      🟩 NVHPC              Pass: 100%/2   | Total: 13m 03s | Avg:  6m 31s | Max:  6m 45s | Hits:  92%/742   
    🔍 gpu: rtx2080 🔍
      🟩 h100               Pass: 100%/2   | Total: 17m 56s | Avg:  8m 58s | Max: 14m 02s | Hits:  97%/1158  
      🔍 rtx2080            Pass:  95%/20  | Total:  2h 03m | Avg:  6m 11s | Max: 14m 41s | Hits:  93%/9985  
    🔍 jobs: Test 🔍
      🟩 Build              Pass: 100%/19  | Total:  1h 41m | Avg:  5m 20s | Max: 14m 41s | Hits:  93%/9985  
      🔍 Test               Pass:  66%/3   | Total: 40m 32s | Avg: 13m 30s | Max: 14m 16s | Hits:  98%/1158  
    🔍 std: 20 🔍
      🟩 17                 Pass: 100%/4   | Total: 17m 21s | Avg:  4m 20s | Max:  6m 18s | Hits:  95%/2108  
      🔍 20                 Pass:  94%/18  | Total:  2h 04m | Avg:  6m 55s | Max: 14m 41s | Hits:  93%/9035  
    🟨 cudacxx_family
      🟨 nvcc               Pass:  95%/22  | Total:  2h 21m | Avg:  6m 27s | Max: 14m 41s | Hits:  94%/11143 
    🟩 sm
      🟩 90                 Pass: 100%/3   | Total: 21m 23s | Avg:  7m 07s | Max: 14m 02s | Hits:  97%/1737  
      🟩 90a                Pass: 100%/1   | Total:  3m 42s | Avg:  3m 42s | Max:  3m 42s | Hits:  96%/579   
    
  • 🟩 thrust: Pass: 100%/45 | Total: 12h 41m | Avg: 16m 55s | Max: 38m 59s | Hits: 91%/79956

    🟩 cmake_options
      🟩 -DTHRUST_DISPATCH_TYPE=Force32bit Pass: 100%/2   | Total: 24m 59s | Avg: 12m 29s | Max: 13m 42s | Hits:  97%/3556  
    🟩 cpu
      🟩 amd64              Pass: 100%/43  | Total: 12h 15m | Avg: 17m 06s | Max: 38m 59s | Hits:  91%/76401 
      🟩 arm64              Pass: 100%/2   | Total: 26m 31s | Avg: 13m 15s | Max: 13m 19s | Hits:  94%/3555  
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total:  1h 32m | Avg: 18m 24s | Max: 33m 51s | Hits:  89%/8881  
      🟩 12.5               Pass: 100%/2   | Total: 52m 17s | Avg: 26m 08s | Max: 27m 00s | Hits:  93%/3554  
      🟩 12.8               Pass: 100%/38  | Total: 10h 17m | Avg: 16m 15s | Max: 38m 59s | Hits:  91%/67521 
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total: 25m 41s | Avg: 12m 50s | Max: 12m 54s | Hits:  94%/3554  
      🟩 nvcc12.0           Pass: 100%/5   | Total:  1h 32m | Avg: 18m 24s | Max: 33m 51s | Hits:  89%/8881  
      🟩 nvcc12.5           Pass: 100%/2   | Total: 52m 17s | Avg: 26m 08s | Max: 27m 00s | Hits:  93%/3554  
      🟩 nvcc12.8           Pass: 100%/36  | Total:  9h 52m | Avg: 16m 26s | Max: 38m 59s | Hits:  91%/63967 
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total: 25m 41s | Avg: 12m 50s | Max: 12m 54s | Hits:  94%/3554  
      🟩 nvcc               Pass: 100%/43  | Total: 12h 16m | Avg: 17m 07s | Max: 38m 59s | Hits:  91%/76402 
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total: 55m 41s | Avg: 13m 55s | Max: 15m 12s | Hits:  94%/7108  
      🟩 Clang15            Pass: 100%/2   | Total: 28m 09s | Avg: 14m 04s | Max: 14m 41s | Hits:  94%/3554  
      🟩 Clang16            Pass: 100%/2   | Total: 28m 45s | Avg: 14m 22s | Max: 15m 12s | Hits:  94%/3554  
      🟩 Clang17            Pass: 100%/2   | Total: 27m 34s | Avg: 13m 47s | Max: 14m 02s | Hits:  94%/3554  
      🟩 Clang18            Pass: 100%/7   | Total:  1h 25m | Avg: 12m 10s | Max: 14m 27s | Hits:  96%/12439 
      🟩 GCC7               Pass: 100%/2   | Total: 29m 13s | Avg: 14m 36s | Max: 14m 55s | Hits:  94%/3556  
      🟩 GCC8               Pass: 100%/1   | Total: 13m 38s | Avg: 13m 38s | Max: 13m 38s | Hits:  94%/1778  
      🟩 GCC9               Pass: 100%/2   | Total: 31m 41s | Avg: 15m 50s | Max: 15m 57s | Hits:  94%/3556  
      🟩 GCC10              Pass: 100%/2   | Total: 28m 04s | Avg: 14m 02s | Max: 14m 05s | Hits:  94%/3556  
      🟩 GCC11              Pass: 100%/2   | Total: 27m 41s | Avg: 13m 50s | Max: 14m 11s | Hits:  94%/3556  
      🟩 GCC12              Pass: 100%/2   | Total: 28m 53s | Avg: 14m 26s | Max: 14m 34s | Hits:  94%/3556  
      🟩 GCC13              Pass: 100%/10  | Total:  2h 31m | Avg: 15m 06s | Max: 37m 54s | Hits:  91%/17780 
      🟩 MSVC14.29          Pass: 100%/2   | Total:  1h 08m | Avg: 34m 09s | Max: 34m 28s | Hits:  66%/3542  
      🟩 MSVC14.42          Pass: 100%/3   | Total:  1h 45m | Avg: 35m 16s | Max: 38m 59s | Hits:  67%/5313  
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 52m 17s | Avg: 26m 08s | Max: 27m 00s | Hits:  93%/3554  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/17  | Total:  3h 45m | Avg: 13m 15s | Max: 15m 12s | Hits:  95%/30209 
      🟩 GCC                Pass: 100%/21  | Total:  5h 10m | Avg: 14m 46s | Max: 37m 54s | Hits:  93%/37338 
      🟩 MSVC               Pass: 100%/5   | Total:  2h 54m | Avg: 34m 49s | Max: 38m 59s | Hits:  67%/8855  
      🟩 NVHPC              Pass: 100%/2   | Total: 52m 17s | Avg: 26m 08s | Max: 27m 00s | Hits:  93%/3554  
    🟩 gpu
      🟩 h100               Pass: 100%/2   | Total: 19m 04s | Avg:  9m 32s | Max: 10m 48s | Hits:  97%/3556  
      🟩 rtx2080            Pass: 100%/33  | Total:  9h 09m | Avg: 16m 39s | Max: 34m 28s | Hits:  92%/58637 
      🟩 rtx4090            Pass: 100%/10  | Total:  3h 13m | Avg: 19m 19s | Max: 38m 59s | Hits:  87%/17763 
    🟩 jobs
      🟩 Build              Pass: 100%/38  | Total: 10h 39m | Avg: 16m 49s | Max: 38m 59s | Hits:  91%/67519 
      🟩 TestCPU            Pass: 100%/3   | Total:  1h 18m | Avg: 26m 17s | Max: 37m 54s | Hits:  73%/5326  
      🟩 TestGPU            Pass: 100%/4   | Total: 43m 54s | Avg: 10m 58s | Max: 11m 34s | Hits:  99%/7111  
    🟩 sm
      🟩 90                 Pass: 100%/2   | Total: 19m 04s | Avg:  9m 32s | Max: 10m 48s | Hits:  97%/3556  
      🟩 90;90a;100         Pass: 100%/1   | Total: 13m 54s | Avg: 13m 54s | Max: 13m 54s | Hits:  94%/1778  
    🟩 std
      🟩 17                 Pass: 100%/20  | Total:  5h 59m | Avg: 17m 58s | Max: 34m 28s | Hits:  90%/35531 
      🟩 20                 Pass: 100%/23  | Total:  6h 17m | Avg: 16m 24s | Max: 38m 59s | Hits:  91%/40869 
    
  • 🟩 libcudacxx: Pass: 100%/43 | Total: 6h 51m | Avg: 9m 34s | Max: 36m 07s | Hits: 90%/105477

    🟩 cpu
      🟩 amd64              Pass: 100%/41  | Total:  6h 41m | Avg:  9m 48s | Max: 36m 07s | Hits:  90%/99708 
      🟩 arm64              Pass: 100%/2   | Total:  9m 40s | Avg:  4m 50s | Max:  6m 02s | Hits:  93%/5769  
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total: 45m 12s | Avg:  9m 02s | Max: 28m 27s | Hits:  96%/14053 
      🟩 12.5               Pass: 100%/2   | Total: 44m 46s | Avg: 22m 23s | Max: 36m 07s | Hits:  63%/5714  
      🟩 12.8               Pass: 100%/36  | Total:  5h 21m | Avg:  8m 56s | Max: 26m 48s | Hits:  91%/85710 
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total: 42m 06s | Avg: 21m 03s | Max: 22m 18s | Hits:  27%/5730  
      🟩 nvcc12.0           Pass: 100%/5   | Total: 45m 12s | Avg:  9m 02s | Max: 28m 27s | Hits:  96%/14053 
      🟩 nvcc12.5           Pass: 100%/2   | Total: 44m 46s | Avg: 22m 23s | Max: 36m 07s | Hits:  63%/5714  
      🟩 nvcc12.8           Pass: 100%/34  | Total:  4h 39m | Avg:  8m 13s | Max: 26m 48s | Hits:  95%/79980 
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total: 42m 06s | Avg: 21m 03s | Max: 22m 18s | Hits:  27%/5730  
      🟩 nvcc               Pass: 100%/41  | Total:  6h 09m | Avg:  9m 00s | Max: 36m 07s | Hits:  93%/99747 
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total: 17m 01s | Avg:  4m 15s | Max:  4m 35s | Hits:  99%/11428 
      🟩 Clang15            Pass: 100%/2   | Total:  9m 09s | Avg:  4m 34s | Max:  4m 37s | Hits:  99%/5726  
      🟩 Clang16            Pass: 100%/2   | Total: 16m 01s | Avg:  8m 00s | Max:  9m 02s | Hits:  84%/5726  
      🟩 Clang17            Pass: 100%/2   | Total: 11m 01s | Avg:  5m 30s | Max:  6m 37s | Hits:  94%/5726  
      🟩 Clang18            Pass: 100%/6   | Total:  1h 15m | Avg: 12m 34s | Max: 22m 18s | Hits:  66%/14340 
      🟩 GCC7               Pass: 100%/2   | Total:  8m 56s | Avg:  4m 28s | Max:  5m 13s | Hits:  93%/5664  
      🟩 GCC8               Pass: 100%/1   | Total:  5m 42s | Avg:  5m 42s | Max:  5m 42s | Hits:  88%/2842  
      🟩 GCC9               Pass: 100%/2   | Total: 12m 09s | Avg:  6m 04s | Max:  8m 37s | Hits:  91%/5676  
      🟩 GCC10              Pass: 100%/2   | Total:  7m 54s | Avg:  3m 57s | Max:  4m 06s | Hits:  98%/5732  
      🟩 GCC11              Pass: 100%/2   | Total: 10m 24s | Avg:  5m 12s | Max:  6m 18s | Hits:  94%/5728  
      🟩 GCC12              Pass: 100%/2   | Total:  8m 15s | Avg:  4m 07s | Max:  4m 19s | Hits:  98%/5728  
      🟩 GCC13              Pass: 100%/10  | Total:  1h 17m | Avg:  7m 42s | Max: 15m 59s | Hits:  98%/14601 
      🟩 MSVC14.29          Pass: 100%/2   | Total: 54m 23s | Avg: 27m 11s | Max: 28m 27s | Hits:  98%/5390  
      🟩 MSVC14.42          Pass: 100%/2   | Total: 53m 28s | Avg: 26m 44s | Max: 26m 48s | Hits:  98%/5456  
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 44m 46s | Avg: 22m 23s | Max: 36m 07s | Hits:  63%/5714  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/16  | Total:  2h 08m | Avg:  8m 02s | Max: 22m 18s | Hits:  85%/42946 
      🟩 GCC                Pass: 100%/21  | Total:  2h 10m | Avg:  6m 12s | Max: 15m 59s | Hits:  96%/45971 
      🟩 MSVC               Pass: 100%/4   | Total:  1h 47m | Avg: 26m 57s | Max: 28m 27s | Hits:  98%/10846 
      🟩 NVHPC              Pass: 100%/2   | Total: 44m 46s | Avg: 22m 23s | Max: 36m 07s | Hits:  63%/5714  
    🟩 gpu
      🟩 h100               Pass: 100%/2   | Total: 17m 49s | Avg:  8m 54s | Max: 13m 44s | Hits:  98%/2974  
      🟩 rtx2080            Pass: 100%/41  | Total:  6h 33m | Avg:  9m 36s | Max: 36m 07s | Hits:  90%/102503
    🟩 jobs
      🟩 Build              Pass: 100%/37  | Total:  5h 39m | Avg:  9m 11s | Max: 36m 07s | Hits:  90%/105437
      🟩 NVRTC              Pass: 100%/2   | Total: 31m 02s | Avg: 15m 31s | Max: 15m 59s | Hits:  90%/40    
      🟩 Test               Pass: 100%/3   | Total: 38m 38s | Avg: 12m 52s | Max: 15m 52s
      🟩 VerifyCodegen      Pass: 100%/1   | Total:  2m 11s | Avg:  2m 11s | Max:  2m 11s
    🟩 sm
      🟩 75                 Pass: 100%/2   | Total: 31m 02s | Avg: 15m 31s | Max: 15m 59s | Hits:  90%/40    
      🟩 90                 Pass: 100%/2   | Total: 17m 49s | Avg:  8m 54s | Max: 13m 44s | Hits:  98%/2974  
      🟩 90;90a;100         Pass: 100%/1   | Total:  4m 50s | Avg:  4m 50s | Max:  4m 50s | Hits:  98%/2974  
    🟩 std
      🟩 17                 Pass: 100%/21  | Total:  3h 19m | Avg:  9m 29s | Max: 28m 27s | Hits:  92%/56387 
      🟩 20                 Pass: 100%/21  | Total:  3h 30m | Avg: 10m 00s | Max: 36m 07s | Hits:  87%/49090 
    
  • 🟩 cccl_c_parallel: Pass: 100%/2 | Total: 17m 10s | Avg: 8m 35s | Max: 14m 45s | Hits: 98%/320

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total: 17m 10s | Avg:  8m 35s | Max: 14m 45s | Hits:  98%/320   
    🟩 ctk
      🟩 12.8               Pass: 100%/2   | Total: 17m 10s | Avg:  8m 35s | Max: 14m 45s | Hits:  98%/320   
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/2   | Total: 17m 10s | Avg:  8m 35s | Max: 14m 45s | Hits:  98%/320   
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/2   | Total: 17m 10s | Avg:  8m 35s | Max: 14m 45s | Hits:  98%/320   
    🟩 cxx
      🟩 GCC13              Pass: 100%/2   | Total: 17m 10s | Avg:  8m 35s | Max: 14m 45s | Hits:  98%/320   
    🟩 cxx_family
      🟩 GCC                Pass: 100%/2   | Total: 17m 10s | Avg:  8m 35s | Max: 14m 45s | Hits:  98%/320   
    🟩 gpu
      🟩 rtx2080            Pass: 100%/2   | Total: 17m 10s | Avg:  8m 35s | Max: 14m 45s | Hits:  98%/320   
    🟩 jobs
      🟩 Build              Pass: 100%/1   | Total:  2m 25s | Avg:  2m 25s | Max:  2m 25s | Hits:  98%/160   
      🟩 Test               Pass: 100%/1   | Total: 14m 45s | Avg: 14m 45s | Max: 14m 45s | Hits:  98%/160   
    
  • 🟩 python: Pass: 100%/1 | Total: 1h 01m | Avg: 1h 01m | Max: 1h 01m

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total:  1h 01m | Avg:  1h 01m | Max:  1h 01m
    🟩 ctk
      🟩 12.8               Pass: 100%/1   | Total:  1h 01m | Avg:  1h 01m | Max:  1h 01m
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/1   | Total:  1h 01m | Avg:  1h 01m | Max:  1h 01m
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total:  1h 01m | Avg:  1h 01m | Max:  1h 01m
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total:  1h 01m | Avg:  1h 01m | Max:  1h 01m
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total:  1h 01m | Avg:  1h 01m | Max:  1h 01m
    🟩 gpu
      🟩 rtx2080            Pass: 100%/1   | Total:  1h 01m | Avg:  1h 01m | Max:  1h 01m
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total:  1h 01m | Avg:  1h 01m | Max:  1h 01m
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
+/- libcu++
+/- CUB
Thrust
CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
+/- libcu++
+/- CUB
+/- Thrust
+/- CUDA Experimental
+/- python
+/- CCCL C Parallel Library
+/- Catch2Helper

🏃‍ Runner counts (total jobs: 158)

# Runner
111 linux-amd64-cpu16
15 windows-amd64-cpu16
10 linux-arm64-cpu16
8 linux-amd64-gpu-rtx2080-latest-1
6 linux-amd64-gpu-rtxa6000-latest-1
5 linux-amd64-gpu-h100-latest-1
3 linux-amd64-gpu-rtx4090-latest-1

@fbusato fbusato requested a review from a team as a code owner March 12, 2025 22:38
@fbusato fbusato requested a review from gonidelis March 12, 2025 22:38
@github-actions
Copy link
Contributor

🟨 CI finished in 1h 17m: Pass: 96%/158 | Total: 2d 10h | Avg: 22m 18s | Max: 1h 15m | Hits: 92%/242650
  • 🟨 cub: Pass: 93%/45 | Total: 1d 11h | Avg: 47m 56s | Max: 1h 15m | Hits: 86%/50488

    🔍 cpu: amd64 🔍
      🔍 amd64              Pass:  93%/43  | Total:  1d 10h | Avg: 47m 38s | Max:  1h 15m | Hits:  86%/48052 
      🟩 arm64              Pass: 100%/2   | Total:  1h 49m | Avg: 54m 31s | Max: 55m 12s | Hits:  85%/2436  
    🔍 cudacxx_family: nvcc 🔍
      🟩 ClangCUDA          Pass: 100%/2   | Total:  1h 55m | Avg: 57m 50s | Max: 57m 57s | Hits:  85%/2104  
      🔍 nvcc               Pass:  93%/43  | Total:  1d 10h | Avg: 47m 28s | Max:  1h 15m | Hits:  86%/48384 
    🔍 cxx_family: MSVC 🔍
      🟩 Clang              Pass: 100%/17  | Total: 13h 44m | Avg: 48m 29s | Max: 57m 57s | Hits:  87%/20382 
      🟩 GCC                Pass: 100%/22  | Total: 15h 27m | Avg: 42m 08s | Max: 57m 57s | Hits:  89%/26810 
      🔍 MSVC               Pass:  25%/4   | Total:  4h 48m | Avg:  1h 12m | Max:  1h 15m | Hits:  15%/1042  
      🟩 NVHPC              Pass: 100%/2   | Total:  1h 57m | Avg: 58m 31s | Max:  1h 02m | Hits:  83%/2254  
    🔍 gpu: rtx2080 🔍
      🟩 h100               Pass: 100%/3   | Total:  1h 07m | Avg: 22m 32s | Max: 24m 46s | Hits:  94%/3654  
      🔍 rtx2080            Pass:  91%/34  | Total:  1d 07h | Avg: 54m 44s | Max:  1h 15m | Hits:  83%/37090 
      🟩 rtxa6000           Pass: 100%/8   | Total:  3h 48m | Avg: 28m 32s | Max: 50m 15s | Hits:  96%/9744  
    🔍 jobs: Build 🔍
      🔍 Build              Pass:  91%/37  | Total:  1d 09h | Avg: 53m 33s | Max:  1h 15m | Hits:  83%/40744 
      🟩 DeviceLaunch       Pass: 100%/1   | Total: 22m 12s | Avg: 22m 12s | Max: 22m 12s | Hits:  99%/1218  
      🟩 GraphCapture       Pass: 100%/1   | Total: 16m 58s | Avg: 16m 58s | Max: 16m 58s | Hits:  99%/1218  
      🟩 HostLaunch         Pass: 100%/3   | Total:  1h 11m | Avg: 23m 52s | Max: 24m 46s | Hits:  99%/3654  
      🟩 TestGPU            Pass: 100%/3   | Total:  1h 04m | Avg: 21m 35s | Max: 23m 12s | Hits:  99%/3654  
    🔍 std: 17 🔍
      🔍 17                 Pass:  85%/20  | Total: 18h 12m | Avg: 54m 37s | Max:  1h 15m | Hits:  85%/20465 
      🟩 20                 Pass: 100%/25  | Total: 17h 44m | Avg: 42m 35s | Max:  1h 15m | Hits:  87%/30023 
    🟨 ctk
      🟨 12.0               Pass:  80%/5   | Total:  4h 32m | Avg: 54m 24s | Max:  1h 06m | Hits:  85%/4880  
      🟩 12.5               Pass: 100%/2   | Total:  1h 57m | Avg: 58m 31s | Max:  1h 02m | Hits:  83%/2254  
      🟨 12.8               Pass:  94%/38  | Total:  1d 05h | Avg: 46m 31s | Max:  1h 15m | Hits:  86%/43354 
    🟨 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total:  1h 55m | Avg: 57m 50s | Max: 57m 57s | Hits:  85%/2104  
      🟨 nvcc12.0           Pass:  80%/5   | Total:  4h 32m | Avg: 54m 24s | Max:  1h 06m | Hits:  85%/4880  
      🟩 nvcc12.5           Pass: 100%/2   | Total:  1h 57m | Avg: 58m 31s | Max:  1h 02m | Hits:  83%/2254  
      🟨 nvcc12.8           Pass:  94%/36  | Total:  1d 03h | Avg: 45m 54s | Max:  1h 15m | Hits:  87%/41250 
    🟨 cxx
      🟩 Clang14            Pass: 100%/4   | Total:  3h 23m | Avg: 50m 51s | Max: 53m 04s | Hits:  85%/4880  
      🟩 Clang15            Pass: 100%/2   | Total:  1h 42m | Avg: 51m 02s | Max: 52m 14s | Hits:  85%/2436  
      🟩 Clang16            Pass: 100%/2   | Total:  1h 44m | Avg: 52m 24s | Max: 53m 13s | Hits:  85%/2436  
      🟩 Clang17            Pass: 100%/2   | Total:  1h 42m | Avg: 51m 03s | Max: 52m 41s | Hits:  85%/2436  
      🟩 Clang18            Pass: 100%/7   | Total:  5h 11m | Avg: 44m 32s | Max: 57m 57s | Hits:  89%/8194  
      🟩 GCC7               Pass: 100%/2   | Total:  1h 43m | Avg: 51m 30s | Max: 53m 27s | Hits:  85%/2440  
      🟩 GCC8               Pass: 100%/1   | Total: 48m 12s | Avg: 48m 12s | Max: 48m 12s | Hits:  85%/1220  
      🟩 GCC9               Pass: 100%/2   | Total:  1h 46m | Avg: 53m 04s | Max: 55m 05s | Hits:  85%/2440  
      🟩 GCC10              Pass: 100%/2   | Total:  1h 38m | Avg: 49m 06s | Max: 49m 24s | Hits:  85%/2440  
      🟩 GCC11              Pass: 100%/2   | Total:  1h 41m | Avg: 50m 49s | Max: 52m 31s | Hits:  85%/2436  
      🟩 GCC12              Pass: 100%/2   | Total:  1h 42m | Avg: 51m 28s | Max: 51m 35s | Hits:  85%/2436  
      🟩 GCC13              Pass: 100%/11  | Total:  6h 07m | Avg: 33m 22s | Max: 57m 57s | Hits:  93%/13398 
      🟥 MSVC14.29          Pass:   0%/2   | Total:  2h 21m | Avg:  1h 10m | Max:  1h 15m
      🟨 MSVC14.42          Pass:  50%/2   | Total:  2h 27m | Avg:  1h 13m | Max:  1h 15m | Hits:  15%/1042  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  1h 57m | Avg: 58m 31s | Max:  1h 02m | Hits:  83%/2254  
    🟩 sm
      🟩 90                 Pass: 100%/3   | Total:  1h 07m | Avg: 22m 32s | Max: 24m 46s | Hits:  94%/3654  
      🟩 90;90a;100         Pass: 100%/1   | Total: 57m 57s | Avg: 57m 57s | Max: 57m 57s | Hits:  85%/1218  
    
  • 🟨 thrust: Pass: 93%/45 | Total: 12h 48m | Avg: 17m 04s | Max: 38m 06s | Hits: 93%/74643

    🔍 cpu: amd64 🔍
      🔍 amd64              Pass:  93%/43  | Total: 12h 22m | Avg: 17m 15s | Max: 38m 06s | Hits:  93%/71088 
      🟩 arm64              Pass: 100%/2   | Total: 26m 04s | Avg: 13m 02s | Max: 13m 31s | Hits:  94%/3555  
    🔍 cudacxx_family: nvcc 🔍
      🟩 ClangCUDA          Pass: 100%/2   | Total: 27m 38s | Avg: 13m 49s | Max: 14m 39s | Hits:  94%/3554  
      🔍 nvcc               Pass:  93%/43  | Total: 12h 20m | Avg: 17m 13s | Max: 38m 06s | Hits:  93%/71089 
    🔍 cxx_family: MSVC 🔍
      🟩 Clang              Pass: 100%/17  | Total:  3h 47m | Avg: 13m 23s | Max: 15m 36s | Hits:  95%/30209 
      🟩 GCC                Pass: 100%/21  | Total:  5h 04m | Avg: 14m 28s | Max: 32m 48s | Hits:  93%/37338 
      🔍 MSVC               Pass:  40%/5   | Total:  3h 03m | Avg: 36m 36s | Max: 38m 06s | Hits:  68%/3542  
      🟩 NVHPC              Pass: 100%/2   | Total: 53m 19s | Avg: 26m 39s | Max: 27m 24s | Hits:  93%/3554  
    🔍 gpu: rtx2080 🔍
      🟩 h100               Pass: 100%/2   | Total: 20m 35s | Avg: 10m 17s | Max: 11m 36s | Hits:  97%/3556  
      🔍 rtx2080            Pass:  90%/33  | Total:  9h 46m | Avg: 17m 45s | Max: 38m 06s | Hits:  93%/53324 
      🟩 rtx4090            Pass: 100%/10  | Total:  2h 41m | Avg: 16m 08s | Max: 36m 30s | Hits:  92%/17763 
    🔍 jobs: Build 🔍
      🔍 Build              Pass:  92%/38  | Total: 11h 14m | Avg: 17m 44s | Max: 38m 06s | Hits:  92%/62206 
      🟩 TestCPU            Pass: 100%/3   | Total: 49m 26s | Avg: 16m 28s | Max: 33m 38s | Hits:  90%/5326  
      🟩 TestGPU            Pass: 100%/4   | Total: 44m 39s | Avg: 11m 09s | Max: 11m 36s | Hits:  99%/7111  
    🔍 std: 17 🔍
      🔍 17                 Pass:  85%/20  | Total:  6h 14m | Avg: 18m 44s | Max: 38m 06s | Hits:  94%/30218 
      🟩 20                 Pass: 100%/23  | Total:  6h 07m | Avg: 15m 59s | Max: 36m 30s | Hits:  91%/40869 
    🟨 ctk
      🟨 12.0               Pass:  80%/5   | Total:  1h 36m | Avg: 19m 22s | Max: 37m 59s | Hits:  94%/7110  
      🟩 12.5               Pass: 100%/2   | Total: 53m 19s | Avg: 26m 39s | Max: 27m 24s | Hits:  93%/3554  
      🟨 12.8               Pass:  94%/38  | Total: 10h 17m | Avg: 16m 15s | Max: 38m 06s | Hits:  92%/63979 
    🟨 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total: 27m 38s | Avg: 13m 49s | Max: 14m 39s | Hits:  94%/3554  
      🟨 nvcc12.0           Pass:  80%/5   | Total:  1h 36m | Avg: 19m 22s | Max: 37m 59s | Hits:  94%/7110  
      🟩 nvcc12.5           Pass: 100%/2   | Total: 53m 19s | Avg: 26m 39s | Max: 27m 24s | Hits:  93%/3554  
      🟨 nvcc12.8           Pass:  94%/36  | Total:  9h 50m | Avg: 16m 23s | Max: 38m 06s | Hits:  92%/60425 
    🟨 cxx
      🟩 Clang14            Pass: 100%/4   | Total: 55m 46s | Avg: 13m 56s | Max: 14m 09s | Hits:  94%/7108  
      🟩 Clang15            Pass: 100%/2   | Total: 27m 43s | Avg: 13m 51s | Max: 14m 07s | Hits:  94%/3554  
      🟩 Clang16            Pass: 100%/2   | Total: 27m 31s | Avg: 13m 45s | Max: 13m 47s | Hits:  94%/3554  
      🟩 Clang17            Pass: 100%/2   | Total: 28m 40s | Avg: 14m 20s | Max: 14m 39s | Hits:  94%/3554  
      🟩 Clang18            Pass: 100%/7   | Total:  1h 27m | Avg: 12m 34s | Max: 15m 36s | Hits:  96%/12439 
      🟩 GCC7               Pass: 100%/2   | Total: 31m 01s | Avg: 15m 30s | Max: 15m 48s | Hits:  94%/3556  
      🟩 GCC8               Pass: 100%/1   | Total: 14m 33s | Avg: 14m 33s | Max: 14m 33s | Hits:  94%/1778  
      🟩 GCC9               Pass: 100%/2   | Total: 30m 35s | Avg: 15m 17s | Max: 15m 57s | Hits:  94%/3556  
      🟩 GCC10              Pass: 100%/2   | Total: 29m 03s | Avg: 14m 31s | Max: 15m 08s | Hits:  94%/3556  
      🟩 GCC11              Pass: 100%/2   | Total: 46m 48s | Avg: 23m 24s | Max: 32m 48s | Hits:  71%/3556  
      🟩 GCC12              Pass: 100%/2   | Total: 31m 37s | Avg: 15m 48s | Max: 16m 32s | Hits:  94%/3556  
      🟩 GCC13              Pass: 100%/10  | Total:  2h 00m | Avg: 12m 03s | Max: 14m 05s | Hits:  96%/17780 
      🟥 MSVC14.29          Pass:   0%/2   | Total:  1h 14m | Avg: 37m 24s | Max: 37m 59s
      🟨 MSVC14.42          Pass:  66%/3   | Total:  1h 48m | Avg: 36m 04s | Max: 38m 06s | Hits:  68%/3542  
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 53m 19s | Avg: 26m 39s | Max: 27m 24s | Hits:  93%/3554  
    🟩 cmake_options
      🟩 -DTHRUST_DISPATCH_TYPE=Force32bit Pass: 100%/2   | Total: 25m 21s | Avg: 12m 40s | Max: 14m 05s | Hits:  97%/3556  
    🟩 sm
      🟩 90                 Pass: 100%/2   | Total: 20m 35s | Avg: 10m 17s | Max: 11m 36s | Hits:  97%/3556  
      🟩 90;90a;100         Pass: 100%/1   | Total: 13m 14s | Avg: 13m 14s | Max: 13m 14s | Hits:  94%/1778  
    
  • 🟩 libcudacxx: Pass: 100%/43 | Total: 6h 32m | Avg: 9m 07s | Max: 34m 16s | Hits: 93%/105477

    🟩 cpu
      🟩 amd64              Pass: 100%/41  | Total:  6h 24m | Avg:  9m 23s | Max: 34m 16s | Hits:  92%/99708 
      🟩 arm64              Pass: 100%/2   | Total:  7m 39s | Avg:  3m 49s | Max:  3m 56s | Hits:  98%/5769  
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total: 40m 31s | Avg:  8m 06s | Max: 24m 33s | Hits:  98%/14053 
      🟩 12.5               Pass: 100%/2   | Total: 43m 12s | Avg: 21m 36s | Max: 34m 16s | Hits:  67%/5714  
      🟩 12.8               Pass: 100%/36  | Total:  5h 08m | Avg:  8m 34s | Max: 29m 02s | Hits:  93%/85710 
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total: 42m 32s | Avg: 21m 16s | Max: 22m 06s | Hits:  27%/5730  
      🟩 nvcc12.0           Pass: 100%/5   | Total: 40m 31s | Avg:  8m 06s | Max: 24m 33s | Hits:  98%/14053 
      🟩 nvcc12.5           Pass: 100%/2   | Total: 43m 12s | Avg: 21m 36s | Max: 34m 16s | Hits:  67%/5714  
      🟩 nvcc12.8           Pass: 100%/34  | Total:  4h 26m | Avg:  7m 49s | Max: 29m 02s | Hits:  98%/79980 
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total: 42m 32s | Avg: 21m 16s | Max: 22m 06s | Hits:  27%/5730  
      🟩 nvcc               Pass: 100%/41  | Total:  5h 50m | Avg:  8m 32s | Max: 34m 16s | Hits:  96%/99747 
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total: 17m 29s | Avg:  4m 22s | Max:  4m 46s | Hits:  99%/11428 
      🟩 Clang15            Pass: 100%/2   | Total:  9m 27s | Avg:  4m 43s | Max:  4m 44s | Hits:  99%/5726  
      🟩 Clang16            Pass: 100%/2   | Total: 12m 34s | Avg:  6m 17s | Max:  8m 06s | Hits:  92%/5726  
      🟩 Clang17            Pass: 100%/2   | Total:  9m 16s | Avg:  4m 38s | Max:  4m 41s | Hits:  99%/5726  
      🟩 Clang18            Pass: 100%/6   | Total:  1h 09m | Avg: 11m 32s | Max: 22m 06s | Hits:  70%/14340 
      🟩 GCC7               Pass: 100%/2   | Total:  7m 35s | Avg:  3m 47s | Max:  3m 49s | Hits:  99%/5664  
      🟩 GCC8               Pass: 100%/1   | Total:  3m 56s | Avg:  3m 56s | Max:  3m 56s | Hits:  99%/2842  
      🟩 GCC9               Pass: 100%/2   | Total:  8m 03s | Avg:  4m 01s | Max:  4m 09s | Hits:  99%/5676  
      🟩 GCC10              Pass: 100%/2   | Total:  8m 17s | Avg:  4m 08s | Max:  4m 11s | Hits:  98%/5732  
      🟩 GCC11              Pass: 100%/2   | Total:  8m 25s | Avg:  4m 12s | Max:  4m 20s | Hits:  98%/5728  
      🟩 GCC12              Pass: 100%/2   | Total:  8m 20s | Avg:  4m 10s | Max:  4m 14s | Hits:  99%/5728  
      🟩 GCC13              Pass: 100%/10  | Total:  1h 19m | Avg:  7m 59s | Max: 17m 30s | Hits:  98%/14601 
      🟩 MSVC14.29          Pass: 100%/2   | Total: 51m 10s | Avg: 25m 35s | Max: 26m 37s | Hits:  98%/5390  
      🟩 MSVC14.42          Pass: 100%/2   | Total: 55m 38s | Avg: 27m 49s | Max: 29m 02s | Hits:  98%/5456  
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 43m 12s | Avg: 21m 36s | Max: 34m 16s | Hits:  67%/5714  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/16  | Total:  1h 58m | Avg:  7m 22s | Max: 22m 06s | Hits:  88%/42946 
      🟩 GCC                Pass: 100%/21  | Total:  2h 04m | Avg:  5m 55s | Max: 17m 30s | Hits:  99%/45971 
      🟩 MSVC               Pass: 100%/4   | Total:  1h 46m | Avg: 26m 42s | Max: 29m 02s | Hits:  98%/10846 
      🟩 NVHPC              Pass: 100%/2   | Total: 43m 12s | Avg: 21m 36s | Max: 34m 16s | Hits:  67%/5714  
    🟩 gpu
      🟩 h100               Pass: 100%/2   | Total: 17m 59s | Avg:  8m 59s | Max: 13m 27s | Hits:  98%/2974  
      🟩 rtx2080            Pass: 100%/41  | Total:  6h 14m | Avg:  9m 08s | Max: 34m 16s | Hits:  92%/102503
    🟩 jobs
      🟩 Build              Pass: 100%/37  | Total:  5h 20m | Avg:  8m 39s | Max: 34m 16s | Hits:  93%/105437
      🟩 NVRTC              Pass: 100%/2   | Total: 33m 08s | Avg: 16m 34s | Max: 17m 30s | Hits:  90%/40    
      🟩 Test               Pass: 100%/3   | Total: 36m 28s | Avg: 12m 09s | Max: 13m 32s
      🟩 VerifyCodegen      Pass: 100%/1   | Total:  2m 17s | Avg:  2m 17s | Max:  2m 17s
    🟩 sm
      🟩 75                 Pass: 100%/2   | Total: 33m 08s | Avg: 16m 34s | Max: 17m 30s | Hits:  90%/40    
      🟩 90                 Pass: 100%/2   | Total: 17m 59s | Avg:  8m 59s | Max: 13m 27s | Hits:  98%/2974  
      🟩 90;90a;100         Pass: 100%/1   | Total:  4m 50s | Avg:  4m 50s | Max:  4m 50s | Hits:  98%/2974  
    🟩 std
      🟩 17                 Pass: 100%/21  | Total:  3h 07m | Avg:  8m 57s | Max: 26m 37s | Hits:  95%/56387 
      🟩 20                 Pass: 100%/21  | Total:  3h 22m | Avg:  9m 37s | Max: 34m 16s | Hits:  90%/49090 
    
  • 🟩 cudax: Pass: 100%/22 | Total: 2h 07m | Avg: 5m 48s | Max: 14m 07s | Hits: 97%/11722

    🟩 cpu
      🟩 amd64              Pass: 100%/18  | Total:  1h 54m | Avg:  6m 21s | Max: 14m 07s | Hits:  96%/9406  
      🟩 arm64              Pass: 100%/4   | Total: 13m 14s | Avg:  3m 18s | Max:  3m 28s | Hits:  99%/2316  
    🟩 ctk
      🟩 12.0               Pass: 100%/1   | Total: 11m 24s | Avg: 11m 24s | Max: 11m 24s | Hits:  59%/277   
      🟩 12.5               Pass: 100%/2   | Total: 11m 29s | Avg:  5m 44s | Max:  5m 51s | Hits:  96%/742   
      🟩 12.8               Pass: 100%/19  | Total:  1h 44m | Avg:  5m 31s | Max: 14m 07s | Hits:  98%/10703 
    🟩 cudacxx
      🟩 nvcc12.0           Pass: 100%/1   | Total: 11m 24s | Avg: 11m 24s | Max: 11m 24s | Hits:  59%/277   
      🟩 nvcc12.5           Pass: 100%/2   | Total: 11m 29s | Avg:  5m 44s | Max:  5m 51s | Hits:  96%/742   
      🟩 nvcc12.8           Pass: 100%/19  | Total:  1h 44m | Avg:  5m 31s | Max: 14m 07s | Hits:  98%/10703 
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/22  | Total:  2h 07m | Avg:  5m 48s | Max: 14m 07s | Hits:  97%/11722 
    🟩 cxx
      🟩 Clang14            Pass: 100%/1   | Total:  3m 52s | Avg:  3m 52s | Max:  3m 52s | Hits:  99%/581   
      🟩 Clang15            Pass: 100%/1   | Total:  3m 53s | Avg:  3m 53s | Max:  3m 53s | Hits:  99%/579   
      🟩 Clang16            Pass: 100%/1   | Total:  3m 51s | Avg:  3m 51s | Max:  3m 51s | Hits:  99%/579   
      🟩 Clang17            Pass: 100%/1   | Total:  3m 58s | Avg:  3m 58s | Max:  3m 58s | Hits:  99%/579   
      🟩 Clang18            Pass: 100%/4   | Total: 21m 34s | Avg:  5m 23s | Max: 11m 34s | Hits:  99%/2316  
      🟩 GCC10              Pass: 100%/1   | Total:  3m 36s | Avg:  3m 36s | Max:  3m 36s | Hits:  99%/581   
      🟩 GCC11              Pass: 100%/1   | Total:  3m 42s | Avg:  3m 42s | Max:  3m 42s | Hits:  99%/579   
      🟩 GCC12              Pass: 100%/2   | Total: 16m 41s | Avg:  8m 20s | Max: 12m 38s | Hits:  99%/1158  
      🟩 GCC13              Pass: 100%/6   | Total: 31m 05s | Avg:  5m 10s | Max: 14m 07s | Hits:  99%/3474  
      🟩 MSVC14.39          Pass: 100%/1   | Total: 11m 24s | Avg: 11m 24s | Max: 11m 24s | Hits:  59%/277   
      🟩 MSVC14.42          Pass: 100%/1   | Total: 12m 43s | Avg: 12m 43s | Max: 12m 43s | Hits:  59%/277   
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 11m 29s | Avg:  5m 44s | Max:  5m 51s | Hits:  96%/742   
    🟩 cxx_family
      🟩 Clang              Pass: 100%/8   | Total: 37m 08s | Avg:  4m 38s | Max: 11m 34s | Hits:  99%/4634  
      🟩 GCC                Pass: 100%/10  | Total: 55m 04s | Avg:  5m 30s | Max: 14m 07s | Hits:  99%/5792  
      🟩 MSVC               Pass: 100%/2   | Total: 24m 07s | Avg: 12m 03s | Max: 12m 43s | Hits:  59%/554   
      🟩 NVHPC              Pass: 100%/2   | Total: 11m 29s | Avg:  5m 44s | Max:  5m 51s | Hits:  96%/742   
    🟩 gpu
      🟩 h100               Pass: 100%/2   | Total: 17m 31s | Avg:  8m 45s | Max: 14m 07s | Hits:  99%/1158  
      🟩 rtx2080            Pass: 100%/20  | Total:  1h 50m | Avg:  5m 30s | Max: 12m 43s | Hits:  97%/10564 
    🟩 jobs
      🟩 Build              Pass: 100%/19  | Total:  1h 29m | Avg:  4m 42s | Max: 12m 43s | Hits:  96%/9985  
      🟩 Test               Pass: 100%/3   | Total: 38m 19s | Avg: 12m 46s | Max: 14m 07s | Hits:  99%/1737  
    🟩 sm
      🟩 90                 Pass: 100%/3   | Total: 20m 54s | Avg:  6m 58s | Max: 14m 07s | Hits:  99%/1737  
      🟩 90a                Pass: 100%/1   | Total:  3m 17s | Avg:  3m 17s | Max:  3m 17s | Hits:  99%/579   
    🟩 std
      🟩 17                 Pass: 100%/4   | Total: 15m 50s | Avg:  3m 57s | Max:  5m 51s | Hits:  98%/2108  
      🟩 20                 Pass: 100%/18  | Total:  1h 51m | Avg:  6m 13s | Max: 14m 07s | Hits:  96%/9614  
    
  • 🟩 cccl_c_parallel: Pass: 100%/2 | Total: 16m 34s | Avg: 8m 17s | Max: 14m 25s | Hits: 98%/320

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total: 16m 34s | Avg:  8m 17s | Max: 14m 25s | Hits:  98%/320   
    🟩 ctk
      🟩 12.8               Pass: 100%/2   | Total: 16m 34s | Avg:  8m 17s | Max: 14m 25s | Hits:  98%/320   
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/2   | Total: 16m 34s | Avg:  8m 17s | Max: 14m 25s | Hits:  98%/320   
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/2   | Total: 16m 34s | Avg:  8m 17s | Max: 14m 25s | Hits:  98%/320   
    🟩 cxx
      🟩 GCC13              Pass: 100%/2   | Total: 16m 34s | Avg:  8m 17s | Max: 14m 25s | Hits:  98%/320   
    🟩 cxx_family
      🟩 GCC                Pass: 100%/2   | Total: 16m 34s | Avg:  8m 17s | Max: 14m 25s | Hits:  98%/320   
    🟩 gpu
      🟩 rtx2080            Pass: 100%/2   | Total: 16m 34s | Avg:  8m 17s | Max: 14m 25s | Hits:  98%/320   
    🟩 jobs
      🟩 Build              Pass: 100%/1   | Total:  2m 09s | Avg:  2m 09s | Max:  2m 09s | Hits:  98%/160   
      🟩 Test               Pass: 100%/1   | Total: 14m 25s | Avg: 14m 25s | Max: 14m 25s | Hits:  98%/160   
    
  • 🟩 python: Pass: 100%/1 | Total: 1h 01m | Avg: 1h 01m | Max: 1h 01m

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total:  1h 01m | Avg:  1h 01m | Max:  1h 01m
    🟩 ctk
      🟩 12.8               Pass: 100%/1   | Total:  1h 01m | Avg:  1h 01m | Max:  1h 01m
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/1   | Total:  1h 01m | Avg:  1h 01m | Max:  1h 01m
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total:  1h 01m | Avg:  1h 01m | Max:  1h 01m
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total:  1h 01m | Avg:  1h 01m | Max:  1h 01m
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total:  1h 01m | Avg:  1h 01m | Max:  1h 01m
    🟩 gpu
      🟩 rtx2080            Pass: 100%/1   | Total:  1h 01m | Avg:  1h 01m | Max:  1h 01m
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total:  1h 01m | Avg:  1h 01m | Max:  1h 01m
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
+/- libcu++
+/- CUB
Thrust
CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
+/- libcu++
+/- CUB
+/- Thrust
+/- CUDA Experimental
+/- python
+/- CCCL C Parallel Library
+/- Catch2Helper

🏃‍ Runner counts (total jobs: 158)

# Runner
111 linux-amd64-cpu16
15 windows-amd64-cpu16
10 linux-arm64-cpu16
8 linux-amd64-gpu-rtx2080-latest-1
6 linux-amd64-gpu-rtxa6000-latest-1
5 linux-amd64-gpu-h100-latest-1
3 linux-amd64-gpu-rtx4090-latest-1

@github-actions
Copy link
Contributor

🟩 CI finished in 1h 24m: Pass: 100%/158 | Total: 2d 10h | Avg: 22m 21s | Max: 1h 22m | Hits: 91%/251089
  • 🟩 cub: Pass: 100%/45 | Total: 1d 12h | Avg: 48m 44s | Max: 1h 22m | Hits: 82%/53614

    🟩 cpu
      🟩 amd64              Pass: 100%/43  | Total:  1d 10h | Avg: 48m 26s | Max:  1h 22m | Hits:  81%/51178 
      🟩 arm64              Pass: 100%/2   | Total:  1h 51m | Avg: 55m 30s | Max: 57m 22s | Hits:  85%/2436  
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total:  4h 34m | Avg: 54m 50s | Max:  1h 07m | Hits:  73%/5922  
      🟩 12.5               Pass: 100%/2   | Total:  2h 01m | Avg:  1h 00m | Max:  1h 01m | Hits:  83%/2254  
      🟩 12.8               Pass: 100%/38  | Total:  1d 05h | Avg: 47m 18s | Max:  1h 22m | Hits:  83%/45438 
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total:  2h 02m | Avg:  1h 01m | Max:  1h 03m | Hits:  85%/2104  
      🟩 nvcc12.0           Pass: 100%/5   | Total:  4h 34m | Avg: 54m 50s | Max:  1h 07m | Hits:  73%/5922  
      🟩 nvcc12.5           Pass: 100%/2   | Total:  2h 01m | Avg:  1h 00m | Max:  1h 01m | Hits:  83%/2254  
      🟩 nvcc12.8           Pass: 100%/36  | Total:  1d 03h | Avg: 46m 32s | Max:  1h 22m | Hits:  82%/43334 
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total:  2h 02m | Avg:  1h 01m | Max:  1h 03m | Hits:  85%/2104  
      🟩 nvcc               Pass: 100%/43  | Total:  1d 10h | Avg: 48m 09s | Max:  1h 22m | Hits:  81%/51510 
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total:  3h 22m | Avg: 50m 43s | Max: 52m 09s | Hits:  85%/4880  
      🟩 Clang15            Pass: 100%/2   | Total:  1h 36m | Avg: 48m 27s | Max: 49m 15s | Hits:  85%/2436  
      🟩 Clang16            Pass: 100%/2   | Total:  1h 41m | Avg: 50m 55s | Max: 52m 04s | Hits:  85%/2436  
      🟩 Clang17            Pass: 100%/2   | Total:  1h 39m | Avg: 49m 58s | Max: 51m 48s | Hits:  85%/2436  
      🟩 Clang18            Pass: 100%/7   | Total:  5h 21m | Avg: 45m 51s | Max:  1h 03m | Hits:  89%/8194  
      🟩 GCC7               Pass: 100%/2   | Total:  1h 47m | Avg: 53m 31s | Max: 55m 32s | Hits:  85%/2440  
      🟩 GCC8               Pass: 100%/1   | Total: 50m 18s | Avg: 50m 18s | Max: 50m 18s | Hits:  85%/1220  
      🟩 GCC9               Pass: 100%/2   | Total:  1h 41m | Avg: 50m 45s | Max: 51m 11s | Hits:  85%/2440  
      🟩 GCC10              Pass: 100%/2   | Total:  1h 39m | Avg: 49m 40s | Max: 50m 58s | Hits:  85%/2440  
      🟩 GCC11              Pass: 100%/2   | Total:  1h 40m | Avg: 50m 00s | Max: 51m 14s | Hits:  85%/2436  
      🟩 GCC12              Pass: 100%/2   | Total:  1h 50m | Avg: 55m 08s | Max:  1h 00m | Hits:  73%/2436  
      🟩 GCC13              Pass: 100%/11  | Total:  6h 22m | Avg: 34m 44s | Max:  1h 02m | Hits:  93%/13398 
      🟩 MSVC14.29          Pass: 100%/2   | Total:  2h 22m | Avg:  1h 11m | Max:  1h 14m | Hits:  15%/2084  
      🟩 MSVC14.42          Pass: 100%/2   | Total:  2h 36m | Avg:  1h 18m | Max:  1h 22m | Hits:  15%/2084  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  2h 01m | Avg:  1h 00m | Max:  1h 01m | Hits:  83%/2254  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/17  | Total: 13h 42m | Avg: 48m 23s | Max:  1h 03m | Hits:  87%/20382 
      🟩 GCC                Pass: 100%/22  | Total: 15h 50m | Avg: 43m 12s | Max:  1h 02m | Hits:  88%/26810 
      🟩 MSVC               Pass: 100%/4   | Total:  4h 58m | Avg:  1h 14m | Max:  1h 22m | Hits:  15%/4168  
      🟩 NVHPC              Pass: 100%/2   | Total:  2h 01m | Avg:  1h 00m | Max:  1h 01m | Hits:  83%/2254  
    🟩 gpu
      🟩 h100               Pass: 100%/3   | Total:  1h 07m | Avg: 22m 39s | Max: 24m 10s | Hits:  94%/3654  
      🟩 rtx2080            Pass: 100%/34  | Total:  1d 07h | Avg: 55m 28s | Max:  1h 22m | Hits:  77%/40216 
      🟩 rtxa6000           Pass: 100%/8   | Total:  3h 59m | Avg: 29m 56s | Max: 53m 27s | Hits:  96%/9744  
    🟩 jobs
      🟩 Build              Pass: 100%/37  | Total:  1d 09h | Avg: 54m 25s | Max:  1h 22m | Hits:  78%/43870 
      🟩 DeviceLaunch       Pass: 100%/1   | Total: 23m 22s | Avg: 23m 22s | Max: 23m 22s | Hits:  99%/1218  
      🟩 GraphCapture       Pass: 100%/1   | Total: 16m 59s | Avg: 16m 59s | Max: 16m 59s | Hits:  99%/1218  
      🟩 HostLaunch         Pass: 100%/3   | Total:  1h 11m | Avg: 23m 57s | Max: 25m 20s | Hits:  99%/3654  
      🟩 TestGPU            Pass: 100%/3   | Total:  1h 07m | Avg: 22m 38s | Max: 24m 53s | Hits:  99%/3654  
    🟩 sm
      🟩 90                 Pass: 100%/3   | Total:  1h 07m | Avg: 22m 39s | Max: 24m 10s | Hits:  94%/3654  
      🟩 90;90a;100         Pass: 100%/1   | Total:  1h 02m | Avg:  1h 02m | Max:  1h 02m | Hits:  85%/1218  
    🟩 std
      🟩 17                 Pass: 100%/20  | Total: 18h 29m | Avg: 55m 29s | Max:  1h 14m | Hits:  74%/23591 
      🟩 20                 Pass: 100%/25  | Total: 18h 03m | Avg: 43m 21s | Max:  1h 22m | Hits:  87%/30023 
    
  • 🟩 thrust: Pass: 100%/45 | Total: 12h 38m | Avg: 16m 50s | Max: 40m 20s | Hits: 91%/79956

    🟩 cmake_options
      🟩 -DTHRUST_DISPATCH_TYPE=Force32bit Pass: 100%/2   | Total: 25m 11s | Avg: 12m 35s | Max: 13m 56s | Hits:  97%/3556  
    🟩 cpu
      🟩 amd64              Pass: 100%/43  | Total: 12h 11m | Avg: 17m 00s | Max: 40m 20s | Hits:  91%/76401 
      🟩 arm64              Pass: 100%/2   | Total: 26m 43s | Avg: 13m 21s | Max: 13m 57s | Hits:  94%/3555  
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total:  1h 32m | Avg: 18m 24s | Max: 32m 23s | Hits:  89%/8881  
      🟩 12.5               Pass: 100%/2   | Total: 53m 24s | Avg: 26m 42s | Max: 27m 11s | Hits:  93%/3554  
      🟩 12.8               Pass: 100%/38  | Total: 10h 12m | Avg: 16m 07s | Max: 40m 20s | Hits:  92%/67521 
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total: 26m 45s | Avg: 13m 22s | Max: 13m 38s | Hits:  94%/3554  
      🟩 nvcc12.0           Pass: 100%/5   | Total:  1h 32m | Avg: 18m 24s | Max: 32m 23s | Hits:  89%/8881  
      🟩 nvcc12.5           Pass: 100%/2   | Total: 53m 24s | Avg: 26m 42s | Max: 27m 11s | Hits:  93%/3554  
      🟩 nvcc12.8           Pass: 100%/36  | Total:  9h 45m | Avg: 16m 16s | Max: 40m 20s | Hits:  91%/63967 
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total: 26m 45s | Avg: 13m 22s | Max: 13m 38s | Hits:  94%/3554  
      🟩 nvcc               Pass: 100%/43  | Total: 12h 11m | Avg: 17m 00s | Max: 40m 20s | Hits:  91%/76402 
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total: 57m 23s | Avg: 14m 20s | Max: 15m 30s | Hits:  94%/7108  
      🟩 Clang15            Pass: 100%/2   | Total: 27m 12s | Avg: 13m 36s | Max: 13m 45s | Hits:  94%/3554  
      🟩 Clang16            Pass: 100%/2   | Total: 27m 54s | Avg: 13m 57s | Max: 14m 04s | Hits:  94%/3554  
      🟩 Clang17            Pass: 100%/2   | Total: 28m 40s | Avg: 14m 20s | Max: 14m 59s | Hits:  94%/3554  
      🟩 Clang18            Pass: 100%/7   | Total:  1h 26m | Avg: 12m 20s | Max: 14m 28s | Hits:  96%/12439 
      🟩 GCC7               Pass: 100%/2   | Total: 29m 50s | Avg: 14m 55s | Max: 15m 09s | Hits:  94%/3556  
      🟩 GCC8               Pass: 100%/1   | Total: 14m 03s | Avg: 14m 03s | Max: 14m 03s | Hits:  94%/1778  
      🟩 GCC9               Pass: 100%/2   | Total: 32m 15s | Avg: 16m 07s | Max: 17m 05s | Hits:  94%/3556  
      🟩 GCC10              Pass: 100%/2   | Total: 30m 08s | Avg: 15m 04s | Max: 15m 38s | Hits:  94%/3556  
      🟩 GCC11              Pass: 100%/2   | Total: 30m 38s | Avg: 15m 19s | Max: 15m 42s | Hits:  94%/3556  
      🟩 GCC12              Pass: 100%/2   | Total: 36m 48s | Avg: 18m 24s | Max: 20m 58s | Hits:  80%/3556  
      🟩 GCC13              Pass: 100%/10  | Total:  2h 04m | Avg: 12m 24s | Max: 15m 07s | Hits:  96%/17780 
      🟩 MSVC14.29          Pass: 100%/2   | Total:  1h 07m | Avg: 33m 35s | Max: 34m 47s | Hits:  66%/3542  
      🟩 MSVC14.42          Pass: 100%/3   | Total:  1h 52m | Avg: 37m 26s | Max: 40m 20s | Hits:  67%/5313  
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 53m 24s | Avg: 26m 42s | Max: 27m 11s | Hits:  93%/3554  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/17  | Total:  3h 47m | Avg: 13m 23s | Max: 15m 30s | Hits:  95%/30209 
      🟩 GCC                Pass: 100%/21  | Total:  4h 57m | Avg: 14m 10s | Max: 20m 58s | Hits:  94%/37338 
      🟩 MSVC               Pass: 100%/5   | Total:  2h 59m | Avg: 35m 53s | Max: 40m 20s | Hits:  67%/8855  
      🟩 NVHPC              Pass: 100%/2   | Total: 53m 24s | Avg: 26m 42s | Max: 27m 11s | Hits:  93%/3554  
    🟩 gpu
      🟩 h100               Pass: 100%/2   | Total: 20m 10s | Avg: 10m 05s | Max: 12m 01s | Hits:  97%/3556  
      🟩 rtx2080            Pass: 100%/33  | Total:  9h 30m | Avg: 17m 17s | Max: 37m 35s | Hits:  91%/58637 
      🟩 rtx4090            Pass: 100%/10  | Total:  2h 47m | Avg: 16m 45s | Max: 40m 20s | Hits:  92%/17763 
    🟩 jobs
      🟩 Build              Pass: 100%/38  | Total: 11h 02m | Avg: 17m 25s | Max: 40m 20s | Hits:  91%/67519 
      🟩 TestCPU            Pass: 100%/3   | Total: 50m 41s | Avg: 16m 53s | Max: 34m 23s | Hits:  90%/5326  
      🟩 TestGPU            Pass: 100%/4   | Total: 45m 06s | Avg: 11m 16s | Max: 12m 01s | Hits:  99%/7111  
    🟩 sm
      🟩 90                 Pass: 100%/2   | Total: 20m 10s | Avg: 10m 05s | Max: 12m 01s | Hits:  97%/3556  
      🟩 90;90a;100         Pass: 100%/1   | Total: 15m 07s | Avg: 15m 07s | Max: 15m 07s | Hits:  94%/1778  
    🟩 std
      🟩 17                 Pass: 100%/20  | Total:  6h 07m | Avg: 18m 23s | Max: 37m 35s | Hits:  90%/35531 
      🟩 20                 Pass: 100%/23  | Total:  6h 05m | Avg: 15m 52s | Max: 40m 20s | Hits:  92%/40869 
    
  • 🟩 libcudacxx: Pass: 100%/43 | Total: 6h 03m | Avg: 8m 27s | Max: 27m 14s | Hits: 94%/105477

    🟩 cpu
      🟩 amd64              Pass: 100%/41  | Total:  5h 56m | Avg:  8m 41s | Max: 27m 14s | Hits:  94%/99708 
      🟩 arm64              Pass: 100%/2   | Total:  7m 40s | Avg:  3m 50s | Max:  3m 58s | Hits:  98%/5769  
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total: 40m 34s | Avg:  8m 06s | Max: 24m 55s | Hits:  98%/14053 
      🟩 12.5               Pass: 100%/2   | Total: 18m 02s | Avg:  9m 01s | Max:  9m 12s | Hits:  98%/5714  
      🟩 12.8               Pass: 100%/36  | Total:  5h 05m | Avg:  8m 28s | Max: 27m 14s | Hits:  93%/85710 
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total: 44m 52s | Avg: 22m 26s | Max: 24m 00s | Hits:  27%/5730  
      🟩 nvcc12.0           Pass: 100%/5   | Total: 40m 34s | Avg:  8m 06s | Max: 24m 55s | Hits:  98%/14053 
      🟩 nvcc12.5           Pass: 100%/2   | Total: 18m 02s | Avg:  9m 01s | Max:  9m 12s | Hits:  98%/5714  
      🟩 nvcc12.8           Pass: 100%/34  | Total:  4h 20m | Avg:  7m 39s | Max: 27m 14s | Hits:  98%/79980 
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total: 44m 52s | Avg: 22m 26s | Max: 24m 00s | Hits:  27%/5730  
      🟩 nvcc               Pass: 100%/41  | Total:  5h 18m | Avg:  7m 46s | Max: 27m 14s | Hits:  98%/99747 
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total: 17m 45s | Avg:  4m 26s | Max:  4m 49s | Hits:  99%/11428 
      🟩 Clang15            Pass: 100%/2   | Total:  9m 28s | Avg:  4m 44s | Max:  4m 44s | Hits:  99%/5726  
      🟩 Clang16            Pass: 100%/2   | Total:  9m 22s | Avg:  4m 41s | Max:  4m 51s | Hits:  99%/5726  
      🟩 Clang17            Pass: 100%/2   | Total:  9m 40s | Avg:  4m 50s | Max:  4m 59s | Hits:  99%/5726  
      🟩 Clang18            Pass: 100%/6   | Total:  1h 06m | Avg: 11m 07s | Max: 24m 00s | Hits:  70%/14340 
      🟩 GCC7               Pass: 100%/2   | Total:  7m 49s | Avg:  3m 54s | Max:  4m 12s | Hits:  99%/5664  
      🟩 GCC8               Pass: 100%/1   | Total:  3m 58s | Avg:  3m 58s | Max:  3m 58s | Hits:  99%/2842  
      🟩 GCC9               Pass: 100%/2   | Total: 10m 55s | Avg:  5m 27s | Max:  7m 10s | Hits:  93%/5676  
      🟩 GCC10              Pass: 100%/2   | Total:  8m 35s | Avg:  4m 17s | Max:  4m 22s | Hits:  98%/5732  
      🟩 GCC11              Pass: 100%/2   | Total:  8m 09s | Avg:  4m 04s | Max:  4m 14s | Hits:  98%/5728  
      🟩 GCC12              Pass: 100%/2   | Total:  8m 41s | Avg:  4m 20s | Max:  4m 24s | Hits:  98%/5728  
      🟩 GCC13              Pass: 100%/10  | Total:  1h 18m | Avg:  7m 52s | Max: 17m 08s | Hits:  98%/14601 
      🟩 MSVC14.29          Pass: 100%/2   | Total: 51m 40s | Avg: 25m 50s | Max: 26m 45s | Hits:  98%/5390  
      🟩 MSVC14.42          Pass: 100%/2   | Total: 54m 22s | Avg: 27m 11s | Max: 27m 14s | Hits:  98%/5456  
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 18m 02s | Avg:  9m 01s | Max:  9m 12s | Hits:  98%/5714  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/16  | Total:  1h 52m | Avg:  7m 03s | Max: 24m 00s | Hits:  89%/42946 
      🟩 GCC                Pass: 100%/21  | Total:  2h 06m | Avg:  6m 02s | Max: 17m 08s | Hits:  98%/45971 
      🟩 MSVC               Pass: 100%/4   | Total:  1h 46m | Avg: 26m 30s | Max: 27m 14s | Hits:  98%/10846 
      🟩 NVHPC              Pass: 100%/2   | Total: 18m 02s | Avg:  9m 01s | Max:  9m 12s | Hits:  98%/5714  
    🟩 gpu
      🟩 h100               Pass: 100%/2   | Total: 17m 30s | Avg:  8m 45s | Max: 13m 22s | Hits:  98%/2974  
      🟩 rtx2080            Pass: 100%/41  | Total:  5h 46m | Avg:  8m 26s | Max: 27m 14s | Hits:  94%/102503
    🟩 jobs
      🟩 Build              Pass: 100%/37  | Total:  4h 57m | Avg:  8m 02s | Max: 27m 14s | Hits:  94%/105437
      🟩 NVRTC              Pass: 100%/2   | Total: 32m 49s | Avg: 16m 24s | Max: 17m 08s | Hits:  90%/40    
      🟩 Test               Pass: 100%/3   | Total: 31m 16s | Avg: 10m 25s | Max: 13m 22s
      🟩 VerifyCodegen      Pass: 100%/1   | Total:  2m 08s | Avg:  2m 08s | Max:  2m 08s
    🟩 sm
      🟩 75                 Pass: 100%/2   | Total: 32m 49s | Avg: 16m 24s | Max: 17m 08s | Hits:  90%/40    
      🟩 90                 Pass: 100%/2   | Total: 17m 30s | Avg:  8m 45s | Max: 13m 22s | Hits:  98%/2974  
      🟩 90;90a;100         Pass: 100%/1   | Total:  4m 32s | Avg:  4m 32s | Max:  4m 32s | Hits:  98%/2974  
    🟩 std
      🟩 17                 Pass: 100%/21  | Total:  3h 12m | Avg:  9m 11s | Max: 27m 14s | Hits:  94%/56387 
      🟩 20                 Pass: 100%/21  | Total:  2h 48m | Avg:  8m 02s | Max: 27m 08s | Hits:  94%/49090 
    
  • 🟩 cudax: Pass: 100%/22 | Total: 2h 11m | Avg: 5m 57s | Max: 14m 03s | Hits: 97%/11722

    🟩 cpu
      🟩 amd64              Pass: 100%/18  | Total:  1h 57m | Avg:  6m 33s | Max: 14m 03s | Hits:  96%/9406  
      🟩 arm64              Pass: 100%/4   | Total: 13m 11s | Avg:  3m 17s | Max:  3m 22s | Hits:  99%/2316  
    🟩 ctk
      🟩 12.0               Pass: 100%/1   | Total: 11m 57s | Avg: 11m 57s | Max: 11m 57s | Hits:  59%/277   
      🟩 12.5               Pass: 100%/2   | Total: 11m 48s | Avg:  5m 54s | Max:  6m 00s | Hits:  96%/742   
      🟩 12.8               Pass: 100%/19  | Total:  1h 47m | Avg:  5m 39s | Max: 14m 03s | Hits:  98%/10703 
    🟩 cudacxx
      🟩 nvcc12.0           Pass: 100%/1   | Total: 11m 57s | Avg: 11m 57s | Max: 11m 57s | Hits:  59%/277   
      🟩 nvcc12.5           Pass: 100%/2   | Total: 11m 48s | Avg:  5m 54s | Max:  6m 00s | Hits:  96%/742   
      🟩 nvcc12.8           Pass: 100%/19  | Total:  1h 47m | Avg:  5m 39s | Max: 14m 03s | Hits:  98%/10703 
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/22  | Total:  2h 11m | Avg:  5m 57s | Max: 14m 03s | Hits:  97%/11722 
    🟩 cxx
      🟩 Clang14            Pass: 100%/1   | Total:  3m 47s | Avg:  3m 47s | Max:  3m 47s | Hits:  99%/581   
      🟩 Clang15            Pass: 100%/1   | Total:  3m 42s | Avg:  3m 42s | Max:  3m 42s | Hits:  99%/579   
      🟩 Clang16            Pass: 100%/1   | Total:  3m 46s | Avg:  3m 46s | Max:  3m 46s | Hits:  99%/579   
      🟩 Clang17            Pass: 100%/1   | Total:  3m 54s | Avg:  3m 54s | Max:  3m 54s | Hits:  99%/579   
      🟩 Clang18            Pass: 100%/4   | Total: 23m 28s | Avg:  5m 52s | Max: 12m 57s | Hits:  99%/2316  
      🟩 GCC10              Pass: 100%/1   | Total:  4m 15s | Avg:  4m 15s | Max:  4m 15s | Hits:  96%/581   
      🟩 GCC11              Pass: 100%/1   | Total:  3m 52s | Avg:  3m 52s | Max:  3m 52s | Hits:  99%/579   
      🟩 GCC12              Pass: 100%/2   | Total: 17m 12s | Avg:  8m 36s | Max: 13m 24s | Hits:  99%/1158  
      🟩 GCC13              Pass: 100%/6   | Total: 31m 04s | Avg:  5m 10s | Max: 14m 03s | Hits:  99%/3474  
      🟩 MSVC14.39          Pass: 100%/1   | Total: 11m 57s | Avg: 11m 57s | Max: 11m 57s | Hits:  59%/277   
      🟩 MSVC14.42          Pass: 100%/1   | Total: 12m 23s | Avg: 12m 23s | Max: 12m 23s | Hits:  59%/277   
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 11m 48s | Avg:  5m 54s | Max:  6m 00s | Hits:  96%/742   
    🟩 cxx_family
      🟩 Clang              Pass: 100%/8   | Total: 38m 37s | Avg:  4m 49s | Max: 12m 57s | Hits:  99%/4634  
      🟩 GCC                Pass: 100%/10  | Total: 56m 23s | Avg:  5m 38s | Max: 14m 03s | Hits:  98%/5792  
      🟩 MSVC               Pass: 100%/2   | Total: 24m 20s | Avg: 12m 10s | Max: 12m 23s | Hits:  59%/554   
      🟩 NVHPC              Pass: 100%/2   | Total: 11m 48s | Avg:  5m 54s | Max:  6m 00s | Hits:  96%/742   
    🟩 gpu
      🟩 h100               Pass: 100%/2   | Total: 17m 32s | Avg:  8m 46s | Max: 14m 03s | Hits:  99%/1158  
      🟩 rtx2080            Pass: 100%/20  | Total:  1h 53m | Avg:  5m 40s | Max: 13m 24s | Hits:  96%/10564 
    🟩 jobs
      🟩 Build              Pass: 100%/19  | Total:  1h 30m | Avg:  4m 46s | Max: 12m 23s | Hits:  96%/9985  
      🟩 Test               Pass: 100%/3   | Total: 40m 24s | Avg: 13m 28s | Max: 14m 03s | Hits:  99%/1737  
    🟩 sm
      🟩 90                 Pass: 100%/3   | Total: 20m 56s | Avg:  6m 58s | Max: 14m 03s | Hits:  99%/1737  
      🟩 90a                Pass: 100%/1   | Total:  3m 27s | Avg:  3m 27s | Max:  3m 27s | Hits:  99%/579   
    🟩 std
      🟩 17                 Pass: 100%/4   | Total: 15m 40s | Avg:  3m 55s | Max:  5m 48s | Hits:  98%/2108  
      🟩 20                 Pass: 100%/18  | Total:  1h 55m | Avg:  6m 24s | Max: 14m 03s | Hits:  96%/9614  
    
  • 🟩 cccl_c_parallel: Pass: 100%/2 | Total: 22m 06s | Avg: 11m 03s | Max: 19m 46s | Hits: 98%/320

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total: 22m 06s | Avg: 11m 03s | Max: 19m 46s | Hits:  98%/320   
    🟩 ctk
      🟩 12.8               Pass: 100%/2   | Total: 22m 06s | Avg: 11m 03s | Max: 19m 46s | Hits:  98%/320   
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/2   | Total: 22m 06s | Avg: 11m 03s | Max: 19m 46s | Hits:  98%/320   
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/2   | Total: 22m 06s | Avg: 11m 03s | Max: 19m 46s | Hits:  98%/320   
    🟩 cxx
      🟩 GCC13              Pass: 100%/2   | Total: 22m 06s | Avg: 11m 03s | Max: 19m 46s | Hits:  98%/320   
    🟩 cxx_family
      🟩 GCC                Pass: 100%/2   | Total: 22m 06s | Avg: 11m 03s | Max: 19m 46s | Hits:  98%/320   
    🟩 gpu
      🟩 rtx2080            Pass: 100%/2   | Total: 22m 06s | Avg: 11m 03s | Max: 19m 46s | Hits:  98%/320   
    🟩 jobs
      🟩 Build              Pass: 100%/1   | Total:  2m 20s | Avg:  2m 20s | Max:  2m 20s | Hits:  98%/160   
      🟩 Test               Pass: 100%/1   | Total: 19m 46s | Avg: 19m 46s | Max: 19m 46s | Hits:  98%/160   
    
  • 🟩 python: Pass: 100%/1 | Total: 1h 03m | Avg: 1h 03m | Max: 1h 03m

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total:  1h 03m | Avg:  1h 03m | Max:  1h 03m
    🟩 ctk
      🟩 12.8               Pass: 100%/1   | Total:  1h 03m | Avg:  1h 03m | Max:  1h 03m
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/1   | Total:  1h 03m | Avg:  1h 03m | Max:  1h 03m
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total:  1h 03m | Avg:  1h 03m | Max:  1h 03m
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total:  1h 03m | Avg:  1h 03m | Max:  1h 03m
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total:  1h 03m | Avg:  1h 03m | Max:  1h 03m
    🟩 gpu
      🟩 rtx2080            Pass: 100%/1   | Total:  1h 03m | Avg:  1h 03m | Max:  1h 03m
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total:  1h 03m | Avg:  1h 03m | Max:  1h 03m
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
+/- libcu++
+/- CUB
Thrust
CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
+/- libcu++
+/- CUB
+/- Thrust
+/- CUDA Experimental
+/- python
+/- CCCL C Parallel Library
+/- Catch2Helper

🏃‍ Runner counts (total jobs: 158)

# Runner
111 linux-amd64-cpu16
15 windows-amd64-cpu16
10 linux-arm64-cpu16
8 linux-amd64-gpu-rtx2080-latest-1
6 linux-amd64-gpu-rtxa6000-latest-1
5 linux-amd64-gpu-h100-latest-1
3 linux-amd64-gpu-rtx4090-latest-1

@fbusato fbusato merged commit 3260f76 into NVIDIA:main Mar 13, 2025
171 of 172 checks passed
@github-project-automation github-project-automation bot moved this from In Review to Done in CCCL Mar 13, 2025
@fbusato fbusato deleted the deprecate-cub-bfe-bfi branch March 20, 2025 22:06
davebayer pushed a commit to davebayer/cccl that referenced this pull request Apr 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

3.0 Targeted for 3.0 release

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

[FEA]: Deprecate/Replace cub::BFE

4 participants