KEMBAR78
Adds support for large number of segments and large number of items to `DeviceSegmentedRadixSort` by elstehle · Pull Request #3402 · NVIDIA/cccl · GitHub
Skip to content

Conversation

@elstehle
Copy link
Contributor

@elstehle elstehle commented Jan 15, 2025

Description

Closes #3133

Checklist

  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@copy-pr-bot
Copy link
Contributor

copy-pr-bot bot commented Jan 15, 2025

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@elstehle
Copy link
Contributor Author

/ok to test

@elstehle elstehle marked this pull request as ready for review January 16, 2025 05:56
@elstehle elstehle requested review from a team as code owners January 16, 2025 05:56
@elstehle elstehle mentioned this pull request Jan 16, 2025
25 tasks
@elstehle elstehle force-pushed the enh/large-seg-support-seg-radix-sort branch from 6236d63 to 102f93c Compare January 16, 2025 06:10
@github-actions
Copy link
Contributor

🟨 CI finished in 1h 50m: Pass: 98%/78 | Total: 1d 06h | Avg: 23m 29s | Max: 1h 01m | Hits: 400%/12760
  • 🟨 cub: Pass: 97%/38 | Total: 23h 26m | Avg: 37m 01s | Max: 1h 01m | Hits: 525%/3540

    🔍 cpu: amd64 🔍
      🔍 amd64              Pass:  97%/36  | Total: 22h 03m | Avg: 36m 46s | Max:  1h 01m | Hits: 525%/3540  
      🟩 arm64              Pass: 100%/2   | Total:  1h 22m | Avg: 41m 25s | Max: 41m 32s
    🔍 ctk: 12.6 🔍
      🟩 12.0               Pass: 100%/5   | Total:  2h 53m | Avg: 34m 40s | Max: 37m 21s | Hits: 533%/885   
      🟩 12.5               Pass: 100%/2   | Total:  1h 18m | Avg: 39m 06s | Max: 40m 14s
      🔍 12.6               Pass:  96%/31  | Total: 19h 15m | Avg: 37m 15s | Max:  1h 01m | Hits: 522%/2655  
    🔍 cudacxx: nvcc12.6 🔍
      🟩 ClangCUDA18        Pass: 100%/2   | Total:  1h 43m | Avg: 51m 31s | Max: 52m 24s
      🟩 nvcc12.0           Pass: 100%/5   | Total:  2h 53m | Avg: 34m 40s | Max: 37m 21s | Hits: 533%/885   
      🟩 nvcc12.5           Pass: 100%/2   | Total:  1h 18m | Avg: 39m 06s | Max: 40m 14s
      🔍 nvcc12.6           Pass:  96%/29  | Total: 17h 32m | Avg: 36m 16s | Max:  1h 01m | Hits: 522%/2655  
    🔍 cudacxx_family: nvcc 🔍
      🟩 ClangCUDA          Pass: 100%/2   | Total:  1h 43m | Avg: 51m 31s | Max: 52m 24s
      🔍 nvcc               Pass:  97%/36  | Total: 21h 43m | Avg: 36m 12s | Max:  1h 01m | Hits: 525%/3540  
    🔍 cxx: GCC13 🔍
      🟩 Clang14            Pass: 100%/4   | Total:  2h 23m | Avg: 35m 51s | Max: 37m 44s
      🟩 Clang15            Pass: 100%/1   | Total: 34m 55s | Avg: 34m 55s | Max: 34m 55s
      🟩 Clang16            Pass: 100%/1   | Total: 34m 36s | Avg: 34m 36s | Max: 34m 36s
      🟩 Clang17            Pass: 100%/1   | Total: 35m 13s | Avg: 35m 13s | Max: 35m 13s
      🟩 Clang18            Pass: 100%/7   | Total:  4h 41m | Avg: 40m 11s | Max: 52m 24s
      🟩 GCC7               Pass: 100%/2   | Total:  1h 11m | Avg: 35m 43s | Max: 37m 21s
      🟩 GCC8               Pass: 100%/1   | Total: 34m 30s | Avg: 34m 30s | Max: 34m 30s
      🟩 GCC9               Pass: 100%/2   | Total:  1h 09m | Avg: 34m 51s | Max: 35m 02s
      🟩 GCC10              Pass: 100%/1   | Total: 36m 30s | Avg: 36m 30s | Max: 36m 30s
      🟩 GCC11              Pass: 100%/1   | Total: 34m 23s | Avg: 34m 23s | Max: 34m 23s
      🟩 GCC12              Pass: 100%/3   | Total:  1h 18m | Avg: 26m 00s | Max: 39m 50s
      🔍 GCC13              Pass:  87%/8   | Total:  4h 29m | Avg: 33m 41s | Max: 51m 09s
      🟩 MSVC14.29          Pass: 100%/2   | Total:  1h 31m | Avg: 45m 54s | Max:  1h 01m | Hits: 527%/1770  
      🟩 MSVC14.39          Pass: 100%/2   | Total:  1h 53m | Avg: 56m 33s | Max: 58m 12s | Hits: 522%/1770  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  1h 18m | Avg: 39m 06s | Max: 40m 14s
    🔍 cxx_family: GCC 🔍
      🟩 Clang              Pass: 100%/14  | Total:  8h 49m | Avg: 37m 49s | Max: 52m 24s
      🔍 GCC                Pass:  94%/18  | Total:  9h 54m | Avg: 33m 00s | Max: 51m 09s
      🟩 MSVC               Pass: 100%/4   | Total:  3h 24m | Avg: 51m 13s | Max:  1h 01m | Hits: 525%/3540  
      🟩 NVHPC              Pass: 100%/2   | Total:  1h 18m | Avg: 39m 06s | Max: 40m 14s
    🔍 gpu: v100 🔍
      🟩 h100               Pass: 100%/2   | Total: 38m 12s | Avg: 19m 06s | Max: 31m 12s
      🔍 v100               Pass:  97%/36  | Total: 22h 48m | Avg: 38m 00s | Max:  1h 01m | Hits: 525%/3540  
    🔍 jobs: TestGPU 🔍
      🟩 Build              Pass: 100%/31  | Total: 19h 25m | Avg: 37m 35s | Max:  1h 01m | Hits: 525%/3540  
      🟩 DeviceLaunch       Pass: 100%/1   | Total: 51m 09s | Avg: 51m 09s | Max: 51m 09s
      🟩 GraphCapture       Pass: 100%/1   | Total: 33m 48s | Avg: 33m 48s | Max: 33m 48s
      🟩 HostLaunch         Pass: 100%/3   | Total:  1h 45m | Avg: 35m 03s | Max: 37m 29s
      🔍 TestGPU            Pass:  50%/2   | Total: 51m 32s | Avg: 25m 46s | Max: 26m 07s
    🔍 std: 20 🔍
      🟩 17                 Pass: 100%/14  | Total:  9h 15m | Avg: 39m 40s | Max:  1h 01m | Hits: 526%/2655  
      🔍 20                 Pass:  95%/24  | Total: 14h 11m | Avg: 35m 28s | Max: 58m 12s | Hits: 522%/885   
    🟩 sm
      🟩 90                 Pass: 100%/2   | Total: 38m 12s | Avg: 19m 06s | Max: 31m 12s
      🟩 90a                Pass: 100%/1   | Total:  7m 11s | Avg:  7m 11s | Max:  7m 11s
    
  • 🟩 thrust: Pass: 100%/37 | Total: 6h 28m | Avg: 10m 30s | Max: 38m 11s | Hits: 352%/9220

    🟩 cmake_options
      🟩 -DTHRUST_DISPATCH_TYPE=Force32bit Pass: 100%/2   | Total: 19m 09s | Avg:  9m 34s | Max: 12m 55s
    🟩 cpu
      🟩 amd64              Pass: 100%/35  | Total:  6h 19m | Avg: 10m 50s | Max: 38m 11s | Hits: 352%/9220  
      🟩 arm64              Pass: 100%/2   | Total:  9m 33s | Avg:  4m 46s | Max:  4m 55s
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total: 51m 21s | Avg: 10m 16s | Max: 31m 15s | Hits: 353%/1844  
      🟩 12.5               Pass: 100%/2   | Total: 28m 45s | Avg: 14m 22s | Max: 14m 24s
      🟩 12.6               Pass: 100%/30  | Total:  5h 08m | Avg: 10m 17s | Max: 38m 11s | Hits: 352%/7376  
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total:  9m 51s | Avg:  4m 55s | Max:  4m 59s
      🟩 nvcc12.0           Pass: 100%/5   | Total: 51m 21s | Avg: 10m 16s | Max: 31m 15s | Hits: 353%/1844  
      🟩 nvcc12.5           Pass: 100%/2   | Total: 28m 45s | Avg: 14m 22s | Max: 14m 24s
      🟩 nvcc12.6           Pass: 100%/28  | Total:  4h 58m | Avg: 10m 40s | Max: 38m 11s | Hits: 352%/7376  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total:  9m 51s | Avg:  4m 55s | Max:  4m 59s
      🟩 nvcc               Pass: 100%/35  | Total:  6h 18m | Avg: 10m 49s | Max: 38m 11s | Hits: 352%/9220  
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total: 21m 31s | Avg:  5m 22s | Max:  5m 36s
      🟩 Clang15            Pass: 100%/1   | Total:  5m 25s | Avg:  5m 25s | Max:  5m 25s
      🟩 Clang16            Pass: 100%/1   | Total:  5m 22s | Avg:  5m 22s | Max:  5m 22s
      🟩 Clang17            Pass: 100%/1   | Total:  5m 17s | Avg:  5m 17s | Max:  5m 17s
      🟩 Clang18            Pass: 100%/7   | Total: 46m 17s | Avg:  6m 36s | Max: 12m 23s
      🟩 GCC7               Pass: 100%/2   | Total: 10m 28s | Avg:  5m 14s | Max:  5m 49s
      🟩 GCC8               Pass: 100%/1   | Total:  5m 03s | Avg:  5m 03s | Max:  5m 03s
      🟩 GCC9               Pass: 100%/2   | Total: 10m 14s | Avg:  5m 07s | Max:  5m 25s
      🟩 GCC10              Pass: 100%/1   | Total:  5m 21s | Avg:  5m 21s | Max:  5m 21s
      🟩 GCC11              Pass: 100%/1   | Total:  5m 57s | Avg:  5m 57s | Max:  5m 57s
      🟩 GCC12              Pass: 100%/1   | Total:  6m 18s | Avg:  6m 18s | Max:  6m 18s
      🟩 GCC13              Pass: 100%/8   | Total:  1h 01m | Avg:  7m 40s | Max: 13m 06s
      🟩 MSVC14.29          Pass: 100%/2   | Total:  1h 00m | Avg: 30m 26s | Max: 31m 15s | Hits: 352%/3688  
      🟩 MSVC14.39          Pass: 100%/3   | Total:  1h 50m | Avg: 36m 48s | Max: 38m 11s | Hits: 352%/5532  
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 28m 45s | Avg: 14m 22s | Max: 14m 24s
    🟩 cxx_family
      🟩 Clang              Pass: 100%/14  | Total:  1h 23m | Avg:  5m 59s | Max: 12m 23s
      🟩 GCC                Pass: 100%/16  | Total:  1h 44m | Avg:  6m 33s | Max: 13m 06s
      🟩 MSVC               Pass: 100%/5   | Total:  2h 51m | Avg: 34m 15s | Max: 38m 11s | Hits: 352%/9220  
      🟩 NVHPC              Pass: 100%/2   | Total: 28m 45s | Avg: 14m 22s | Max: 14m 24s
    🟩 gpu
      🟩 v100               Pass: 100%/37  | Total:  6h 28m | Avg: 10m 30s | Max: 38m 11s | Hits: 352%/9220  
    🟩 jobs
      🟩 Build              Pass: 100%/31  | Total:  4h 57m | Avg:  9m 35s | Max: 38m 11s | Hits: 349%/7376  
      🟩 TestCPU            Pass: 100%/3   | Total: 53m 13s | Avg: 17m 44s | Max: 37m 08s | Hits: 365%/1844  
      🟩 TestGPU            Pass: 100%/3   | Total: 38m 24s | Avg: 12m 48s | Max: 13m 06s
    🟩 sm
      🟩 90a                Pass: 100%/1   | Total:  4m 33s | Avg:  4m 33s | Max:  4m 33s
    🟩 std
      🟩 17                 Pass: 100%/14  | Total:  2h 43m | Avg: 11m 39s | Max: 35m 07s | Hits: 352%/5532  
      🟩 20                 Pass: 100%/21  | Total:  3h 26m | Avg:  9m 49s | Max: 38m 11s | Hits: 353%/3688  
    
  • 🟩 cccl_c_parallel: Pass: 100%/2 | Total: 9m 18s | Avg: 4m 39s | Max: 7m 26s

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total:  9m 18s | Avg:  4m 39s | Max:  7m 26s
    🟩 ctk
      🟩 12.6               Pass: 100%/2   | Total:  9m 18s | Avg:  4m 39s | Max:  7m 26s
    🟩 cudacxx
      🟩 nvcc12.6           Pass: 100%/2   | Total:  9m 18s | Avg:  4m 39s | Max:  7m 26s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/2   | Total:  9m 18s | Avg:  4m 39s | Max:  7m 26s
    🟩 cxx
      🟩 GCC13              Pass: 100%/2   | Total:  9m 18s | Avg:  4m 39s | Max:  7m 26s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/2   | Total:  9m 18s | Avg:  4m 39s | Max:  7m 26s
    🟩 gpu
      🟩 v100               Pass: 100%/2   | Total:  9m 18s | Avg:  4m 39s | Max:  7m 26s
    🟩 jobs
      🟩 Build              Pass: 100%/1   | Total:  1m 52s | Avg:  1m 52s | Max:  1m 52s
      🟩 Test               Pass: 100%/1   | Total:  7m 26s | Avg:  7m 26s | Max:  7m 26s
    
  • 🟩 python: Pass: 100%/1 | Total: 27m 17s | Avg: 27m 17s | Max: 27m 17s

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total: 27m 17s | Avg: 27m 17s | Max: 27m 17s
    🟩 ctk
      🟩 12.6               Pass: 100%/1   | Total: 27m 17s | Avg: 27m 17s | Max: 27m 17s
    🟩 cudacxx
      🟩 nvcc12.6           Pass: 100%/1   | Total: 27m 17s | Avg: 27m 17s | Max: 27m 17s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total: 27m 17s | Avg: 27m 17s | Max: 27m 17s
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total: 27m 17s | Avg: 27m 17s | Max: 27m 17s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total: 27m 17s | Avg: 27m 17s | Max: 27m 17s
    🟩 gpu
      🟩 v100               Pass: 100%/1   | Total: 27m 17s | Avg: 27m 17s | Max: 27m 17s
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total: 27m 17s | Avg: 27m 17s | Max: 27m 17s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
+/- CUB
Thrust
CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental
+/- python
+/- CCCL C Parallel Library
+/- Catch2Helper

🏃‍ Runner counts (total jobs: 78)

# Runner
53 linux-amd64-cpu16
11 linux-amd64-gpu-v100-latest-1
9 windows-amd64-cpu16
4 linux-arm64-cpu16
1 linux-amd64-gpu-h100-latest-1-testing

@elstehle elstehle force-pushed the enh/large-seg-support-seg-radix-sort branch from 77790f5 to 1785fb4 Compare January 23, 2025 11:25
@github-actions
Copy link
Contributor

🟩 CI finished in 3h 04m: Pass: 100%/78 | Total: 1d 15h | Avg: 30m 39s | Max: 1h 06m | Hits: 385%/12708
  • 🟩 cub: Pass: 100%/38 | Total: 1d 04h | Avg: 45m 13s | Max: 1h 06m | Hits: 488%/3528

    🟩 cpu
      🟩 amd64              Pass: 100%/36  | Total:  1d 02h | Avg: 44m 45s | Max:  1h 06m | Hits: 488%/3528  
      🟩 arm64              Pass: 100%/2   | Total:  1h 47m | Avg: 53m 31s | Max: 56m 02s
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total:  4h 07m | Avg: 49m 24s | Max:  1h 01m | Hits: 488%/882   
      🟩 12.5               Pass: 100%/2   | Total:  1h 45m | Avg: 52m 30s | Max: 54m 48s
      🟩 12.6               Pass: 100%/31  | Total: 22h 46m | Avg: 44m 04s | Max:  1h 06m | Hits: 488%/2646  
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total:  1h 51m | Avg: 55m 48s | Max: 56m 25s
      🟩 nvcc12.0           Pass: 100%/5   | Total:  4h 07m | Avg: 49m 24s | Max:  1h 01m | Hits: 488%/882   
      🟩 nvcc12.5           Pass: 100%/2   | Total:  1h 45m | Avg: 52m 30s | Max: 54m 48s
      🟩 nvcc12.6           Pass: 100%/29  | Total: 20h 54m | Avg: 43m 16s | Max:  1h 06m | Hits: 488%/2646  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total:  1h 51m | Avg: 55m 48s | Max: 56m 25s
      🟩 nvcc               Pass: 100%/36  | Total:  1d 02h | Avg: 44m 38s | Max:  1h 06m | Hits: 488%/3528  
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total:  3h 04m | Avg: 46m 06s | Max: 48m 34s
      🟩 Clang15            Pass: 100%/1   | Total: 43m 02s | Avg: 43m 02s | Max: 43m 02s
      🟩 Clang16            Pass: 100%/1   | Total: 43m 59s | Avg: 43m 59s | Max: 43m 59s
      🟩 Clang17            Pass: 100%/1   | Total: 43m 49s | Avg: 43m 49s | Max: 43m 49s
      🟩 Clang18            Pass: 100%/7   | Total:  5h 21m | Avg: 45m 53s | Max: 56m 25s
      🟩 GCC7               Pass: 100%/2   | Total:  1h 32m | Avg: 46m 10s | Max: 47m 53s
      🟩 GCC8               Pass: 100%/1   | Total: 43m 11s | Avg: 43m 11s | Max: 43m 11s
      🟩 GCC9               Pass: 100%/2   | Total:  1h 32m | Avg: 46m 07s | Max: 48m 39s
      🟩 GCC10              Pass: 100%/1   | Total: 43m 09s | Avg: 43m 09s | Max: 43m 09s
      🟩 GCC11              Pass: 100%/1   | Total: 43m 50s | Avg: 43m 50s | Max: 43m 50s
      🟩 GCC12              Pass: 100%/3   | Total:  1h 38m | Avg: 32m 53s | Max: 48m 07s
      🟩 GCC13              Pass: 100%/8   | Total:  5h 10m | Avg: 38m 45s | Max: 55m 35s
      🟩 MSVC14.29          Pass: 100%/2   | Total:  2h 02m | Avg:  1h 01m | Max:  1h 01m | Hits: 488%/1764  
      🟩 MSVC14.39          Pass: 100%/2   | Total:  2h 11m | Avg:  1h 05m | Max:  1h 06m | Hits: 488%/1764  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  1h 45m | Avg: 52m 30s | Max: 54m 48s
    🟩 cxx_family
      🟩 Clang              Pass: 100%/14  | Total: 10h 36m | Avg: 45m 28s | Max: 56m 25s
      🟩 GCC                Pass: 100%/18  | Total: 12h 03m | Avg: 40m 11s | Max: 55m 35s
      🟩 MSVC               Pass: 100%/4   | Total:  4h 13m | Avg:  1h 03m | Max:  1h 06m | Hits: 488%/3528  
      🟩 NVHPC              Pass: 100%/2   | Total:  1h 45m | Avg: 52m 30s | Max: 54m 48s
    🟩 gpu
      🟩 h100               Pass: 100%/2   | Total: 50m 32s | Avg: 25m 16s | Max: 31m 30s
      🟩 v100               Pass: 100%/36  | Total:  1d 03h | Avg: 46m 20s | Max:  1h 06m | Hits: 488%/3528  
    🟩 jobs
      🟩 Build              Pass: 100%/31  | Total:  1d 00h | Avg: 47m 37s | Max:  1h 06m | Hits: 488%/3528  
      🟩 DeviceLaunch       Pass: 100%/1   | Total: 55m 35s | Avg: 55m 35s | Max: 55m 35s
      🟩 GraphCapture       Pass: 100%/1   | Total: 33m 25s | Avg: 33m 25s | Max: 33m 25s
      🟩 HostLaunch         Pass: 100%/3   | Total:  1h 38m | Avg: 32m 53s | Max: 36m 40s
      🟩 TestGPU            Pass: 100%/2   | Total: 54m 22s | Avg: 27m 11s | Max: 27m 40s
    🟩 sm
      🟩 90                 Pass: 100%/2   | Total: 50m 32s | Avg: 25m 16s | Max: 31m 30s
      🟩 90a                Pass: 100%/1   | Total: 18m 23s | Avg: 18m 23s | Max: 18m 23s
    🟩 std
      🟩 17                 Pass: 100%/14  | Total: 11h 53m | Avg: 50m 59s | Max:  1h 05m | Hits: 488%/2646  
      🟩 20                 Pass: 100%/24  | Total: 16h 44m | Avg: 41m 51s | Max:  1h 06m | Hits: 488%/882   
    
  • 🟩 thrust: Pass: 100%/37 | Total: 10h 07m | Avg: 16m 25s | Max: 39m 03s | Hits: 345%/9180

    🟩 cmake_options
      🟩 -DTHRUST_DISPATCH_TYPE=Force32bit Pass: 100%/2   | Total: 23m 51s | Avg: 11m 55s | Max: 12m 32s
    🟩 cpu
      🟩 amd64              Pass: 100%/35  | Total:  9h 43m | Avg: 16m 40s | Max: 39m 03s | Hits: 345%/9180  
      🟩 arm64              Pass: 100%/2   | Total: 23m 51s | Avg: 11m 55s | Max: 12m 19s
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total:  1h 25m | Avg: 17m 02s | Max: 34m 34s | Hits: 340%/1836  
      🟩 12.5               Pass: 100%/2   | Total: 55m 49s | Avg: 27m 54s | Max: 27m 59s
      🟩 12.6               Pass: 100%/30  | Total:  7h 46m | Avg: 15m 33s | Max: 39m 03s | Hits: 346%/7344  
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total: 24m 11s | Avg: 12m 05s | Max: 12m 43s
      🟩 nvcc12.0           Pass: 100%/5   | Total:  1h 25m | Avg: 17m 02s | Max: 34m 34s | Hits: 340%/1836  
      🟩 nvcc12.5           Pass: 100%/2   | Total: 55m 49s | Avg: 27m 54s | Max: 27m 59s
      🟩 nvcc12.6           Pass: 100%/28  | Total:  7h 22m | Avg: 15m 47s | Max: 39m 03s | Hits: 346%/7344  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total: 24m 11s | Avg: 12m 05s | Max: 12m 43s
      🟩 nvcc               Pass: 100%/35  | Total:  9h 43m | Avg: 16m 40s | Max: 39m 03s | Hits: 345%/9180  
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total: 50m 48s | Avg: 12m 42s | Max: 13m 25s
      🟩 Clang15            Pass: 100%/1   | Total: 13m 06s | Avg: 13m 06s | Max: 13m 06s
      🟩 Clang16            Pass: 100%/1   | Total: 14m 09s | Avg: 14m 09s | Max: 14m 09s
      🟩 Clang17            Pass: 100%/1   | Total: 13m 32s | Avg: 13m 32s | Max: 13m 32s
      🟩 Clang18            Pass: 100%/7   | Total:  1h 26m | Avg: 12m 23s | Max: 15m 14s
      🟩 GCC7               Pass: 100%/2   | Total: 26m 55s | Avg: 13m 27s | Max: 13m 42s
      🟩 GCC8               Pass: 100%/1   | Total: 13m 10s | Avg: 13m 10s | Max: 13m 10s
      🟩 GCC9               Pass: 100%/2   | Total: 27m 08s | Avg: 13m 34s | Max: 14m 10s
      🟩 GCC10              Pass: 100%/1   | Total: 14m 10s | Avg: 14m 10s | Max: 14m 10s
      🟩 GCC11              Pass: 100%/1   | Total: 13m 31s | Avg: 13m 31s | Max: 13m 31s
      🟩 GCC12              Pass: 100%/1   | Total: 13m 09s | Avg: 13m 09s | Max: 13m 09s
      🟩 GCC13              Pass: 100%/8   | Total:  1h 30m | Avg: 11m 20s | Max: 13m 01s
      🟩 MSVC14.29          Pass: 100%/2   | Total:  1h 09m | Avg: 34m 48s | Max: 35m 02s | Hits: 340%/3672  
      🟩 MSVC14.39          Pass: 100%/3   | Total:  1h 45m | Avg: 35m 00s | Max: 39m 03s | Hits: 348%/5508  
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 55m 49s | Avg: 27m 54s | Max: 27m 59s
    🟩 cxx_family
      🟩 Clang              Pass: 100%/14  | Total:  2h 58m | Avg: 12m 44s | Max: 15m 14s
      🟩 GCC                Pass: 100%/16  | Total:  3h 18m | Avg: 12m 25s | Max: 14m 10s
      🟩 MSVC               Pass: 100%/5   | Total:  2h 54m | Avg: 34m 55s | Max: 39m 03s | Hits: 345%/9180  
      🟩 NVHPC              Pass: 100%/2   | Total: 55m 49s | Avg: 27m 54s | Max: 27m 59s
    🟩 gpu
      🟩 v100               Pass: 100%/37  | Total: 10h 07m | Avg: 16m 25s | Max: 39m 03s | Hits: 345%/9180  
    🟩 jobs
      🟩 Build              Pass: 100%/31  | Total:  8h 40m | Avg: 16m 47s | Max: 39m 03s | Hits: 340%/7344  
      🟩 TestCPU            Pass: 100%/3   | Total: 46m 08s | Avg: 15m 22s | Max: 30m 31s | Hits: 365%/1836  
      🟩 TestGPU            Pass: 100%/3   | Total: 40m 47s | Avg: 13m 35s | Max: 15m 14s
    🟩 sm
      🟩 90a                Pass: 100%/1   | Total:  8m 13s | Avg:  8m 13s | Max:  8m 13s
    🟩 std
      🟩 17                 Pass: 100%/14  | Total:  4h 23m | Avg: 18m 49s | Max: 35m 27s | Hits: 340%/5508  
      🟩 20                 Pass: 100%/21  | Total:  5h 20m | Avg: 15m 14s | Max: 39m 03s | Hits: 352%/3672  
    
  • 🟩 cccl_c_parallel: Pass: 100%/2 | Total: 12m 24s | Avg: 6m 12s | Max: 10m 17s

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total: 12m 24s | Avg:  6m 12s | Max: 10m 17s
    🟩 ctk
      🟩 12.6               Pass: 100%/2   | Total: 12m 24s | Avg:  6m 12s | Max: 10m 17s
    🟩 cudacxx
      🟩 nvcc12.6           Pass: 100%/2   | Total: 12m 24s | Avg:  6m 12s | Max: 10m 17s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/2   | Total: 12m 24s | Avg:  6m 12s | Max: 10m 17s
    🟩 cxx
      🟩 GCC13              Pass: 100%/2   | Total: 12m 24s | Avg:  6m 12s | Max: 10m 17s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/2   | Total: 12m 24s | Avg:  6m 12s | Max: 10m 17s
    🟩 gpu
      🟩 v100               Pass: 100%/2   | Total: 12m 24s | Avg:  6m 12s | Max: 10m 17s
    🟩 jobs
      🟩 Build              Pass: 100%/1   | Total:  2m 07s | Avg:  2m 07s | Max:  2m 07s
      🟩 Test               Pass: 100%/1   | Total: 10m 17s | Avg: 10m 17s | Max: 10m 17s
    
  • 🟩 python: Pass: 100%/1 | Total: 52m 58s | Avg: 52m 58s | Max: 52m 58s

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total: 52m 58s | Avg: 52m 58s | Max: 52m 58s
    🟩 ctk
      🟩 12.6               Pass: 100%/1   | Total: 52m 58s | Avg: 52m 58s | Max: 52m 58s
    🟩 cudacxx
      🟩 nvcc12.6           Pass: 100%/1   | Total: 52m 58s | Avg: 52m 58s | Max: 52m 58s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total: 52m 58s | Avg: 52m 58s | Max: 52m 58s
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total: 52m 58s | Avg: 52m 58s | Max: 52m 58s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total: 52m 58s | Avg: 52m 58s | Max: 52m 58s
    🟩 gpu
      🟩 v100               Pass: 100%/1   | Total: 52m 58s | Avg: 52m 58s | Max: 52m 58s
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total: 52m 58s | Avg: 52m 58s | Max: 52m 58s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
+/- CUB
Thrust
CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental
+/- python
+/- CCCL C Parallel Library
+/- Catch2Helper

🏃‍ Runner counts (total jobs: 78)

# Runner
53 linux-amd64-cpu16
11 linux-amd64-gpu-v100-latest-1
9 windows-amd64-cpu16
4 linux-arm64-cpu16
1 linux-amd64-gpu-h100-latest-1-testing

@elstehle
Copy link
Contributor Author

pre-commit.ci autofix

@copy-pr-bot
Copy link
Contributor

copy-pr-bot bot commented Jan 27, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@elstehle
Copy link
Contributor Author

elstehle commented Feb 7, 2025

/ok to test

@elstehle
Copy link
Contributor Author

/ok to test

@elstehle
Copy link
Contributor Author

/ok to test

@elstehle
Copy link
Contributor Author

/ok to test

@github-actions
Copy link
Contributor

🟨 CI finished in 1h 20m: Pass: 61%/97 | Total: 1d 23h | Avg: 29m 32s | Max: 1h 08m | Hits: 90%/89387
  • 🟨 cub: Pass: 17%/45 | Total: 1d 07h | Avg: 42m 23s | Max: 1h 08m | Hits: 77%/8526

    🔍 cudacxx_family: nvcc 🔍
      🟩 ClangCUDA          Pass: 100%/2   | Total:  1h 49m | Avg: 54m 46s | Max: 55m 46s | Hits:  89%/2104  
      🔍 nvcc               Pass:  13%/43  | Total:  1d 05h | Avg: 41m 48s | Max:  1h 08m | Hits:  72%/6422  
    🟨 ctk
      🟨 12.0               Pass:  20%/5   | Total:  4h 24m | Avg: 52m 55s | Max:  1h 03m | Hits:  76%/1042  
      🟩 12.6               Pass: 100%/2   | Total:  2h 13m | Avg:  1h 06m | Max:  1h 08m | Hits:  67%/2254  
      🟨 12.8               Pass:  13%/38  | Total:  1d 01h | Avg: 39m 43s | Max:  1h 02m | Hits:  81%/5230  
    🟨 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total:  1h 49m | Avg: 54m 46s | Max: 55m 46s | Hits:  89%/2104  
      🟨 nvcc12.0           Pass:  20%/5   | Total:  4h 24m | Avg: 52m 55s | Max:  1h 03m | Hits:  76%/1042  
      🟩 nvcc12.6           Pass: 100%/2   | Total:  2h 13m | Avg:  1h 06m | Max:  1h 08m | Hits:  67%/2254  
      🟨 nvcc12.8           Pass:   8%/36  | Total: 23h 20m | Avg: 38m 53s | Max:  1h 02m | Hits:  76%/3126  
    🟨 cxx
      🟥 Clang14            Pass:   0%/4   | Total:  4h 01m | Avg:  1h 00m | Max:  1h 03m
      🟥 Clang15            Pass:   0%/2   | Total:  1h 57m | Avg: 58m 57s | Max:  1h 00m
      🟥 Clang16            Pass:   0%/2   | Total:  1h 55m | Avg: 57m 53s | Max:  1h 00m
      🟥 Clang17            Pass:   0%/2   | Total:  1h 54m | Avg: 57m 27s | Max: 59m 00s
      🟨 Clang18            Pass:  28%/7   | Total:  4h 44m | Avg: 40m 35s | Max:  1h 02m | Hits:  89%/2104  
      🟥 GCC7               Pass:   0%/2   | Total:  1h 30m | Avg: 45m 24s | Max: 47m 03s
      🟥 GCC8               Pass:   0%/1   | Total: 45m 38s | Avg: 45m 38s | Max: 45m 38s
      🟥 GCC9               Pass:   0%/2   | Total:  1h 29m | Avg: 44m 33s | Max: 44m 58s
      🟥 GCC10              Pass:   0%/2   | Total:  1h 25m | Avg: 42m 59s | Max: 43m 36s
      🟥 GCC11              Pass:   0%/2   | Total:  1h 30m | Avg: 45m 04s | Max: 47m 17s
      🟥 GCC12              Pass:   0%/2   | Total:  1h 27m | Avg: 43m 31s | Max: 43m 54s
      🟥 GCC13              Pass:   0%/11  | Total:  3h 13m | Avg: 17m 34s | Max: 43m 20s
      🟩 MSVC14.29          Pass: 100%/2   | Total:  1h 46m | Avg: 53m 19s | Max: 54m 13s | Hits:  76%/2084  
      🟩 MSVC14.42          Pass: 100%/2   | Total:  1h 51m | Avg: 55m 41s | Max: 57m 04s | Hits:  75%/2084  
      🟩 NVHPC25.1          Pass: 100%/2   | Total:  2h 13m | Avg:  1h 06m | Max:  1h 08m | Hits:  67%/2254  
    🟨 cxx_family
      🟨 Clang              Pass:  11%/17  | Total: 14h 34m | Avg: 51m 25s | Max:  1h 03m | Hits:  89%/2104  
      🟥 GCC                Pass:   0%/22  | Total: 11h 22m | Avg: 31m 00s | Max: 47m 17s
      🟩 MSVC               Pass: 100%/4   | Total:  3h 38m | Avg: 54m 30s | Max: 57m 04s | Hits:  76%/4168  
      🟩 NVHPC              Pass: 100%/2   | Total:  2h 13m | Avg:  1h 06m | Max:  1h 08m | Hits:  67%/2254  
    🟨 cpu
      🟨 amd64              Pass:  18%/43  | Total:  1d 06h | Avg: 42m 04s | Max:  1h 08m | Hits:  77%/8526  
      🟥 arm64              Pass:   0%/2   | Total:  1h 38m | Avg: 49m 11s | Max: 56m 48s
    🟨 gpu
      🟥 h100               Pass:   0%/3   | Total: 22m 56s | Avg:  7m 38s | Max: 22m 56s
      🟨 rtx2080            Pass:  23%/34  | Total:  1d 05h | Avg: 52m 19s | Max:  1h 08m | Hits:  77%/8526  
      🟥 rtxa6000           Pass:   0%/8   | Total:  1h 45m | Avg: 13m 12s | Max:  1h 02m
    🟨 jobs
      🟨 Build              Pass:  21%/37  | Total:  1d 07h | Avg: 51m 33s | Max:  1h 08m | Hits:  77%/8526  
      🟥 DeviceLaunch       Pass:   0%/1  
      🟥 GraphCapture       Pass:   0%/1  
      🟥 HostLaunch         Pass:   0%/3  
      🟥 TestGPU            Pass:   0%/3  
    🟥 sm
      🟥 90                 Pass:   0%/3   | Total: 22m 56s | Avg:  7m 38s | Max: 22m 56s
      🟥 90;90a;100         Pass:   0%/1   | Total: 42m 09s | Avg: 42m 09s | Max: 42m 09s
    🟨 std
      🟨 17                 Pass:  25%/20  | Total: 17h 25m | Avg: 52m 17s | Max:  1h 04m | Hits:  77%/5305  
      🟨 20                 Pass:  12%/25  | Total: 14h 21m | Avg: 34m 28s | Max:  1h 08m | Hits:  77%/3221  
    
  • 🟩 thrust: Pass: 100%/45 | Total: 14h 14m | Avg: 18m 59s | Max: 49m 17s | Hits: 91%/80541

    🟩 cmake_options
      🟩 -DTHRUST_DISPATCH_TYPE=Force32bit Pass: 100%/2   | Total: 24m 04s | Avg: 12m 02s | Max: 12m 47s | Hits:  95%/3582  
    🟩 cpu
      🟩 amd64              Pass: 100%/43  | Total: 13h 51m | Avg: 19m 20s | Max: 49m 17s | Hits:  91%/76960 
      🟩 arm64              Pass: 100%/2   | Total: 23m 01s | Avg: 11m 30s | Max: 13m 43s | Hits:  92%/3581  
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total:  1h 57m | Avg: 23m 35s | Max: 49m 17s | Hits:  89%/8946  
      🟩 12.6               Pass: 100%/2   | Total:  1h 25m | Avg: 42m 46s | Max: 45m 38s | Hits:  86%/3580  
      🟩 12.8               Pass: 100%/38  | Total: 10h 51m | Avg: 17m 07s | Max: 48m 54s | Hits:  92%/68015 
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total: 24m 49s | Avg: 12m 24s | Max: 12m 51s | Hits:  93%/3580  
      🟩 nvcc12.0           Pass: 100%/5   | Total:  1h 57m | Avg: 23m 35s | Max: 49m 17s | Hits:  89%/8946  
      🟩 nvcc12.6           Pass: 100%/2   | Total:  1h 25m | Avg: 42m 46s | Max: 45m 38s | Hits:  86%/3580  
      🟩 nvcc12.8           Pass: 100%/36  | Total: 10h 26m | Avg: 17m 23s | Max: 48m 54s | Hits:  92%/64435 
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total: 24m 49s | Avg: 12m 24s | Max: 12m 51s | Hits:  93%/3580  
      🟩 nvcc               Pass: 100%/43  | Total: 13h 49m | Avg: 19m 17s | Max: 49m 17s | Hits:  91%/76961 
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total: 57m 47s | Avg: 14m 26s | Max: 16m 33s | Hits:  91%/7160  
      🟩 Clang15            Pass: 100%/2   | Total: 29m 37s | Avg: 14m 48s | Max: 16m 07s | Hits:  91%/3580  
      🟩 Clang16            Pass: 100%/2   | Total: 31m 21s | Avg: 15m 40s | Max: 15m 58s | Hits:  91%/3580  
      🟩 Clang17            Pass: 100%/2   | Total: 28m 18s | Avg: 14m 09s | Max: 14m 39s | Hits:  91%/3580  
      🟩 Clang18            Pass: 100%/7   | Total:  1h 21m | Avg: 11m 41s | Max: 15m 26s | Hits:  94%/12530 
      🟩 GCC7               Pass: 100%/2   | Total: 33m 03s | Avg: 16m 31s | Max: 18m 07s | Hits:  90%/3582  
      🟩 GCC8               Pass: 100%/1   | Total: 15m 01s | Avg: 15m 01s | Max: 15m 01s | Hits:  90%/1791  
      🟩 GCC9               Pass: 100%/2   | Total: 39m 01s | Avg: 19m 30s | Max: 19m 57s | Hits:  89%/3582  
      🟩 GCC10              Pass: 100%/2   | Total: 32m 17s | Avg: 16m 08s | Max: 16m 58s | Hits:  90%/3582  
      🟩 GCC11              Pass: 100%/2   | Total: 35m 38s | Avg: 17m 49s | Max: 19m 39s | Hits:  90%/3582  
      🟩 GCC12              Pass: 100%/2   | Total: 40m 13s | Avg: 20m 06s | Max: 20m 33s | Hits:  89%/3582  
      🟩 GCC13              Pass: 100%/10  | Total:  2h 07m | Avg: 12m 44s | Max: 18m 43s | Hits:  95%/17910 
      🟩 MSVC14.29          Pass: 100%/2   | Total:  1h 35m | Avg: 47m 41s | Max: 49m 17s | Hits:  83%/3568  
      🟩 MSVC14.42          Pass: 100%/3   | Total:  2h 02m | Avg: 40m 41s | Max: 48m 54s | Hits:  88%/5352  
      🟩 NVHPC25.1          Pass: 100%/2   | Total:  1h 25m | Avg: 42m 46s | Max: 45m 38s | Hits:  86%/3580  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/17  | Total:  3h 48m | Avg: 13m 27s | Max: 16m 33s | Hits:  92%/30430 
      🟩 GCC                Pass: 100%/21  | Total:  5h 22m | Avg: 15m 21s | Max: 20m 33s | Hits:  92%/37611 
      🟩 MSVC               Pass: 100%/5   | Total:  3h 37m | Avg: 43m 29s | Max: 49m 17s | Hits:  86%/8920  
      🟩 NVHPC              Pass: 100%/2   | Total:  1h 25m | Avg: 42m 46s | Max: 45m 38s | Hits:  86%/3580  
    🟩 gpu
      🟩 h100               Pass: 100%/2   | Total: 16m 25s | Avg:  8m 12s | Max: 11m 33s | Hits:  99%/3582  
      🟩 rtx2080            Pass: 100%/33  | Total: 11h 06m | Avg: 20m 11s | Max: 49m 17s | Hits:  89%/59066 
      🟩 rtx4090            Pass: 100%/10  | Total:  2h 52m | Avg: 17m 12s | Max: 48m 54s | Hits:  95%/17893 
    🟩 jobs
      🟩 Build              Pass: 100%/38  | Total: 12h 46m | Avg: 20m 10s | Max: 49m 17s | Hits:  90%/68013 
      🟩 TestCPU            Pass: 100%/3   | Total: 43m 35s | Avg: 14m 31s | Max: 27m 09s | Hits:  99%/5365  
      🟩 TestGPU            Pass: 100%/4   | Total: 44m 16s | Avg: 11m 04s | Max: 11m 33s | Hits:  99%/7163  
    🟩 sm
      🟩 90                 Pass: 100%/2   | Total: 16m 25s | Avg:  8m 12s | Max: 11m 33s | Hits:  99%/3582  
      🟩 90;90a;100         Pass: 100%/1   | Total: 15m 58s | Avg: 15m 58s | Max: 15m 58s | Hits:  90%/1791  
    🟩 std
      🟩 17                 Pass: 100%/20  | Total:  7h 31m | Avg: 22m 35s | Max: 49m 17s | Hits:  89%/35791 
      🟩 20                 Pass: 100%/23  | Total:  6h 18m | Avg: 16m 27s | Max: 48m 54s | Hits:  93%/41168 
    
  • 🟩 stdpar: Pass: 100%/4 | Total: 15m 48s | Avg: 3m 57s | Max: 4m 40s

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total:  9m 20s | Avg:  4m 40s | Max:  4m 40s
      🟩 arm64              Pass: 100%/2   | Total:  6m 28s | Avg:  3m 14s | Max:  3m 14s
    🟩 ctk
      🟩 12.6               Pass: 100%/4   | Total: 15m 48s | Avg:  3m 57s | Max:  4m 40s
    🟩 cudacxx
      🟩 nvcc12.6           Pass: 100%/4   | Total: 15m 48s | Avg:  3m 57s | Max:  4m 40s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/4   | Total: 15m 48s | Avg:  3m 57s | Max:  4m 40s
    🟩 cxx
      🟩 NVHPC25.1          Pass: 100%/4   | Total: 15m 48s | Avg:  3m 57s | Max:  4m 40s
    🟩 cxx_family
      🟩 NVHPC              Pass: 100%/4   | Total: 15m 48s | Avg:  3m 57s | Max:  4m 40s
    🟩 gpu
      🟩 rtx2080            Pass: 100%/4   | Total: 15m 48s | Avg:  3m 57s | Max:  4m 40s
    🟩 jobs
      🟩 Build              Pass: 100%/4   | Total: 15m 48s | Avg:  3m 57s | Max:  4m 40s
    🟩 std
      🟩 17                 Pass: 100%/2   | Total:  7m 54s | Avg:  3m 57s | Max:  4m 40s
      🟩 20                 Pass: 100%/2   | Total:  7m 54s | Avg:  3m 57s | Max:  4m 40s
    
  • 🟩 cccl_c_parallel: Pass: 100%/2 | Total: 22m 16s | Avg: 11m 08s | Max: 20m 12s | Hits: 98%/320

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total: 22m 16s | Avg: 11m 08s | Max: 20m 12s | Hits:  98%/320   
    🟩 ctk
      🟩 12.8               Pass: 100%/2   | Total: 22m 16s | Avg: 11m 08s | Max: 20m 12s | Hits:  98%/320   
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/2   | Total: 22m 16s | Avg: 11m 08s | Max: 20m 12s | Hits:  98%/320   
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/2   | Total: 22m 16s | Avg: 11m 08s | Max: 20m 12s | Hits:  98%/320   
    🟩 cxx
      🟩 GCC13              Pass: 100%/2   | Total: 22m 16s | Avg: 11m 08s | Max: 20m 12s | Hits:  98%/320   
    🟩 cxx_family
      🟩 GCC                Pass: 100%/2   | Total: 22m 16s | Avg: 11m 08s | Max: 20m 12s | Hits:  98%/320   
    🟩 gpu
      🟩 rtx2080            Pass: 100%/2   | Total: 22m 16s | Avg: 11m 08s | Max: 20m 12s | Hits:  98%/320   
    🟩 jobs
      🟩 Build              Pass: 100%/1   | Total:  2m 04s | Avg:  2m 04s | Max:  2m 04s | Hits:  98%/160   
      🟩 Test               Pass: 100%/1   | Total: 20m 12s | Avg: 20m 12s | Max: 20m 12s | Hits:  98%/160   
    
  • 🟩 python: Pass: 100%/1 | Total: 1h 05m | Avg: 1h 05m | Max: 1h 05m

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total:  1h 05m | Avg:  1h 05m | Max:  1h 05m
    🟩 ctk
      🟩 12.8               Pass: 100%/1   | Total:  1h 05m | Avg:  1h 05m | Max:  1h 05m
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/1   | Total:  1h 05m | Avg:  1h 05m | Max:  1h 05m
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total:  1h 05m | Avg:  1h 05m | Max:  1h 05m
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total:  1h 05m | Avg:  1h 05m | Max:  1h 05m
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total:  1h 05m | Avg:  1h 05m | Max:  1h 05m
    🟩 gpu
      🟩 rtx2080            Pass: 100%/1   | Total:  1h 05m | Avg:  1h 05m | Max:  1h 05m
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total:  1h 05m | Avg:  1h 05m | Max:  1h 05m
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
+/- CUB
Thrust
CUDA Experimental
stdpar
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental
+/- stdpar
+/- python
+/- CCCL C Parallel Library
+/- Catch2Helper

🏃‍ Runner counts (total jobs: 97)

# Runner
68 linux-amd64-cpu16
9 windows-amd64-cpu16
6 linux-arm64-cpu16
6 linux-amd64-gpu-rtxa6000-latest-1
3 linux-amd64-gpu-h100-latest-1
3 linux-amd64-gpu-rtx4090-latest-1
2 linux-amd64-gpu-rtx2080-latest-1

@elstehle
Copy link
Contributor Author

/ok to test

@github-actions
Copy link
Contributor

🟩 CI finished in 1h 14m: Pass: 100%/97 | Total: 1d 01h | Avg: 15m 30s | Max: 1h 12m | Hits: 99%/134512
  • 🟩 cub: Pass: 100%/45 | Total: 17h 06m | Avg: 22m 48s | Max: 43m 59s | Hits: 98%/53651

    🟩 cpu
      🟩 amd64              Pass: 100%/43  | Total: 16h 03m | Avg: 22m 25s | Max: 37m 58s | Hits:  98%/51213 
      🟩 arm64              Pass: 100%/2   | Total:  1h 02m | Avg: 31m 03s | Max: 43m 59s | Hits:  98%/2438  
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total:  2h 01m | Avg: 24m 21s | Max: 37m 44s | Hits:  98%/5926  
      🟩 12.6               Pass: 100%/2   | Total: 21m 33s | Avg: 10m 46s | Max: 11m 09s | Hits:  98%/2254  
      🟩 12.8               Pass: 100%/38  | Total: 14h 42m | Avg: 23m 13s | Max: 43m 59s | Hits:  98%/45471 
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total:  9m 49s | Avg:  4m 54s | Max:  4m 55s | Hits: 100%/2104  
      🟩 nvcc12.0           Pass: 100%/5   | Total:  2h 01m | Avg: 24m 21s | Max: 37m 44s | Hits:  98%/5926  
      🟩 nvcc12.6           Pass: 100%/2   | Total: 21m 33s | Avg: 10m 46s | Max: 11m 09s | Hits:  98%/2254  
      🟩 nvcc12.8           Pass: 100%/36  | Total: 14h 32m | Avg: 24m 14s | Max: 43m 59s | Hits:  98%/43367 
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total:  9m 49s | Avg:  4m 54s | Max:  4m 55s | Hits: 100%/2104  
      🟩 nvcc               Pass: 100%/43  | Total: 16h 56m | Avg: 23m 37s | Max: 43m 59s | Hits:  98%/51547 
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total:  2h 28m | Avg: 37m 08s | Max: 37m 55s | Hits:  98%/4884  
      🟩 Clang15            Pass: 100%/2   | Total:  1h 14m | Avg: 37m 06s | Max: 37m 31s | Hits:  98%/2438  
      🟩 Clang16            Pass: 100%/2   | Total:  1h 13m | Avg: 36m 47s | Max: 37m 44s | Hits:  98%/2438  
      🟩 Clang17            Pass: 100%/2   | Total:  1h 14m | Avg: 37m 08s | Max: 37m 27s | Hits:  98%/2438  
      🟩 Clang18            Pass: 100%/7   | Total:  2h 56m | Avg: 25m 08s | Max: 43m 59s | Hits:  99%/8199  
      🟩 GCC7               Pass: 100%/2   | Total: 30m 28s | Avg: 15m 14s | Max: 15m 39s | Hits:  98%/2442  
      🟩 GCC8               Pass: 100%/1   | Total: 15m 44s | Avg: 15m 44s | Max: 15m 44s | Hits:  98%/1221  
      🟩 GCC9               Pass: 100%/2   | Total: 29m 40s | Avg: 14m 50s | Max: 15m 00s | Hits:  98%/2442  
      🟩 GCC10              Pass: 100%/2   | Total: 23m 41s | Avg: 11m 50s | Max: 14m 12s | Hits:  98%/2442  
      🟩 GCC11              Pass: 100%/2   | Total: 28m 36s | Avg: 14m 18s | Max: 14m 23s | Hits:  98%/2438  
      🟩 GCC12              Pass: 100%/2   | Total: 24m 42s | Avg: 12m 21s | Max: 15m 11s | Hits:  98%/2438  
      🟩 GCC13              Pass: 100%/11  | Total:  3h 50m | Avg: 20m 54s | Max: 29m 30s | Hits:  96%/13409 
      🟩 MSVC14.29          Pass: 100%/2   | Total: 36m 19s | Avg: 18m 09s | Max: 18m 34s | Hits:  99%/2084  
      🟩 MSVC14.42          Pass: 100%/2   | Total: 38m 41s | Avg: 19m 20s | Max: 19m 22s | Hits:  99%/2084  
      🟩 NVHPC25.1          Pass: 100%/2   | Total: 21m 33s | Avg: 10m 46s | Max: 11m 09s | Hits:  98%/2254  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/17  | Total:  9h 06m | Avg: 32m 09s | Max: 43m 59s | Hits:  99%/20397 
      🟩 GCC                Pass: 100%/22  | Total:  6h 22m | Avg: 17m 24s | Max: 29m 30s | Hits:  97%/26832 
      🟩 MSVC               Pass: 100%/4   | Total:  1h 15m | Avg: 18m 45s | Max: 19m 22s | Hits:  99%/4168  
      🟩 NVHPC              Pass: 100%/2   | Total: 21m 33s | Avg: 10m 46s | Max: 11m 09s | Hits:  98%/2254  
    🟩 gpu
      🟩 h100               Pass: 100%/3   | Total:  1h 18m | Avg: 26m 02s | Max: 29m 30s | Hits:  88%/3657  
      🟩 rtx2080            Pass: 100%/34  | Total: 12h 34m | Avg: 22m 11s | Max: 43m 59s | Hits:  98%/40242 
      🟩 rtxa6000           Pass: 100%/8   | Total:  3h 13m | Avg: 24m 09s | Max: 37m 40s | Hits:  99%/9752  
    🟩 jobs
      🟩 Build              Pass: 100%/37  | Total: 13h 51m | Avg: 22m 28s | Max: 43m 59s | Hits:  97%/43899 
      🟩 DeviceLaunch       Pass: 100%/1   | Total: 26m 09s | Avg: 26m 09s | Max: 26m 09s | Hits:  99%/1219  
      🟩 GraphCapture       Pass: 100%/1   | Total: 19m 40s | Avg: 19m 40s | Max: 19m 40s | Hits:  99%/1219  
      🟩 HostLaunch         Pass: 100%/3   | Total:  1h 21m | Avg: 27m 16s | Max: 29m 30s | Hits:  99%/3657  
      🟩 TestGPU            Pass: 100%/3   | Total:  1h 07m | Avg: 22m 21s | Max: 24m 47s | Hits:  99%/3657  
    🟩 sm
      🟩 90                 Pass: 100%/3   | Total:  1h 18m | Avg: 26m 02s | Max: 29m 30s | Hits:  88%/3657  
      🟩 90;90a;100         Pass: 100%/1   | Total:  9m 58s | Avg:  9m 58s | Max:  9m 58s | Hits:  98%/1219  
    🟩 std
      🟩 17                 Pass: 100%/20  | Total:  6h 56m | Avg: 20m 49s | Max: 37m 58s | Hits:  98%/23606 
      🟩 20                 Pass: 100%/25  | Total: 10h 09m | Avg: 24m 22s | Max: 43m 59s | Hits:  97%/30045 
    
  • 🟩 thrust: Pass: 100%/45 | Total: 6h 13m | Avg: 8m 18s | Max: 27m 00s | Hits: 99%/80541

    🟩 cmake_options
      🟩 -DTHRUST_DISPATCH_TYPE=Force32bit Pass: 100%/2   | Total: 17m 04s | Avg:  8m 32s | Max: 11m 04s | Hits:  99%/3582  
    🟩 cpu
      🟩 amd64              Pass: 100%/43  | Total:  6h 04m | Avg:  8m 27s | Max: 27m 00s | Hits:  99%/76960 
      🟩 arm64              Pass: 100%/2   | Total:  9m 50s | Avg:  4m 55s | Max:  5m 14s | Hits:  99%/3581  
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total: 38m 59s | Avg:  7m 47s | Max: 19m 34s | Hits:  99%/8946  
      🟩 12.6               Pass: 100%/2   | Total: 29m 14s | Avg: 14m 37s | Max: 15m 25s | Hits:  99%/3580  
      🟩 12.8               Pass: 100%/38  | Total:  5h 05m | Avg:  8m 02s | Max: 27m 00s | Hits:  99%/68015 
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total: 10m 06s | Avg:  5m 03s | Max:  5m 13s | Hits: 100%/3580  
      🟩 nvcc12.0           Pass: 100%/5   | Total: 38m 59s | Avg:  7m 47s | Max: 19m 34s | Hits:  99%/8946  
      🟩 nvcc12.6           Pass: 100%/2   | Total: 29m 14s | Avg: 14m 37s | Max: 15m 25s | Hits:  99%/3580  
      🟩 nvcc12.8           Pass: 100%/36  | Total:  4h 55m | Avg:  8m 12s | Max: 27m 00s | Hits:  99%/64435 
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total: 10m 06s | Avg:  5m 03s | Max:  5m 13s | Hits: 100%/3580  
      🟩 nvcc               Pass: 100%/43  | Total:  6h 03m | Avg:  8m 27s | Max: 27m 00s | Hits:  99%/76961 
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total: 20m 03s | Avg:  5m 00s | Max:  5m 17s | Hits: 100%/7160  
      🟩 Clang15            Pass: 100%/2   | Total: 11m 04s | Avg:  5m 32s | Max:  5m 36s | Hits: 100%/3580  
      🟩 Clang16            Pass: 100%/2   | Total: 10m 40s | Avg:  5m 20s | Max:  5m 21s | Hits: 100%/3580  
      🟩 Clang17            Pass: 100%/2   | Total: 10m 57s | Avg:  5m 28s | Max:  5m 32s | Hits: 100%/3580  
      🟩 Clang18            Pass: 100%/7   | Total: 42m 59s | Avg:  6m 08s | Max: 10m 18s | Hits: 100%/12530 
      🟩 GCC7               Pass: 100%/2   | Total: 10m 26s | Avg:  5m 13s | Max:  5m 35s | Hits:  99%/3582  
      🟩 GCC8               Pass: 100%/1   | Total:  5m 17s | Avg:  5m 17s | Max:  5m 17s | Hits:  99%/1791  
      🟩 GCC9               Pass: 100%/2   | Total: 10m 41s | Avg:  5m 20s | Max:  5m 42s | Hits:  99%/3582  
      🟩 GCC10              Pass: 100%/2   | Total: 11m 14s | Avg:  5m 37s | Max:  5m 49s | Hits:  99%/3582  
      🟩 GCC11              Pass: 100%/2   | Total: 11m 55s | Avg:  5m 57s | Max:  6m 09s | Hits:  99%/3582  
      🟩 GCC12              Pass: 100%/2   | Total: 11m 38s | Avg:  5m 49s | Max:  5m 55s | Hits:  99%/3582  
      🟩 GCC13              Pass: 100%/10  | Total:  1h 16m | Avg:  7m 38s | Max: 11m 21s | Hits:  99%/17910 
      🟩 MSVC14.29          Pass: 100%/2   | Total: 40m 40s | Avg: 20m 20s | Max: 21m 06s | Hits:  99%/3568  
      🟩 MSVC14.42          Pass: 100%/3   | Total:  1h 10m | Avg: 23m 31s | Max: 27m 00s | Hits:  99%/5352  
      🟩 NVHPC25.1          Pass: 100%/2   | Total: 29m 14s | Avg: 14m 37s | Max: 15m 25s | Hits:  99%/3580  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/17  | Total:  1h 35m | Avg:  5m 37s | Max: 10m 18s | Hits: 100%/30430 
      🟩 GCC                Pass: 100%/21  | Total:  2h 17m | Avg:  6m 33s | Max: 11m 21s | Hits:  99%/37611 
      🟩 MSVC               Pass: 100%/5   | Total:  1h 51m | Avg: 22m 14s | Max: 27m 00s | Hits:  99%/8920  
      🟩 NVHPC              Pass: 100%/2   | Total: 29m 14s | Avg: 14m 37s | Max: 15m 25s | Hits:  99%/3580  
    🟩 gpu
      🟩 h100               Pass: 100%/2   | Total: 15m 58s | Avg:  7m 59s | Max: 11m 19s | Hits:  99%/3582  
      🟩 rtx2080            Pass: 100%/33  | Total:  4h 02m | Avg:  7m 20s | Max: 21m 06s | Hits:  99%/59066 
      🟩 rtx4090            Pass: 100%/10  | Total:  1h 55m | Avg: 11m 33s | Max: 27m 00s | Hits:  99%/17893 
    🟩 jobs
      🟩 Build              Pass: 100%/38  | Total:  4h 47m | Avg:  7m 34s | Max: 22m 45s | Hits:  99%/68013 
      🟩 TestCPU            Pass: 100%/3   | Total: 42m 01s | Avg: 14m 00s | Max: 27m 00s | Hits:  99%/5365  
      🟩 TestGPU            Pass: 100%/4   | Total: 44m 02s | Avg: 11m 00s | Max: 11m 21s | Hits:  99%/7163  
    🟩 sm
      🟩 90                 Pass: 100%/2   | Total: 15m 58s | Avg:  7m 59s | Max: 11m 19s | Hits:  99%/3582  
      🟩 90;90a;100         Pass: 100%/1   | Total:  6m 34s | Avg:  6m 34s | Max:  6m 34s | Hits:  99%/1791  
    🟩 std
      🟩 17                 Pass: 100%/20  | Total:  2h 43m | Avg:  8m 09s | Max: 21m 06s | Hits:  99%/35791 
      🟩 20                 Pass: 100%/23  | Total:  3h 13m | Avg:  8m 24s | Max: 27m 00s | Hits:  99%/41168 
    
  • 🟩 stdpar: Pass: 100%/4 | Total: 15m 39s | Avg: 3m 54s | Max: 4m 37s

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total:  9m 08s | Avg:  4m 34s | Max:  4m 37s
      🟩 arm64              Pass: 100%/2   | Total:  6m 31s | Avg:  3m 15s | Max:  3m 16s
    🟩 ctk
      🟩 12.6               Pass: 100%/4   | Total: 15m 39s | Avg:  3m 54s | Max:  4m 37s
    🟩 cudacxx
      🟩 nvcc12.6           Pass: 100%/4   | Total: 15m 39s | Avg:  3m 54s | Max:  4m 37s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/4   | Total: 15m 39s | Avg:  3m 54s | Max:  4m 37s
    🟩 cxx
      🟩 NVHPC25.1          Pass: 100%/4   | Total: 15m 39s | Avg:  3m 54s | Max:  4m 37s
    🟩 cxx_family
      🟩 NVHPC              Pass: 100%/4   | Total: 15m 39s | Avg:  3m 54s | Max:  4m 37s
    🟩 gpu
      🟩 rtx2080            Pass: 100%/4   | Total: 15m 39s | Avg:  3m 54s | Max:  4m 37s
    🟩 jobs
      🟩 Build              Pass: 100%/4   | Total: 15m 39s | Avg:  3m 54s | Max:  4m 37s
    🟩 std
      🟩 17                 Pass: 100%/2   | Total:  7m 47s | Avg:  3m 53s | Max:  4m 31s
      🟩 20                 Pass: 100%/2   | Total:  7m 52s | Avg:  3m 56s | Max:  4m 37s
    
  • 🟩 cccl_c_parallel: Pass: 100%/2 | Total: 16m 30s | Avg: 8m 15s | Max: 14m 24s | Hits: 98%/320

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total: 16m 30s | Avg:  8m 15s | Max: 14m 24s | Hits:  98%/320   
    🟩 ctk
      🟩 12.8               Pass: 100%/2   | Total: 16m 30s | Avg:  8m 15s | Max: 14m 24s | Hits:  98%/320   
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/2   | Total: 16m 30s | Avg:  8m 15s | Max: 14m 24s | Hits:  98%/320   
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/2   | Total: 16m 30s | Avg:  8m 15s | Max: 14m 24s | Hits:  98%/320   
    🟩 cxx
      🟩 GCC13              Pass: 100%/2   | Total: 16m 30s | Avg:  8m 15s | Max: 14m 24s | Hits:  98%/320   
    🟩 cxx_family
      🟩 GCC                Pass: 100%/2   | Total: 16m 30s | Avg:  8m 15s | Max: 14m 24s | Hits:  98%/320   
    🟩 gpu
      🟩 rtx2080            Pass: 100%/2   | Total: 16m 30s | Avg:  8m 15s | Max: 14m 24s | Hits:  98%/320   
    🟩 jobs
      🟩 Build              Pass: 100%/1   | Total:  2m 06s | Avg:  2m 06s | Max:  2m 06s | Hits:  98%/160   
      🟩 Test               Pass: 100%/1   | Total: 14m 24s | Avg: 14m 24s | Max: 14m 24s | Hits:  98%/160   
    
  • 🟩 python: Pass: 100%/1 | Total: 1h 12m | Avg: 1h 12m | Max: 1h 12m

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total:  1h 12m | Avg:  1h 12m | Max:  1h 12m
    🟩 ctk
      🟩 12.8               Pass: 100%/1   | Total:  1h 12m | Avg:  1h 12m | Max:  1h 12m
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/1   | Total:  1h 12m | Avg:  1h 12m | Max:  1h 12m
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total:  1h 12m | Avg:  1h 12m | Max:  1h 12m
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total:  1h 12m | Avg:  1h 12m | Max:  1h 12m
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total:  1h 12m | Avg:  1h 12m | Max:  1h 12m
    🟩 gpu
      🟩 rtx2080            Pass: 100%/1   | Total:  1h 12m | Avg:  1h 12m | Max:  1h 12m
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total:  1h 12m | Avg:  1h 12m | Max:  1h 12m
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
+/- CUB
Thrust
CUDA Experimental
stdpar
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental
+/- stdpar
+/- python
+/- CCCL C Parallel Library
+/- Catch2Helper

🏃‍ Runner counts (total jobs: 97)

# Runner
68 linux-amd64-cpu16
9 windows-amd64-cpu16
6 linux-arm64-cpu16
6 linux-amd64-gpu-rtxa6000-latest-1
3 linux-amd64-gpu-h100-latest-1
3 linux-amd64-gpu-rtx4090-latest-1
2 linux-amd64-gpu-rtx2080-latest-1

@elstehle
Copy link
Contributor Author

/ok to test

@github-actions
Copy link
Contributor

🟩 CI finished in 1h 19m: Pass: 100%/97 | Total: 1d 19h | Avg: 26m 48s | Max: 1h 06m | Hits: 94%/134641
  • 🟩 cub: Pass: 100%/45 | Total: 1d 05h | Avg: 39m 36s | Max: 1h 01m | Hits: 92%/53780

    🟩 cpu
      🟩 amd64              Pass: 100%/43  | Total:  1d 04h | Avg: 39m 24s | Max:  1h 01m | Hits:  92%/51336 
      🟩 arm64              Pass: 100%/2   | Total:  1h 28m | Avg: 44m 00s | Max: 55m 27s | Hits:  90%/2444  
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total:  3h 42m | Avg: 44m 29s | Max: 54m 02s | Hits:  90%/5940  
      🟩 12.6               Pass: 100%/2   | Total:  2h 00m | Avg:  1h 00m | Max:  1h 01m | Hits:  89%/2260  
      🟩 12.8               Pass: 100%/38  | Total: 23h 59m | Avg: 37m 52s | Max:  1h 00m | Hits:  92%/45580 
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total:  1h 58m | Avg: 59m 11s | Max:  1h 00m | Hits:  91%/2108  
      🟩 nvcc12.0           Pass: 100%/5   | Total:  3h 42m | Avg: 44m 29s | Max: 54m 02s | Hits:  90%/5940  
      🟩 nvcc12.6           Pass: 100%/2   | Total:  2h 00m | Avg:  1h 00m | Max:  1h 01m | Hits:  89%/2260  
      🟩 nvcc12.8           Pass: 100%/36  | Total: 22h 00m | Avg: 36m 41s | Max: 55m 27s | Hits:  92%/43472 
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total:  1h 58m | Avg: 59m 11s | Max:  1h 00m | Hits:  91%/2108  
      🟩 nvcc               Pass: 100%/43  | Total:  1d 03h | Avg: 38m 42s | Max:  1h 01m | Hits:  92%/51672 
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total:  3h 28m | Avg: 52m 12s | Max: 54m 02s | Hits:  90%/4896  
      🟩 Clang15            Pass: 100%/2   | Total:  1h 41m | Avg: 50m 48s | Max: 52m 10s | Hits:  90%/2444  
      🟩 Clang16            Pass: 100%/2   | Total:  1h 39m | Avg: 49m 34s | Max: 50m 03s | Hits:  90%/2444  
      🟩 Clang17            Pass: 100%/2   | Total:  1h 37m | Avg: 48m 32s | Max: 48m 49s | Hits:  90%/2444  
      🟩 Clang18            Pass: 100%/7   | Total:  5h 27m | Avg: 46m 46s | Max:  1h 00m | Hits:  93%/8218  
      🟩 GCC7               Pass: 100%/2   | Total:  1h 06m | Avg: 33m 12s | Max: 34m 42s | Hits:  90%/2448  
      🟩 GCC8               Pass: 100%/1   | Total: 35m 26s | Avg: 35m 26s | Max: 35m 26s | Hits:  90%/1224  
      🟩 GCC9               Pass: 100%/2   | Total:  1h 11m | Avg: 35m 46s | Max: 39m 32s | Hits:  90%/2448  
      🟩 GCC10              Pass: 100%/2   | Total:  1h 06m | Avg: 33m 05s | Max: 35m 02s | Hits:  90%/2448  
      🟩 GCC11              Pass: 100%/2   | Total:  1h 01m | Avg: 30m 33s | Max: 31m 09s | Hits:  90%/2444  
      🟩 GCC12              Pass: 100%/2   | Total:  1h 02m | Avg: 31m 12s | Max: 31m 34s | Hits:  90%/2444  
      🟩 GCC13              Pass: 100%/11  | Total:  4h 45m | Avg: 25m 56s | Max: 32m 33s | Hits:  95%/13442 
      🟩 MSVC14.29          Pass: 100%/2   | Total:  1h 26m | Avg: 43m 06s | Max: 43m 27s | Hits:  90%/2088  
      🟩 MSVC14.42          Pass: 100%/2   | Total:  1h 32m | Avg: 46m 24s | Max: 48m 15s | Hits:  90%/2088  
      🟩 NVHPC25.1          Pass: 100%/2   | Total:  2h 00m | Avg:  1h 00m | Max:  1h 01m | Hits:  89%/2260  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/17  | Total: 13h 54m | Avg: 49m 03s | Max:  1h 00m | Hits:  91%/20446 
      🟩 GCC                Pass: 100%/22  | Total: 10h 48m | Avg: 29m 28s | Max: 39m 32s | Hits:  92%/26898 
      🟩 MSVC               Pass: 100%/4   | Total:  2h 59m | Avg: 44m 45s | Max: 48m 15s | Hits:  90%/4176  
      🟩 NVHPC              Pass: 100%/2   | Total:  2h 00m | Avg:  1h 00m | Max:  1h 01m | Hits:  89%/2260  
    🟩 gpu
      🟩 h100               Pass: 100%/3   | Total:  1h 02m | Avg: 20m 51s | Max: 26m 57s | Hits:  96%/3666  
      🟩 rtx2080            Pass: 100%/34  | Total:  1d 00h | Avg: 43m 53s | Max:  1h 01m | Hits:  90%/40338 
      🟩 rtxa6000           Pass: 100%/8   | Total:  3h 47m | Avg: 28m 25s | Max: 50m 05s | Hits:  97%/9776  
    🟩 jobs
      🟩 Build              Pass: 100%/37  | Total:  1d 02h | Avg: 42m 55s | Max:  1h 01m | Hits:  90%/44004 
      🟩 DeviceLaunch       Pass: 100%/1   | Total: 26m 43s | Avg: 26m 43s | Max: 26m 43s | Hits:  99%/1222  
      🟩 GraphCapture       Pass: 100%/1   | Total: 19m 40s | Avg: 19m 40s | Max: 19m 40s | Hits:  99%/1222  
      🟩 HostLaunch         Pass: 100%/3   | Total:  1h 20m | Avg: 26m 43s | Max: 27m 05s | Hits:  99%/3666  
      🟩 TestGPU            Pass: 100%/3   | Total:  1h 07m | Avg: 22m 30s | Max: 23m 42s | Hits:  99%/3666  
    🟩 sm
      🟩 90                 Pass: 100%/3   | Total:  1h 02m | Avg: 20m 51s | Max: 26m 57s | Hits:  96%/3666  
      🟩 90;90a;100         Pass: 100%/1   | Total: 29m 54s | Avg: 29m 54s | Max: 29m 54s | Hits:  90%/1222  
    🟩 std
      🟩 17                 Pass: 100%/20  | Total: 14h 14m | Avg: 42m 43s | Max: 59m 50s | Hits:  90%/23662 
      🟩 20                 Pass: 100%/25  | Total: 15h 28m | Avg: 37m 07s | Max:  1h 01m | Hits:  93%/30118 
    
  • 🟩 thrust: Pass: 100%/45 | Total: 11h 57m | Avg: 15m 57s | Max: 33m 32s | Hits: 95%/80541

    🟩 cmake_options
      🟩 -DTHRUST_DISPATCH_TYPE=Force32bit Pass: 100%/2   | Total: 25m 17s | Avg: 12m 38s | Max: 14m 13s | Hits:  97%/3582  
    🟩 cpu
      🟩 amd64              Pass: 100%/43  | Total: 11h 32m | Avg: 16m 05s | Max: 33m 32s | Hits:  95%/76960 
      🟩 arm64              Pass: 100%/2   | Total: 25m 52s | Avg: 12m 56s | Max: 13m 09s | Hits:  94%/3581  
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total:  1h 25m | Avg: 17m 11s | Max: 28m 48s | Hits:  94%/8946  
      🟩 12.6               Pass: 100%/2   | Total: 55m 36s | Avg: 27m 48s | Max: 29m 58s | Hits:  93%/3580  
      🟩 12.8               Pass: 100%/38  | Total:  9h 36m | Avg: 15m 10s | Max: 33m 32s | Hits:  95%/68015 
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total: 26m 06s | Avg: 13m 03s | Max: 13m 06s | Hits:  94%/3580  
      🟩 nvcc12.0           Pass: 100%/5   | Total:  1h 25m | Avg: 17m 11s | Max: 28m 48s | Hits:  94%/8946  
      🟩 nvcc12.6           Pass: 100%/2   | Total: 55m 36s | Avg: 27m 48s | Max: 29m 58s | Hits:  93%/3580  
      🟩 nvcc12.8           Pass: 100%/36  | Total:  9h 10m | Avg: 15m 17s | Max: 33m 32s | Hits:  95%/64435 
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total: 26m 06s | Avg: 13m 03s | Max: 13m 06s | Hits:  94%/3580  
      🟩 nvcc               Pass: 100%/43  | Total: 11h 31m | Avg: 16m 05s | Max: 33m 32s | Hits:  95%/76961 
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total: 55m 53s | Avg: 13m 58s | Max: 14m 40s | Hits:  94%/7160  
      🟩 Clang15            Pass: 100%/2   | Total: 29m 00s | Avg: 14m 30s | Max: 15m 43s | Hits:  94%/3580  
      🟩 Clang16            Pass: 100%/2   | Total: 29m 33s | Avg: 14m 46s | Max: 15m 18s | Hits:  94%/3580  
      🟩 Clang17            Pass: 100%/2   | Total: 30m 06s | Avg: 15m 03s | Max: 15m 17s | Hits:  94%/3580  
      🟩 Clang18            Pass: 100%/7   | Total:  1h 25m | Avg: 12m 12s | Max: 14m 29s | Hits:  96%/12530 
      🟩 GCC7               Pass: 100%/2   | Total: 30m 54s | Avg: 15m 27s | Max: 16m 21s | Hits:  94%/3582  
      🟩 GCC8               Pass: 100%/1   | Total: 14m 08s | Avg: 14m 08s | Max: 14m 08s | Hits:  94%/1791  
      🟩 GCC9               Pass: 100%/2   | Total: 29m 11s | Avg: 14m 35s | Max: 15m 02s | Hits:  94%/3582  
      🟩 GCC10              Pass: 100%/2   | Total: 29m 41s | Avg: 14m 50s | Max: 15m 27s | Hits:  94%/3582  
      🟩 GCC11              Pass: 100%/2   | Total: 30m 18s | Avg: 15m 09s | Max: 15m 28s | Hits:  94%/3582  
      🟩 GCC12              Pass: 100%/2   | Total: 30m 41s | Avg: 15m 20s | Max: 15m 45s | Hits:  94%/3582  
      🟩 GCC13              Pass: 100%/10  | Total:  2h 00m | Avg: 12m 05s | Max: 15m 48s | Hits:  96%/17910 
      🟩 MSVC14.29          Pass: 100%/2   | Total: 58m 21s | Avg: 29m 10s | Max: 29m 33s | Hits:  93%/3568  
      🟩 MSVC14.42          Pass: 100%/3   | Total:  1h 28m | Avg: 29m 22s | Max: 33m 32s | Hits:  95%/5352  
      🟩 NVHPC25.1          Pass: 100%/2   | Total: 55m 36s | Avg: 27m 48s | Max: 29m 58s | Hits:  93%/3580  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/17  | Total:  3h 50m | Avg: 13m 31s | Max: 15m 43s | Hits:  95%/30430 
      🟩 GCC                Pass: 100%/21  | Total:  4h 45m | Avg: 13m 36s | Max: 16m 21s | Hits:  95%/37611 
      🟩 MSVC               Pass: 100%/5   | Total:  2h 26m | Avg: 29m 17s | Max: 33m 32s | Hits:  95%/8920  
      🟩 NVHPC              Pass: 100%/2   | Total: 55m 36s | Avg: 27m 48s | Max: 29m 58s | Hits:  93%/3580  
    🟩 gpu
      🟩 h100               Pass: 100%/2   | Total: 19m 50s | Avg:  9m 55s | Max: 11m 02s | Hits:  97%/3582  
      🟩 rtx2080            Pass: 100%/33  | Total:  9h 07m | Avg: 16m 35s | Max: 29m 58s | Hits:  94%/59066 
      🟩 rtx4090            Pass: 100%/10  | Total:  2h 30m | Avg: 15m 04s | Max: 33m 32s | Hits:  97%/17893 
    🟩 jobs
      🟩 Build              Pass: 100%/38  | Total: 10h 32m | Avg: 16m 39s | Max: 33m 32s | Hits:  94%/68013 
      🟩 TestCPU            Pass: 100%/3   | Total: 41m 12s | Avg: 13m 44s | Max: 25m 37s | Hits:  99%/5365  
      🟩 TestGPU            Pass: 100%/4   | Total: 43m 47s | Avg: 10m 56s | Max: 11m 23s | Hits:  99%/7163  
    🟩 sm
      🟩 90                 Pass: 100%/2   | Total: 19m 50s | Avg:  9m 55s | Max: 11m 02s | Hits:  97%/3582  
      🟩 90;90a;100         Pass: 100%/1   | Total: 13m 24s | Avg: 13m 24s | Max: 13m 24s | Hits:  94%/1791  
    🟩 std
      🟩 17                 Pass: 100%/20  | Total:  5h 54m | Avg: 17m 43s | Max: 29m 58s | Hits:  94%/35791 
      🟩 20                 Pass: 100%/23  | Total:  5h 38m | Avg: 14m 42s | Max: 33m 32s | Hits:  96%/41168 
    
  • 🟩 stdpar: Pass: 100%/4 | Total: 16m 09s | Avg: 4m 02s | Max: 4m 49s

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total:  9m 19s | Avg:  4m 39s | Max:  4m 49s
      🟩 arm64              Pass: 100%/2   | Total:  6m 50s | Avg:  3m 25s | Max:  3m 30s
    🟩 ctk
      🟩 12.6               Pass: 100%/4   | Total: 16m 09s | Avg:  4m 02s | Max:  4m 49s
    🟩 cudacxx
      🟩 nvcc12.6           Pass: 100%/4   | Total: 16m 09s | Avg:  4m 02s | Max:  4m 49s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/4   | Total: 16m 09s | Avg:  4m 02s | Max:  4m 49s
    🟩 cxx
      🟩 NVHPC25.1          Pass: 100%/4   | Total: 16m 09s | Avg:  4m 02s | Max:  4m 49s
    🟩 cxx_family
      🟩 NVHPC              Pass: 100%/4   | Total: 16m 09s | Avg:  4m 02s | Max:  4m 49s
    🟩 gpu
      🟩 rtx2080            Pass: 100%/4   | Total: 16m 09s | Avg:  4m 02s | Max:  4m 49s
    🟩 jobs
      🟩 Build              Pass: 100%/4   | Total: 16m 09s | Avg:  4m 02s | Max:  4m 49s
    🟩 std
      🟩 17                 Pass: 100%/2   | Total:  8m 19s | Avg:  4m 09s | Max:  4m 49s
      🟩 20                 Pass: 100%/2   | Total:  7m 50s | Avg:  3m 55s | Max:  4m 30s
    
  • 🟩 cccl_c_parallel: Pass: 100%/2 | Total: 16m 45s | Avg: 8m 22s | Max: 14m 36s | Hits: 98%/320

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total: 16m 45s | Avg:  8m 22s | Max: 14m 36s | Hits:  98%/320   
    🟩 ctk
      🟩 12.8               Pass: 100%/2   | Total: 16m 45s | Avg:  8m 22s | Max: 14m 36s | Hits:  98%/320   
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/2   | Total: 16m 45s | Avg:  8m 22s | Max: 14m 36s | Hits:  98%/320   
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/2   | Total: 16m 45s | Avg:  8m 22s | Max: 14m 36s | Hits:  98%/320   
    🟩 cxx
      🟩 GCC13              Pass: 100%/2   | Total: 16m 45s | Avg:  8m 22s | Max: 14m 36s | Hits:  98%/320   
    🟩 cxx_family
      🟩 GCC                Pass: 100%/2   | Total: 16m 45s | Avg:  8m 22s | Max: 14m 36s | Hits:  98%/320   
    🟩 gpu
      🟩 rtx2080            Pass: 100%/2   | Total: 16m 45s | Avg:  8m 22s | Max: 14m 36s | Hits:  98%/320   
    🟩 jobs
      🟩 Build              Pass: 100%/1   | Total:  2m 09s | Avg:  2m 09s | Max:  2m 09s | Hits:  98%/160   
      🟩 Test               Pass: 100%/1   | Total: 14m 36s | Avg: 14m 36s | Max: 14m 36s | Hits:  98%/160   
    
  • 🟩 python: Pass: 100%/1 | Total: 1h 06m | Avg: 1h 06m | Max: 1h 06m

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total:  1h 06m | Avg:  1h 06m | Max:  1h 06m
    🟩 ctk
      🟩 12.8               Pass: 100%/1   | Total:  1h 06m | Avg:  1h 06m | Max:  1h 06m
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/1   | Total:  1h 06m | Avg:  1h 06m | Max:  1h 06m
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total:  1h 06m | Avg:  1h 06m | Max:  1h 06m
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total:  1h 06m | Avg:  1h 06m | Max:  1h 06m
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total:  1h 06m | Avg:  1h 06m | Max:  1h 06m
    🟩 gpu
      🟩 rtx2080            Pass: 100%/1   | Total:  1h 06m | Avg:  1h 06m | Max:  1h 06m
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total:  1h 06m | Avg:  1h 06m | Max:  1h 06m
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
+/- CUB
Thrust
CUDA Experimental
stdpar
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental
+/- stdpar
+/- python
+/- CCCL C Parallel Library
+/- Catch2Helper

🏃‍ Runner counts (total jobs: 97)

# Runner
68 linux-amd64-cpu16
9 windows-amd64-cpu16
6 linux-arm64-cpu16
6 linux-amd64-gpu-rtxa6000-latest-1
3 linux-amd64-gpu-h100-latest-1
3 linux-amd64-gpu-rtx4090-latest-1
2 linux-amd64-gpu-rtx2080-latest-1

@elstehle
Copy link
Contributor Author

/ok to test

@github-actions
Copy link
Contributor

🟩 CI finished in 1h 29m: Pass: 100%/97 | Total: 1d 20h | Avg: 27m 37s | Max: 1h 07m | Hits: 92%/134281
  • 🟩 cub: Pass: 100%/45 | Total: 1d 06h | Avg: 40m 24s | Max: 1h 03m | Hits: 88%/53780

    🟩 cpu
      🟩 amd64              Pass: 100%/43  | Total:  1d 04h | Avg: 40m 12s | Max:  1h 03m | Hits:  88%/51336 
      🟩 arm64              Pass: 100%/2   | Total:  1h 29m | Avg: 44m 43s | Max: 55m 13s | Hits:  90%/2444  
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total:  3h 42m | Avg: 44m 25s | Max: 53m 28s | Hits:  90%/5940  
      🟩 12.6               Pass: 100%/2   | Total:  1h 54m | Avg: 57m 10s | Max: 57m 12s | Hits:  89%/2260  
      🟩 12.8               Pass: 100%/38  | Total:  1d 00h | Avg: 38m 59s | Max:  1h 03m | Hits:  88%/45580 
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total:  2h 05m | Avg:  1h 02m | Max:  1h 03m | Hits:  91%/2108  
      🟩 nvcc12.0           Pass: 100%/5   | Total:  3h 42m | Avg: 44m 25s | Max: 53m 28s | Hits:  90%/5940  
      🟩 nvcc12.6           Pass: 100%/2   | Total:  1h 54m | Avg: 57m 10s | Max: 57m 12s | Hits:  89%/2260  
      🟩 nvcc12.8           Pass: 100%/36  | Total: 22h 36m | Avg: 37m 41s | Max: 55m 13s | Hits:  88%/43472 
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total:  2h 05m | Avg:  1h 02m | Max:  1h 03m | Hits:  91%/2108  
      🟩 nvcc               Pass: 100%/43  | Total:  1d 04h | Avg: 39m 22s | Max: 57m 12s | Hits:  88%/51672 
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total:  3h 23m | Avg: 50m 48s | Max: 53m 28s | Hits:  90%/4896  
      🟩 Clang15            Pass: 100%/2   | Total:  1h 45m | Avg: 52m 51s | Max: 53m 05s | Hits:  90%/2444  
      🟩 Clang16            Pass: 100%/2   | Total:  1h 37m | Avg: 48m 44s | Max: 48m 51s | Hits:  90%/2444  
      🟩 Clang17            Pass: 100%/2   | Total:  1h 37m | Avg: 48m 49s | Max: 49m 01s | Hits:  90%/2444  
      🟩 Clang18            Pass: 100%/7   | Total:  5h 29m | Avg: 47m 07s | Max:  1h 03m | Hits:  93%/8218  
      🟩 GCC7               Pass: 100%/2   | Total:  1h 05m | Avg: 32m 55s | Max: 34m 02s | Hits:  90%/2448  
      🟩 GCC8               Pass: 100%/1   | Total: 32m 28s | Avg: 32m 28s | Max: 32m 28s | Hits:  90%/1224  
      🟩 GCC9               Pass: 100%/2   | Total:  1h 08m | Avg: 34m 26s | Max: 35m 09s | Hits:  90%/2448  
      🟩 GCC10              Pass: 100%/2   | Total:  1h 06m | Avg: 33m 06s | Max: 33m 55s | Hits:  90%/2448  
      🟩 GCC11              Pass: 100%/2   | Total:  1h 01m | Avg: 30m 46s | Max: 30m 55s | Hits:  90%/2444  
      🟩 GCC12              Pass: 100%/2   | Total:  1h 05m | Avg: 32m 38s | Max: 34m 34s | Hits:  90%/2444  
      🟩 GCC13              Pass: 100%/11  | Total:  5h 28m | Avg: 29m 52s | Max: 54m 51s | Hits:  81%/13442 
      🟩 MSVC14.29          Pass: 100%/2   | Total:  1h 30m | Avg: 45m 25s | Max: 47m 32s | Hits:  90%/2088  
      🟩 MSVC14.42          Pass: 100%/2   | Total:  1h 30m | Avg: 45m 09s | Max: 46m 08s | Hits:  90%/2088  
      🟩 NVHPC25.1          Pass: 100%/2   | Total:  1h 54m | Avg: 57m 10s | Max: 57m 12s | Hits:  89%/2260  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/17  | Total: 13h 53m | Avg: 49m 03s | Max:  1h 03m | Hits:  91%/20446 
      🟩 GCC                Pass: 100%/22  | Total: 11h 28m | Avg: 31m 18s | Max: 54m 51s | Hits:  86%/26898 
      🟩 MSVC               Pass: 100%/4   | Total:  3h 01m | Avg: 45m 17s | Max: 47m 32s | Hits:  90%/4176  
      🟩 NVHPC              Pass: 100%/2   | Total:  1h 54m | Avg: 57m 10s | Max: 57m 12s | Hits:  89%/2260  
    🟩 gpu
      🟩 h100               Pass: 100%/3   | Total:  1h 15m | Avg: 25m 17s | Max: 26m 49s | Hits:  71%/3666  
      🟩 rtx2080            Pass: 100%/34  | Total:  1d 01h | Avg: 44m 35s | Max:  1h 03m | Hits:  88%/40338 
      🟩 rtxa6000           Pass: 100%/8   | Total:  3h 46m | Avg: 28m 19s | Max: 48m 15s | Hits:  97%/9776  
    🟩 jobs
      🟩 Build              Pass: 100%/37  | Total:  1d 03h | Avg: 43m 52s | Max:  1h 03m | Hits:  86%/44004 
      🟩 DeviceLaunch       Pass: 100%/1   | Total: 27m 38s | Avg: 27m 38s | Max: 27m 38s | Hits:  99%/1222  
      🟩 GraphCapture       Pass: 100%/1   | Total: 20m 09s | Avg: 20m 09s | Max: 20m 09s | Hits:  99%/1222  
      🟩 HostLaunch         Pass: 100%/3   | Total:  1h 20m | Avg: 26m 52s | Max: 27m 38s | Hits:  99%/3666  
      🟩 TestGPU            Pass: 100%/3   | Total:  1h 06m | Avg: 22m 12s | Max: 23m 15s | Hits:  99%/3666  
    🟩 sm
      🟩 90                 Pass: 100%/3   | Total:  1h 15m | Avg: 25m 17s | Max: 26m 49s | Hits:  71%/3666  
      🟩 90;90a;100         Pass: 100%/1   | Total: 54m 51s | Avg: 54m 51s | Max: 54m 51s | Hits:  16%/1222  
    🟩 std
      🟩 17                 Pass: 100%/20  | Total: 14h 20m | Avg: 43m 00s | Max:  1h 01m | Hits:  90%/23662 
      🟩 20                 Pass: 100%/25  | Total: 15h 58m | Avg: 38m 19s | Max:  1h 03m | Hits:  87%/30118 
    
  • 🟩 thrust: Pass: 100%/45 | Total: 12h 40m | Avg: 16m 53s | Max: 34m 51s | Hits: 94%/80181

    🟩 cmake_options
      🟩 -DTHRUST_DISPATCH_TYPE=Force32bit Pass: 100%/2   | Total: 25m 28s | Avg: 12m 44s | Max: 13m 59s | Hits:  97%/3566  
    🟩 cpu
      🟩 amd64              Pass: 100%/43  | Total: 12h 13m | Avg: 17m 03s | Max: 34m 51s | Hits:  94%/76616 
      🟩 arm64              Pass: 100%/2   | Total: 26m 40s | Avg: 13m 20s | Max: 13m 41s | Hits:  94%/3565  
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total:  1h 32m | Avg: 18m 25s | Max: 32m 52s | Hits:  94%/8906  
      🟩 12.6               Pass: 100%/2   | Total: 55m 15s | Avg: 27m 37s | Max: 29m 06s | Hits:  93%/3564  
      🟩 12.8               Pass: 100%/38  | Total: 10h 12m | Avg: 16m 07s | Max: 34m 51s | Hits:  94%/67711 
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total: 28m 22s | Avg: 14m 11s | Max: 14m 12s | Hits:  94%/3564  
      🟩 nvcc12.0           Pass: 100%/5   | Total:  1h 32m | Avg: 18m 25s | Max: 32m 52s | Hits:  94%/8906  
      🟩 nvcc12.6           Pass: 100%/2   | Total: 55m 15s | Avg: 27m 37s | Max: 29m 06s | Hits:  93%/3564  
      🟩 nvcc12.8           Pass: 100%/36  | Total:  9h 44m | Avg: 16m 14s | Max: 34m 51s | Hits:  94%/64147 
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total: 28m 22s | Avg: 14m 11s | Max: 14m 12s | Hits:  94%/3564  
      🟩 nvcc               Pass: 100%/43  | Total: 12h 11m | Avg: 17m 01s | Max: 34m 51s | Hits:  94%/76617 
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total: 57m 52s | Avg: 14m 28s | Max: 14m 43s | Hits:  94%/7128  
      🟩 Clang15            Pass: 100%/2   | Total: 28m 31s | Avg: 14m 15s | Max: 14m 24s | Hits:  94%/3564  
      🟩 Clang16            Pass: 100%/2   | Total: 28m 16s | Avg: 14m 08s | Max: 14m 13s | Hits:  94%/3564  
      🟩 Clang17            Pass: 100%/2   | Total: 29m 29s | Avg: 14m 44s | Max: 14m 51s | Hits:  94%/3564  
      🟩 Clang18            Pass: 100%/7   | Total:  1h 26m | Avg: 12m 25s | Max: 14m 12s | Hits:  96%/12474 
      🟩 GCC7               Pass: 100%/2   | Total: 30m 19s | Avg: 15m 09s | Max: 15m 22s | Hits:  94%/3566  
      🟩 GCC8               Pass: 100%/1   | Total: 15m 32s | Avg: 15m 32s | Max: 15m 32s | Hits:  94%/1783  
      🟩 GCC9               Pass: 100%/2   | Total: 30m 19s | Avg: 15m 09s | Max: 15m 20s | Hits:  94%/3566  
      🟩 GCC10              Pass: 100%/2   | Total: 30m 16s | Avg: 15m 08s | Max: 15m 39s | Hits:  94%/3566  
      🟩 GCC11              Pass: 100%/2   | Total: 28m 59s | Avg: 14m 29s | Max: 14m 35s | Hits:  94%/3566  
      🟩 GCC12              Pass: 100%/2   | Total: 29m 45s | Avg: 14m 52s | Max: 15m 48s | Hits:  94%/3566  
      🟩 GCC13              Pass: 100%/10  | Total:  2h 30m | Avg: 15m 05s | Max: 33m 01s | Hits:  92%/17830 
      🟩 MSVC14.29          Pass: 100%/2   | Total:  1h 04m | Avg: 32m 23s | Max: 32m 52s | Hits:  93%/3552  
      🟩 MSVC14.42          Pass: 100%/3   | Total:  1h 33m | Avg: 31m 01s | Max: 34m 51s | Hits:  95%/5328  
      🟩 NVHPC25.1          Pass: 100%/2   | Total: 55m 15s | Avg: 27m 37s | Max: 29m 06s | Hits:  93%/3564  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/17  | Total:  3h 51m | Avg: 13m 35s | Max: 14m 51s | Hits:  95%/30294 
      🟩 GCC                Pass: 100%/21  | Total:  5h 16m | Avg: 15m 03s | Max: 33m 01s | Hits:  93%/37443 
      🟩 MSVC               Pass: 100%/5   | Total:  2h 37m | Avg: 31m 34s | Max: 34m 51s | Hits:  95%/8880  
      🟩 NVHPC              Pass: 100%/2   | Total: 55m 15s | Avg: 27m 37s | Max: 29m 06s | Hits:  93%/3564  
    🟩 gpu
      🟩 h100               Pass: 100%/2   | Total: 29m 54s | Avg: 14m 57s | Max: 18m 18s | Hits:  87%/3566  
      🟩 rtx2080            Pass: 100%/33  | Total:  9h 35m | Avg: 17m 26s | Max: 33m 01s | Hits:  94%/58802 
      🟩 rtx4090            Pass: 100%/10  | Total:  2h 34m | Avg: 15m 29s | Max: 34m 51s | Hits:  97%/17813 
    🟩 jobs
      🟩 Build              Pass: 100%/38  | Total: 11h 10m | Avg: 17m 39s | Max: 34m 51s | Hits:  93%/67709 
      🟩 TestCPU            Pass: 100%/3   | Total: 44m 22s | Avg: 14m 47s | Max: 28m 30s | Hits:  99%/5341  
      🟩 TestGPU            Pass: 100%/4   | Total: 45m 10s | Avg: 11m 17s | Max: 11m 45s | Hits:  99%/7131  
    🟩 sm
      🟩 90                 Pass: 100%/2   | Total: 29m 54s | Avg: 14m 57s | Max: 18m 18s | Hits:  87%/3566  
      🟩 90;90a;100         Pass: 100%/1   | Total: 33m 01s | Avg: 33m 01s | Max: 33m 01s | Hits:  75%/1783  
    🟩 std
      🟩 17                 Pass: 100%/20  | Total:  5h 59m | Avg: 17m 57s | Max: 32m 52s | Hits:  94%/35631 
      🟩 20                 Pass: 100%/23  | Total:  6h 15m | Avg: 16m 20s | Max: 34m 51s | Hits:  94%/40984 
    
  • 🟩 stdpar: Pass: 100%/4 | Total: 16m 46s | Avg: 4m 11s | Max: 4m 58s

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total:  9m 54s | Avg:  4m 57s | Max:  4m 58s
      🟩 arm64              Pass: 100%/2   | Total:  6m 52s | Avg:  3m 26s | Max:  3m 28s
    🟩 ctk
      🟩 12.6               Pass: 100%/4   | Total: 16m 46s | Avg:  4m 11s | Max:  4m 58s
    🟩 cudacxx
      🟩 nvcc12.6           Pass: 100%/4   | Total: 16m 46s | Avg:  4m 11s | Max:  4m 58s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/4   | Total: 16m 46s | Avg:  4m 11s | Max:  4m 58s
    🟩 cxx
      🟩 NVHPC25.1          Pass: 100%/4   | Total: 16m 46s | Avg:  4m 11s | Max:  4m 58s
    🟩 cxx_family
      🟩 NVHPC              Pass: 100%/4   | Total: 16m 46s | Avg:  4m 11s | Max:  4m 58s
    🟩 gpu
      🟩 rtx2080            Pass: 100%/4   | Total: 16m 46s | Avg:  4m 11s | Max:  4m 58s
    🟩 jobs
      🟩 Build              Pass: 100%/4   | Total: 16m 46s | Avg:  4m 11s | Max:  4m 58s
    🟩 std
      🟩 17                 Pass: 100%/2   | Total:  8m 24s | Avg:  4m 12s | Max:  4m 56s
      🟩 20                 Pass: 100%/2   | Total:  8m 22s | Avg:  4m 11s | Max:  4m 58s
    
  • 🟩 cccl_c_parallel: Pass: 100%/2 | Total: 17m 30s | Avg: 8m 45s | Max: 14m 57s | Hits: 96%/320

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total: 17m 30s | Avg:  8m 45s | Max: 14m 57s | Hits:  96%/320   
    🟩 ctk
      🟩 12.8               Pass: 100%/2   | Total: 17m 30s | Avg:  8m 45s | Max: 14m 57s | Hits:  96%/320   
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/2   | Total: 17m 30s | Avg:  8m 45s | Max: 14m 57s | Hits:  96%/320   
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/2   | Total: 17m 30s | Avg:  8m 45s | Max: 14m 57s | Hits:  96%/320   
    🟩 cxx
      🟩 GCC13              Pass: 100%/2   | Total: 17m 30s | Avg:  8m 45s | Max: 14m 57s | Hits:  96%/320   
    🟩 cxx_family
      🟩 GCC                Pass: 100%/2   | Total: 17m 30s | Avg:  8m 45s | Max: 14m 57s | Hits:  96%/320   
    🟩 gpu
      🟩 rtx2080            Pass: 100%/2   | Total: 17m 30s | Avg:  8m 45s | Max: 14m 57s | Hits:  96%/320   
    🟩 jobs
      🟩 Build              Pass: 100%/1   | Total:  2m 33s | Avg:  2m 33s | Max:  2m 33s | Hits:  94%/160   
      🟩 Test               Pass: 100%/1   | Total: 14m 57s | Avg: 14m 57s | Max: 14m 57s | Hits:  98%/160   
    
  • 🟩 python: Pass: 100%/1 | Total: 1h 07m | Avg: 1h 07m | Max: 1h 07m

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total:  1h 07m | Avg:  1h 07m | Max:  1h 07m
    🟩 ctk
      🟩 12.8               Pass: 100%/1   | Total:  1h 07m | Avg:  1h 07m | Max:  1h 07m
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/1   | Total:  1h 07m | Avg:  1h 07m | Max:  1h 07m
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total:  1h 07m | Avg:  1h 07m | Max:  1h 07m
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total:  1h 07m | Avg:  1h 07m | Max:  1h 07m
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total:  1h 07m | Avg:  1h 07m | Max:  1h 07m
    🟩 gpu
      🟩 rtx2080            Pass: 100%/1   | Total:  1h 07m | Avg:  1h 07m | Max:  1h 07m
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total:  1h 07m | Avg:  1h 07m | Max:  1h 07m
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
+/- CUB
Thrust
CUDA Experimental
stdpar
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental
+/- stdpar
+/- python
+/- CCCL C Parallel Library
+/- Catch2Helper

🏃‍ Runner counts (total jobs: 97)

# Runner
68 linux-amd64-cpu16
9 windows-amd64-cpu16
6 linux-arm64-cpu16
6 linux-amd64-gpu-rtxa6000-latest-1
3 linux-amd64-gpu-h100-latest-1
3 linux-amd64-gpu-rtx4090-latest-1
2 linux-amd64-gpu-rtx2080-latest-1

@elstehle
Copy link
Contributor Author

/ok to test

@github-actions
Copy link
Contributor

🟩 CI finished in 1h 10m: Pass: 100%/97 | Total: 16h 59m | Avg: 10m 30s | Max: 1h 09m | Hits: 99%/134281
  • 🟩 cub: Pass: 100%/45 | Total: 8h 57m | Avg: 11m 56s | Max: 54m 14s | Hits: 99%/53780

    🟩 cpu
      🟩 amd64              Pass: 100%/43  | Total:  8h 45m | Avg: 12m 13s | Max: 54m 14s | Hits:  98%/51336 
      🟩 arm64              Pass: 100%/2   | Total: 11m 50s | Avg:  5m 55s | Max:  6m 18s | Hits:  99%/2444  
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total: 41m 54s | Avg:  8m 22s | Max: 18m 33s | Hits:  99%/5940  
      🟩 12.6               Pass: 100%/2   | Total: 21m 17s | Avg: 10m 38s | Max: 10m 41s | Hits:  98%/2260  
      🟩 12.8               Pass: 100%/38  | Total:  7h 54m | Avg: 12m 28s | Max: 54m 14s | Hits:  98%/45580 
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total:  9m 57s | Avg:  4m 58s | Max:  5m 00s | Hits: 100%/2108  
      🟩 nvcc12.0           Pass: 100%/5   | Total: 41m 54s | Avg:  8m 22s | Max: 18m 33s | Hits:  99%/5940  
      🟩 nvcc12.6           Pass: 100%/2   | Total: 21m 17s | Avg: 10m 38s | Max: 10m 41s | Hits:  98%/2260  
      🟩 nvcc12.8           Pass: 100%/36  | Total:  7h 44m | Avg: 12m 53s | Max: 54m 14s | Hits:  98%/43472 
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total:  9m 57s | Avg:  4m 58s | Max:  5m 00s | Hits: 100%/2108  
      🟩 nvcc               Pass: 100%/43  | Total:  8h 47m | Avg: 12m 16s | Max: 54m 14s | Hits:  98%/51672 
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total: 24m 05s | Avg:  6m 01s | Max:  6m 28s | Hits: 100%/4896  
      🟩 Clang15            Pass: 100%/2   | Total: 13m 08s | Avg:  6m 34s | Max:  6m 41s | Hits: 100%/2444  
      🟩 Clang16            Pass: 100%/2   | Total: 13m 08s | Avg:  6m 34s | Max:  6m 39s | Hits: 100%/2444  
      🟩 Clang17            Pass: 100%/2   | Total: 12m 20s | Avg:  6m 10s | Max:  6m 10s | Hits: 100%/2444  
      🟩 Clang18            Pass: 100%/7   | Total:  1h 15m | Avg: 10m 47s | Max: 25m 16s | Hits: 100%/8218  
      🟩 GCC7               Pass: 100%/2   | Total: 12m 48s | Avg:  6m 24s | Max:  6m 46s | Hits:  99%/2448  
      🟩 GCC8               Pass: 100%/1   | Total:  6m 28s | Avg:  6m 28s | Max:  6m 28s | Hits:  99%/1224  
      🟩 GCC9               Pass: 100%/2   | Total: 13m 03s | Avg:  6m 31s | Max:  6m 57s | Hits:  99%/2448  
      🟩 GCC10              Pass: 100%/2   | Total:  1h 01m | Avg: 30m 38s | Max: 54m 14s | Hits:  84%/2448  
      🟩 GCC11              Pass: 100%/2   | Total: 14m 08s | Avg:  7m 04s | Max:  7m 24s | Hits:  99%/2444  
      🟩 GCC12              Pass: 100%/2   | Total: 14m 06s | Avg:  7m 03s | Max:  7m 04s | Hits:  99%/2444  
      🟩 GCC13              Pass: 100%/11  | Total:  2h 58m | Avg: 16m 13s | Max: 27m 17s | Hits:  99%/13442 
      🟩 MSVC14.29          Pass: 100%/2   | Total: 39m 04s | Avg: 19m 32s | Max: 20m 31s | Hits:  99%/2088  
      🟩 MSVC14.42          Pass: 100%/2   | Total: 38m 42s | Avg: 19m 21s | Max: 19m 57s | Hits:  99%/2088  
      🟩 NVHPC25.1          Pass: 100%/2   | Total: 21m 17s | Avg: 10m 38s | Max: 10m 41s | Hits:  98%/2260  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/17  | Total:  2h 18m | Avg:  8m 07s | Max: 25m 16s | Hits: 100%/20446 
      🟩 GCC                Pass: 100%/22  | Total:  5h 00m | Avg: 13m 39s | Max: 54m 14s | Hits:  98%/26898 
      🟩 MSVC               Pass: 100%/4   | Total:  1h 17m | Avg: 19m 26s | Max: 20m 31s | Hits:  99%/4176  
      🟩 NVHPC              Pass: 100%/2   | Total: 21m 17s | Avg: 10m 38s | Max: 10m 41s | Hits:  98%/2260  
    🟩 gpu
      🟩 h100               Pass: 100%/3   | Total: 53m 24s | Avg: 17m 48s | Max: 26m 27s | Hits:  99%/3666  
      🟩 rtx2080            Pass: 100%/34  | Total:  5h 25m | Avg:  9m 35s | Max: 54m 14s | Hits:  98%/40338 
      🟩 rtxa6000           Pass: 100%/8   | Total:  2h 38m | Avg: 19m 46s | Max: 27m 17s | Hits:  99%/9776  
    🟩 jobs
      🟩 Build              Pass: 100%/37  | Total:  5h 45m | Avg:  9m 20s | Max: 54m 14s | Hits:  98%/44004 
      🟩 DeviceLaunch       Pass: 100%/1   | Total: 27m 16s | Avg: 27m 16s | Max: 27m 16s | Hits:  99%/1222  
      🟩 GraphCapture       Pass: 100%/1   | Total: 19m 53s | Avg: 19m 53s | Max: 19m 53s | Hits:  99%/1222  
      🟩 HostLaunch         Pass: 100%/3   | Total:  1h 19m | Avg: 26m 20s | Max: 27m 17s | Hits:  99%/3666  
      🟩 TestGPU            Pass: 100%/3   | Total:  1h 05m | Avg: 21m 58s | Max: 22m 28s | Hits:  99%/3666  
    🟩 sm
      🟩 90                 Pass: 100%/3   | Total: 53m 24s | Avg: 17m 48s | Max: 26m 27s | Hits:  99%/3666  
      🟩 90;90a;100         Pass: 100%/1   | Total:  7m 14s | Avg:  7m 14s | Max:  7m 14s | Hits:  99%/1222  
    🟩 std
      🟩 17                 Pass: 100%/20  | Total:  3h 39m | Avg: 10m 59s | Max: 54m 14s | Hits:  98%/23662 
      🟩 20                 Pass: 100%/25  | Total:  5h 17m | Avg: 12m 42s | Max: 27m 17s | Hits:  99%/30118 
    
  • 🟩 thrust: Pass: 100%/45 | Total: 6h 20m | Avg: 8m 26s | Max: 26m 16s | Hits: 99%/80181

    🟩 cmake_options
      🟩 -DTHRUST_DISPATCH_TYPE=Force32bit Pass: 100%/2   | Total: 18m 01s | Avg:  9m 00s | Max: 11m 30s | Hits:  99%/3566  
    🟩 cpu
      🟩 amd64              Pass: 100%/43  | Total:  6h 09m | Avg:  8m 36s | Max: 26m 16s | Hits:  99%/76616 
      🟩 arm64              Pass: 100%/2   | Total: 10m 02s | Avg:  5m 01s | Max:  5m 22s | Hits:  99%/3565  
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total: 39m 30s | Avg:  7m 54s | Max: 19m 17s | Hits:  99%/8906  
      🟩 12.6               Pass: 100%/2   | Total: 31m 06s | Avg: 15m 33s | Max: 15m 38s | Hits:  99%/3564  
      🟩 12.8               Pass: 100%/38  | Total:  5h 09m | Avg:  8m 08s | Max: 26m 16s | Hits:  99%/67711 
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total: 10m 36s | Avg:  5m 18s | Max:  5m 30s | Hits: 100%/3564  
      🟩 nvcc12.0           Pass: 100%/5   | Total: 39m 30s | Avg:  7m 54s | Max: 19m 17s | Hits:  99%/8906  
      🟩 nvcc12.6           Pass: 100%/2   | Total: 31m 06s | Avg: 15m 33s | Max: 15m 38s | Hits:  99%/3564  
      🟩 nvcc12.8           Pass: 100%/36  | Total:  4h 58m | Avg:  8m 18s | Max: 26m 16s | Hits:  99%/64147 
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total: 10m 36s | Avg:  5m 18s | Max:  5m 30s | Hits: 100%/3564  
      🟩 nvcc               Pass: 100%/43  | Total:  6h 09m | Avg:  8m 35s | Max: 26m 16s | Hits:  99%/76617 
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total: 20m 21s | Avg:  5m 05s | Max:  5m 28s | Hits:  99%/7128  
      🟩 Clang15            Pass: 100%/2   | Total: 11m 34s | Avg:  5m 47s | Max:  6m 01s | Hits: 100%/3564  
      🟩 Clang16            Pass: 100%/2   | Total: 10m 42s | Avg:  5m 21s | Max:  5m 23s | Hits: 100%/3564  
      🟩 Clang17            Pass: 100%/2   | Total: 11m 13s | Avg:  5m 36s | Max:  5m 39s | Hits: 100%/3564  
      🟩 Clang18            Pass: 100%/7   | Total: 43m 53s | Avg:  6m 16s | Max: 10m 27s | Hits: 100%/12474 
      🟩 GCC7               Pass: 100%/2   | Total: 10m 33s | Avg:  5m 16s | Max:  5m 20s | Hits:  99%/3566  
      🟩 GCC8               Pass: 100%/1   | Total:  5m 27s | Avg:  5m 27s | Max:  5m 27s | Hits:  99%/1783  
      🟩 GCC9               Pass: 100%/2   | Total: 11m 04s | Avg:  5m 32s | Max:  5m 38s | Hits:  99%/3566  
      🟩 GCC10              Pass: 100%/2   | Total: 12m 09s | Avg:  6m 04s | Max:  6m 18s | Hits:  99%/3566  
      🟩 GCC11              Pass: 100%/2   | Total: 11m 14s | Avg:  5m 37s | Max:  5m 44s | Hits:  99%/3566  
      🟩 GCC12              Pass: 100%/2   | Total: 11m 56s | Avg:  5m 58s | Max:  5m 58s | Hits:  99%/3566  
      🟩 GCC13              Pass: 100%/10  | Total:  1h 17m | Avg:  7m 47s | Max: 11m 44s | Hits:  99%/17830 
      🟩 MSVC14.29          Pass: 100%/2   | Total: 40m 44s | Avg: 20m 22s | Max: 21m 27s | Hits:  99%/3552  
      🟩 MSVC14.42          Pass: 100%/3   | Total:  1h 10m | Avg: 23m 22s | Max: 26m 16s | Hits:  99%/5328  
      🟩 NVHPC25.1          Pass: 100%/2   | Total: 31m 06s | Avg: 15m 33s | Max: 15m 38s | Hits:  99%/3564  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/17  | Total:  1h 37m | Avg:  5m 44s | Max: 10m 27s | Hits:  99%/30294 
      🟩 GCC                Pass: 100%/21  | Total:  2h 20m | Avg:  6m 41s | Max: 11m 44s | Hits:  99%/37443 
      🟩 MSVC               Pass: 100%/5   | Total:  1h 50m | Avg: 22m 10s | Max: 26m 16s | Hits:  99%/8880  
      🟩 NVHPC              Pass: 100%/2   | Total: 31m 06s | Avg: 15m 33s | Max: 15m 38s | Hits:  99%/3564  
    🟩 gpu
      🟩 h100               Pass: 100%/2   | Total: 16m 47s | Avg:  8m 23s | Max: 11m 42s | Hits:  99%/3566  
      🟩 rtx2080            Pass: 100%/33  | Total:  4h 08m | Avg:  7m 30s | Max: 21m 40s | Hits:  99%/58802 
      🟩 rtx4090            Pass: 100%/10  | Total:  1h 55m | Avg: 11m 31s | Max: 26m 16s | Hits:  99%/17813 
    🟩 jobs
      🟩 Build              Pass: 100%/38  | Total:  4h 53m | Avg:  7m 43s | Max: 22m 10s | Hits:  99%/67709 
      🟩 TestCPU            Pass: 100%/3   | Total: 41m 18s | Avg: 13m 46s | Max: 26m 16s | Hits:  99%/5341  
      🟩 TestGPU            Pass: 100%/4   | Total: 45m 23s | Avg: 11m 20s | Max: 11m 44s | Hits:  99%/7131  
    🟩 sm
      🟩 90                 Pass: 100%/2   | Total: 16m 47s | Avg:  8m 23s | Max: 11m 42s | Hits:  99%/3566  
      🟩 90;90a;100         Pass: 100%/1   | Total:  6m 27s | Avg:  6m 27s | Max:  6m 27s | Hits:  99%/1783  
    🟩 std
      🟩 17                 Pass: 100%/20  | Total:  2h 45m | Avg:  8m 17s | Max: 21m 40s | Hits:  99%/35631 
      🟩 20                 Pass: 100%/23  | Total:  3h 16m | Avg:  8m 31s | Max: 26m 16s | Hits:  99%/40984 
    
  • 🟩 stdpar: Pass: 100%/4 | Total: 15m 46s | Avg: 3m 56s | Max: 4m 44s

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total:  9m 19s | Avg:  4m 39s | Max:  4m 44s
      🟩 arm64              Pass: 100%/2   | Total:  6m 27s | Avg:  3m 13s | Max:  3m 14s
    🟩 ctk
      🟩 12.6               Pass: 100%/4   | Total: 15m 46s | Avg:  3m 56s | Max:  4m 44s
    🟩 cudacxx
      🟩 nvcc12.6           Pass: 100%/4   | Total: 15m 46s | Avg:  3m 56s | Max:  4m 44s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/4   | Total: 15m 46s | Avg:  3m 56s | Max:  4m 44s
    🟩 cxx
      🟩 NVHPC25.1          Pass: 100%/4   | Total: 15m 46s | Avg:  3m 56s | Max:  4m 44s
    🟩 cxx_family
      🟩 NVHPC              Pass: 100%/4   | Total: 15m 46s | Avg:  3m 56s | Max:  4m 44s
    🟩 gpu
      🟩 rtx2080            Pass: 100%/4   | Total: 15m 46s | Avg:  3m 56s | Max:  4m 44s
    🟩 jobs
      🟩 Build              Pass: 100%/4   | Total: 15m 46s | Avg:  3m 56s | Max:  4m 44s
    🟩 std
      🟩 17                 Pass: 100%/2   | Total:  7m 48s | Avg:  3m 54s | Max:  4m 35s
      🟩 20                 Pass: 100%/2   | Total:  7m 58s | Avg:  3m 59s | Max:  4m 44s
    
  • 🟩 cccl_c_parallel: Pass: 100%/2 | Total: 16m 41s | Avg: 8m 20s | Max: 14m 21s | Hits: 98%/320

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total: 16m 41s | Avg:  8m 20s | Max: 14m 21s | Hits:  98%/320   
    🟩 ctk
      🟩 12.8               Pass: 100%/2   | Total: 16m 41s | Avg:  8m 20s | Max: 14m 21s | Hits:  98%/320   
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/2   | Total: 16m 41s | Avg:  8m 20s | Max: 14m 21s | Hits:  98%/320   
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/2   | Total: 16m 41s | Avg:  8m 20s | Max: 14m 21s | Hits:  98%/320   
    🟩 cxx
      🟩 GCC13              Pass: 100%/2   | Total: 16m 41s | Avg:  8m 20s | Max: 14m 21s | Hits:  98%/320   
    🟩 cxx_family
      🟩 GCC                Pass: 100%/2   | Total: 16m 41s | Avg:  8m 20s | Max: 14m 21s | Hits:  98%/320   
    🟩 gpu
      🟩 rtx2080            Pass: 100%/2   | Total: 16m 41s | Avg:  8m 20s | Max: 14m 21s | Hits:  98%/320   
    🟩 jobs
      🟩 Build              Pass: 100%/1   | Total:  2m 20s | Avg:  2m 20s | Max:  2m 20s | Hits:  98%/160   
      🟩 Test               Pass: 100%/1   | Total: 14m 21s | Avg: 14m 21s | Max: 14m 21s | Hits:  98%/160   
    
  • 🟩 python: Pass: 100%/1 | Total: 1h 09m | Avg: 1h 09m | Max: 1h 09m

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total:  1h 09m | Avg:  1h 09m | Max:  1h 09m
    🟩 ctk
      🟩 12.8               Pass: 100%/1   | Total:  1h 09m | Avg:  1h 09m | Max:  1h 09m
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/1   | Total:  1h 09m | Avg:  1h 09m | Max:  1h 09m
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total:  1h 09m | Avg:  1h 09m | Max:  1h 09m
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total:  1h 09m | Avg:  1h 09m | Max:  1h 09m
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total:  1h 09m | Avg:  1h 09m | Max:  1h 09m
    🟩 gpu
      🟩 rtx2080            Pass: 100%/1   | Total:  1h 09m | Avg:  1h 09m | Max:  1h 09m
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total:  1h 09m | Avg:  1h 09m | Max:  1h 09m
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
+/- CUB
Thrust
CUDA Experimental
stdpar
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental
+/- stdpar
+/- python
+/- CCCL C Parallel Library
+/- Catch2Helper

🏃‍ Runner counts (total jobs: 97)

# Runner
68 linux-amd64-cpu16
9 windows-amd64-cpu16
6 linux-arm64-cpu16
6 linux-amd64-gpu-rtxa6000-latest-1
3 linux-amd64-gpu-h100-latest-1
3 linux-amd64-gpu-rtx4090-latest-1
2 linux-amd64-gpu-rtx2080-latest-1

@elstehle elstehle merged commit e3c22d1 into NVIDIA:main Mar 19, 2025
109 of 111 checks passed
@github-project-automation github-project-automation bot moved this from In Review to Done in CCCL Mar 19, 2025
davebayer pushed a commit to davebayer/cccl that referenced this pull request Apr 7, 2025
…o `DeviceSegmentedRadixSort` (NVIDIA#3402)

* adds benchmarks for segmented radix sort

* fixes num items and num segments in seg radix sort

* improves tests

* clean up benchmark

* limits the segment size to int_max

* improves documentation

* guards larger-than-int single seg test

* fixes graph launch tests

* resolves merge conflicts

* [pre-commit.ci] auto code formatting

* fixes style

* resolves merge conflicts

* reverts multi-dim grid launch

* resolves merge conflicts from latest c.parallel

* uses streaming approach over partitions of segments

* fixes tests

* updates benchmark

* fixes include

* fixes docs

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

Add support for large num_items to device_segmented_radix_sort.cuh

2 participants