KEMBAR78
Allow cuda::par*.on() to take cuda::stream_ref by bernhardmgruber · Pull Request #4225 · NVIDIA/cccl · GitHub
Skip to content

Conversation

@bernhardmgruber
Copy link
Contributor

Fixes: #4150

@bernhardmgruber bernhardmgruber requested review from a team as code owners March 21, 2025 05:02
@github-project-automation github-project-automation bot moved this to Todo in CCCL Mar 21, 2025
@cccl-authenticator-app cccl-authenticator-app bot moved this from Todo to In Review in CCCL Mar 21, 2025
@github-actions
Copy link
Contributor

🟨 CI finished in 1h 14m: Pass: 2%/119 | Total: 8h 13m | Avg: 4m 08s | Max: 1h 14m | Hits: 98%/320
  • 🟥 cub: Pass: 0%/45 | Total: 2h 37m | Avg: 3m 29s | Max: 18m 02s

    🟥 cpu
      🟥 amd64              Pass:   0%/43  | Total:  2h 32m | Avg:  3m 33s | Max: 18m 02s
      🟥 arm64              Pass:   0%/2   | Total:  4m 08s | Avg:  2m 04s | Max:  2m 06s
    🟥 ctk
      🟥 12.0               Pass:   0%/5   | Total: 26m 03s | Avg:  5m 12s | Max: 16m 35s
      🟥 12.6               Pass:   0%/2   | Total: 10m 26s | Avg:  5m 13s | Max:  5m 17s
      🟥 12.8               Pass:   0%/38  | Total:  2h 00m | Avg:  3m 10s | Max: 18m 02s
    🟥 cudacxx
      🟥 ClangCUDA18        Pass:   0%/2   | Total:  5m 17s | Avg:  2m 38s | Max:  2m 42s
      🟥 nvcc12.0           Pass:   0%/5   | Total: 26m 03s | Avg:  5m 12s | Max: 16m 35s
      🟥 nvcc12.6           Pass:   0%/2   | Total: 10m 26s | Avg:  5m 13s | Max:  5m 17s
      🟥 nvcc12.8           Pass:   0%/36  | Total:  1h 55m | Avg:  3m 12s | Max: 18m 02s
    🟥 cudacxx_family
      🟥 ClangCUDA          Pass:   0%/2   | Total:  5m 17s | Avg:  2m 38s | Max:  2m 42s
      🟥 nvcc               Pass:   0%/43  | Total:  2h 31m | Avg:  3m 31s | Max: 18m 02s
    🟥 cxx
      🟥 Clang14            Pass:   0%/4   | Total:  9m 54s | Avg:  2m 28s | Max:  2m 32s
      🟥 Clang15            Pass:   0%/2   | Total:  5m 05s | Avg:  2m 32s | Max:  2m 33s
      🟥 Clang16            Pass:   0%/2   | Total:  5m 15s | Avg:  2m 37s | Max:  2m 38s
      🟥 Clang17            Pass:   0%/2   | Total:  5m 07s | Avg:  2m 33s | Max:  2m 34s
      🟥 Clang18            Pass:   0%/7   | Total: 12m 37s | Avg:  1m 48s | Max:  2m 42s
      🟥 GCC7               Pass:   0%/2   | Total:  4m 39s | Avg:  2m 19s | Max:  2m 27s
      🟥 GCC8               Pass:   0%/1   | Total:  2m 22s | Avg:  2m 22s | Max:  2m 22s
      🟥 GCC9               Pass:   0%/2   | Total:  4m 50s | Avg:  2m 25s | Max:  2m 27s
      🟥 GCC10              Pass:   0%/2   | Total:  5m 04s | Avg:  2m 32s | Max:  2m 35s
      🟥 GCC11              Pass:   0%/2   | Total:  4m 56s | Avg:  2m 28s | Max:  2m 28s
      🟥 GCC12              Pass:   0%/2   | Total:  4m 40s | Avg:  2m 20s | Max:  2m 23s
      🟥 GCC13              Pass:   0%/11  | Total: 12m 49s | Avg:  1m 09s | Max:  2m 56s
      🟥 MSVC14.29          Pass:   0%/2   | Total: 34m 02s | Avg: 17m 01s | Max: 17m 27s
      🟥 MSVC14.42          Pass:   0%/2   | Total: 35m 19s | Avg: 17m 39s | Max: 18m 02s
      🟥 NVHPC25.1          Pass:   0%/2   | Total: 10m 26s | Avg:  5m 13s | Max:  5m 17s
    🟥 cxx_family
      🟥 Clang              Pass:   0%/17  | Total: 37m 58s | Avg:  2m 14s | Max:  2m 42s
      🟥 GCC                Pass:   0%/22  | Total: 39m 20s | Avg:  1m 47s | Max:  2m 56s
      🟥 MSVC               Pass:   0%/4   | Total:  1h 09m | Avg: 17m 20s | Max: 18m 02s
      🟥 NVHPC              Pass:   0%/2   | Total: 10m 26s | Avg:  5m 13s | Max:  5m 17s
    🟥 gpu
      🟥 h100               Pass:   0%/3   | Total:  2m 35s | Avg:  0m 51s | Max:  2m 35s
      🟥 rtx2080            Pass:   0%/34  | Total:  2h 28m | Avg:  4m 22s | Max: 18m 02s
      🟥 rtxa6000           Pass:   0%/8   | Total:  5m 36s | Avg:  0m 42s | Max:  2m 56s
    🟥 jobs
      🟥 Build              Pass:   0%/37  | Total:  2h 37m | Avg:  4m 14s | Max: 18m 02s
      🟥 DeviceLaunch       Pass:   0%/1  
      🟥 GraphCapture       Pass:   0%/1  
      🟥 HostLaunch         Pass:   0%/3  
      🟥 TestGPU            Pass:   0%/3  
    🟥 sm
      🟥 90                 Pass:   0%/3   | Total:  2m 35s | Avg:  0m 51s | Max:  2m 35s
      🟥 90;90a;100         Pass:   0%/1   | Total:  2m 42s | Avg:  2m 42s | Max:  2m 42s
    🟥 std
      🟥 17                 Pass:   0%/20  | Total:  1h 36m | Avg:  4m 48s | Max: 17m 27s
      🟥 20                 Pass:   0%/25  | Total:  1h 00m | Avg:  2m 26s | Max: 18m 02s
    
  • 🟥 thrust: Pass: 0%/45 | Total: 2h 36m | Avg: 3m 28s | Max: 18m 52s

    🟥 cmake_options
      🟥 -DTHRUST_DISPATCH_TYPE=Force32bit Pass:   0%/2   | Total:  2m 29s | Avg:  1m 14s | Max:  2m 29s
    🟥 cpu
      🟥 amd64              Pass:   0%/43  | Total:  2h 32m | Avg:  3m 32s | Max: 18m 52s
      🟥 arm64              Pass:   0%/2   | Total:  4m 01s | Avg:  2m 00s | Max:  2m 03s
    🟥 ctk
      🟥 12.0               Pass:   0%/5   | Total: 25m 31s | Avg:  5m 06s | Max: 16m 30s
      🟥 12.6               Pass:   0%/2   | Total:  9m 51s | Avg:  4m 55s | Max:  5m 05s
      🟥 12.8               Pass:   0%/38  | Total:  2h 01m | Avg:  3m 11s | Max: 18m 52s
    🟥 cudacxx
      🟥 ClangCUDA18        Pass:   0%/2   | Total:  5m 03s | Avg:  2m 31s | Max:  2m 34s
      🟥 nvcc12.0           Pass:   0%/5   | Total: 25m 31s | Avg:  5m 06s | Max: 16m 30s
      🟥 nvcc12.6           Pass:   0%/2   | Total:  9m 51s | Avg:  4m 55s | Max:  5m 05s
      🟥 nvcc12.8           Pass:   0%/36  | Total:  1h 56m | Avg:  3m 13s | Max: 18m 52s
    🟥 cudacxx_family
      🟥 ClangCUDA          Pass:   0%/2   | Total:  5m 03s | Avg:  2m 31s | Max:  2m 34s
      🟥 nvcc               Pass:   0%/43  | Total:  2h 31m | Avg:  3m 31s | Max: 18m 52s
    🟥 cxx
      🟥 Clang14            Pass:   0%/4   | Total:  9m 15s | Avg:  2m 18s | Max:  2m 22s
      🟥 Clang15            Pass:   0%/2   | Total:  5m 02s | Avg:  2m 31s | Max:  2m 35s
      🟥 Clang16            Pass:   0%/2   | Total:  4m 56s | Avg:  2m 28s | Max:  2m 29s
      🟥 Clang17            Pass:   0%/2   | Total:  5m 02s | Avg:  2m 31s | Max:  2m 32s
      🟥 Clang18            Pass:   0%/7   | Total: 11m 56s | Avg:  1m 42s | Max:  2m 34s
      🟥 GCC7               Pass:   0%/2   | Total:  4m 11s | Avg:  2m 05s | Max:  2m 06s
      🟥 GCC8               Pass:   0%/1   | Total:  2m 10s | Avg:  2m 10s | Max:  2m 10s
      🟥 GCC9               Pass:   0%/2   | Total:  4m 47s | Avg:  2m 23s | Max:  2m 24s
      🟥 GCC10              Pass:   0%/2   | Total:  4m 44s | Avg:  2m 22s | Max:  2m 28s
      🟥 GCC11              Pass:   0%/2   | Total:  4m 35s | Avg:  2m 17s | Max:  2m 24s
      🟥 GCC12              Pass:   0%/2   | Total:  4m 41s | Avg:  2m 20s | Max:  2m 21s
      🟥 GCC13              Pass:   0%/10  | Total: 14m 17s | Avg:  1m 25s | Max:  2m 29s
      🟥 MSVC14.29          Pass:   0%/2   | Total: 35m 22s | Avg: 17m 41s | Max: 18m 52s
      🟥 MSVC14.42          Pass:   0%/3   | Total: 35m 47s | Avg: 11m 55s | Max: 18m 31s
      🟥 NVHPC25.1          Pass:   0%/2   | Total:  9m 51s | Avg:  4m 55s | Max:  5m 05s
    🟥 cxx_family
      🟥 Clang              Pass:   0%/17  | Total: 36m 11s | Avg:  2m 07s | Max:  2m 35s
      🟥 GCC                Pass:   0%/21  | Total: 39m 25s | Avg:  1m 52s | Max:  2m 29s
      🟥 MSVC               Pass:   0%/5   | Total:  1h 11m | Avg: 14m 13s | Max: 18m 52s
      🟥 NVHPC              Pass:   0%/2   | Total:  9m 51s | Avg:  4m 55s | Max:  5m 05s
    🟥 gpu
      🟥 h100               Pass:   0%/2   | Total:  2m 21s | Avg:  1m 10s | Max:  2m 21s
      🟥 rtx2080            Pass:   0%/33  | Total:  2h 08m | Avg:  3m 53s | Max: 18m 52s
      🟥 rtx4090            Pass:   0%/10  | Total: 25m 51s | Avg:  2m 35s | Max: 18m 31s
    🟥 jobs
      🟥 Build              Pass:   0%/38  | Total:  2h 36m | Avg:  4m 07s | Max: 18m 52s
      🟥 TestCPU            Pass:   0%/3  
      🟥 TestGPU            Pass:   0%/4  
    🟥 sm
      🟥 90                 Pass:   0%/2   | Total:  2m 21s | Avg:  1m 10s | Max:  2m 21s
      🟥 90;90a;100         Pass:   0%/1   | Total:  2m 28s | Avg:  2m 28s | Max:  2m 28s
    🟥 std
      🟥 17                 Pass:   0%/20  | Total:  1h 35m | Avg:  4m 46s | Max: 18m 52s
      🟥 20                 Pass:   0%/23  | Total: 58m 39s | Avg:  2m 33s | Max: 18m 31s
    
  • 🟥 cudax: Pass: 0%/22 | Total: 1h 09m | Avg: 3m 09s | Max: 9m 07s

    🟥 cudacxx_family
      🟥 nvcc               Pass:   0%/22  | Total:  1h 09m | Avg:  3m 09s | Max:  9m 07s
    🟥 cpu
      🟥 amd64              Pass:   0%/18  | Total:  1h 00m | Avg:  3m 20s | Max:  9m 07s
      🟥 arm64              Pass:   0%/4   | Total:  9m 31s | Avg:  2m 22s | Max:  2m 28s
    🟥 ctk
      🟥 12.0               Pass:   0%/1   | Total:  9m 01s | Avg:  9m 01s | Max:  9m 01s
      🟥 12.6               Pass:   0%/2   | Total: 10m 30s | Avg:  5m 15s | Max:  5m 20s
      🟥 12.8               Pass:   0%/19  | Total: 50m 03s | Avg:  2m 38s | Max:  9m 07s
    🟥 cudacxx
      🟥 nvcc12.0           Pass:   0%/1   | Total:  9m 01s | Avg:  9m 01s | Max:  9m 01s
      🟥 nvcc12.6           Pass:   0%/2   | Total: 10m 30s | Avg:  5m 15s | Max:  5m 20s
      🟥 nvcc12.8           Pass:   0%/19  | Total: 50m 03s | Avg:  2m 38s | Max:  9m 07s
    🟥 cxx
      🟥 Clang14            Pass:   0%/1   | Total:  2m 49s | Avg:  2m 49s | Max:  2m 49s
      🟥 Clang15            Pass:   0%/1   | Total:  2m 58s | Avg:  2m 58s | Max:  2m 58s
      🟥 Clang16            Pass:   0%/1   | Total:  2m 47s | Avg:  2m 47s | Max:  2m 47s
      🟥 Clang17            Pass:   0%/1   | Total:  2m 56s | Avg:  2m 56s | Max:  2m 56s
      🟥 Clang18            Pass:   0%/4   | Total:  7m 44s | Avg:  1m 56s | Max:  2m 56s
      🟥 GCC10              Pass:   0%/1   | Total:  2m 40s | Avg:  2m 40s | Max:  2m 40s
      🟥 GCC11              Pass:   0%/1   | Total:  3m 01s | Avg:  3m 01s | Max:  3m 01s
      🟥 GCC12              Pass:   0%/2   | Total:  2m 48s | Avg:  1m 24s | Max:  2m 48s
      🟥 GCC13              Pass:   0%/6   | Total: 13m 13s | Avg:  2m 12s | Max:  2m 58s
      🟥 MSVC14.39          Pass:   0%/1   | Total:  9m 01s | Avg:  9m 01s | Max:  9m 01s
      🟥 MSVC14.42          Pass:   0%/1   | Total:  9m 07s | Avg:  9m 07s | Max:  9m 07s
      🟥 NVHPC25.1          Pass:   0%/2   | Total: 10m 30s | Avg:  5m 15s | Max:  5m 20s
    🟥 cxx_family
      🟥 Clang              Pass:   0%/8   | Total: 19m 14s | Avg:  2m 24s | Max:  2m 58s
      🟥 GCC                Pass:   0%/10  | Total: 21m 42s | Avg:  2m 10s | Max:  3m 01s
      🟥 MSVC               Pass:   0%/2   | Total: 18m 08s | Avg:  9m 04s | Max:  9m 07s
      🟥 NVHPC              Pass:   0%/2   | Total: 10m 30s | Avg:  5m 15s | Max:  5m 20s
    🟥 gpu
      🟥 h100               Pass:   0%/2   | Total:  2m 45s | Avg:  1m 22s | Max:  2m 45s
      🟥 rtx2080            Pass:   0%/20  | Total:  1h 06m | Avg:  3m 20s | Max:  9m 07s
    🟥 jobs
      🟥 Build              Pass:   0%/19  | Total:  1h 09m | Avg:  3m 39s | Max:  9m 07s
      🟥 Test               Pass:   0%/3  
    🟥 sm
      🟥 90                 Pass:   0%/3   | Total:  5m 32s | Avg:  1m 50s | Max:  2m 47s
      🟥 90a                Pass:   0%/1   | Total:  2m 58s | Avg:  2m 58s | Max:  2m 58s
    🟥 std
      🟥 17                 Pass:   0%/4   | Total: 12m 47s | Avg:  3m 11s | Max:  5m 20s
      🟥 20                 Pass:   0%/18  | Total: 56m 47s | Avg:  3m 09s | Max:  9m 07s
    
  • 🟥 stdpar: Pass: 0%/4 | Total: 18m 26s | Avg: 4m 36s | Max: 5m 27s

    🟥 ctk
      🟥 12.6               Pass:   0%/4   | Total: 18m 26s | Avg:  4m 36s | Max:  5m 27s
    🟥 cudacxx
      🟥 nvcc12.6           Pass:   0%/4   | Total: 18m 26s | Avg:  4m 36s | Max:  5m 27s
    🟥 cudacxx_family
      🟥 nvcc               Pass:   0%/4   | Total: 18m 26s | Avg:  4m 36s | Max:  5m 27s
    🟥 cxx
      🟥 NVHPC25.1          Pass:   0%/4   | Total: 18m 26s | Avg:  4m 36s | Max:  5m 27s
    🟥 cxx_family
      🟥 NVHPC              Pass:   0%/4   | Total: 18m 26s | Avg:  4m 36s | Max:  5m 27s
    🟥 gpu
      🟥 rtx2080            Pass:   0%/4   | Total: 18m 26s | Avg:  4m 36s | Max:  5m 27s
    🟥 jobs
      🟥 Build              Pass:   0%/4   | Total: 18m 26s | Avg:  4m 36s | Max:  5m 27s
    🟥 cpu
      🟥 amd64              Pass:   0%/2   | Total: 10m 12s | Avg:  5m 06s | Max:  5m 27s
      🟥 arm64              Pass:   0%/2   | Total:  8m 14s | Avg:  4m 07s | Max:  4m 09s
    🟥 std
      🟥 17                 Pass:   0%/2   | Total:  8m 54s | Avg:  4m 27s | Max:  4m 45s
      🟥 20                 Pass:   0%/2   | Total:  9m 32s | Avg:  4m 46s | Max:  5m 27s
    
  • 🟩 cccl_c_parallel: Pass: 100%/2 | Total: 17m 46s | Avg: 8m 53s | Max: 15m 19s | Hits: 98%/320

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total: 17m 46s | Avg:  8m 53s | Max: 15m 19s | Hits:  98%/320   
    🟩 ctk
      🟩 12.8               Pass: 100%/2   | Total: 17m 46s | Avg:  8m 53s | Max: 15m 19s | Hits:  98%/320   
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/2   | Total: 17m 46s | Avg:  8m 53s | Max: 15m 19s | Hits:  98%/320   
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/2   | Total: 17m 46s | Avg:  8m 53s | Max: 15m 19s | Hits:  98%/320   
    🟩 cxx
      🟩 GCC13              Pass: 100%/2   | Total: 17m 46s | Avg:  8m 53s | Max: 15m 19s | Hits:  98%/320   
    🟩 cxx_family
      🟩 GCC                Pass: 100%/2   | Total: 17m 46s | Avg:  8m 53s | Max: 15m 19s | Hits:  98%/320   
    🟩 gpu
      🟩 rtx2080            Pass: 100%/2   | Total: 17m 46s | Avg:  8m 53s | Max: 15m 19s | Hits:  98%/320   
    🟩 jobs
      🟩 Build              Pass: 100%/1   | Total:  2m 27s | Avg:  2m 27s | Max:  2m 27s | Hits:  98%/160   
      🟩 Test               Pass: 100%/1   | Total: 15m 19s | Avg: 15m 19s | Max: 15m 19s | Hits:  98%/160   
    
  • 🟩 python: Pass: 100%/1 | Total: 1h 14m | Avg: 1h 14m | Max: 1h 14m

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total:  1h 14m | Avg:  1h 14m | Max:  1h 14m
    🟩 ctk
      🟩 12.8               Pass: 100%/1   | Total:  1h 14m | Avg:  1h 14m | Max:  1h 14m
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/1   | Total:  1h 14m | Avg:  1h 14m | Max:  1h 14m
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total:  1h 14m | Avg:  1h 14m | Max:  1h 14m
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total:  1h 14m | Avg:  1h 14m | Max:  1h 14m
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total:  1h 14m | Avg:  1h 14m | Max:  1h 14m
    🟩 gpu
      🟩 rtx2080            Pass: 100%/1   | Total:  1h 14m | Avg:  1h 14m | Max:  1h 14m
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total:  1h 14m | Avg:  1h 14m | Max:  1h 14m
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
CUB
+/- Thrust
+/- CUDA Experimental
stdpar
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
+/- CUDA Experimental
+/- stdpar
+/- python
+/- CCCL C Parallel Library
+/- Catch2Helper

🏃‍ Runner counts (total jobs: 119)

# Runner
81 linux-amd64-cpu16
11 windows-amd64-cpu16
10 linux-arm64-cpu16
6 linux-amd64-gpu-rtxa6000-latest-1
4 linux-amd64-gpu-rtx2080-latest-1
4 linux-amd64-gpu-h100-latest-1
3 linux-amd64-gpu-rtx4090-latest-1

@github-actions
Copy link
Contributor

🟩 CI finished in 1h 32m: Pass: 100%/119 | Total: 2d 17h | Avg: 32m 46s | Max: 1h 12m | Hits: 79%/145863
  • 🟩 cub: Pass: 100%/45 | Total: 1d 14h | Avg: 51m 19s | Max: 1h 12m | Hits: 75%/53780

    🟩 cpu
      🟩 amd64              Pass: 100%/43  | Total:  1d 12h | Avg: 51m 11s | Max:  1h 12m | Hits:  75%/51336 
      🟩 arm64              Pass: 100%/2   | Total:  1h 48m | Avg: 54m 12s | Max:  1h 01m | Hits:  69%/2444  
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total:  4h 46m | Avg: 57m 18s | Max:  1h 01m | Hits:  70%/5940  
      🟩 12.6               Pass: 100%/2   | Total:  2h 23m | Avg:  1h 11m | Max:  1h 12m | Hits:  69%/2260  
      🟩 12.8               Pass: 100%/38  | Total:  1d 07h | Avg: 49m 27s | Max:  1h 05m | Hits:  76%/45580 
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total:  2h 08m | Avg:  1h 04m | Max:  1h 04m | Hits:  75%/2108  
      🟩 nvcc12.0           Pass: 100%/5   | Total:  4h 46m | Avg: 57m 18s | Max:  1h 01m | Hits:  70%/5940  
      🟩 nvcc12.6           Pass: 100%/2   | Total:  2h 23m | Avg:  1h 11m | Max:  1h 12m | Hits:  69%/2260  
      🟩 nvcc12.8           Pass: 100%/36  | Total:  1d 05h | Avg: 48m 39s | Max:  1h 05m | Hits:  76%/43472 
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total:  2h 08m | Avg:  1h 04m | Max:  1h 04m | Hits:  75%/2108  
      🟩 nvcc               Pass: 100%/43  | Total:  1d 12h | Avg: 50m 44s | Max:  1h 12m | Hits:  75%/51672 
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total:  4h 03m | Avg:  1h 00m | Max:  1h 01m | Hits:  69%/4896  
      🟩 Clang15            Pass: 100%/2   | Total:  2h 00m | Avg:  1h 00m | Max:  1h 01m | Hits:  69%/2444  
      🟩 Clang16            Pass: 100%/2   | Total:  2h 00m | Avg:  1h 00m | Max:  1h 00m | Hits:  69%/2444  
      🟩 Clang17            Pass: 100%/2   | Total:  2h 10m | Avg:  1h 05m | Max:  1h 05m | Hits:  69%/2444  
      🟩 Clang18            Pass: 100%/7   | Total:  6h 03m | Avg: 51m 59s | Max:  1h 04m | Hits:  80%/8218  
      🟩 GCC7               Pass: 100%/2   | Total:  1h 46m | Avg: 53m 11s | Max: 54m 09s | Hits:  69%/2448  
      🟩 GCC8               Pass: 100%/1   | Total: 50m 56s | Avg: 50m 56s | Max: 50m 56s | Hits:  69%/1224  
      🟩 GCC9               Pass: 100%/2   | Total:  1h 38m | Avg: 49m 15s | Max: 50m 46s | Hits:  69%/2448  
      🟩 GCC10              Pass: 100%/2   | Total:  1h 43m | Avg: 51m 54s | Max: 54m 56s | Hits:  69%/2448  
      🟩 GCC11              Pass: 100%/2   | Total:  1h 36m | Avg: 48m 12s | Max: 49m 18s | Hits:  69%/2444  
      🟩 GCC12              Pass: 100%/2   | Total:  1h 46m | Avg: 53m 18s | Max: 53m 22s | Hits:  69%/2444  
      🟩 GCC13              Pass: 100%/11  | Total:  6h 36m | Avg: 36m 02s | Max: 50m 34s | Hits:  85%/13442 
      🟩 MSVC14.29          Pass: 100%/2   | Total:  1h 55m | Avg: 57m 55s | Max: 58m 44s | Hits:  76%/2088  
      🟩 MSVC14.42          Pass: 100%/2   | Total:  1h 53m | Avg: 56m 48s | Max: 57m 55s | Hits:  75%/2088  
      🟩 NVHPC25.1          Pass: 100%/2   | Total:  2h 23m | Avg:  1h 11m | Max:  1h 12m | Hits:  69%/2260  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/17  | Total: 16h 17m | Avg: 57m 30s | Max:  1h 05m | Hits:  73%/20446 
      🟩 GCC                Pass: 100%/22  | Total: 15h 59m | Avg: 43m 35s | Max: 54m 56s | Hits:  77%/26898 
      🟩 MSVC               Pass: 100%/4   | Total:  3h 49m | Avg: 57m 21s | Max: 58m 44s | Hits:  76%/4176  
      🟩 NVHPC              Pass: 100%/2   | Total:  2h 23m | Avg:  1h 11m | Max:  1h 12m | Hits:  69%/2260  
    🟩 gpu
      🟩 h100               Pass: 100%/3   | Total:  1h 22m | Avg: 27m 29s | Max: 30m 51s | Hits:  89%/3666  
      🟩 rtx2080            Pass: 100%/34  | Total:  1d 08h | Avg: 57m 12s | Max:  1h 12m | Hits:  70%/40338 
      🟩 rtxa6000           Pass: 100%/8   | Total:  4h 42m | Avg: 35m 15s | Max: 58m 35s | Hits:  92%/9776  
    🟩 jobs
      🟩 Build              Pass: 100%/37  | Total:  1d 10h | Avg: 56m 08s | Max:  1h 12m | Hits:  70%/44004 
      🟩 DeviceLaunch       Pass: 100%/1   | Total: 33m 08s | Avg: 33m 08s | Max: 33m 08s | Hits:  99%/1222  
      🟩 GraphCapture       Pass: 100%/1   | Total: 26m 00s | Avg: 26m 00s | Max: 26m 00s | Hits:  99%/1222  
      🟩 HostLaunch         Pass: 100%/3   | Total:  1h 32m | Avg: 30m 45s | Max: 33m 14s | Hits:  99%/3666  
      🟩 TestGPU            Pass: 100%/3   | Total:  1h 20m | Avg: 26m 53s | Max: 28m 16s | Hits:  99%/3666  
    🟩 sm
      🟩 90                 Pass: 100%/3   | Total:  1h 22m | Avg: 27m 29s | Max: 30m 51s | Hits:  89%/3666  
      🟩 90;90a;100         Pass: 100%/1   | Total: 47m 09s | Avg: 47m 09s | Max: 47m 09s | Hits:  69%/1222  
    🟩 std
      🟩 17                 Pass: 100%/20  | Total: 19h 03m | Avg: 57m 11s | Max:  1h 12m | Hits:  70%/23662 
      🟩 20                 Pass: 100%/25  | Total: 19h 25m | Avg: 46m 37s | Max:  1h 11m | Hits:  79%/30118 
    
  • 🟩 thrust: Pass: 100%/45 | Total: 22h 28m | Avg: 29m 57s | Max: 58m 22s | Hits: 80%/79911

    🟩 cmake_options
      🟩 -DTHRUST_DISPATCH_TYPE=Force32bit Pass: 100%/2   | Total: 41m 16s | Avg: 20m 38s | Max: 29m 11s | Hits:  88%/3554  
    🟩 cpu
      🟩 amd64              Pass: 100%/43  | Total: 21h 34m | Avg: 30m 06s | Max: 58m 22s | Hits:  80%/76358 
      🟩 arm64              Pass: 100%/2   | Total: 54m 01s | Avg: 27m 00s | Max: 28m 23s | Hits:  77%/3553  
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total:  2h 51m | Avg: 34m 22s | Max: 44m 36s | Hits:  78%/8876  
      🟩 12.6               Pass: 100%/2   | Total:  1h 49m | Avg: 54m 47s | Max: 58m 22s | Hits:  65%/3552  
      🟩 12.8               Pass: 100%/38  | Total: 17h 46m | Avg: 28m 04s | Max: 53m 16s | Hits:  81%/67483 
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total: 50m 03s | Avg: 25m 01s | Max: 25m 54s | Hits:  77%/3552  
      🟩 nvcc12.0           Pass: 100%/5   | Total:  2h 51m | Avg: 34m 22s | Max: 44m 36s | Hits:  78%/8876  
      🟩 nvcc12.6           Pass: 100%/2   | Total:  1h 49m | Avg: 54m 47s | Max: 58m 22s | Hits:  65%/3552  
      🟩 nvcc12.8           Pass: 100%/36  | Total: 16h 56m | Avg: 28m 14s | Max: 53m 16s | Hits:  81%/63931 
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total: 50m 03s | Avg: 25m 01s | Max: 25m 54s | Hits:  77%/3552  
      🟩 nvcc               Pass: 100%/43  | Total: 21h 38m | Avg: 30m 11s | Max: 58m 22s | Hits:  80%/76359 
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total:  2h 01m | Avg: 30m 27s | Max: 31m 25s | Hits:  77%/7104  
      🟩 Clang15            Pass: 100%/2   | Total: 57m 54s | Avg: 28m 57s | Max: 29m 58s | Hits:  77%/3552  
      🟩 Clang16            Pass: 100%/2   | Total:  1h 03m | Avg: 31m 35s | Max: 33m 07s | Hits:  77%/3552  
      🟩 Clang17            Pass: 100%/2   | Total:  1h 01m | Avg: 30m 32s | Max: 30m 49s | Hits:  77%/3552  
      🟩 Clang18            Pass: 100%/7   | Total:  2h 31m | Avg: 21m 41s | Max: 28m 38s | Hits:  83%/12432 
      🟩 GCC7               Pass: 100%/2   | Total:  1h 03m | Avg: 31m 56s | Max: 32m 47s | Hits:  77%/3554  
      🟩 GCC8               Pass: 100%/1   | Total: 29m 12s | Avg: 29m 12s | Max: 29m 12s | Hits:  77%/1777  
      🟩 GCC9               Pass: 100%/2   | Total:  1h 07m | Avg: 33m 48s | Max: 34m 42s | Hits:  77%/3554  
      🟩 GCC10              Pass: 100%/2   | Total:  1h 02m | Avg: 31m 17s | Max: 31m 35s | Hits:  77%/3554  
      🟩 GCC11              Pass: 100%/2   | Total:  1h 02m | Avg: 31m 08s | Max: 32m 14s | Hits:  77%/3554  
      🟩 GCC12              Pass: 100%/2   | Total:  1h 07m | Avg: 33m 34s | Max: 33m 54s | Hits:  77%/3554  
      🟩 GCC13              Pass: 100%/10  | Total:  3h 33m | Avg: 21m 21s | Max: 31m 48s | Hits:  86%/17770 
      🟩 MSVC14.29          Pass: 100%/2   | Total:  1h 28m | Avg: 44m 15s | Max: 44m 36s | Hits:  81%/3540  
      🟩 MSVC14.42          Pass: 100%/3   | Total:  2h 08m | Avg: 42m 45s | Max: 53m 16s | Hits:  81%/5310  
      🟩 NVHPC25.1          Pass: 100%/2   | Total:  1h 49m | Avg: 54m 47s | Max: 58m 22s | Hits:  65%/3552  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/17  | Total:  7h 35m | Avg: 26m 48s | Max: 33m 07s | Hits:  80%/30192 
      🟩 GCC                Pass: 100%/21  | Total:  9h 26m | Avg: 26m 58s | Max: 34m 42s | Hits:  81%/37317 
      🟩 MSVC               Pass: 100%/5   | Total:  3h 36m | Avg: 43m 21s | Max: 53m 16s | Hits:  81%/8850  
      🟩 NVHPC              Pass: 100%/2   | Total:  1h 49m | Avg: 54m 47s | Max: 58m 22s | Hits:  65%/3552  
    🟩 gpu
      🟩 h100               Pass: 100%/2   | Total: 29m 08s | Avg: 14m 34s | Max: 16m 36s | Hits:  88%/3554  
      🟩 rtx2080            Pass: 100%/33  | Total: 18h 22m | Avg: 33m 25s | Max: 58m 22s | Hits:  76%/58604 
      🟩 rtx4090            Pass: 100%/10  | Total:  3h 36m | Avg: 21m 38s | Max: 49m 13s | Hits:  91%/17753 
    🟩 jobs
      🟩 Build              Pass: 100%/38  | Total: 20h 58m | Avg: 33m 06s | Max: 58m 22s | Hits:  76%/67481 
      🟩 TestCPU            Pass: 100%/3   | Total: 41m 32s | Avg: 13m 50s | Max: 25m 48s | Hits:  99%/5323  
      🟩 TestGPU            Pass: 100%/4   | Total: 48m 30s | Avg: 12m 07s | Max: 12m 32s | Hits:  99%/7107  
    🟩 sm
      🟩 90                 Pass: 100%/2   | Total: 29m 08s | Avg: 14m 34s | Max: 16m 36s | Hits:  88%/3554  
      🟩 90;90a;100         Pass: 100%/1   | Total: 30m 52s | Avg: 30m 52s | Max: 30m 52s | Hits:  77%/1777  
    🟩 std
      🟩 17                 Pass: 100%/20  | Total: 11h 38m | Avg: 34m 55s | Max: 58m 22s | Hits:  76%/35511 
      🟩 20                 Pass: 100%/23  | Total: 10h 08m | Avg: 26m 27s | Max: 51m 13s | Hits:  82%/40846 
    
  • 🟩 cudax: Pass: 100%/22 | Total: 2h 16m | Avg: 6m 12s | Max: 15m 14s | Hits: 95%/11852

    🟩 cpu
      🟩 amd64              Pass: 100%/18  | Total:  2h 01m | Avg:  6m 45s | Max: 15m 14s | Hits:  95%/9512  
      🟩 arm64              Pass: 100%/4   | Total: 14m 51s | Avg:  3m 42s | Max:  3m 51s | Hits:  95%/2340  
    🟩 ctk
      🟩 12.0               Pass: 100%/1   | Total:  9m 51s | Avg:  9m 51s | Max:  9m 51s | Hits:  89%/282   
      🟩 12.6               Pass: 100%/2   | Total: 14m 00s | Avg:  7m 00s | Max:  7m 07s | Hits:  90%/754   
      🟩 12.8               Pass: 100%/19  | Total:  1h 52m | Avg:  5m 56s | Max: 15m 14s | Hits:  96%/10816 
    🟩 cudacxx
      🟩 nvcc12.0           Pass: 100%/1   | Total:  9m 51s | Avg:  9m 51s | Max:  9m 51s | Hits:  89%/282   
      🟩 nvcc12.6           Pass: 100%/2   | Total: 14m 00s | Avg:  7m 00s | Max:  7m 07s | Hits:  90%/754   
      🟩 nvcc12.8           Pass: 100%/19  | Total:  1h 52m | Avg:  5m 56s | Max: 15m 14s | Hits:  96%/10816 
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/22  | Total:  2h 16m | Avg:  6m 12s | Max: 15m 14s | Hits:  95%/11852 
    🟩 cxx
      🟩 Clang14            Pass: 100%/1   | Total:  4m 39s | Avg:  4m 39s | Max:  4m 39s | Hits:  95%/587   
      🟩 Clang15            Pass: 100%/1   | Total:  4m 11s | Avg:  4m 11s | Max:  4m 11s | Hits:  95%/585   
      🟩 Clang16            Pass: 100%/1   | Total:  4m 16s | Avg:  4m 16s | Max:  4m 16s | Hits:  95%/585   
      🟩 Clang17            Pass: 100%/1   | Total:  4m 31s | Avg:  4m 31s | Max:  4m 31s | Hits:  95%/585   
      🟩 Clang18            Pass: 100%/4   | Total: 24m 22s | Avg:  6m 05s | Max: 12m 35s | Hits:  96%/2340  
      🟩 GCC10              Pass: 100%/1   | Total:  4m 15s | Avg:  4m 15s | Max:  4m 15s | Hits:  95%/587   
      🟩 GCC11              Pass: 100%/1   | Total:  4m 15s | Avg:  4m 15s | Max:  4m 15s | Hits:  95%/585   
      🟩 GCC12              Pass: 100%/2   | Total: 17m 50s | Avg:  8m 55s | Max: 13m 04s | Hits:  97%/1170  
      🟩 GCC13              Pass: 100%/6   | Total: 34m 13s | Avg:  5m 42s | Max: 15m 14s | Hits:  96%/3510  
      🟩 MSVC14.39          Pass: 100%/1   | Total:  9m 51s | Avg:  9m 51s | Max:  9m 51s | Hits:  89%/282   
      🟩 MSVC14.42          Pass: 100%/1   | Total: 10m 12s | Avg: 10m 12s | Max: 10m 12s | Hits:  89%/282   
      🟩 NVHPC25.1          Pass: 100%/2   | Total: 14m 00s | Avg:  7m 00s | Max:  7m 07s | Hits:  90%/754   
    🟩 cxx_family
      🟩 Clang              Pass: 100%/8   | Total: 41m 59s | Avg:  5m 14s | Max: 12m 35s | Hits:  96%/4682  
      🟩 GCC                Pass: 100%/10  | Total:  1h 00m | Avg:  6m 03s | Max: 15m 14s | Hits:  96%/5852  
      🟩 MSVC               Pass: 100%/2   | Total: 20m 03s | Avg: 10m 01s | Max: 10m 12s | Hits:  89%/564   
      🟩 NVHPC              Pass: 100%/2   | Total: 14m 00s | Avg:  7m 00s | Max:  7m 07s | Hits:  90%/754   
    🟩 gpu
      🟩 h100               Pass: 100%/2   | Total: 18m 55s | Avg:  9m 27s | Max: 15m 14s | Hits:  97%/1170  
      🟩 rtx2080            Pass: 100%/20  | Total:  1h 57m | Avg:  5m 53s | Max: 13m 04s | Hits:  95%/10682 
    🟩 jobs
      🟩 Build              Pass: 100%/19  | Total:  1h 35m | Avg:  5m 02s | Max: 10m 12s | Hits:  94%/10097 
      🟩 Test               Pass: 100%/3   | Total: 40m 53s | Avg: 13m 37s | Max: 15m 14s | Hits:  99%/1755  
    🟩 sm
      🟩 90                 Pass: 100%/3   | Total: 22m 40s | Avg:  7m 33s | Max: 15m 14s | Hits:  96%/1755  
      🟩 90a                Pass: 100%/1   | Total:  3m 52s | Avg:  3m 52s | Max:  3m 52s | Hits:  95%/585   
    🟩 std
      🟩 17                 Pass: 100%/4   | Total: 18m 18s | Avg:  4m 34s | Max:  7m 07s | Hits:  94%/2132  
      🟩 20                 Pass: 100%/18  | Total:  1h 58m | Avg:  6m 34s | Max: 15m 14s | Hits:  95%/9720  
    
  • 🟩 stdpar: Pass: 100%/4 | Total: 16m 28s | Avg: 4m 07s | Max: 4m 54s

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total:  9m 41s | Avg:  4m 50s | Max:  4m 54s
      🟩 arm64              Pass: 100%/2   | Total:  6m 47s | Avg:  3m 23s | Max:  3m 24s
    🟩 ctk
      🟩 12.6               Pass: 100%/4   | Total: 16m 28s | Avg:  4m 07s | Max:  4m 54s
    🟩 cudacxx
      🟩 nvcc12.6           Pass: 100%/4   | Total: 16m 28s | Avg:  4m 07s | Max:  4m 54s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/4   | Total: 16m 28s | Avg:  4m 07s | Max:  4m 54s
    🟩 cxx
      🟩 NVHPC25.1          Pass: 100%/4   | Total: 16m 28s | Avg:  4m 07s | Max:  4m 54s
    🟩 cxx_family
      🟩 NVHPC              Pass: 100%/4   | Total: 16m 28s | Avg:  4m 07s | Max:  4m 54s
    🟩 gpu
      🟩 rtx2080            Pass: 100%/4   | Total: 16m 28s | Avg:  4m 07s | Max:  4m 54s
    🟩 jobs
      🟩 Build              Pass: 100%/4   | Total: 16m 28s | Avg:  4m 07s | Max:  4m 54s
    🟩 std
      🟩 17                 Pass: 100%/2   | Total:  8m 11s | Avg:  4m 05s | Max:  4m 47s
      🟩 20                 Pass: 100%/2   | Total:  8m 17s | Avg:  4m 08s | Max:  4m 54s
    
  • 🟩 cccl_c_parallel: Pass: 100%/2 | Total: 18m 44s | Avg: 9m 22s | Max: 16m 16s | Hits: 98%/320

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total: 18m 44s | Avg:  9m 22s | Max: 16m 16s | Hits:  98%/320   
    🟩 ctk
      🟩 12.8               Pass: 100%/2   | Total: 18m 44s | Avg:  9m 22s | Max: 16m 16s | Hits:  98%/320   
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/2   | Total: 18m 44s | Avg:  9m 22s | Max: 16m 16s | Hits:  98%/320   
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/2   | Total: 18m 44s | Avg:  9m 22s | Max: 16m 16s | Hits:  98%/320   
    🟩 cxx
      🟩 GCC13              Pass: 100%/2   | Total: 18m 44s | Avg:  9m 22s | Max: 16m 16s | Hits:  98%/320   
    🟩 cxx_family
      🟩 GCC                Pass: 100%/2   | Total: 18m 44s | Avg:  9m 22s | Max: 16m 16s | Hits:  98%/320   
    🟩 gpu
      🟩 rtx2080            Pass: 100%/2   | Total: 18m 44s | Avg:  9m 22s | Max: 16m 16s | Hits:  98%/320   
    🟩 jobs
      🟩 Build              Pass: 100%/1   | Total:  2m 28s | Avg:  2m 28s | Max:  2m 28s | Hits:  98%/160   
      🟩 Test               Pass: 100%/1   | Total: 16m 16s | Avg: 16m 16s | Max: 16m 16s | Hits:  98%/160   
    
  • 🟩 python: Pass: 100%/1 | Total: 1h 10m | Avg: 1h 10m | Max: 1h 10m

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total:  1h 10m | Avg:  1h 10m | Max:  1h 10m
    🟩 ctk
      🟩 12.8               Pass: 100%/1   | Total:  1h 10m | Avg:  1h 10m | Max:  1h 10m
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/1   | Total:  1h 10m | Avg:  1h 10m | Max:  1h 10m
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total:  1h 10m | Avg:  1h 10m | Max:  1h 10m
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total:  1h 10m | Avg:  1h 10m | Max:  1h 10m
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total:  1h 10m | Avg:  1h 10m | Max:  1h 10m
    🟩 gpu
      🟩 rtx2080            Pass: 100%/1   | Total:  1h 10m | Avg:  1h 10m | Max:  1h 10m
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total:  1h 10m | Avg:  1h 10m | Max:  1h 10m
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
CUB
+/- Thrust
+/- CUDA Experimental
stdpar
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
+/- CUDA Experimental
+/- stdpar
+/- python
+/- CCCL C Parallel Library
+/- Catch2Helper

🏃‍ Runner counts (total jobs: 119)

# Runner
81 linux-amd64-cpu16
11 windows-amd64-cpu16
10 linux-arm64-cpu16
6 linux-amd64-gpu-rtxa6000-latest-1
4 linux-amd64-gpu-rtx2080-latest-1
4 linux-amd64-gpu-h100-latest-1
3 linux-amd64-gpu-rtx4090-latest-1

@bernhardmgruber bernhardmgruber merged commit 45cb0b2 into NVIDIA:main Mar 21, 2025
133 of 134 checks passed
@github-project-automation github-project-automation bot moved this from In Review to Done in CCCL Mar 21, 2025
@bernhardmgruber bernhardmgruber deleted the exec_stream_ref branch March 21, 2025 18:47
davebayer pushed a commit to davebayer/cccl that referenced this pull request Apr 7, 2025
Also drop `__host__ __device__` for stream_ref constructor

Fixes: NVIDIA#4150

Co-authored-by: Michael Schellenberger Costa <miscco@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

Thrust execution_policy::on should take a stream_ref in addition to a stream

3 participants