KEMBAR78
Comparing v3.0.2...v3.0.3 · NVIDIA/cccl · GitHub
Skip to content
Permalink

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also or learn more about diff comparisons.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also . Learn more about diff comparisons here.
base repository: NVIDIA/cccl
Failed to load repositories. Confirm that selected base ref is valid, then try again.
Loading
base: v3.0.2
Choose a base ref
...
head repository: NVIDIA/cccl
Failed to load repositories. Confirm that selected head ref is valid, then try again.
Loading
compare: v3.0.3
Choose a head ref
  • 12 commits
  • 112 files changed
  • 9 contributors

Commits on Aug 8, 2025

  1. Backport #5442 to branch/3.0x (#5469)

    * cuda.cccl: Update dependencies to enable running on CUDA 13 driver (#5442)
    
    * Add numba-cuda as a dependency
    
    * Replace use of pynvjitlink patch
    
    * Update pyproject.toml
    
    There's a bug in cuda-bindings 12.9.0 that prevents us from using CUDA 13 driver
    
    * Pathfinder update
    
    * Remove all uses of CUDA_ENABLE_PYNVJITLINK. Remove pynvjitlink dependency. Add numba-cuda lower bound.
    
    * Remove all empty imports sections
    
    * Restore some imports sections
    
    * It's lib, not library
    
    ---------
    
    Co-authored-by: Ashwin Srinath <shwina@users.noreply.github.com>
    
    * We need cuda.core?
    
    * We need cuda.core?
    
    * Add begin/end markers to warp merge sort API example
    
    ---------
    
    Co-authored-by: Ashwin Srinath <shwina@users.noreply.github.com>
    shwina and shwina authored Aug 8, 2025
    Configuration menu
    Copy the full SHA
    668d07f View commit details
    Browse the repository at this point in the history
  2. Fix grid dependency sync in cub::DeviceMergeSort (#5456) (#5461)

    The sync was too late and did not guard loading from merge_partitions, leading to a data race
    bernhardmgruber authored Aug 8, 2025
    Configuration menu
    Copy the full SHA
    a7a0fd5 View commit details
    Browse the repository at this point in the history

Commits on Aug 9, 2025

  1. Fix SMEM alignment in DeviceTransform (#5463)

    This is a partial backport of #5414
    
    Fixes NvBug 5320137 and NVIDIA/cccl_private#453
    bernhardmgruber authored Aug 9, 2025
    Configuration menu
    Copy the full SHA
    cd84a78 View commit details
    Browse the repository at this point in the history

Commits on Aug 12, 2025

  1. Bump branch/3.0.x to 3.0.3. (#5502)

    Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
    github-actions[bot] authored Aug 12, 2025
    Configuration menu
    Copy the full SHA
    e8e3a5f View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    043f422 View commit details
    Browse the repository at this point in the history

Commits on Aug 28, 2025

  1. Update PTX ISA version for CUDA 13 (#5676) (#5700)

    Co-authored-by: David Bayer <48736217+davebayer@users.noreply.github.com>
    miscco and davebayer authored Aug 28, 2025
    Configuration menu
    Copy the full SHA
    70062d3 View commit details
    Browse the repository at this point in the history

Commits on Sep 9, 2025

  1. Backport some MSVC test fixes to 3.0 (#5819)

    We are getting QA bugs because of those failures, most are MSVC limitations that are popping up again.
    
    There is one fix in the definition of `indirectly_writable` for an MSVC oddity
    miscco authored Sep 9, 2025
    Configuration menu
    Copy the full SHA
    978484a View commit details
    Browse the repository at this point in the history

Commits on Sep 17, 2025

  1. Work around submdspan compiler issue on MSVC (#5885) (#5903)

    We were experiencing segfaults with `submdspan` on MSVC
    
    Thanks to some friendly help of the compiler team we have a workaround for those crashes
    miscco authored Sep 17, 2025
    Configuration menu
    Copy the full SHA
    8050285 View commit details
    Browse the repository at this point in the history

Commits on Sep 24, 2025

  1. Backport pin of llvmlite dependency to branch/3.0x (#6000)

    * Backport pin of llvmlite dependency to branch/3.0x
    
    * Pin llvmlite dependency in pyproject.toml for cuda_coop
    
    Added llvmlite dependency with a note to remove it later.
    
    * [pre-commit.ci] auto code formatting
    
    ---------
    
    Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
    shwina and pre-commit-ci[bot] authored Sep 24, 2025
    Configuration menu
    Copy the full SHA
    218ac88 View commit details
    Browse the repository at this point in the history

Commits on Oct 1, 2025

  1. Ensure that we are actually calling the cuda APIs ... (#4570) (#6098)

    (cherry picked from commit 65c0373)
    
    Co-authored-by: Michael Schellenberger Costa <miscco@nvidia.com>
    davebayer and miscco authored Oct 1, 2025
    Configuration menu
    Copy the full SHA
    4c8d218 View commit details
    Browse the repository at this point in the history

Commits on Oct 2, 2025

  1. Configuration menu
    Copy the full SHA
    667a95f View commit details
    Browse the repository at this point in the history

Commits on Oct 6, 2025

  1. [Backport 3.0.x] Use proper qualification in allocate.h (#4796) (#6126)

    * Use proper qualification in allocate.h (#4796)
    
    Fixes [BUG]: Ambiguous call to __do_deallocate_handle_size when compiling with clang and libc++ #4793
    
    * Keep sink for `__align`
    
    ---------
    
    Co-authored-by: Michael Schellenberger Costa <miscco@nvidia.com>
    wmaxey and miscco authored Oct 6, 2025
    Configuration menu
    Copy the full SHA
    8c04b65 View commit details
    Browse the repository at this point in the history
Loading