-
Notifications
You must be signed in to change notification settings - Fork 282
Add thrust::offset_iterator
#4073
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
/ok to test |
6779e58 to
3427e90
Compare
🟨 CI finished in 1h 40m: Pass: 51%/93 | Total: 2d 14h | Avg: 40m 29s | Max: 1h 26m | Hits: 35%/53934
|
| Project | |
|---|---|
| CCCL Infrastructure | |
| libcu++ | |
| +/- | CUB |
| +/- | Thrust |
| CUDA Experimental | |
| python | |
| CCCL C Parallel Library | |
| Catch2Helper |
Modifications in project or dependencies?
| Project | |
|---|---|
| CCCL Infrastructure | |
| libcu++ | |
| +/- | CUB |
| +/- | Thrust |
| CUDA Experimental | |
| +/- | python |
| +/- | CCCL C Parallel Library |
| +/- | Catch2Helper |
🏃 Runner counts (total jobs: 93)
| # | Runner |
|---|---|
| 66 | linux-amd64-cpu16 |
| 9 | windows-amd64-cpu16 |
| 6 | linux-amd64-gpu-rtxa6000-latest-1 |
| 4 | linux-arm64-cpu16 |
| 3 | linux-amd64-gpu-h100-latest-1 |
| 3 | linux-amd64-gpu-rtx4090-latest-1 |
| 2 | linux-amd64-gpu-rtx2080-latest-1 |
3427e90 to
e214f13
Compare
🟨 CI finished in 1h 07m: Pass: 91%/93 | Total: 1d 10h | Avg: 22m 15s | Max: 1h 00m | Hits: 96%/120182
|
| Project | |
|---|---|
| CCCL Infrastructure | |
| libcu++ | |
| +/- | CUB |
| +/- | Thrust |
| CUDA Experimental | |
| python | |
| CCCL C Parallel Library | |
| Catch2Helper |
Modifications in project or dependencies?
| Project | |
|---|---|
| CCCL Infrastructure | |
| libcu++ | |
| +/- | CUB |
| +/- | Thrust |
| CUDA Experimental | |
| +/- | python |
| +/- | CCCL C Parallel Library |
| +/- | Catch2Helper |
🏃 Runner counts (total jobs: 93)
| # | Runner |
|---|---|
| 66 | linux-amd64-cpu16 |
| 9 | windows-amd64-cpu16 |
| 6 | linux-amd64-gpu-rtxa6000-latest-1 |
| 4 | linux-arm64-cpu16 |
| 3 | linux-amd64-gpu-h100-latest-1 |
| 3 | linux-amd64-gpu-rtx4090-latest-1 |
| 2 | linux-amd64-gpu-rtx2080-latest-1 |
8ab29ce to
d307ca9
Compare
🟨 CI finished in 1h 28m: Pass: 93%/93 | Total: 1d 01h | Avg: 16m 20s | Max: 1h 26m | Hits: 94%/125555
|
| Project | |
|---|---|
| CCCL Infrastructure | |
| libcu++ | |
| CUB | |
| +/- | Thrust |
| CUDA Experimental | |
| python | |
| CCCL C Parallel Library | |
| Catch2Helper |
Modifications in project or dependencies?
| Project | |
|---|---|
| CCCL Infrastructure | |
| libcu++ | |
| +/- | CUB |
| +/- | Thrust |
| CUDA Experimental | |
| +/- | python |
| +/- | CCCL C Parallel Library |
| +/- | Catch2Helper |
🏃 Runner counts (total jobs: 93)
| # | Runner |
|---|---|
| 66 | linux-amd64-cpu16 |
| 9 | windows-amd64-cpu16 |
| 6 | linux-amd64-gpu-rtxa6000-latest-1 |
| 4 | linux-arm64-cpu16 |
| 3 | linux-amd64-gpu-h100-latest-1 |
| 3 | linux-amd64-gpu-rtx4090-latest-1 |
| 2 | linux-amd64-gpu-rtx2080-latest-1 |
d307ca9 to
64ffdb8
Compare
🟩 CI finished in 1h 09m: Pass: 100%/93 | Total: 17h 09m | Avg: 11m 04s | Max: 1h 00m | Hits: 94%/134475
|
| Project | |
|---|---|
| CCCL Infrastructure | |
| libcu++ | |
| CUB | |
| +/- | Thrust |
| CUDA Experimental | |
| python | |
| CCCL C Parallel Library | |
| Catch2Helper |
Modifications in project or dependencies?
| Project | |
|---|---|
| CCCL Infrastructure | |
| libcu++ | |
| +/- | CUB |
| +/- | Thrust |
| CUDA Experimental | |
| +/- | python |
| +/- | CCCL C Parallel Library |
| +/- | Catch2Helper |
🏃 Runner counts (total jobs: 93)
| # | Runner |
|---|---|
| 66 | linux-amd64-cpu16 |
| 9 | windows-amd64-cpu16 |
| 6 | linux-amd64-gpu-rtxa6000-latest-1 |
| 4 | linux-arm64-cpu16 |
| 3 | linux-amd64-gpu-h100-latest-1 |
| 3 | linux-amd64-gpu-rtx4090-latest-1 |
| 2 | linux-amd64-gpu-rtx2080-latest-1 |
🟩 CI finished in 1h 10m: Pass: 100%/93 | Total: 17h 35m | Avg: 11m 20s | Max: 1h 00m | Hits: 94%/134475
|
| Project | |
|---|---|
| CCCL Infrastructure | |
| libcu++ | |
| CUB | |
| +/- | Thrust |
| CUDA Experimental | |
| python | |
| CCCL C Parallel Library | |
| Catch2Helper |
Modifications in project or dependencies?
| Project | |
|---|---|
| CCCL Infrastructure | |
| libcu++ | |
| +/- | CUB |
| +/- | Thrust |
| CUDA Experimental | |
| +/- | python |
| +/- | CCCL C Parallel Library |
| +/- | Catch2Helper |
🏃 Runner counts (total jobs: 93)
| # | Runner |
|---|---|
| 66 | linux-amd64-cpu16 |
| 9 | windows-amd64-cpu16 |
| 6 | linux-amd64-gpu-rtxa6000-latest-1 |
| 4 | linux-arm64-cpu16 |
| 3 | linux-amd64-gpu-h100-latest-1 |
| 3 | linux-amd64-gpu-rtx4090-latest-1 |
| 2 | linux-amd64-gpu-rtx2080-latest-1 |
649f5ee to
062bfb1
Compare
🟩 CI finished in 1h 14m: Pass: 100%/93 | Total: 20h 18m | Avg: 13m 06s | Max: 1h 09m | Hits: 93%/134475
|
| Project | |
|---|---|
| CCCL Infrastructure | |
| libcu++ | |
| +/- | CUB |
| +/- | Thrust |
| CUDA Experimental | |
| python | |
| CCCL C Parallel Library | |
| Catch2Helper |
Modifications in project or dependencies?
| Project | |
|---|---|
| CCCL Infrastructure | |
| libcu++ | |
| +/- | CUB |
| +/- | Thrust |
| CUDA Experimental | |
| +/- | python |
| +/- | CCCL C Parallel Library |
| +/- | Catch2Helper |
🏃 Runner counts (total jobs: 93)
| # | Runner |
|---|---|
| 66 | linux-amd64-cpu16 |
| 9 | windows-amd64-cpu16 |
| 6 | linux-amd64-gpu-rtxa6000-latest-1 |
| 4 | linux-arm64-cpu16 |
| 3 | linux-amd64-gpu-h100-latest-1 |
| 3 | linux-amd64-gpu-rtx4090-latest-1 |
| 2 | linux-amd64-gpu-rtx2080-latest-1 |
Co-authored-by: Elias Stehle <3958403+elstehle@users.noreply.github.com>
🟩 CI finished in 1h 07m: Pass: 100%/93 | Total: 17h 29m | Avg: 11m 16s | Max: 1h 03m | Hits: 94%/134475
|
| Project | |
|---|---|
| CCCL Infrastructure | |
| libcu++ | |
| +/- | CUB |
| +/- | Thrust |
| CUDA Experimental | |
| python | |
| CCCL C Parallel Library | |
| Catch2Helper |
Modifications in project or dependencies?
| Project | |
|---|---|
| CCCL Infrastructure | |
| libcu++ | |
| +/- | CUB |
| +/- | Thrust |
| CUDA Experimental | |
| +/- | python |
| +/- | CCCL C Parallel Library |
| +/- | Catch2Helper |
🏃 Runner counts (total jobs: 93)
| # | Runner |
|---|---|
| 66 | linux-amd64-cpu16 |
| 9 | windows-amd64-cpu16 |
| 6 | linux-amd64-gpu-rtxa6000-latest-1 |
| 4 | linux-arm64-cpu16 |
| 3 | linux-amd64-gpu-h100-latest-1 |
| 3 | linux-amd64-gpu-rtx4090-latest-1 |
| 2 | linux-amd64-gpu-rtx2080-latest-1 |
* Add thrust::offset_iterator * Support custom offset types in offset_iterator * Move CUB-using test to CUDA * Remove mutation * Add example loading offset via transform_iterator and extend doc * Fx1# Please enter the commit message for your changes. Lines starting * MSVC workaround * Update after discssion with elstehle * Add select example with offset_iterator * Apply suggestions from code review Co-authored-by: Elias Stehle <3958403+elstehle@users.noreply.github.com> --------- Co-authored-by: Elias Stehle <3958403+elstehle@users.noreply.github.com>
|
Should the |
There is no such function. I assume you mean the deduction guide here: https://github.com/NVIDIA/cccl/pull/4073/files#diff-41c2e21b5544528c5931e9128123ef37f48ec73afc7a138a1b4c00395d56d752R186-R187 It severs the same purpose though. Constructing an |
|
Thanks! It is my first time seeing deduction guides in real code. |
* Add thrust::offset_iterator * Support custom offset types in offset_iterator * Move CUB-using test to CUDA * Remove mutation * Add example loading offset via transform_iterator and extend doc * Fx1# Please enter the commit message for your changes. Lines starting * MSVC workaround * Update after discssion with elstehle * Add select example with offset_iterator * Apply suggestions from code review Co-authored-by: Elias Stehle <3958403+elstehle@users.noreply.github.com> --------- Co-authored-by: Elias Stehle <3958403+elstehle@users.noreply.github.com>
Adds a new
offset_iteratorto Thrust that bundles a base iterator and a offset. The offset is either a value or anindirectly_readable. Dereferencing and comparing iterators will apply the offset to the base iterator. if the offset isindirectly_readable, the offset value will be loaded as required.This iterator has two use cases with different mechanics:
+=) in host code inside anoffset_iterator. Advancing the offset iterator will advance the offset and not the base iterator. This enables to use such iterators in host code as well.Related: #3767