Fix the sampler and update the triton/cuda kernels #146

nvchenghaoz · 2025-09-26T22:50:39Z

@coderabbitai summary

Signed-off-by: Chenghao Zhang <211069071+nvchenghaoz@users.noreply.github.com>

Signed-off-by: Suyog Gupta <41447211+suyoggupta@users.noreply.github.com>

Signed-off-by: Chenghao Zhang <211069071+nvchenghaoz@users.noreply.github.com>

* Fix the bamba unit test Signed-off-by: Chenghao Zhang <211069071+nvchenghaoz@users.noreply.github.com> * none: Add triton backend for ssm_transform and cuda backend for conv Signed-off-by: Chenghao Zhang <211069071+nvchenghaoz@users.noreply.github.com> * Fully Use the TRT LLM kernels Signed-off-by: Chenghao Zhang <211069071+nvchenghaoz@users.noreply.github.com> * Add fake version for ssm transform op Signed-off-by: Chenghao Zhang <211069071+nvchenghaoz@users.noreply.github.com> * Fix the datatype error in fake op Signed-off-by: Chenghao Zhang <211069071+nvchenghaoz@users.noreply.github.com> * Fix the conv test error Signed-off-by: Chenghao Zhang <211069071+nvchenghaoz@users.noreply.github.com> * Fix the triton ssm error Signed-off-by: Chenghao Zhang <211069071+nvchenghaoz@users.noreply.github.com> * Fix the DemoLLM sampler mismatch Signed-off-by: Chenghao Zhang <211069071+nvchenghaoz@users.noreply.github.com> * Update the implementation for triton/cuda kernels Signed-off-by: Chenghao Zhang <211069071+nvchenghaoz@users.noreply.github.com> * Fix the d2d memcpy for decode Signed-off-by: Chenghao Zhang <211069071+nvchenghaoz@users.noreply.github.com> * Revert the generator and remove the redundant code Signed-off-by: Chenghao Zhang <211069071+nvchenghaoz@users.noreply.github.com> --------- Signed-off-by: Chenghao Zhang <211069071+nvchenghaoz@users.noreply.github.com> Signed-off-by: Suyog Gupta <41447211+suyoggupta@users.noreply.github.com> Co-authored-by: Suyog Gupta <41447211+suyoggupta@users.noreply.github.com>

nvchenghaoz and others added 12 commits September 22, 2025 14:16

Fix the bamba unit test

22ade41

Signed-off-by: Chenghao Zhang <211069071+nvchenghaoz@users.noreply.github.com>

none: Add triton backend for ssm_transform and cuda backend for conv

2344404

Signed-off-by: Chenghao Zhang <211069071+nvchenghaoz@users.noreply.github.com>

Fully Use the TRT LLM kernels

1bbcf19

Signed-off-by: Chenghao Zhang <211069071+nvchenghaoz@users.noreply.github.com>

Add fake version for ssm transform op

65083c2

Signed-off-by: Chenghao Zhang <211069071+nvchenghaoz@users.noreply.github.com>

Fix the datatype error in fake op

8cfb07b

Signed-off-by: Chenghao Zhang <211069071+nvchenghaoz@users.noreply.github.com>

Fix the conv test error

f6c7aec

Signed-off-by: Chenghao Zhang <211069071+nvchenghaoz@users.noreply.github.com>

Fix the triton ssm error

08aada6

Signed-off-by: Chenghao Zhang <211069071+nvchenghaoz@users.noreply.github.com>

Fix the DemoLLM sampler mismatch

199cdcb

Signed-off-by: Chenghao Zhang <211069071+nvchenghaoz@users.noreply.github.com>

Update the implementation for triton/cuda kernels

a4307d3

Signed-off-by: Chenghao Zhang <211069071+nvchenghaoz@users.noreply.github.com>

ensure cudagraph compatibility

2d02923

Signed-off-by: Suyog Gupta <41447211+suyoggupta@users.noreply.github.com>

Fix the d2d memcpy for decode

33b6206

Signed-off-by: Chenghao Zhang <211069071+nvchenghaoz@users.noreply.github.com>

Revert the generator and remove the redundant code

920fa1e

Signed-off-by: Chenghao Zhang <211069071+nvchenghaoz@users.noreply.github.com>

lucaslie approved these changes Sep 26, 2025

View reviewed changes

nvchenghaoz merged commit 4b50b3e into feat/ad_linear_attention Sep 26, 2025
2 of 3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix the sampler and update the triton/cuda kernels #146

Fix the sampler and update the triton/cuda kernels #146

Uh oh!

nvchenghaoz commented Sep 26, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Fix the sampler and update the triton/cuda kernels #146

Fix the sampler and update the triton/cuda kernels #146

Uh oh!

Conversation

nvchenghaoz commented Sep 26, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants