KEMBAR78
[PT2]: Add Static Dispatch Kernel for scale_gradient by kqfu · Pull Request #160454 · pytorch/pytorch · GitHub
Skip to content

Conversation

@kqfu
Copy link
Contributor

@kqfu kqfu commented Aug 12, 2025

Summary: Add Static Dispatch Kernel for scale_gradient

Test Plan:

MODEL_TYPE=dpa_product_first_ctr_model
MODEL_ENTITY_ID=892669089
SNAPSHOT_ID=37
OTHER_MODEL_ENTITY_ID=892669089
OTHER_SNAPSHOT_ID=36

MODULES=(mix prepare_float_features object user)
SUFFIXES=(.predictor.local .predictor.precompute.prepare_float_features .predictor.precompute.remote_object_only .predictor.precompute.remote_request_only)

for i in "${!MODULES[@]}"; do 
MODULE=${MODULES[i]}
SUFFIX=${SUFFIXES[i]}
buck2 run mode/opt caffe2/torch/fb/model_transform/fx2trt/packaging:load_net_predictor -- --loadMode=BenchmarkAB --inputNetFile=/data/users/$USER/models/${MODEL_ENTITY_ID}/${SNAPSHOT_ID}/${MODEL_ENTITY_ID}_${SNAPSHOT_ID}${SUFFIX} --otherNetFile=/data/users/$USER/models/${OTHER_MODEL_ENTITY_ID}/${OTHER_SNAPSHOT_ID}/${OTHER_MODEL_ENTITY_ID}_${OTHER_SNAPSHOT_ID}${SUFFIX} --moduleName=${MODULE} --submodToDevice "" --benchmarkDontRebatchSamples=true --doNotRandomizeSampleInputs=true

Rollback Plan:

Reviewed By: henryoier

Differential Revision: D80062244

@pytorch-bot
Copy link

pytorch-bot bot commented Aug 12, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/160454

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure

As of commit a407a08 with merge base dae7710 (image):

NEW FAILURE - The following job has failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D80062244

@github-actions
Copy link
Contributor

Attention! native_functions.yaml was changed

If you are adding a new function or defaulted argument to native_functions.yaml, you cannot use it from pre-existing Python frontend code until our FC window passes (two weeks). Split your PR into two PRs, one which adds the new C++ functionality, and one that makes use of it from Python, and land them two weeks apart. See https://github.com/pytorch/pytorch/wiki/PyTorch's-Python-Frontend-Backward-and-Forward-Compatibility-Policy#forwards-compatibility-fc for more info.


Caused by:

@kqfu kqfu force-pushed the export-D80062244 branch from 5463b49 to 016fd7c Compare August 13, 2025 16:47
kqfu added a commit to kqfu/pytorch that referenced this pull request Aug 13, 2025
Summary:

Add Static Dispatch Kernel for scale_gradient

Test Plan:
```
MODEL_TYPE=dpa_product_first_ctr_model
MODEL_ENTITY_ID=892669089
SNAPSHOT_ID=37
OTHER_MODEL_ENTITY_ID=892669089
OTHER_SNAPSHOT_ID=36

MODULES=(mix prepare_float_features object user)
SUFFIXES=(.predictor.local .predictor.precompute.prepare_float_features .predictor.precompute.remote_object_only .predictor.precompute.remote_request_only)

for i in "${!MODULES[@]}"; do 
MODULE=${MODULES[i]}
SUFFIX=${SUFFIXES[i]}
buck2 run mode/opt caffe2/torch/fb/model_transform/fx2trt/packaging:load_net_predictor -- --loadMode=BenchmarkAB --inputNetFile=/data/users/$USER/models/${MODEL_ENTITY_ID}/${SNAPSHOT_ID}/${MODEL_ENTITY_ID}_${SNAPSHOT_ID}${SUFFIX} --otherNetFile=/data/users/$USER/models/${OTHER_MODEL_ENTITY_ID}/${OTHER_SNAPSHOT_ID}/${OTHER_MODEL_ENTITY_ID}_${OTHER_SNAPSHOT_ID}${SUFFIX} --moduleName=${MODULE} --submodToDevice "" --benchmarkDontRebatchSamples=true --doNotRandomizeSampleInputs=true
```

Rollback Plan:

Reviewed By: henryoier

Differential Revision: D80062244
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D80062244

@kqfu kqfu force-pushed the export-D80062244 branch from 016fd7c to ee2c5bd Compare August 13, 2025 16:52
kqfu added a commit to kqfu/pytorch that referenced this pull request Aug 13, 2025
Summary:
Pull Request resolved: pytorch#160454

Add Static Dispatch Kernel for scale_gradient

Test Plan:
```
MODEL_TYPE=dpa_product_first_ctr_model
MODEL_ENTITY_ID=892669089
SNAPSHOT_ID=37
OTHER_MODEL_ENTITY_ID=892669089
OTHER_SNAPSHOT_ID=36

MODULES=(mix prepare_float_features object user)
SUFFIXES=(.predictor.local .predictor.precompute.prepare_float_features .predictor.precompute.remote_object_only .predictor.precompute.remote_request_only)

for i in "${!MODULES[@]}"; do
MODULE=${MODULES[i]}
SUFFIX=${SUFFIXES[i]}
buck2 run mode/opt caffe2/torch/fb/model_transform/fx2trt/packaging:load_net_predictor -- --loadMode=BenchmarkAB --inputNetFile=/data/users/$USER/models/${MODEL_ENTITY_ID}/${SNAPSHOT_ID}/${MODEL_ENTITY_ID}_${SNAPSHOT_ID}${SUFFIX} --otherNetFile=/data/users/$USER/models/${OTHER_MODEL_ENTITY_ID}/${OTHER_SNAPSHOT_ID}/${OTHER_MODEL_ENTITY_ID}_${OTHER_SNAPSHOT_ID}${SUFFIX} --moduleName=${MODULE} --submodToDevice "" --benchmarkDontRebatchSamples=true --doNotRandomizeSampleInputs=true
```

Rollback Plan:

Reviewed By: henryoier

Differential Revision: D80062244
@kqfu kqfu force-pushed the export-D80062244 branch from ee2c5bd to bdb0998 Compare August 13, 2025 21:34
kqfu added a commit to kqfu/pytorch that referenced this pull request Aug 13, 2025
Summary:

Add Static Dispatch Kernel for scale_gradient

Test Plan:
```
MODEL_TYPE=dpa_product_first_ctr_model
MODEL_ENTITY_ID=892669089
SNAPSHOT_ID=37
OTHER_MODEL_ENTITY_ID=892669089
OTHER_SNAPSHOT_ID=36

MODULES=(mix prepare_float_features object user)
SUFFIXES=(.predictor.local .predictor.precompute.prepare_float_features .predictor.precompute.remote_object_only .predictor.precompute.remote_request_only)

for i in "${!MODULES[@]}"; do 
MODULE=${MODULES[i]}
SUFFIX=${SUFFIXES[i]}
buck2 run mode/opt caffe2/torch/fb/model_transform/fx2trt/packaging:load_net_predictor -- --loadMode=BenchmarkAB --inputNetFile=/data/users/$USER/models/${MODEL_ENTITY_ID}/${SNAPSHOT_ID}/${MODEL_ENTITY_ID}_${SNAPSHOT_ID}${SUFFIX} --otherNetFile=/data/users/$USER/models/${OTHER_MODEL_ENTITY_ID}/${OTHER_SNAPSHOT_ID}/${OTHER_MODEL_ENTITY_ID}_${OTHER_SNAPSHOT_ID}${SUFFIX} --moduleName=${MODULE} --submodToDevice "" --benchmarkDontRebatchSamples=true --doNotRandomizeSampleInputs=true
```

Rollback Plan:

Reviewed By: henryoier

Differential Revision: D80062244
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D80062244

kqfu added a commit to kqfu/pytorch that referenced this pull request Aug 13, 2025
Summary:
Pull Request resolved: pytorch#160454

Add Static Dispatch Kernel for scale_gradient

Test Plan:
```
MODEL_TYPE=dpa_product_first_ctr_model
MODEL_ENTITY_ID=892669089
SNAPSHOT_ID=37
OTHER_MODEL_ENTITY_ID=892669089
OTHER_SNAPSHOT_ID=36

MODULES=(mix prepare_float_features object user)
SUFFIXES=(.predictor.local .predictor.precompute.prepare_float_features .predictor.precompute.remote_object_only .predictor.precompute.remote_request_only)

for i in "${!MODULES[@]}"; do
MODULE=${MODULES[i]}
SUFFIX=${SUFFIXES[i]}
buck2 run mode/opt caffe2/torch/fb/model_transform/fx2trt/packaging:load_net_predictor -- --loadMode=BenchmarkAB --inputNetFile=/data/users/$USER/models/${MODEL_ENTITY_ID}/${SNAPSHOT_ID}/${MODEL_ENTITY_ID}_${SNAPSHOT_ID}${SUFFIX} --otherNetFile=/data/users/$USER/models/${OTHER_MODEL_ENTITY_ID}/${OTHER_SNAPSHOT_ID}/${OTHER_MODEL_ENTITY_ID}_${OTHER_SNAPSHOT_ID}${SUFFIX} --moduleName=${MODULE} --submodToDevice "" --benchmarkDontRebatchSamples=true --doNotRandomizeSampleInputs=true
```

Rollback Plan:

Reviewed By: henryoier

Differential Revision: D80062244
@kqfu kqfu force-pushed the export-D80062244 branch from bdb0998 to a16df43 Compare August 13, 2025 21:41
@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Aug 13, 2025

CLA Signed

The committers listed above are authorized under a signed CLA.

  • ✅ login: kqfu / name: Kevin Fu (a407a08)

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Aug 14, 2025
@iremyux
Copy link
Collaborator

iremyux commented Aug 14, 2025

Adding ciflow/win-arm64 label to trigger Windows Arm64 CI and its test purposes - nothing about this PR specifically. (It should not effect the acceptance of the PR even if it fails.)

@iremyux iremyux added the ciflow/win-arm64 Trigger Windows Arm64 CI Workflows label Aug 14, 2025
kqfu added a commit to kqfu/pytorch that referenced this pull request Aug 14, 2025
Summary:

Add Static Dispatch Kernel for scale_gradient

Test Plan:
```
MODEL_TYPE=dpa_product_first_ctr_model
MODEL_ENTITY_ID=892669089
SNAPSHOT_ID=37
OTHER_MODEL_ENTITY_ID=892669089
OTHER_SNAPSHOT_ID=36

MODULES=(mix prepare_float_features object user)
SUFFIXES=(.predictor.local .predictor.precompute.prepare_float_features .predictor.precompute.remote_object_only .predictor.precompute.remote_request_only)

for i in "${!MODULES[@]}"; do 
MODULE=${MODULES[i]}
SUFFIX=${SUFFIXES[i]}
buck2 run mode/opt caffe2/torch/fb/model_transform/fx2trt/packaging:load_net_predictor -- --loadMode=BenchmarkAB --inputNetFile=/data/users/$USER/models/${MODEL_ENTITY_ID}/${SNAPSHOT_ID}/${MODEL_ENTITY_ID}_${SNAPSHOT_ID}${SUFFIX} --otherNetFile=/data/users/$USER/models/${OTHER_MODEL_ENTITY_ID}/${OTHER_SNAPSHOT_ID}/${OTHER_MODEL_ENTITY_ID}_${OTHER_SNAPSHOT_ID}${SUFFIX} --moduleName=${MODULE} --submodToDevice "" --benchmarkDontRebatchSamples=true --doNotRandomizeSampleInputs=true
```

Rollback Plan:

Reviewed By: henryoier

Differential Revision: D80062244
kqfu added a commit to kqfu/pytorch that referenced this pull request Aug 14, 2025
Summary:

Add Static Dispatch Kernel for scale_gradient

Test Plan:
```
MODEL_TYPE=dpa_product_first_ctr_model
MODEL_ENTITY_ID=892669089
SNAPSHOT_ID=37
OTHER_MODEL_ENTITY_ID=892669089
OTHER_SNAPSHOT_ID=36

MODULES=(mix prepare_float_features object user)
SUFFIXES=(.predictor.local .predictor.precompute.prepare_float_features .predictor.precompute.remote_object_only .predictor.precompute.remote_request_only)

for i in "${!MODULES[@]}"; do 
MODULE=${MODULES[i]}
SUFFIX=${SUFFIXES[i]}
buck2 run mode/opt caffe2/torch/fb/model_transform/fx2trt/packaging:load_net_predictor -- --loadMode=BenchmarkAB --inputNetFile=/data/users/$USER/models/${MODEL_ENTITY_ID}/${SNAPSHOT_ID}/${MODEL_ENTITY_ID}_${SNAPSHOT_ID}${SUFFIX} --otherNetFile=/data/users/$USER/models/${OTHER_MODEL_ENTITY_ID}/${OTHER_SNAPSHOT_ID}/${OTHER_MODEL_ENTITY_ID}_${OTHER_SNAPSHOT_ID}${SUFFIX} --moduleName=${MODULE} --submodToDevice "" --benchmarkDontRebatchSamples=true --doNotRandomizeSampleInputs=true
```

Rollback Plan:

Reviewed By: henryoier

Differential Revision: D80062244
kqfu added a commit to kqfu/pytorch that referenced this pull request Aug 14, 2025
Summary:

Add Static Dispatch Kernel for scale_gradient

Test Plan:
```
MODEL_TYPE=dpa_product_first_ctr_model
MODEL_ENTITY_ID=892669089
SNAPSHOT_ID=37
OTHER_MODEL_ENTITY_ID=892669089
OTHER_SNAPSHOT_ID=36

MODULES=(mix prepare_float_features object user)
SUFFIXES=(.predictor.local .predictor.precompute.prepare_float_features .predictor.precompute.remote_object_only .predictor.precompute.remote_request_only)

for i in "${!MODULES[@]}"; do 
MODULE=${MODULES[i]}
SUFFIX=${SUFFIXES[i]}
buck2 run mode/opt caffe2/torch/fb/model_transform/fx2trt/packaging:load_net_predictor -- --loadMode=BenchmarkAB --inputNetFile=/data/users/$USER/models/${MODEL_ENTITY_ID}/${SNAPSHOT_ID}/${MODEL_ENTITY_ID}_${SNAPSHOT_ID}${SUFFIX} --otherNetFile=/data/users/$USER/models/${OTHER_MODEL_ENTITY_ID}/${OTHER_SNAPSHOT_ID}/${OTHER_MODEL_ENTITY_ID}_${OTHER_SNAPSHOT_ID}${SUFFIX} --moduleName=${MODULE} --submodToDevice "" --benchmarkDontRebatchSamples=true --doNotRandomizeSampleInputs=true
```

Rollback Plan:

Reviewed By: henryoier

Differential Revision: D80062244
@kqfu kqfu force-pushed the export-D80062244 branch from a16df43 to 15ec71f Compare August 14, 2025 17:28
kqfu added a commit to kqfu/pytorch that referenced this pull request Aug 14, 2025
Summary:
Pull Request resolved: pytorch#160454

Add Static Dispatch Kernel for scale_gradient

Test Plan:
```
MODEL_TYPE=dpa_product_first_ctr_model
MODEL_ENTITY_ID=892669089
SNAPSHOT_ID=37
OTHER_MODEL_ENTITY_ID=892669089
OTHER_SNAPSHOT_ID=36

MODULES=(mix prepare_float_features object user)
SUFFIXES=(.predictor.local .predictor.precompute.prepare_float_features .predictor.precompute.remote_object_only .predictor.precompute.remote_request_only)

for i in "${!MODULES[@]}"; do
MODULE=${MODULES[i]}
SUFFIX=${SUFFIXES[i]}
buck2 run mode/opt caffe2/torch/fb/model_transform/fx2trt/packaging:load_net_predictor -- --loadMode=BenchmarkAB --inputNetFile=/data/users/$USER/models/${MODEL_ENTITY_ID}/${SNAPSHOT_ID}/${MODEL_ENTITY_ID}_${SNAPSHOT_ID}${SUFFIX} --otherNetFile=/data/users/$USER/models/${OTHER_MODEL_ENTITY_ID}/${OTHER_SNAPSHOT_ID}/${OTHER_MODEL_ENTITY_ID}_${OTHER_SNAPSHOT_ID}${SUFFIX} --moduleName=${MODULE} --submodToDevice "" --benchmarkDontRebatchSamples=true --doNotRandomizeSampleInputs=true
```

Rollback Plan:

Reviewed By: henryoier

Differential Revision: D80062244
@kqfu kqfu force-pushed the export-D80062244 branch from 15ec71f to 3406649 Compare August 14, 2025 17:32
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D80062244

@kqfu
Copy link
Contributor Author

kqfu commented Aug 15, 2025

/easycla

1 similar comment
@kqfu
Copy link
Contributor Author

kqfu commented Aug 15, 2025

/easycla

@kqfu
Copy link
Contributor Author

kqfu commented Aug 15, 2025

/easycla

Summary:

Add Static Dispatch Kernel for scale_gradient

Test Plan:
```
MODEL_TYPE=dpa_product_first_ctr_model
MODEL_ENTITY_ID=892669089
SNAPSHOT_ID=37
OTHER_MODEL_ENTITY_ID=892669089
OTHER_SNAPSHOT_ID=36

MODULES=(mix prepare_float_features object user)
SUFFIXES=(.predictor.local .predictor.precompute.prepare_float_features .predictor.precompute.remote_object_only .predictor.precompute.remote_request_only)

for i in "${!MODULES[@]}"; do 
MODULE=${MODULES[i]}
SUFFIX=${SUFFIXES[i]}
buck2 run mode/opt caffe2/torch/fb/model_transform/fx2trt/packaging:load_net_predictor -- --loadMode=BenchmarkAB --inputNetFile=/data/users/$USER/models/${MODEL_ENTITY_ID}/${SNAPSHOT_ID}/${MODEL_ENTITY_ID}_${SNAPSHOT_ID}${SUFFIX} --otherNetFile=/data/users/$USER/models/${OTHER_MODEL_ENTITY_ID}/${OTHER_SNAPSHOT_ID}/${OTHER_MODEL_ENTITY_ID}_${OTHER_SNAPSHOT_ID}${SUFFIX} --moduleName=${MODULE} --submodToDevice "" --benchmarkDontRebatchSamples=true --doNotRandomizeSampleInputs=true
```

Rollback Plan:

Reviewed By: henryoier

Differential Revision: D80062244
@kqfu kqfu force-pushed the export-D80062244 branch from 3406649 to a407a08 Compare August 15, 2025 00:28
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D80062244

@facebook-github-bot
Copy link
Contributor

@pytorchbot merge

(Initiating merge automatically since Phabricator Diff has merged)

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@pytorchmergebot
Copy link
Collaborator

Merge failed

Reason: 1 jobs have failed, first few of them are: windows-arm64-build-test / test

Details for Dev Infra team Raised by workflow job

@clee2000
Copy link
Contributor

@pytorchbot merge -f "i don't think the win arm failure is related"

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use -f as last resort and instead consider -i/--ignore-current to continue the merge ignoring current failures. This will allow currently pending tests to finish and report signal before the merge.

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

can-gaa-hou pushed a commit to can-gaa-hou/pytorch that referenced this pull request Aug 22, 2025
Summary: Add Static Dispatch Kernel for scale_gradient

Test Plan:
```
MODEL_TYPE=dpa_product_first_ctr_model
MODEL_ENTITY_ID=892669089
SNAPSHOT_ID=37
OTHER_MODEL_ENTITY_ID=892669089
OTHER_SNAPSHOT_ID=36

MODULES=(mix prepare_float_features object user)
SUFFIXES=(.predictor.local .predictor.precompute.prepare_float_features .predictor.precompute.remote_object_only .predictor.precompute.remote_request_only)

for i in "${!MODULES[@]}"; do
MODULE=${MODULES[i]}
SUFFIX=${SUFFIXES[i]}
buck2 run mode/opt caffe2/torch/fb/model_transform/fx2trt/packaging:load_net_predictor -- --loadMode=BenchmarkAB --inputNetFile=/data/users/$USER/models/${MODEL_ENTITY_ID}/${SNAPSHOT_ID}/${MODEL_ENTITY_ID}_${SNAPSHOT_ID}${SUFFIX} --otherNetFile=/data/users/$USER/models/${OTHER_MODEL_ENTITY_ID}/${OTHER_SNAPSHOT_ID}/${OTHER_MODEL_ENTITY_ID}_${OTHER_SNAPSHOT_ID}${SUFFIX} --moduleName=${MODULE} --submodToDevice "" --benchmarkDontRebatchSamples=true --doNotRandomizeSampleInputs=true
```

Rollback Plan:

Reviewed By: henryoier

Differential Revision: D80062244

Pull Request resolved: pytorch#160454
Approved by: https://github.com/henryoier
markc-614 pushed a commit to markc-614/pytorch that referenced this pull request Sep 17, 2025
Summary: Add Static Dispatch Kernel for scale_gradient

Test Plan:
```
MODEL_TYPE=dpa_product_first_ctr_model
MODEL_ENTITY_ID=892669089
SNAPSHOT_ID=37
OTHER_MODEL_ENTITY_ID=892669089
OTHER_SNAPSHOT_ID=36

MODULES=(mix prepare_float_features object user)
SUFFIXES=(.predictor.local .predictor.precompute.prepare_float_features .predictor.precompute.remote_object_only .predictor.precompute.remote_request_only)

for i in "${!MODULES[@]}"; do
MODULE=${MODULES[i]}
SUFFIX=${SUFFIXES[i]}
buck2 run mode/opt caffe2/torch/fb/model_transform/fx2trt/packaging:load_net_predictor -- --loadMode=BenchmarkAB --inputNetFile=/data/users/$USER/models/${MODEL_ENTITY_ID}/${SNAPSHOT_ID}/${MODEL_ENTITY_ID}_${SNAPSHOT_ID}${SUFFIX} --otherNetFile=/data/users/$USER/models/${OTHER_MODEL_ENTITY_ID}/${OTHER_SNAPSHOT_ID}/${OTHER_MODEL_ENTITY_ID}_${OTHER_SNAPSHOT_ID}${SUFFIX} --moduleName=${MODULE} --submodToDevice "" --benchmarkDontRebatchSamples=true --doNotRandomizeSampleInputs=true
```

Rollback Plan:

Reviewed By: henryoier

Differential Revision: D80062244

Pull Request resolved: pytorch#160454
Approved by: https://github.com/henryoier
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/trunk Trigger trunk jobs on your pull request ciflow/win-arm64 Trigger Windows Arm64 CI Workflows fb-exported Merged release notes: quantization release notes category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants