[PT2]: Add Static Dispatch Kernel for scale_gradient #160454

kqfu · 2025-08-12T19:58:56Z

Summary: Add Static Dispatch Kernel for scale_gradient

Test Plan:

MODEL_TYPE=dpa_product_first_ctr_model
MODEL_ENTITY_ID=892669089
SNAPSHOT_ID=37
OTHER_MODEL_ENTITY_ID=892669089
OTHER_SNAPSHOT_ID=36

MODULES=(mix prepare_float_features object user)
SUFFIXES=(.predictor.local .predictor.precompute.prepare_float_features .predictor.precompute.remote_object_only .predictor.precompute.remote_request_only)

for i in "${!MODULES[@]}"; do 
MODULE=${MODULES[i]}
SUFFIX=${SUFFIXES[i]}
buck2 run mode/opt caffe2/torch/fb/model_transform/fx2trt/packaging:load_net_predictor -- --loadMode=BenchmarkAB --inputNetFile=/data/users/$USER/models/${MODEL_ENTITY_ID}/${SNAPSHOT_ID}/${MODEL_ENTITY_ID}_${SNAPSHOT_ID}${SUFFIX} --otherNetFile=/data/users/$USER/models/${OTHER_MODEL_ENTITY_ID}/${OTHER_SNAPSHOT_ID}/${OTHER_MODEL_ENTITY_ID}_${OTHER_SNAPSHOT_ID}${SUFFIX} --moduleName=${MODULE} --submodToDevice "" --benchmarkDontRebatchSamples=true --doNotRandomizeSampleInputs=true

Rollback Plan:

Reviewed By: henryoier

Differential Revision: D80062244

pytorch-bot · 2025-08-12T19:58:59Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/160454

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure

As of commit a407a08 with merge base dae7710 ():

NEW FAILURE - The following job has failed:

windows-arm64-build-test / test (gh)
ModuleNotFoundError: No module named 'torch'

This comment was automatically generated by Dr. CI and updates every 15 minutes.

facebook-github-bot · 2025-08-12T19:59:07Z

This pull request was exported from Phabricator. Differential Revision: D80062244

github-actions · 2025-08-12T20:02:51Z

Attention! native_functions.yaml was changed

If you are adding a new function or defaulted argument to native_functions.yaml, you cannot use it from pre-existing Python frontend code until our FC window passes (two weeks). Split your PR into two PRs, one which adds the new C++ functionality, and one that makes use of it from Python, and land them two weeks apart. See https://github.com/pytorch/pytorch/wiki/PyTorch's-Python-Frontend-Backward-and-Forward-Compatibility-Policy#forwards-compatibility-fc for more info.

Caused by:

aten/src/ATen/native/native_functions.yaml

Summary: Add Static Dispatch Kernel for scale_gradient Test Plan: ``` MODEL_TYPE=dpa_product_first_ctr_model MODEL_ENTITY_ID=892669089 SNAPSHOT_ID=37 OTHER_MODEL_ENTITY_ID=892669089 OTHER_SNAPSHOT_ID=36 MODULES=(mix prepare_float_features object user) SUFFIXES=(.predictor.local .predictor.precompute.prepare_float_features .predictor.precompute.remote_object_only .predictor.precompute.remote_request_only) for i in "${!MODULES[@]}"; do MODULE=${MODULES[i]} SUFFIX=${SUFFIXES[i]} buck2 run mode/opt caffe2/torch/fb/model_transform/fx2trt/packaging:load_net_predictor -- --loadMode=BenchmarkAB --inputNetFile=/data/users/$USER/models/${MODEL_ENTITY_ID}/${SNAPSHOT_ID}/${MODEL_ENTITY_ID}_${SNAPSHOT_ID}${SUFFIX} --otherNetFile=/data/users/$USER/models/${OTHER_MODEL_ENTITY_ID}/${OTHER_SNAPSHOT_ID}/${OTHER_MODEL_ENTITY_ID}_${OTHER_SNAPSHOT_ID}${SUFFIX} --moduleName=${MODULE} --submodToDevice "" --benchmarkDontRebatchSamples=true --doNotRandomizeSampleInputs=true ``` Rollback Plan: Reviewed By: henryoier Differential Revision: D80062244

facebook-github-bot · 2025-08-13T16:52:51Z

This pull request was exported from Phabricator. Differential Revision: D80062244

Summary: Pull Request resolved: pytorch#160454 Add Static Dispatch Kernel for scale_gradient Test Plan: ``` MODEL_TYPE=dpa_product_first_ctr_model MODEL_ENTITY_ID=892669089 SNAPSHOT_ID=37 OTHER_MODEL_ENTITY_ID=892669089 OTHER_SNAPSHOT_ID=36 MODULES=(mix prepare_float_features object user) SUFFIXES=(.predictor.local .predictor.precompute.prepare_float_features .predictor.precompute.remote_object_only .predictor.precompute.remote_request_only) for i in "${!MODULES[@]}"; do MODULE=${MODULES[i]} SUFFIX=${SUFFIXES[i]} buck2 run mode/opt caffe2/torch/fb/model_transform/fx2trt/packaging:load_net_predictor -- --loadMode=BenchmarkAB --inputNetFile=/data/users/$USER/models/${MODEL_ENTITY_ID}/${SNAPSHOT_ID}/${MODEL_ENTITY_ID}_${SNAPSHOT_ID}${SUFFIX} --otherNetFile=/data/users/$USER/models/${OTHER_MODEL_ENTITY_ID}/${OTHER_SNAPSHOT_ID}/${OTHER_MODEL_ENTITY_ID}_${OTHER_SNAPSHOT_ID}${SUFFIX} --moduleName=${MODULE} --submodToDevice "" --benchmarkDontRebatchSamples=true --doNotRandomizeSampleInputs=true ``` Rollback Plan: Reviewed By: henryoier Differential Revision: D80062244

Summary: Add Static Dispatch Kernel for scale_gradient Test Plan: ``` MODEL_TYPE=dpa_product_first_ctr_model MODEL_ENTITY_ID=892669089 SNAPSHOT_ID=37 OTHER_MODEL_ENTITY_ID=892669089 OTHER_SNAPSHOT_ID=36 MODULES=(mix prepare_float_features object user) SUFFIXES=(.predictor.local .predictor.precompute.prepare_float_features .predictor.precompute.remote_object_only .predictor.precompute.remote_request_only) for i in "${!MODULES[@]}"; do MODULE=${MODULES[i]} SUFFIX=${SUFFIXES[i]} buck2 run mode/opt caffe2/torch/fb/model_transform/fx2trt/packaging:load_net_predictor -- --loadMode=BenchmarkAB --inputNetFile=/data/users/$USER/models/${MODEL_ENTITY_ID}/${SNAPSHOT_ID}/${MODEL_ENTITY_ID}_${SNAPSHOT_ID}${SUFFIX} --otherNetFile=/data/users/$USER/models/${OTHER_MODEL_ENTITY_ID}/${OTHER_SNAPSHOT_ID}/${OTHER_MODEL_ENTITY_ID}_${OTHER_SNAPSHOT_ID}${SUFFIX} --moduleName=${MODULE} --submodToDevice "" --benchmarkDontRebatchSamples=true --doNotRandomizeSampleInputs=true ``` Rollback Plan: Reviewed By: henryoier Differential Revision: D80062244

facebook-github-bot · 2025-08-13T21:41:36Z

This pull request was exported from Phabricator. Differential Revision: D80062244

Summary: Pull Request resolved: pytorch#160454 Add Static Dispatch Kernel for scale_gradient Test Plan: ``` MODEL_TYPE=dpa_product_first_ctr_model MODEL_ENTITY_ID=892669089 SNAPSHOT_ID=37 OTHER_MODEL_ENTITY_ID=892669089 OTHER_SNAPSHOT_ID=36 MODULES=(mix prepare_float_features object user) SUFFIXES=(.predictor.local .predictor.precompute.prepare_float_features .predictor.precompute.remote_object_only .predictor.precompute.remote_request_only) for i in "${!MODULES[@]}"; do MODULE=${MODULES[i]} SUFFIX=${SUFFIXES[i]} buck2 run mode/opt caffe2/torch/fb/model_transform/fx2trt/packaging:load_net_predictor -- --loadMode=BenchmarkAB --inputNetFile=/data/users/$USER/models/${MODEL_ENTITY_ID}/${SNAPSHOT_ID}/${MODEL_ENTITY_ID}_${SNAPSHOT_ID}${SUFFIX} --otherNetFile=/data/users/$USER/models/${OTHER_MODEL_ENTITY_ID}/${OTHER_SNAPSHOT_ID}/${OTHER_MODEL_ENTITY_ID}_${OTHER_SNAPSHOT_ID}${SUFFIX} --moduleName=${MODULE} --submodToDevice "" --benchmarkDontRebatchSamples=true --doNotRandomizeSampleInputs=true ``` Rollback Plan: Reviewed By: henryoier Differential Revision: D80062244

linux-foundation-easycla · 2025-08-13T21:41:51Z

The committers listed above are authorized under a signed CLA.

✅ login: kqfu / name: Kevin Fu (a407a08)

iremyux · 2025-08-14T14:30:22Z

Adding ciflow/win-arm64 label to trigger Windows Arm64 CI and its test purposes - nothing about this PR specifically. (It should not effect the acceptance of the PR even if it fails.)

Summary: Add Static Dispatch Kernel for scale_gradient Test Plan: ``` MODEL_TYPE=dpa_product_first_ctr_model MODEL_ENTITY_ID=892669089 SNAPSHOT_ID=37 OTHER_MODEL_ENTITY_ID=892669089 OTHER_SNAPSHOT_ID=36 MODULES=(mix prepare_float_features object user) SUFFIXES=(.predictor.local .predictor.precompute.prepare_float_features .predictor.precompute.remote_object_only .predictor.precompute.remote_request_only) for i in "${!MODULES[@]}"; do MODULE=${MODULES[i]} SUFFIX=${SUFFIXES[i]} buck2 run mode/opt caffe2/torch/fb/model_transform/fx2trt/packaging:load_net_predictor -- --loadMode=BenchmarkAB --inputNetFile=/data/users/$USER/models/${MODEL_ENTITY_ID}/${SNAPSHOT_ID}/${MODEL_ENTITY_ID}_${SNAPSHOT_ID}${SUFFIX} --otherNetFile=/data/users/$USER/models/${OTHER_MODEL_ENTITY_ID}/${OTHER_SNAPSHOT_ID}/${OTHER_MODEL_ENTITY_ID}_${OTHER_SNAPSHOT_ID}${SUFFIX} --moduleName=${MODULE} --submodToDevice "" --benchmarkDontRebatchSamples=true --doNotRandomizeSampleInputs=true ``` Rollback Plan: Reviewed By: henryoier Differential Revision: D80062244

Summary: Pull Request resolved: pytorch#160454 Add Static Dispatch Kernel for scale_gradient Test Plan: ``` MODEL_TYPE=dpa_product_first_ctr_model MODEL_ENTITY_ID=892669089 SNAPSHOT_ID=37 OTHER_MODEL_ENTITY_ID=892669089 OTHER_SNAPSHOT_ID=36 MODULES=(mix prepare_float_features object user) SUFFIXES=(.predictor.local .predictor.precompute.prepare_float_features .predictor.precompute.remote_object_only .predictor.precompute.remote_request_only) for i in "${!MODULES[@]}"; do MODULE=${MODULES[i]} SUFFIX=${SUFFIXES[i]} buck2 run mode/opt caffe2/torch/fb/model_transform/fx2trt/packaging:load_net_predictor -- --loadMode=BenchmarkAB --inputNetFile=/data/users/$USER/models/${MODEL_ENTITY_ID}/${SNAPSHOT_ID}/${MODEL_ENTITY_ID}_${SNAPSHOT_ID}${SUFFIX} --otherNetFile=/data/users/$USER/models/${OTHER_MODEL_ENTITY_ID}/${OTHER_SNAPSHOT_ID}/${OTHER_MODEL_ENTITY_ID}_${OTHER_SNAPSHOT_ID}${SUFFIX} --moduleName=${MODULE} --submodToDevice "" --benchmarkDontRebatchSamples=true --doNotRandomizeSampleInputs=true ``` Rollback Plan: Reviewed By: henryoier Differential Revision: D80062244

facebook-github-bot · 2025-08-14T17:32:25Z

This pull request was exported from Phabricator. Differential Revision: D80062244

kqfu · 2025-08-15T00:03:54Z

/easycla

kqfu · 2025-08-15T00:03:57Z

/easycla

kqfu · 2025-08-15T00:25:50Z

/easycla

Summary: Add Static Dispatch Kernel for scale_gradient Test Plan: ``` MODEL_TYPE=dpa_product_first_ctr_model MODEL_ENTITY_ID=892669089 SNAPSHOT_ID=37 OTHER_MODEL_ENTITY_ID=892669089 OTHER_SNAPSHOT_ID=36 MODULES=(mix prepare_float_features object user) SUFFIXES=(.predictor.local .predictor.precompute.prepare_float_features .predictor.precompute.remote_object_only .predictor.precompute.remote_request_only) for i in "${!MODULES[@]}"; do MODULE=${MODULES[i]} SUFFIX=${SUFFIXES[i]} buck2 run mode/opt caffe2/torch/fb/model_transform/fx2trt/packaging:load_net_predictor -- --loadMode=BenchmarkAB --inputNetFile=/data/users/$USER/models/${MODEL_ENTITY_ID}/${SNAPSHOT_ID}/${MODEL_ENTITY_ID}_${SNAPSHOT_ID}${SUFFIX} --otherNetFile=/data/users/$USER/models/${OTHER_MODEL_ENTITY_ID}/${OTHER_SNAPSHOT_ID}/${OTHER_MODEL_ENTITY_ID}_${OTHER_SNAPSHOT_ID}${SUFFIX} --moduleName=${MODULE} --submodToDevice "" --benchmarkDontRebatchSamples=true --doNotRandomizeSampleInputs=true ``` Rollback Plan: Reviewed By: henryoier Differential Revision: D80062244

facebook-github-bot · 2025-08-15T00:28:13Z

This pull request was exported from Phabricator. Differential Revision: D80062244

facebook-github-bot · 2025-08-15T02:32:59Z

@pytorchbot merge

(Initiating merge automatically since Phabricator Diff has merged)

pytorchmergebot · 2025-08-15T02:34:50Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorchmergebot · 2025-08-15T02:35:09Z

Merge failed

Reason: 1 jobs have failed, first few of them are: windows-arm64-build-test / test

Details for Dev Infra team

Raised by workflow job

clee2000 · 2025-08-15T03:40:39Z

@pytorchbot merge -f "i don't think the win arm failure is related"

pytorchmergebot · 2025-08-15T03:42:22Z

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use -f as last resort and instead consider -i/--ignore-current to continue the merge ignoring current failures. This will allow currently pending tests to finish and report signal before the merge.

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

Summary: Add Static Dispatch Kernel for scale_gradient Test Plan: ``` MODEL_TYPE=dpa_product_first_ctr_model MODEL_ENTITY_ID=892669089 SNAPSHOT_ID=37 OTHER_MODEL_ENTITY_ID=892669089 OTHER_SNAPSHOT_ID=36 MODULES=(mix prepare_float_features object user) SUFFIXES=(.predictor.local .predictor.precompute.prepare_float_features .predictor.precompute.remote_object_only .predictor.precompute.remote_request_only) for i in "${!MODULES[@]}"; do MODULE=${MODULES[i]} SUFFIX=${SUFFIXES[i]} buck2 run mode/opt caffe2/torch/fb/model_transform/fx2trt/packaging:load_net_predictor -- --loadMode=BenchmarkAB --inputNetFile=/data/users/$USER/models/${MODEL_ENTITY_ID}/${SNAPSHOT_ID}/${MODEL_ENTITY_ID}_${SNAPSHOT_ID}${SUFFIX} --otherNetFile=/data/users/$USER/models/${OTHER_MODEL_ENTITY_ID}/${OTHER_SNAPSHOT_ID}/${OTHER_MODEL_ENTITY_ID}_${OTHER_SNAPSHOT_ID}${SUFFIX} --moduleName=${MODULE} --submodToDevice "" --benchmarkDontRebatchSamples=true --doNotRandomizeSampleInputs=true ``` Rollback Plan: Reviewed By: henryoier Differential Revision: D80062244 Pull Request resolved: pytorch#160454 Approved by: https://github.com/henryoier

pytorch-bot bot added the release notes: quantization release notes category label Aug 12, 2025

facebook-github-bot added the fb-exported label Aug 12, 2025

kqfu force-pushed the export-D80062244 branch from 5463b49 to 016fd7c Compare August 13, 2025 16:47

kqfu force-pushed the export-D80062244 branch from 016fd7c to ee2c5bd Compare August 13, 2025 16:52

kqfu force-pushed the export-D80062244 branch from ee2c5bd to bdb0998 Compare August 13, 2025 21:34

kqfu force-pushed the export-D80062244 branch from bdb0998 to a16df43 Compare August 13, 2025 21:41

henryoier approved these changes Aug 14, 2025

View reviewed changes

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Aug 14, 2025

henryoier approved these changes Aug 14, 2025

View reviewed changes

iremyux added the ciflow/win-arm64 Trigger Windows Arm64 CI Workflows label Aug 14, 2025

kqfu force-pushed the export-D80062244 branch from a16df43 to 15ec71f Compare August 14, 2025 17:28

kqfu force-pushed the export-D80062244 branch from 15ec71f to 3406649 Compare August 14, 2025 17:32

kqfu force-pushed the export-D80062244 branch from 3406649 to a407a08 Compare August 15, 2025 00:28

pytorchmergebot added the merging label Aug 15, 2025

pytorchmergebot removed the merging label Aug 15, 2025

pytorchmergebot added the merging label Aug 15, 2025

pytorchmergebot closed this in 55061c9 Aug 15, 2025

pytorchmergebot added Merged and removed merging labels Aug 15, 2025

[PT2]: Add Static Dispatch Kernel for scale_gradient #160454

[PT2]: Add Static Dispatch Kernel for scale_gradient #160454

Uh oh!

Conversation

kqfu commented Aug 12, 2025

Uh oh!

pytorch-bot bot commented Aug 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/160454

❌ 1 New Failure

Uh oh!

facebook-github-bot commented Aug 12, 2025

Uh oh!

github-actions bot commented Aug 12, 2025

Attention! native_functions.yaml was changed

Uh oh!

facebook-github-bot commented Aug 13, 2025

Uh oh!

facebook-github-bot commented Aug 13, 2025

Uh oh!

linux-foundation-easycla bot commented Aug 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

iremyux commented Aug 14, 2025

Uh oh!

facebook-github-bot commented Aug 14, 2025

Uh oh!

kqfu commented Aug 15, 2025

Uh oh!

kqfu commented Aug 15, 2025

Uh oh!

kqfu commented Aug 15, 2025

Uh oh!

facebook-github-bot commented Aug 15, 2025

Uh oh!

facebook-github-bot commented Aug 15, 2025

Uh oh!

pytorchmergebot commented Aug 15, 2025

Merge started

Uh oh!

pytorchmergebot commented Aug 15, 2025

Merge failed

Uh oh!

clee2000 commented Aug 15, 2025

Uh oh!

pytorchmergebot commented Aug 15, 2025

Merge started

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

pytorch-bot bot commented Aug 12, 2025 •

edited

Loading

linux-foundation-easycla bot commented Aug 13, 2025 •

edited

Loading