-
Notifications
You must be signed in to change notification settings - Fork 25.7k
Fix ConvolutionBinaryInplace using target node #114436
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Fixes #113440 [ghstack-poisoned]
|
There's a bug in convolution_unary where it uses the target of ConvolutionBinaryInplace even though it needs to use the input[1]. I am having hard time finding where this happens so partially reverting this part until I can find it. cc: @Chillee EDIT: updated the PR with the correct fix |
This IR node mutates in place, it needs to use the argument no the target. Fixes #113440 [ghstack-poisoned]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the fix. The UT passes after this fix, but the model still failure. For example: python benchmarks/dynamo/torchbench.py --performance --float32 -dcpu -n50 --no-skip --dashboard --only resnet50 --inference --freezing --timeout 9000 --backend=inductor --output=test.csv
- error message:
return getattr(self, n.op)(n.target, args, kwargs) File "/home/leslie/inductor/pytorch/torch/_inductor/graph.py", line 589, in call_function return target(*args, **kwargs) File "/home/leslie/inductor/pytorch/torch/_inductor/fx_passes/mkldnn_fusion.py", line 397, in fn assert isinstance(other, ir.TensorBox) torch._dynamo.exc.BackendCompilerFailed: backend='inductor' raised: AssertionError: - for which I think due to we need the
TensorBoxinstead ofExternKernelAlloc. Instead of current fix, how about we made some change hereto returnLine 5158 in 7daeb65
return packed packed.inputs[0], since the return value will be wrapped intoTensorBox.
|
cc @chuanqi129, since more than 1 models failed among fp32/bf16 testing, please help to confirm all the failures fixed by this PR. |
This IR node mutates in place, it needs to use the argument not the target. Fixes #113440 [ghstack-poisoned]
|
@leslie-fang-intel updated the PR, can you check whether this fixes the other issue? If not, please share the full assertion failure |
cc @chuanqi129, please help to confirm |
Thanks @oulgen, emmm.... I think we should not use Line 3748 in 7daeb65
|
|
@leslie-fang-intel I don't quite follow what you mean. The op mutates input[1], and previously via MutationLayout it was returning input[1]. So, returning input[1] is the correct result. |
I think |
This IR node mutates in place, it needs to use the argument not the target. Fixes #113440 [ghstack-poisoned]
|
@leslie-fang-intel Thanks! I missed the buffer wrapping when i moved the code around. Updated the code. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please double check ConvolutionBinaryInplace set get_mutation_names() properly.
|
@chuanqi129 is working on verifying this fix among the failures we saw in fp32/bf16 static/dynamic shape tests. Expect he wiill help to post the result in early next week |
|
@pytorchbot merge |
|
@pytorchbot abort |
|
❌ 🤖 pytorchbot command failed: Try |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
|
Actually @jansel is correct, since init reorders the arguments, the |
Init function reorders the arguments so the mutation actually happens on argument input[0] I am not sure if there's a good way to test this unfortunately.. Added tests on #114436 [ghstack-poisoned]
|
#114501 fixes the mutation tracking bug. My apologies. I tried to cancel the merge but couldn't figure out how to do it. |
@eellison once mentioned that closing and reopening the PR should stop the merge. Although I never tried myself. |
Init function reorders the arguments so the mutation actually happens on argument input[0] I am not sure if there's a good way to test this unfortunately.. Added tests on #114436 Pull Request resolved: #114501 Approved by: https://github.com/leslie-fang-intel, https://github.com/aakhundov
Stack from ghstack (oldest at bottom):
This IR node mutates in place, it needs to use the argument not the
target.
Fixes #113440
cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @peterbell10 @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @aakhundov @ColinPeppler