KEMBAR78
[PT2E Quantization] Fix RecursionError when prepare_pt2e graph with concat of the same node by siahuat0727 · Pull Request #129567 · pytorch/pytorch · GitHub
Skip to content

Conversation

@siahuat0727
Copy link
Contributor

Fixes #129038

@pytorch-bot
Copy link

pytorch-bot bot commented Jun 26, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/129567

Note: Links to docs will display an error until the docs builds have been completed.

❌ 20 New Failures, 24 Cancelled Jobs

As of commit 5a856ce with merge base 9ca749d (image):

NEW FAILURES - The following jobs have failed:

  • inductor-rocm / rocm6.2-py3.10-inductor / build (gh)
    ##[error]Can't find 'action.yml', 'action.yaml' or 'Dockerfile' under '/home/ec2-user/actions-runner/_work/pytorch/pytorch/.github/actions/upload-sccache-stats'. Did you forget to run actions/checkout before running your local action?
  • pull / cuda12.4-py3.10-gcc9-sm75 / build (gh)
    ##[error]Can't find 'action.yml', 'action.yaml' or 'Dockerfile' under '/home/ec2-user/actions-runner/_work/pytorch/pytorch/.github/actions/upload-sccache-stats'. Did you forget to run actions/checkout before running your local action?
  • pull / linux-focal-cuda11.8-py3.10-gcc9 / build (gh)
    ##[error]Can't find 'action.yml', 'action.yaml' or 'Dockerfile' under '/home/ec2-user/actions-runner/_work/pytorch/pytorch/.github/actions/upload-sccache-stats'. Did you forget to run actions/checkout before running your local action?
  • pull / linux-focal-cuda12.4-py3.10-gcc9 / build (gh)
    ##[error]Can't find 'action.yml', 'action.yaml' or 'Dockerfile' under '/home/ec2-user/actions-runner/_work/pytorch/pytorch/.github/actions/upload-sccache-stats'. Did you forget to run actions/checkout before running your local action?
  • pull / linux-focal-cuda12.4-py3.10-gcc9-sm86 / build (gh)
    ##[error]Can't find 'action.yml', 'action.yaml' or 'Dockerfile' under '/home/ec2-user/actions-runner/_work/pytorch/pytorch/.github/actions/upload-sccache-stats'. Did you forget to run actions/checkout before running your local action?
  • pull / linux-focal-py3_9-clang9-xla / build (gh)
    ##[error]Can't find 'action.yml', 'action.yaml' or 'Dockerfile' under '/home/ec2-user/actions-runner/_work/pytorch/pytorch/.github/actions/upload-sccache-stats'. Did you forget to run actions/checkout before running your local action?
  • pull / linux-focal-py3.11-clang10 / build (gh)
    ##[error]Can't find 'action.yml', 'action.yaml' or 'Dockerfile' under '/home/ec2-user/actions-runner/_work/pytorch/pytorch/.github/actions/upload-sccache-stats'. Did you forget to run actions/checkout before running your local action?
  • pull / linux-focal-py3.12-clang10 / build (gh)
    ##[error]Can't find 'action.yml', 'action.yaml' or 'Dockerfile' under '/home/ec2-user/actions-runner/_work/pytorch/pytorch/.github/actions/upload-sccache-stats'. Did you forget to run actions/checkout before running your local action?
  • pull / linux-focal-py3.9-clang10 / build (gh)
    ##[error]Can't find 'action.yml', 'action.yaml' or 'Dockerfile' under '/home/ec2-user/actions-runner/_work/pytorch/pytorch/.github/actions/upload-sccache-stats'. Did you forget to run actions/checkout before running your local action?
  • pull / linux-focal-py3.9-clang10-onnx / build (gh)
    ##[error]Can't find 'action.yml', 'action.yaml' or 'Dockerfile' under '/home/ec2-user/actions-runner/_work/pytorch/pytorch/.github/actions/upload-sccache-stats'. Did you forget to run actions/checkout before running your local action?
  • pull / linux-focal-rocm6.2-py3.10 / build (gh)
    ##[error]Can't find 'action.yml', 'action.yaml' or 'Dockerfile' under '/home/ec2-user/actions-runner/_work/pytorch/pytorch/.github/actions/upload-sccache-stats'. Did you forget to run actions/checkout before running your local action?
  • pull / linux-jammy-cuda11.8-cudnn9-py3.9-clang12 / build (gh)
    ##[error]Can't find 'action.yml', 'action.yaml' or 'Dockerfile' under '/home/ec2-user/actions-runner/_work/pytorch/pytorch/.github/actions/upload-sccache-stats'. Did you forget to run actions/checkout before running your local action?
  • pull / linux-jammy-py3-clang12-executorch / build (gh)
    ##[error]Can't find 'action.yml', 'action.yaml' or 'Dockerfile' under '/home/ec2-user/actions-runner/_work/pytorch/pytorch/.github/actions/upload-sccache-stats'. Did you forget to run actions/checkout before running your local action?
  • pull / linux-jammy-py3-clang12-mobile-build / build (gh)
    ##[error]Can't find 'action.yml', 'action.yaml' or 'Dockerfile' under '/home/ec2-user/actions-runner/_work/pytorch/pytorch/.github/actions/upload-sccache-stats'. Did you forget to run actions/checkout before running your local action?
  • pull / linux-jammy-py3.10-clang15-asan / build (gh)
    ##[error]Can't find 'action.yml', 'action.yaml' or 'Dockerfile' under '/home/ec2-user/actions-runner/_work/pytorch/pytorch/.github/actions/upload-sccache-stats'. Did you forget to run actions/checkout before running your local action?
  • pull / linux-jammy-py3.9-gcc11 / build (gh)
    ##[error]Can't find 'action.yml', 'action.yaml' or 'Dockerfile' under '/home/ec2-user/actions-runner/_work/pytorch/pytorch/.github/actions/upload-sccache-stats'. Did you forget to run actions/checkout before running your local action?
  • pull / linux-jammy-py3.9-gcc11-mobile-lightweight-dispatch-build / build (gh)
    ##[error]Can't find 'action.yml', 'action.yaml' or 'Dockerfile' under '/home/ec2-user/actions-runner/_work/pytorch/pytorch/.github/actions/upload-sccache-stats'. Did you forget to run actions/checkout before running your local action?
  • pull / linux-jammy-py3.9-gcc11-no-ops / build (gh)
    ##[error]Can't find 'action.yml', 'action.yaml' or 'Dockerfile' under '/home/ec2-user/actions-runner/_work/pytorch/pytorch/.github/actions/upload-sccache-stats'. Did you forget to run actions/checkout before running your local action?
  • pull / linux-jammy-py3.9-gcc11-pch / build (gh)
    ##[error]Can't find 'action.yml', 'action.yaml' or 'Dockerfile' under '/home/ec2-user/actions-runner/_work/pytorch/pytorch/.github/actions/upload-sccache-stats'. Did you forget to run actions/checkout before running your local action?
  • pull / win-vs2019-cpu-py3 / build (gh)
    sccache: error: couldn't connect to server

CANCELLED JOBS - The following jobs were cancelled. Please retry:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@siahuat0727 siahuat0727 force-pushed the fix-pt2e-concat-same-node branch from 0c7b4a1 to b45e5d3 Compare June 26, 2024 13:23
@siahuat0727 siahuat0727 changed the title Fix pt2e concat same node [PT2E Quantization] Fix RecursionError when prepare_pt2e graph with concat of the same node Jun 26, 2024
@siahuat0727
Copy link
Contributor Author

Hi @jerryzh168, I need your help and review on this PR. Thank you.

@siahuat0727
Copy link
Contributor Author

Hi @jerryzh168,

Would you be able to review this PR when you have a chance? Thank you!

Copy link
Contributor

@jerryzh168 jerryzh168 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry just saw this, thanks for fixing this @siahuat0727 , can you add a test in https://github.com/pytorch/pytorch/blob/main/test/quantization/pt2e/test_xnnpack_quantizer.py

@siahuat0727
Copy link
Contributor Author

@jerryzh168 Sure

@siahuat0727 siahuat0727 force-pushed the fix-pt2e-concat-same-node branch from b45e5d3 to 3c3231e Compare July 19, 2024 13:55
@siahuat0727
Copy link
Contributor Author

siahuat0727 commented Jul 19, 2024

@jerryzh168 Done. Could you help to review the test? Thanks!

@siahuat0727
Copy link
Contributor Author

Hi @jerryzh168, I noticed that the checks have been triggered and have successfully passed. Would appreciate your review and any further actions that may be needed. Thanks!

@siahuat0727
Copy link
Contributor Author

@jerryzh168 Hi, looking forward to your help

@siahuat0727
Copy link
Contributor Author

Hi @jerryzh168, would you mind merging this? It seems that I am not authorized.

Copy link
Contributor

@jerryzh168 jerryzh168 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry for the delay, I haven't been looking at mentions in my notifications, thanks for the fix!

@jerryzh168
Copy link
Contributor

@pytorchbot merge

@pytorch-bot
Copy link

pytorch-bot bot commented Oct 23, 2024

Pull workflow has not been scheduled for the PR yet. It could be because author doesn't have permissions to run those or skip-checks keywords were added to PR/commits, aborting merge. Please get/give approval for the workflows and/or remove skip ci decorators before next merge attempt. If you think this is a mistake, please contact PyTorch Dev Infra.

@jerryzh168
Copy link
Contributor

@pytorchbot rebase

@pytorchmergebot
Copy link
Collaborator

@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here

@pytorchmergebot
Copy link
Collaborator

Rebase failed due to Command git -C /home/runner/work/pytorch/pytorch rebase refs/remotes/origin/viable/strict pull/129567/head returned non-zero exit code 1

Rebasing (1/2)
Auto-merging torch/ao/quantization/quantizer/xnnpack_quantizer_utils.py
CONFLICT (content): Merge conflict in torch/ao/quantization/quantizer/xnnpack_quantizer_utils.py
error: could not apply 0309dec555... Fix pt2e concat same node
hint: Resolve all conflicts manually, mark them as resolved with
hint: "git add/rm <conflicted_files>", then run "git rebase --continue".
hint: You can instead skip this commit: run "git rebase --skip".
hint: To abort and get back to the state before "git rebase", run "git rebase --abort".
hint: Disable this message with "git config advice.mergeConflict false"
Could not apply 0309dec555... Fix pt2e concat same node

Raised by https://github.com/pytorch/pytorch/actions/runs/11473256650

@jerryzh168
Copy link
Contributor

@siahuat0727 could you manually rebase the PR to the latest main?

@siahuat0727 siahuat0727 force-pushed the fix-pt2e-concat-same-node branch from 9f5bcc5 to f86ca90 Compare October 23, 2024 05:35
@siahuat0727
Copy link
Contributor Author

Hi @jerryzh168, I've just rebased the branch onto viable/strict. Is it correct?

@siahuat0727
Copy link
Contributor Author

@jerryzh168

@jerryzh168
Copy link
Contributor

Hi @jerryzh168, I've just rebased the branch onto viable/strict. Is it correct?

I think just rebase on main would be better

@siahuat0727 siahuat0727 force-pushed the fix-pt2e-concat-same-node branch from f86ca90 to 8fbdac1 Compare October 30, 2024 08:30
@siahuat0727
Copy link
Contributor Author

Hi @jerryzh168, I've rebased the PR on main.

@siahuat0727
Copy link
Contributor Author

@jerryzh168

1 similar comment
@siahuat0727
Copy link
Contributor Author

@jerryzh168

@jerryzh168
Copy link
Contributor

sorry for the delay, I think we can merge

@siahuat0727
Copy link
Contributor Author

Hi @jerryzh168, there’s a failing check, but it seems unrelated to this PR. Is there anything I should do to move the merge forward?

@jerryzh168
Copy link
Contributor

can you just run lintrunner -a

@siahuat0727
Copy link
Contributor Author

can you just run lintrunner -a

Hi @jerryzh168, here is the result

➜  pytorch git:(fix-pt2e-concat-same-node) lintrunner -a
Warning: Could not find a lintrunner config at: '.lintrunner.private.toml'. Continuing without using configuration file.
  FLAKE8 success!
  CLANGFORMAT success!
  MYPY success!
  CLANGTIDY success!
  TYPEIGNORE success!
  TYPENOSKIP success!
  NOQA success!
  NATIVEFUNCTIONS success!
  NEWLINE success!
  SPACES success!
  INCLUDE success!
  TABS success!
  ERROR_PRONE_ISINSTANCE success!
  PYBIND11_SPECIALIZATION success!
  PYPIDEP success!
  EXEC success!
  RAWCUDA success!
  RAWCUDADEVICE success!
  ROOT_LOGGING success!
  PYBIND11_INCLUDE success!
  MYPYSTRICT success!
  DEPLOY_DETECTION success!
  ACTIONLINT success!
  TESTOWNERS success!
  TEST_HAS_MAIN success!
  CUBINCLUDE success!
  CMAKE success!
  SHELLCHECK success!
  CALL_ONCE success!
  ONCE_FLAG success!
  CONTEXT_DECORATOR success!
  WORKFLOWSYNC success!
  PYFMT success!
  COPYRIGHT success!
  LINTRUNNER_VERSION success!
  BAZEL_LINTER success!
  RUFF success!
  META_NO_CREATE_UNBACKED success!
  ATEN_CPU_GPU_AGNOSTIC success!
  MERGE_CONFLICTLESS_CSV success!
ok No lint issues.
Successfully applied all patches.
➜  pytorch git:(fix-pt2e-concat-same-node) git status   
On branch fix-pt2e-concat-same-node
Your branch is up to date with 'origin/fix-pt2e-concat-same-node'.

nothing to commit, working tree clean
➜  pytorch git:(fix-pt2e-concat-same-node) git l|head -1
8fbdac1c00a Add test for concatenation the same node

@siahuat0727
Copy link
Contributor Author

I got some code modified by lintrunner -m origin/main -a, but it is still running. I will update it later.

@siahuat0727
Copy link
Contributor Author

Done. @jerryzh168

@siahuat0727
Copy link
Contributor Author

@jerryzh168

1 similar comment
@siahuat0727
Copy link
Contributor Author

@jerryzh168

@jerryzh168
Copy link
Contributor

@pytorchbot merge

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Nov 26, 2024
@pytorchmergebot
Copy link
Collaborator

PR targets viable/strict rather than main, refusing merge request

@jerryzh168
Copy link
Contributor

@siahuat0727 it looks like you have to open a PR against main, not viable/strict

@siahuat0727
Copy link
Contributor Author

Got it, sorry, I noticed that a section in Contributing.md suggested merging to viable/strict before. I opened a new PR.

pytorchmergebot pushed a commit that referenced this pull request Nov 29, 2024
…oncat of the same node (#141651)

Fixes #129038

Related PR #129567

Here is the new PR against main, thanks! @jerryzh168

Pull Request resolved: #141651
Approved by: https://github.com/jerryzh168
GeorgeWigley pushed a commit to graphcore/pytorch-fork that referenced this pull request Nov 29, 2024
…oncat of the same node (pytorch#141651)

Fixes pytorch#129038

Related PR pytorch#129567

Here is the new PR against main, thanks! @jerryzh168

Pull Request resolved: pytorch#141651
Approved by: https://github.com/jerryzh168
pobin6 pushed a commit to pobin6/pytorch that referenced this pull request Dec 5, 2024
…oncat of the same node (pytorch#141651)

Fixes pytorch#129038

Related PR pytorch#129567

Here is the new PR against main, thanks! @jerryzh168

Pull Request resolved: pytorch#141651
Approved by: https://github.com/jerryzh168
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/trunk Trigger trunk jobs on your pull request open source release notes: AO frontend release notes: quantization release notes category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants