Migrate nonzero from TH to ATen (CPU) #58811

peterbell10 · 2021-05-23T00:21:22Z

The existing PR (gh-50655) has been stalled because TensorIterator doesn't guarantee iteration order in the same way that TH_TENSOR_APPLY does. For contiguous test cases this isn't an issue; but it breaks down for example with channels last format. I resolve this by adding a new TensorIteratorConfig parameter, enforce_linear_iteration, which disables dimension reordering. I've also added a test case for non-contiguous tensors to verify this works.

This PR also significantly improves performance by adding multithreading support to the algorithm. As part of this, I wrote a custom count_nonzero that gives per-thread counts which is necessary to write the outputs in the right location.

Shape	Before	After (1 thread)	After (8 threads)
256,128,32	2610 us	2220 us	496 us
128,128,32	1250 us	976 us	175 us
64,128,32	581 us	486 us	88 us
32,128,32	292 us	245 us	80 us
16,128,32	147 us	120 us	71 us
8,128,32	75 us	61 us	61 us
4,128,32	39 us	32 us	32 us
2,128,32	20 us	17 us	17 us
1,128,32	11 us	9 us	9 us

facebook-github-bot · 2021-05-23T00:21:29Z

💊 CI failures summary and remediations

As of commit 815dad5 (more details on the Dr. CI page):

2/2 failures introduced in this PR

🕵️ 2 new failures recognized by patterns

The following CI failures do not appear to be due to upstream breakages:

pytorch_xla_linux_bionic_py3_6_clang9_build (1/2)

Step: "(Optional) Merge target branch" (full log | diagnosis details | 🔁 rerun)

Automatic merge failed; fix conflicts and then commit the result.

CONFLICT (add/add): Merge conflict in .github/scripts/generate_ci_workflows.py
Auto-merging .github/scripts/generate_ci_workflows.py
CONFLICT (add/add): Merge conflict in .github/scale-config.yml
Auto-merging .github/scale-config.yml
CONFLICT (add/add): Merge conflict in .circleci/scripts/binary_linux_test.sh
Auto-merging .circleci/scripts/binary_linux_test.sh
CONFLICT (add/add): Merge conflict in .circleci/config.yml
Auto-merging .circleci/config.yml
CONFLICT (add/add): Merge conflict in .circleci/cimodel/data/windows_build_definitions.py
Auto-merging .circleci/cimodel/data/windows_build_definitions.py
Automatic merge failed; fix conflicts and then commit the result.


Exited with code exit status 1

pytorch_linux_xenial_py3_6_gcc5_4_build (2/2)

Step: "(Optional) Merge target branch" (full log | diagnosis details | 🔁 rerun)

Automatic merge failed; fix conflicts and then commit the result.

CONFLICT (add/add): Merge conflict in .github/scripts/generate_ci_workflows.py
Auto-merging .github/scripts/generate_ci_workflows.py
CONFLICT (add/add): Merge conflict in .github/scale-config.yml
Auto-merging .github/scale-config.yml
CONFLICT (add/add): Merge conflict in .circleci/scripts/binary_linux_test.sh
Auto-merging .circleci/scripts/binary_linux_test.sh
CONFLICT (add/add): Merge conflict in .circleci/config.yml
Auto-merging .circleci/config.yml
CONFLICT (add/add): Merge conflict in .circleci/cimodel/data/windows_build_definitions.py
Auto-merging .circleci/cimodel/data/windows_build_definitions.py
Automatic merge failed; fix conflicts and then commit the result.


Exited with code exit status 1

This comment was automatically generated by Dr. CI (expand for details).

Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

ngimel · 2021-05-25T00:28:00Z