KEMBAR78
[Testing] Adding reference tests to `OpInfo` class by krshrimali · Pull Request #59369 · pytorch/pytorch · GitHub
Skip to content

Conversation

@krshrimali
Copy link
Contributor

@krshrimali krshrimali commented Jun 3, 2021

This PR will ideally add ref argument to OpInfo base class. The idea is to add reference checks for all the ops eligible. For more discussion, please check #58294

  • Migrate (but not removing yet) and modify helper functions from UnaryUfuncOpInfo class to OpInfo base class.
  • Test the reference checks for multiple ops. (also decide a list of different and eligible ops for this)
  • Handle possible edge cases (for example: uint64 isn't implemented in PyTorch but is there in NumPy, and this needs to be handled -- more on this later) -- Update: We decided that these reference tests should only test for values and not types.
  • Create a sample PR for a single (of all different categories?) on adding reference functions to the eligible ops. -- Update: This is being done in this PR only.
  • Remove reference tests from test_unary_ufuncs.py and test to make sure that nothing breaks. (Update: We won't be touching Unary Ufunc reference tests in this PR)
  • Add comments, remove unnecessary prints/comments (added for debugging).

Note: To keep the PR description short, examples of edge cases encountered have been mentioned in the comments below.

cc: @mruberry @pmeier @kshitij12345

@facebook-github-bot
Copy link
Contributor

facebook-github-bot commented Jun 3, 2021

💊 CI failures summary and remediations

As of commit e82dd34 (more details on the Dr. CI page and at hud.pytorch.org/pr/59369):


  • 1/1 failures possibly* introduced in this PR
    • 1/1 non-scanned failure(s)

ci.pytorch.org: 1 failed


This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

@krshrimali krshrimali marked this pull request as draft June 3, 2021 11:06
@krshrimali
Copy link
Contributor Author

Examples of few edge cases encountered so far:

  • torch.pow, ref: np.power:

NumPy does not support integers as input and negative integers as powers (a ValueError is raised, see below). PyTorch, on the other hand, does not raise a warning/error but instead returns tensor(0).

======================================================================
ERROR: test_reference_testing_pow_test_check_pow_cpu_int8 (__main__.TestOpInfoCPU)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/krshrimali/Documents/Projects/Quansight/pytorch/torch/testing/_internal/common_device_type.py", line 292, in instantiated_test
    result = test_fn(self, *args)
  File "/home/krshrimali/Documents/Projects/Quansight/pytorch/torch/testing/_internal/common_device_type.py", line 266, in test_wrapper
    return test(*args, **kwargs)
  File "/home/krshrimali/Documents/Projects/Quansight/pytorch/torch/testing/_internal/common_utils.py", line 626, in wrapper
    fn(*args, **kwargs)
  File "/home/krshrimali/Documents/Projects/Quansight/pytorch/test/test_ops.py", line 132, in test_reference_testing
    self._test_reference_numerics(dtype, op, tensors)
  File "/home/krshrimali/Documents/Projects/Quansight/pytorch/test/test_ops.py", line 112, in _test_reference_numerics
    expected = op.ref(sample_numpy_input, *numpy_args, **numpy_kwargs)
ValueError: Integers to negative integer powers are not allowed.

----------------------------------------------------------------------

There are 2 possible solutions to this:

  1. Don't do reference checking for pow. (OK for beginning, but there might be more ops who are divergent from NumPy's behavior, and this needs to be handled)
  2. Add an option/decorator/argument to only run these tests for specific dtypes mentioned. (Covers more ops but will need some time spent on figuring out for which dtypes the implementation is divergent b/w NumPy and PyTorch)

(I can only think of these 2 right now, more suggestions and opinions are welcome).

  • torch.split, ref: np.split:

The implementation of split is divergent compared to NumPy's split. See #50012. For such ops, I suggest to just not add reference checks.

For now, only the above 2 ops have shown divergent behavior in any way out of 3 tested (split, pow and roll). More ops to be tested.

Copy link
Collaborator

@kshitij12345 kshitij12345 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @krshrimali! Had a cursory look. Made a few comments and suggestions.

Apologies but I am a bit confused about the roadmap.

Will this PR try to add multiple operators or this PR plans to introduce the infrastructure and add reference for few operators and then gradually update the remaining operators in follow-up PRs? (Second one makes more sense to me.)

@krshrimali
Copy link
Contributor Author

Will this PR try to add multiple operators or this PR plans to introduce the infrastructure and add reference for few operators and then gradually update the remaining operators in follow-up PRs? (Second one makes more sense to me.)

Thanks, it's a good question. I'll make the description clearer, but this is how I'm thinking the roadmap to be: (suggestions/feedback welcome)

  • This PR will just add reference infrastructure to OpInfo class.
  • But to have a useable infrastructure, I'll test this infra on a few ops (2-4) on my local system to make sure the infra covers comfortably enough cases. (I understand we'll know more once all the eligible ops use this infra). This will not be included in the PR though.
  • Once this PR is merged, follow-up PRs will gradually update the OpInfo framework. (as you rightly mentioned)

Does this sound good? @kshitij12345

@kshitij12345
Copy link
Collaborator

But to have a useable infrastructure, I'll test this infra on a few ops (2-4) on my local system to make sure the infra covers comfortably enough cases. (I understand we'll know more once all the eligible ops use this infra). This will not be included in the PR though.

Since you'll be testing them locally already it would make sense to add them to this PR. This would serve two purposes.

  • It will actually test the new reference tests on CI.
  • It will serve as an example to add these references so that other community members can also help you in this endeavor.

Otherwise sounds good to me. Thanks!

@krshrimali
Copy link
Contributor Author

While testing on a other ops, here are a few more observations:

Consider subtract function of both NumPy and PyTorch: The sample_inputs func will usually return a combination of tensors and scalars for all the possible arguments of the op, one of the cases I found interesting is mentioned here:

>>> torch.subtract(torch.tensor(1, dtype=torch.uint8), 3)
tensor(254, dtype=torch.uint8)
>>> np.subtract(np.array(1, dtype=np.uint8), 3)
-2

The assertion for both of them to be equal will obviously fail since both the values and dtypes are different. One solution would be to mention dtype in such ops using np.subtract(1, 3, dtype=np.uint8) but this doesn't seem to be an appropriate solution for these reasons:

  1. Not all ops have the dtype argument in NumPy, an example: np.amin.
  2. Explicitly passing output datatype for an op will essentially be disabling type assertions since anyways we pass the dtype to reference function.

Since I have noticed this as an issue with Scalars/Numbers, so one solution could be to convert each scalar to a tensor (with dtype as the input dtype, here uint8) and then depend on the library's type promotion logic to figure out the output dtype.

>>> np.subtract(np.array(1, dtype=np.uint8), np.array(3, dtype=np.uint8))
254

But this solution won't work for the inputs: torch.tensor(1, dtype=torch.bool) and 1. Since we plan to convert the scalar to tensor (for the solution proposed above), 1 will be converted to torch.tensor(1, dtype=torch.bool) (True). Considering add here (since subtract won't work on 2 boolean inputs on NumPy and PyTorch)

>>> torch.add(torch.tensor(1, dtype=torch.bool), 1)
2
>>> np.add(np.array(1, dtype=torch.bool), np.array(1, dtype=torch.bool))
True

Here are a few ideas I have in my mind:

  1. Only run the reference checks for non-scalar inputs. (have tested for multiple ops: argmin, argmax, amin, amax, linalg.cholesky, dot, cidst -- scipy.spatial.distance.cdist).
  2. Disable tests for particular dtypes for each op. These tests can be particularly ignored for scalar inputs instead of ignoring for all inputs.

I'm oriented towards the 1st solution (to only run reference checks for non-scalar inputs), but there can be a better and logical solution to this, can't think of one right now. Any inputs/suggestions are welcome. @mruberry @pmeier @kshitij12345

@mruberry
Copy link
Collaborator

mruberry commented Jun 9, 2021

Hey @krshrimali! Thanks for taking a close look at what references might look like, and what issues we may have when implementing references.

A few concepts for us to work with:

  • a reference doesn't have to be a function in NumPy, it can be a function that wraps a NumPy (or SciPy) function, too, to implement the desired behavior. Sometimes NumPy ops don't work the way we'd expect.
  • type promotion is a pain. In the future I expect we'll implement type promotion as specified by the Python Array API (https://data-apis.org/array-api/latest/API_specification/type_promotion.html?highlight=type%20promotion), but PyTorch's current type promotion is different. In the case of subtract above I think NumPy's type promotion might be divergent from the Python Array API's, too?
  • instead of worrying about type promotion in this PR, maybe there's 1-2 simple references we can add that test the functionality this PR wants to implement?
  • in the future, we might be able to deal with type promotion discrepancies by implementing a type promotion wrapper for NumPy operators. This might use torch.result_type, for example, and pre-cast all the inputs to the NumPy operation to the appropratie dtype.

Does that make sense? What are your thoughts?

@mruberry
Copy link
Collaborator

mruberry commented Jun 20, 2021

Made some simplifying comments, @krshrimali. Once they're incorporated I think this PR should be close to being ready; be sure to check that the lint issue is resolved, too. Other failures look unrelated.

@krshrimali
Copy link
Contributor Author

krshrimali commented Jun 20, 2021

Made some simplifying comments, @krshrimali. Once they're incorporated I think this PR should be close to being ready; be sure to check that the lint issue is resolved, too. Other failures look unrelated.

Thanks @mruberry for the review. I'e addressed the comments, and revised this PR testing for amax, amin, roll (amax, amin cover lambda wrapper example while roll is a simple op with np.roll reference). Waiting for the CI to pass.

@mruberry
Copy link
Collaborator

Remember to flake8 locally before submitting with a command like python -m flake8 <file>.

This has a conflict that needs to be resolved

@krshrimali
Copy link
Contributor Author

Remember to flake8 locally before submitting with a command like python -m flake8 <file>.

Sure, thanks for the suggestion! Will take care of this.

This has a conflict that needs to be resolved

Thanks, I've resolved this @mruberry.

@mruberry
Copy link
Collaborator

Mac build should be fixed now; would you rebase this to retrigger that and the other tests? Sorry about that

@krshrimali
Copy link
Contributor Author

Mac build should be fixed now; would you rebase this to retrigger that and the other tests? Sorry about that

Done, thanks @mruberry! :)

@krshrimali
Copy link
Contributor Author

The test failure looks unrelated to this PR. Rest of the tests seem to pass, please let me know if there are any changes suggested whenever you find time. @mruberry Thanks! :)

Copy link
Collaborator

@mruberry mruberry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work, @krshrimali!

@facebook-github-bot
Copy link
Contributor

@mruberry has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@mruberry merged this pull request in 0802022.

@krshrimali
Copy link
Contributor Author

Thanks @mruberry, @kshitij12345, @pmeier and @rgommers for helping in reviewing the PR. 🎉

facebook-github-bot pushed a commit that referenced this pull request Jul 15, 2021
Summary:
This PR adds `complex64` dtype testing, following conversation from: pytorch/xla#3019 ([comment](pytorch/xla#3019 (comment))). Original PR that added OpInfo reference testing: #59369.

cc: mruberry kshitij12345

Pull Request resolved: #61627

Reviewed By: gchanan

Differential Revision: D29710560

Pulled By: mruberry

fbshipit-source-id: 55b2e5ff47f031069335a0c75a45d4f4885ef9ac
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants