TestCommonCUDA.test_dtypes_matmul_cuda fails

## 🐛 Bug

TestCommonCUDA.test_dtypes_matmul_cuda fails

This test failure is also seen on V100, A100, and 3090 GPU. The earliest seen failure was on 6/18/2021.

Line 5944 seems related  #60157 

https://github.com/pytorch/pytorch/blob/af3f7a210a18e8badbe1a10531ce7bc49b5154f3/torch/testing/_internal/common_methods_invocations.py#L5939-L5954


## To Reproduce

Steps to reproduce the behavior:

```
python test/test_ops.py -v -k test_dtypes_matmul_cuda
```

error message

```python
$ python test/test_ops.py -v -k test_dtypes_matmul_cuda
Test results will be stored in test-reports/python-unittest/.home.xwang.Developer.pytorch.test.test_ops

Running tests...
----------------------------------------------------------------------
  test_dtypes_matmul_cuda (__main__.TestCommonCUDA) ... FAIL (2.224s)

======================================================================
ERROR [2.224s]: test_dtypes_matmul_cuda (__main__.TestCommonCUDA)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/xwang/Developer/pytorch/torch/testing/_internal/common_utils.py", line 1051, in wrapper
    method(*args, **kwargs)
  File "/home/xwang/Developer/pytorch/torch/testing/_internal/common_utils.py", line 1051, in wrapper
    method(*args, **kwargs)
  File "/home/xwang/Developer/pytorch/torch/testing/_internal/common_device_type.py", line 380, in instantiated_test
    result = test_fn(self, *args)
  File "/home/xwang/Developer/pytorch/torch/testing/_internal/common_device_type.py", line 354, in test_wrapper
    return test(*args, **kwargs)
  File "/home/xwang/Developer/pytorch/torch/testing/_internal/common_device_type.py", line 746, in dep_fn
    return fn(slf, device, *args, **kwargs)
  File "/home/xwang/Developer/pytorch/torch/testing/_internal/common_device_type.py", line 894, in only_fn
    return fn(self, device, *args, **kwargs)
  File "/home/xwang/Developer/pytorch/test/test_ops.py", line 166, in test_dtypes
    self.assertEqual(supported_backward_dtypes, claimed_backward_supported, msg=msg)
  File "/home/xwang/Developer/pytorch/torch/testing/_internal/common_utils.py", line 1386, in assertEqual
    super().assertEqual(x, y, msg=self._get_assert_msg(msg, debug_msg=debug_msg))
AssertionError: Items in the first set but not the second:
torch.bfloat16 : Attempted to compare [set] types: Expected: {torch.complex128, torch.bfloat16, torch.float16, torch.float32, torch.float64, torch.complex64}; Actual: {torch.complex128, torch.float16, torch.float32, torch.complex64, torch.float64}.
The supported backward dtypes for matmul on cuda according to its OpInfo are
        {torch.complex128, torch.float16, torch.float32, torch.complex64, torch.float64}, but the detected supported backward dtypes are {torch.complex128, torch.bfloat16, torch.float16, torch.float32, torch.float64, torch.complex64}.
        The following backward dtypes should be added to the OpInfo: {torch.bfloat16}. 

----------------------------------------------------------------------
Ran 1 test in 2.225s

FAILED (errors=1)

Generating XML reports...
```

## Expected behavior

no fail

## Environment

```
Collecting environment information...
PyTorch version: 1.10.0a0+git01e0296
Is debug build: False
CUDA used to build PyTorch: 11.3
ROCM used to build PyTorch: N/A

OS: Manjaro Linux (x86_64)
GCC version: (GCC) 10.2.0
Clang version: Could not collect
CMake version: version 3.20.2
Libc version: glibc-2.33

Python version: 3.9 (64-bit runtime)
Python platform: Linux-5.10.36-2-MANJARO-x86_64-with-glibc2.33
Is CUDA available: True
CUDA runtime version: 11.3.58
GPU models and configuration: 
GPU 0: GeForce RTX 2070 SUPER
GPU 1: GeForce GTX 1070 Ti

Nvidia driver version: 460.80
cuDNN version: Probably one of the following:
/usr/lib/libcudnn.so.8.2.0
/usr/lib/libcudnn_adv_infer.so.8.2.0
/usr/lib/libcudnn_adv_train.so.8.2.0
/usr/lib/libcudnn_cnn_infer.so.8.2.0
/usr/lib/libcudnn_cnn_train.so.8.2.0
/usr/lib/libcudnn_ops_infer.so.8.2.0
/usr/lib/libcudnn_ops_train.so.8.2.0
HIP runtime version: N/A
MIOpen runtime version: N/A

Versions of relevant libraries:
[pip3] numpy==1.19.5
[pip3] torch==1.10.0a0+git01e0296
[pip3] torchvision==0.10.0a0+7d955df
[conda] Could not collec
```

## Additional context

cc @ngimel @mruberry @VitalyFedyunin @walterddr @ptrblck 


	OpInfo('matmul',
	dtypes=floating_types(),
	dtypesIfCPU=all_types_and_complex(),
	dtypesIfCUDA=floating_and_complex_types_and(torch.float16, *[torch.bfloat16] if CUDA11OrLater else []),
	dtypesIfROCM=floating_types_and(torch.half, torch.bfloat16),
	backward_dtypesIfCUDA=floating_and_complex_types_and(torch.float16),
	assert_autodiffed=True,
	sample_inputs_func=sample_inputs_matmul,
	skips=(
	# FIXME: bfloat16 backward support likely depends on CUDA11+
	# and SM53+
	SkipInfo('TestCommon', 'test_dtypes', active_if=IS_WINDOWS),
	# matmul does not correctly warn when resizing out= inputs
	SkipInfo('TestCommon', 'test_out'),
	SkipInfo('TestCommon', 'test_conj_view', device_type='cpu'),
	)),

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

TestCommonCUDA.test_dtypes_matmul_cuda fails #60443

🐛 Bug

To Reproduce

Expected behavior

Environment

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

TestCommonCUDA.test_dtypes_matmul_cuda fails #60443

Description

🐛 Bug

To Reproduce

Expected behavior

Environment

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions