KEMBAR78
[ROCM] enable fft tests by micmelesse · Pull Request #60313 · pytorch/pytorch · GitHub
Skip to content

Conversation

@micmelesse
Copy link
Contributor

@micmelesse micmelesse commented Jun 18, 2021

This PR enables fft tests on ROCM. It contains a function that generates a valid input for fft tests that call hipfftExecC2R or hipfftExecZ2D. With this helper function we are able to fix a number of fft tests. This brings a close to the series of fft PRs enabling fft tests on ROCM.

@facebook-github-bot
Copy link
Contributor

facebook-github-bot commented Jun 18, 2021

💊 CI failures summary and remediations

As of commit 9aa3713 (more details on the Dr. CI page and at hud.pytorch.org/pr/60313):


💚 💚 Looks good so far! There are no failures yet. 💚 💚


Preview docs built from this PR

This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

@github-actions github-actions bot added the module: rocm AMD GPU support for Pytorch label Jun 18, 2021
@micmelesse micmelesse marked this pull request as ready for review June 23, 2021 16:21
@codecov
Copy link

codecov bot commented Jun 23, 2021

Codecov Report

Merging #60313 (9aa3713) into master (8cba365) will decrease coverage by 0.00%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master   #60313      +/-   ##
==========================================
- Coverage   76.23%   76.23%   -0.01%     
==========================================
  Files        2059     2059              
  Lines      204892   204892              
==========================================
- Hits       156201   156200       -1     
- Misses      48691    48692       +1     

Copy link
Collaborator

@jeffdaily jeffdaily left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - CI is passing

Copy link
Contributor

@malfet malfet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If test input is not valid for some reason, please fix it in codepath for all accelerators.
Otherwise, let raise unittest.SkipTest if complex input is not supported by rocFFT

@micmelesse
Copy link
Contributor Author

If test input is not valid for some reason, please fix it in codepath for all accelerators.
Otherwise, let raise unittest.SkipTest if complex input is not supported by rocFFT

@malfet rocFFT supports complex input. What that check is doing is seeing if the input will invoke hipfftExecC2R or hipfftExecZ2D and if it does apply a certain modification to the input.

micmelesse and others added 2 commits June 24, 2021 13:07
@micmelesse
Copy link
Contributor Author

@malfet I added your suggestions and everything seems to pass.

@facebook-github-bot
Copy link
Contributor

@malfet has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@malfet merged this pull request in ce232e7.

# generate Hermitian symmetric input using rfftn
rfft_output = np.fft.rfftn(np_input_real)

return torch.from_numpy(rfft_output)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@micmelesse from_numpy will always return a CPU tensor, right? So this never runs any tests on rocm.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@peterbell10 I didn't notice that in the documentation. Hopefully, the op forces it to GPU. Either way, I will create a pr today to explicitly force the tensor to GPU.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ops do not force inputs to the correct device or dtype, they just call the corresponding torch function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was was fixed in #61073

facebook-github-bot pushed a commit that referenced this pull request Jul 13, 2021
Summary:
This PR fixes a bug in #60313. Where the tensors generated by _generate_valid_rocfft_input are on the cpu instead of the gpu. This was due to using numpy to generate tensors and converting it to pytorch using torch.from_numpy. This leads to the generated tensors staying on the cpu. We now generate the tensors using pytorch itself which carries over the device type of the input tensors to the generated tensor.

Pull Request resolved: #61073

Reviewed By: H-Huang

Differential Revision: D29668418

Pulled By: malfet

fbshipit-source-id: ce2025c26d079c15603a89b9bf7878f48d73155e
@jithunnair-amd
Copy link
Collaborator

@peterbell10 Are you the owner of these fft unit tests? I'd like to request that we consider making the "workaround" in this PR be actually considered as a proper fix for the inputs to these unit tests. The original unit test inputs were actually testing undefined behaviour even according to cufft documentation, as mentioned here: https://docs.nvidia.com/cuda/cufft/index.html#fft-types

Note: Complex-to-real (C2R) transforms accept complex-Hermitian input, which requires the 0th element (and the N2th input if N is even) to be real-valued, i.e. its imaginary part should be zero. Otherwise, the behavior of the transform is undefined.

Given that, wouldn't it be an improvement to the unit tests to provide inputs that have defined behaviour instead of relying on undefined behaviour that could change down the line?

If needed, we can go over the technical details in a call.

cc @malfet @micmelesse

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants