KEMBAR78
Fix cuBLAS arguments for fp16 dot by apaszke · Pull Request #3660 · pytorch/pytorch · GitHub
Skip to content

Conversation

@apaszke
Copy link
Contributor

@apaszke apaszke commented Nov 12, 2017

Result type has to be fp16 for fp16 dot. See docs of cublasDotEx (look for "datatypes combinations currrently supported").


#ifdef CUDA_HALF_TENSOR
float THCudaBlas_Hdot(THCState *state, int64_t n, half *x, int64_t incx, half *y, int64_t incy)
half THCudaBlas_Hdot(THCState *state, int64_t n, half *x, int64_t incx, half *y, int64_t incy)

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

@apaszke
Copy link
Contributor Author

apaszke commented Nov 29, 2017

@soumith can you please review again? I added tests for CUDA half <-> CPU float comparison

@soumith soumith merged commit 6ae0d47 into master Nov 29, 2017
@soumith
Copy link
Member

soumith commented Nov 29, 2017

looks good!

@colesbury colesbury deleted the half_dot_fix branch December 2, 2017 21:27
colesbury added a commit to colesbury/pytorch that referenced this pull request Feb 1, 2018
The test_cuda.py setup purports to test half tensors, but actually just
re-tests FloatTensors because the keys in type_map were str instead of
type. Testing HalfTensors is more complicated, requiring changes to
precision and requires excluding some unimplemented methods.

We should fully test half CUDA tensors. This change just deletes the
duplicate tests of FloatTensor.
@soumith soumith added the 0.3.1 label Feb 4, 2018
colesbury added a commit that referenced this pull request Feb 7, 2018
The test_cuda.py setup purports to test half tensors, but actually just
re-tests FloatTensors because the keys in type_map were str instead of
type. Testing HalfTensors is more complicated, requiring changes to
precision and requires excluding some unimplemented methods.

We should fully test half CUDA tensors. This change just deletes the
duplicate tests of FloatTensor.
soumith pushed a commit that referenced this pull request Feb 7, 2018
* Fix cuBLAS arguments for fp16 dot

* Enable FloatTensor <-> CUDA HalfTensor checks in test_cuda.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants