-
Notifications
You must be signed in to change notification settings - Fork 25.7k
Upgrade to DLPack 1.0. #145000
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Upgrade to DLPack 1.0. #145000
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/145000
Note: Links to docs will display an error until the docs builds have been completed. ❌ 1 New Failure, 1 PendingAs of commit c95743c with merge base 2dfc0e3 ( NEW FAILURE - The following job has failed:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This PR makes the necessary changes in order to upgrade PyTorch DLPack
support to version 1.0. In summary, we add support for the following:
- Support both `DLManagedTensor` and `DLManagedTensorVersioned` when
producing and consuming DLPack capsules
- New parameter for `__dlpack__` method: `max_version`
- Version checks:
- Fallback to old implementation if no `max_version` or if version
lower than 1.0
- Check that the to-be-consumed capsule is of version up to 1.X
In order to accommodate these new specifications, this PR adds the
following main changes:
- `torch._C._to_dlpack_versioned` Python API (Module.cpp): new Python
API for creating a versioned DLPack capsule (called by `__dlpack__`
method)
- `DLPackTraits<T>` class (DLConvertor.h): select the correct
capsule name depending on which DLPack tensor class is being used
- `toDLPackImpl<T>` function (DLConvertor.cpp): populates the
common fields of both classes
- `fillVersion<T>` function (DLConvertor.cpp): populates the version
field for `DLManagedTensorVersioned` (no-op for `DLManagedTensor`)
- `fromDLPackImpl<T>` function (tensor_new.cpp): common function for
creating an `at::Tensor` for both classes, leaving the possible
version check for its caller
ghstack-source-id: 3ca1169
Pull Request resolved: #145000
This PR makes the necessary changes in order to upgrade PyTorch DLPack
support to version 1.0. In summary, we add support for the following:
- Support both `DLManagedTensor` and `DLManagedTensorVersioned` when
producing and consuming DLPack capsules
- New parameter for `__dlpack__` method: `max_version`
- Version checks:
- Fallback to old implementation if no `max_version` or if version
lower than 1.0
- Check that the to-be-consumed capsule is of version up to 1.X
In order to accommodate these new specifications, this PR adds the
following main changes:
- `torch._C._to_dlpack_versioned` Python API (Module.cpp): new Python
API for creating a versioned DLPack capsule (called by `__dlpack__`
method)
- `DLPackTraits<T>` class (DLConvertor.h): select the correct
capsule name depending on which DLPack tensor class is being used
- `toDLPackImpl<T>` function (DLConvertor.cpp): populates the
common fields of both classes
- `fromDLPackImpl<T>` function (DLConvertor.cpp): constructs a tensor
from a DLPAck capsule
- `fillVersion<T>` function (DLConvertor.cpp): populates the version
field for `DLManagedTensorVersioned` (no-op for `DLManagedTensor`)
- `tensor_fromDLPackImpl<T>` function (tensor_new.cpp): outer function
for constructing a tensor out of a DLPack capsule that also marks the
capsule as used
ghstack-source-id: e58ba67
Pull Request resolved: #145000
This PR makes the necessary changes in order to upgrade PyTorch DLPack
support to version 1.0. In summary, we add support for the following:
- Support both `DLManagedTensor` and `DLManagedTensorVersioned` when
producing and consuming DLPack capsules
- New parameter for `__dlpack__` method: `max_version`
- Version checks:
- Fallback to old implementation if no `max_version` or if version
lower than 1.0
- Check that the to-be-consumed capsule is of version up to 1.X
In order to accommodate these new specifications, this PR adds the
following main changes:
- `torch._C._to_dlpack_versioned` Python API (Module.cpp): new Python
API for creating a versioned DLPack capsule (called by `__dlpack__`
method)
- `DLPackTraits<T>` class (DLConvertor.h): select the correct
capsule name depending on which DLPack tensor class is being used
- `toDLPackImpl<T>` function (DLConvertor.cpp): populates the
common fields of both classes
- `fromDLPackImpl<T>` function (DLConvertor.cpp): constructs a tensor
from a DLPAck capsule
- `fillVersion<T>` function (DLConvertor.cpp): populates the version
field for `DLManagedTensorVersioned` (no-op for `DLManagedTensor`)
- `tensor_fromDLPackImpl<T>` function (tensor_new.cpp): outer function
for constructing a tensor out of a DLPack capsule that also marks the
capsule as used
ghstack-source-id: 063107b
Pull Request resolved: #145000
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change sounds ok but what is the expectations in terms of BC when interracting with libraries that didn't upgrade to latest dlpack yet? Is that ok for all Tensors to be of the new version?
aten/src/ATen/DLConvertor.h
Outdated
| TORCH_API Tensor fromDLPack(DLManagedTensor* src); | ||
| TORCH_API Tensor | ||
| fromDLPack(DLManagedTensor* src, std::function<void(void*)> deleter); | ||
| TORCH_API DLManagedTensorVersioned* toDLPack(const Tensor& src); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this API used by C++ libraries? This would be a BC-breaking change for these users right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's being used by PyTorch/XLA. Yes, it's definitely BC-breaking. I was thinking that, since DLPack 1.0 should be the new default, the old version should have the name with a suffix. However, now that you brought this up, not being BC-breaking sounds more important.
In summary, I will change the names so that we are not BC-breaking.
aten/src/ATen/DLConvertor.h
Outdated
| TORCH_API DLManagedTensor* toDLPackUnversioned(const Tensor& src); | ||
| TORCH_API Tensor fromDLPack( | ||
| DLManagedTensorVersioned* src, | ||
| std::optional<std::function<void(void*)>> deleter = std::nullopt); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
std::function is optional by default and doesn't require std::optional right?
torch/__init__.py
Outdated
| solve, | ||
| ) | ||
| from torch.utils.dlpack import from_dlpack, to_dlpack | ||
| from torch.utils.dlpack import from_dlpack, to_dlpack, to_dlpack_unversioned |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While I understand this matches current code, I don't think we want to have this as a new top level API? Being part of torch.utils.dlpack is enough?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That comment sounds right - I don't think it should be needed. For the introduction strategy to a versioned protocol, see the explanation and prototype code (if max_version is None: ....) at https://data-apis.org/array-api/latest/API_specification/generated/array_api.array.__dlpack__.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it's useful/needed to have a Python function to use here so that can be called from within Tensor.__dlpack__, then making it a private function by prepending an underscore should be fine I think.
torch/csrc/Module.cpp
Outdated
| METH_NOARGS, | ||
| nullptr}, | ||
| {"_to_dlpack", THPModule_toDLPack, METH_O, nullptr}, | ||
| {"_to_dlpack_unversioned", THPModule_toDLPackUnversioned, METH_O, nullptr}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we still need this one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does look necessary to me. For DLPack to change the ABI once, there's a dance that needs doing to be not-super-disruptive: continue returning the old (0.8) version, unless the consumer indicates it can handle the new (1.X) version by passing in max_version=(1, 0) (or (1, x) in the future).
| try: | ||
| # Try running __dlpack__ while specifying `max_version` argument. | ||
| dlpack = ext_tensor.__dlpack__(**kwargs) | ||
| except TypeError: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it guaranteed that they will fail with TypeError?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That should be the case - it's the standard error in Python for using a non-existing keyword.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for tackling this @ysiraichi! This looks like a good start. A few high-level comments:
- As noted inline,
__dlpack__gained 3 new keywords, so far this adds onlymax_version. I suspect it's safer to add all at once, because other libraries are probably going to assume that ifmax_versionis present, the 1.0 support is complete. - It would be good to have new Python-level tests in test/test_dlpack.py. That will also make the changes in logic easier to review.
- There's one testing TODO for a very old numpy version there that may be nice to take along:
Line 239 in e9f6e27
# TODO: add interchange tests once NumPy 1.22 (dlpack support) is required
- There's one testing TODO for a very old numpy version there that may be nice to take along:
- I think this is inactionable right now, but adding for completeness: DLPack gained a new
DLPACK_FLAG_BITMASK_READ_ONLYfield, which in the future can feed into PyTorch's copy-on-write (COW) feature.
| try: | ||
| # Try running __dlpack__ while specifying `max_version` argument. | ||
| dlpack = ext_tensor.__dlpack__(**kwargs) | ||
| except TypeError: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That should be the case - it's the standard error in Python for using a non-existing keyword.
torch/__init__.py
Outdated
| solve, | ||
| ) | ||
| from torch.utils.dlpack import from_dlpack, to_dlpack | ||
| from torch.utils.dlpack import from_dlpack, to_dlpack, to_dlpack_unversioned |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That comment sounds right - I don't think it should be needed. For the introduction strategy to a versioned protocol, see the explanation and prototype code (if max_version is None: ....) at https://data-apis.org/array-api/latest/API_specification/generated/array_api.array.__dlpack__.html
torch/_tensor.py
Outdated
| max_version (tuple[int, int] or None): An optional Python tuple with | ||
| 2 integers, representing the maximum version the caller supports. If | ||
| None is passed, then PyTorch will fallback to DLPack 0.X, where versions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rephrase as "If None (default), PyTorch will use DLPack 0.8".
torch/__init__.py
Outdated
| solve, | ||
| ) | ||
| from torch.utils.dlpack import from_dlpack, to_dlpack | ||
| from torch.utils.dlpack import from_dlpack, to_dlpack, to_dlpack_unversioned |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it's useful/needed to have a Python function to use here so that can be called from within Tensor.__dlpack__, then making it a private function by prepending an underscore should be fine I think.
torch/_tensor.py
Outdated
| __torch_dispatch__ = _C._disabled_torch_dispatch_impl | ||
|
|
||
| def __dlpack__(self, stream=None): | ||
| def __dlpack__(self, stream=None, max_version=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that there are also new dl_device and copy keywords (API docs).
torch/csrc/Module.cpp
Outdated
| METH_NOARGS, | ||
| nullptr}, | ||
| {"_to_dlpack", THPModule_toDLPack, METH_O, nullptr}, | ||
| {"_to_dlpack_unversioned", THPModule_toDLPackUnversioned, METH_O, nullptr}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does look necessary to me. For DLPack to change the ABI once, there's a dance that needs doing to be not-super-disruptive: continue returning the old (0.8) version, unless the consumer indicates it can handle the new (1.X) version by passing in max_version=(1, 0) (or (1, x) in the future).
|
@albanD @rgommers Since you have already reviewed this PR, I was thinking about adding the new
Let me know if that works for you. |
|
@ysiraichi a stack with two separate PRs, with the |
|
Separate PR on the stack sounds ok. |
This PR makes the necessary changes in order to upgrade PyTorch DLPack
support to version 1.0. In summary, we add support for the following:
- Support both `DLManagedTensor` and `DLManagedTensorVersioned` when
producing and consuming DLPack capsules
- New parameter for `__dlpack__` method: `max_version`
- Version checks:
- Fallback to old implementation if no `max_version` or if version
lower than 1.0
- Check that the to-be-consumed capsule is of version up to 1.X
In order to accommodate these new specifications, this PR adds the
following main changes:
- `torch._C._to_dlpack_versioned` Python API (Module.cpp): new Python
API for creating a versioned DLPack capsule (called by `__dlpack__`
method)
- `DLPackTraits<T>` class (DLConvertor.h): select the correct
capsule name depending on which DLPack tensor class is being used
- `toDLPackImpl<T>` function (DLConvertor.cpp): populates the
common fields of both classes
- `fromDLPackImpl<T>` function (DLConvertor.cpp): constructs a tensor
from a DLPAck capsule
- `fillVersion<T>` function (DLConvertor.cpp): populates the version
field for `DLManagedTensorVersioned` (no-op for `DLManagedTensor`)
- `tensor_fromDLPackImpl<T>` function (tensor_new.cpp): outer function
for constructing a tensor out of a DLPack capsule that also marks the
capsule as used
ghstack-source-id: 2b774ce
Pull Request resolved: #145000
|
Gentle ping @ysiraichi, any chance we can get this work wrapped up in the near future? Note that we just tagged DLPack v1.1: https://github.com/dmlc/dlpack/releases/tag/v1.1, which added a few more dtype enums. |
|
Yeah. I'm still working on this. Have 2 more PRs to be opened. |
|
@pytorchbot successfully started a revert job. Check the current status here. |
This reverts commit 6e185c5. Reverted #145000 on behalf of https://github.com/atalman due to failing internal tests ([comment](#145000 (comment)))
|
@ysiraichi your PR has been successfully reverted. |
|
@atalman Could you share the internal errors? |
|
I believe this needs to be imported internally and landed internally since we are missing some internal changes. Error is during compilation, I believe there is a code thats using older constructs: |
|
I see. |
|
This is an executorch build issue for some reason. |
|
LLM proposes Which is kind of sus, not sure if this fixes the problem. NARRATOR: It did not work. |
aten/src/ATen/DLConvertor.h
Outdated
| TORCH_API ScalarType toScalarType(const DLDataType& dtype); | ||
| TORCH_API DLManagedTensor* toDLPack(const Tensor& src); | ||
| TORCH_API Tensor fromDLPack(DLManagedTensor* src); | ||
| TORCH_API DLManagedTensorVersioned* toDLPackVersioned(const Tensor& src); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This fixes internal failures:
| TORCH_API DLManagedTensorVersioned* toDLPackVersioned(const Tensor& src); | |
| TORCH_API struct DLManagedTensorVersioned* toDLPackVersioned(const Tensor& src); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @ysiraichi could you please try suggested change. I can help import this PR internally and make sure the signal is green before merging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
This PR makes the necessary changes in order to upgrade PyTorch DLPack
support to version 1.0. In summary, we add support for the following:
- Support both `DLManagedTensor` and `DLManagedTensorVersioned` when
producing and consuming DLPack capsules
- New parameter for `__dlpack__` method: `max_version`
- Version checks:
- Fallback to old implementation if no `max_version` or if version
lower than 1.0
- Check that the to-be-consumed capsule is of version up to 1.X
In order to accommodate these new specifications, this PR adds the
following main changes:
- `torch._C._to_dlpack_versioned` Python API (Module.cpp): new Python
API for creating a versioned DLPack capsule (called by `__dlpack__`
method)
- `DLPackTraits<T>` class (DLConvertor.h): select the correct
traits (e.g. capsule name, conversion functions) depending on which
DLPack tensor class is being used
- `toDLPackImpl<T>` function (DLConvertor.cpp): populates the
common fields of both classes
- `fromDLPackImpl<T>` function (DLConvertor.cpp): constructs a tensor
from a DLPAck capsule
- `fillVersion<T>` function (DLConvertor.cpp): populates the version
field for `DLManagedTensorVersioned` (no-op for `DLManagedTensor`)
- `tensor_fromDLPackImpl<T>` function (tensor_new.cpp): outer function
for constructing a tensor out of a DLPack capsule that also marks the
capsule as used
cc ezyang gchanan
[ghstack-poisoned]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
I don't have a devserver on hand, so trying to go with the regular land!
|
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Merge failedReason: 1 jobs have failed, first few of them are: trunk / win-vs2022-cpu-py3 / test (default, 3, 3, windows.4xlarge.nonephemeral) Details for Dev Infra teamRaised by workflow job |
|
@pytorchbot merge -i |
Merge startedYour change will be merged while ignoring the following 1 checks: trunk / win-vs2022-cpu-py3 / test (default, 3, 3, windows.4xlarge.nonephemeral) Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
|
This landed smoothly, thank @ysiraichi for the change. We can merge the rest of the stack now! |
|
Just want to followup with #145000 (comment) that we have 1.1 that comes with F8 and F4 data types, would be great to get help on landing support for them, hopefully as a matter of updating
|
|
I'm sorry, but I'm not able to work on this anymore. Feel free to pick this up. |
Stack from ghstack (oldest at bottom):
BufferErrorfor DLPack buffer-related errors. #150691This PR makes the necessary changes in order to upgrade PyTorch DLPack
support to version 1.0. In summary, we add support for the following:
DLManagedTensorandDLManagedTensorVersionedwhenproducing and consuming DLPack capsules
__dlpack__method:max_versionmax_versionor if versionlower than 1.0
In order to accommodate these new specifications, this PR adds the
following main changes:
torch._C._to_dlpack_versionedPython API (Module.cpp): new PythonAPI for creating a versioned DLPack capsule (called by
__dlpack__method)
DLPackTraits<T>class (DLConvertor.h): select the correcttraits (e.g. capsule name, conversion functions) depending on which
DLPack tensor class is being used
toDLPackImpl<T>function (DLConvertor.cpp): populates thecommon fields of both classes
fromDLPackImpl<T>function (DLConvertor.cpp): constructs a tensorfrom a DLPAck capsule
fillVersion<T>function (DLConvertor.cpp): populates the versionfield for
DLManagedTensorVersioned(no-op forDLManagedTensor)tensor_fromDLPackImpl<T>function (tensor_new.cpp): outer functionfor constructing a tensor out of a DLPack capsule that also marks the
capsule as used
cc @ezyang @gchanan