Remove dependency on numpy for serialization for XLA/open registration devices without numpy #137444

mikaylagawarecki · 2024-10-07T20:47:49Z

Follow ups: Do the same for maia and mtia

Motivation

With the move to weights_only by default, we are making an explicit decision not to allowlist GLOBALs required to deserialize numpy tensors by default. The implication is that backends relying on numpy for serialization will fail loudly when torch.load flips weights_only.

However, we make the observation that this dependency on numpy was legacy and is not actually needed anymore. So we can remove it, which aligns with our weights_only strategy.

Why is this ok?

The following comment on why numpy is necessary for serialization is legacy

pytorch/torch/_tensor.py

Lines 303 to 312 in c87c9f0

    
           # Note: Numpy array is chosen to be the rebuild component for XLA, MTIA, MAIA Tensors. 
        
           # We considered a few options: 
        
           # 1. CPU tensor can't be used here. 
        
           #    Otherwise in torch.load CPU storage is reconstructed with randomly 
        
           #    initialized data, moved onto backend device, and then storage is updated 
        
           #    to the serialized content. This works perfectly for CPU/CUDA but not these backends; 
        
           #    their tensors are disconnected with storage so they don't get the update. 
        
           # 2. Python list is not a good fit due to performance reason. 
        
           #    `tolist()` converts every single element in the tensor into python objects 
        
           #    and serialize them one by one.

We no longer do the following, though it was the case 5 years ago in the PR that added this

CPU storage is reconstructed with randomly initialized data, moved onto backend device, and then storage is updated to the serialized content

Instead what now happens is that CPU storage is constructed with data from the file and then moved onto backend device.

Old behavior (legacy_load): https://github.com/ailzhang/pytorch/blob/67adda891a839691790a0dcd99062430050eff3b/torch/serialization.py#L620

Stack from ghstack (oldest at bottom):

…n devices without numpy [ghstack-poisoned]

pytorch-bot · 2024-10-07T20:47:52Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/137444

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 9c0b63b with merge base f80ed0b ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

…n devices without numpy ghstack-source-id: d2fd34d Pull Request resolved: #137444

albanD

Sounds good!

test/test_cpp_extensions_open_device_registration.py

…registration devices without numpy" Related: pytorch/xla#7799 (comment) Follow ups: Do the same for maia and mtia ## Motivation With the move to `weights_only` by default, we are making an explicit decision not to allowlist GLOBALs required to deserialize `numpy` tensors by default. The implication is that backends relying on numpy for serialization will fail loudly when `torch.load` flips `weights_only`. However, we make the observation that this dependency on numpy was legacy and is not actually needed anymore. So we can remove it, which aligns with our weights_only strategy. ## Why is this ok? The following comment on why numpy is necessary for serialization is legacy https://github.com/pytorch/pytorch/blob/c87c9f0a01f4840bd19ac5058960c9766dd15ef8/torch/_tensor.py#L303-L312 We no longer do the following, though it was the case 5 years ago in the PR that added this > CPU storage is reconstructed with randomly initialized data, moved onto backend device, and then storage is updated to the serialized content **Instead what now happens is that CPU storage is constructed with data from the file **and then** moved onto backend device.** Old behavior (`legacy_load`): https://github.com/ailzhang/pytorch/blob/67adda891a839691790a0dcd99062430050eff3b/torch/serialization.py#L620 [ghstack-poisoned]

…n devices without numpy ghstack-source-id: 7883a35 Pull Request resolved: #137444

mikaylagawarecki · 2024-10-09T16:02:43Z

@pytorchbot merge

pytorchmergebot · 2024-10-09T16:05:14Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

…tion" See rationale in #137444 description [ghstack-poisoned]

See rationale in #137444 description [ghstack-poisoned]

…tion" See rationale in #137444 description [ghstack-poisoned]

See rationale in #137444 description [ghstack-poisoned]

…tion" See rationale in #137444 description [ghstack-poisoned]

See rationale in #137444 description [ghstack-poisoned]

See rationale in #137444 description Pull Request resolved: #137600 Approved by: https://github.com/albanD

See rationale in pytorch#137444 description Pull Request resolved: pytorch#137600 Approved by: https://github.com/albanD

Summary: NumPy based tensor rebuilding from serialization has been deprecated by other backends (eg. [XLA](pytorch#137444)). The new flow has CPU storage being constructed with data from the file and then moved to the target backend device. Furthermore, relying on numpy for serialization will fail loudly when torch.load flips weights_only. Reviewed By: andyanwang Differential Revision: D77843238

Summary: NumPy based tensor rebuilding from serialization has been deprecated by other backends (eg. [XLA](#137444)). The new flow has CPU storage being constructed with data from the file and then moved to the target backend device. Furthermore, relying on numpy for serialization will fail loudly when torch.load flips weights_only. Reviewed By: andyanwang Differential Revision: D77843238 Pull Request resolved: #157884 Approved by: https://github.com/albanD

Remove dependency on numpy for serialization for XLA/open registratio…

583efdb

…n devices without numpy [ghstack-poisoned]

mikaylagawarecki added a commit that referenced this pull request Oct 7, 2024

Remove dependency on numpy for serialization for XLA/open registratio…

a6c5b0b

…n devices without numpy ghstack-source-id: d2fd34d Pull Request resolved: #137444

mikaylagawarecki mentioned this pull request Oct 7, 2024

Remove dependency on numpy for serialization for XLA/open registration devices without numpy #137439

Closed

mikaylagawarecki added release notes: python_frontend python frontend release notes category topic: not user facing topic category labels Oct 7, 2024

mikaylagawarecki requested a review from albanD October 7, 2024 20:58

albanD approved these changes Oct 7, 2024

View reviewed changes

test/test_cpp_extensions_open_device_registration.py Outdated Show resolved Hide resolved

mikaylagawarecki added a commit that referenced this pull request Oct 9, 2024

Remove dependency on numpy for serialization for XLA/open registratio…

fb4e5dd

…n devices without numpy ghstack-source-id: 7883a35 Pull Request resolved: #137444

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Oct 9, 2024

pytorchmergebot added the merging label Oct 9, 2024

This was referenced Oct 9, 2024

Remove numpy dependency for maia serialization #137600

Closed

Flip default on weights_only #137602

Closed

pytorchmergebot added the Merged label Oct 9, 2024

pytorchmergebot closed this in 70288c3 Oct 9, 2024

pytorchmergebot removed the merging label Oct 9, 2024

mikaylagawarecki added a commit that referenced this pull request Oct 16, 2024

Update base for Update on "Remove numpy dependency for maia serializa…

5f3e353

…tion" See rationale in #137444 description [ghstack-poisoned]

mikaylagawarecki added a commit that referenced this pull request Oct 16, 2024

Update on "Remove numpy dependency for maia serialization"

df5bf22

See rationale in #137444 description [ghstack-poisoned]

mikaylagawarecki added a commit that referenced this pull request Oct 22, 2024

Update base for Update on "Remove numpy dependency for maia serializa…

bcdd12a

…tion" See rationale in #137444 description [ghstack-poisoned]

mikaylagawarecki added a commit that referenced this pull request Oct 22, 2024

Update on "Remove numpy dependency for maia serialization"

ca72364

See rationale in #137444 description [ghstack-poisoned]

mikaylagawarecki added a commit that referenced this pull request Oct 25, 2024

Update base for Update on "Remove numpy dependency for maia serializa…

bb482a6

…tion" See rationale in #137444 description [ghstack-poisoned]

mikaylagawarecki added a commit that referenced this pull request Oct 25, 2024

Update on "Remove numpy dependency for maia serialization"

2043f5c

See rationale in #137444 description [ghstack-poisoned]

pytorchmergebot pushed a commit that referenced this pull request Oct 28, 2024

Remove numpy dependency for maia serialization (#137600)

1a275fe

See rationale in #137444 description Pull Request resolved: #137600 Approved by: https://github.com/albanD

github-actions bot deleted the gh/mikaylagawarecki/267/head branch November 9, 2024 02:02

nautsimon mentioned this pull request Jul 9, 2025

[PyTorch] Deprecate numpy serialization for MTIA #157884

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Remove dependency on numpy for serialization for XLA/open registration devices without numpy #137444

Remove dependency on numpy for serialization for XLA/open registration devices without numpy #137444

Uh oh!

mikaylagawarecki commented Oct 7, 2024 •

edited

Loading

Uh oh!

pytorch-bot bot commented Oct 7, 2024 •

edited

Loading

Uh oh!

albanD left a comment

Uh oh!

Uh oh!

mikaylagawarecki commented Oct 9, 2024

Uh oh!

pytorchmergebot commented Oct 9, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	# Note: Numpy array is chosen to be the rebuild component for XLA, MTIA, MAIA Tensors.
	# We considered a few options:
	# 1. CPU tensor can't be used here.
	# Otherwise in torch.load CPU storage is reconstructed with randomly
	# initialized data, moved onto backend device, and then storage is updated
	# to the serialized content. This works perfectly for CPU/CUDA but not these backends;
	# their tensors are disconnected with storage so they don't get the update.
	# 2. Python list is not a good fit due to performance reason.
	# `tolist()` converts every single element in the tensor into python objects
	# and serialize them one by one.

Remove dependency on numpy for serialization for XLA/open registration devices without numpy #137444

Remove dependency on numpy for serialization for XLA/open registration devices without numpy #137444

Uh oh!

Conversation

mikaylagawarecki commented Oct 7, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Why is this ok?

Uh oh!

pytorch-bot bot commented Oct 7, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/137444

✅ No Failures

Uh oh!

albanD left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

mikaylagawarecki commented Oct 9, 2024

Uh oh!

pytorchmergebot commented Oct 9, 2024

Merge started

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

mikaylagawarecki commented Oct 7, 2024 •

edited

Loading

pytorch-bot bot commented Oct 7, 2024 •

edited

Loading