KEMBAR78
Have __sizeof__ account for size of stored elements by lantiga · Pull Request #3821 · pytorch/pytorch · GitHub
Skip to content

Conversation

@lantiga
Copy link
Contributor

@lantiga lantiga commented Nov 22, 2017

This PR addresses #3655.

__sizeof__ now accounts for the size of data in Tensor as well as Storage:

s0 = torch.randn(0).__sizeof__()           # 80L
s10 = torch.randn(10).__sizeof__()        # 120L
s100 = torch.randn(100).__sizeof__()   # 480L
(s100 - s0) / (s10 - s0)                          # 10

s0 = torch.randn(0).storage().__sizeof__()           # 40L
s10 = torch.randn(10).storage().__sizeof__()        # 80L
s100 = torch.randn(100).storage().__sizeof__()   # 440L
(s100 - s0) / (s10 - s0)                                          # 10

Note that the reported size is indicative: it only accounts for the size of the data and disregards other attributes (e.g. Size).

@colesbury
Copy link
Member

This includes the size of the Storage in the tensor, but the documentation for sizeof says:

Only the memory consumption directly attributed to the object is accounted for, not the memory consumption of objects it refers to.

That means that the recipe for computing memory referenced in the Python documentation will overstate the memory usage:

https://code.activestate.com/recipes/577504/

I don't think this will change the code snippet from the original PyTorch discussion either:
https://discuss.pytorch.org/t/data-type-when-change-tensor-array-to-numpy-array/9786

test0 = torch.rand(100, 100)
test1 = test0.numpy()
print(sys.getsizeof(test1))

test1 is a NumPy array. It's size is reported as 112 because it's a view and test1.base is intentionally not included in __sizeof__.

@colesbury
Copy link
Member

On the other hand, perhaps including the Storage size in the Tensor is useful enough for debugging that it's not worth trying to strictly adhere to Python documentation spec.

@lantiga
Copy link
Contributor Author

lantiga commented Nov 22, 2017

Thank you @colesbury. I'm good with sticking to the specification.
However storage is so seldom used explicitly in PyTorch, contrarily to Lua Torch, that it feels like the analogous of the data buffer in NumPy.

@lantiga
Copy link
Contributor Author

lantiga commented Nov 22, 2017

Done

@ezyang ezyang merged commit 9b31280 into pytorch:master Nov 27, 2017
@soumith soumith added the 0.3.1 label Feb 4, 2018
soumith pushed a commit that referenced this pull request Feb 7, 2018
* Have __sizeof__ account for size of stored elements

* Conform to sizeof specification
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants