KEMBAR78
Fix use after free when advanced indexing tensors with tensors by zou3519 · Pull Request #4559 · pytorch/pytorch · GitHub
Skip to content

Conversation

@zou3519
Copy link
Contributor

@zou3519 zou3519 commented Jan 9, 2018

When indexing tensors with tensors, like the following code, a use after free occurs. This doesn't actually impact anything in practice (nothing crashes or errors out) but is theoretically undefined behavior.

Advanced indexing on Variables doesn't exhibit this behavior because it's a different codepath.

Sample code:

import torch
x = torch.randn(2, 2)
i = torch.Tensor([1]).long()
x[i,:]

Explanation

In THPTensor_(_convertToTensorIndexers), a vector<THPIndexTensor> is created by constructing THPTensors from sequences/tensors/etc. Each THPIndexTensor is then freed with the following:

for (auto& idx : indexers) {
  THIndexTensor_(free)(LIBRARY_STATE idx->cdata);
  Py_DECREF(idx);
}

This is a problem because Py_DECREF(idx) will turn idx->ob_refcnt to 0 since this function created the relevant THPIndexTensors and owns them, causing THPTensor_(dealloc) to be called. THPTensor_(dealloc) already has a line that calls THIndexTensor_(free)(LIBRARY_STATE idx->cdata).

So THIndexTensor_(free)(LIBRARY_STATE idx->cdata) gets called twice on the same cdata. After the first call frees cdata, the second attempts to access flags/members of cdata to determine if it should free it.

In `THPTensor_(_convertToTensorIndexers)`, a `vector<THPIndexTensor>` is
created by constructing `THPTensor`s from sequences/tensors/etc. Each
`THPIndexTensor` is then freed with the following:

```
for (auto& idx : indexers) {
  THIndexTensor_(free)(LIBRARY_STATE idx->cdata);
  Py_DECREF(idx);
}
```

This is a problem because `Py_DECREF(idx)` will turn `idx->ob_refcnt` to 0 since this function
created the relevant `THPIndexTensor`s and owns them, causing `THPTensor_(dealloc)` to be
called. `THPTensor_(dealloc)` already has a line that calls
`THIndexTensor_(free)(LIBRARY_STATE idx->cdata)`.

So `THIndexTensor_(free)(LIBRARY_STATE idx->cdata)` gets called twice on the same
`cdata`. After the first call frees `cdata`, the second attempts to access flags/members of `cdata` to
determine if it should free it.
// Clean up Indexers
for (auto& idx : indexers) {
THIndexTensor_(free)(LIBRARY_STATE idx->cdata);
Py_DECREF(idx);

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

@soumith soumith merged commit 7b31d33 into pytorch:master Jan 11, 2018
@soumith soumith added the 0.3.1 label Feb 7, 2018
soumith pushed a commit that referenced this pull request Feb 7, 2018
In `THPTensor_(_convertToTensorIndexers)`, a `vector<THPIndexTensor>` is
created by constructing `THPTensor`s from sequences/tensors/etc. Each
`THPIndexTensor` is then freed with the following:

```
for (auto& idx : indexers) {
  THIndexTensor_(free)(LIBRARY_STATE idx->cdata);
  Py_DECREF(idx);
}
```

This is a problem because `Py_DECREF(idx)` will turn `idx->ob_refcnt` to 0 since this function
created the relevant `THPIndexTensor`s and owns them, causing `THPTensor_(dealloc)` to be
called. `THPTensor_(dealloc)` already has a line that calls
`THIndexTensor_(free)(LIBRARY_STATE idx->cdata)`.

So `THIndexTensor_(free)(LIBRARY_STATE idx->cdata)` gets called twice on the same
`cdata`. After the first call frees `cdata`, the second attempts to access flags/members of `cdata` to
determine if it should free it.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants