KEMBAR78
Fix refcycles in DataParallel scatter and gather by zou3519 · Pull Request #4988 · pytorch/pytorch · GitHub
Skip to content

Conversation

@zou3519
Copy link
Contributor

@zou3519 zou3519 commented Feb 1, 2018

Addresses #4865

DataParallel's scatter and gather methods contain reference cycles; this PR removes them. I think this is the explanation.

There's something up with Python 2.7 that creates reference cycles when a module is replicated, so the unit test I wrote is only for Python 3. As far as I can tell, those reference cycles are created internally by Python so I don't think there's anything we can do about that.

Test Plan

Unit test to test the multi-gpu behavior under Python 3.

@zou3519 zou3519 changed the title Dataparallel leak Fix refcycles in DataParallel scatter and gather Feb 1, 2018

This comment was marked as off-topic.

@zou3519 zou3519 changed the title Fix refcycles in DataParallel scatter and gather [wip] Fix refcycles in DataParallel scatter and gather Feb 1, 2018
@zou3519 zou3519 changed the title [wip] Fix refcycles in DataParallel scatter and gather Fix refcycles in DataParallel scatter and gather Feb 5, 2018
@zou3519
Copy link
Contributor Author

zou3519 commented Feb 5, 2018

I added a better fix: it turns out that setting the inner functions equal to None before returning clears the cell object that has reference cycles. See this for an explanation.

try:
return scatter_map(inputs)
finally:
scatter_map = None

This comment was marked as off-topic.

@soumith soumith merged commit 885c874 into pytorch:master Feb 5, 2018
@soumith soumith added the 0.3.1 label Feb 5, 2018
soumith pushed a commit that referenced this pull request Feb 7, 2018
* Eliminate reference cycles in scatter_gather

* Test for refcycles

* Better fix

* Add comments
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants