KEMBAR78
Make mkldnn Stream object thread_local and enable mkldnn thread-safe by bdaskalov · Pull Request #17022 · pytorch/pytorch · GitHub
Skip to content

Conversation

@bdaskalov
Copy link
Contributor

This PR fixes following issue: #16828

It is a combination of two things:

  1. MKLDNN streams are not thread-safe but are currently shared between different threads. This change makes them thread_local
  2. By default MKLDNN primitives can share global memory and can't be invoked from multiple threads. This PR enables the MKLDNN_ENABLE_CONCURRENT_EXEC cmake configuration option that makes them thread-safe.

Copy link
Contributor

@ezyang ezyang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know what this means if we link against an MKLDNN that was build by someone else? Because it seems like this is a build time parameter.

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ezyang is landing this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@bdaskalov
Copy link
Contributor Author

@ezyang Looking at the MKLDNN source code it seems like the only place this is used is in scratchpad.cpp where it changes the underlying implementation. I don't think there is a way to know how a prebuild implementation is configured since it is not exposed anywhere in the public interface or the headers.

Can you retry the failing CI runs? They all seem to be broken for unreleated reasons but I don't have the rights to rerun them.

Also I already backported this change to 1.0.1. I can make a PR for that too if you guys plan on doing a bugfix release?

zdevito pushed a commit to zdevito/ATen that referenced this pull request Feb 14, 2019
…(#17022)

Summary:
This PR fixes following issue: pytorch/pytorch#16828

It is a combination of two things:
1) MKLDNN streams are not thread-safe but are currently shared between different threads. This change makes them thread_local
2) By default MKLDNN primitives can share global memory and can't be invoked from multiple threads. This PR enables the MKLDNN_ENABLE_CONCURRENT_EXEC cmake configuration option that makes them thread-safe.
Pull Request resolved: pytorch/pytorch#17022

Differential Revision: D14069052

Pulled By: ezyang

fbshipit-source-id: f8f7fcb86c40f5d751fb35dfccc2f802b6e137c6
@soumith
Copy link
Member

soumith commented Feb 14, 2019

@bdaskalov we dont plan to do v1.0.2, the next will be v1.1 which will be cut off of master.

@bdaskalov
Copy link
Contributor Author

bdaskalov commented Feb 14, 2019 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants