KEMBAR78
Add mutex for CPU RNG and move TH to C++ by yf225 · Pull Request #4041 · pytorch/pytorch · GitHub
Skip to content

Conversation

@yf225
Copy link
Contributor

@yf225 yf225 commented Dec 5, 2017

This fixes #3794.

Manually tested and there is no perf regression coming from the change.

@colesbury
Copy link
Member

This will break the Windows build and is not exception safe. We should use std::mutex from C++11 instead of pthread mutexes. That will probably require changing at least THTensorRandom.c to C++.

@colesbury
Copy link
Member

To clarify on the exception issue: a lot of functions in TH can call THError, including the TH_TENSOR_APPLY macros. In PyTorch, THError throws a C++ exception. If that happens while the pthread lock is held, it would not be released. That would lead to deadlocks if the random function were called again.

I suspect that moving all of TH to C++ would be a pain. You can avoid the C++ types in the THRandom.h header by using an opaque struct in the header file.

Copy link
Member

@soumith soumith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see @colesbury 's comments

@soumith
Copy link
Member

soumith commented Jan 23, 2018

@yf225 can you finish this this week, i think folks are blocked on it.

@yf225
Copy link
Contributor Author

yf225 commented Jan 23, 2018

@colesbury What do we think of using different generators for multi-threads (as @Stonesjtu suggested in #3794 (comment))? Perf-wise it might be faster than using mutex because we are not waiting for the lock at all, not sure about implementation yet.

@soumith
Copy link
Member

soumith commented Jan 23, 2018

@yf225 the PR corresponding to that was closed. We should just finish this as it currently stands and merge.

@apaszke
Copy link
Contributor

apaszke commented Jan 23, 2018

I think thread local generators are not a good idea because it’s easy to forget to seed a worker thread

@yf225 yf225 force-pushed the rng_mutex branch 2 times, most recently from 285e369 to c76ce0d Compare January 24, 2018 01:24

This comment was marked as off-topic.

This comment was marked as off-topic.

@yf225
Copy link
Contributor Author

yf225 commented Jan 24, 2018

@colesbury It seems that if I do it as an opaque struct pointer and use new to initialize the struct in the THRandom.cpp file, then I am not sure if I should call delete on this pointer in THGenerator_free, because this THGenerator can be copied elsewhere and deleting the pointer here will destroy the mutex reference in those copies.

I guess moving TH to C++ land might be a cleaner way to go, although the code change would be quite big.

@ezyang
Copy link
Contributor

ezyang commented Jan 24, 2018

Another issue is that if you want to actually acquire the mutex from C you are going to have to work hard to get unwinding to actually free it when necessary.

This comment was marked as off-topic.

@colesbury
Copy link
Member

This still has the issue that it's not exception safe. An exception thrown after the lock call will not release the lock, causing deadlocks the next time a random function is called.

We should covert the C files to C++ and use std::lock_guard.

@colesbury
Copy link
Member

We don't need to convert all of TH in one go. I think converting THTensorRandom.c to .cpp should be sufficient.

@yf225
Copy link
Contributor Author

yf225 commented Jan 24, 2018

@colesbury I think if we put the std::mutex in THRandom.h, then any file that includes this header would need to be converted to C++ as well?

@yf225
Copy link
Contributor Author

yf225 commented Jan 24, 2018

@colesbury If we change THTensorRandom.c to .cpp, we probably also need to change THTensor.c (that contains #include "generic/THTensorRandom.c") to .cpp, and that will require us to change many of other dependent TH files to C++ as well.

Haven't figured out what's the appropriate solution to this yet, or if adding try-catch block to catch THError would be a feasible approach.

@colesbury
Copy link
Member

@yf225, yes. I think your current trick in THRandom.cpp can work. Move the definition of mutex_s to a separate header file that's included by THRandom.cpp and THTensorRandom.cpp. THRandom.h can keep the opaque definition.

As Adam Paszke points out, you can also simplify the definition of mutex_s a bit.

@colesbury
Copy link
Member

I don't think a try-catch block is a good solution. We should be moving towards using C++, even if it requires more work.

@yf225
Copy link
Contributor Author

yf225 commented Jan 25, 2018

After changing things to C++, https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/test/scalar_tensor_test.cpp#L214 seems to fail when lhs is empty and rhs is a tensor with single element 0.

The error message is:

terminate called after throwing an instance of 'std::runtime_error'
 what():  invalid argument 2: out of range: 0 out of 0 at /private/home/willfeng/pytorch/aten/src/TH/generic/THTensorMath.cpp:382
../aten/tools/run_tests.sh: line 14: 23523 Aborted                 $BUILD_ROOT/src/ATen/test/scalar_tensor_test

In the normal case, TRY_CATCH_ELSE is expected to catch this std::runtime_error thrown from THTensorMath.cpp, but it wasn't the case here. I manually tested it and it seems that the catch block was not entered at all.

Any suggestions on how I should go about debugging this?

EDIT: fixed in #4857.

@yf225
Copy link
Contributor Author

yf225 commented Jan 26, 2018

There are a lot of unused variable warnings in TH after moving things to C++. I can fix them here or submit another PR for it.

@yf225 yf225 changed the title Add mutex for CPU RNG Add mutex for CPU RNG and move TH to C++ Jan 26, 2018
@yf225
Copy link
Contributor Author

yf225 commented Jan 29, 2018

@colesbury @apaszke Wondering is there anything else I should do for this PR?

self->left = 1;
self->seeded = 0;
self->normal_is_valid = 0;
self->mutex = new std::mutex();

This comment was marked as off-topic.


void THTensor_(geometric)(THTensor *self, THGenerator *_generator, double p)
{
std::lock_guard<std::mutex> lock(*(_generator->mutex));

This comment was marked as off-topic.

THArgCheck(THTensor_(isContiguous)(self), 1, "RNG state needs to be contiguous");
rng_state = (THGenerator *)THTensor_(data)(self);
THGenerator_copy(rng_state, _generator);
rng_state->mutex = NULL; // mutex should not be part of the generator state

This comment was marked as off-topic.

_PS256_CONST_TYPE(inv_mant_mask, int, ~0x7f800000);

_PS256_CONST_TYPE(sign_mask, int, 0x80000000);
_PS256_CONST_TYPE(sign_mask, int, (int)0x80000000);

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

THArgCheck(THTensor_(isContiguous)(self), 1, "RNG state needs to be contiguous");
rng_state = (THGenerator *)THTensor_(data)(self);
THGenerator_copy(rng_state, _generator);
rng_state->mutex = NULL; // mutex should not be part of the generator state

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

Copy link
Member

@colesbury colesbury left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good. Three comments inline.

_PS256_CONST_TYPE(inv_mant_mask, int, ~0x7f800000);

_PS256_CONST_TYPE(sign_mask, int, 0x80000000);
_PS256_CONST_TYPE(sign_mask, int, (int)0x80000000);

This comment was marked as off-topic.

{
memcpy(self, from, sizeof(THGenerator));
THGeneratorState_copy(&self->gen_state, &from->gen_state);
new (&self->mutex) std::mutex();

This comment was marked as off-topic.

{
static const size_t size = sizeof(THGenerator);
THGenerator *rng_state;
static const size_t size = sizeof(THGeneratorState);

This comment was marked as off-topic.

Copy link
Member

@colesbury colesbury left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@soumith soumith merged commit 96239dd into pytorch:master Jan 31, 2018
@yf225 yf225 mentioned this pull request Jan 31, 2018
ssnl added a commit to ssnl/pytorch that referenced this pull request Jan 31, 2018
soumith pushed a commit that referenced this pull request Jan 31, 2018
* Revert "Clarify grad_input_mask documentation in derivatives.yaml (#4963)"

This reverts commit 6f3266b.

* Revert "fix triu and tril for zero-strided inputs on gpu (#4962)"

This reverts commit 6c197c2.

* Revert "Add mutex for CPU RNG and move TH to C++ (#4041)"

This reverts commit 96239dd.

* Revert "Support multivariate TransformedDistributions (#4937)"

This reverts commit ca5071d.

* Revert "Only check that arguments are Variables in VariableType (#4943)"

This reverts commit d444379.

* Revert "torch.set_num_threads sets MKL option too (#4949)"

This reverts commit 2aaeec0.
@alsrgv
Copy link

alsrgv commented Feb 1, 2018

I think this breaks C FFI plugins since those are built with C, rather than C++. Specifically, I get this error:

packages/torch/utils/ffi/../../lib/include/TH/THRandom.h:6:17: fatal error: mutex: No such file or directory
     #include <mutex>
                     ^
    compilation terminated.
    error: command 'gcc' failed with exit status 1

@alsrgv
Copy link

alsrgv commented Feb 1, 2018

CC @yf225 @colesbury @soumith ^^

@ezyang
Copy link
Contributor

ezyang commented Feb 1, 2018

Drat! I guess, if we want to support FFI, we're going to need some C extern blocks and a C-compatible header for FFI plugins to import T_T

@alsrgv
Copy link

alsrgv commented Feb 1, 2018

Yeah, I actually wanted to write my header file in C++, but parser didn't support extern "C" and/or namespace constructs.

@apaszke
Copy link
Contributor

apaszke commented Feb 1, 2018

There’s a new C++-based extensions system underway. It might be simpler to recommend moving to that, but we should figure out an easy way to migrate (since this is breaking)

@alsrgv
Copy link

alsrgv commented Feb 1, 2018

@apaszke, I'd love to try new C++ extension out when it's ready. That said, would you support the C way for a while for backward compatibility?

@soumith
Copy link
Member

soumith commented Feb 1, 2018

yes we will support the C extension. We'll issue fixes shortly on this.

@soumith
Copy link
Member

soumith commented Feb 2, 2018

@alsrgv now fixed in master

@alsrgv
Copy link

alsrgv commented Feb 2, 2018

@soumith, thanks for the quick turnaround!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

make CPU RNG thread-safe

7 participants