KEMBAR78
libshm retry on EINTR by ssnl · Pull Request #15964 · pytorch/pytorch · GitHub
Skip to content

Conversation

@ssnl
Copy link
Collaborator

@ssnl ssnl commented Jan 11, 2019

fixes #14314

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@soumith is landing this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@soumith
Copy link
Member

soumith commented Jan 11, 2019

@ssnl a test errored out on OSX:

Jan 11 10:22:01 test_batch_sampler (__main__.TestDataLoader) ... ERROR

@ssnl
Copy link
Collaborator Author

ssnl commented Jan 12, 2019

@soumith Fixed!

//
// All functions used in `libshm` (so far) indicate error by returning `-1`. If
// you want to use a function with a different error reporting mechanism, you
// need to port `SYSCHECK` from `torch/lib/c10d/Utils.hpp`.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This refers to the modified macro in #15986

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@soumith is landing this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot pushed a commit that referenced this pull request Jan 15, 2019
Summary:
In #15964, I learned that `errno` is only meaningful if the function call fails. E.g., on some macos, a successful `fork()` sets `errno` to `EINVAL` in child process. This commit changes the `SYSCALL` macro so error checking is only done when an error happens. This means checking whether `rv == -1` for most calls, but is checking `rv == nullptr` for `inet_ntop`.

Now `SYSCALL` accepts a second argument `success_cond`, which should be an expression returning whether the call succeeded. `SYSCHECK_ERR_RETURN_NEG1` is the shorthand for checking if rv is `-1`.

Any suggestion on better macro names is welcomed.
Pull Request resolved: #15986

Reviewed By: janewangfb

Differential Revision: D13661790

Pulled By: pietern

fbshipit-source-id: 9551b14b9f88805454a7bfb8e4d39e0f3aed8131
@ssnl ssnl deleted the libshm_eintr branch January 16, 2019 16:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

libshm doesn't retry on EINTR at all

4 participants