-
Notifications
You must be signed in to change notification settings - Fork 25.7k
Fix underflow issue with dirichlet sample #17488
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| BaseSampler<accscalar_t, decltype(normal_lambda)> standard_normal(normal_lambda); | ||
| auto sample = sample_gamma<scalar_t, accscalar_t, decltype(uniform_lambda), decltype(normal_lambda)>(alpha, standard_uniform, standard_normal); | ||
| auto min_value = std::numeric_limits<scalar_t>::lowest(); | ||
| auto min_value = std::numeric_limits<scalar_t>::min(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this was a mistake - since lowest would be a negative value. I was seeing 0s on CUDA.
|
Looks like test failure wasn't totally fixed |
Thanks for pointing out. I couldn't see this on mac, but I am getting this error on my linux workstation. Will debug and push a fix for this. |
|
That was a legitimate error |
|
looks like the latest test failures are real: |
It seems like we might have a bias in the number of samples towards 0s and 1s for I have decreased the precision, but just noting that the imbalance issue reported in #15738 for GPU isn't fixed with this PR and still exists for certain builds. I'll need more help debugging that, because I can't seem to reproduce it. |
|
The recent failures are unrelated, and related to the build. |
|
@fritzo I'll merge this on your say so. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM Thanks for fixing this!
@ezyang I am happy with the tests and math, but my C++ and CUDA is rusty. I see no errors but I defer to you for C++/CUDA nitpicking.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ezyang is landing this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
|
@li-roy We can't land this patch until BC is restored: @neerajprad You don't need to do anything, but this diff no longer applies on master due to a BC-breaking change to |
I see that this was recently changed in #17527. Should I push a fix for this? |
|
@ezyang in the meantime, this can just be changed to ret.scalar_type() i think |
|
@pytorchbot rebase this please |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ezyang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
|
Good to go. |
Summary: Addresses #15738, using fritzo's suggestion. This adds a `torch._sample_dirichlet` method in `Distributions.cpp` and `Distributions.cu`. - For CPU, this leads to no perf hit since all we do is to promote the `alpha` to double when getting the gamma samples (the gamma sampler anyways uses `accscalar_t`(double for CPU)) and cast it back to float32 on return. - I have added an analogous method for CUDA as well, but the default sampler for CUDA uses scalar_t for efficiency, so I have kept it as that. With this, I do not see the bias towards 1 as reported in #15738 with `float32`, but there is a spurious mode at 0.5, as would be expected. Users would need to explicitly use `float64` for GPU to not see the spurious mode at 0.5. (EDIT: see note below, it appears that the bias issue is still there for certain builds). Added some tests and checked that there is no perf regression. My experience with C++ is very limited, so apologies in advance if I missed something basic. cc. ailzhang, fritzo, fmassa Pull Request resolved: pytorch/pytorch#17488 Differential Revision: D14410301 Pulled By: ezyang fbshipit-source-id: 62b2f694b4642685eab06db96d74ce28e05c3992
Addresses #15738, using @fritzo's suggestion. This adds a
torch._sample_dirichletmethod inDistributions.cppandDistributions.cu.alphato double when getting the gamma samples (the gamma sampler anyways usesaccscalar_t(double for CPU)) and cast it back to float32 on return.float32, but there is a spurious mode at 0.5, as would be expected. Users would need to explicitly usefloat64for GPU to not see the spurious mode at 0.5. (EDIT: see note below, it appears that the bias issue is still there for certain builds).Added some tests and checked that there is no perf regression. My experience with C++ is very limited, so apologies in advance if I missed something basic. cc. @ailzhang, @fritzo, @fmassa