KEMBAR78
Add tf.stop_gradient in tf.summary.histogram by allenlavoie · Pull Request #5311 · tensorflow/tensorboard · GitHub
Skip to content

Conversation

@allenlavoie
Copy link
Member

  • Motivation for features / changes

Working around a user issue. It's intended as a temporary fix, but keeping it long-term doesn't have any downsides that I know of.

  • Technical description of changes

Adds tf.stop_gradient in tf.summary.histogram. The function is not differentiable, and building gradient graphs for its conds have been causing some issues. This should be a no-op with a tiny positive performance impact for most users, and will work around an error for those using XLA with multiple/persistent tf.GradientTapes (due to XlaDynamicUpdateSlice not having a gradient defined; there is a separate TensorFlow bug about the op with no gradient, which is the more satisfying medium-term solution here).

  • Screenshots of UI changes

N/A; this just changes the implementation of a summary API.

  • Detailed steps to verify changes work correctly (as executed by you)

Two folks (at Alphabet) have confirmed that this works around an issue they reported for now. It's a bit tricky to reproduce; we might be able to do a unit test with XLA+CPU in the TensorBoard repo, although the original report relates to TPU usage.

  • Alternate designs / implementations considered

The op should have a gradient defined, and will eventually.

Add tf.stop_gradient in tf.summary.histogram. The function is not differentiable, and building gradient graphs for its conds have been causing some issues. This should be a no-op with a tiny positive performance impact for most users, and will work around an error for those using XLA with multiple/persistent tf.GradientTapes (due to XlaDynamicUpdateSlice not having a gradient defined; there is a separate TensorFlow bug about the op with no gradient, which is the more satisfying medium-term solution here).
@google-cla google-cla bot added the cla: yes label Sep 14, 2021
Copy link
Contributor

@stephanwlee stephanwlee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change seems logic. Wondering, is this unique to histograms or other Summary APIs as well? Also, would you mind giving test a stab?

we might be able to do a unit test with XLA+CPU in the TensorBoard repo

If this is not true, I am okay with just moving forward as is.

@allenlavoie
Copy link
Member Author

Sure, added a test. It's quite weird since summaries can't actually run with XLA CPU/GPU (no outside compilation support), but it does fail before the stop_gradient change.

As far as I know the issue is limited to histograms, the combination of a particular slice op and tf.cond. But I haven't looked thoroughly. Like I say, the more satisfying solution is just to define a gradient for the op, this is just to buy some time.

@stephanwlee stephanwlee merged commit 54ef54b into tensorflow:master Sep 15, 2021
@nfelt
Copy link
Contributor

nfelt commented Sep 21, 2021

As far as I know the issue is limited to histograms, the combination of a particular slice op and tf.cond. But I haven't looked thoroughly. Like I say, the more satisfying solution is just to define a gradient for the op, this is just to buy some time.

FWIW, we're trying to simplify this op for TPU-related reasons anyway (#2885) so that would presumably also address the issue if it removes the offending slice + cond combination.

yatbear pushed a commit to yatbear/tensorboard that referenced this pull request Mar 27, 2023
Add tf.stop_gradient in tf.summary.histogram. The function is not differentiable, and building gradient graphs for its conds have been causing some issues. This should be a no-op with a tiny positive performance impact for most users, and will work around an error for those using XLA with multiple/persistent tf.GradientTapes (due to XlaDynamicUpdateSlice not having a gradient defined; there is a separate TensorFlow bug about the op with no gradient, which is the more satisfying medium-term solution here).
dna2github pushed a commit to dna2fork/tensorboard that referenced this pull request May 1, 2023
Add tf.stop_gradient in tf.summary.histogram. The function is not differentiable, and building gradient graphs for its conds have been causing some issues. This should be a no-op with a tiny positive performance impact for most users, and will work around an error for those using XLA with multiple/persistent tf.GradientTapes (due to XlaDynamicUpdateSlice not having a gradient defined; there is a separate TensorFlow bug about the op with no gradient, which is the more satisfying medium-term solution here).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants