KEMBAR78
Cache bleach.Cleaner for faster conversion of markdown to safe html by bmd3k · Pull Request #3599 · tensorflow/tensorboard · GitHub
Skip to content

Conversation

@bmd3k
Copy link
Contributor

@bmd3k bmd3k commented May 7, 2020

  • Motivation for features / changes

Calls to tensorboard.dev's /data/plugin/scalars/tags endpoint can be surprisingly slow if there are many tags.

The following call spends ~3.5 seconds in the code that converts tag description markdown into safe html, even though the description tags are empty:

http://tensorboard.dev/experiment/UlN9loBoTwGczi5FlWbYEQ/data/plugin/scalars/tags

It is the call to bleach.clean() that takes up almost all of this time because it is always reinstantiating bleach.Cleaner.

  • Technical description of changes

Similar to recent changes that cache Markdown instances per thread, we now cache bleach.Cleaner instances per thread.

bmd3k added 2 commits May 6, 2020 11:54
Avoid expensive initialization of bleach Cleaner. Saves about 3.5
seconds of processing time in the following call on tensorboard.dev:

/experiment/UlN9loBoTwGczi5FlWbYEQ/data/plugin/scalars/tags
@bmd3k bmd3k marked this pull request as ready for review May 7, 2020 13:44
@bmd3k bmd3k requested a review from bileschi May 7, 2020 13:45
Copy link
Collaborator

@bileschi bileschi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome thanks!

Copy link
Contributor

@nfelt nfelt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If empty descriptions were part of the problem here, maybe it's worth just adding an early exit to markdown_to_safe_html that skips the markdown conversion and bleach clean entirely and returns the empty string? That does seem like a very common case for descriptions that's worth optimizing for, since then we could potentially avoid ever initializing these objects in most request threads.

@bmd3k
Copy link
Contributor Author

bmd3k commented May 7, 2020

If empty descriptions were part of the problem here, maybe it's worth just adding an early exit to markdown_to_safe_html that skips the markdown conversion and bleach clean entirely and returns the empty string? That does seem like a very common case for descriptions that's worth optimizing for, since then we could potentially avoid ever initializing these objects in most request threads.

I was considering that as a next step. The change as is should considerably improve performance both for empty descriptions and non-empty descriptions. In my limited testing the early-exit-on-empty-markdown approach gains a little more performance improvement but not much.

@bmd3k bmd3k merged commit 3f58947 into tensorflow:master May 7, 2020
caisq pushed a commit to caisq/tensorboard that referenced this pull request May 19, 2020
…ensorflow#3599)

Similar to recent changes that cache Markdown instances per thread, we now cache bleach.Cleaner instances per thread.

Calls to tensorboard.dev's /data/plugin/scalars/tags endpoint can be surprisingly slow if there are many tags.

The following call spends ~3.5 seconds in the code that converts tag description markdown into safe html, even though the description tags are empty:

http://tensorboard.dev/experiment/UlN9loBoTwGczi5FlWbYEQ/data/plugin/scalars/tags

It is the call to bleach.clean() that takes up almost all of this time because it is always reinstantiating bleach.Cleaner.
@caisq caisq mentioned this pull request May 20, 2020
caisq pushed a commit that referenced this pull request May 27, 2020
…3599)

Similar to recent changes that cache Markdown instances per thread, we now cache bleach.Cleaner instances per thread.

Calls to tensorboard.dev's /data/plugin/scalars/tags endpoint can be surprisingly slow if there are many tags.

The following call spends ~3.5 seconds in the code that converts tag description markdown into safe html, even though the description tags are empty:

http://tensorboard.dev/experiment/UlN9loBoTwGczi5FlWbYEQ/data/plugin/scalars/tags

It is the call to bleach.clean() that takes up almost all of this time because it is always reinstantiating bleach.Cleaner.
@bmd3k bmd3k deleted the cache-bleach-cleaner branch June 23, 2021 14:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants