KEMBAR78
util: add `timing.log_latency` by wchargin · Pull Request #3490 · tensorflow/tensorboard · GitHub
Skip to content

Conversation

@wchargin
Copy link
Contributor

@wchargin wchargin commented Apr 6, 2020

Summary:
This patch adds a decorator and context manager for latency profiling.
Any file can just add @timing.log_latency to a function definition or
introduce a with timing.log_latency("some_name_here"): context
anywhere. The output structure reveals the nesting of the call graph.

The implementation is thread-safe, and prints the current thread’s name
and ID so that the logs can easily be disentangled. (Basically, just
find a line that you’re interested in, note the thread ID, and pass the
log through grep THREAD_ID to get a coherent picture.)

I’ve added this as a global build dep on //tensorboard so that it’s
easy to patch timing in for debugging without worrying about build deps.
Any log_latency uses that are actually committed should be accompanied
by a strict local dep.

This patch combines previous independent drafts by @nfelt and @wchargin.

Test Plan:
Simple API usage tests included. As a manual test, adding latency
decoration to a few places in the text plugin’s Markdown rendering
pipeline generates output like this:

I0417 13:31:01.586319 140068322420480 timing.py:118] Thread-2[7f64329a2700] ENTER markdown_to_safe_html
I0417 13:31:01.586349 140068322420480 timing.py:118] Thread-2[7f64329a2700]   ENTER Markdown.convert
I0417 13:31:01.586455 140068322420480 timing.py:118] Thread-2[7f64329a2700]   LEAVE Markdown.convert - 0.000105s elapsed
I0417 13:31:01.586492 140068322420480 timing.py:118] Thread-2[7f64329a2700]   ENTER bleach.clean
I0417 13:31:01.586783 140068322420480 timing.py:118] Thread-2[7f64329a2700]   LEAVE bleach.clean - 0.000288s elapsed
I0417 13:31:01.586838 140068322420480 timing.py:118] Thread-2[7f64329a2700] LEAVE markdown_to_safe_html - 0.000520s elapsed

This is only emitted when --verbosity 0 is passed, not by default.

Co-authored-by: Nick Felt nfelt@users.noreply.github.com
wchargin-branch: timing-log-latency

Summary:
This patch adds a decorator and context manager for latency profiling.
Any file can just add `@timing.log_latency` to a function definition or
introduce a `with timing.log_latency("some_name_here"):` context
anywhere. The output structure reveals the nesting of the call graph.

The implementation is thread-safe, and prints the current thread’s name
and ID so that the logs can easily be disentangled. (Basically, just
find a line that you’re interested in, note the thread ID, and pass the
log through `grep THREAD_ID` to get a coherent picture.)

I’ve added this as a global build dep on `//tensorboard` so that it’s
easy to patch timing in for debugging without worrying about build deps.
Any `log_latency` uses that are actually committed should be accompanied
by a strict local dep.

This patch combines previous independent drafts by @nfelt and @wchargin.

Test Plan:
No automated tests included; I didn’t feel that they’d be worth their
weight. As a manual test, adding latency decoration to a few places in
the text plugin’s Markdown rendering pipeline generates output thus:

```
I0406 15:49:41.078654 139740956759808 timing.py:90] Thread-2[7f17fa169700] ENTER text_array_to_html
I0406 15:49:41.078930 139740956759808 timing.py:90] Thread-2[7f17fa169700]   ENTER markdown_to_safe_html
I0406 15:49:41.079075 139740956759808 timing.py:90] Thread-2[7f17fa169700]     ENTER _markdown.convert
I0406 15:49:41.079684 139740956759808 timing.py:98] Thread-2[7f17fa169700]     EXIT _markdown.convert - 0.000507s elapsed
I0406 15:49:41.079845 139740956759808 timing.py:90] Thread-2[7f17fa169700]     ENTER bleach.clean
I0406 15:49:41.081181 139740956759808 timing.py:98] Thread-2[7f17fa169700]     EXIT bleach.clean - 0.001214s elapsed
I0406 15:49:41.081432 139740956759808 timing.py:98] Thread-2[7f17fa169700]   EXIT markdown_to_safe_html - 0.002366s elapsed
I0406 15:49:41.081580 139740956759808 timing.py:90] Thread-2[7f17fa169700]   ENTER markdown_to_safe_html
I0406 15:49:41.081701 139740956759808 timing.py:90] Thread-2[7f17fa169700]     ENTER _markdown.convert
I0406 15:49:41.082247 139740956759808 timing.py:98] Thread-2[7f17fa169700]     EXIT _markdown.convert - 0.000450s elapsed
I0406 15:49:41.082415 139740956759808 timing.py:90] Thread-2[7f17fa169700]     ENTER bleach.clean
I0406 15:49:41.083828 139740956759808 timing.py:98] Thread-2[7f17fa169700]     EXIT bleach.clean - 0.001243s elapsed
I0406 15:49:41.084016 139740956759808 timing.py:98] Thread-2[7f17fa169700]   EXIT markdown_to_safe_html - 0.002335s elapsed
I0406 15:49:41.084219 139740956759808 timing.py:90] Thread-2[7f17fa169700]   ENTER markdown_to_safe_html
I0406 15:49:41.084347 139740956759808 timing.py:90] Thread-2[7f17fa169700]     ENTER _markdown.convert
I0406 15:49:41.084871 139740956759808 timing.py:98] Thread-2[7f17fa169700]     EXIT _markdown.convert - 0.000427s elapsed
I0406 15:49:41.085012 139740956759808 timing.py:90] Thread-2[7f17fa169700]     ENTER bleach.clean
I0406 15:49:41.086526 139740956759808 timing.py:98] Thread-2[7f17fa169700]     EXIT bleach.clean - 0.001410s elapsed
I0406 15:49:41.086712 139740956759808 timing.py:98] Thread-2[7f17fa169700]   EXIT markdown_to_safe_html - 0.002386s elapsed
I0406 15:49:41.086869 139740956759808 timing.py:90] Thread-2[7f17fa169700]   ENTER markdown_to_safe_html
I0406 15:49:41.086991 139740956759808 timing.py:90] Thread-2[7f17fa169700]     ENTER _markdown.convert
I0406 15:49:41.087421 139740956759808 timing.py:98] Thread-2[7f17fa169700]     EXIT _markdown.convert - 0.000337s elapsed
I0406 15:49:41.087568 139740956759808 timing.py:90] Thread-2[7f17fa169700]     ENTER bleach.clean
I0406 15:49:41.088584 139740956759808 timing.py:98] Thread-2[7f17fa169700]     EXIT bleach.clean - 0.000902s elapsed
I0406 15:49:41.088749 139740956759808 timing.py:98] Thread-2[7f17fa169700]   EXIT markdown_to_safe_html - 0.001778s elapsed
I0406 15:49:41.088971 139740956759808 timing.py:98] Thread-2[7f17fa169700] EXIT text_array_to_html - 0.010081s elapsed
```

This is only emitted when `--verbosity 0` is passed, not by default.

Co-authored-by: Nick Felt <nfelt@users.noreply.github.com>
wchargin-branch: timing-log-latency
@wchargin wchargin requested a review from nfelt April 6, 2020 22:59
@wchargin wchargin added theme:performance Performance, scalability, large data sizes, slowness, etc. and removed cla: yes labels Apr 6, 2020
wchargin added 5 commits April 6, 2020 19:21
wchargin-branch: timing-log-latency
wchargin-source: c8c141575d6b40c050fa84221398ef786551a10c
wchargin-branch: timing-log-latency
wchargin-source: ad8a0506bc8cee7f03cc40c19a55950a51af2e46
wchargin-branch: timing-log-latency
wchargin-source: 43713e571bd0a34da530f98e9b81a140c2de7647
wchargin-branch: timing-log-latency
wchargin-source: 995b6c1cf935c2c42cfdb8d66a3c8d1d73a9995a
wchargin-branch: timing-log-latency
wchargin-source: 2bf968e9b5d9016c3e3f97e49e2e0416a4a79b4c
Copy link
Contributor

@nfelt nfelt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for porting this and adding the nice features!



def _log(log_level, msg, *args):
# Forwarding method to ensure that all logging statements
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tired: fixing the issue cleanly with a nice detailed comment
Wired: leaving a cryptic block of ASCII art at the top of the file in order to push all the logging statements down into the 3-digit line number range

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3-digit

Why stop there?

pass

def __call__(self, func):
qualname = getattr(func, "__qualname__", func)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems slightly better to fall back to str(func) explicitly so we're always passing a string as the name? Unless you wanted to avoid the overhead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I was trying to avoid the overhead—changed to None-sentinel
check, which should be fast enough.


def __call__(self, func):
qualname = getattr(func, "__qualname__", func)
return _log_latency(qualname, self._log_level)(func)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, this was a head-scratcher until I found this part of the contextlib docs:

contextmanager() uses ContextDecorator so the context managers it creates can be used as decorators as well as in with statements. When used as a decorator, a new generator instance is implicitly created on each function call (this allows the otherwise “one-shot” context managers created by contextmanager() to meet the requirement that context managers support multiple invocations in order to be used as decorators).

This was non-obvious to me (especially the nuance about how it manages to recreate the context manager via internal machinery) and would be helpful to at least have a comment - or maybe just write out the decorator by hand to make it obvious that the context manager is being regenerated on each function call. The fact that these context managers are also decorators feels mostly like a convenience when actually using them as decorators directly; we don't need that convenience (since we're using it in the context of our own decorator) so we could just be explicit about wrapping the decorated function in a with block.

Also, the fact that this is defined in an object that is also a context manager, although I see why it's there (to guard against incorrect usage of with log_latency(), feels extra confusing. Reintroducing the @log_latency form of the decorator might make sense just so that we can make the first parameter to log_latency() mandatory and avoid this class of error entirely, even if it makes the contract slightly more subtle?

Or alternatively, we could perhaps at least combine this context manager with the overall log_latency() and always go through this context-manager-decorator rather than taking the shortcut of returning _log_latency(...) as a context-manager-decorator sometimes and other times returning this object. E.g. the following:

class log_latency(object):
    def __init__(self, region_name=None, log_level=None, reusable=True):
        self._context_manager = _log_latency(region_name, log_level)

    def __enter__(self):
        if self._region_name is None:
            raise TypeError(...)
        self._context_manager.__enter__()
        return self

    def __exit__(self, *exc):
        return self._context_manager.__exit__(*exc)

    def __call__(self, func):
        qualname = getattr(func, "__qualname__", func)
        return _log_latency(qualname, self._log_level)(func)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reintroducing the @log_latency form of the decorator might make
sense just so that we can make the first parameter to log_latency()
mandatory and avoid this class of error entirely, even if it makes the
contract slightly more subtle?

Okay, done. The analogous error that we no longer have a nice message
for is with log_latency:, which seems less likely (more obviously
wrong at call site) and would even more hackery to fix (something like
log_latency = _LogLatency() or a metaclass). This is fine enough.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, meant to note—

The fact that these context managers are also decorators feels mostly
like a convenience when actually using them as decorators directly; we
don't need that convenience (since we're using it in the context of
our own decorator) so we could just be explicit about wrapping the
decorated function in a with block.

—in the revised version I’m still using this convenience, because it’s
simpler than doing the functools.wraps/*args, **kwargs dance
ourselves, but I’ve spelled it out a bit more to make it clearer. Agreed
that the contextlib magic is a bit magic.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! This version is vastly easier for me to wrap my head around :)

(I guess with log_latency: will just raise an AttributeError: __enter__ - but if Python 3.7 feels that's a good enough way to indicate "you tried to use something as a context manager and it wasn't one" then I think this is good enough for us.)

@wchargin
Copy link
Contributor Author

Thanks for the review! I’ll update this when I get a chance.

wchargin-branch: timing-log-latency
wchargin-source: 3f46341d518e4bbb495c73977ddaa15dd9d57dbf
wchargin-branch: timing-log-latency
wchargin-source: 3f46341d518e4bbb495c73977ddaa15dd9d57dbf
Copy link
Contributor Author

@wchargin wchargin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed. Also changed “EXIT” to “LEAVE” in the log output so that
the messages at start and end of a region line up in monospaced fonts.



def _log(log_level, msg, *args):
# Forwarding method to ensure that all logging statements
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3-digit

Why stop there?

pass

def __call__(self, func):
qualname = getattr(func, "__qualname__", func)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I was trying to avoid the overhead—changed to None-sentinel
check, which should be fast enough.


def __call__(self, func):
qualname = getattr(func, "__qualname__", func)
return _log_latency(qualname, self._log_level)(func)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reintroducing the @log_latency form of the decorator might make
sense just so that we can make the first parameter to log_latency()
mandatory and avoid this class of error entirely, even if it makes the
contract slightly more subtle?

Okay, done. The analogous error that we no longer have a nice message
for is with log_latency:, which seems less likely (more obviously
wrong at call site) and would even more hackery to fix (something like
log_latency = _LogLatency() or a metaclass). This is fine enough.

Comment on lines +74 to +76
qualname = getattr(function_to_decorate, "__qualname__", None)
if qualname is None:
qualname = str(function_to_decorate)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It’s not super easy to get a handle to a function without qualname to
test this case, but with at least current versions of numpy the
following suffices as a manual test:

>>> import logging
>>> import numpy as np
>>> from tensorboard.util import timing
>>>
>>> np.__version__
'1.18.2'
>>> np.sin.__qualname__
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'numpy.ufunc' object has no attribute '__qualname__'
>>> sin = timing.log_latency(np.sin, log_level=logging.WARNING)
>>> sin([1, 2, 3])
MainThread[7f81fc8d7740] ENTER <ufunc 'sin'>
MainThread[7f81fc8d7740] LEAVE <ufunc 'sin'> - 0.000302s elapsed
array([0.84147098, 0.90929743, 0.14112001])

(Don’t want to commit this in case the upstream bug is ever fixed.)

@wchargin wchargin merged commit 3878480 into master Apr 17, 2020
@wchargin wchargin deleted the wchargin-timing-log-latency branch April 17, 2020 21:05
Copy link
Contributor

@nfelt nfelt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks again for writing this up! And nice use of LEAVE - I spent a brief while trying to think if there was an equal-length pair of words to use but that one didn't occur to me 🥝


def __call__(self, func):
qualname = getattr(func, "__qualname__", func)
return _log_latency(qualname, self._log_level)(func)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! This version is vastly easier for me to wrap my head around :)

(I guess with log_latency: will just raise an AttributeError: __enter__ - but if Python 3.7 feels that's a good enough way to indicate "you tried to use something as a context manager and it wasn't one" then I think this is good enough for us.)

caisq pushed a commit to caisq/tensorboard that referenced this pull request May 19, 2020
Summary:
This patch adds a decorator and context manager for latency profiling.
Any file can just add `@timing.log_latency` to a function definition or
introduce a `with timing.log_latency("some_name_here"):` context
anywhere. The output structure reveals the nesting of the call graph.

The implementation is thread-safe, and prints the current thread’s name
and ID so that the logs can easily be disentangled. (Basically, just
find a line that you’re interested in, note the thread ID, and pass the
log through `grep THREAD_ID` to get a coherent picture.)

I’ve added this as a global build dep on `//tensorboard` so that it’s
easy to patch timing in for debugging without worrying about build deps.
Any `log_latency` uses that are actually committed should be accompanied
by a strict local dep.

This patch combines previous independent drafts by @nfelt and @wchargin.

Test Plan:
Simple API usage tests included. As a manual test, adding latency
decoration to a few places in the text plugin’s Markdown rendering
pipeline generates output like this:

```
I0417 13:31:01.586319 140068322420480 timing.py:118] Thread-2[7f64329a2700] ENTER markdown_to_safe_html
I0417 13:31:01.586349 140068322420480 timing.py:118] Thread-2[7f64329a2700]   ENTER Markdown.convert
I0417 13:31:01.586455 140068322420480 timing.py:118] Thread-2[7f64329a2700]   LEAVE Markdown.convert - 0.000105s elapsed
I0417 13:31:01.586492 140068322420480 timing.py:118] Thread-2[7f64329a2700]   ENTER bleach.clean
I0417 13:31:01.586783 140068322420480 timing.py:118] Thread-2[7f64329a2700]   LEAVE bleach.clean - 0.000288s elapsed
I0417 13:31:01.586838 140068322420480 timing.py:118] Thread-2[7f64329a2700] LEAVE markdown_to_safe_html - 0.000520s elapsed
```

This is only emitted when `--verbosity 0` is passed, not by default.

Co-authored-by: Nick Felt <nfelt@users.noreply.github.com>
wchargin-branch: timing-log-latency
caisq pushed a commit that referenced this pull request May 27, 2020
Summary:
This patch adds a decorator and context manager for latency profiling.
Any file can just add `@timing.log_latency` to a function definition or
introduce a `with timing.log_latency("some_name_here"):` context
anywhere. The output structure reveals the nesting of the call graph.

The implementation is thread-safe, and prints the current thread’s name
and ID so that the logs can easily be disentangled. (Basically, just
find a line that you’re interested in, note the thread ID, and pass the
log through `grep THREAD_ID` to get a coherent picture.)

I’ve added this as a global build dep on `//tensorboard` so that it’s
easy to patch timing in for debugging without worrying about build deps.
Any `log_latency` uses that are actually committed should be accompanied
by a strict local dep.

This patch combines previous independent drafts by @nfelt and @wchargin.

Test Plan:
Simple API usage tests included. As a manual test, adding latency
decoration to a few places in the text plugin’s Markdown rendering
pipeline generates output like this:

```
I0417 13:31:01.586319 140068322420480 timing.py:118] Thread-2[7f64329a2700] ENTER markdown_to_safe_html
I0417 13:31:01.586349 140068322420480 timing.py:118] Thread-2[7f64329a2700]   ENTER Markdown.convert
I0417 13:31:01.586455 140068322420480 timing.py:118] Thread-2[7f64329a2700]   LEAVE Markdown.convert - 0.000105s elapsed
I0417 13:31:01.586492 140068322420480 timing.py:118] Thread-2[7f64329a2700]   ENTER bleach.clean
I0417 13:31:01.586783 140068322420480 timing.py:118] Thread-2[7f64329a2700]   LEAVE bleach.clean - 0.000288s elapsed
I0417 13:31:01.586838 140068322420480 timing.py:118] Thread-2[7f64329a2700] LEAVE markdown_to_safe_html - 0.000520s elapsed
```

This is only emitted when `--verbosity 0` is passed, not by default.

Co-authored-by: Nick Felt <nfelt@users.noreply.github.com>
wchargin-branch: timing-log-latency
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla: yes theme:performance Performance, scalability, large data sizes, slowness, etc.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants