-
Notifications
You must be signed in to change notification settings - Fork 1.7k
util: add timing.log_latency
#3490
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Summary:
This patch adds a decorator and context manager for latency profiling.
Any file can just add `@timing.log_latency` to a function definition or
introduce a `with timing.log_latency("some_name_here"):` context
anywhere. The output structure reveals the nesting of the call graph.
The implementation is thread-safe, and prints the current thread’s name
and ID so that the logs can easily be disentangled. (Basically, just
find a line that you’re interested in, note the thread ID, and pass the
log through `grep THREAD_ID` to get a coherent picture.)
I’ve added this as a global build dep on `//tensorboard` so that it’s
easy to patch timing in for debugging without worrying about build deps.
Any `log_latency` uses that are actually committed should be accompanied
by a strict local dep.
This patch combines previous independent drafts by @nfelt and @wchargin.
Test Plan:
No automated tests included; I didn’t feel that they’d be worth their
weight. As a manual test, adding latency decoration to a few places in
the text plugin’s Markdown rendering pipeline generates output thus:
```
I0406 15:49:41.078654 139740956759808 timing.py:90] Thread-2[7f17fa169700] ENTER text_array_to_html
I0406 15:49:41.078930 139740956759808 timing.py:90] Thread-2[7f17fa169700] ENTER markdown_to_safe_html
I0406 15:49:41.079075 139740956759808 timing.py:90] Thread-2[7f17fa169700] ENTER _markdown.convert
I0406 15:49:41.079684 139740956759808 timing.py:98] Thread-2[7f17fa169700] EXIT _markdown.convert - 0.000507s elapsed
I0406 15:49:41.079845 139740956759808 timing.py:90] Thread-2[7f17fa169700] ENTER bleach.clean
I0406 15:49:41.081181 139740956759808 timing.py:98] Thread-2[7f17fa169700] EXIT bleach.clean - 0.001214s elapsed
I0406 15:49:41.081432 139740956759808 timing.py:98] Thread-2[7f17fa169700] EXIT markdown_to_safe_html - 0.002366s elapsed
I0406 15:49:41.081580 139740956759808 timing.py:90] Thread-2[7f17fa169700] ENTER markdown_to_safe_html
I0406 15:49:41.081701 139740956759808 timing.py:90] Thread-2[7f17fa169700] ENTER _markdown.convert
I0406 15:49:41.082247 139740956759808 timing.py:98] Thread-2[7f17fa169700] EXIT _markdown.convert - 0.000450s elapsed
I0406 15:49:41.082415 139740956759808 timing.py:90] Thread-2[7f17fa169700] ENTER bleach.clean
I0406 15:49:41.083828 139740956759808 timing.py:98] Thread-2[7f17fa169700] EXIT bleach.clean - 0.001243s elapsed
I0406 15:49:41.084016 139740956759808 timing.py:98] Thread-2[7f17fa169700] EXIT markdown_to_safe_html - 0.002335s elapsed
I0406 15:49:41.084219 139740956759808 timing.py:90] Thread-2[7f17fa169700] ENTER markdown_to_safe_html
I0406 15:49:41.084347 139740956759808 timing.py:90] Thread-2[7f17fa169700] ENTER _markdown.convert
I0406 15:49:41.084871 139740956759808 timing.py:98] Thread-2[7f17fa169700] EXIT _markdown.convert - 0.000427s elapsed
I0406 15:49:41.085012 139740956759808 timing.py:90] Thread-2[7f17fa169700] ENTER bleach.clean
I0406 15:49:41.086526 139740956759808 timing.py:98] Thread-2[7f17fa169700] EXIT bleach.clean - 0.001410s elapsed
I0406 15:49:41.086712 139740956759808 timing.py:98] Thread-2[7f17fa169700] EXIT markdown_to_safe_html - 0.002386s elapsed
I0406 15:49:41.086869 139740956759808 timing.py:90] Thread-2[7f17fa169700] ENTER markdown_to_safe_html
I0406 15:49:41.086991 139740956759808 timing.py:90] Thread-2[7f17fa169700] ENTER _markdown.convert
I0406 15:49:41.087421 139740956759808 timing.py:98] Thread-2[7f17fa169700] EXIT _markdown.convert - 0.000337s elapsed
I0406 15:49:41.087568 139740956759808 timing.py:90] Thread-2[7f17fa169700] ENTER bleach.clean
I0406 15:49:41.088584 139740956759808 timing.py:98] Thread-2[7f17fa169700] EXIT bleach.clean - 0.000902s elapsed
I0406 15:49:41.088749 139740956759808 timing.py:98] Thread-2[7f17fa169700] EXIT markdown_to_safe_html - 0.001778s elapsed
I0406 15:49:41.088971 139740956759808 timing.py:98] Thread-2[7f17fa169700] EXIT text_array_to_html - 0.010081s elapsed
```
This is only emitted when `--verbosity 0` is passed, not by default.
Co-authored-by: Nick Felt <nfelt@users.noreply.github.com>
wchargin-branch: timing-log-latency
wchargin-branch: timing-log-latency wchargin-source: c8c141575d6b40c050fa84221398ef786551a10c
wchargin-branch: timing-log-latency wchargin-source: ad8a0506bc8cee7f03cc40c19a55950a51af2e46
wchargin-branch: timing-log-latency wchargin-source: 43713e571bd0a34da530f98e9b81a140c2de7647
wchargin-branch: timing-log-latency wchargin-source: 995b6c1cf935c2c42cfdb8d66a3c8d1d73a9995a
wchargin-branch: timing-log-latency wchargin-source: 2bf968e9b5d9016c3e3f97e49e2e0416a4a79b4c
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for porting this and adding the nice features!
|
|
||
|
|
||
| def _log(log_level, msg, *args): | ||
| # Forwarding method to ensure that all logging statements |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tired: fixing the issue cleanly with a nice detailed comment
Wired: leaving a cryptic block of ASCII art at the top of the file in order to push all the logging statements down into the 3-digit line number range
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
3-digit
Why stop there?
tensorboard/util/timing.py
Outdated
| pass | ||
|
|
||
| def __call__(self, func): | ||
| qualname = getattr(func, "__qualname__", func) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems slightly better to fall back to str(func) explicitly so we're always passing a string as the name? Unless you wanted to avoid the overhead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I was trying to avoid the overhead—changed to None-sentinel
check, which should be fast enough.
tensorboard/util/timing.py
Outdated
|
|
||
| def __call__(self, func): | ||
| qualname = getattr(func, "__qualname__", func) | ||
| return _log_latency(qualname, self._log_level)(func) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, this was a head-scratcher until I found this part of the contextlib docs:
contextmanager() uses ContextDecorator so the context managers it creates can be used as decorators as well as in with statements. When used as a decorator, a new generator instance is implicitly created on each function call (this allows the otherwise “one-shot” context managers created by contextmanager() to meet the requirement that context managers support multiple invocations in order to be used as decorators).
This was non-obvious to me (especially the nuance about how it manages to recreate the context manager via internal machinery) and would be helpful to at least have a comment - or maybe just write out the decorator by hand to make it obvious that the context manager is being regenerated on each function call. The fact that these context managers are also decorators feels mostly like a convenience when actually using them as decorators directly; we don't need that convenience (since we're using it in the context of our own decorator) so we could just be explicit about wrapping the decorated function in a with block.
Also, the fact that this is defined in an object that is also a context manager, although I see why it's there (to guard against incorrect usage of with log_latency(), feels extra confusing. Reintroducing the @log_latency form of the decorator might make sense just so that we can make the first parameter to log_latency() mandatory and avoid this class of error entirely, even if it makes the contract slightly more subtle?
Or alternatively, we could perhaps at least combine this context manager with the overall log_latency() and always go through this context-manager-decorator rather than taking the shortcut of returning _log_latency(...) as a context-manager-decorator sometimes and other times returning this object. E.g. the following:
class log_latency(object):
def __init__(self, region_name=None, log_level=None, reusable=True):
self._context_manager = _log_latency(region_name, log_level)
def __enter__(self):
if self._region_name is None:
raise TypeError(...)
self._context_manager.__enter__()
return self
def __exit__(self, *exc):
return self._context_manager.__exit__(*exc)
def __call__(self, func):
qualname = getattr(func, "__qualname__", func)
return _log_latency(qualname, self._log_level)(func)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reintroducing the
@log_latencyform of the decorator might make
sense just so that we can make the first parameter tolog_latency()
mandatory and avoid this class of error entirely, even if it makes the
contract slightly more subtle?
Okay, done. The analogous error that we no longer have a nice message
for is with log_latency:, which seems less likely (more obviously
wrong at call site) and would even more hackery to fix (something like
log_latency = _LogLatency() or a metaclass). This is fine enough.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, meant to note—
The fact that these context managers are also decorators feels mostly
like a convenience when actually using them as decorators directly; we
don't need that convenience (since we're using it in the context of
our own decorator) so we could just be explicit about wrapping the
decorated function in awithblock.
—in the revised version I’m still using this convenience, because it’s
simpler than doing the functools.wraps/*args, **kwargs dance
ourselves, but I’ve spelled it out a bit more to make it clearer. Agreed
that the contextlib magic is a bit magic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! This version is vastly easier for me to wrap my head around :)
(I guess with log_latency: will just raise an AttributeError: __enter__ - but if Python 3.7 feels that's a good enough way to indicate "you tried to use something as a context manager and it wasn't one" then I think this is good enough for us.)
|
Thanks for the review! I’ll update this when I get a chance. |
wchargin-branch: timing-log-latency wchargin-source: 3f46341d518e4bbb495c73977ddaa15dd9d57dbf
wchargin-branch: timing-log-latency wchargin-source: 3f46341d518e4bbb495c73977ddaa15dd9d57dbf
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed. Also changed “EXIT” to “LEAVE” in the log output so that
the messages at start and end of a region line up in monospaced fonts.
|
|
||
|
|
||
| def _log(log_level, msg, *args): | ||
| # Forwarding method to ensure that all logging statements |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
3-digit
Why stop there?
tensorboard/util/timing.py
Outdated
| pass | ||
|
|
||
| def __call__(self, func): | ||
| qualname = getattr(func, "__qualname__", func) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I was trying to avoid the overhead—changed to None-sentinel
check, which should be fast enough.
tensorboard/util/timing.py
Outdated
|
|
||
| def __call__(self, func): | ||
| qualname = getattr(func, "__qualname__", func) | ||
| return _log_latency(qualname, self._log_level)(func) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reintroducing the
@log_latencyform of the decorator might make
sense just so that we can make the first parameter tolog_latency()
mandatory and avoid this class of error entirely, even if it makes the
contract slightly more subtle?
Okay, done. The analogous error that we no longer have a nice message
for is with log_latency:, which seems less likely (more obviously
wrong at call site) and would even more hackery to fix (something like
log_latency = _LogLatency() or a metaclass). This is fine enough.
| qualname = getattr(function_to_decorate, "__qualname__", None) | ||
| if qualname is None: | ||
| qualname = str(function_to_decorate) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It’s not super easy to get a handle to a function without qualname to
test this case, but with at least current versions of numpy the
following suffices as a manual test:
>>> import logging
>>> import numpy as np
>>> from tensorboard.util import timing
>>>
>>> np.__version__
'1.18.2'
>>> np.sin.__qualname__
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'numpy.ufunc' object has no attribute '__qualname__'
>>> sin = timing.log_latency(np.sin, log_level=logging.WARNING)
>>> sin([1, 2, 3])
MainThread[7f81fc8d7740] ENTER <ufunc 'sin'>
MainThread[7f81fc8d7740] LEAVE <ufunc 'sin'> - 0.000302s elapsed
array([0.84147098, 0.90929743, 0.14112001])(Don’t want to commit this in case the upstream bug is ever fixed.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks again for writing this up! And nice use of LEAVE - I spent a brief while trying to think if there was an equal-length pair of words to use but that one didn't occur to me 🥝
tensorboard/util/timing.py
Outdated
|
|
||
| def __call__(self, func): | ||
| qualname = getattr(func, "__qualname__", func) | ||
| return _log_latency(qualname, self._log_level)(func) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! This version is vastly easier for me to wrap my head around :)
(I guess with log_latency: will just raise an AttributeError: __enter__ - but if Python 3.7 feels that's a good enough way to indicate "you tried to use something as a context manager and it wasn't one" then I think this is good enough for us.)
Summary:
This patch adds a decorator and context manager for latency profiling.
Any file can just add `@timing.log_latency` to a function definition or
introduce a `with timing.log_latency("some_name_here"):` context
anywhere. The output structure reveals the nesting of the call graph.
The implementation is thread-safe, and prints the current thread’s name
and ID so that the logs can easily be disentangled. (Basically, just
find a line that you’re interested in, note the thread ID, and pass the
log through `grep THREAD_ID` to get a coherent picture.)
I’ve added this as a global build dep on `//tensorboard` so that it’s
easy to patch timing in for debugging without worrying about build deps.
Any `log_latency` uses that are actually committed should be accompanied
by a strict local dep.
This patch combines previous independent drafts by @nfelt and @wchargin.
Test Plan:
Simple API usage tests included. As a manual test, adding latency
decoration to a few places in the text plugin’s Markdown rendering
pipeline generates output like this:
```
I0417 13:31:01.586319 140068322420480 timing.py:118] Thread-2[7f64329a2700] ENTER markdown_to_safe_html
I0417 13:31:01.586349 140068322420480 timing.py:118] Thread-2[7f64329a2700] ENTER Markdown.convert
I0417 13:31:01.586455 140068322420480 timing.py:118] Thread-2[7f64329a2700] LEAVE Markdown.convert - 0.000105s elapsed
I0417 13:31:01.586492 140068322420480 timing.py:118] Thread-2[7f64329a2700] ENTER bleach.clean
I0417 13:31:01.586783 140068322420480 timing.py:118] Thread-2[7f64329a2700] LEAVE bleach.clean - 0.000288s elapsed
I0417 13:31:01.586838 140068322420480 timing.py:118] Thread-2[7f64329a2700] LEAVE markdown_to_safe_html - 0.000520s elapsed
```
This is only emitted when `--verbosity 0` is passed, not by default.
Co-authored-by: Nick Felt <nfelt@users.noreply.github.com>
wchargin-branch: timing-log-latency
Summary:
This patch adds a decorator and context manager for latency profiling.
Any file can just add `@timing.log_latency` to a function definition or
introduce a `with timing.log_latency("some_name_here"):` context
anywhere. The output structure reveals the nesting of the call graph.
The implementation is thread-safe, and prints the current thread’s name
and ID so that the logs can easily be disentangled. (Basically, just
find a line that you’re interested in, note the thread ID, and pass the
log through `grep THREAD_ID` to get a coherent picture.)
I’ve added this as a global build dep on `//tensorboard` so that it’s
easy to patch timing in for debugging without worrying about build deps.
Any `log_latency` uses that are actually committed should be accompanied
by a strict local dep.
This patch combines previous independent drafts by @nfelt and @wchargin.
Test Plan:
Simple API usage tests included. As a manual test, adding latency
decoration to a few places in the text plugin’s Markdown rendering
pipeline generates output like this:
```
I0417 13:31:01.586319 140068322420480 timing.py:118] Thread-2[7f64329a2700] ENTER markdown_to_safe_html
I0417 13:31:01.586349 140068322420480 timing.py:118] Thread-2[7f64329a2700] ENTER Markdown.convert
I0417 13:31:01.586455 140068322420480 timing.py:118] Thread-2[7f64329a2700] LEAVE Markdown.convert - 0.000105s elapsed
I0417 13:31:01.586492 140068322420480 timing.py:118] Thread-2[7f64329a2700] ENTER bleach.clean
I0417 13:31:01.586783 140068322420480 timing.py:118] Thread-2[7f64329a2700] LEAVE bleach.clean - 0.000288s elapsed
I0417 13:31:01.586838 140068322420480 timing.py:118] Thread-2[7f64329a2700] LEAVE markdown_to_safe_html - 0.000520s elapsed
```
This is only emitted when `--verbosity 0` is passed, not by default.
Co-authored-by: Nick Felt <nfelt@users.noreply.github.com>
wchargin-branch: timing-log-latency
Summary:
This patch adds a decorator and context manager for latency profiling.
Any file can just add
@timing.log_latencyto a function definition orintroduce a
with timing.log_latency("some_name_here"):contextanywhere. The output structure reveals the nesting of the call graph.
The implementation is thread-safe, and prints the current thread’s name
and ID so that the logs can easily be disentangled. (Basically, just
find a line that you’re interested in, note the thread ID, and pass the
log through
grep THREAD_IDto get a coherent picture.)I’ve added this as a global build dep on
//tensorboardso that it’seasy to patch timing in for debugging without worrying about build deps.
Any
log_latencyuses that are actually committed should be accompaniedby a strict local dep.
This patch combines previous independent drafts by @nfelt and @wchargin.
Test Plan:
Simple API usage tests included. As a manual test, adding latency
decoration to a few places in the text plugin’s Markdown rendering
pipeline generates output like this:
This is only emitted when
--verbosity 0is passed, not by default.Co-authored-by: Nick Felt nfelt@users.noreply.github.com
wchargin-branch: timing-log-latency