Make tensorboard.compat.{tf,tf2} lazily loaded #1781

nfelt · 2019-01-26T05:22:37Z

This PR makes tensorboard.compat.tf into a lazy-loading module for accessing the TF API (either real or stub), and introduces a similar tensorboard.compat.tf2 lazy-loading module for accessing the TF 2.0 API (either as tf or tf.compat.v2).

The first change (lazifying tensorboard.compat.tf) is necessary so that that the tensorboard.summary.v2 summary API can be imported without immediately importing TensorFlow, since we want TF to be able to import that without triggering a circular dependency. Previously, the V2 summary API transitively imports various pieces from under tensorboard.compat, and that module's __init__.py file was opportunistically importing tensorflow at module load time to provide tensorboard.compat.tf). Now that import is deferred until tf is actually used, and we avoid using it from within the V2 summary APIs. I've added summary_dep_test.py to enforce this so we don't have a regression.

To implement this lazy-loading behavior, I refactored LazyLoader to let you customize the module loading with your own function, and made it usable as a decorator for convenience. I also refactored the existing LazyLoader uses in tensorboard/__init__.py to use this, and replaced tensorboard.compat.import_tf_v2() with a lazy loading tensorboard.compat.tf2, which in turn eliminates the need to call import_tf_v2() on-demand in the various places that use it (necessary because we don't always have a TF 2.0 API available).

wchargin · 2019-01-28T17:48:06Z

tensorboard/compat/tensorflow_stub/__init__.py

 from . import tensor_shape  # noqa
+
+# Set a fake __version__ to help distinguish this as our own stub API.
+__version__ = 'stub'


Is it strange that this makes tf.__version__ and tf.version.VERSION
different? Do we want to change the latter as well?

The stub doesn't actually have a version module so I'm not quite sure I follow? I picked __version__ because it's the one place that's consistent across TF 1.x and 2.x, since 2.0 only has tf.version.VERSION, but most of 1.x used tf.VERSION.

Ah, got it. Thanks.

wchargin · 2019-01-28T17:59:03Z

tensorboard/lazy.py

  def __getattr__(self, item):
-    module = self._load()
-    return getattr(module, item)
+    if not self._cached_module:


At a glance, it looks like it should be fine that this isn’t threadsafe,
but if there are problems that have to do with thread safety combined
with lazy imports then that sounds like a nightmare to debug. Up to you
whether to add (potentially double-checked) locking or not.

This was also arguably an existing problem (in the previous code, the load function could be called many times, so the caching here was just for efficiency, not correctness), but I'll grant that inviting users to supply their own load functions makes it more of a concern, so done.

wchargin · 2019-01-28T18:14:57Z

tensorboard/lazy.py

+      load_fn: callable that actually does the import and returns the module
+    """
    super(LazyLoader, self).__init__(name)
+    self._load_fn = load_fn


If I understand correctly, this does have the effect that _load_fn and
_cached_module leak onto the actual module being imported, and in the
event that the underlying module actually defines those properties they
will not be accessible (because of the __getattr__/__getattribute__
distinction). This might sound like “who would ever do that”, but it can
happen if you try to lazily load a module that is itself lazily loaded:

>>> from tensorboard import lazy >>> @lazy.lazy_load("inner") ... def inner(): ... import sys ... return sys ... >>> @lazy.lazy_load("outer") ... def outer(): ... return inner ... >>> outer.stdout Traceback (most recent call last): File "<stdin>", line 1, in <module> File "tensorboard/lazy.py", line 55, in __getattr__ return getattr(self._cached_module, item) AttributeError: 'NoneType' object has no attribute 'stdout' >>> inner.stdout <open file '<stdout>', mode 'w' at 0x7fa4487f5150> >>> outer.stdout <open file '<stdout>', mode 'w' at 0x7fa4487f5150> >>>

This seems a bit insidious. Can we fix the leaky abstraction? For
instance, we could store _load_fn and _cached_module in a closure
value instead of on the object itself.

Noted offline that this is not new in this PR, so it’s fine with me if
you don’t address it and you or I fixes it in a subsequent commit.

Refactored to use a closure.

wchargin · 2019-01-28T18:19:03Z

tensorboard/__init__.py

 from __future__ import division
 from __future__ import print_function

+import importlib as _importlib


nit: This exposes tensorboard._importlib; could resolve by lazily
importing importlib inside program() and summary() if you want.

Fun fact: we're currently exposing the symbols absolute_import, division, and print_function as well :p

The main reason to use importlib was avoiding an inline import that requires a lint suppression internally, so at that point we can just directly import.

wchargin · 2019-01-28T18:21:59Z

Oh, forgot to note: before the underlying module is actually imported,
the repr of tensorboard.compat.tf reads

<module 'tensorboard.compat.tf' (built-in)>

though it provides the correct path after the import is forced. I don’t
see a way to get around this (because we can’t access mod.__file__
without importing mod), so no change requested.

…zation

nfelt

PTAL. I also added a custom __repr__() so it doesn't misleadingly claim to be a built-in, since I'd also noticed that.

nfelt · 2019-01-28T19:37:00Z

tensorboard/compat/tensorflow_stub/__init__.py

 from . import tensor_shape  # noqa
+
+# Set a fake __version__ to help distinguish this as our own stub API.
+__version__ = 'stub'


The stub doesn't actually have a version module so I'm not quite sure I follow? I picked __version__ because it's the one place that's consistent across TF 1.x and 2.x, since 2.0 only has tf.version.VERSION, but most of 1.x used tf.VERSION.

nfelt · 2019-01-29T00:51:01Z

tensorboard/__init__.py

 from __future__ import division
 from __future__ import print_function

+import importlib as _importlib


Fun fact: we're currently exposing the symbols absolute_import, division, and print_function as well :p

The main reason to use importlib was avoiding an inline import that requires a lint suppression internally, so at that point we can just directly import.

nfelt · 2019-01-29T00:52:33Z

tensorboard/lazy.py

+      load_fn: callable that actually does the import and returns the module
+    """
    super(LazyLoader, self).__init__(name)
+    self._load_fn = load_fn


Refactored to use a closure.

nfelt · 2019-01-29T00:53:31Z

tensorboard/lazy.py

  def __getattr__(self, item):
-    module = self._load()
-    return getattr(module, item)
+    if not self._cached_module:


This was also arguably an existing problem (in the previous code, the load function could be called many times, so the caching here was just for efficiency, not correctness), but I'll grant that inviting users to supply their own load functions makes it more of a concern, so done.

wchargin

Thanks a ton! This looks better. Request response to one inline, then
it’ll look good to me.

If you don’t mind, I might submit follow-up PRs to

add a test that the nested imports work,
further improve the repr so that it prints the underlying
__file__ when the module is loaded, and
detects the deadlock noted in an inline and raises an error instead—

but you’ve done enough work on this so I don’t want to block you. :-)

wchargin · 2019-01-29T01:11:31Z

tensorboard/compat/tensorflow_stub/__init__.py

 from . import tensor_shape  # noqa
+
+# Set a fake __version__ to help distinguish this as our own stub API.
+__version__ = 'stub'


Ah, got it. Thanks.

wchargin · 2019-01-29T01:13:12Z

tensorboard/lazy.py

+
+
+def _return_once(f):
+  """Decorator that calls f() once, then returns that value repeatedly."""


Though used safely* within this module, this decorator seems dangerous in
general; it passes through arguments to f the first time, then
completely ignores them at all subsequent invocations because the cached
result already exists.

Given that the local uses are safe, I’d be fine with either of:

documenting this with a warning, and changing the headline of the
docstring from calls f() to calls `f`; or

refactoring so that the decorator actually only calls f() (no
args), and the self argument to load_once is passed in
separately:

lazy_module_box = [] def load_once(): module = load_fn() lazy_module_box[0].__dict__.update(module.__dict__) return module class LazyModule(types.ModuleType): # … pass lazy_module = LazyModule(name) lazy_module_box.append(lazy_module) return lazy_module

(You could also remove the decorator altogether, but I agree that it’s
nice to have the locking logic encapsulated.)

* Barring clients who do things like type(mod).__dir__(not_mod),
which I think is fine to ignore.

It was meant to just be an implementation detail so I wasn't worrying about this, but fair enough. I rewrote the decorator to be a standard memoize that only supports a single hashable argument since that's all that we need.

wchargin · 2019-01-29T01:29:52Z

tensorboard/lazy.py

+    if cache[0] == not_called:
+      with lock:
+        if cache[0] == not_called:
+          cache[0] = f(*args, **kwargs)


This open call can deadlock if the load_fn references the very module
that it corresponds to:

>>> from tensorboard import lazy >>> @lazy.lazy_load("foo") ... def foo(): ... print("loading") ... return foo.x ... >>> foo.y loading ^C^C^C^\Quit

I can’t think of a valid reason to do this, so no change requested (if
you don’t mind, I’ll submit a follow-up PR to detect this and raise a
nice error message instead of hanging).

Actually this is a good point, since this could also be triggered by an import cycle:

@lazy.lazy_load("inner") def inner(): return outer.foo @lazy.lazy_load("outer") def outer(): return inner

And I've had enough headaches with those without deadlock too :p Changed to use a reentrant lock so you just get a recursion depth exceeded error instead, WDYT?

this could also be triggered by an import cycle

Yep, good point.

Changed to use a reentrant lock so you just get a recursion depth
exceeded error instead, WDYT?

Yep, using an RLock is the right fix IMO, and a recursion depth error
is fine for now. Thanks!

nfelt

PTAL. I also tweaked repr() as you mentioned. I'll leave it to you to write the nested case test in a followup if you so desire :)

nfelt · 2019-01-29T02:08:30Z

tensorboard/lazy.py

+
+
+def _return_once(f):
+  """Decorator that calls f() once, then returns that value repeatedly."""


It was meant to just be an implementation detail so I wasn't worrying about this, but fair enough. I rewrote the decorator to be a standard memoize that only supports a single hashable argument since that's all that we need.

nfelt · 2019-01-29T02:14:42Z

tensorboard/lazy.py

+    if cache[0] == not_called:
+      with lock:
+        if cache[0] == not_called:
+          cache[0] = f(*args, **kwargs)


Actually this is a good point, since this could also be triggered by an import cycle:

@lazy.lazy_load("inner") def inner(): return outer.foo @lazy.lazy_load("outer") def outer(): return inner

And I've had enough headaches with those without deadlock too :p Changed to use a reentrant lock so you just get a recursion depth exceeded error instead, WDYT?

wchargin · 2019-01-29T02:21:27Z

tensorboard/lazy.py

-    module = self._load()
-    return getattr(module, item)
+      def __repr__(self):
+        if hasattr(load_once, 'loaded'):


Why not initialize load_once.loaded = False at the beginning, then
just use a field dereference instead of a hasattr (which always makes
me nervous, though it’s safe here)?

Done. I didn't do loaded = False because now part of the definition/contract of the function exists outside the function body where it's easier for it to get accidentally removed, but I don't feel strongly about it.

wchargin · 2019-01-29T02:22:15Z

tensorboard/lazy.py

+  """Memoizing decorator for f, which must have exactly 1 hashable argument."""
+  nothing = object()  # Unique "no value" sentinel object.
+  cache = {}
+  # Use a reentrank lock so that if f references the resulting wrapper we die


nit: reentrant (sp.)

wchargin · 2019-01-29T02:23:39Z

tensorboard/lazy.py

+    if cache[0] == not_called:
+      with lock:
+        if cache[0] == not_called:
+          cache[0] = f(*args, **kwargs)


this could also be triggered by an import cycle

Yep, good point.

Changed to use a reentrant lock so you just get a recursion depth
exceeded error instead, WDYT?

Yep, using an RLock is the right fix IMO, and a recursion depth error
is fine for now. Thanks!

wchargin

Looks great. Thanks!

nfelt

PTAL

nfelt · 2019-01-29T02:28:26Z

tensorboard/lazy.py

+  """Memoizing decorator for f, which must have exactly 1 hashable argument."""
+  nothing = object()  # Unique "no value" sentinel object.
+  cache = {}
+  # Use a reentrank lock so that if f references the resulting wrapper we die


nfelt · 2019-01-29T02:29:45Z

tensorboard/lazy.py

-    module = self._load()
-    return getattr(module, item)
+      def __repr__(self):
+        if hasattr(load_once, 'loaded'):


Done. I didn't do loaded = False because now part of the definition/contract of the function exists outside the function body where it's easier for it to get accidentally removed, but I don't feel strongly about it.

wchargin · 2019-01-29T02:36:18Z

(re-approved)

Summary: This commit adds a test case for [behavior implemented in #1781][1]. [1]: #1781 (comment) Test Plan: `bazel test //tensorboard:lazy_test` passes at HEAD, but fails after checking out `lazy.py` from 976064d. wchargin-branch: lazy-composition

Summary: This tests behavior [described in #1781][2], improving the error message along the way. [2]: #1781 (comment) Test Plan: `bazel test //tensorboard:lazy_test` passes at HEAD, but changing the implementation of `_memoize` to use `Lock` instead of `RLock` causes a deadlock (and so the test times out and fails eventually). wchargin-branch: lazy-cycles

Summary: It was [noted in #1781][3] that the `__repr__` of `LazyModule`s deserves a modicum of special attention. This commit tests that, and improves the `__repr__` for already-loaded lazy modules so that it’s clear that they are still going through the `LazyModule` proxy. (Should there be a bug in the proxy interface—say, someone expects `__setattr__` to also pass through to the underlying module—this would make it much easier to track down.) [3]: #1781 (review) Test Plan: `bazel test //tensorboard:lazy_test` passes. wchargin-branch: lazy-repr

Summary: This commit adds a test case for [behavior implemented in #1781][1]. [1]: #1781 (comment) Test Plan: `bazel test //tensorboard:lazy_test` passes at HEAD, but fails after checking out `lazy.py` from 976064d. wchargin-branch: lazy-composition

Summary: This tests behavior [described in #1781][2], improving the error message along the way. [2]: #1781 (comment) Test Plan: `bazel test //tensorboard:lazy_test` passes at HEAD, but changing the implementation of `_memoize` to use `Lock` instead of `RLock` causes a deadlock (and so the test times out and fails eventually). wchargin-branch: lazy-cycles

Summary: It was [noted in #1781][3] that the `__repr__` of `LazyModule`s deserves a modicum of special attention. This commit tests that, and improves the `__repr__` for already-loaded lazy modules so that it’s clear that they are still going through the `LazyModule` proxy. (Should there be a bug in the proxy interface—say, someone expects `__setattr__` to also pass through to the underlying module—this would make it much easier to track down.) [3]: #1781 (review) Test Plan: `bazel test //tensorboard:lazy_test` passes. wchargin-branch: lazy-repr

nfelt added 2 commits January 25, 2019 20:33

Make tensorboard.compat.{tf,tf2} lazily loaded

8d10ad9

add summary_dep_test.py to check for no TF dep from tb.summary.v2

976064d

nfelt added core:backend core:summaries labels Jan 26, 2019

nfelt requested a review from wchargin January 26, 2019 05:22

wchargin reviewed Jan 28, 2019

View reviewed changes

nfelt added 3 commits January 28, 2019 16:38

CR: rewrite lazy_load() to use closure and guaranteed single initiali…

6f9843f

…zation

add a couple extra checks to SummaryV2DepTest

117c6f2

CR: avoid leaking _importlib symbol

5aec2b5

nfelt commented Jan 29, 2019

View reviewed changes

wchargin reviewed Jan 29, 2019

View reviewed changes

CR: clarify memoization contract, avoid deadlock, nicer repr

3424a95

nfelt commented Jan 29, 2019

View reviewed changes

wchargin reviewed Jan 29, 2019

View reviewed changes

CR: no moar hasattr, typo fix

22b4239

wchargin approved these changes Jan 29, 2019

View reviewed changes

nfelt commented Jan 29, 2019

View reviewed changes

wchargin approved these changes Jan 29, 2019

View reviewed changes

nfelt merged commit 767360e into tensorflow:master Jan 29, 2019

nfelt deleted the compat-tf-lazy branch January 29, 2019 02:47

wchargin mentioned this pull request Jan 30, 2019

Make and test some small improvements to lazy #1789

Merged

wchargin mentioned this pull request Mar 13, 2019

init: fix top-level module reloading #2005

Merged

nfelt mentioned this pull request May 21, 2020

Rid the codebase of tensorboard.compat.tf #3666

Open

13 tasks



		def _return_once(f):
		"""Decorator that calls f() once, then returns that value repeatedly."""

Make tensorboard.compat.{tf,tf2} lazily loaded #1781

Make tensorboard.compat.{tf,tf2} lazily loaded #1781

Uh oh!

Conversation

nfelt commented Jan 26, 2019

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

wchargin commented Jan 28, 2019

Uh oh!

nfelt left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

wchargin left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nfelt left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

wchargin left a comment

Choose a reason for hiding this comment

Uh oh!

nfelt left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment