[DebuggerV2] Display detailed tensor debug-values in graph- and eager-execution components #3541

caisq · 2020-04-23T03:40:46Z

Motivation for features / changes
- Let the relatively new GraphExecutionComponent display details information about debugger-instrumented tensors, such as dtype, rank, size, shape, breakdown of numerical values by type (inf, nan, zero, etc.), and whether inf/nan exists.
- In the older ExecutionDataComponent for eager tensors, replace the old Angular code that serves a similar purpose with the same Component as used by GraphExecutionComponent. This improves unity and reduces maintenance overhead going forward.
Technical description of changes
- Add type DebugTensorValue to store/debugger_types.ts. The new interface defines the possible types of data available from various TensorDebugModes (an existing enum).
- Add helper function parseDebugTensorValue() in a new file store/debug_tensor_value.ts to parse an array representation of TensorDebugMode-specific debugger-generated tensor data into a DebugTensorValue.
- Add DebugTensorValueComponent and its various subcomponents in new folder views/debug-_tensor_value/
- Use DebugTensorValueComponent from GraphExecutionComponent and ExecutionDataComponent.
Screenshots of UI changes
- CURT_HEALTH mode:
- CONCISE_HEALTH mode:
- SHAPE mode:
- FULL_HEALTH mode:
Detailed steps to verify changes work correctly (as executed by you)
- Unit tests are added for the new helper method and new component and its subcomponents
- Manual verification against logdirs with real tfdbg2 data.

stephanwlee · 2020-04-24T04:00:25Z

tensorboard/plugins/debugger_v2/tf_debugger_v2_plugin/store/debug_tensor_value.ts

+  tensorDebugMode: TensorDebugMode,
+  array: number[] | null
+): DebugTensorValue {
+  switch (+tensorDebugMode) {


do you need + part?

Nope. Removed it.

stephanwlee · 2020-04-24T04:03:03Z

tensorboard/plugins/debugger_v2/tf_debugger_v2_plugin/store/debug_tensor_value.ts

+ */
+export function parseDebugTensorValue(
+  tensorDebugMode: TensorDebugMode,
+  array: number[] | null


Didn't read the full code yet so it may be very misinformed: Does it make sense to better type the value? Wherever this value is coming from, it looks like it is possible to use discriminative types instead of coercing the types throughout.

Good point. See my reply to your other comment about the possibility of wrong array length below. After adding the length checks the coercion here is no longer necessary. Removed them.

I am not sure what you mean but at least for some of the types, you can type as such:

interface NoTensor { mode: TensorDebugMode.NO_TENSOR, array: null } interface CurtHealth { mode: TensorDebugMode. CURT_HEALTH, array: [number, number], } interface ConciseHealth { mode: TensorDebugMode. CONCISE_HEALTH, array: [ number, /* what is this */ number, /* what is this */ number, /* what is this */ number, /* what is this */ number, ], }

Thanks. I like the extra compile-time type safety it gives us. Done.

stephanwlee · 2020-04-24T04:33:58Z

tensorboard/plugins/debugger_v2/tf_debugger_v2_plugin/store/debug_tensor_value_test.ts

+  });
+
+  describe('CONCISE_HEALTH', () => {
+    it('all healthy', () => {


these test specs are describing context, not behavior.

I did a sweeping update all these it names in this file.

stephanwlee · 2020-04-24T04:34:49Z

tensorboard/plugins/debugger_v2/tf_debugger_v2_plugin/store/debug_tensor_value_test.ts

+      ]);
+      expect(debugValue).toEqual({
+        size: 1000,
+        numPositiveInfs: 22,


I find it weird that certain properties are missing. Can we just set them to 0 or does 0 signify something else?

As described in the doc string of this function, this function will omit 0-size categories. The rationale is that the return value is meant to be consumed by the UI, which needs to be concise. For instance, if a tensor has only nans but no infs, we want to show something like

size=100, nan x 30

instead of

size=100, nan x 30, -inf x 0, +inf x 0,

which is space-consuming and distracting. I expanded the comment a little to clarify that.

stephanwlee · 2020-04-24T04:36:11Z

tensorboard/plugins/debugger_v2/tf_debugger_v2_plugin/store/debug_tensor_value.ts

+ *   (e.g., counts of -inf, +inf and nan), the corresponding fields
+ *   in the returned object will be defined only if the count is non-zero.
+ */
+export function parseDebugTensorValue(


high level question: I find this rather tedious. What happens if we just have hard dependency on jspb or OSS equivalent?

This is a little tedious. But this is the only place where this logic needs to exist in tensorboard. The background is that for efficiency, DebugNumericsSummaryV2 ops return a fixed-size 1D tensor for all instrumented tensors. The advantages of this format is that

It is smaller in size compared to other options such as a proto or serialized proto

All of such small tensors can be potentially stacked and transferred over device boundaries as a whole in future performance optimizations.

As such, the 1D vector must have a contract in what each number means (given the TensorDebugMode). This function here knows and carries out this contract.

I am not too convinced about the performance argument. Protobuf, sure, may not be as efficient at representing such values as what you did here but it surely is closer to this than, say, JSON (jspb encodes values as array of arrays; it does not transfer binaries).

I am still okay with this because jspb dependency (closure version) on FE is quite not pleasant.

stephanwlee · 2020-04-24T04:44:39Z

...s/debugger_v2/tf_debugger_v2_plugin/views/debug_tensor_value/debug_tensor_value_component.ts

+      </div>
+      <div
+        *ngIf="numPositiveFinites !== undefined && numPositiveFinites > 0"
+        z


stephanwlee · 2020-04-24T04:51:25Z

...s/debugger_v2/tf_debugger_v2_plugin/views/debug_tensor_value/debug_tensor_value_component.ts

+  styles: [
+    `
+      :host {
+        display: inline-block;


It is wrong to have block element (flex) inside inline-block. Just make the host an inline-flex instead? and remove div.flexbox

Revised the CSS here and in related places.

stephanwlee · 2020-04-24T04:55:43Z

...ugger_v2/tf_debugger_v2_plugin/views/debug_tensor_value/debug_tensor_value_component_test.ts

+      expect(tagElements.length).toEqual(3);
+      expect(countElements.length).toEqual(3);


toEqual -> toBe

Done. Fixes all instances like this in this file.

stephanwlee · 2020-04-24T05:16:22Z

...gins/debugger_v2/tf_debugger_v2_plugin/views/execution_data/execution_data_component.ng.html

+          class="output-slot-container"
+        >
+          <div class="output-slot-number">
+            Output slot {{ i }}:


i + 1? or do you want it to read Output slot 0?

The rationale is that tensorflow's tf.Tensor object has 0-based slot indices in their names like unique:0 and unique:1. See: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/framework/ops.py#L462. We want to be consistent with that. Added a comment here.

stephanwlee · 2020-04-24T05:17:53Z

...gins/debugger_v2/tf_debugger_v2_plugin/views/graph_executions/graph_executions_component.css

  width: 100%;
 }

+.tensor-debug-info {


I don't see any components using this in this PR but please always think about how inline element cannot have block element inside.

Removed these two unused classes.

caisq

Thanks for the review!

caisq · 2020-04-24T19:41:23Z

...gins/debugger_v2/tf_debugger_v2_plugin/views/execution_data/execution_data_component.ng.html

+          class="output-slot-container"
+        >
+          <div class="output-slot-number">
+            Output slot {{ i }}:


The rationale is that tensorflow's tf.Tensor object has 0-based slot indices in their names like unique:0 and unique:1. See: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/framework/ops.py#L462. We want to be consistent with that. Added a comment here.

caisq · 2020-04-24T19:47:31Z

...ugger_v2/tf_debugger_v2_plugin/views/debug_tensor_value/debug_tensor_value_component_test.ts

+      expect(tagElements.length).toEqual(3);
+      expect(countElements.length).toEqual(3);


Done. Fixes all instances like this in this file.

caisq · 2020-04-24T20:02:00Z

...s/debugger_v2/tf_debugger_v2_plugin/views/debug_tensor_value/debug_tensor_value_component.ts

+      </div>
+      <div
+        *ngIf="numPositiveFinites !== undefined && numPositiveFinites > 0"
+        z


caisq · 2020-04-24T20:44:26Z

tensorboard/plugins/debugger_v2/tf_debugger_v2_plugin/store/debug_tensor_value_test.ts

+    });
+
+    it('has pos inf', () => {
+      const debugValue = parseDebugTensorValue(TensorDebugMode.CONCISE_HEALTH, [


Yeah, good question. I should add a check for the array length for each switch case in parseDebugTensroValue(). Done. And done adding unit tests for those error conditions.

caisq · 2020-04-24T21:16:33Z

tensorboard/plugins/debugger_v2/tf_debugger_v2_plugin/store/debug_tensor_value_test.ts

+      ]);
+      expect(debugValue).toEqual({
+        size: 1000,
+        numPositiveInfs: 22,


As described in the doc string of this function, this function will omit 0-size categories. The rationale is that the return value is meant to be consumed by the UI, which needs to be concise. For instance, if a tensor has only nans but no infs, we want to show something like

size=100, nan x 30

instead of

size=100, nan x 30, -inf x 0, +inf x 0,

which is space-consuming and distracting. I expanded the comment a little to clarify that.

caisq · 2020-04-24T22:41:39Z

tensorboard/plugins/debugger_v2/tf_debugger_v2_plugin/store/debug_tensor_value.ts

+ *   (e.g., counts of -inf, +inf and nan), the corresponding fields
+ *   in the returned object will be defined only if the count is non-zero.
+ */
+export function parseDebugTensorValue(


This is a little tedious. But this is the only place where this logic needs to exist in tensorboard. The background is that for efficiency, DebugNumericsSummaryV2 ops return a fixed-size 1D tensor for all instrumented tensors. The advantages of this format is that

It is smaller in size compared to other options such as a proto or serialized proto

All of such small tensors can be potentially stacked and transferred over device boundaries as a whole in future performance optimizations.

As such, the 1D vector must have a contract in what each number means (given the TensorDebugMode). This function here knows and carries out this contract.

caisq · 2020-04-24T22:44:53Z

tensorboard/plugins/debugger_v2/tf_debugger_v2_plugin/store/debug_tensor_value.ts

+  tensorDebugMode: TensorDebugMode,
+  array: number[] | null
+): DebugTensorValue {
+  switch (+tensorDebugMode) {


Nope. Removed it.

caisq · 2020-04-24T22:46:03Z

tensorboard/plugins/debugger_v2/tf_debugger_v2_plugin/store/debug_tensor_value.ts

+ */
+export function parseDebugTensorValue(
+  tensorDebugMode: TensorDebugMode,
+  array: number[] | null


Good point. See my reply to your other comment about the possibility of wrong array length below. After adding the length checks the coercion here is no longer necessary. Removed them.

caisq · 2020-04-24T22:48:32Z

...gins/debugger_v2/tf_debugger_v2_plugin/views/graph_executions/graph_executions_component.css

  width: 100%;
 }

+.tensor-debug-info {


Removed these two unused classes.

caisq · 2020-04-24T23:33:54Z

...s/debugger_v2/tf_debugger_v2_plugin/views/debug_tensor_value/debug_tensor_value_component.ts

+  styles: [
+    `
+      :host {
+        display: inline-block;


Revised the CSS here and in related places.

stephanwlee · 2020-04-27T06:08:08Z

tensorboard/plugins/debugger_v2/tf_debugger_v2_plugin/store/debug_tensor_value.ts

+ *   (e.g., counts of -inf, +inf and nan), the corresponding fields
+ *   in the returned object will be defined only if the count is non-zero.
+ */
+export function parseDebugTensorValue(


I am not too convinced about the performance argument. Protobuf, sure, may not be as efficient at representing such values as what you did here but it surely is closer to this than, say, JSON (jspb encodes values as array of arrays; it does not transfer binaries).

I am still okay with this because jspb dependency (closure version) on FE is quite not pleasant.

stephanwlee · 2020-04-27T06:10:22Z

tensorboard/plugins/debugger_v2/tf_debugger_v2_plugin/store/debug_tensor_value.ts

+ */
+export function parseDebugTensorValue(
+  tensorDebugMode: TensorDebugMode,
+  array: number[] | null


I am not sure what you mean but at least for some of the types, you can type as such:

interface NoTensor { mode: TensorDebugMode.NO_TENSOR, array: null } interface CurtHealth { mode: TensorDebugMode. CURT_HEALTH, array: [number, number], } interface ConciseHealth { mode: TensorDebugMode. CONCISE_HEALTH, array: [ number, /* what is this */ number, /* what is this */ number, /* what is this */ number, /* what is this */ number, ], }

…-execution components (tensorflow#3541) * Motivation for features / changes * Let the relatively new `GraphExecutionComponent` display details information about debugger-instrumented tensors, such as dtype, rank, size, shape, breakdown of numerical values by type (inf, nan, zero, etc.), and whether inf/nan exists. * In the older `ExecutionDataComponent` for eager tensors, replace the old Angular code that serves a similar purpose with the same Component as used by `GraphExecutionComponent`. This improves unity and reduces maintenance overhead going forward. * Technical description of changes * Add type `DebugTensorValue` to store/debugger_types.ts. The new interface defines the possible types of data available from various `TensorDebugMode`s (an existing enum). * Add helper function `parseDebugTensorValue()` in a new file store/debug_tensor_value.ts to parse an array representation of TensorDebugMode-specific debugger-generated tensor data into a `DebugTensorValue`. * Add `DebugTensorValueComponent` and its various subcomponents in new folder views/debug-_tensor_value/ * Use `DebugTensorValueComponent` from `GraphExecutionComponent` and `ExecutionDataComponent`. * Screenshots of UI changes * CURT_HEALTH mode: ![image](https://user-images.githubusercontent.com/16824702/80056423-31c27100-84f2-11ea-91db-b441f94d11c8.png) * CONCISE_HEALTH mode: ![image](https://user-images.githubusercontent.com/16824702/80056486-5a4a6b00-84f2-11ea-8142-6c5e282f06c5.png) * SHAPE mode: ![image](https://user-images.githubusercontent.com/16824702/80056523-72ba8580-84f2-11ea-9126-2c27c9b922b1.png) * FULL_HEALTH mode: ![image](https://user-images.githubusercontent.com/16824702/80056591-94b40800-84f2-11ea-9592-fd61c53a2316.png) * Detailed steps to verify changes work correctly (as executed by you) * Unit tests are added for the new helper method and new component and its subcomponents * Manual verification against logdirs with real tfdbg2 data.

…-execution components (#3541) * Motivation for features / changes * Let the relatively new `GraphExecutionComponent` display details information about debugger-instrumented tensors, such as dtype, rank, size, shape, breakdown of numerical values by type (inf, nan, zero, etc.), and whether inf/nan exists. * In the older `ExecutionDataComponent` for eager tensors, replace the old Angular code that serves a similar purpose with the same Component as used by `GraphExecutionComponent`. This improves unity and reduces maintenance overhead going forward. * Technical description of changes * Add type `DebugTensorValue` to store/debugger_types.ts. The new interface defines the possible types of data available from various `TensorDebugMode`s (an existing enum). * Add helper function `parseDebugTensorValue()` in a new file store/debug_tensor_value.ts to parse an array representation of TensorDebugMode-specific debugger-generated tensor data into a `DebugTensorValue`. * Add `DebugTensorValueComponent` and its various subcomponents in new folder views/debug-_tensor_value/ * Use `DebugTensorValueComponent` from `GraphExecutionComponent` and `ExecutionDataComponent`. * Screenshots of UI changes * CURT_HEALTH mode: ![image](https://user-images.githubusercontent.com/16824702/80056423-31c27100-84f2-11ea-91db-b441f94d11c8.png) * CONCISE_HEALTH mode: ![image](https://user-images.githubusercontent.com/16824702/80056486-5a4a6b00-84f2-11ea-8142-6c5e282f06c5.png) * SHAPE mode: ![image](https://user-images.githubusercontent.com/16824702/80056523-72ba8580-84f2-11ea-9126-2c27c9b922b1.png) * FULL_HEALTH mode: ![image](https://user-images.githubusercontent.com/16824702/80056591-94b40800-84f2-11ea-9592-fd61c53a2316.png) * Detailed steps to verify changes work correctly (as executed by you) * Unit tests are added for the new helper method and new component and its subcomponents * Manual verification against logdirs with real tfdbg2 data.

caisq added 24 commits April 16, 2020 16:35

[DebuggerV2] Flesh out graph execution data display

6150f26

Flesh out scrolling effect; Improve CSS

d7dd9c5

Add unit tests for selectors

4b31ddb

Add unit tests for reducers

20daabf

Adjust CSS

5eb5e10

Add unit tests for effect

d1ef4ab

Add container tests

4df718f

Fix loading spinner css

964c20c

Revert extraneous change

9e37e35

Tweak some comments

ab58b65

[DebuggerV2] Add tensor-debug info to GraphExecutionComponent

8893db2

Merge branch 'master' into dbg2-graph-exec-1c

1d7b8d2

WIP: Add debug_tensor_value.ts

a6d5e56

Add logic and tests for FULL_TENSOR and FULL_HEALTH; doc string

14613da

Add undefined filling for shape

64282ad

Refactoring into DebugTensorValueComponent

02c8e32

Refactor DebugTensorValueComponent into separate folder

df83903

Adjust breakdown component CSS

3820d93

Add DebugTensorShapeComponent

ccbc314

Add DebugTensorHasInfOrNaNComponent

1e90a8e

Switch ExecutionDataComponent to using DebugTensorValueCompoennt

4f2bcdc

Adding unit tests for debug-tensor-value components

7f71cbb

Add more unit tests for debug-tensor-value components

606f80f

Add more unit tests

c4ab22c

googlebot added the cla: yes label Apr 23, 2020

caisq changed the title ~~[DebuggerV2] Add component to display detailed tensor debug-values~~ [DebuggerV2] Display detailed tensor debug-values in graph- and eager-execution components Apr 23, 2020

caisq added 2 commits April 22, 2020 23:51

Fix typos

0dc3281

Fix more typos

8634808

caisq marked this pull request as ready for review April 23, 2020 04:16

caisq requested a review from stephanwlee April 23, 2020 04:17

stephanwlee reviewed Apr 24, 2020

View reviewed changes

Address Stephan's comments

0873c95

caisq commented Apr 24, 2020

View reviewed changes

caisq requested a review from stephanwlee April 24, 2020 23:46

stephanwlee approved these changes Apr 27, 2020

View reviewed changes

caisq added 3 commits April 27, 2020 10:14

Address 2nd round of comments

a67a8f5

Remove cruft

b33d934

Add missing BUILD dependency

b6ce533

caisq merged commit 27c2747 into tensorflow:master Apr 27, 2020

		expect(tagElements.length).toEqual(3);
		expect(countElements.length).toEqual(3);

[DebuggerV2] Display detailed tensor debug-values in graph- and eager-execution components #3541

[DebuggerV2] Display detailed tensor debug-values in graph- and eager-execution components #3541

Uh oh!

Conversation

caisq commented Apr 23, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

caisq left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

caisq commented Apr 23, 2020 •

edited

Loading