KEMBAR78
Aggregate numeric values when slicing (What-If Tool) by grovina · Pull Request #2678 · tensorflow/tensorboard · GitHub
Skip to content

Conversation

@grovina
Copy link
Contributor

@grovina grovina commented Sep 19, 2019

  • Motivation for features / changes

When slicing by a numeric feature, aggregate values in intervals instead of creating a slice for each different value.

  • Technical description of changes

The current implementation creates one slice for each different value of the feature. This might result in a huge number of slices that become unmanageable (imagine a slicing by features like "age" or "salary", which can have hundreds of unique values).
To work around this issue, we aggregate the values into buckets (like a histogram), then show aggregated statistics and performance results for each bucket.
Nothing changes for categorical features.

  • Screenshots of UI changes

Screenshot from 2019-09-18 17-54-25

  • Detailed steps to verify changes work correctly (as executed by you)

For both multi and age demos, wait for data to load and then:

  1. Open the Performance tab
  2. Slice by different features and check that:
    • Categorical features behave just like before (compare with demos in prod at https://pair-code.github.io/what-if-tool)
    • Numeric features with up to 10 unique values also remain the same
    • Numeric features with more than 10 unique values are aggregated in buckets, and results are coherent with the correct totals.
  • Alternate designs / implementations considered

I opted for the equally-sized intervals (as opposed to the equally-populated ones, for example), mainly because they should be simpler to understand and because one of comparative columns is precisely the number of examples in each slice. The number of slices is controlled by a variable that could become a setting if needed.
As for notations, I also considered other a simple a — b, but the mathematical intervals are particularly precise and seem to be accessible enough.
Finally, generating the bucket edges beforehand was important to make the values look ok, since we would oftentimes get things like [17.43333333333, 20] printed otherwise. I tried to simply calculate them ad hoc, but having consistent and simple display turned out to be more complex than expected.

cc @jameswex and @tolga-b

This is a new boolean field `isNumeric` in `distanceStats_` to allow us
to check is a feature is numeric instead of looking for the existance of `.stdDev`.
Each slice of the dataset is identified by a key. Currently, when
slicing by a feature, each value becomes a slice and therefore a key.
To facilitate how we handle with generating slices, and particularly to
allow us to group multiple values into the same key (buckets for
numerical features), we wrap the slice key generation in a separate
function.
This commit doesn't change any aggregating behaviour but makes way to
the next steps.
We iterate over the dataset using the slice key function to attribute a
key to each example. This avoids the usage of the `undefined` dummy
keys, since all cases will be naturally covered by seeing all examples.
Since we are now using a single function to handle slice keys throughout
the code, we can change the key attribution logics in a consistent way.
We create buckets for each aggregated feature, so that we can associate
each slice of that feature with an interval instead of a value.
The getSliceKey_ function now checks if the feature has bucket to be
aggregated and returns the slice key of the appropriate interval in the
positive case. It uses the value of the feature as key otherwise (just as
before).
`numBuckets` controls in how many buckets the range of values for a
feature should be divided. This number is also used to avoid aggregating
in buckets when we have less unique values than buckets.
When slicing by numeric features aggregated in buckets, displaying
the buckets in order is a particularly useful feature.
Alphabetical ordering is not enough for this purpose: we need to
parse the number in the interval and order it by its value.
Also, let's make this ordering the default one when aggregating.
Copy link
Contributor

@jameswex jameswex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for this! really excited for this feature. some minor comments

numBuckets: {
type: Number,
value: 10,
},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add this as a numeric paper-input element below the slice by dropdown (when a numeric feature is selected) with a range of 2-10 and default value of 2? can use a <template is="dom-if...> wrapping the element so it only appears in that case. same thing with another dropdown under the secondary slice if that is numeric too. then ensure the buckets are recalculated whenever numPrimaryBuckets or numSecondaryBuckets changes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍
Screenshot from 2019-09-19 16-36-07

type: Boolean,
value: false,
},
featureBucketEdges_: Object,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add short comment on what it is

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure

const feature1 = this.selectedBreakdownFeature;
if (feature1 == '') {
this.selectedSecondBreakdownFeature = '';
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably still want this logic here, which clears out the secondary slice selection when the primary slice selection is cleared. otherwise if you have both slices set, then clear the primary, the view goes back to showing only overall stats, but then when you select a slice again, the secondary slice is already selected as whatever it was last and you don't just see the primary slice you selected.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absolutely. Thanks for catching this!

}
}
this.visdata.forEach(
function(item) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use arrow notation to simplify this, remove need for .bind(this) :
this.visdata.forEach(item => { blah blah.... return blah;});

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes! 👍

if (edges) {
for (let i = 1; i < edges.length; i++) {
if (
example[feature] < edges[i] ||
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how does this handle when an example is missing the feature value for the selected feature?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When example[feature] is undefined, this condition is false and we end up returning undefined (after this loop). This results in a key called "undefined" where all these values are grouped.
I think I can make it more explicit and human-friendly, though, by treating it separately.

}.bind(this)
);
const thresholds = Object.values(thresholdsMap);
this.set('selectedFeatureSort', 'Alphabetical');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why reset the sort order when slices change? i can imagine wanting to keep viewing slices by something like false positive rate as i change slicing criteria.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. It makes sense.

* Calculate edges between buckets for aggregating numeric features.
* We do this beforehand to round numbers and avoid ugly interval labels.
*/
calculateFeatureBucketEdges_: function() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:)

* Add explanation to `featureBucketEdges_`.
* Treat unselection of slicing feature.
* Use arrow function to simplify `forEach` using outer `this`.
When slicing by a feature, group missing values in a '?' key.
When slicing by a numeric feature, values can be aggregated to buckets.
We're adding an input field to control in how many buckets the values
will be aggregated. This only appears for numeric features.
Copy link
Contributor Author

@grovina grovina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There you go, James :)

const feature1 = this.selectedBreakdownFeature;
if (feature1 == '') {
this.selectedSecondBreakdownFeature = '';
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absolutely. Thanks for catching this!

type: Boolean,
value: false,
},
featureBucketEdges_: Object,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure

}
}
this.visdata.forEach(
function(item) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes! 👍

}.bind(this)
);
const thresholds = Object.values(thresholdsMap);
this.set('selectedFeatureSort', 'Alphabetical');
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. It makes sense.

* Calculate edges between buckets for aggregating numeric features.
* We do this beforehand to round numbers and avoid ugly interval labels.
*/
calculateFeatureBucketEdges_: function() {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:)

if (edges) {
for (let i = 1; i < edges.length; i++) {
if (
example[feature] < edges[i] ||
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When example[feature] is undefined, this condition is false and we end up returning undefined (after this loop). This results in a key called "undefined" where all these values are grouped.
I think I can make it more explicit and human-friendly, though, by treating it separately.

numBuckets: {
type: Number,
value: 10,
},
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍
Screenshot from 2019-09-19 16-36-07

@jameswex
Copy link
Contributor

Thanks. Only issue is that the you don't necessary end up with equally-sized intervals when slicing by a numeric feature. Slice by age with 4 buckets and some buckets are width 10 and some width 20 (and the initial one is width 13 due to the min age of 17. Can the logic be adjusted so they all end up of width 10 or 20? Or we would need to document this in the tool to avoid confusion

Make it slightly more precise than the order of magnitude of the
standard deviation for the values of that feature.
@grovina
Copy link
Contributor Author

grovina commented Sep 19, 2019

Thanks. Only issue is that the you don't necessary end up with equally-sized intervals when slicing by a numeric feature. Slice by age with 4 buckets and some buckets are width 10 and some width 20 (and the initial one is width 13 due to the min age of 17. Can the logic be adjusted so they all end up of width 10 or 20? Or we would need to document this in the tool to avoid confusion

Hey James, this is purely due to rounding the edges of the buckets. Increasing precision by 1 order of magnitude should solve it:
Screenshot from 2019-09-19 17-52-52

Note that the 3rd bucket ([49, 64)) has width 15, and the others 16: that's pretty much as good as it gets for reasonably rounded numbers. What that rounding does is to make sure these oscillations are smaller than the order of magnitude of the std. deviation (actually 1 order below it after that last commit).

@jameswex jameswex requested a review from nfelt September 19, 2019 17:01
@jameswex
Copy link
Contributor

@nfelt please review or assign for review

@jameswex
Copy link
Contributor

@stephanwlee please review or assign for review, thx

@stephanwlee stephanwlee requested review from psybuzz and removed request for stephanwlee September 23, 2019 15:04
@nfelt
Copy link
Contributor

nfelt commented Sep 23, 2019

@jameswex is there a reason this needs an additional review? I think your approval is sufficient to merge this since it only affects WIT code.

@jameswex
Copy link
Contributor

@jameswex is there a reason this needs an additional review? I think your approval is sufficient to merge this since it only affects WIT code.

Historically we've given TB a chance to review our code before we commit to the repo, but I'm ok merging this with just my approval if you are.

@nfelt
Copy link
Contributor

nfelt commented Sep 23, 2019

Fair enough - I appreciate the notice! But I think going forward it's fine to merge after your review for PRs that only affect the WIT plugin codebase. Of course if there are bigger structural changes or cases where you want specific reviewer expertise you're still welcome to request review, but don't feel like you need to be blocked on a TB core team reviewer for changes like this one.

@jameswex jameswex merged commit 8ecff39 into tensorflow:master Sep 24, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants