Hash tensor data during deduplication #932

VikParuchuri · 2023-03-29T20:34:59Z

What does this PR do?

When exporting a causal lm model to onnx, the past key version and regular version are merged. The merge loops through each tensor in the graph and stores the values as dict keys. This requires massive memory (>100GB for an 11GB model), and prevents merging from happening.

This is due to the tensor_data in _find_duplicate_initializers being converted from bytes to a tuple. I assume there is a good reason to convert it to a tuple (maybe for comparison across dtypes).

As such, this PR doesn't remove that conversion, but instead converts the tuple to a string and hashes it. It uses SHA-512 to reduce the probability of a hash collision. This uses very little memory to do the deduplication.

You could also:

just convert to str to remove the possibility of collisions - this will take more memory than hashing
remove the initial tuple conversion entirely - also more memory than hashing
Trade off memory for compute - loop through each pair of tensors to compare them - much slower than hashing

I have done some basic manual testing, and this method seems to work.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you make sure to update the documentation with your changes?
Did you write any new necessary tests?

fxmarty · 2023-03-29T20:44:07Z

Thanks a lot for the PR, it's an issue I've had with large models but did not look into it, this will surely help!

HuggingFaceDocBuilderDev · 2023-03-29T21:15:13Z

The documentation is not available anymore as the PR was closed or merged.

fxmarty · 2023-03-30T07:35:08Z

@VikParuchuri This step str(tensor_data).encode("utf-8") is quite slow. I wonder if there is anything better we could hash than this? I suspect for large models this will be prohibitively slow.

Hashing simply tensor_data does not seem to work, getting object supporting the buffer API required. This is kind of surprising as it's of type <class 'bytes'>, and what you get after the encode is also a byte? Any idea @michaelbenayoun?

The reason we put that in a tuple is that some initializers have names as onnx::MatMul_3741, that may be different between the two graphs. So we need to look into the actual data to check whether they are the same.

optimum/onnx/transformations_utils.py

michaelbenayoun · 2023-03-30T08:56:36Z

Hi @VikParuchuri,

That's great thanks!

This is due to the tensor_data in _find_duplicate_initializers being converted from bytes to a tuple. I assume there is a good reason to convert it to a tuple (maybe for comparison across dtypes).

I actually do not remember why make tensor_data a tuple, have you tried removing this? I tried and it seems to be working, at least for the one model I tried.

optimum/onnx/transformations_utils.py

fxmarty · 2023-03-30T09:25:29Z

For reference: https://github.com/huggingface/optimum/pull/587/files#r1152974762. I think it's safe to keep as a tuple + use to_array for now.

michaelbenayoun · 2023-03-30T09:39:10Z

@fxmarty I am not talking about the tuple in the dictionary but tensor_data.

VikParuchuri · 2023-03-30T15:44:22Z

@fxmarty Your method seems to work! I updated the PR.

fxmarty · 2023-03-30T16:03:16Z

@VikParuchuri Great! Could you do pip install -e .[quality] and make style?

VikParuchuri · 2023-03-30T16:52:30Z

Done

fxmarty

LGTM thank you for the fix!

Hash tensor data before storing to dict key

d7856b3

fxmarty requested review from JingyaHuang and fxmarty March 29, 2023 20:44

fxmarty requested a review from michaelbenayoun March 30, 2023 07:36

fxmarty reviewed Mar 30, 2023

View reviewed changes

optimum/onnx/transformations_utils.py Outdated Show resolved Hide resolved

michaelbenayoun reviewed Mar 30, 2023

View reviewed changes

optimum/onnx/transformations_utils.py Outdated Show resolved Hide resolved

Change how tensor data is extracted

d20ed0c

VikParuchuri force-pushed the main branch from 401cb7e to d20ed0c Compare March 30, 2023 16:44

fxmarty approved these changes Mar 30, 2023

View reviewed changes

fxmarty merged commit bcfe24e into huggingface:main Mar 30, 2023

fxmarty mentioned this pull request Apr 4, 2023

Add support for exporting LLaMA to ONNX format #922

Closed

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Hash tensor data during deduplication #932

Hash tensor data during deduplication #932

Uh oh!

VikParuchuri commented Mar 29, 2023

Uh oh!

fxmarty commented Mar 29, 2023 •

edited

Loading

Uh oh!

HuggingFaceDocBuilderDev commented Mar 29, 2023 •

edited

Loading

Uh oh!

fxmarty commented Mar 30, 2023 •

edited

Loading

Uh oh!

Uh oh!

michaelbenayoun commented Mar 30, 2023

Uh oh!

Uh oh!

fxmarty commented Mar 30, 2023

Uh oh!

michaelbenayoun commented Mar 30, 2023

Uh oh!

VikParuchuri commented Mar 30, 2023

Uh oh!

fxmarty commented Mar 30, 2023

Uh oh!

VikParuchuri commented Mar 30, 2023

Uh oh!

fxmarty left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Hash tensor data during deduplication #932

Hash tensor data during deduplication #932

Uh oh!

Conversation

VikParuchuri commented Mar 29, 2023

What does this PR do?

Before submitting

Uh oh!

fxmarty commented Mar 29, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented Mar 29, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

fxmarty commented Mar 30, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

michaelbenayoun commented Mar 30, 2023

Uh oh!

Uh oh!

fxmarty commented Mar 30, 2023

Uh oh!

michaelbenayoun commented Mar 30, 2023

Uh oh!

VikParuchuri commented Mar 30, 2023

Uh oh!

fxmarty commented Mar 30, 2023

Uh oh!

VikParuchuri commented Mar 30, 2023

Uh oh!

fxmarty left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

fxmarty commented Mar 29, 2023 •

edited

Loading

HuggingFaceDocBuilderDev commented Mar 29, 2023 •

edited

Loading

fxmarty commented Mar 30, 2023 •

edited

Loading