AOTI Minifier #139351

yushangdi · 2024-10-31T00:36:36Z

See documentation at https://docs-preview.pytorch.org/pytorch/pytorch/139351/torch.compiler_aot_inductor_minifier.html.

Add a minifier for AOTI.

Test Plan:
python test/inductor/test_minifier.py

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @ColinPeppler @amjames @desertfire @chauhang @aakhundov

pytorch-bot · 2024-10-31T00:36:39Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/139351

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 2787ffb with merge base b8cf324 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

desertfire · 2024-11-01T17:07:13Z

docs/source/torch.compiler_aot_inductor_minifier.rst

+    # GPU Hardware Info:
+    # NVIDIA PG509-210 : 8
+
+    exported_program = torch.export.load('/data/users/shangdiy/pytorch/torch_compile_debug/run_2024_10_31_16_48_02_720863-pid_3598491/minifier/checkpoints/exported_program.pt2')


I expect the repro file to contain a trimmed model code which runs the full export-compile-run flow and then reproduces the problem. Loading another .pt2 file is not intuitive here.

I expect the repro file to contain a trimmed model code which runs the full export-compile-run flow and then reproduces the problem. Loading another .pt2 file is not intuitive here.

@desertfire The .pt2 file is a trimmed model. I'm not sure how we can just include the model code instead of using an exported_program (the existing torch.compile minifier does this, but it's buggy). How can we convert the model code into a string and preserve the model safely?

If the trimmed model is as simple as a few nodes, that's possible. But if the trimmed model is more complicated, e.g. contains submodules or parameters, then converting it into a string and load inputs/state_dict is non-trivial anymore.

Another point is that the result of the minifier is a GraphModule object. It seems a bit weird that we first convert the GraphModule object into some nn.Module code, and then export it back to a GraphModule?

When we use TORCH_COMPILE_DEBUG=1, it will generate fx_graph_runnable.py. Maybe we can do something similar here?

pytorch/torch/_inductor/debug.py

Lines 475 to 495 in 91d38a5

with self.fopen("fx_graph_runnable.py") as fd:

save_dir = None

if torch._inductor.config.trace.save_real_tensors:

inputs = torch._subclasses.fake_utils.try_convert_fake_to_real(inputs)

save_dir = os.path.dirname(fd.name)

# dont try to use stable hash torchinductor compilation if saving real tensors

# and avoid recursively trying to save real tensors inside of the inductor compilation

# regardless

stable_hash = torch._inductor.config.trace.save_real_tensors

with torch._inductor.config.patch(

{"trace.enabled": False, "trace.save_real_tensors": False}

):

save_graph_repro(

fd,

gm,

inputs,

"inductor",

save_dir=save_dir,

stable_hash=stable_hash,

)

When we use TORCH_COMPILE_DEBUG=1, it will generate fx_graph_runnable.py. Maybe we can do something similar here?

pytorch/torch/_inductor/debug.py

Lines 475 to 495 in 91d38a5

with self.fopen("fx_graph_runnable.py") as fd:

save_dir = None

if torch._inductor.config.trace.save_real_tensors:

inputs = torch._subclasses.fake_utils.try_convert_fake_to_real(inputs)

save_dir = os.path.dirname(fd.name)

# dont try to use stable hash torchinductor compilation if saving real tensors

# and avoid recursively trying to save real tensors inside of the inductor compilation

# regardless

stable_hash = torch._inductor.config.trace.save_real_tensors

with torch._inductor.config.patch(

{"trace.enabled": False, "trace.save_real_tensors": False}

):

save_graph_repro(

fd,

gm,

inputs,

"inductor",

save_dir=save_dir,

stable_hash=stable_hash,

)

yeah, we can do this, but I think this is not as robust as storing an exported program. This uses torch._dynamo.repro.after_aot.save_graph_repro under the hood, which only works for flattened graphs. It doesn't work out-of-the-box if there are submodules in the graph.

But maybe this is fine since AOTI runs on ep.module() which is already flattened, so the repro are always a flattened graph?

so the repro are always a flattened graph?

I think so.

My main concern with generating .pt2 is for OSS users's ability to see the generated code. Because some OSS users don't want to share their whole model code, but if they see the generated code is small enough, they are more willing to share it as a repro. With .pt2, they can unzip and examine the minimized code in theory, but that just adds an extra step and discourages people to share.

For that reason, I think maybe we can just print the minimized exported graph as a string comment in this file, so users can immediately see what is contained in the graph.

so the repro are always a flattened graph?

I think so.

My main concern with generating .pt2 is for OSS users's ability to see the generated code. Because some OSS users don't want to share their whole model code, but if they see the generated code is small enough, they are more willing to share it as a repro. With .pt2, they can unzip and examine the minimized code in theory, but that just adds an extra step and discourages people to share.

For that reason, I think maybe we can just print the minimized exported graph as a string comment in this file, so users can immediately see what is contained in the graph.

I modified the PR to return the exported graph as a string now. The doc is also updated.

desertfire · 2024-11-01T17:14:26Z

torch/_inductor/__init__.py

+
+    if load_and_run:
+        compiled_model = aoti_load_package(package_path)
+        aoti_result = compiled_model(*args)


We could compare with eager result here and thus enable accuracy minifier. Ok to do it in a follow-up PR.

We could compare with eager result here and thus enable accuracy minifier. Ok to do it in a follow-up PR.

yeah, I can do it in a follow up

muchulee8 · 2024-11-05T19:05:08Z

torch/_inductor/config.py

 verbose_progress = False

+# dump an aoti minifier if program errors
+dump_aoti_minifier: bool = os.environ.get("DUMP_AOTI_MINIFIER", "0") == "1"


I would recommend moving this under aot_inductor.

yushangdi · 2024-11-07T19:19:25Z

@pytorchbot merge

pytorchmergebot · 2024-11-07T19:21:07Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

henrylhtsang · 2024-11-08T21:57:26Z

Thanks for the work. Not sure how much work that would be, but is it possible to print some metadata for the input arguments, so users can try to reproduce them without needing to load local files?

e.g. for tensor, print shape and dtype. For python constants, print everything.

yushangdi · 2024-11-08T22:17:56Z

Thanks for the work. Not sure how much work that would be, but is it possible to print some metadata for the input arguments, so users can try to reproduce them without needing to load local files?

e.g. for tensor, print shape and dtype. For python constants, print everything.

Thanks for the suggestion! Yeah I can add that, it shouldn't be too hard.

See documentation at https://docs-preview.pytorch.org/pytorch/pytorch/139351/torch.compiler_aot_inductor_minifier.html. Add a minifier for AOTI. Test Plan: python test/inductor/test_minifier.py Pull Request resolved: pytorch#139351 Approved by: https://github.com/desertfire

yushangdi added 4 commits October 23, 2024 18:10

minimal demo

75c8b2e

device bug

7507630

comment

1515b50

change to use exported_program

d3c33bf

pytorch-bot bot added ciflow/inductor module: dynamo module: inductor labels Oct 31, 2024

yushangdi added 6 commits October 30, 2024 17:37

aoti_package minifier

8932a3e

bug fix, need to export

8f83a4c

cleanyp

c42ea80

add unittest

2433ce6

lint

0f28579

cleanup

327ef1e

yushangdi added release notes: export module: aotinductor aot inductor and removed module: dynamo labels Oct 31, 2024

indenetation

1e1eca4

pytorch-bot bot added the module: dynamo label Oct 31, 2024

yushangdi marked this pull request as ready for review October 31, 2024 22:24

yushangdi requested a review from desertfire October 31, 2024 22:24

yushangdi changed the title ~~Aoti package minifier~~ AOTI Minifier Oct 31, 2024

yushangdi added 2 commits October 31, 2024 17:34

add doc

cefbe82

remove repro.md

882f799

desertfire reviewed Nov 1, 2024

View reviewed changes

update doc

b2ca6f8

muchulee8 reviewed Nov 5, 2024

View reviewed changes

yushangdi added 2 commits November 6, 2024 14:46

dump string module in repro

add7914

Merge branch 'main' into aoti_package_minifier

2787ffb

desertfire approved these changes Nov 7, 2024

View reviewed changes

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Nov 7, 2024

pytorchmergebot added the merging label Nov 7, 2024

pytorchmergebot added the Merged label Nov 7, 2024

pytorchmergebot closed this in 83e36a6 Nov 7, 2024

pytorchmergebot removed the merging label Nov 7, 2024

github-actions bot deleted the aoti_package_minifier branch December 9, 2024 02:14

	with self.fopen("fx_graph_runnable.py") as fd:
	save_dir = None
	if torch._inductor.config.trace.save_real_tensors:
	inputs = torch._subclasses.fake_utils.try_convert_fake_to_real(inputs)
	save_dir = os.path.dirname(fd.name)

	# dont try to use stable hash torchinductor compilation if saving real tensors
	# and avoid recursively trying to save real tensors inside of the inductor compilation
	# regardless
	stable_hash = torch._inductor.config.trace.save_real_tensors
	with torch._inductor.config.patch(
	{"trace.enabled": False, "trace.save_real_tensors": False}
	):
	save_graph_repro(
	fd,
	gm,
	inputs,
	"inductor",
	save_dir=save_dir,
	stable_hash=stable_hash,
	)

AOTI Minifier #139351

AOTI Minifier #139351

Uh oh!

Conversation

yushangdi commented Oct 31, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Oct 31, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/139351

✅ No Failures

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yushangdi Nov 5, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yushangdi commented Nov 7, 2024

Uh oh!

pytorchmergebot commented Nov 7, 2024

Merge started

Uh oh!

henrylhtsang commented Nov 8, 2024

Uh oh!

yushangdi commented Nov 8, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

yushangdi commented Oct 31, 2024 •

edited

Loading

pytorch-bot bot commented Oct 31, 2024 •

edited

Loading

yushangdi Nov 5, 2024 •

edited

Loading