split single_gpu and multi_gpu #17083

ydshieh · 2022-05-04T12:45:30Z

What does this PR do?

Fix the scheduled CI issue caused by the 256 limits (jobs generated from matrix).

Note that the workflow run page has a graph that has no single-gpu and multi-gpu on it. But on the left side, the job names have matrix mentioned.

HuggingFaceDocBuilderDev · 2022-05-04T13:00:31Z

The documentation is not available anymore as the PR was closed or merged.

LysandreJik

Thanks for your PR! Did you launch it as a trial to see if it works? I see the following should be completed, on line 303:

[setup, run_tests_gpu, run_examples_gpu, run_pipelines_tf_gpu, run_pipelines_torch_gpu, run_all_tests_torch_cuda_extensions_gpu]

ydshieh · 2022-05-05T13:42:14Z

Thanks for your PR! Did you launch it as a trial to see if it works? I see the following should be completed, on line 303:

[setup, run_tests_gpu, run_examples_gpu, run_pipelines_tf_gpu, run_pipelines_torch_gpu, run_all_tests_torch_cuda_extensions_gpu]

You are right, that line should be changed. I haven't launched it (just tried with a dummy example). I will launch it now.

LysandreJik · 2022-05-05T13:56:00Z

You can launch it with only 1-2 models in each run, for example by updating this line:

          echo "::set-output name=matrix::$(python3 -c 'import os; tests = os.getcwd(); model_tests = os.listdir(os.path.join(tests, "models")); d1 = sorted(list(filter(os.path.isdir, os.listdir(tests)))); d2 = sorted(list(filter(os.path.isdir, [f"models/{x}" for x in model_tests]))); d1.remove("models"); d = d2 + d1; print(d)')"

to

          echo "::set-output name=matrix::$(python3 -c 'import os; tests = os.getcwd(); model_tests = os.listdir(os.path.join(tests, "models"))[:2]; d1 = sorted(list(filter(os.path.isdir, os.listdir(tests)))); d2 = sorted(list(filter(os.path.isdir, [f"models/{x}" for x in model_tests]))); d1.remove("models"); d = d2 + d1; print(d)')"

This way you'll test the full behavior without having 12-hour long iterations.

ydshieh · 2022-05-05T17:30:59Z

It took sometime, but the run looks good.

https://github.com/huggingface/transformers/actions/runs/2276209307

LysandreJik · 2022-05-09T11:12:56Z

Looks good, thanks @ydshieh!

* split single_gpu and multi_gpu * update needs in send_result Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

ydshieh requested a review from LysandreJik May 4, 2022 12:45

LysandreJik reviewed May 5, 2022

View reviewed changes

ydshieh added 2 commits May 6, 2022 09:11

split single_gpu and multi_gpu

0cc52bc

update needs in send_result

493b384

ydshieh force-pushed the fix_scheduled_ci_256_limit branch from a9c25d1 to 493b384 Compare May 6, 2022 07:13

LysandreJik approved these changes May 9, 2022

View reviewed changes

LysandreJik merged commit 3212afa into main May 9, 2022

LysandreJik deleted the fix_scheduled_ci_256_limit branch May 9, 2022 11:13

elusenji pushed a commit to elusenji/transformers that referenced this pull request Jun 12, 2022

split single_gpu and multi_gpu (huggingface#17083)

fb58c9b

* split single_gpu and multi_gpu * update needs in send_result Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

split single_gpu and multi_gpu #17083

split single_gpu and multi_gpu #17083

Uh oh!

ydshieh commented May 4, 2022 •

edited

Loading

Uh oh!

HuggingFaceDocBuilderDev commented May 4, 2022 •

edited

Loading

Uh oh!

LysandreJik left a comment

Uh oh!

ydshieh commented May 5, 2022

Uh oh!

LysandreJik commented May 5, 2022

Uh oh!

ydshieh commented May 5, 2022

Uh oh!

LysandreJik commented May 9, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

split single_gpu and multi_gpu #17083

split single_gpu and multi_gpu #17083

Uh oh!

Conversation

ydshieh commented May 4, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Uh oh!

HuggingFaceDocBuilderDev commented May 4, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

LysandreJik left a comment

Choose a reason for hiding this comment

Uh oh!

ydshieh commented May 5, 2022

Uh oh!

LysandreJik commented May 5, 2022

Uh oh!

ydshieh commented May 5, 2022

Uh oh!

LysandreJik commented May 9, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ydshieh commented May 4, 2022 •

edited

Loading

HuggingFaceDocBuilderDev commented May 4, 2022 •

edited

Loading