KEMBAR78
Fixup no_trainer save logic by muellerzr · Pull Request #16968 · huggingface/transformers · GitHub
Skip to content

Conversation

@muellerzr
Copy link
Contributor

Fix save logic in all no_trainer examples

What does this add?

This PR fixes a bug pointed out in huggingface/accelerate#322, where the save and load logic was wrong in how it skipped over the steps in the training loop.

This PR fixes it and changes the internals slightly to let saveing of a checkpoint be named right (before it always started at epoch_0, even if we resumed from epoch 1

@muellerzr muellerzr added PyTorch Anything PyTorch Examples Which is related to examples in general labels Apr 27, 2022
@muellerzr muellerzr requested review from sgugger and removed request for sgugger April 27, 2022 15:52
@muellerzr muellerzr marked this pull request as draft April 27, 2022 15:54
@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Apr 27, 2022

The documentation is not available anymore as the PR was closed or merged.

@muellerzr muellerzr marked this pull request as ready for review April 27, 2022 18:17
@muellerzr muellerzr requested a review from sgugger April 27, 2022 18:20
Copy link
Collaborator

@sgugger sgugger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing all of those!

@muellerzr muellerzr merged commit 60e1d88 into main Apr 27, 2022
@muellerzr muellerzr deleted the muellerzr-bugfix-examples branch April 27, 2022 18:46
chamidullinr pushed a commit to chamidullinr/transformers that referenced this pull request Apr 28, 2022
elusenji pushed a commit to elusenji/transformers that referenced this pull request Jun 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Examples Which is related to examples in general PyTorch Anything PyTorch

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants