Fixup no_trainer examples scripts and add more tests #16765
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fixup
no_trainerExamples and Bolster their testsWhat does this add?
This changes the logging behavior inside the
no_trainerscripts, slightly changes how the initial configuration is stored, and adds tests for the tracking API.Who is it for?
Users of
transformerswho want to try outAcceleratequicklyWhy is this needed?
I was made aware that the scripts were laggy when it came to how logs were sent to weights and biases when using the
no_trainerscripts, and this was due to the step being passed in as a parameter, causing a lag in when it gets uploaded.To follow akin to the original Accelerate scripts, these are now passed in as a
"step"parameter to the overall dictionary logged viaaccelerate.log()TensorBoardalso does not like whenEnum's are logged, so there is a manual adjustment rightr before saving the hyperparemeters to get the enum value from the LR Scheduler type.Finally, as
TensorBoardis a test requirement, I added in tests for tracking inside the no_trainer tests, asTensorBoardis also how we test that behavior in the CI in Accelerate proper.