KEMBAR78
Fix PT TF ViTMAE by ydshieh · Pull Request #16766 · huggingface/transformers · GitHub
Skip to content

Conversation

@ydshieh
Copy link
Collaborator

@ydshieh ydshieh commented Apr 13, 2022

What does this PR do?

Fix PT TF ViTMAE: just use some settings both in PT/TF (instead of in only one model). Otherwise, the PT/TF equivalence tests for them won't use something like std = 0.02 , and gets larger (init) weights --> larger diff in outputs.

Also, the eps for layer norm layers should be the same in PT/TF.
(not a real big deal in practice, since here is 1e-5 v.s. 1e-12 -> but it also affects the tests)

@ydshieh ydshieh marked this pull request as ready for review April 13, 2022 19:32
@ydshieh ydshieh requested review from NielsRogge, gante and sgugger and removed request for sgugger April 13, 2022 19:51
Copy link
Collaborator

@sgugger sgugger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing!

Copy link
Member

@gante gante left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@stas00
Copy link
Contributor

stas00 commented Apr 15, 2022

Let's merge since CI fails quite a lot on this one

@ydshieh
Copy link
Collaborator Author

ydshieh commented Apr 15, 2022

Let's merge since CI fails quite a lot on this one

Ok!

@ydshieh ydshieh merged commit ee209d4 into huggingface:main Apr 15, 2022
@ydshieh ydshieh deleted the fix_pt_tf_vit_mae branch April 15, 2022 04:37
elusenji pushed a commit to elusenji/transformers that referenced this pull request Jun 12, 2022
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants