KEMBAR78
documentation: some minor clean up by mingboiz · Pull Request #16850 · huggingface/transformers · GitHub
Skip to content

Conversation

@mingboiz
Copy link
Contributor

What does this PR do?

This cleans up some minor documentation changes unrelated to debertav2 in the PR #15529 so I'm opening up this PR just for repo history. Please do let me know if this is alright 😄

Before submitting

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Apr 20, 2022

The documentation is not available anymore as the PR was closed or merged.

Copy link
Member

@LysandreJik LysandreJik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you @mingboiz!

src/transformers/models/wav2vec2/tokenization_wav2vec2.py
src/transformers/models/wav2vec2_with_lm/processing_wav2vec2_with_lm.py
src/transformers/models/wavlm/modeling_wavlm.py
src/transformers/models/ctrl/modeling_ctrl.py
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this an intended change?

Copy link
Contributor Author

@mingboiz mingboiz Apr 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes - during the final merge this line was removed from utils/documentation_tests.txt.

I wasn't able to track which exact commit in the linked PR that caused this change but I guess it happened because this line was added originally in one of the multiple rebases on main needed to pass CI (the PR was originally started in 4.17.dev0 😅 ), so it was tracked as an additional but unrelated to debertav2 changes.

So I removed that line in the final commit of the PR and now restoring it here for parity!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed!

index of the token comprising a given character or the span of characters corresponding to a given token). Currently
no "Fast" implementation is available for the SentencePiece-based tokenizers (for T5, ALBERT, CamemBERT, XLM-RoBERTa
and XLNet models).
index of the token comprising a given character or the span of characters corresponding to a given token).
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was also an unrelated change to debertav2 in the linked PR - currently there's Fast implementation for the tokenizers of the mentioned models, so making the change here!

@LysandreJik
Copy link
Member

Thank you!

@LysandreJik LysandreJik merged commit 10dfa12 into huggingface:main Apr 26, 2022
chamidullinr pushed a commit to chamidullinr/transformers that referenced this pull request Apr 28, 2022
elusenji pushed a commit to elusenji/transformers that referenced this pull request Jun 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants