Add Doc Test GPT-J #16507

ArEnSc · 2022-03-31T03:20:30Z

What does this PR do?

Fixes the broken doc tests for GPT-J
Apart of the documentation sprint work.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github [issue] ([Community Event] Doc Tests Sprint #16292)? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

@ydshieh
@sgugger
-->

HuggingFaceDocBuilderDev · 2022-03-31T03:35:38Z

The documentation is not available anymore as the PR was closed or merged.

ydshieh · 2022-03-31T11:18:10Z

@ArEnSc

Thank you for working on GPT-J.
Maybe you can just put any expected value as empty string. Once other parts are done,
please ping me and I will try to get and put the expected values :)

ArEnSc · 2022-04-01T15:43:18Z

@ArEnSc

Thank you for working on GPT-J. Maybe you can just put any expected value as empty string. Once other parts are done, please ping me and I will try to get and put the expected values :)

yep please do and let me know and we can close this =)

ydshieh · 2022-04-05T11:43:31Z

Hi, @ArEnSc

After discussing with the team, we found that there is no real checkpoints for finetuned GPT-J model on downstream tasks. (There is one for text seq. classification, but it is for Korean language).

If you still want to work on this GPT-J doctest, the best we could do for now is to use a tiny model (that is created for testing purpose)

https://huggingface.co/hf-internal-testing/tiny-random-gptj

With this one, there won't be any OOM issue. Let me know if you want to continue, and if so, don't hesitate if you have any issue using this tiny model checkpoint.

Thanks!

ArEnSc · 2022-04-05T14:42:04Z

Hi, @ArEnSc

After discussing with the team, we found that there is no real checkpoints for finetuned GPT-J model on downstream tasks. (There is one for text seq. classification, but it is for Korean language).

If you still want to work on this GPT-J doctest, the best we could do for now is to use a tiny model (that is created for testing purpose)

https://huggingface.co/hf-internal-testing/tiny-random-gptj

With this one, there won't be any OOM issue. Let me know if you want to continue, and if so, don't hesitate if you have any issue using this tiny model checkpoint.

Thanks!

will do! Ill do this after work today

ydshieh · 2022-04-08T08:30:34Z

Hi, @ArEnSc

Thank you for the effort.

I need to investigate first why it is non deterministic, and see if there is a way to fix this.
We strongly prefer not to disable doctests for some parts in a model.
(Otherwise, our team need to discuss to justify it and make the decision).

I will take a look in this soon!

ydshieh · 2022-04-08T10:28:30Z

Hi, @ArEnSc

I uploaded the checkpoints

"ydshieh/tiny-random-gptj-for-sequence-classification"
"ydshieh/tiny-random-gptj-for-question-answering"

I tested with tiny-random-gptj-for-question-answering and the results are now deterministic. Could you use them please, and let me know if they work well :-)

Ran Style and Fixup.

ArEnSc · 2022-04-11T14:32:16Z

@ydshieh I think we should be good to go here let me know =)
edit (# limitations under the License..) <-- ill fix this in a bit. Had to trigger CI some how

ydshieh

Run locally - tests pass.
LGTM. Let's wait @patil-suraj final review :-) before merging.

ydshieh · 2022-04-11T20:07:48Z

src/transformers/benchmark/benchmark_utils.py

                    if self.args.inference:
                        if self.args.memory:
                            memory, inference_summary = self.inference_memory(model_name, batch_size, sequence_length)
+


No need for this new line :-) Let's not change this file 🙏

This should be addressed.

strange I will fix that soon

src/transformers/models/gptj/modeling_gptj.py

ydshieh · 2022-04-11T20:14:41Z

@ArEnSc Would you mind to remove the comments with (very) long error message - just easier to load this page :-) Thanks!

ArEnSc · 2022-04-12T01:51:39Z

@ydshieh looks like it good to go!

patil-suraj

Thanks a lot for working on this! Good for merge once the comment for benchmark_utils.py file is addressed :)

patil-suraj · 2022-04-12T09:14:20Z

src/transformers/benchmark/benchmark_utils.py

                    if self.args.inference:
                        if self.args.memory:
                            memory, inference_summary = self.inference_memory(model_name, batch_size, sequence_length)
+


This should be addressed.

src/transformers/models/gptj/modeling_gptj.py

Then refactored tests to const variables.

…to add-doc-test-gptj

ArEnSc · 2022-04-12T19:09:59Z

@ydshieh I think this is done after CI passes =)

ArEnSc · 2022-04-12T19:49:34Z

@ydshieh we are good to merge! =)

ydshieh · 2022-04-13T13:03:41Z

@ydshieh we are good to merge! =)

Yes! Merged now.
Thanks a lot for working on this doctest for GPT-J, @ArEnSc 🚀 🎉

* Required the values GPTJ unfortunately cannot run the model =) * Added the file to the doc tests * Run Fixup and Style * Fixed with the test versions of gptj. Ran Style and Fixup. * Trigger ci * A Minor Change to License * Fixed spacing added to the benchmark_utils. Then refactored tests to const variables. * Removed strings that were included as default parameters anyways. Co-authored-by: ArEnSc <xx.mike.chung.xx@gmail.com>

Required the values GPTJ unfortunately cannot run the model =)

021cc92

mikechung added 2 commits March 31, 2022 16:27

Added the file to the doc tests

d2bbf6c

Run Fixup and Style

31aa39d

ArEnSc added 2 commits April 8, 2022 10:49

Fixed with the test versions of gptj

c947786

Ran Style and Fixup.

Trigger ci

31d663d

ydshieh approved these changes Apr 11, 2022

View reviewed changes

ydshieh requested review from patil-suraj and sgugger and removed request for sgugger April 11, 2022 20:12

A Minor Change to License

ff819c4

patil-suraj approved these changes Apr 12, 2022

View reviewed changes

ArEnSc and others added 5 commits April 12, 2022 15:00

Fixed spacing added to the benchmark_utils

f77cb81

Then refactored tests to const variables.

Merge branch 'master' into add-doc-test-gptj

eedc872

Merge branch 'main' into add-doc-test-gptj

f177981

Removed strings that were included as default parameters anyways.

4456604

Merge branch 'add-doc-test-gptj' of github.com:ArEnSc/transformers in…

ae9c520

…to add-doc-test-gptj

ydshieh merged commit 06b4aac into huggingface:main Apr 13, 2022

ydshieh changed the title ~~[WIP] Add Doc Test GPT-J~~ Add Doc Test GPT-J Apr 13, 2022

Add Doc Test GPT-J #16507

Add Doc Test GPT-J #16507

Uh oh!

Conversation

ArEnSc commented Mar 31, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Before submitting

Who can review?

Uh oh!

HuggingFaceDocBuilderDev commented Mar 31, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ydshieh commented Mar 31, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ArEnSc commented Apr 1, 2022

Uh oh!

ydshieh commented Apr 5, 2022

Uh oh!

ArEnSc commented Apr 5, 2022

Uh oh!

ydshieh commented Apr 8, 2022

Uh oh!

ydshieh commented Apr 8, 2022

Uh oh!

ArEnSc commented Apr 11, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ydshieh left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ydshieh Apr 11, 2022

Choose a reason for hiding this comment

Uh oh!

patil-suraj Apr 12, 2022

Choose a reason for hiding this comment

Uh oh!

ArEnSc Apr 12, 2022

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ydshieh commented Apr 11, 2022

Uh oh!

ArEnSc commented Apr 12, 2022

Uh oh!

patil-suraj left a comment

Choose a reason for hiding this comment

Uh oh!

patil-suraj Apr 12, 2022

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

ArEnSc commented Apr 12, 2022

Uh oh!

ArEnSc commented Apr 12, 2022

Uh oh!

ydshieh commented Apr 13, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

ArEnSc commented Mar 31, 2022 •

edited

Loading

HuggingFaceDocBuilderDev commented Mar 31, 2022 •

edited

Loading

ydshieh commented Mar 31, 2022 •

edited

Loading

ArEnSc commented Apr 11, 2022 •

edited

Loading

ydshieh left a comment •

edited

Loading