Generation: get special tokens from model config #30899

zucchini-nlp · 2024-05-19T13:18:57Z

What does this PR do?

Fixed #30892 . The error, described in the linked issue, was caused by passing a GenerationConfig type instead of a kwarg. If a user passes GenerationConfig we do not update it with model's generation config, thus cannot get special tokens used by the model. Two options were:

Update the user's GenerationConfig by that of models in the _prepare_generation_config
Try to get special tokens from the config, if not then from self.config

I think the first option can mess-up with what the user wants when generating, for ex by adding a do_sample=True from model's generation config. So I went for option-2

HuggingFaceDocBuilderDev · 2024-05-19T13:37:57Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

ArthurZucker

Are you sure this is the only argument that we need to get from the model's generation config for BC?

gante

I see what went wrong in my previous PR 😭 Thank you for taking this issue @zucchini-nlp!

For full BC, we need further changes. The removed _get_decoder_start_token_id function used bos_token_id as a fallback, and bos_token_id could be loaded from the config (like decoder_start_token_id). Furthermore, this logic only was used when self.config.is_encoder_decoder is True.

So the needed changes for BC are:

first check if self.config.is_encoder_decoder is True.
If 1. is true, get BOTH decoder_start_token_id and bos_token_id with fallback to self.generation_config
if 1. is false, don't use a fallback to self.generation_config
The line decoder_start_token_id = decoder_start_token_id if decoder_start_token_id is not None else bos_token_id can stay as it is :)

gante · 2024-05-20T12:17:09Z

Are you sure this is the only argument that we need to get from the model's generation config for BC?

@ArthurZucker see my comment above :D

zucchini-nlp · 2024-05-20T12:54:59Z

@ArthurZucker @gante done, I brought back the _get_decoder_start_token from prev version and checked it works when user passes decoder_start_tokens, or when user does not pass anything.

gante

Thank you for iterating on the fix 💛

(for a perfect PR, only a test is missing, to ensure we don't repeat the mistake)

ArthurZucker

Approving and merging to patch transformers today

ArthurZucker · 2024-05-22T16:15:01Z

src/transformers/generation/utils.py

+        elif bos_token_id is not None:
+            return bos_token_id
+        else:
+            return


Suggested change

return

return None

explicit but I guess it's the same

* fix * let's do this way? * codestyle * update * add tests

fix

8c09fb6

zucchini-nlp requested a review from gante May 19, 2024 13:18

younesbelkada mentioned this pull request May 20, 2024

transformers 4.41.0 breaks generate() for T5 #30892

Closed

4 tasks

zucchini-nlp added 2 commits May 20, 2024 11:08

let's do this way?

dd36f2d

codestyle

35ce59f

ArthurZucker reviewed May 20, 2024

View reviewed changes

gante reviewed May 20, 2024

View reviewed changes

update

f69e76f

gante approved these changes May 20, 2024

View reviewed changes

add tests

35a5946

ArthurZucker approved these changes May 22, 2024

View reviewed changes

ArthurZucker reviewed May 22, 2024

View reviewed changes

ArthurZucker merged commit b1065aa into huggingface:main May 22, 2024
21 checks passed

ArthurZucker pushed a commit that referenced this pull request May 22, 2024

Generation: get special tokens from model config (#30899)

9d05459

* fix * let's do this way? * codestyle * update * add tests

itazap pushed a commit that referenced this pull request May 24, 2024

Generation: get special tokens from model config (#30899)

96d211c

* fix * let's do this way? * codestyle * update * add tests

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generation: get special tokens from model config #30899

Generation: get special tokens from model config #30899

zucchini-nlp commented May 19, 2024

HuggingFaceDocBuilderDev commented May 19, 2024

ArthurZucker left a comment

gante left a comment •

edited

gante commented May 20, 2024

zucchini-nlp commented May 20, 2024

gante left a comment •

edited

ArthurZucker left a comment

ArthurZucker May 22, 2024

Generation: get special tokens from model config #30899

Generation: get special tokens from model config #30899

Conversation

zucchini-nlp commented May 19, 2024

What does this PR do?

HuggingFaceDocBuilderDev commented May 19, 2024

ArthurZucker left a comment

Choose a reason for hiding this comment

gante left a comment • edited

Choose a reason for hiding this comment

gante commented May 20, 2024

zucchini-nlp commented May 20, 2024

gante left a comment • edited

Choose a reason for hiding this comment

ArthurZucker left a comment

Choose a reason for hiding this comment

ArthurZucker May 22, 2024

Choose a reason for hiding this comment

gante left a comment •

edited

gante left a comment •

edited