In ch06, why does the pretrained model repeats the input?#462

tt7533 started this conversation inGeneral

tt7533

Dec 31, 2024

· 1 comment

Return to top

Discussion options

tt7533
Dec 31, 2024

In ch06, I have noticed that generate_text_simple with text2 in the pretrained model, i.e. before finetuning, simply just repeats indefinitely the input text2 many times over word for word.

Is there an explanation why this happens?

You must be logged in to vote

Replies: 1 comment

Comment options

Hi there, this is normal behavior for several pretrained LLMs that haven't undergone finetuning, yet. Especially with smaller LLMs. Not sure why this happens exactly (maybe an artifact of repetitive structures in the training data or the LLM isn't good at longer contexts)

You must be logged in to vote

0 replies

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

In ch06, why does the pretrained model repeats the input?#462

Uh oh!

{{title}}

Uh oh!

tt7533
Dec 31, 2024

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

rasbt
Jan 4, 2025
Maintainer

Select a reply

Uh oh!

Movatterモバイル変換

In ch06, why does the pretrained model repeats the input?#462

Uh oh!

tt7533Dec 31, 2024

Replies: 1 comment

Uh oh!

rasbtJan 4, 2025 Maintainer

Uh oh!

tt7533
Dec 31, 2024

rasbt
Jan 4, 2025
Maintainer