- Notifications
You must be signed in to change notification settings - Fork6.1k
[docs] updateguidance_scale
docstring for guidance_distilled models.#11935
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
base:main
Are you sure you want to change the base?
Conversation
HuggingFaceDocBuilderDev commentedJul 15, 2025
The docs for this PR livehere. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
@@ -691,7 +691,11 @@ def __call__( | |||
Guidance](https://huggingface.co/papers/2207.12598). `guidance_scale` is defined as `w` of equation 2. | |||
of [Imagen Paper](https://huggingface.co/papers/2205.11487). Guidance scale is enabled by setting | |||
`guidance_scale > 1`. Higher guidance scale encourages to generate images that are closely linked to | |||
the text `prompt`, usually at the expense of lower image quality. | |||
the text `prompt`, usually at the expense of lower image quality. In case of Flux, which is a guidance- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Does "mimic" mean it still produces the same result as true CFG? If the effects are the same (despite the implementation), I'm not too sure the end-user will care or notice.
Maybe it'd be better to make a note of it on the Fluxmodel card?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
We don't know the true CFG results I think. Maybe "resembles" is a better phrase?
If the effects are the same (despite the implementation), I'm not too sure the end-user will care or notice.
I think docstrings are important and it can be confusing to users if we're not putting the right phrases here.
Maybe it'd be better to make a note of it on the Fluxmodel card?
Here, I disagree. I think just clarifying it at the docstring level is more than sufficient w.r.t the info already available in the model card (for example, we already mention that Dev is a guidance-distlled model).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Sure, maybe lets go with something like this then?
"Guidance-distilled models don't implement true classifier-free guidance and forguidance_scale > 1
, it only resembles it."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
resembles/mimics is very confusing terminology to me.
doing CFG on a guidance-distilled model with cfg_scale=X and embedded_cfg_scale=Y is effectively/approximately the same as doing CFG on base model with cfg_scale=X*Y based on how the math works out. this can easily be validated by running inference with same seed and making sure product of true and embedded scale is the same value. the results will not be the exact same but will be similar-ish (because distilled model is a noisy approximator of base model outputs)
better not to change explantation imo. if we want to, we can just say it is guidance-distilled, and leave the interested user to google it and find necessary information
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Hmm. I hear both of you. I have had complaints over DMs multiple times regarding this.
How about:
Guidance-distilled models (such as …) don't implement true classifier-free guidance and for guidance_scale > 1, it approximates its effects. Refer tohttps://arxiv.org/abs/2210.03142 for more details.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
@stevhliu WDYT
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
How about:
"Guidance-distilled models approximates true classifier-free guidance forguidance_scale > 1
. Refer to thepaper to learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Yes that should be fine.@a-r-r-o-w good with you?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
SGTM. It's a bit weird here though thatguidance_scale
actually means the embedded guidance scale, whereas we havetrue_cfg_scale
to actually meanguidance_scale
😞
Maybe clarifying this is very important
@@ -691,7 +691,11 @@ def __call__( | |||
Guidance](https://huggingface.co/papers/2207.12598). `guidance_scale` is defined as `w` of equation 2. | |||
of [Imagen Paper](https://huggingface.co/papers/2205.11487). Guidance scale is enabled by setting | |||
`guidance_scale > 1`. Higher guidance scale encourages to generate images that are closely linked to | |||
the text `prompt`, usually at the expense of lower image quality. | |||
the text `prompt`, usually at the expense of lower image quality. In case of Flux, which is a guidance- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Sure, maybe lets go with something like this then?
"Guidance-distilled models don't implement true classifier-free guidance and forguidance_scale > 1
, it only resembles it."
`guidance_scale > 1` just mimics it. In case of Flux, which is a guidance- distilled model, | ||
`guidance_scale > 1` doesn't implement true classifier-free guidance. Specifying `guidance_scale > 1` | ||
just mimics it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
This seems like a duplicate sentence
`guidance_scale > 1`justmimicsit.IncaseofFlux,whichisaguidance-distilledmodel, | |
`guidance_scale > 1`doesn'timplementtrueclassifier-freeguidance.Specifying`guidance_scale > 1` | |
justmimicsit. | |
`guidance_scale > 1`justmimicsit. |
What does this PR do?
Was feeling bored on the training journeys hence decided to open PRs. I am sure I am missing some other pipelines. What are those?