- Notifications
You must be signed in to change notification settings - Fork472
add Beta Schedule#811
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
base:master
Are you sure you want to change the base?
Uh oh!
There was an error while loading.Please reload this page.
Conversation
Green-Sky commentedSep 10, 2025
BETA sigmas for 8steps: from initial code. |
Green-Sky commentedSep 10, 2025 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
What i realized is that this scheduler allows for way too many variations that make no real sense. The paper only really uses alpha/beta 0.5 and diffusion practitioners seem to almost always use 0.6 for both. What I realized while looking at the functions, is that it almost looks exactly like a smoothstep/smootherstep (or rather the inverse of it). more:
(Chroma1-HD-Flash-Q4_K_S) |
phil2sat commentedSep 10, 2025
Thats the way to go... no speed loss no gain only some more lines |
Green-Sky commentedSep 10, 2025 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
f382a48 to5635b0eComparephil2sat commentedSep 10, 2025
i dont think that makes a huge difference never seen beta configurable and other than 0.6 i guess 0.5 or 0.55 makes 3 of 1m pixel difference |
wbruna commentedSep 10, 2025
Just came up with an alternative, too:wbruna@2050ffe (looks like the same algorithm):
|
5635b0e tod68873fComparephil2sat commentedSep 10, 2025
For me that looks exactly different, lol, so like my first fake implementaion without boost or simple, compare it with the actual implementaion its exact what boost does but without boost dependency and also with the same speed, you posted two different pics, on the first view it seem the same bit copare it with simple i guess thats more simple than beta |
phil2sat commentedSep 10, 2025
maybe its also time for an simple comebackhttps://github.com/user-attachments/files/22256634/simple_beta.tar.gz |
wbruna commentedSep 10, 2025
Well... yeah, of course they are. But the difference is in the finishing steps, so that points to a precision issue. If we mindlessly crank up the precision: diff --git a/denoiser.hpp b/denoiser.hppindex d841f03..541bb99 100644--- a/denoiser.hpp+++ b/denoiser.hpp@@ -280,8 +280,8 @@ struct BetaDist { double x = u < 0.5 ? u * u : 1.0 - (1.0 - u) * (1.0 - u);- const int max_iterations = 50;- const double tolerance = 1e-12;+ const int max_iterations = 1000;+ const double tolerance = 1e-20; for (int i = 0; i < max_iterations; ++i) { double err = beta_cdf(x) - u;@@ -333,8 +333,8 @@ private: double incomplete_beta(double a, double b, double x) { double f = 1.0, c = 1.0, d = 0.0;- const int max_iterations = 200;- const double tolerance = 1e-15;+ const int max_iterations = 1000;+ const double tolerance = 1e-20; for (int i = 0; i <= max_iterations; ++i) { int m = i / 2; ... we get the samesha256 between images generated by Boost and this implementation. Let me just clarify why I posted it as-is:
|
Green-Sky commentedSep 10, 2025 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
I suspect aCubic Bezier fit might be another simple solution. a visualization of what I meant:https://thebookofshaders.com/edit.php?log=160414041933 a similar function is also called a gain function. |
phil2sat commentedSep 11, 2025 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
for me the question is, after testing this actual pull, the code is about twice as long as boost version. on a modern gpu i guess there is absolute zero speed gain or loss, even on my gpu. so extra checking for boost is in my opinion not neccesary as the actual implemetation is exactly cloning what boost does. and i dont really know it this simple math is slower than following a pointer to external lib or statically make the bin twice as big, even with my slow gpu they do the same speed. (didnt check size difference) for me it works, im fine so whats next, maybe "pertubed attention guidance" PAG ? makes SDXL much better. |
phil2sat commentedSep 12, 2025 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
some comparison: Chroma V47 heun 8-Step
sorry for low res but it takes ages,@Green-Sky your idea with bezier. the first test looks promising but i have to tweak it a little bit, details are finer but a little noise in hair. have to generate larger resolutions later if fine tuned. I think i have maxed out Detail vs. Noise/Artifacts. same parameters heun bezier 1024x1024 cant get higher, dang. |














Uh oh!
There was an error while loading.Please reload this page.
Submitted by@phil2sat in#777
TODO: