- Notifications
You must be signed in to change notification settings - Fork227
SDXL example bugfixes and enhancements#1947
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
base:main
Are you sure you want to change the base?
Uh oh!
There was an error while loading.Please reload this page.
Conversation
…nsformersRemove <4.43.0 version limit on transformers, since it made no apparent difference in my testing.
A set of if/elif/else statements was neglecting to account for the case of using dml, and was instead changing things to cpu in the final else block when the user tried using dml. This commit changes the final else block to be elif provider == "cpu". That way, if the provider is neither cuda, nor qnn, nor cpu, the last possibility is dml (which should be the default according to an earlier comment in line 326), and no changes are made.
Line 422 was using "Path(__file__).resolve().parent", but this is exactly what script_dir is for, as defined in line 373.
Some computers (such as mine) are capable of running conversion, optimization, and inference, but choke on evaluation. Add a command-line flag to skip evaluation, so that optimization can be completed.
ByteSize() used to be quick but capable of wildly wrong results with Protobuf 3.x. Now with newer Protobuf versions, it seems more consistent, but is very costly to run, to the tune of ~60GB RAM/swap and >5 minutes for a ~10GB unet file (YMMV). Therefore, change the code here to only run ByteSize() if save_as_external_data is not already enabled.
SDXL unets end up ~5GB after fp16 optimization, and thus always need to use external data files to avoid Protobuf's 2GB limit on the main model.onnx file. Olive has code that checks if a model is >2GB even if save_as_external_data is not enabled, but that ByteSize() check is rather costly to run on a large unet file, so avoiding it where possible saves time and RAM/swapping.
…der treeCurrently, the folder tree for the optimized model is created by copying the unoptimized model's folder tree, then overwriting the unoptimized files with optimized versions from the run cache. Weights.pb files are ignored/excluded during the tree copy, which is fine. However, there is a scenario (that actually occurs with the default SDXL 1.0 model used by the example) where text_encoder_2 can be >2GB when unoptimized, but <2GB when optimized. In this case, the unoptimized model will have a model.onnx.data external data file, whereas the optimized model will not. The unoptimized external file gets copied over to the optimized tree, but is not overwritten when the optimized model files are copied in, since there isn't an optimized version of it. You end up with a vestigial 3GB file that isn't actually used by the optimized model. Therefore, this commit changes the tree-copy code to also ignore model.onnx and model.onnx.data files (in addition to the currently-ignored weights.pb files). The unoptimized onnx files don't actually need to be copied over, since they are overwritten anyway when the optimized models are copied in.
Add --single_local_file option for optimizing a local .safetensors file. When the flag is used along with --optimize, the script checks the local folder for a .safetensors file with a name matching --model_id to run optimization on. Has no effect when used without --optimize.
- Modify the update_steps function to have a step_callback_pipeline parameter, and to return latents.- For the two result = pipeline() functions, change callback=update_steps into callback_on_step_end=update_steps. The former seems to have been deprecated for the latter in the huggingface diffusers SDXL pipeline:https://github.com/huggingface/diffusers/blob/main/src/diffusers/pipelines/stable_diffusion_xl/pipeline_stable_diffusion_xl.py#L1027
The GUI's progress bar was stopping one notch short of full. Subtract 1 from the calculation to make it fill up properly and align with the command-line progress bar.
Refiner inference was already being handled by code in run_inference_loop(), so run_refiner_inference_loop() was unused vestigial code. This commit removes the latter.
Previously, run_inference_loop() implicitly assumed that --num_images would be smaller than --batch_size. This commit rearranges the code to support --num_images larger than --batch_size, by running more batches until the requested number of images is reached.
Previously, the interactive GUI controls were at the bottom of the window. But this tends to put them offscreen if there is more than one row of images, and they cannot be brought back into view by click-dragging the window titlebar, since the mouse pointer cannot drag higher than the top of the screen. Maximizing the window isn't possible either, since the maximize button is disabled.Therefore, this commit moves the GUI controls to the top of the window so they can remain reachable. (This includes the text prompt field, the "Generate" button, and the progress bar.)
…t generation, besides clicking "Generate" buttonPreviously, in the interactive GUI, pressing "Enter" in the text field did nothing. This commit enables pressing "Enter" in the text field to start generating, besides the existing way of clicking the "Generate" button. This can help in particular if/when the window is wider than the screen, where the "Generate" button may be offscreen.
If the user does not specify values for these flags, they are set to the value of --image_size, which defaults to 768. Internal functions now take width and height as params instead of size. I also updated the help text to note that size/width/height values must be multiples of 32 (otherwise an error occurs).
Try to obey 120-character style guide
Missed a long comment line. Shorten it to comply with 120-character style guide.
The SDXL example used to be located in examples/directml/stable_diffusion_xl, but was recently moved to examples/stable_diffusion_xl.
Added info on --single_local_file and --skip_evaluation options.
@microsoft-github-policy-service agree |
for image_index in range(batch_size): | ||
if images_saved < num_images: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
nitpick: Wouldn't this just be same asfor image_index in range(len(result.images)):
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Looks like it works the same, yes. I've made the suggested change.
olive_run(olive_config) | ||
footprints_file_path =Path(__file__).resolve().parent / "footprints" / submodel_name / "footprints.json" | ||
footprints_file_path =script_dir / "footprints" / submodel_name / "footprints.json" | ||
with footprints_file_path.open("r") as footprint_file: | ||
footprints = json.load(footprint_file) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
olive_run
now returns an object that has the API to query for output data like footprints and models. I would recommend using the API rather than hardcoding the paths.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Thanks for the doc link! That can probably be arranged, but would take me more study/testing to figure out. (May be a week or weeks, depending on life.)
prompt_textbox = tk.Entry(window) | ||
prompt_textbox.insert(tk.END, prompt) | ||
prompt_textbox.place(x=0, y=0, width=window_width - button_width, height=button_height) | ||
prompt_textbox.bind('<Return>', on_generate_click) |
Check warning
Code scanning / lintrunner
RUFF/Q000 Warning
Seehttps://docs.astral.sh/ruff/rules/bad-quotes-inline-string
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I've made this change, and others suggested by lintrunner.
Bugfixes:
Enhancements:
--skip_evaluation
option, since some systems (like mine) can complete all operations EXCEPT evaluation. A workaround forIssue 510,Issue 1170, andIssue 1194.--single_local_file
option for optimizing a local .safetensors model file--num_images
larger than--batch_size
. This should be the intended behavior, based on the relevantReadme saying "The minimum number of inferences will be ceil(num_images / batch_size)".FixesIssue 1164.--image_width
and--image_height
flagsDocumentation updates:
I got all this into a working state months ago on my machine, but didn't have time to split it into logically-separated, digestible commits until now. Hope this format is satisfactory! Let me know if anything else is needed from me, as I'm basically new to Github.