Fixed bug where dml provider was being changed to cpu
Avoid running costly ByteSize() check if save_as_external_data is already enabled. Related toIssue 1165.
Enable save_as_external_data for optimization pass in config_unet.json, since unet always exceeds the 2GB Protobuf limit. AddressesIssue 1165.
Fixed bug where vestigial unoptimized model.onnx.data files were still in optimized model folders
Fixed error/crash in interactive GUI mode near start of generation
Fixed visual glitch where interactive GUI progress bar stopped short of full
Removed unused vestigial function run_refiner_inference_loop(). This functionality was already being handled by code in run_inference_loop().

Enhancements:

Added--skip_evaluation option, since some systems (like mine) can complete all operations EXCEPT evaluation. A workaround forIssue 510,Issue 1170, andIssue 1194.
Added--single_local_file option for optimizing a local .safetensors model file
Support--num_images larger than--batch_size. This should be the intended behavior, based on the relevantReadme saying "The minimum number of inferences will be ceil(num_images / batch_size)".FixesIssue 1164.
Added optional--image_width and--image_height flags
In interactive GUI, enable pressing Enter key in input text field to start generation
Move interactive GUI controls from bottom to top, so they stay onscreen if image window is taller than screen

Documentation updates:

Updated SDXL example link in examples.md to point to correct folder
Updated SDXL example's DirectML readme with info on new options/flags

I got all this into a working state months ago on my machine, but didn't have time to split it into logically-separated, digestible commits until now. Hope this format is satisfactory! Let me know if anything else is needed from me, as I'm basically new to Github.

lostdisc added20 commits

July 1, 2025 21:22

Update requirements-common.txt to remove <4.43.0 version limit on tra…

86eeb04

…nsformersRemove <4.43.0 version limit on transformers, since it made no apparent difference in my testing.

Fix bug where dml provider was being changed to cpu

c72c6dd

A set of if/elif/else statements was neglecting to account for the case of using dml, and was instead changing things to cpu in the final else block when the user tried using dml.  This commit changes the final else block to be elif provider == "cpu".  That way, if the provider is neither cuda, nor qnn, nor cpu, the last possibility is dml (which should be the default according to an earlier comment in line 326), and no changes are made.

Change line 422 to use script_dir variable

4819fd3

Line 422 was using "Path(__file__).resolve().parent", but this is exactly what script_dir is for, as defined in line 373.

Add --skip_evaluation option

dfa5def

Some computers (such as mine) are capable of running conversion, optimization, and inference, but choke on evaluation.  Add a command-line flag to skip evaluation, so that optimization can be completed.

Avoid running ByteSize() if save_as_external_data is enabled

832f481

ByteSize() used to be quick but capable of wildly wrong results with Protobuf 3.x.  Now with newer Protobuf versions, it seems more consistent, but is very costly to run, to the tune of ~60GB RAM/swap and >5 minutes for a ~10GB unet file (YMMV).  Therefore, change the code here to only run ByteSize() if save_as_external_data is not already enabled.

Enable save_as_external_data for unet optimization passes

aa4713e

SDXL unets end up ~5GB after fp16 optimization, and thus always need to use external data files to avoid Protobuf's 2GB limit on the main model.onnx file.  Olive has code that checks if a model is >2GB even if save_as_external_data is not enabled, but that ByteSize() check is rather costly to run on a large unet file, so avoiding it where possible saves time and RAM/swapping.

Exclude onnx and onnx.data files when copying unoptimized model's fol…

85a9eea

…der treeCurrently, the folder tree for the optimized model is created by copying the unoptimized model's folder tree, then overwriting the unoptimized files with optimized versions from the run cache.  Weights.pb files are ignored/excluded during the tree copy, which is fine.  However, there is a scenario (that actually occurs with the default SDXL 1.0 model used by the example) where text_encoder_2 can be >2GB when unoptimized, but <2GB when optimized.  In this case, the unoptimized model will have a model.onnx.data external data file, whereas the optimized model will not.  The unoptimized external file gets copied over to the optimized tree, but is not overwritten when the optimized model files are copied in, since there isn't an optimized version of it.  You end up with a vestigial 3GB file that isn't actually used by the optimized model.  Therefore, this commit changes the tree-copy code to also ignore model.onnx and model.onnx.data files (in addition to the currently-ignored weights.pb files).  The unoptimized onnx files don't actually need to be copied over, since they are overwritten anyway when the optimized models are copied in.

Merge branch 'microsoft:main' into sdxl-example-fixes

a40e7e3

Add --single_local_file option for optimizing a local .safetensors file

7932ae6

Add --single_local_file option for optimizing a local .safetensors file.  When the flag is used along with --optimize, the script checks the local folder for a .safetensors file with a name matching --model_id to run optimization on.  Has no effect when used without --optimize.

Fix error/crash in interactive mode when attempting to generate

7f87f2b

- Modify the update_steps function to have a step_callback_pipeline parameter, and to return latents.- For the two result = pipeline() functions, change callback=update_steps into callback_on_step_end=update_steps.  The former seems to have been deprecated for the latter in the huggingface diffusers SDXL pipeline:https://github.com/huggingface/diffusers/blob/main/src/diffusers/pipelines/stable_diffusion_xl/pipeline_stable_diffusion_xl.py#L1027

Make interactive GUI's progress-bar fill up properly

57d8c23

The GUI's progress bar was stopping one notch short of full.  Subtract 1 from the calculation to make it fill up properly and align with the command-line progress bar.

Remove unused vestigial funtion run_refiner_inference_loop()

437aba0

Refiner inference was already being handled by code in run_inference_loop(), so run_refiner_inference_loop() was unused vestigial code.  This commit removes the latter.

Support --num_images larger than --batch_size

c6dd634

Previously, run_inference_loop() implicitly assumed that --num_images would be smaller than --batch_size.  This commit rearranges the code to support --num_images larger than --batch_size, by running more batches until the requested number of images is reached.

Move interactive GUI controls from bottom to top so they stay onscreen

3e42533

Previously, the interactive GUI controls were at the bottom of the window.  But this tends to put them offscreen if there is more than one row of images, and they cannot be brought back into view by click-dragging the window titlebar, since the mouse pointer cannot drag higher than the top of the screen.  Maximizing the window isn't possible either, since the maximize button is disabled.Therefore, this commit moves the GUI controls to the top of the window so they can remain reachable.  (This includes the text prompt field, the "Generate" button, and the progress bar.)

In interactive GUI, enable pressing "Enter" key in text field to star…

18e923d

…t generation, besides clicking "Generate" buttonPreviously, in the interactive GUI, pressing "Enter" in the text field did nothing.  This commit enables pressing "Enter" in the text field to start generating, besides the existing way of clicking the "Generate" button.  This can help in particular if/when the window is wider than the screen, where the "Generate" button may be offscreen.

Add optional --image_width and --image_height flags

766c0cc

If the user does not specify values for these flags, they are set to the value of --image_size, which defaults to 768.  Internal functions now take width and height as params instead of size.   I also updated the help text to note that size/width/height values must be multiples of 32 (otherwise an error occurs).

Wrap/shorten long lines of code for style

765c886

Try to obey 120-character style guide

Shorten long comment line

c5111c8

Missed a long comment line.  Shorten it to comply with 120-character style guide.

Fix link to SDXL example in examples.md documentation

b793f93

The SDXL example used to be located in examples/directml/stable_diffusion_xl, but was recently moved to examples/stable_diffusion_xl.

Update SDXL example's DirectML README.md to mention new options/flags

d377f66

Added info on --single_local_file and --skip_evaluation options.

Copy link

Author

lostdisc commentedJul 4, 2025

@microsoft-github-policy-service agree

shaahji requested changes

Jul 9, 2025

View reviewed changes

examples/stable_diffusion_xl/stable_diffusion_xl.py Outdated

Comment on lines 97 to 98

		for image_index in range(batch_size):
		if images_saved < num_images:

Copy link

Contributor

shaahjiJul 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

nitpick: Wouldn't this just be same asfor image_index in range(len(result.images)): ?

Copy link

Author

lostdiscJul 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Looks like it works the same, yes. I've made the suggested change.

examples/stable_diffusion_xl/stable_diffusion_xl.py

Comment on lines 434 to 438

		olive_run(olive_config)

		footprints_file_path =Path(__file__).resolve().parent / "footprints" / submodel_name / "footprints.json"
		footprints_file_path =script_dir / "footprints" / submodel_name / "footprints.json"
		with footprints_file_path.open("r") as footprint_file:
		footprints = json.load(footprint_file)

Copy link

Contributor

shaahjiJul 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

olive_run now returns an object that has the API to query for output data like footprints and models. I would recommend using the API rather than hardcoding the paths.

Copy link

Contributor

xiaoyu-workJul 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

https://microsoft.github.io/Olive/how-to/python_api.html#accessing-optimization-results

Copy link

Author

lostdiscJul 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Thanks for the doc link! That can probably be arranged, but would take me more study/testing to figure out. (May be a week or weeks, depending on life.)

github-advanced-securitybot found potential problems

Jul 9, 2025

View reviewed changes

examples/stable_diffusion_xl/stable_diffusion_xl.py Outdated

		prompt_textbox = tk.Entry(window)
		prompt_textbox.insert(tk.END, prompt)
		prompt_textbox.place(x=0, y=0, width=window_width - button_width, height=button_height)
		prompt_textbox.bind('<Return>', on_generate_click)

Check warning

Code scanning / lintrunner

RUFF/Q000 Warning

Single quotes found but double quotes preferred.
Seehttps://docs.astral.sh/ruff/rules/bad-quotes-inline-string

Copy link

Author

lostdiscJul 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I've made this change, and others suggested by lintrunner.

lostdisc added3 commits

July 9, 2025 22:20

Update stable_diffusion_xl.py with lintrunner's suggested changes

9e650ce

Update common.py with lintrunner's suggested changes

a12c563

Change range(batch_size) to range(len(result.images)) as suggested

3945ff5

Labels

None yet

3 participants

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

SDXL example bugfixes and enhancements#1947

Are you sure you want to change the base?

SDXL example bugfixes and enhancements#1947

Uh oh!

Conversation

lostdisc commentedJul 4, 2025

Uh oh!

lostdisc commentedJul 4, 2025

Uh oh!

shaahjiJul 9, 2025

Choose a reason for hiding this comment

Uh oh!

lostdiscJul 10, 2025

Choose a reason for hiding this comment

Uh oh!

shaahjiJul 9, 2025

Choose a reason for hiding this comment

Uh oh!

xiaoyu-workJul 10, 2025

Choose a reason for hiding this comment

Uh oh!

lostdiscJul 10, 2025

Choose a reason for hiding this comment

Uh oh!

Check warning

lostdiscJul 10, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!