Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

SDXL example bugfixes and enhancements#1947

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Open
lostdisc wants to merge23 commits intomicrosoft:main
base:main
Choose a base branch
Loading
fromlostdisc:sdxl-example-fixes

Conversation

lostdisc
Copy link

Bugfixes:

  • Fixed bug where dml provider was being changed to cpu
  • Avoid running costly ByteSize() check if save_as_external_data is already enabled. Related toIssue 1165.
  • Enable save_as_external_data for optimization pass in config_unet.json, since unet always exceeds the 2GB Protobuf limit. AddressesIssue 1165.
  • Fixed bug where vestigial unoptimized model.onnx.data files were still in optimized model folders
  • Fixed error/crash in interactive GUI mode near start of generation
  • Fixed visual glitch where interactive GUI progress bar stopped short of full
  • Removed unused vestigial function run_refiner_inference_loop(). This functionality was already being handled by code in run_inference_loop().

Enhancements:

  • Added--skip_evaluation option, since some systems (like mine) can complete all operations EXCEPT evaluation. A workaround forIssue 510,Issue 1170, andIssue 1194.
  • Added--single_local_file option for optimizing a local .safetensors model file
  • Support--num_images larger than--batch_size. This should be the intended behavior, based on the relevantReadme saying "The minimum number of inferences will be ceil(num_images / batch_size)".FixesIssue 1164.
  • Added optional--image_width and--image_height flags
  • In interactive GUI, enable pressing Enter key in input text field to start generation
  • Move interactive GUI controls from bottom to top, so they stay onscreen if image window is taller than screen

Documentation updates:

  • Updated SDXL example link in examples.md to point to correct folder
  • Updated SDXL example's DirectML readme with info on new options/flags

I got all this into a working state months ago on my machine, but didn't have time to split it into logically-separated, digestible commits until now. Hope this format is satisfactory! Let me know if anything else is needed from me, as I'm basically new to Github.

…nsformersRemove <4.43.0 version limit on transformers, since it made no apparent difference in my testing.
A set of if/elif/else statements was neglecting to account for the case of using dml, and was instead changing things to cpu in the final else block when the user tried using dml.  This commit changes the final else block to be elif provider == "cpu".  That way, if the provider is neither cuda, nor qnn, nor cpu, the last possibility is dml (which should be the default according to an earlier comment in line 326), and no changes are made.
Line 422 was using "Path(__file__).resolve().parent", but this is exactly what script_dir is for, as defined in line 373.
Some computers (such as mine) are capable of running conversion, optimization, and inference, but choke on evaluation.  Add a command-line flag to skip evaluation, so that optimization can be completed.
ByteSize() used to be quick but capable of wildly wrong results with Protobuf 3.x.  Now with newer Protobuf versions, it seems more consistent, but is very costly to run, to the tune of ~60GB RAM/swap and >5 minutes for a ~10GB unet file (YMMV).  Therefore, change the code here to only run ByteSize() if save_as_external_data is not already enabled.
SDXL unets end up ~5GB after fp16 optimization, and thus always need to use external data files to avoid Protobuf's 2GB limit on the main model.onnx file.  Olive has code that checks if a model is >2GB even if save_as_external_data is not enabled, but that ByteSize() check is rather costly to run on a large unet file, so avoiding it where possible saves time and RAM/swapping.
…der treeCurrently, the folder tree for the optimized model is created by copying the unoptimized model's folder tree, then overwriting the unoptimized files with optimized versions from the run cache.  Weights.pb files are ignored/excluded during the tree copy, which is fine.  However, there is a scenario (that actually occurs with the default SDXL 1.0 model used by the example) where text_encoder_2 can be >2GB when unoptimized, but <2GB when optimized.  In this case, the unoptimized model will have a model.onnx.data external data file, whereas the optimized model will not.  The unoptimized external file gets copied over to the optimized tree, but is not overwritten when the optimized model files are copied in, since there isn't an optimized version of it.  You end up with a vestigial 3GB file that isn't actually used by the optimized model.  Therefore, this commit changes the tree-copy code to also ignore model.onnx and model.onnx.data files (in addition to the currently-ignored weights.pb files).  The unoptimized onnx files don't actually need to be copied over, since they are overwritten anyway when the optimized models are copied in.
Add --single_local_file option for optimizing a local .safetensors file.  When the flag is used along with --optimize, the script checks the local folder for a .safetensors file with a name matching --model_id to run optimization on.  Has no effect when used without --optimize.
- Modify the update_steps function to have a step_callback_pipeline parameter, and to return latents.- For the two result = pipeline() functions, change callback=update_steps into callback_on_step_end=update_steps.  The former seems to have been deprecated for the latter in the huggingface diffusers SDXL pipeline:https://github.com/huggingface/diffusers/blob/main/src/diffusers/pipelines/stable_diffusion_xl/pipeline_stable_diffusion_xl.py#L1027
The GUI's progress bar was stopping one notch short of full.  Subtract 1 from the calculation to make it fill up properly and align with the command-line progress bar.
Refiner inference was already being handled by code in run_inference_loop(), so run_refiner_inference_loop() was unused vestigial code.  This commit removes the latter.
Previously, run_inference_loop() implicitly assumed that --num_images would be smaller than --batch_size.  This commit rearranges the code to support --num_images larger than --batch_size, by running more batches until the requested number of images is reached.
Previously, the interactive GUI controls were at the bottom of the window.  But this tends to put them offscreen if there is more than one row of images, and they cannot be brought back into view by click-dragging the window titlebar, since the mouse pointer cannot drag higher than the top of the screen.  Maximizing the window isn't possible either, since the maximize button is disabled.Therefore, this commit moves the GUI controls to the top of the window so they can remain reachable.  (This includes the text prompt field, the "Generate" button, and the progress bar.)
…t generation, besides clicking "Generate" buttonPreviously, in the interactive GUI, pressing "Enter" in the text field did nothing.  This commit enables pressing "Enter" in the text field to start generating, besides the existing way of clicking the "Generate" button.  This can help in particular if/when the window is wider than the screen, where the "Generate" button may be offscreen.
If the user does not specify values for these flags, they are set to the value of --image_size, which defaults to 768.  Internal functions now take width and height as params instead of size.   I also updated the help text to note that size/width/height values must be multiples of 32 (otherwise an error occurs).
Try to obey 120-character style guide
Missed a long comment line.  Shorten it to comply with 120-character style guide.
The SDXL example used to be located in examples/directml/stable_diffusion_xl, but was recently moved to examples/stable_diffusion_xl.
Added info on --single_local_file and --skip_evaluation options.
@lostdisc
Copy link
Author

@microsoft-github-policy-service agree

Comment on lines 97 to 98
for image_index in range(batch_size):
if images_saved < num_images:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

nitpick: Wouldn't this just be same asfor image_index in range(len(result.images)): ?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Looks like it works the same, yes. I've made the suggested change.

Comment on lines 434 to 438
olive_run(olive_config)

footprints_file_path =Path(__file__).resolve().parent / "footprints" / submodel_name / "footprints.json"
footprints_file_path =script_dir / "footprints" / submodel_name / "footprints.json"
with footprints_file_path.open("r") as footprint_file:
footprints = json.load(footprint_file)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

olive_run now returns an object that has the API to query for output data like footprints and models. I would recommend using the API rather than hardcoding the paths.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Thanks for the doc link! That can probably be arranged, but would take me more study/testing to figure out. (May be a week or weeks, depending on life.)

prompt_textbox = tk.Entry(window)
prompt_textbox.insert(tk.END, prompt)
prompt_textbox.place(x=0, y=0, width=window_width - button_width, height=button_height)
prompt_textbox.bind('<Return>', on_generate_click)

Check warning

Code scanning / lintrunner

RUFF/Q000 Warning

Single quotes found but double quotes preferred.
Seehttps://docs.astral.sh/ruff/rules/bad-quotes-inline-string
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I've made this change, and others suggested by lintrunner.

Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Reviewers

@xiaoyu-workxiaoyu-workxiaoyu-work left review comments

@shaahjishaahjishaahji requested changes

Requested changes must be addressed to merge this pull request.

Assignees
No one assigned
Labels
None yet
Projects
None yet
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

"num_images" doesn't work for the example of directml stable_diffusion_xl.
3 participants
@lostdisc@xiaoyu-work@shaahji

[8]ページ先頭

©2009-2025 Movatter.jp