Merge branch 'dev' into master

pull/4177/head
Vladimir Mandic 2025-09-01 11:41:07 -04:00 committed by GitHub
commit e40275741d
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
226 changed files with 5840 additions and 763 deletions

View File

@ -38,6 +38,7 @@ ignore-paths=/usr/lib/.*$,
pipelines/flex2,
pipelines/f_lite,
pipelines/hidream,
pipelines/hdm,
pipelines/meissonic,
pipelines/omnigen2,
pipelines/segmoe,

View File

@ -20,6 +20,7 @@ exclude = [
"pipelines/meissonic",
"pipelines/omnigen2",
"pipelines/hdm",
"pipelines/segmoe",
"scripts/lbm",

View File

@ -1,5 +1,62 @@
# Change Log for SD.Next
## Update for 2025-08-30
- **Models**
- **Chroma** final versions: [Chroma1-HD](https://huggingface.co/lodestones/Chroma1-HD), [Chroma1-Base](https://huggingface.co/lodestones/Chroma1-Base) and [Chroma1-Flash](https://huggingface.co/lodestones/Chroma1-Flash)
- **Qwen-Image** [InstantX ControlNet Union](https://huggingface.co/InstantX/Qwen-Image-ControlNet-Union) support
*note* qwen-image is already a very large model and controlnet adds 3.5GB on top of that so quantization and offloading are highly recommended!
- [Nunchaku-Qwen-Image-Lightning](https://huggingface.co/nunchaku-tech/nunchaku-qwen-image)
if you have a compatible nVidia GPU, Nunchaku is the fastest quantization engine, currently available for Flux.1, SANA and Qwen-Image models
*note*: release version of `nunchaku==0.3.2` does NOT include support, so you need to build [nunchaku](https://nunchaku.tech/docs/nunchaku/installation/installation.html) from source
- [HunyuanDiT ControlNet](https://huggingface.co/Tencent-Hunyuan/HYDiT-ControlNet-v1.2) Canny, Depth, Pose
- [KBlueLeaf/HDM-xut-340M-anime](https://huggingface.co/KBlueLeaf/HDM-xut-340M-anime)
highly experimental: HDM *Home-made-Diffusion-Model* is a project to investigate specialized training recipe/scheme for pretraining T2I model at home based on super-light architecture
requires: generator=cpu, dtype=float16, offload=none
- updated [SD.Next Model Samples Gallery](https://vladmandic.github.io/sd-samples/compare.html)
- **UI**
- default to **ModernUI**
standard ui is still available via *settings -> user interface -> theme type*
- mobile-friendly!
- make hints touch-friendly: hold touch to display hint
- improved image scaling in img2img and control interfaces
- add base model type to networks display, thanks @Artheriax
- additional hints to ui, thanks @Artheriax
- add video support to gallery, thanks @CalamitousFelicitousness
- additional artwork for reference models in networks, thanks @liutyi
- improve ui hints display
- restyled all toolbuttons to be modernui native
- reodered system settings
- configurable horizontal vs vertical panel layout
in settings -> user interface -> panel min width
*example*: if panel width is less than specified value, layout switches to verical
- configurable grid images size
in *settings -> user interface -> grid image size*
- **Offloading**
- enable offload during pre-forward by default
- improve offloading of models with multiple dits
- improve offloading of models with impliciy vae processing
- improve offloading of models with controlnet
- **SDNQ**
- add quantized matmul support for all quantization types and group sizes
- **Other**
- refactor reuse-seed and add functionality to all tabs
- **Fixes**
- normalize path hanlding when deleting images
- remove samplers filtering
- fix hidden model tags in networks display
- fix networks reference models display on windows
- fix handling of pre-quantized `flux` models
- fix `wan` use correct pipeline for i2v models
- fix `qwen-image` with hires
- fix `omnigen-2` failure
- fix `auraflow` quantization
- fix `kandinsky-3` noise
- fix `infiniteyou` pipeline offloading
- fix `skyreels-v2` image-to-video
- fix `flex2` img2img denoising strength
- fix segfault on startup with rocm 6.4.3 and torch 2.8
## Update for 2025-08-20
A quick service release with several important hotfixes, improved localization support and adding new **Qwen** model variants...

View File

@ -43,12 +43,15 @@ All individual features are not listed here, instead check [ChangeLog](CHANGELOG
<br>
*Main interface using **StandardUI***:
![screenshot-standardui](https://github.com/user-attachments/assets/cab47fe3-9adb-4d67-aea9-9ee738df5dcc)
**Desktop** interface
<div align="center">
<img src="https://github.com/user-attachments/assets/d6119a63-6ee5-4597-95f6-29ed0701d3b5" alt="screenshot-modernui-desktop" width="90%">
</div>
*Main interface using **ModernUI***:
![screenshot-modernui](https://github.com/user-attachments/assets/39e3bc9a-a9f7-4cda-ba33-7da8def08032)
**Mobile** interface
<div align="center">
<img src="https://github.com/user-attachments/assets/ced9fe0c-d2c2-46d1-94a7-8f9f2307ce38" alt="screenshot-modernui-mobile" width="35%">
</div>
For screenshots and informations on other available themes, see [Themes](https://vladmandic.github.io/sdnext-docs/Themes/)

View File

@ -7,19 +7,16 @@ Main ToDo list can be found at [GitHub projects](https://github.com/users/vladma
- Remote TE
- Mobile ModernUI
- [Canvas](https://konvajs.org/)
- [Modular pipelines and guiders](https://github.com/huggingface/diffusers/issues/11915)
- Refactor: Sampler options
- Refactor: [GGUF](https://huggingface.co/docs/diffusers/main/en/quantization/gguf)
- Feature: Diffusers [group offloading](https://github.com/vladmandic/sdnext/issues/4049)
- Feature: Common repo for `T5` and `CLiP`
- Feature: LoRA add OMI format support for SD35/FLUX.1
- Video: Generic API support
- Video: LTX TeaCache and others
- Video: LTX API
- Video: LTX PromptEnhance
- Video: LTX Conditioning preprocess
- [WanAI-2.1 VACE](https://huggingface.co/Wan-AI/Wan2.1-VACE-14B)(https://github.com/huggingface/diffusers/pull/11582)
- [Cosmos-Predict2-Video](https://huggingface.co/nvidia/Cosmos-Predict2-2B-Video2World)(https://github.com/huggingface/diffusers/pull/11695)
### Blocked items
@ -30,6 +27,8 @@ Main ToDo list can be found at [GitHub projects](https://github.com/users/vladma
### Under Consideration
- [X-Omni](https://github.com/X-Omni-Team/X-Omni/blob/main/README.md)
- [DiffSynth Studio](https://github.com/modelscope/DiffSynth-Studio)
- [IPAdapter negative guidance](https://github.com/huggingface/diffusers/discussions/7167)
- [IPAdapter composition](https://huggingface.co/ostris/ip-composition-adapter)
- [STG](https://github.com/huggingface/diffusers/blob/main/examples/community/README.md#spatiotemporal-skip-guidance)

View File

@ -8,6 +8,7 @@ const { GoogleGenerativeAI } = require('@google/generative-ai');
const api_key = process.env.GOOGLE_AI_API_KEY;
const model = 'gemini-2.5-flash';
const prompt = `
// eslint-disable-next-line max-len
Translate attached JSON from English to {language} using following rules: fields id, label and reload should be preserved from original, field localized should be a translated version of field label and field hint should be translated in-place. if field is less than 3 characters, do not translate it and keep it as is. Every JSON entry should have id, label, localized, reload and hint fields. Output should be pure JSON without any additional text. To better match translation, context of the text is related to Stable Diffusion and topic of Generative AI.`;
const languages = {
hr: 'Croatian',

@ -1 +1 @@
Subproject commit da4ccd4aa75e3b42937674ba23d406a02783df4f
Subproject commit 3a5df9fc03c1d61d7d70413f7a2f78a4b4552ae2

Binary file not shown.

Before

Width:  |  Height:  |  Size: 104 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 164 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 301 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 181 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 101 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 96 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 97 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 89 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 215 KiB

View File

@ -1,7 +0,0 @@
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24">
<filter id='shadow' color-interpolation-filters="sRGB">
<feDropShadow flood-color="black" dx="0" dy="0" flood-opacity="0.9" stdDeviation="0.5"/>
<feDropShadow flood-color="black" dx="0" dy="0" flood-opacity="0.9" stdDeviation="0.5"/>
</filter>
<path style="filter:url(#shadow);" fill="#FFFFFF" d="M13.18 19C13.35 19.72 13.64 20.39 14.03 21H5C3.9 21 3 20.11 3 19V5C3 3.9 3.9 3 5 3H19C20.11 3 21 3.9 21 5V11.18C20.5 11.07 20 11 19.5 11C19.33 11 19.17 11 19 11.03V5H5V19H13.18M11.21 15.83L9.25 13.47L6.5 17H13.03C13.14 15.54 13.73 14.22 14.64 13.19L13.96 12.29L11.21 15.83M19 13.5V12L16.75 14.25L19 16.5V15C20.38 15 21.5 16.12 21.5 17.5C21.5 17.9 21.41 18.28 21.24 18.62L22.33 19.71C22.75 19.08 23 18.32 23 17.5C23 15.29 21.21 13.5 19 13.5M19 20C17.62 20 16.5 18.88 16.5 17.5C16.5 17.1 16.59 16.72 16.76 16.38L15.67 15.29C15.25 15.92 15 16.68 15 17.5C15 19.71 16.79 21.5 19 21.5V23L21.25 20.75L19 18.5V20Z" />
</svg>

Before

Width:  |  Height:  |  Size: 989 B

Binary file not shown.

Before

Width:  |  Height:  |  Size: 122 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 255 KiB

View File

@ -32,6 +32,7 @@
{"id":"","label":"","localized":"","reload":"","hint":"Sort by time, descending"}
],
"main": [
{"id":"","label":"SD.Next","localized":"","reload":"","hint":"SD.Next<br>All-in-one WebUI for AI generative image and video creation"},
{"id":"","label":"Prompt","localized":"","reload":"","hint":"Describe image you want to generate"},
{"id":"","label":"Start","localized":"","reload":"","hint":"Start"},
{"id":"","label":"End","localized":"","reload":"","hint":"End"},
@ -41,10 +42,14 @@
{"id":"","label":"Text","localized":"","reload":"","hint":"Create image from text"},
{"id":"","label":"Image","localized":"","reload":"","hint":"Create image from image"},
{"id":"","label":"Control","localized":"","reload":"","hint":"Create image with full guidance"},
{"id":"","label":"Process","localized":"","reload":"","hint":"Process existing image"},
{"id":"","label":"Images","localized":"","reload":"","hint":"Create images<br>Unified interface<br>Supports T2I and I2I<br>With optional control guidance"},
{"id":"","label":"T2I","localized":"","reload":"","hint":"Create image from text<br>Legacy interface that mimics original text-to-image interface and behavior"},
{"id":"","label":"I2I","localized":"","reload":"","hint":"Create image from image<br>Legacy interface that mimics original image-to-image interface and behavior"},
{"id":"","label":"Process","localized":"","reload":"","hint":"Process existing image<br>Can be used to upscale images, remove backgrounds, obfuscate NSFW content, apply various filters and effects"},
{"id":"","label":"Caption","localized":"","reload":"","hint":"Analyze existing images and create text descriptions"},
{"id":"","label":"Interrogate","localized":"","reload":"","hint":"Run interrogate to get description of your image"},
{"id":"","label":"Models","localized":"","reload":"","hint":"Download, convert or merge your models and manage models metadata"},
{"id":"","label":"Sampler","localized":"","reload":"","hint":"Settings related to sampler and seed selection and configuration. Samplers guide the process of turning noise into an image over multiple steps."},
{"id":"","label":"Agent Scheduler","localized":"","reload":"","hint":"Enqueue your generate requests and run them in the background"},
{"id":"","label":"AgentScheduler","localized":"","reload":"","hint":"Enqueue your generate requests and run them in the background"},
{"id":"","label":"System","localized":"","reload":"","hint":"System settings and information"},
@ -97,7 +102,7 @@
{"id":"","label":"Denoise","localized":"","reload":"","hint":"Denoising settings. Higher denoise means that more of existing image content is allowed to change during generate"},
{"id":"","label":"Mask","localized":"","reload":"","hint":"Image masking and mask options"},
{"id":"","label":"Input","localized":"","reload":"","hint":"Selection of input media"},
{"id":"","label":"Video","localized":"","reload":"","hint":"Create video using guidance"},
{"id":"","label":"Video","localized":"","reload":"","hint":"Create videos using different methods<br>Supports text-to-image, image-to-image first-last-frame, etc."},
{"id":"","label":"Control elements","localized":"","reload":"","hint":"Control elements are advanced models that can guide generation towards desired outcome"},
{"id":"","label":"IP adapter","localized":"","reload":"","hint":"Guide generation towards desired outcome using IP adapters plugin models"},
{"id":"","label":"IP adapters","localized":"","reload":"","hint":"IP adapters are plugin models that can guide generation towards desired outcome"},
@ -192,7 +197,14 @@
{"id":"","label":"Control Only","localized":"","reload":"","hint":"This uses only the Control input below as the source for any ControlNet or IP Adapter type tasks based on any of our various options."},
{"id":"","label":"Init Image Same As Control","localized":"","reload":"","hint":"Will additionally treat any image placed into the Control input window as a source for img2img type tasks, an image to modify for example."},
{"id":"","label":"Separate Init Image","localized":"","reload":"","hint":"Creates an additional window next to Control input labeled Init input, so you can have a separate image for both Control operations and an init source."},
{"id":"","label":"Override settings","localized":"","reload":"","hint":"If generation parameters deviate from your system settings override settings populated with those settings to override your system configuration for this workflow"}
{"id":"","label":"Override settings","localized":"","reload":"","hint":"If generation parameters deviate from your system settings override settings populated with those settings to override your system configuration for this workflow"},
{"id":"","label":"sigma method","localized":"","reload":"","hint":"Controls how noise levels (sigmas) are distributed across diffusion steps. Options:\n- default: the model default\n- karras: smoother noise schedule, higher quality with fewer steps\n- beta: based on beta schedule values\n- exponential: exponential decay of noise\n- lambdas: experimental, balances signal-to-noise\n- flowmatch: tuned for flow-matching models"},
{"id":"","label":"timestep spacing","localized":"","reload":"","hint":"Determines how timesteps are spaced across the diffusion process. Options:\n- default: the model default\n- leading: creates evenly spaced steps\n- linspace: includes the first and last steps and evenly selects the remaining intermediate steps\n- trailing: only includes the last step and evenly selects the remaining intermediate steps starting from the end"},
{"id":"","label":"beta schedule","localized":"","reload":"","hint":"Defines how beta (noise strength per step) grows. Options:\n- default: the model default\n- linear: evenly decays noise per step\n- scaled: squared version of linear, used only by Stable Diffusion\n- cosine: smoother decay, often better results with fewer steps\n- sigmoid: sharp transition, experimental"},
{"id":"","label":"prediction method","localized":"","reload":"","hint":"Defines what the model predicts at each step. Options:\n- default: the model default\n- epsilon: noise (most common for Stable Diffusion)\n- sample: direct denoised image prediction, also called as x0 prediction\n- v_prediction: velocity prediction, used by CosXL and NoobAI VPred models\n- flow_prediction: used with newer flow-matching models like SD3 and Flux"},
{"id":"","label":"sampler order","localized":"","reload":"","hint":"Order of solver updates in the sampler. Higher order improves stability/accuracy but increases compute cost."},
{"id":"","label":"flow shift","localized":"","reload":"","hint":"Adjustment for flow-based samplers. Shifts noise distribution during generation, useful for fine-tuning balance between detail and consistency."},
{"id":"","label":"resize mode","localized":"","reload":"","hint":"Defines how the input is resized or adapted in second-pass refinement:\n- none: no resizing, keep original resolution\n- fixed: force resize to target resolution (may distort)\n- crop: center-crop to fit target while keeping aspect ratio\n- fill: resize to fit and pad empty space with borders\n- outpaint: extend canvas beyond image borders\n- context aware: smart resize that blends or adapts surrounding areas"}
],
"other": [
{"id":"","label":"Install","localized":"","reload":"","hint":"Install"},
@ -358,15 +370,26 @@
{"id":"","label":"ONNX allow fallback to CPU","localized":"","reload":"","hint":"Allow fallback to CPU when selected execution provider failed"},
{"id":"","label":"ONNX cache converted models","localized":"","reload":"","hint":"Save the models that are converted to ONNX format as a cache. You can manage them in ONNX tab"},
{"id":"","label":"ONNX unload base model when processing refiner","localized":"","reload":"","hint":"Unload base model when the refiner is being converted/optimized/processed"},
{"id":"","label":"Inference-mode","localized":"","reload":"","hint":"Use torch.inference_mode"},
{"id":"","label":"no-grad","localized":"","reload":"","hint":"Use torch.no_grad"},
{"id":"","label":"Model compile precompile","localized":"","reload":"","hint":"Run model compile immediately on model load instead of first use"},
{"id":"","label":"Use zeros for prompt padding","localized":"","reload":"","hint":"Force full zero tensor when prompt is empty to remove any residual noise"},
{"id":"","label":"Include invisible watermark","localized":"","reload":"","hint":"Add invisible watermark to image by altering some pixel values"},
{"id":"","label":"invisible watermark string","localized":"","reload":"","hint":"Watermark string to add to image. Keep very short to avoid image corruption."},
{"id":"","label":"show log view","localized":"","reload":"","hint":"Show log view at the bottom of the main window"},
{"id":"","label":"Log view update period","localized":"","reload":"","hint":"Log view update period, in milliseconds"},
{"id":"","label":"PAG layer names","localized":"","reload":"","hint":"Space separated list of layers<br>Available: d[0-5], m[0], u[0-8]<br>Default: m0"}
{"id":"","label":"PAG layer names","localized":"","reload":"","hint":"Space separated list of layers<br>Available: d[0-5], m[0], u[0-8]<br>Default: m0"},
{"id":"","label":"prompt attention normalization","localized":"","reload":"","hint":"Balances prompt token weights to avoid overly strong/weak influence. Helps stabilize outputs."},
{"id":"","label":"ck flash attention","localized":"","reload":"","hint":"Custom Flash Attention kernel. Very fast, but may be unstable or hardware-dependent."},
{"id":"","label":"flash attention","localized":"","reload":"","hint":"Highly optimized attention algorithm. Greatly reduces VRAM use and speeds up inference, but can be non-deterministic."},
{"id":"","label":"memory attention","localized":"","reload":"","hint":"Uses less VRAM by chunking attention computation. Slower but allows bigger batches or images."},
{"id":"","label":"math attention","localized":"","reload":"","hint":"Fallback pure-math attention implementation. Stable and predictable but very slow."},
{"id":"","label":"dynamic attention","localized":"","reload":"","hint":"Adjusts attention computation dynamically per step. Saves VRAM but slows generation."},
{"id":"","label":"sage attention","localized":"","reload":"","hint":"Experimental attention optimization method. May improve speed but less tested and can cause bugs."},
{"id":"","label":"batch matrix-matrix","localized":"","reload":"","hint":"Standard batched matrix multiplication for attention. Reliable but not VRAM-efficient."},
{"id":"","label":"split attention","localized":"","reload":"","hint":"Splits attention layers into smaller chunks. Helps with very large images at the cost of slower inference."},
{"id":"","label":"deterministic mode","localized":"","reload":"","hint":"Forces deterministic output across runs. Useful for reproducibility, but may disable some optimizations."},
{"id":"","label":"no-grad","localized":"","reload":"","hint":"Disables gradient tracking with torch.no_grad. Reduces memory usage and speeds up inference."},
{"id":"","label":"Inference-mode","localized":"","reload":"","hint":"Like no-grad but stricter. Ensures model runs only in inference mode for safety and speed."},
{"id":"","label":"cudamallocasync","localized":"","reload":"","hint":"Uses CUDA async memory allocator. Improves performance and VRAM fragmentation, but may cause instability on some GPUs."}
],
"missing": [
{"id":"","label":"1st stage","localized":"","reload":"","hint":"1st stage"},
@ -455,7 +478,6 @@
{"id":"","label":"batch interogate","localized":"","reload":"","hint":"batch interogate"},
{"id":"","label":"batch interrogate","localized":"","reload":"","hint":"batch interrogate"},
{"id":"","label":"batch mask directory","localized":"","reload":"","hint":"batch mask directory"},
{"id":"","label":"batch matrix-matrix","localized":"","reload":"","hint":"batch matrix-matrix"},
{"id":"","label":"batch mode uses sequential seeds","localized":"","reload":"","hint":"batch mode uses sequential seeds"},
{"id":"","label":"batch output directory","localized":"","reload":"","hint":"batch output directory"},
{"id":"","label":"batch uses original name","localized":"","reload":"","hint":"batch uses original name"},
@ -466,7 +488,6 @@
{"id":"","label":"beta block weight preset","localized":"","reload":"","hint":"beta block weight preset"},
{"id":"","label":"beta end","localized":"","reload":"","hint":"beta end"},
{"id":"","label":"beta ratio","localized":"","reload":"","hint":"beta ratio"},
{"id":"","label":"beta schedule","localized":"","reload":"","hint":"beta schedule"},
{"id":"","label":"beta start","localized":"","reload":"","hint":"beta start"},
{"id":"","label":"bh1","localized":"","reload":"","hint":"bh1"},
{"id":"","label":"bh2","localized":"","reload":"","hint":"bh2"},
@ -495,7 +516,6 @@
{"id":"","label":"chunk size","localized":"","reload":"","hint":"chunk size"},
{"id":"","label":"civitai model type","localized":"","reload":"","hint":"civitai model type"},
{"id":"","label":"civitai token","localized":"","reload":"","hint":"civitai token"},
{"id":"","label":"ck flash attention","localized":"","reload":"","hint":"ck flash attention"},
{"id":"","label":"ckpt","localized":"","reload":"","hint":"ckpt"},
{"id":"","label":"cleanup temporary folder on startup","localized":"","reload":"","hint":"cleanup temporary folder on startup"},
{"id":"","label":"clip model","localized":"","reload":"","hint":"clip model"},
@ -563,7 +583,6 @@
{"id":"","label":"create zip archive","localized":"","reload":"","hint":"create zip archive"},
{"id":"","label":"cross-attention","localized":"","reload":"","hint":"cross-attention"},
{"id":"","label":"cudagraphs","localized":"","reload":"","hint":"cudagraphs"},
{"id":"","label":"cudamallocasync","localized":"","reload":"","hint":"cudamallocasync"},
{"id":"","label":"custom pipeline","localized":"","reload":"","hint":"custom pipeline"},
{"id":"","label":"dark","localized":"","reload":"","hint":"dark"},
{"id":"","label":"dc solver","localized":"","reload":"","hint":"dc solver"},
@ -591,7 +610,6 @@
{"id":"","label":"depth threshold","localized":"","reload":"","hint":"depth threshold"},
{"id":"","label":"description","localized":"","reload":"","hint":"description"},
{"id":"","label":"details","localized":"","reload":"","hint":"details"},
{"id":"","label":"deterministic mode","localized":"","reload":"","hint":"deterministic mode"},
{"id":"","label":"device info","localized":"","reload":"","hint":"device info"},
{"id":"","label":"diffusers","localized":"","reload":"","hint":"diffusers"},
{"id":"","label":"dilate","localized":"","reload":"","hint":"dilate"},
@ -635,7 +653,6 @@
{"id":"","label":"duration","localized":"","reload":"","hint":"duration"},
{"id":"","label":"dwpose","localized":"","reload":"","hint":"dwpose"},
{"id":"","label":"dynamic","localized":"","reload":"","hint":"dynamic"},
{"id":"","label":"dynamic attention","localized":"","reload":"","hint":"dynamic attention"},
{"id":"","label":"dynamic attention slicing rate in gb","localized":"","reload":"","hint":"dynamic attention slicing rate in gb"},
{"id":"","label":"dynamic attention trigger rate in gb","localized":"","reload":"","hint":"dynamic attention trigger rate in gb"},
{"id":"","label":"edge","localized":"","reload":"","hint":"edge"},
@ -682,9 +699,7 @@
{"id":"","label":"filename","localized":"","reload":"","hint":"filename"},
{"id":"","label":"first-block cache enabled","localized":"","reload":"","hint":"first-block cache enabled"},
{"id":"","label":"fixed unet precision","localized":"","reload":"","hint":"fixed unet precision"},
{"id":"","label":"flash attention","localized":"","reload":"","hint":"flash attention"},
{"id":"","label":"flavors","localized":"","reload":"","hint":"flavors"},
{"id":"","label":"flow shift","localized":"","reload":"","hint":"flow shift"},
{"id":"","label":"folder","localized":"","reload":"","hint":"folder"},
{"id":"","label":"folder for control generate","localized":"","reload":"","hint":"folder for control generate"},
{"id":"","label":"folder for control grids","localized":"","reload":"","hint":"folder for control grids"},
@ -848,7 +863,6 @@
{"id":"","label":"mask only","localized":"","reload":"","hint":"mask only"},
{"id":"","label":"mask strength","localized":"","reload":"","hint":"mask strength"},
{"id":"","label":"masked","localized":"","reload":"","hint":"masked"},
{"id":"","label":"math attention","localized":"","reload":"","hint":"math attention"},
{"id":"","label":"max faces","localized":"","reload":"","hint":"max faces"},
{"id":"","label":"max flavors","localized":"","reload":"","hint":"max flavors"},
{"id":"","label":"max guidance","localized":"","reload":"","hint":"max guidance"},
@ -866,7 +880,6 @@
{"id":"","label":"medium","localized":"","reload":"","hint":"medium"},
{"id":"","label":"mediums","localized":"","reload":"","hint":"mediums"},
{"id":"","label":"memory","localized":"","reload":"","hint":"memory"},
{"id":"","label":"memory attention","localized":"","reload":"","hint":"memory attention"},
{"id":"","label":"memory limit","localized":"","reload":"","hint":"memory limit"},
{"id":"","label":"memory optimization","localized":"","reload":"","hint":"memory optimization"},
{"id":"","label":"merge alpha","localized":"","reload":"","hint":"merge alpha"},
@ -987,7 +1000,6 @@
{"id":"","label":"postprocessing operation order","localized":"","reload":"","hint":"postprocessing operation order"},
{"id":"","label":"power","localized":"","reload":"","hint":"power"},
{"id":"","label":"predefined question","localized":"","reload":"","hint":"predefined question"},
{"id":"","label":"prediction method","localized":"","reload":"","hint":"prediction method"},
{"id":"","label":"preset","localized":"","reload":"","hint":"preset"},
{"id":"","label":"preset block merge","localized":"","reload":"","hint":"preset block merge"},
{"id":"","label":"preview","localized":"","reload":"","hint":"preview"},
@ -998,7 +1010,6 @@
{"id":"","label":"processor move to cpu after use","localized":"","reload":"","hint":"processor move to cpu after use"},
{"id":"","label":"processor settings","localized":"","reload":"","hint":"processor settings"},
{"id":"","label":"processor unload after use","localized":"","reload":"","hint":"processor unload after use"},
{"id":"","label":"prompt attention normalization","localized":"","reload":"","hint":"prompt attention normalization"},
{"id":"","label":"prompt ex","localized":"","reload":"","hint":"prompt ex"},
{"id":"","label":"prompt processor","localized":"","reload":"","hint":"prompt processor"},
{"id":"","label":"prompt strength","localized":"","reload":"","hint":"prompt strength"},
@ -1035,13 +1046,12 @@
{"id":"","label":"reprocess face","localized":"","reload":"","hint":"reprocess face"},
{"id":"","label":"reprocess refine","localized":"","reload":"","hint":"reprocess refine"},
{"id":"","label":"request browser notifications","localized":"","reload":"","hint":"request browser notifications"},
{"id":"","label":"rescale","localized":"","reload":"","hint":"rescale"},
{"id":"","label":"rescale","localized":"","reload":"","hint":"rescale betas with zero terminal snr"},
{"id":"","label":"rescale betas with zero terminal snr","localized":"","reload":"","hint":"rescale betas with zero terminal snr"},
{"id":"","label":"reset anchors","localized":"","reload":"","hint":"reset anchors"},
{"id":"","label":"residual diff threshold","localized":"","reload":"","hint":"residual diff threshold"},
{"id":"","label":"resize background color","localized":"","reload":"","hint":"resize background color"},
{"id":"","label":"resize method","localized":"","reload":"","hint":"resize method"},
{"id":"","label":"resize mode","localized":"","reload":"","hint":"resize mode"},
{"id":"","label":"resize scale","localized":"","reload":"","hint":"resize scale"},
{"id":"","label":"restart step","localized":"","reload":"","hint":"restart step"},
{"id":"","label":"restore faces: codeformer","localized":"","reload":"","hint":"restore faces: codeformer"},
@ -1057,13 +1067,11 @@
{"id":"","label":"run benchmark","localized":"","reload":"","hint":"run benchmark"},
{"id":"","label":"sa solver","localized":"","reload":"","hint":"sa solver"},
{"id":"","label":"safetensors","localized":"","reload":"","hint":"safetensors"},
{"id":"","label":"sage attention","localized":"","reload":"","hint":"sage attention"},
{"id":"","label":"same as primary","localized":"","reload":"","hint":"same as primary"},
{"id":"","label":"same latent","localized":"","reload":"","hint":"same latent"},
{"id":"","label":"sample","localized":"","reload":"","hint":"sample"},
{"id":"","label":"sampler","localized":"","reload":"","hint":"sampler"},
{"id":"","label":"sampler dynamic shift","localized":"","reload":"","hint":"sampler dynamic shift"},
{"id":"","label":"sampler order","localized":"","reload":"","hint":"sampler order"},
{"id":"","label":"sampler shift","localized":"","reload":"","hint":"sampler shift"},
{"id":"","label":"sana: use complex human instructions","localized":"","reload":"","hint":"sana: use complex human instructions"},
{"id":"","label":"saturation","localized":"","reload":"","hint":"saturation"},
@ -1129,7 +1137,6 @@
{"id":"","label":"sigma","localized":"","reload":"","hint":"sigma"},
{"id":"","label":"sigma churn","localized":"","reload":"","hint":"sigma churn"},
{"id":"","label":"sigma max","localized":"","reload":"","hint":"sigma max"},
{"id":"","label":"sigma method","localized":"","reload":"","hint":"sigma method"},
{"id":"","label":"sigma min","localized":"","reload":"","hint":"sigma min"},
{"id":"","label":"sigma noise","localized":"","reload":"","hint":"sigma noise"},
{"id":"","label":"sigma tmin","localized":"","reload":"","hint":"sigma tmin"},
@ -1148,7 +1155,6 @@
{"id":"","label":"spatial frequency","localized":"","reload":"","hint":"spatial frequency"},
{"id":"","label":"specify model revision","localized":"","reload":"","hint":"specify model revision"},
{"id":"","label":"specify model variant","localized":"","reload":"","hint":"specify model variant"},
{"id":"","label":"split attention","localized":"","reload":"","hint":"split attention"},
{"id":"","label":"stable-fast","localized":"","reload":"","hint":"stable-fast"},
{"id":"","label":"standard","localized":"","reload":"","hint":"standard"},
{"id":"","label":"start","localized":"","reload":"","hint":"start"},
@ -1209,7 +1215,6 @@
{"id":"","label":"timestep","localized":"","reload":"","hint":"timestep"},
{"id":"","label":"timestep skip end","localized":"","reload":"","hint":"timestep skip end"},
{"id":"","label":"timestep skip start","localized":"","reload":"","hint":"timestep skip start"},
{"id":"","label":"timestep spacing","localized":"","reload":"","hint":"timestep spacing"},
{"id":"","label":"timesteps","localized":"","reload":"","hint":"timesteps"},
{"id":"","label":"timesteps override","localized":"","reload":"","hint":"timesteps override"},
{"id":"","label":"timesteps presets","localized":"","reload":"","hint":"timesteps presets"},

Binary file not shown.

Before

Width:  |  Height:  |  Size: 108 KiB

View File

Before

Width:  |  Height:  |  Size: 101 KiB

After

Width:  |  Height:  |  Size: 101 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 157 KiB

View File

@ -1,4 +1,3 @@
{
"Tempest-by-Vlad XL": {
"path": "tempestByVlad_baseV01.safetensors@https://civitai.com/api/download/models/1301775",
@ -140,40 +139,47 @@
"extras": "sampler: Default, cfg_scale: 4.5"
},
"lodestones Chroma Unlocked HD": {
"lodestones Chroma1 HD": {
"path": "lodestones/Chroma1-HD",
"preview": "lodestones--Chroma-HD.jpg",
"desc": "Chroma is a 8.9B parameter model based on FLUX.1-schnell. Its fully Apache 2.0 licensed, ensuring that anyone can use, modify, and build on top of it—no corporate gatekeeping. The model is still training right now, and Id love to hear your thoughts! Your input and feedback are really appreciated.",
"desc": "Chroma is a 8.9B parameter model based on FLUX.1-schnell. Its fully Apache 2.0 licensed, ensuring that anyone can use, modify, and build on top of it—no corporate gatekeeping. This is the high-res fine-tune of the Chroma1-Base at a 1024x1024 resolution.",
"skip": true,
"extras": "sampler: Default, cfg_scale: 3.5"
"extras": ""
},
"lodestones Chroma Unlocked HD Annealed": {
"path": "vladmandic/chroma-unlocked-v50-annealed",
"preview": "lodestones--Chroma-annealed.jpg",
"desc": "Chroma is a 8.9B parameter model based on FLUX.1-schnell. Its fully Apache 2.0 licensed, ensuring that anyone can use, modify, and build on top of it—no corporate gatekeeping. The model is still training right now, and Id love to hear your thoughts! Your input and feedback are really appreciated.",
"lodestones Chroma1 Base": {
"path": "lodestones/Chroma1-Base",
"preview": "lodestones--Chroma-Base.jpg",
"desc": "Chroma is a 8.9B parameter model based on FLUX.1-schnell. Its fully Apache 2.0 licensed, ensuring that anyone can use, modify, and build on top of it—no corporate gatekeeping. This is the core 512x512 model. It's a solid, all-around foundation for pretty much any creative project.",
"skip": true,
"extras": "sampler: Default, cfg_scale: 3.5"
"extras": ""
},
"lodestones Chroma Unlocked HD Flash": {
"lodestones Chroma1 Flash": {
"path": "lodestones/Chroma1-Flash",
"preview": "lodestones--Chroma-flash.jpg",
"desc": "Chroma is a 8.9B parameter model based on FLUX.1-schnell. Its fully Apache 2.0 licensed, ensuring that anyone can use, modify, and build on top of it—no corporate gatekeeping. The model is still training right now, and Id love to hear your thoughts! Your input and feedback are really appreciated.",
"desc": "Chroma is a 8.9B parameter model based on FLUX.1-schnell. Its fully Apache 2.0 licensed, ensuring that anyone can use, modify, and build on top of it—no corporate gatekeeping. A fine-tuned version of the Chroma1-Base made to find the best way to make these flow matching models faster.",
"skip": true,
"extras": "sampler: Default, cfg_scale: 1.0"
"extras": ""
},
"lodestones Chroma Unlocked v48": {
"lodestones Chroma1 v50 Preview Annealed": {
"path": "vladmandic/chroma-unlocked-v50-annealed",
"preview": "lodestones--Chroma-annealed.jpg",
"desc": "Chroma is a 8.9B parameter model based on FLUX.1-schnell. Its fully Apache 2.0 licensed, ensuring that anyone can use, modify, and build on top of it—no corporate gatekeeping. Re-tweaked variant with extra noise added.",
"skip": true,
"extras": ""
},
"lodestones Chroma1 v48 Preview": {
"path": "vladmandic/chroma-unlocked-v48",
"preview": "lodestones--Chroma.jpg",
"desc": "Chroma is a 8.9B parameter model based on FLUX.1-schnell. Its fully Apache 2.0 licensed, ensuring that anyone can use, modify, and build on top of it—no corporate gatekeeping. The model is still training right now, and Id love to hear your thoughts! Your input and feedback are really appreciated.",
"desc": "Chroma is a 8.9B parameter model based on FLUX.1-schnell. Its fully Apache 2.0 licensed, ensuring that anyone can use, modify, and build on top of it—no corporate gatekeeping. Last raw version of Chroma before final finetuning.",
"skip": true,
"extras": "sampler: Default, cfg_scale: 1.0"
"extras": ""
},
"lodestones Chroma Unlocked v48 Detail Calibrated": {
"lodestones Chroma1 v48 Preview Calibrated": {
"path": "vladmandic/chroma-unlocked-v48-detail-calibrated",
"preview": "lodestones--Chroma-detail.jpg",
"desc": "Chroma is a 8.9B parameter model based on FLUX.1-schnell. Its fully Apache 2.0 licensed, ensuring that anyone can use, modify, and build on top of it—no corporate gatekeeping. The model is still training right now, and Id love to hear your thoughts! Your input and feedback are really appreciated.",
"desc": "Chroma is a 8.9B parameter model based on FLUX.1-schnell. Its fully Apache 2.0 licensed, ensuring that anyone can use, modify, and build on top of it—no corporate gatekeeping. Last raw version of Chroma before final finetuning but with some detail calibration.",
"skip": true,
"extras": "sampler: Default, cfg_scale: 1.0"
"extras": ""
},
"Qwen-Image": {
@ -185,12 +191,12 @@
},
"Qwen-Image-Edit": {
"path": "Qwen/Qwen-Image-Edit",
"preview": "Qwen--Qwen-Image.jpg",
"preview": "Qwen--Qwen-Image-Edit.jpg",
"desc": " Qwen-Image-Edit, the image editing version of Qwen-Image. Built upon our 20B Qwen-Image model, Qwen-Image-Edit successfully extends Qwen-Images unique text rendering capabilities to image editing tasks, enabling precise text editing.",
"skip": true,
"extras": ""
},
"Qwen-Lightning": {
"Qwen-Image-Lightning": {
"path": "vladmandic/Qwen-Lightning",
"preview": "Qwen-Lightning.jpg",
"desc": " Qwen-Lightning is step-distilled from Qwen-Image to allow for generation in 8 steps.",
@ -281,13 +287,13 @@
"NVLabs Sana 1.5 1.6B 1k": {
"path": "Efficient-Large-Model/SANA1.5_1.6B_1024px_diffusers",
"desc": "Sana is an efficient model with scaling of training-time and inference time techniques. SANA-1.5 delivers: efficient model growth from 1.6B Sana-1.0 model to 4.8B, achieving similar or better performance than training from scratch and saving 60% training cost; efficient model depth pruning, slimming any model size as you want; powerful VLM selection based inference scaling, smaller model+inference scaling > larger model.",
"preview": "Efficient-Large-Model--Sana15_1600M_1024px_diffusers.jpg",
"preview": "Efficient-Large-Model--SANA1.5_1.6B_1024px_diffusers.jpg",
"skip": true
},
"NVLabs Sana 1.5 4.8B 1k": {
"path": "Efficient-Large-Model/SANA1.5_4.8B_1024px_diffusers",
"desc": "Sana is an efficient model with scaling of training-time and inference time techniques. SANA-1.5 delivers: efficient model growth from 1.6B Sana-1.0 model to 4.8B, achieving similar or better performance than training from scratch and saving 60% training cost; efficient model depth pruning, slimming any model size as you want; powerful VLM selection based inference scaling, smaller model+inference scaling > larger model.",
"preview": "Efficient-Large-Model--Sana15_4800M_1024px_diffusers.jpg",
"preview": "Efficient-Large-Model--SANA1.5_4.8B_1024px_diffusers.jpg",
"skip": true
},
"NVLabs Sana 1.5 1.6B 1k Sprint": {
@ -299,25 +305,25 @@
"NVLabs Sana 1.0 1.6B 4k": {
"path": "Efficient-Large-Model/Sana_1600M_4Kpx_BF16_diffusers",
"desc": "Sana is a text-to-image framework that can efficiently generate images up to 4096 × 4096 resolution. Sana can synthesize high-resolution, high-quality images with strong text-image alignment at a remarkably fast speed, deployable on laptop GPU.",
"preview": "Efficient-Large-Model--Sana15_1600M_4Kpx_diffusers.jpg",
"preview": "Efficient-Large-Model--Sana_1600M_4Kpx_BF16_diffusers.jpg",
"skip": true
},
"NVLabs Sana 1.0 1.6B 2k": {
"path": "Efficient-Large-Model/Sana_1600M_2Kpx_BF16_diffusers",
"desc": "Sana is a text-to-image framework that can efficiently generate images up to 4096 × 4096 resolution. Sana can synthesize high-resolution, high-quality images with strong text-image alignment at a remarkably fast speed, deployable on laptop GPU.",
"preview": "Efficient-Large-Model--Sana1_1600M_2Kpx_diffusers.jpg",
"preview": "Efficient-Large-Model--Sana_1600M_2Kpx_BF16_diffusers.jpg",
"skip": true
},
"NVLabs Sana 1.0 1.6B 1k": {
"path": "Efficient-Large-Model/Sana_1600M_1024px_diffusers",
"desc": "Sana is a text-to-image framework that can efficiently generate images up to 4096 × 4096 resolution. Sana can synthesize high-resolution, high-quality images with strong text-image alignment at a remarkably fast speed, deployable on laptop GPU.",
"preview": "Efficient-Large-Model--Sana1_1600M_1024px_diffusers.jpg",
"preview": "Efficient-Large-Model--Sana_1600M_1024px_diffusers.jpg",
"skip": true
},
"NVLabs Sana 1.0 0.6B 0.5k": {
"path": "Efficient-Large-Model/Sana_600M_512px_diffusers",
"desc": "Sana is a text-to-image framework that can efficiently generate images up to 4096 × 4096 resolution. Sana can synthesize high-resolution, high-quality images with strong text-image alignment at a remarkably fast speed, deployable on laptop GPU.",
"preview": "Efficient-Large-Model--Sana1_600M_1024px_diffusers.jpg",
"preview": "Efficient-Large-Model--Sana_600M_512px_diffusers.jpg",
"skip": true
},
@ -340,7 +346,6 @@
"preview": "Shitao--OmniGen-v1.jpg",
"skip": true
},
"VectorSpaceLab OmniGen v2": {
"path": "OmniGen2/OmniGen2",
"desc": "OmniGen2 is a powerful and efficient unified multimodal model. Unlike OmniGen v1, OmniGen2 features two distinct decoding pathways for text and image modalities, utilizing unshared parameters and a decoupled image tokenizer.",
@ -462,7 +467,6 @@
"skip": true,
"extras": "sampler: Default"
},
"AlphaVLLM Lumina 2": {
"path": "Alpha-VLLM/Lumina-Image-2.0",
"desc": "A Unified and Efficient Image Generative Model. Lumina-Image-2.0 is a 2 billion parameter flow-based diffusion transformer capable of generating images from text descriptions.",
@ -553,9 +557,10 @@
"extras": "sampler: Default"
},
"Playground v2.5": {
"path": "playground-v2.5-1024px-aesthetic.fp16.safetensors@https://huggingface.co/playgroundai/playground-v2.5-1024px-aesthetic/resolve/main/playground-v2.5-1024px-aesthetic.fp16.safetensors?download=true",
"desc": "Playground v2.5 is a diffusion-based text-to-image generative model, and a successor to Playground v2. Playground v2.5 is the state-of-the-art open-source model in aesthetic quality. Our user studies demonstrate that our model outperforms SDXL, Playground v2, PixArt-α, DALL-E 3, and Midjourney 5.2.",
"path": "playgroundai/playground-v2.5-1024px-aesthetic",
"desc": "Playground v2.5 is a diffusion-based text-to-image generative model, and a successor to Playground v2. Playground v2.5 is the state-of-the-art open-source model in aesthetic quality.",
"preview": "playgroundai--playground-v2_5-1024px-aesthetic.jpg",
"variant": "fp16",
"extras": "sampler: DPM++ 2M EDM"
},
@ -604,6 +609,7 @@
"preview": "MeissonFlow--Meissonic.jpg",
"skip": true
},
"aMUSEd 256": {
"path": "huggingface/amused/amused-256",
"skip": true,
@ -624,6 +630,7 @@
"preview": "warp-ai--wuerstchen.jpg",
"extras": "sampler: Default, cfg_scale: 4.0, image_cfg_scale: 0.0"
},
"KOALA 700M": {
"path": "huggingface/etri-vilab/koala-700m-llava-cap",
"variant": "fp16",
@ -632,22 +639,34 @@
"preview": "etri-vilab--koala-700m-llava-cap.jpg",
"extras": "sampler: Default"
},
"HDM-XUT 340M Anime": {
"path": "KBlueLeaf/HDM-xut-340M-anime",
"skip": true,
"desc": "HDM(Home made Diffusion Model) is a project to investigate specialized training recipe/scheme for pretraining T2I model at home which require the training setup should be exectuable on customer level hardware or cheap enough second handed server hardware.",
"preview": "KBlueLeaf--HDM-xut-340M-anime.jpg",
"extras": ""
},
"Tsinghua UniDiffuser": {
"path": "thu-ml/unidiffuser-v1",
"desc": "UniDiffuser is a unified diffusion framework to fit all distributions relevant to a set of multi-modal data in one transformer. UniDiffuser is able to perform image, text, text-to-image, image-to-text, and image-text pair generation by setting proper timesteps without additional overhead.\nSpecifically, UniDiffuser employs a variation of transformer, called U-ViT, which parameterizes the joint noise prediction network. Other components perform as encoders and decoders of different modalities, including a pretrained image autoencoder from Stable Diffusion, a pretrained image ViT-B/32 CLIP encoder, a pretrained text ViT-L CLIP encoder, and a GPT-2 text decoder finetuned by ourselves.",
"preview": "thu-ml--unidiffuser-v1.jpg",
"extras": "width: 512, height: 512, sampler: Default"
},
"SalesForce BLIP-Diffusion": {
"path": "salesforce/blipdiffusion",
"desc": "BLIP-Diffusion, a new subject-driven image generation model that supports multimodal control which consumes inputs of subject images and text prompts. Unlike other subject-driven generation models, BLIP-Diffusion introduces a new multimodal encoder which is pre-trained to provide subject representation.",
"preview": "salesforce--blipdiffusion.jpg"
},
"InstaFlow 0.9B": {
"path": "XCLiu/instaflow_0_9B_from_sd_1_5",
"desc": "InstaFlow is an ultra-fast, one-step image generator that achieves image quality close to Stable Diffusion. This efficiency is made possible through a recent Rectified Flow technique, which trains probability flows with straight trajectories, hence inherently requiring only a single step for fast inference.",
"preview": "XCLiu--instaflow_0_9B_from_sd_1_5.jpg"
},
"DeepFloyd IF Medium": {
"path": "DeepFloyd/IF-I-M-v1.0",
"desc": "DeepFloyd-IF is a pixel-based text-to-image triple-cascaded diffusion model, that can generate pictures with new state-of-the-art for photorealism and language understanding. The result is a highly efficient model that outperforms current state-of-the-art models, achieving a zero-shot FID-30K score of 6.66 on the COCO dataset. It is modular and composed of frozen text mode and three pixel cascaded diffusion modules, each designed to generate images of increasing resolution: 64x64, 256x256, and 1024x1024.",

Binary file not shown.

Before

Width:  |  Height:  |  Size: 78 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 78 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 160 KiB

View File

@ -447,6 +447,8 @@ def git(arg: str, folder: str = None, ignore: bool = False, optional: bool = Fal
stdout += ('\n' if len(stdout) > 0 else '') + result.stderr.decode(encoding="utf8", errors="ignore")
stdout = stdout.strip()
if result.returncode != 0 and not ignore:
if folder is None:
folder = 'root'
if "couldn't find remote ref" in stdout: # not a git repo
log.error(f'Git: folder="{folder}" could not identify repository')
elif "no submodule mapping found" in stdout:
@ -601,7 +603,7 @@ def check_diffusers():
if args.skip_git:
install('diffusers')
return
sha = '4fcd0bc7ebb934a1559d0b516f09534ba22c8a0d' # diffusers commit hash
sha = '9b721db205729d5a6e97a72312c3a0f4534064f1' # diffusers commit hash
pkg = pkg_resources.working_set.by_key.get('diffusers', None)
minor = int(pkg.version.split('.')[1] if pkg is not None else -1)
cur = opts.get('diffusers_version', '') if minor > -1 else ''
@ -622,18 +624,22 @@ def check_transformers():
t_start = time.time()
if args.skip_all or args.skip_git or args.experimental:
return
pkg = pkg_resources.working_set.by_key.get('transformers', None)
pkg_transofmers = pkg_resources.working_set.by_key.get('transformers', None)
pkg_tokenizers = pkg_resources.working_set.by_key.get('tokenizers', None)
if args.use_directml:
target = '4.52.4'
target_transformers = '4.52.4'
target_tokenizers = '0.21.4'
else:
target = '4.55.2'
if (pkg is None) or ((pkg.version != target) and (not args.experimental)):
if pkg is None:
log.info(f'Transformers install: version={target}')
target_transformers = '4.56.0'
target_tokenizers = '0.22.0'
if (pkg_transofmers is None) or ((pkg_transofmers.version != target_transformers) or (pkg_tokenizers is None) or ((pkg_tokenizers.version != target_tokenizers) and (not args.experimental))):
if pkg_transofmers is None:
log.info(f'Transformers install: version={target_transformers}')
else:
log.info(f'Transformers update: current={pkg.version} target={target}')
log.info(f'Transformers update: current={pkg_transofmers.version} target={target_transformers}')
pip('uninstall --yes transformers', ignore=True, quiet=True, uv=False)
pip(f'install --upgrade transformers=={target}', ignore=False, quiet=True, uv=False)
pip(f'install --upgrade tokenizers=={target_tokenizers}', ignore=False, quiet=True, uv=False)
pip(f'install --upgrade transformers=={target_transformers}', ignore=False, quiet=True, uv=False)
ts('transformers', t_start)
@ -768,10 +774,6 @@ def install_rocm_zluda():
# older rocm (5.7) uses torch 2.3 or older
torch_command = os.environ.get('TORCH_COMMAND', f'torch torchvision --index-url https://download.pytorch.org/whl/rocm{rocm.version}')
if device is not None and rocm.version != "6.2" and rocm.get_blaslt_enabled():
log.debug(f'ROCm hipBLASLt: arch={device.name} available={device.blaslt_supported}')
rocm.set_blaslt_enabled(device.blaslt_supported)
if device is None or os.environ.get("HSA_OVERRIDE_GFX_VERSION", None) is not None:
log.info(f'ROCm: HSA_OVERRIDE_GFX_VERSION auto config skipped: device={device.name if device is not None else None} version={os.environ.get("HSA_OVERRIDE_GFX_VERSION", None)}')
else:
@ -1271,21 +1273,6 @@ def install_optional():
ts('optional', t_start)
def install_sentencepiece():
if installed('sentencepiece', quiet=True):
pass
elif int(sys.version_info.minor) >= 13:
backup_cmake_policy = os.environ.get('CMAKE_POLICY_VERSION_MINIMUM', None)
backup_cxxflags = os.environ.get('CXXFLAGS', None)
os.environ.setdefault('CMAKE_POLICY_VERSION_MINIMUM', '3.5')
os.environ.setdefault('CXXFLAGS', '-include cstdint')
install('git+https://github.com/google/sentencepiece#subdirectory=python', 'sentencepiece')
os.environ.setdefault('CMAKE_POLICY_VERSION_MINIMUM', backup_cmake_policy)
os.environ.setdefault('CXXFLAGS', backup_cxxflags)
else:
install('sentencepiece', 'sentencepiece')
def install_requirements():
t_start = time.time()
if args.profile:

View File

@ -98,7 +98,7 @@ table.settings-value-table td { padding: 0.4em; border: 1px solid #ccc; max-widt
.extra-network-cards .card:hover .overlay { background: rgba(0, 0, 0, 0.40); }
.extra-network-cards .card:hover .preview { box-shadow: none; filter: grayscale(100%); }
.extra-network-cards .card:hover .overlay { background: rgba(0, 0, 0, 0.40); }
.extra-network-cards .card .tags { margin: 4px; display: none; overflow-wrap: break-word; }
.extra-network-cards .card .tags { margin: 4px; display: none; overflow-wrap: anywhere; }
.extra-network-cards .card .tag { padding: 2px; margin: 2px; background: var(--neutral-700); cursor: pointer; display: inline-block; }
.extra-network-cards .card .actions > span { padding: 4px; }
.extra-network-cards .card:hover .actions { display: block; }
@ -128,8 +128,8 @@ div:has(>#tab-browser-folders) { flex-grow: 0 !important; background-color: var(
/* loader */
.splash { position: fixed; top: 0; left: 0; width: 100vw; height: 100vh; z-index: 1000; display: block; text-align: center; }
.motd { margin-top: 2em; color: var(--body-text-color-subdued); font-family: monospace; font-variant: all-petite-caps; }
.splash-img { margin: 10% auto 0 auto; width: 512px; background-repeat: no-repeat; height: 512px; animation: color 10s infinite alternate; max-width: 80vw; background-size: contain; }
.motd { margin-top: 2em; color: var(--body-text-color-subdued); font-family: monospace; font-variant: all-petite-caps; font-size: 1.2em; }
.splash-img { margin: 10% auto 0 auto; width: 512px; background-repeat: no-repeat; height: 512px; animation: hue 5s infinite alternate; max-width: 80vw; background-size: contain; }
.loading { color: white; position: absolute; top: 20%; left: 50%; transform: translateX(-50%); }
.loader { width: 300px; height: 300px; border: var(--spacing-md) solid transparent; border-radius: 50%; border-top: var(--spacing-md) solid var(--primary-600); animation: spin 4s linear infinite; position: relative; }
.loader::before, .loader::after { content: ""; position: absolute; top: 6px; bottom: 6px; left: 6px; right: 6px; border-radius: 50%; border: var(--spacing-md) solid transparent; }
@ -137,4 +137,4 @@ div:has(>#tab-browser-folders) { flex-grow: 0 !important; background-color: var(
.loader::after { border-top-color: var(--primary-300); animation: spin 1.5s linear infinite; }
@keyframes move { from { background-position-x: 0, -40px; } to { background-position-x: 0, 40px; } }
@keyframes spin { from { transform: rotate(0deg); } to { transform: rotate(360deg); } }
@keyframes color { from { filter: hue-rotate(0deg) } to { filter: hue-rotate(360deg) } }
@keyframes hue { from { filter: hue-rotate(0deg) } to { filter: hue-rotate(360deg) } }

View File

@ -953,7 +953,7 @@ svg.feather.feather-image,
}
/* No Preview Card Styles */
.extra-network-cards .card:has(>img[src*="card-no-preview.png"])::before {
.extra-network-cards .card:has(>img[src*="missing.png"])::before {
content: '';
position: absolute;
width: 100%;
@ -1007,11 +1007,11 @@ svg.feather.feather-image,
}
.splash-img {
margin: 0;
margin: 10% auto 0 auto;
width: 512px;
height: 512px;
background-repeat: no-repeat;
animation: color 8s infinite alternate, move 3s infinite alternate;
animation: hue 5s infinite alternate;
}
.loading {

View File

@ -107,7 +107,7 @@ async function modelCardClick(id) {
downloads: data.downloads?.toString() || '',
creator,
desc: data.desc || 'no description available',
image: images.length > 0 ? images[0] : '/sdapi/v1/network/thumb?filename=html/card-no-preview.png',
image: images.length > 0 ? images[0] : '/sdapi/v1/network/thumb?filename=html/missing.png',
versions: versionsHTML || '',
});
el.innerHTML = modelHTML;

View File

@ -14,8 +14,13 @@ const getENActiveTab = () => {
else if (gradioApp().getElementById('extras_image')?.checkVisibility()) tabName = 'process';
else if (gradioApp().getElementById('interrogate_image')?.checkVisibility()) tabName = 'caption';
else if (gradioApp().getElementById('tab-gallery-search')?.checkVisibility()) tabName = 'gallery';
if (tabName in ['process', 'caption', 'gallery']) tabName = lastTab;
else lastTab = tabName;
if (['process', 'caption', 'gallery'].includes(tabName)) {
tabName = lastTab;
} else if (tabName !== '') {
lastTab = tabName;
}
if (tabName !== '') return tabName;
// legacy method
if (gradioApp().getElementById('tab_txt2img')?.style.display === 'block') tabName = 'txt2img';
@ -277,8 +282,31 @@ function extraNetworksSearchButton(event) {
const tabName = getENActiveTab();
const searchTextarea = gradioApp().querySelector(`#${tabName}_extra_search textarea`);
const button = event.target;
searchTextarea.value = `${button.textContent.trim()}/`;
updateInput(searchTextarea);
if (searchTextarea) {
searchTextarea.value = `${button.textContent.trim()}/`;
updateInput(searchTextarea);
} else {
console.error(`Could not find the search textarea for the tab: ${tabName}`);
}
}
function extraNetworksFilterVersion(event) {
// log('extraNetworksFilterVersion', event);
const version = event.target.textContent.trim();
const activeTab = gradioApp().querySelector('.extra-networks-tab:not([style*="display: none"])');
if (!activeTab) return;
const cardContainer = activeTab.querySelector('.extra-network-cards');
if (!cardContainer) return;
if (cardContainer.dataset.activeVersion === version) {
cardContainer.dataset.activeVersion = '';
cardContainer.querySelectorAll('.card').forEach((card) => card.style.display = '');
} else {
cardContainer.dataset.activeVersion = version;
cardContainer.querySelectorAll('.card').forEach((card) => {
if (card.dataset.version === version) card.style.display = '';
else card.style.display = 'none';
});
}
}
let desiredStyle = '';

View File

@ -1197,7 +1197,7 @@ table.settings-value-table td {
}
.extra-networks .search textarea {
width: calc(120px / 1.1);
width: calc(140px / 1.1);
resize: none;
margin-right: 2px;
}
@ -1233,7 +1233,7 @@ table.settings-value-table td {
padding: 3px 3px 3px 12px;
text-align: left;
text-indent: -6px;
width: 120px;
width: 140px;
width: 100%;
}
@ -1249,7 +1249,7 @@ table.settings-value-table td {
.extra-network-subdirs {
background: var(--input-background-fill);
border-radius: 4px;
min-width: max(15%, 120px);
min-width: max(15%, 140px);
overflow-x: hidden;
overflow-y: auto;
padding-top: 0.5em;
@ -1376,7 +1376,17 @@ table.settings-value-table td {
display: block;
}
.extra-network-cards .card:has(>img[src*="card-no-preview.png"])::before {
.extra-network-cards .card:hover {
z-index: 100;
position: relative;
}
.extra-network-cards .card:hover .tags {
display: block;
z-index: 101; /* Optional: ensure tags are above everything */
}
.extra-network-cards .card:has(>img[src*="missing.png"])::before {
background-color: var(--data-color);
content: '';
height: 100%;
@ -1461,7 +1471,6 @@ table.settings-value-table td {
overflow-y: auto;
}
.extra-details td:first-child {
font-weight: bold;
vertical-align: top;
@ -1471,6 +1480,29 @@ table.settings-value-table td {
max-height: 50vh;
}
.network-folder::before {
content: "󰉖 ";
margin-right: 0.8em;
}
.network-reference {
filter: contrast(0.9);
}
.network-reference::before {
content: "󰴊 ";
margin-right: 0.8em;
}
.network-model {
opacity: 0.6;
}
.network-model::before {
content: "󰴉 ";
margin-right: 0.8em;
}
.input-accordion-checkbox {
display: none !important;
}
@ -1716,6 +1748,13 @@ background: var(--background-color)
width: max-content;
}
#tab-gallery-files gallery-file {
/* Add a vertical gutter between items (left/right), matching existing small row spacing */
display: inline-block;
margin-right: 0.2em;
vertical-align: top; /* keep rows aligned on the top edge */
}
#tab-gallery-files {
display: block;
height: 75vh;
@ -1813,6 +1852,24 @@ div:has(>#tab-gallery-folders) {
object-fit: contain;
}
/* Gallery video preview matches image preview sizing and layout */
#tab-gallery-video {
height: 63vh;
}
/* Ensure the <video> element fills the preview column and preserves aspect */
#tab-gallery-video video {
width: 100%;
height: 100% !important;
object-fit: contain;
background: none;
}
/* Gradio container around the video should not add extra spacing */
#tab-gallery-video .wrap {
height: 100%;
}
.gallery-sort {
background: var(--input-background-fill) !important;
margin: 0 !important;
@ -1867,70 +1924,73 @@ div:has(>#tab-gallery-folders) {
}
.splash {
display: block;
height: 100vh;
left: 0;
position: fixed;
text-align: center;
top: 0;
left: 0;
width: 100vw;
height: 100vh;
z-index: 1000;
display: flex;
flex-direction: column;
align-items: center;
justify-content: center;
background-color: rgba(0, 0, 0, 0.8);
}
.motd {
margin-top: 1em;
color: var(--body-text-color-subdued);
font-family: monospace;
font-variant: all-petite-caps;
margin-top: 2em;
font-size: 1.2em;
}
.splash-img {
animation: color 10s infinite alternate;
background-repeat: no-repeat;
background-size: contain;
height: 512px;
margin: 10% auto 0 auto;
max-width: 80vw;
width: 512px;
height: 512px;
background-repeat: no-repeat;
animation: hue 5s infinite alternate;
}
.loading {
color: white;
left: 50%;
position: absolute;
top: 20%;
transform: translateX(-50%);
position: border-box;
top: 85%;
font-size: 1.5em;
}
.loader {
animation: spin 4s linear infinite;
width: 100px;
height: 100px;
border: var(--spacing-md) solid transparent;
border-radius: 50%;
border-top: var(--spacing-md) solid var(--primary-600);
height: 300px;
position: relative;
width: 300px;
animation: spin 2s linear infinite, hue 5s infinite alternate;
position: border-box;
}
.loader::before, .loader::after {
border: var(--spacing-md) solid transparent;
border-radius: 50%;
bottom: 6px;
.loader::before,
.loader::after {
content: "";
left: 6px;
position: absolute;
right: 6px;
top: 6px;
bottom: 6px;
left: 6px;
right: 6px;
border-radius: 50%;
border: var(--spacing-md) solid transparent;
animation: hue 5s infinite alternate;
}
.loader::before {
animation: 3s spin linear infinite;
border-top-color: var(--primary-900);
animation: spin 3s linear infinite;
}
.loader::after {
animation: spin 1.5s linear infinite;
border-top-color: var(--primary-300);
animation: spin 1.5s linear infinite;
}
.docs-search textarea {
@ -2087,35 +2147,21 @@ div:has(>#tab-gallery-folders) {
filter: blur(0);
}
@keyframes move {
from {
background-position-x: 0, -40px;
}
to {
background-position-x: 0, 40px;
}
}
@keyframes spin {
from {
transform: rotate(0deg);
}
to {
transform: rotate(360deg);
}
}
@keyframes color {
from {
filter: hue-rotate(0deg)
@keyframes hue {
0% {
filter: hue-rotate(0deg);
}
to {
filter: hue-rotate(360deg)
100% {
filter: hue-rotate(360deg);
}
}

View File

@ -9,10 +9,12 @@ const localeData = {
type: 2,
hint: null,
btn: null,
expandTimeout: null, // New property for expansion timeout
expandTimeout: null, // Property for expansion timeout
currentElement: null, // Track current element for expansion
observer: null, // MutationObserver for DOM changes
};
let localeTimeout = null;
const isTouchDevice = 'ontouchstart' in window;
async function cycleLocale() {
clearTimeout(localeTimeout);
@ -62,43 +64,49 @@ async function tooltipCreate() {
if (window.opts.tooltips === 'None') localeData.type = 0;
if (window.opts.tooltips === 'Browser default') localeData.type = 1;
if (window.opts.tooltips === 'UI tooltips') localeData.type = 2;
if (localeData.type === 2) { // setup event delegation for tooltips instead of individual listeners
if (isTouchDevice) {
gradioApp().addEventListener('touchstart', tooltipShowDelegated); // eslint-disable-line no-use-before-define
gradioApp().addEventListener('touchend', tooltipHideDelegated); // eslint-disable-line no-use-before-define
}
gradioApp().addEventListener('pointerover', tooltipShowDelegated); // eslint-disable-line no-use-before-define
gradioApp().addEventListener('pointerout', tooltipHideDelegated); // eslint-disable-line no-use-before-define
}
if (!localeData.observer) initializeDOMObserver(); // eslint-disable-line no-use-before-define
}
async function expandTooltip(element, longHint) {
if (localeData.currentElement === element && localeData.hint.classList.contains('tooltip-show')) {
// Hide the progress ring
const ring = localeData.hint.querySelector('.tooltip-progress-ring');
if (ring) {
ring.style.opacity = '0';
}
// Expand the container
if (ring) ring.style.opacity = '0';
localeData.hint.classList.add('tooltip-expanded');
// After container starts expanding, reveal the long content
setTimeout(() => {
const longContent = localeData.hint.querySelector('.long-content');
if (longContent) {
longContent.classList.add('show');
}
if (longContent) longContent.classList.add('show');
}, 100);
}
}
async function tooltipShowDelegated(e) { // use event delegation to handle dynamically created elements
if (e.target.dataset && e.target.dataset.hint) tooltipShow(e); // eslint-disable-line no-use-before-define
}
async function tooltipHideDelegated(e) {
if (e.target.dataset && e.target.dataset.hint) tooltipHide(e); // eslint-disable-line no-use-before-define
}
async function tooltipShow(e) {
// Clear any existing expansion timeout
if (localeData.expandTimeout) {
if (localeData.expandTimeout) { // clear any existing expansion timeout
clearTimeout(localeData.expandTimeout);
localeData.expandTimeout = null;
}
// Remove expanded class and reset current element
localeData.hint.classList.remove('tooltip-expanded');
localeData.hint.classList.remove('tooltip-expanded'); // remove expanded class and reset current element
localeData.currentElement = e.target;
if (e.target.dataset.hint) {
// Create progress ring SVG
const progressRing = `
const progressRing = ` // create progress ring SVG
<div class="tooltip-progress-ring">
<svg viewBox="0 0 12 12">
<circle class="ring-background" cx="6" cy="6" r="5"></circle>
@ -106,8 +114,7 @@ async function tooltipShow(e) {
</svg>
</div>
`;
// Set up the complete content structure from the start
// set up the complete content structure from the start
let content = `
<div class="tooltip-header">
<b>${e.target.textContent}</b>
@ -116,21 +123,12 @@ async function tooltipShow(e) {
<div class="separator"></div>
${e.target.dataset.hint}
`;
// Add long content if available, but keep it hidden
if (e.target.dataset.longHint) {
content += `<div class="long-content"><div class="separator"></div>${e.target.dataset.longHint}</div>`;
}
// Add reload notice if needed
if (e.target.dataset.reload) {
if (e.target.dataset.longHint) content += `<div class="long-content"><div class="separator"></div>${e.target.dataset.longHint}</div>`; // add long content if available, but keep it hidden
if (e.target.dataset.reload) { // add reload notice if needed
const reloadType = e.target.dataset.reload;
let reloadText = '';
if (reloadType === 'model') {
reloadText = 'Requires model reload';
} else if (reloadType === 'server') {
reloadText = 'Requires server restart';
}
if (reloadType === 'model') reloadText = 'Requires model reload';
else if (reloadType === 'server') reloadText = 'Requires server restart';
if (reloadText) {
content += `
<div class="tooltip-reload-notice">
@ -144,40 +142,28 @@ async function tooltipShow(e) {
localeData.hint.innerHTML = content;
localeData.hint.classList.add('tooltip-show');
if (e.clientX > window.innerWidth / 2) {
localeData.hint.classList.add('tooltip-left');
} else {
localeData.hint.classList.remove('tooltip-left');
}
if (e.clientX > window.innerWidth / 2) localeData.hint.classList.add('tooltip-left');
else localeData.hint.classList.remove('tooltip-left');
// Set up expansion timer if long hint is available
if (e.target.dataset.longHint) {
// Start progress ring animation
const ring = localeData.hint.querySelector('.tooltip-progress-ring');
if (e.target.dataset.longHint) { // set up expansion timer if long hint is available
const ring = localeData.hint.querySelector('.tooltip-progress-ring'); // start progress ring animation
const ringProgress = localeData.hint.querySelector('.ring-progress');
if (ring && ringProgress) {
// Show the ring and start animation
setTimeout(() => {
ring.classList.add('active');
ringProgress.classList.add('animate');
}, 100);
}
localeData.expandTimeout = setTimeout(() => {
expandTooltip(e.target, e.target.dataset.longHint);
}, 3000);
localeData.expandTimeout = setTimeout(() => expandTooltip(e.target, e.target.dataset.longHint), 3000);
}
}
}
async function tooltipHide(e) {
// Clear expansion timeout when hiding
if (localeData.expandTimeout) {
clearTimeout(localeData.expandTimeout);
localeData.expandTimeout = null;
}
localeData.hint.classList.remove('tooltip-show', 'tooltip-expanded');
localeData.currentElement = null;
}
@ -294,8 +280,6 @@ async function setHint(el, entry) {
el.dataset.hint = entry.hint;
if (entry.longHint && entry.longHint.length > 0) el.dataset.longHint = entry.longHint;
if (entry.reload && entry.reload.length > 0) el.dataset.reload = entry.reload;
el.addEventListener('mouseover', tooltipShow);
el.addEventListener('mouseout', tooltipHide);
} else {
// tooltips disabled
}
@ -345,6 +329,7 @@ async function setHints(analyze = false) {
localeData.initial = false;
const t1 = performance.now();
// localeData.btn.style.backgroundColor = localeData.locale !== 'en' ? 'var(--primary-500)' : '';
log('touchDevice', isTouchDevice);
log('setHints', { type: localeData.type, locale: localeData.locale, elements: elements.length, localized, hints, data: localeData.data.length, override: overrideData.length, time: Math.round(t1 - t0) });
// sortUIElements();
if (analyze) {
@ -359,3 +344,80 @@ const analyzeHints = async () => {
localeData.data = [];
await setHints(true);
};
// Apply hints to a single element immediately
async function applyHintToElement(el) {
if (!localeData.data || localeData.data.length === 0) return;
if (!el.textContent) return;
// check if element matches our selector criteria
const isValidElement = el.tagName === 'BUTTON'
|| el.tagName === 'H2'
|| (el.tagName === 'SPAN' && (el.parentElement?.tagName === 'LABEL' || el.parentElement?.classList.contains('label-wrap')));
if (!isValidElement) return;
let found; // find matching hint data
if (el.dataset.original) found = localeData.data.find((l) => l.label.toLowerCase().trim() === el.dataset.original.toLowerCase().trim());
else found = localeData.data.find((l) => l.label.toLowerCase().trim() === el.textContent.toLowerCase().trim());
if (found?.localized?.length > 0) { // apply localization if found
if (!el.dataset.original) el.dataset.original = el.textContent;
replaceTextContent(el, found.localized);
}
if (found?.hint?.length > 0) setHint(el, found); // apply hint if found
}
// Initialize MutationObserver for immediate hint application
function initializeDOMObserver() {
if (localeData.observer) {
localeData.observer.disconnect();
}
localeData.observer = new MutationObserver((mutations) => {
// Process added nodes immediately
for (const mutation of mutations) {
if (mutation.type === 'childList') {
for (const node of mutation.addedNodes) {
if (node.nodeType === Node.ELEMENT_NODE) {
// Apply hints to the node itself
applyHintToElement(node);
// Apply hints to all relevant children
const elements = [
...Array.from(node.querySelectorAll('button')),
...Array.from(node.querySelectorAll('h2')),
...Array.from(node.querySelectorAll('label > span')),
...Array.from(node.querySelectorAll('.label-wrap > span')),
];
// Include the node itself if it matches
if (node.matches && (
node.matches('button')
|| node.matches('h2')
|| node.matches('label > span')
|| node.matches('.label-wrap > span')
)) {
elements.push(node);
}
// Apply hints immediately to all found elements
elements.forEach((el) => applyHintToElement(el));
}
}
}
}
});
// Start observing the entire gradio app for changes
const targetNode = gradioApp();
if (targetNode) {
localeData.observer.observe(targetNode, {
childList: true,
subtree: true,
});
}
}
// Export for external use if needed
const forceReapplyHints = () => setHints();

View File

@ -8,18 +8,18 @@ async function initStartup() {
if (window.setupLogger) await setupLogger();
// all items here are non-blocking async calls
initModels();
getUIDefaults();
initPromptChecker();
initContextMenu();
initDragDrop();
initAccordions();
initSettings();
initImageViewer();
initGallery();
initiGenerationParams();
initChangelog();
setupControlUI();
await initModels();
await getUIDefaults();
await initPromptChecker();
await initContextMenu();
await initDragDrop();
await initAccordions();
await initSettings();
await initImageViewer();
await initGallery();
await initiGenerationParams();
await initChangelog();
await setupControlUI();
// reconnect server session
await reconnectUI();

View File

@ -262,7 +262,6 @@ def main():
installer.check_onnx()
installer.check_transformers()
installer.check_diffusers()
installer.install_sentencepiece()
installer.check_modified_files()
if args.test:
installer.log.info('Startup: test mode')

Binary file not shown.

Before

Width:  |  Height:  |  Size: 37 KiB

After

Width:  |  Height:  |  Size: 69 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 85 KiB

After

Width:  |  Height:  |  Size: 48 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 78 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 88 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 52 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 52 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 52 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 52 KiB

After

Width:  |  Height:  |  Size: 87 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 52 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 52 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 52 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 92 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 95 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 82 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 63 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 51 KiB

After

Width:  |  Height:  |  Size: 93 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 51 KiB

After

Width:  |  Height:  |  Size: 94 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 51 KiB

After

Width:  |  Height:  |  Size: 81 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 61 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 70 KiB

After

Width:  |  Height:  |  Size: 74 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 58 KiB

After

Width:  |  Height:  |  Size: 41 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 50 KiB

After

Width:  |  Height:  |  Size: 31 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 79 KiB

After

Width:  |  Height:  |  Size: 52 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 58 KiB

After

Width:  |  Height:  |  Size: 60 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 35 KiB

After

Width:  |  Height:  |  Size: 62 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 35 KiB

After

Width:  |  Height:  |  Size: 90 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 35 KiB

After

Width:  |  Height:  |  Size: 59 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 74 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 50 KiB

After

Width:  |  Height:  |  Size: 27 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 63 KiB

After

Width:  |  Height:  |  Size: 42 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 39 KiB

After

Width:  |  Height:  |  Size: 44 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 63 KiB

After

Width:  |  Height:  |  Size: 80 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 32 KiB

After

Width:  |  Height:  |  Size: 10 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 32 KiB

After

Width:  |  Height:  |  Size: 16 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 39 KiB

After

Width:  |  Height:  |  Size: 61 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 79 KiB

After

Width:  |  Height:  |  Size: 71 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 79 KiB

After

Width:  |  Height:  |  Size: 87 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 117 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 54 KiB

After

Width:  |  Height:  |  Size: 124 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 54 KiB

After

Width:  |  Height:  |  Size: 87 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 54 KiB

After

Width:  |  Height:  |  Size: 114 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 42 KiB

After

Width:  |  Height:  |  Size: 92 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 42 KiB

After

Width:  |  Height:  |  Size: 82 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 50 KiB

After

Width:  |  Height:  |  Size: 43 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 88 KiB

After

Width:  |  Height:  |  Size: 95 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 72 KiB

After

Width:  |  Height:  |  Size: 15 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 73 KiB

After

Width:  |  Height:  |  Size: 63 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 88 KiB

After

Width:  |  Height:  |  Size: 73 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 77 KiB

After

Width:  |  Height:  |  Size: 49 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 22 KiB

After

Width:  |  Height:  |  Size: 34 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 76 KiB

After

Width:  |  Height:  |  Size: 74 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 87 KiB

After

Width:  |  Height:  |  Size: 88 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 57 KiB

After

Width:  |  Height:  |  Size: 54 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 57 KiB

After

Width:  |  Height:  |  Size: 61 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 57 KiB

After

Width:  |  Height:  |  Size: 38 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 47 KiB

After

Width:  |  Height:  |  Size: 52 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 51 KiB

After

Width:  |  Height:  |  Size: 47 KiB

View File

@ -168,7 +168,7 @@ def atomic_civit_search_metadata(item, results):
# log.error(f'CivitAI search metadata: item={item} {e}')
return
has_meta = os.path.isfile(meta) and os.stat(meta).st_size > 0
if ('card-no-preview.png' in item['preview'] or not has_meta) and os.path.isfile(item['filename']):
if ('missing.png' in item['preview'] or not has_meta) and os.path.isfile(item['filename']):
sha = item.get('hash', None)
found = False
result = {
@ -260,7 +260,7 @@ def civit_search_metadata(title: str = None):
if type(title) == str:
if page.title != title:
continue
if page.name == 'style':
if page.name == 'style' or page.name == 'wildcards':
continue
for item in page.list_items():
if item is None:

View File

@ -200,7 +200,7 @@ def create_model_cards(all_models: list[Model]) -> str:
if image.url and len(image.url) > 0 and not image.url.lower().endswith('.mp4'):
previews.append(image.url)
if len(previews) == 0:
previews = ['/sdapi/v1/network/thumb?filename=html/card-no-preview.png']
previews = ['/sdapi/v1/network/thumb?filename=html/missing.png']
all_cards += card.format(id=model.id, name=model.name, type=model.type, preview=previews[0])
html = details + cards.format(cards=all_cards)
return html

View File

@ -7,6 +7,7 @@ from modules.processing_class import StableDiffusionProcessingControl
from modules import shared, images, masking, sd_models
from modules.timer import process as process_timer
from modules.control import util
from modules.control import processors as control_processors
debug = os.environ.get('SD_CONTROL_DEBUG', None) is not None
@ -108,7 +109,7 @@ def preprocess_image(
if processed_image is not None:
processed_images.append(processed_image)
if shared.opts.control_unload_processor and process.processor_id is not None:
processors.config[process.processor_id]['dirty'] = True # to force reload
control_processors.config[process.processor_id]['dirty'] = True # to force reload
process.model = None
# blend processed images

View File

@ -102,6 +102,15 @@ predefined_sd3 = {
"Alimama Inpainting SD35": 'alimama-creative/SD3-Controlnet-Inpainting',
"Alimama SoftEdge SD35": 'alimama-creative/SD3-Controlnet-Softedge',
}
predefined_qwen = {
"InstantX Union Qwen": 'InstantX/Qwen-Image-ControlNet-Union',
}
predefined_hunyuandit = {
"HunyuanDiT Canny": 'Tencent-Hunyuan/HunyuanDiT-v1.2-ControlNet-Diffusers-Canny',
"HunyuanDiT Pose": 'Tencent-Hunyuan/HunyuanDiT-v1.2-ControlNet-Diffusers-Pose',
"HunyuanDiT Depth": 'Tencent-Hunyuan/HunyuanDiT-v1.2-ControlNet-Diffusers-Depth',
}
variants = {
'NoobAI Canny XL': 'fp16',
'NoobAI Lineart Anime XL': 'fp16',
@ -116,6 +125,8 @@ all_models.update(predefined_sd15)
all_models.update(predefined_sdxl)
all_models.update(predefined_f1)
all_models.update(predefined_sd3)
all_models.update(predefined_qwen)
all_models.update(predefined_hunyuandit)
cache_dir = 'models/control/controlnet'
load_lock = threading.Lock()
@ -150,6 +161,10 @@ def api_list_models(model_type: str = None):
model_list += list(predefined_f1)
if model_type == 'sd3' or model_type == 'all':
model_list += list(predefined_sd3)
if model_type == 'qwen' or model_type == 'all':
model_list += list(predefined_qwen)
if model_type == 'hunyuandit' or model_type == 'all':
model_list += list(predefined_hunyuandit)
model_list += sorted(find_models())
return model_list
@ -170,6 +185,10 @@ def list_models(refresh=False):
models = ['None'] + list(predefined_f1) + sorted(find_models())
elif modules.shared.sd_model_type == 'sd3':
models = ['None'] + list(predefined_sd3) + sorted(find_models())
elif modules.shared.sd_model_type == 'qwen':
models = ['None'] + list(predefined_qwen) + sorted(find_models())
elif modules.shared.sd_model_type == 'hunyuandit':
models = ['None'] + list(predefined_hunyuandit) + sorted(find_models())
else:
log.warning(f'Control {what} model list failed: unknown model type')
models = ['None'] + sorted(predefined_sd15) + sorted(predefined_sdxl) + sorted(predefined_f1) + sorted(predefined_sd3) + sorted(find_models())
@ -222,12 +241,18 @@ class ControlNet():
elif shared.sd_model_type == 'sd3':
from diffusers import SD3ControlNetModel as cls
config = 'InstantX/SD3-Controlnet-Canny'
elif shared.sd_model_type == 'qwen':
from diffusers import QwenImageControlNetModel as cls
config = 'InstantX/Qwen-Image-ControlNet-Union'
elif shared.sd_model_type == 'hunyuandit':
from diffusers import HunyuanDiT2DControlNetModel as cls
config = 'Tencent-Hunyuan/HunyuanDiT-v1.2-ControlNet-Diffusers-Canny'
else:
log.error(f'Control {what}: type={shared.sd_model_type} unsupported model')
return None, None
return cls, config
def load_safetensors(self, model_id, model_path, cls, config):
def load_safetensors(self, model_id, model_path, cls, config): # pylint: disable=unused-argument
name = os.path.splitext(model_path)[0]
config_path = None
if not os.path.exists(model_path):
@ -302,6 +327,7 @@ class ControlNet():
errors.display(e, 'Control')
if self.model is None:
return
self.model.offload_never = True
if self.dtype is not None:
self.model.to(self.dtype)
if "Control" in opts.sdnq_quantize_weights:
@ -422,6 +448,30 @@ class ControlNetPipeline():
controlnet=controlnets, # can be a list
)
sd_models.move_model(self.pipeline, pipeline.device)
elif detect.is_qwen(pipeline) and len(controlnets) > 0:
from diffusers import QwenImageControlNetPipeline
self.pipeline = QwenImageControlNetPipeline(
vae=pipeline.vae,
text_encoder=pipeline.text_encoder,
tokenizer=pipeline.tokenizer,
transformer=pipeline.transformer,
scheduler=pipeline.scheduler,
controlnet=controlnets[0] if isinstance(controlnets, list) else controlnets, # can be a list
)
elif detect.is_hunyuandit(pipeline) and len(controlnets) > 0:
from diffusers import HunyuanDiTControlNetPipeline
self.pipeline = HunyuanDiTControlNetPipeline(
vae=pipeline.vae,
text_encoder=pipeline.text_encoder,
tokenizer=pipeline.tokenizer,
text_encoder_2=pipeline.text_encoder_2,
tokenizer_2=pipeline.tokenizer_2,
transformer=pipeline.transformer,
scheduler=pipeline.scheduler,
safety_checker=None,
feature_extractor=None,
controlnet=controlnets[0] if isinstance(controlnets, list) else controlnets, # can be a list
)
elif len(loras) > 0:
self.pipeline = pipeline
for lora in loras:
@ -442,17 +492,19 @@ class ControlNetPipeline():
if dtype is not None:
self.pipeline = self.pipeline.to(dtype)
controlnet = None # free up memory
controlnets = None
sd_models.copy_diffuser_options(self.pipeline, pipeline)
if opts.diffusers_offload_mode == 'none':
sd_models.move_model(self.pipeline, devices.device)
from modules.sd_models import set_diffuser_offload
set_diffuser_offload(self.pipeline, 'model')
sd_models.clear_caches()
sd_models.set_diffuser_offload(self.pipeline, 'model')
t1 = time.time()
debug_log(f'Control {what} pipeline: class={self.pipeline.__class__.__name__} time={t1-t0:.2f}')
def restore(self):
if self.pipeline is not None:
if self.pipeline is not None and hasattr(self.pipeline, 'unload_lora_weights'):
self.pipeline.unload_lora_weights()
self.pipeline = None
return self.orig_pipeline

View File

@ -20,3 +20,11 @@ def is_f1(model):
def is_sd3(model):
return is_compatible(model, pattern='StableDiffusion3Pipeline')
def is_qwen(model):
return is_compatible(model, pattern='Qwen')
def is_hunyuandit(model):
return is_compatible(model, pattern='HunyuanDiT')

View File

@ -134,7 +134,7 @@ class Script(scripts_manager.Script):
app = get_app('buffalo_l')
from modules.face.faceid import face_id
processed_images = face_id(p, app=app, source_images=input_images, model=ip_model, override=ip_override, cache=ip_cache, scale=ip_strength, structure=ip_structure) # run faceid pipeline
processed = processing.Processed(p, images_list=processed_images, seed=p.seed, subseed=p.subseed, index_of_first_image=0) # manually created processed object
processed = processing.get_processed(p, images_list=processed_images, seed=p.seed, subseed=p.subseed, index_of_first_image=0) # manually created processed object
elif mode == 'PhotoMaker': # photomaker creates pipeline and triggers original process_images
from modules.face.insightface import get_app
app = get_app('buffalo_l')

View File

@ -40,7 +40,7 @@ def create_ui(prompt, negative, styles, _overrides):
with gr.Accordion(label="Video", open=False):
with gr.Row():
mp4_codec = gr.Dropdown(label="FP codec", choices=['none', 'libx264'], value='libx264', type='value')
ui_common.create_refresh_button(mp4_codec, get_codecs)
ui_common.create_refresh_button(mp4_codec, get_codecs, elem_id="framepack_mp4_codec_refresh")
mp4_ext = gr.Textbox(label="FP format", value='mp4', elem_id="framepack_mp4_ext")
mp4_opt = gr.Textbox(label="FP options", value='crf:16', elem_id="framepack_mp4_ext")
with gr.Row():

Some files were not shown because too many files have changed in this diff Show More