diff --git a/.pylintrc b/.pylintrc index a24604e3c..5bd9493dd 100644 --- a/.pylintrc +++ b/.pylintrc @@ -38,6 +38,7 @@ ignore-paths=/usr/lib/.*$, pipelines/flex2, pipelines/f_lite, pipelines/hidream, + pipelines/hdm, pipelines/meissonic, pipelines/omnigen2, pipelines/segmoe, diff --git a/.ruff.toml b/.ruff.toml index d6c39bb1a..f5b4d3c2b 100644 --- a/.ruff.toml +++ b/.ruff.toml @@ -20,6 +20,7 @@ exclude = [ "pipelines/meissonic", "pipelines/omnigen2", + "pipelines/hdm", "pipelines/segmoe", "scripts/lbm", diff --git a/CHANGELOG.md b/CHANGELOG.md index c6d5695b0..9bb328452 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,5 +1,62 @@ # Change Log for SD.Next +## Update for 2025-08-30 + +- **Models** + - **Chroma** final versions: [Chroma1-HD](https://huggingface.co/lodestones/Chroma1-HD), [Chroma1-Base](https://huggingface.co/lodestones/Chroma1-Base) and [Chroma1-Flash](https://huggingface.co/lodestones/Chroma1-Flash) + - **Qwen-Image** [InstantX ControlNet Union](https://huggingface.co/InstantX/Qwen-Image-ControlNet-Union) support + *note* qwen-image is already a very large model and controlnet adds 3.5GB on top of that so quantization and offloading are highly recommended! + - [Nunchaku-Qwen-Image-Lightning](https://huggingface.co/nunchaku-tech/nunchaku-qwen-image) + if you have a compatible nVidia GPU, Nunchaku is the fastest quantization engine, currently available for Flux.1, SANA and Qwen-Image models + *note*: release version of `nunchaku==0.3.2` does NOT include support, so you need to build [nunchaku](https://nunchaku.tech/docs/nunchaku/installation/installation.html) from source + - [HunyuanDiT ControlNet](https://huggingface.co/Tencent-Hunyuan/HYDiT-ControlNet-v1.2) Canny, Depth, Pose + - [KBlueLeaf/HDM-xut-340M-anime](https://huggingface.co/KBlueLeaf/HDM-xut-340M-anime) + highly experimental: HDM *Home-made-Diffusion-Model* is a project to investigate specialized training recipe/scheme for pretraining T2I model at home based on super-light architecture + requires: generator=cpu, dtype=float16, offload=none + - updated [SD.Next Model Samples Gallery](https://vladmandic.github.io/sd-samples/compare.html) +- **UI** + - default to **ModernUI** + standard ui is still available via *settings -> user interface -> theme type* + - mobile-friendly! + - make hints touch-friendly: hold touch to display hint + - improved image scaling in img2img and control interfaces + - add base model type to networks display, thanks @Artheriax + - additional hints to ui, thanks @Artheriax + - add video support to gallery, thanks @CalamitousFelicitousness + - additional artwork for reference models in networks, thanks @liutyi + - improve ui hints display + - restyled all toolbuttons to be modernui native + - reodered system settings + - configurable horizontal vs vertical panel layout + in settings -> user interface -> panel min width + *example*: if panel width is less than specified value, layout switches to verical + - configurable grid images size + in *settings -> user interface -> grid image size* +- **Offloading** + - enable offload during pre-forward by default + - improve offloading of models with multiple dits + - improve offloading of models with impliciy vae processing + - improve offloading of models with controlnet +- **SDNQ** + - add quantized matmul support for all quantization types and group sizes +- **Other** + - refactor reuse-seed and add functionality to all tabs +- **Fixes** + - normalize path hanlding when deleting images + - remove samplers filtering + - fix hidden model tags in networks display + - fix networks reference models display on windows + - fix handling of pre-quantized `flux` models + - fix `wan` use correct pipeline for i2v models + - fix `qwen-image` with hires + - fix `omnigen-2` failure + - fix `auraflow` quantization + - fix `kandinsky-3` noise + - fix `infiniteyou` pipeline offloading + - fix `skyreels-v2` image-to-video + - fix `flex2` img2img denoising strength + - fix segfault on startup with rocm 6.4.3 and torch 2.8 + ## Update for 2025-08-20 A quick service release with several important hotfixes, improved localization support and adding new **Qwen** model variants... diff --git a/README.md b/README.md index 1095c1c20..e37f77479 100644 --- a/README.md +++ b/README.md @@ -43,12 +43,15 @@ All individual features are not listed here, instead check [ChangeLog](CHANGELOG
-*Main interface using **StandardUI***: -![screenshot-standardui](https://github.com/user-attachments/assets/cab47fe3-9adb-4d67-aea9-9ee738df5dcc) +**Desktop** interface +
+screenshot-modernui-desktop +
-*Main interface using **ModernUI***: - -![screenshot-modernui](https://github.com/user-attachments/assets/39e3bc9a-a9f7-4cda-ba33-7da8def08032) +**Mobile** interface +
+screenshot-modernui-mobile +
For screenshots and informations on other available themes, see [Themes](https://vladmandic.github.io/sdnext-docs/Themes/) diff --git a/TODO.md b/TODO.md index 2b99c8c72..c480e4f67 100644 --- a/TODO.md +++ b/TODO.md @@ -7,19 +7,16 @@ Main ToDo list can be found at [GitHub projects](https://github.com/users/vladma - Remote TE - Mobile ModernUI - [Canvas](https://konvajs.org/) - - [Modular pipelines and guiders](https://github.com/huggingface/diffusers/issues/11915) - Refactor: Sampler options - Refactor: [GGUF](https://huggingface.co/docs/diffusers/main/en/quantization/gguf) - Feature: Diffusers [group offloading](https://github.com/vladmandic/sdnext/issues/4049) -- Feature: Common repo for `T5` and `CLiP` - Feature: LoRA add OMI format support for SD35/FLUX.1 - Video: Generic API support - Video: LTX TeaCache and others - Video: LTX API - Video: LTX PromptEnhance - Video: LTX Conditioning preprocess -- [WanAI-2.1 VACE](https://huggingface.co/Wan-AI/Wan2.1-VACE-14B)(https://github.com/huggingface/diffusers/pull/11582) - [Cosmos-Predict2-Video](https://huggingface.co/nvidia/Cosmos-Predict2-2B-Video2World)(https://github.com/huggingface/diffusers/pull/11695) ### Blocked items @@ -30,6 +27,8 @@ Main ToDo list can be found at [GitHub projects](https://github.com/users/vladma ### Under Consideration +- [X-Omni](https://github.com/X-Omni-Team/X-Omni/blob/main/README.md) +- [DiffSynth Studio](https://github.com/modelscope/DiffSynth-Studio) - [IPAdapter negative guidance](https://github.com/huggingface/diffusers/discussions/7167) - [IPAdapter composition](https://huggingface.co/ostris/ip-composition-adapter) - [STG](https://github.com/huggingface/diffusers/blob/main/examples/community/README.md#spatiotemporal-skip-guidance) diff --git a/cli/localize.js b/cli/localize.js index e1b0aba5b..da870d455 100755 --- a/cli/localize.js +++ b/cli/localize.js @@ -8,6 +8,7 @@ const { GoogleGenerativeAI } = require('@google/generative-ai'); const api_key = process.env.GOOGLE_AI_API_KEY; const model = 'gemini-2.5-flash'; const prompt = ` +// eslint-disable-next-line max-len Translate attached JSON from English to {language} using following rules: fields id, label and reload should be preserved from original, field localized should be a translated version of field label and field hint should be translated in-place. if field is less than 3 characters, do not translate it and keep it as is. Every JSON entry should have id, label, localized, reload and hint fields. Output should be pure JSON without any additional text. To better match translation, context of the text is related to Stable Diffusion and topic of Generative AI.`; const languages = { hr: 'Croatian', diff --git a/extensions-builtin/sdnext-modernui b/extensions-builtin/sdnext-modernui index da4ccd4aa..3a5df9fc0 160000 --- a/extensions-builtin/sdnext-modernui +++ b/extensions-builtin/sdnext-modernui @@ -1 +1 @@ -Subproject commit da4ccd4aa75e3b42937674ba23d406a02783df4f +Subproject commit 3a5df9fc03c1d61d7d70413f7a2f78a4b4552ae2 diff --git a/html/amethyst-nightfall.jpg b/html/amethyst-nightfall.jpg deleted file mode 100644 index 216f2f765..000000000 Binary files a/html/amethyst-nightfall.jpg and /dev/null differ diff --git a/html/black-orange.jpg b/html/black-orange.jpg deleted file mode 100644 index 90c4bca0a..000000000 Binary files a/html/black-orange.jpg and /dev/null differ diff --git a/html/black-teal.jpg b/html/black-teal.jpg deleted file mode 100644 index 06f604f86..000000000 Binary files a/html/black-teal.jpg and /dev/null differ diff --git a/html/emerald-paradise.jpg b/html/emerald-paradise.jpg deleted file mode 100644 index 73a4b7a5f..000000000 Binary files a/html/emerald-paradise.jpg and /dev/null differ diff --git a/html/gradio-base.jpg b/html/gradio-base.jpg deleted file mode 100644 index 97a29c107..000000000 Binary files a/html/gradio-base.jpg and /dev/null differ diff --git a/html/gradio-default.jpg b/html/gradio-default.jpg deleted file mode 100644 index bb9097be4..000000000 Binary files a/html/gradio-default.jpg and /dev/null differ diff --git a/html/gradio-glass.jpg b/html/gradio-glass.jpg deleted file mode 100644 index 984a509bf..000000000 Binary files a/html/gradio-glass.jpg and /dev/null differ diff --git a/html/gradio-monochrome.jpg b/html/gradio-monochrome.jpg deleted file mode 100644 index ab2134fb4..000000000 Binary files a/html/gradio-monochrome.jpg and /dev/null differ diff --git a/html/gradio-soft.jpg b/html/gradio-soft.jpg deleted file mode 100644 index 1ba487a7c..000000000 Binary files a/html/gradio-soft.jpg and /dev/null differ diff --git a/html/image-update.svg b/html/image-update.svg deleted file mode 100644 index 3abf12df0..000000000 --- a/html/image-update.svg +++ /dev/null @@ -1,7 +0,0 @@ - - - - - - - diff --git a/html/invoked.jpg b/html/invoked.jpg deleted file mode 100644 index fd0c1a965..000000000 Binary files a/html/invoked.jpg and /dev/null differ diff --git a/html/light-teal.jpg b/html/light-teal.jpg deleted file mode 100644 index f09546a02..000000000 Binary files a/html/light-teal.jpg and /dev/null differ diff --git a/html/locale_en.json b/html/locale_en.json index 04dc20618..8113a811b 100644 --- a/html/locale_en.json +++ b/html/locale_en.json @@ -32,6 +32,7 @@ {"id":"","label":"","localized":"","reload":"","hint":"Sort by time, descending"} ], "main": [ + {"id":"","label":"SD.Next","localized":"","reload":"","hint":"SD.Next
All-in-one WebUI for AI generative image and video creation"}, {"id":"","label":"Prompt","localized":"","reload":"","hint":"Describe image you want to generate"}, {"id":"","label":"Start","localized":"","reload":"","hint":"Start"}, {"id":"","label":"End","localized":"","reload":"","hint":"End"}, @@ -41,10 +42,14 @@ {"id":"","label":"Text","localized":"","reload":"","hint":"Create image from text"}, {"id":"","label":"Image","localized":"","reload":"","hint":"Create image from image"}, {"id":"","label":"Control","localized":"","reload":"","hint":"Create image with full guidance"}, - {"id":"","label":"Process","localized":"","reload":"","hint":"Process existing image"}, + {"id":"","label":"Images","localized":"","reload":"","hint":"Create images
Unified interface
Supports T2I and I2I
With optional control guidance"}, + {"id":"","label":"T2I","localized":"","reload":"","hint":"Create image from text
Legacy interface that mimics original text-to-image interface and behavior"}, + {"id":"","label":"I2I","localized":"","reload":"","hint":"Create image from image
Legacy interface that mimics original image-to-image interface and behavior"}, + {"id":"","label":"Process","localized":"","reload":"","hint":"Process existing image
Can be used to upscale images, remove backgrounds, obfuscate NSFW content, apply various filters and effects"}, {"id":"","label":"Caption","localized":"","reload":"","hint":"Analyze existing images and create text descriptions"}, {"id":"","label":"Interrogate","localized":"","reload":"","hint":"Run interrogate to get description of your image"}, {"id":"","label":"Models","localized":"","reload":"","hint":"Download, convert or merge your models and manage models metadata"}, + {"id":"","label":"Sampler","localized":"","reload":"","hint":"Settings related to sampler and seed selection and configuration. Samplers guide the process of turning noise into an image over multiple steps."}, {"id":"","label":"Agent Scheduler","localized":"","reload":"","hint":"Enqueue your generate requests and run them in the background"}, {"id":"","label":"AgentScheduler","localized":"","reload":"","hint":"Enqueue your generate requests and run them in the background"}, {"id":"","label":"System","localized":"","reload":"","hint":"System settings and information"}, @@ -97,7 +102,7 @@ {"id":"","label":"Denoise","localized":"","reload":"","hint":"Denoising settings. Higher denoise means that more of existing image content is allowed to change during generate"}, {"id":"","label":"Mask","localized":"","reload":"","hint":"Image masking and mask options"}, {"id":"","label":"Input","localized":"","reload":"","hint":"Selection of input media"}, - {"id":"","label":"Video","localized":"","reload":"","hint":"Create video using guidance"}, + {"id":"","label":"Video","localized":"","reload":"","hint":"Create videos using different methods
Supports text-to-image, image-to-image first-last-frame, etc."}, {"id":"","label":"Control elements","localized":"","reload":"","hint":"Control elements are advanced models that can guide generation towards desired outcome"}, {"id":"","label":"IP adapter","localized":"","reload":"","hint":"Guide generation towards desired outcome using IP adapters plugin models"}, {"id":"","label":"IP adapters","localized":"","reload":"","hint":"IP adapters are plugin models that can guide generation towards desired outcome"}, @@ -192,7 +197,14 @@ {"id":"","label":"Control Only","localized":"","reload":"","hint":"This uses only the Control input below as the source for any ControlNet or IP Adapter type tasks based on any of our various options."}, {"id":"","label":"Init Image Same As Control","localized":"","reload":"","hint":"Will additionally treat any image placed into the Control input window as a source for img2img type tasks, an image to modify for example."}, {"id":"","label":"Separate Init Image","localized":"","reload":"","hint":"Creates an additional window next to Control input labeled Init input, so you can have a separate image for both Control operations and an init source."}, - {"id":"","label":"Override settings","localized":"","reload":"","hint":"If generation parameters deviate from your system settings override settings populated with those settings to override your system configuration for this workflow"} + {"id":"","label":"Override settings","localized":"","reload":"","hint":"If generation parameters deviate from your system settings override settings populated with those settings to override your system configuration for this workflow"}, + {"id":"","label":"sigma method","localized":"","reload":"","hint":"Controls how noise levels (sigmas) are distributed across diffusion steps. Options:\n- default: the model default\n- karras: smoother noise schedule, higher quality with fewer steps\n- beta: based on beta schedule values\n- exponential: exponential decay of noise\n- lambdas: experimental, balances signal-to-noise\n- flowmatch: tuned for flow-matching models"}, + {"id":"","label":"timestep spacing","localized":"","reload":"","hint":"Determines how timesteps are spaced across the diffusion process. Options:\n- default: the model default\n- leading: creates evenly spaced steps\n- linspace: includes the first and last steps and evenly selects the remaining intermediate steps\n- trailing: only includes the last step and evenly selects the remaining intermediate steps starting from the end"}, + {"id":"","label":"beta schedule","localized":"","reload":"","hint":"Defines how beta (noise strength per step) grows. Options:\n- default: the model default\n- linear: evenly decays noise per step\n- scaled: squared version of linear, used only by Stable Diffusion\n- cosine: smoother decay, often better results with fewer steps\n- sigmoid: sharp transition, experimental"}, + {"id":"","label":"prediction method","localized":"","reload":"","hint":"Defines what the model predicts at each step. Options:\n- default: the model default\n- epsilon: noise (most common for Stable Diffusion)\n- sample: direct denoised image prediction, also called as x0 prediction\n- v_prediction: velocity prediction, used by CosXL and NoobAI VPred models\n- flow_prediction: used with newer flow-matching models like SD3 and Flux"}, + {"id":"","label":"sampler order","localized":"","reload":"","hint":"Order of solver updates in the sampler. Higher order improves stability/accuracy but increases compute cost."}, + {"id":"","label":"flow shift","localized":"","reload":"","hint":"Adjustment for flow-based samplers. Shifts noise distribution during generation, useful for fine-tuning balance between detail and consistency."}, + {"id":"","label":"resize mode","localized":"","reload":"","hint":"Defines how the input is resized or adapted in second-pass refinement:\n- none: no resizing, keep original resolution\n- fixed: force resize to target resolution (may distort)\n- crop: center-crop to fit target while keeping aspect ratio\n- fill: resize to fit and pad empty space with borders\n- outpaint: extend canvas beyond image borders\n- context aware: smart resize that blends or adapts surrounding areas"} ], "other": [ {"id":"","label":"Install","localized":"","reload":"","hint":"Install"}, @@ -358,15 +370,26 @@ {"id":"","label":"ONNX allow fallback to CPU","localized":"","reload":"","hint":"Allow fallback to CPU when selected execution provider failed"}, {"id":"","label":"ONNX cache converted models","localized":"","reload":"","hint":"Save the models that are converted to ONNX format as a cache. You can manage them in ONNX tab"}, {"id":"","label":"ONNX unload base model when processing refiner","localized":"","reload":"","hint":"Unload base model when the refiner is being converted/optimized/processed"}, - {"id":"","label":"Inference-mode","localized":"","reload":"","hint":"Use torch.inference_mode"}, - {"id":"","label":"no-grad","localized":"","reload":"","hint":"Use torch.no_grad"}, {"id":"","label":"Model compile precompile","localized":"","reload":"","hint":"Run model compile immediately on model load instead of first use"}, {"id":"","label":"Use zeros for prompt padding","localized":"","reload":"","hint":"Force full zero tensor when prompt is empty to remove any residual noise"}, {"id":"","label":"Include invisible watermark","localized":"","reload":"","hint":"Add invisible watermark to image by altering some pixel values"}, {"id":"","label":"invisible watermark string","localized":"","reload":"","hint":"Watermark string to add to image. Keep very short to avoid image corruption."}, {"id":"","label":"show log view","localized":"","reload":"","hint":"Show log view at the bottom of the main window"}, {"id":"","label":"Log view update period","localized":"","reload":"","hint":"Log view update period, in milliseconds"}, - {"id":"","label":"PAG layer names","localized":"","reload":"","hint":"Space separated list of layers
Available: d[0-5], m[0], u[0-8]
Default: m0"} + {"id":"","label":"PAG layer names","localized":"","reload":"","hint":"Space separated list of layers
Available: d[0-5], m[0], u[0-8]
Default: m0"}, + {"id":"","label":"prompt attention normalization","localized":"","reload":"","hint":"Balances prompt token weights to avoid overly strong/weak influence. Helps stabilize outputs."}, + {"id":"","label":"ck flash attention","localized":"","reload":"","hint":"Custom Flash Attention kernel. Very fast, but may be unstable or hardware-dependent."}, + {"id":"","label":"flash attention","localized":"","reload":"","hint":"Highly optimized attention algorithm. Greatly reduces VRAM use and speeds up inference, but can be non-deterministic."}, + {"id":"","label":"memory attention","localized":"","reload":"","hint":"Uses less VRAM by chunking attention computation. Slower but allows bigger batches or images."}, + {"id":"","label":"math attention","localized":"","reload":"","hint":"Fallback pure-math attention implementation. Stable and predictable but very slow."}, + {"id":"","label":"dynamic attention","localized":"","reload":"","hint":"Adjusts attention computation dynamically per step. Saves VRAM but slows generation."}, + {"id":"","label":"sage attention","localized":"","reload":"","hint":"Experimental attention optimization method. May improve speed but less tested and can cause bugs."}, + {"id":"","label":"batch matrix-matrix","localized":"","reload":"","hint":"Standard batched matrix multiplication for attention. Reliable but not VRAM-efficient."}, + {"id":"","label":"split attention","localized":"","reload":"","hint":"Splits attention layers into smaller chunks. Helps with very large images at the cost of slower inference."}, + {"id":"","label":"deterministic mode","localized":"","reload":"","hint":"Forces deterministic output across runs. Useful for reproducibility, but may disable some optimizations."}, + {"id":"","label":"no-grad","localized":"","reload":"","hint":"Disables gradient tracking with torch.no_grad. Reduces memory usage and speeds up inference."}, + {"id":"","label":"Inference-mode","localized":"","reload":"","hint":"Like no-grad but stricter. Ensures model runs only in inference mode for safety and speed."}, + {"id":"","label":"cudamallocasync","localized":"","reload":"","hint":"Uses CUDA async memory allocator. Improves performance and VRAM fragmentation, but may cause instability on some GPUs."} ], "missing": [ {"id":"","label":"1st stage","localized":"","reload":"","hint":"1st stage"}, @@ -455,7 +478,6 @@ {"id":"","label":"batch interogate","localized":"","reload":"","hint":"batch interogate"}, {"id":"","label":"batch interrogate","localized":"","reload":"","hint":"batch interrogate"}, {"id":"","label":"batch mask directory","localized":"","reload":"","hint":"batch mask directory"}, - {"id":"","label":"batch matrix-matrix","localized":"","reload":"","hint":"batch matrix-matrix"}, {"id":"","label":"batch mode uses sequential seeds","localized":"","reload":"","hint":"batch mode uses sequential seeds"}, {"id":"","label":"batch output directory","localized":"","reload":"","hint":"batch output directory"}, {"id":"","label":"batch uses original name","localized":"","reload":"","hint":"batch uses original name"}, @@ -466,7 +488,6 @@ {"id":"","label":"beta block weight preset","localized":"","reload":"","hint":"beta block weight preset"}, {"id":"","label":"beta end","localized":"","reload":"","hint":"beta end"}, {"id":"","label":"beta ratio","localized":"","reload":"","hint":"beta ratio"}, - {"id":"","label":"beta schedule","localized":"","reload":"","hint":"beta schedule"}, {"id":"","label":"beta start","localized":"","reload":"","hint":"beta start"}, {"id":"","label":"bh1","localized":"","reload":"","hint":"bh1"}, {"id":"","label":"bh2","localized":"","reload":"","hint":"bh2"}, @@ -495,7 +516,6 @@ {"id":"","label":"chunk size","localized":"","reload":"","hint":"chunk size"}, {"id":"","label":"civitai model type","localized":"","reload":"","hint":"civitai model type"}, {"id":"","label":"civitai token","localized":"","reload":"","hint":"civitai token"}, - {"id":"","label":"ck flash attention","localized":"","reload":"","hint":"ck flash attention"}, {"id":"","label":"ckpt","localized":"","reload":"","hint":"ckpt"}, {"id":"","label":"cleanup temporary folder on startup","localized":"","reload":"","hint":"cleanup temporary folder on startup"}, {"id":"","label":"clip model","localized":"","reload":"","hint":"clip model"}, @@ -563,7 +583,6 @@ {"id":"","label":"create zip archive","localized":"","reload":"","hint":"create zip archive"}, {"id":"","label":"cross-attention","localized":"","reload":"","hint":"cross-attention"}, {"id":"","label":"cudagraphs","localized":"","reload":"","hint":"cudagraphs"}, - {"id":"","label":"cudamallocasync","localized":"","reload":"","hint":"cudamallocasync"}, {"id":"","label":"custom pipeline","localized":"","reload":"","hint":"custom pipeline"}, {"id":"","label":"dark","localized":"","reload":"","hint":"dark"}, {"id":"","label":"dc solver","localized":"","reload":"","hint":"dc solver"}, @@ -591,7 +610,6 @@ {"id":"","label":"depth threshold","localized":"","reload":"","hint":"depth threshold"}, {"id":"","label":"description","localized":"","reload":"","hint":"description"}, {"id":"","label":"details","localized":"","reload":"","hint":"details"}, - {"id":"","label":"deterministic mode","localized":"","reload":"","hint":"deterministic mode"}, {"id":"","label":"device info","localized":"","reload":"","hint":"device info"}, {"id":"","label":"diffusers","localized":"","reload":"","hint":"diffusers"}, {"id":"","label":"dilate","localized":"","reload":"","hint":"dilate"}, @@ -635,7 +653,6 @@ {"id":"","label":"duration","localized":"","reload":"","hint":"duration"}, {"id":"","label":"dwpose","localized":"","reload":"","hint":"dwpose"}, {"id":"","label":"dynamic","localized":"","reload":"","hint":"dynamic"}, - {"id":"","label":"dynamic attention","localized":"","reload":"","hint":"dynamic attention"}, {"id":"","label":"dynamic attention slicing rate in gb","localized":"","reload":"","hint":"dynamic attention slicing rate in gb"}, {"id":"","label":"dynamic attention trigger rate in gb","localized":"","reload":"","hint":"dynamic attention trigger rate in gb"}, {"id":"","label":"edge","localized":"","reload":"","hint":"edge"}, @@ -682,9 +699,7 @@ {"id":"","label":"filename","localized":"","reload":"","hint":"filename"}, {"id":"","label":"first-block cache enabled","localized":"","reload":"","hint":"first-block cache enabled"}, {"id":"","label":"fixed unet precision","localized":"","reload":"","hint":"fixed unet precision"}, - {"id":"","label":"flash attention","localized":"","reload":"","hint":"flash attention"}, {"id":"","label":"flavors","localized":"","reload":"","hint":"flavors"}, - {"id":"","label":"flow shift","localized":"","reload":"","hint":"flow shift"}, {"id":"","label":"folder","localized":"","reload":"","hint":"folder"}, {"id":"","label":"folder for control generate","localized":"","reload":"","hint":"folder for control generate"}, {"id":"","label":"folder for control grids","localized":"","reload":"","hint":"folder for control grids"}, @@ -848,7 +863,6 @@ {"id":"","label":"mask only","localized":"","reload":"","hint":"mask only"}, {"id":"","label":"mask strength","localized":"","reload":"","hint":"mask strength"}, {"id":"","label":"masked","localized":"","reload":"","hint":"masked"}, - {"id":"","label":"math attention","localized":"","reload":"","hint":"math attention"}, {"id":"","label":"max faces","localized":"","reload":"","hint":"max faces"}, {"id":"","label":"max flavors","localized":"","reload":"","hint":"max flavors"}, {"id":"","label":"max guidance","localized":"","reload":"","hint":"max guidance"}, @@ -866,7 +880,6 @@ {"id":"","label":"medium","localized":"","reload":"","hint":"medium"}, {"id":"","label":"mediums","localized":"","reload":"","hint":"mediums"}, {"id":"","label":"memory","localized":"","reload":"","hint":"memory"}, - {"id":"","label":"memory attention","localized":"","reload":"","hint":"memory attention"}, {"id":"","label":"memory limit","localized":"","reload":"","hint":"memory limit"}, {"id":"","label":"memory optimization","localized":"","reload":"","hint":"memory optimization"}, {"id":"","label":"merge alpha","localized":"","reload":"","hint":"merge alpha"}, @@ -987,7 +1000,6 @@ {"id":"","label":"postprocessing operation order","localized":"","reload":"","hint":"postprocessing operation order"}, {"id":"","label":"power","localized":"","reload":"","hint":"power"}, {"id":"","label":"predefined question","localized":"","reload":"","hint":"predefined question"}, - {"id":"","label":"prediction method","localized":"","reload":"","hint":"prediction method"}, {"id":"","label":"preset","localized":"","reload":"","hint":"preset"}, {"id":"","label":"preset block merge","localized":"","reload":"","hint":"preset block merge"}, {"id":"","label":"preview","localized":"","reload":"","hint":"preview"}, @@ -998,7 +1010,6 @@ {"id":"","label":"processor move to cpu after use","localized":"","reload":"","hint":"processor move to cpu after use"}, {"id":"","label":"processor settings","localized":"","reload":"","hint":"processor settings"}, {"id":"","label":"processor unload after use","localized":"","reload":"","hint":"processor unload after use"}, - {"id":"","label":"prompt attention normalization","localized":"","reload":"","hint":"prompt attention normalization"}, {"id":"","label":"prompt ex","localized":"","reload":"","hint":"prompt ex"}, {"id":"","label":"prompt processor","localized":"","reload":"","hint":"prompt processor"}, {"id":"","label":"prompt strength","localized":"","reload":"","hint":"prompt strength"}, @@ -1035,13 +1046,12 @@ {"id":"","label":"reprocess face","localized":"","reload":"","hint":"reprocess face"}, {"id":"","label":"reprocess refine","localized":"","reload":"","hint":"reprocess refine"}, {"id":"","label":"request browser notifications","localized":"","reload":"","hint":"request browser notifications"}, - {"id":"","label":"rescale","localized":"","reload":"","hint":"rescale"}, + {"id":"","label":"rescale","localized":"","reload":"","hint":"rescale betas with zero terminal snr"}, {"id":"","label":"rescale betas with zero terminal snr","localized":"","reload":"","hint":"rescale betas with zero terminal snr"}, {"id":"","label":"reset anchors","localized":"","reload":"","hint":"reset anchors"}, {"id":"","label":"residual diff threshold","localized":"","reload":"","hint":"residual diff threshold"}, {"id":"","label":"resize background color","localized":"","reload":"","hint":"resize background color"}, {"id":"","label":"resize method","localized":"","reload":"","hint":"resize method"}, - {"id":"","label":"resize mode","localized":"","reload":"","hint":"resize mode"}, {"id":"","label":"resize scale","localized":"","reload":"","hint":"resize scale"}, {"id":"","label":"restart step","localized":"","reload":"","hint":"restart step"}, {"id":"","label":"restore faces: codeformer","localized":"","reload":"","hint":"restore faces: codeformer"}, @@ -1057,13 +1067,11 @@ {"id":"","label":"run benchmark","localized":"","reload":"","hint":"run benchmark"}, {"id":"","label":"sa solver","localized":"","reload":"","hint":"sa solver"}, {"id":"","label":"safetensors","localized":"","reload":"","hint":"safetensors"}, - {"id":"","label":"sage attention","localized":"","reload":"","hint":"sage attention"}, {"id":"","label":"same as primary","localized":"","reload":"","hint":"same as primary"}, {"id":"","label":"same latent","localized":"","reload":"","hint":"same latent"}, {"id":"","label":"sample","localized":"","reload":"","hint":"sample"}, {"id":"","label":"sampler","localized":"","reload":"","hint":"sampler"}, {"id":"","label":"sampler dynamic shift","localized":"","reload":"","hint":"sampler dynamic shift"}, - {"id":"","label":"sampler order","localized":"","reload":"","hint":"sampler order"}, {"id":"","label":"sampler shift","localized":"","reload":"","hint":"sampler shift"}, {"id":"","label":"sana: use complex human instructions","localized":"","reload":"","hint":"sana: use complex human instructions"}, {"id":"","label":"saturation","localized":"","reload":"","hint":"saturation"}, @@ -1129,7 +1137,6 @@ {"id":"","label":"sigma","localized":"","reload":"","hint":"sigma"}, {"id":"","label":"sigma churn","localized":"","reload":"","hint":"sigma churn"}, {"id":"","label":"sigma max","localized":"","reload":"","hint":"sigma max"}, - {"id":"","label":"sigma method","localized":"","reload":"","hint":"sigma method"}, {"id":"","label":"sigma min","localized":"","reload":"","hint":"sigma min"}, {"id":"","label":"sigma noise","localized":"","reload":"","hint":"sigma noise"}, {"id":"","label":"sigma tmin","localized":"","reload":"","hint":"sigma tmin"}, @@ -1148,7 +1155,6 @@ {"id":"","label":"spatial frequency","localized":"","reload":"","hint":"spatial frequency"}, {"id":"","label":"specify model revision","localized":"","reload":"","hint":"specify model revision"}, {"id":"","label":"specify model variant","localized":"","reload":"","hint":"specify model variant"}, - {"id":"","label":"split attention","localized":"","reload":"","hint":"split attention"}, {"id":"","label":"stable-fast","localized":"","reload":"","hint":"stable-fast"}, {"id":"","label":"standard","localized":"","reload":"","hint":"standard"}, {"id":"","label":"start","localized":"","reload":"","hint":"start"}, @@ -1209,7 +1215,6 @@ {"id":"","label":"timestep","localized":"","reload":"","hint":"timestep"}, {"id":"","label":"timestep skip end","localized":"","reload":"","hint":"timestep skip end"}, {"id":"","label":"timestep skip start","localized":"","reload":"","hint":"timestep skip start"}, - {"id":"","label":"timestep spacing","localized":"","reload":"","hint":"timestep spacing"}, {"id":"","label":"timesteps","localized":"","reload":"","hint":"timesteps"}, {"id":"","label":"timesteps override","localized":"","reload":"","hint":"timesteps override"}, {"id":"","label":"timesteps presets","localized":"","reload":"","hint":"timesteps presets"}, diff --git a/html/midnight-barbie.jpg b/html/midnight-barbie.jpg deleted file mode 100644 index c4182e8f1..000000000 Binary files a/html/midnight-barbie.jpg and /dev/null differ diff --git a/html/card-no-preview.png b/html/missing.png similarity index 100% rename from html/card-no-preview.png rename to html/missing.png diff --git a/html/orchid-dreams.jpg b/html/orchid-dreams.jpg deleted file mode 100644 index 8a62eeb0a..000000000 Binary files a/html/orchid-dreams.jpg and /dev/null differ diff --git a/html/reference.json b/html/reference.json index 5d634e68d..ae8c37e85 100644 --- a/html/reference.json +++ b/html/reference.json @@ -1,4 +1,3 @@ - { "Tempest-by-Vlad XL": { "path": "tempestByVlad_baseV01.safetensors@https://civitai.com/api/download/models/1301775", @@ -140,40 +139,47 @@ "extras": "sampler: Default, cfg_scale: 4.5" }, - "lodestones Chroma Unlocked HD": { + "lodestones Chroma1 HD": { "path": "lodestones/Chroma1-HD", "preview": "lodestones--Chroma-HD.jpg", - "desc": "Chroma is a 8.9B parameter model based on FLUX.1-schnell. It’s fully Apache 2.0 licensed, ensuring that anyone can use, modify, and build on top of it—no corporate gatekeeping. The model is still training right now, and I’d love to hear your thoughts! Your input and feedback are really appreciated.", + "desc": "Chroma is a 8.9B parameter model based on FLUX.1-schnell. It’s fully Apache 2.0 licensed, ensuring that anyone can use, modify, and build on top of it—no corporate gatekeeping. This is the high-res fine-tune of the Chroma1-Base at a 1024x1024 resolution.", "skip": true, - "extras": "sampler: Default, cfg_scale: 3.5" + "extras": "" }, - "lodestones Chroma Unlocked HD Annealed": { - "path": "vladmandic/chroma-unlocked-v50-annealed", - "preview": "lodestones--Chroma-annealed.jpg", - "desc": "Chroma is a 8.9B parameter model based on FLUX.1-schnell. It’s fully Apache 2.0 licensed, ensuring that anyone can use, modify, and build on top of it—no corporate gatekeeping. The model is still training right now, and I’d love to hear your thoughts! Your input and feedback are really appreciated.", + "lodestones Chroma1 Base": { + "path": "lodestones/Chroma1-Base", + "preview": "lodestones--Chroma-Base.jpg", + "desc": "Chroma is a 8.9B parameter model based on FLUX.1-schnell. It’s fully Apache 2.0 licensed, ensuring that anyone can use, modify, and build on top of it—no corporate gatekeeping. This is the core 512x512 model. It's a solid, all-around foundation for pretty much any creative project.", "skip": true, - "extras": "sampler: Default, cfg_scale: 3.5" + "extras": "" }, - "lodestones Chroma Unlocked HD Flash": { + "lodestones Chroma1 Flash": { "path": "lodestones/Chroma1-Flash", "preview": "lodestones--Chroma-flash.jpg", - "desc": "Chroma is a 8.9B parameter model based on FLUX.1-schnell. It’s fully Apache 2.0 licensed, ensuring that anyone can use, modify, and build on top of it—no corporate gatekeeping. The model is still training right now, and I’d love to hear your thoughts! Your input and feedback are really appreciated.", + "desc": "Chroma is a 8.9B parameter model based on FLUX.1-schnell. It’s fully Apache 2.0 licensed, ensuring that anyone can use, modify, and build on top of it—no corporate gatekeeping. A fine-tuned version of the Chroma1-Base made to find the best way to make these flow matching models faster.", "skip": true, - "extras": "sampler: Default, cfg_scale: 1.0" + "extras": "" }, - "lodestones Chroma Unlocked v48": { + "lodestones Chroma1 v50 Preview Annealed": { + "path": "vladmandic/chroma-unlocked-v50-annealed", + "preview": "lodestones--Chroma-annealed.jpg", + "desc": "Chroma is a 8.9B parameter model based on FLUX.1-schnell. It’s fully Apache 2.0 licensed, ensuring that anyone can use, modify, and build on top of it—no corporate gatekeeping. Re-tweaked variant with extra noise added.", + "skip": true, + "extras": "" + }, + "lodestones Chroma1 v48 Preview": { "path": "vladmandic/chroma-unlocked-v48", "preview": "lodestones--Chroma.jpg", - "desc": "Chroma is a 8.9B parameter model based on FLUX.1-schnell. It’s fully Apache 2.0 licensed, ensuring that anyone can use, modify, and build on top of it—no corporate gatekeeping. The model is still training right now, and I’d love to hear your thoughts! Your input and feedback are really appreciated.", + "desc": "Chroma is a 8.9B parameter model based on FLUX.1-schnell. It’s fully Apache 2.0 licensed, ensuring that anyone can use, modify, and build on top of it—no corporate gatekeeping. Last raw version of Chroma before final finetuning.", "skip": true, - "extras": "sampler: Default, cfg_scale: 1.0" + "extras": "" }, - "lodestones Chroma Unlocked v48 Detail Calibrated": { + "lodestones Chroma1 v48 Preview Calibrated": { "path": "vladmandic/chroma-unlocked-v48-detail-calibrated", "preview": "lodestones--Chroma-detail.jpg", - "desc": "Chroma is a 8.9B parameter model based on FLUX.1-schnell. It’s fully Apache 2.0 licensed, ensuring that anyone can use, modify, and build on top of it—no corporate gatekeeping. The model is still training right now, and I’d love to hear your thoughts! Your input and feedback are really appreciated.", + "desc": "Chroma is a 8.9B parameter model based on FLUX.1-schnell. It’s fully Apache 2.0 licensed, ensuring that anyone can use, modify, and build on top of it—no corporate gatekeeping. Last raw version of Chroma before final finetuning but with some detail calibration.", "skip": true, - "extras": "sampler: Default, cfg_scale: 1.0" + "extras": "" }, "Qwen-Image": { @@ -185,12 +191,12 @@ }, "Qwen-Image-Edit": { "path": "Qwen/Qwen-Image-Edit", - "preview": "Qwen--Qwen-Image.jpg", + "preview": "Qwen--Qwen-Image-Edit.jpg", "desc": " Qwen-Image-Edit, the image editing version of Qwen-Image. Built upon our 20B Qwen-Image model, Qwen-Image-Edit successfully extends Qwen-Image’s unique text rendering capabilities to image editing tasks, enabling precise text editing.", "skip": true, "extras": "" }, - "Qwen-Lightning": { + "Qwen-Image-Lightning": { "path": "vladmandic/Qwen-Lightning", "preview": "Qwen-Lightning.jpg", "desc": " Qwen-Lightning is step-distilled from Qwen-Image to allow for generation in 8 steps.", @@ -281,13 +287,13 @@ "NVLabs Sana 1.5 1.6B 1k": { "path": "Efficient-Large-Model/SANA1.5_1.6B_1024px_diffusers", "desc": "Sana is an efficient model with scaling of training-time and inference time techniques. SANA-1.5 delivers: efficient model growth from 1.6B Sana-1.0 model to 4.8B, achieving similar or better performance than training from scratch and saving 60% training cost; efficient model depth pruning, slimming any model size as you want; powerful VLM selection based inference scaling, smaller model+inference scaling > larger model.", - "preview": "Efficient-Large-Model--Sana15_1600M_1024px_diffusers.jpg", + "preview": "Efficient-Large-Model--SANA1.5_1.6B_1024px_diffusers.jpg", "skip": true }, "NVLabs Sana 1.5 4.8B 1k": { "path": "Efficient-Large-Model/SANA1.5_4.8B_1024px_diffusers", "desc": "Sana is an efficient model with scaling of training-time and inference time techniques. SANA-1.5 delivers: efficient model growth from 1.6B Sana-1.0 model to 4.8B, achieving similar or better performance than training from scratch and saving 60% training cost; efficient model depth pruning, slimming any model size as you want; powerful VLM selection based inference scaling, smaller model+inference scaling > larger model.", - "preview": "Efficient-Large-Model--Sana15_4800M_1024px_diffusers.jpg", + "preview": "Efficient-Large-Model--SANA1.5_4.8B_1024px_diffusers.jpg", "skip": true }, "NVLabs Sana 1.5 1.6B 1k Sprint": { @@ -299,25 +305,25 @@ "NVLabs Sana 1.0 1.6B 4k": { "path": "Efficient-Large-Model/Sana_1600M_4Kpx_BF16_diffusers", "desc": "Sana is a text-to-image framework that can efficiently generate images up to 4096 × 4096 resolution. Sana can synthesize high-resolution, high-quality images with strong text-image alignment at a remarkably fast speed, deployable on laptop GPU.", - "preview": "Efficient-Large-Model--Sana15_1600M_4Kpx_diffusers.jpg", + "preview": "Efficient-Large-Model--Sana_1600M_4Kpx_BF16_diffusers.jpg", "skip": true }, "NVLabs Sana 1.0 1.6B 2k": { "path": "Efficient-Large-Model/Sana_1600M_2Kpx_BF16_diffusers", "desc": "Sana is a text-to-image framework that can efficiently generate images up to 4096 × 4096 resolution. Sana can synthesize high-resolution, high-quality images with strong text-image alignment at a remarkably fast speed, deployable on laptop GPU.", - "preview": "Efficient-Large-Model--Sana1_1600M_2Kpx_diffusers.jpg", + "preview": "Efficient-Large-Model--Sana_1600M_2Kpx_BF16_diffusers.jpg", "skip": true }, "NVLabs Sana 1.0 1.6B 1k": { "path": "Efficient-Large-Model/Sana_1600M_1024px_diffusers", "desc": "Sana is a text-to-image framework that can efficiently generate images up to 4096 × 4096 resolution. Sana can synthesize high-resolution, high-quality images with strong text-image alignment at a remarkably fast speed, deployable on laptop GPU.", - "preview": "Efficient-Large-Model--Sana1_1600M_1024px_diffusers.jpg", + "preview": "Efficient-Large-Model--Sana_1600M_1024px_diffusers.jpg", "skip": true }, "NVLabs Sana 1.0 0.6B 0.5k": { "path": "Efficient-Large-Model/Sana_600M_512px_diffusers", "desc": "Sana is a text-to-image framework that can efficiently generate images up to 4096 × 4096 resolution. Sana can synthesize high-resolution, high-quality images with strong text-image alignment at a remarkably fast speed, deployable on laptop GPU.", - "preview": "Efficient-Large-Model--Sana1_600M_1024px_diffusers.jpg", + "preview": "Efficient-Large-Model--Sana_600M_512px_diffusers.jpg", "skip": true }, @@ -340,7 +346,6 @@ "preview": "Shitao--OmniGen-v1.jpg", "skip": true }, - "VectorSpaceLab OmniGen v2": { "path": "OmniGen2/OmniGen2", "desc": "OmniGen2 is a powerful and efficient unified multimodal model. Unlike OmniGen v1, OmniGen2 features two distinct decoding pathways for text and image modalities, utilizing unshared parameters and a decoupled image tokenizer.", @@ -462,7 +467,6 @@ "skip": true, "extras": "sampler: Default" }, - "AlphaVLLM Lumina 2": { "path": "Alpha-VLLM/Lumina-Image-2.0", "desc": "A Unified and Efficient Image Generative Model. Lumina-Image-2.0 is a 2 billion parameter flow-based diffusion transformer capable of generating images from text descriptions.", @@ -553,9 +557,10 @@ "extras": "sampler: Default" }, "Playground v2.5": { - "path": "playground-v2.5-1024px-aesthetic.fp16.safetensors@https://huggingface.co/playgroundai/playground-v2.5-1024px-aesthetic/resolve/main/playground-v2.5-1024px-aesthetic.fp16.safetensors?download=true", - "desc": "Playground v2.5 is a diffusion-based text-to-image generative model, and a successor to Playground v2. Playground v2.5 is the state-of-the-art open-source model in aesthetic quality. Our user studies demonstrate that our model outperforms SDXL, Playground v2, PixArt-α, DALL-E 3, and Midjourney 5.2.", + "path": "playgroundai/playground-v2.5-1024px-aesthetic", + "desc": "Playground v2.5 is a diffusion-based text-to-image generative model, and a successor to Playground v2. Playground v2.5 is the state-of-the-art open-source model in aesthetic quality.", "preview": "playgroundai--playground-v2_5-1024px-aesthetic.jpg", + "variant": "fp16", "extras": "sampler: DPM++ 2M EDM" }, @@ -604,6 +609,7 @@ "preview": "MeissonFlow--Meissonic.jpg", "skip": true }, + "aMUSEd 256": { "path": "huggingface/amused/amused-256", "skip": true, @@ -624,6 +630,7 @@ "preview": "warp-ai--wuerstchen.jpg", "extras": "sampler: Default, cfg_scale: 4.0, image_cfg_scale: 0.0" }, + "KOALA 700M": { "path": "huggingface/etri-vilab/koala-700m-llava-cap", "variant": "fp16", @@ -632,22 +639,34 @@ "preview": "etri-vilab--koala-700m-llava-cap.jpg", "extras": "sampler: Default" }, + + "HDM-XUT 340M Anime": { + "path": "KBlueLeaf/HDM-xut-340M-anime", + "skip": true, + "desc": "HDM(Home made Diffusion Model) is a project to investigate specialized training recipe/scheme for pretraining T2I model at home which require the training setup should be exectuable on customer level hardware or cheap enough second handed server hardware.", + "preview": "KBlueLeaf--HDM-xut-340M-anime.jpg", + "extras": "" + }, + "Tsinghua UniDiffuser": { "path": "thu-ml/unidiffuser-v1", "desc": "UniDiffuser is a unified diffusion framework to fit all distributions relevant to a set of multi-modal data in one transformer. UniDiffuser is able to perform image, text, text-to-image, image-to-text, and image-text pair generation by setting proper timesteps without additional overhead.\nSpecifically, UniDiffuser employs a variation of transformer, called U-ViT, which parameterizes the joint noise prediction network. Other components perform as encoders and decoders of different modalities, including a pretrained image autoencoder from Stable Diffusion, a pretrained image ViT-B/32 CLIP encoder, a pretrained text ViT-L CLIP encoder, and a GPT-2 text decoder finetuned by ourselves.", "preview": "thu-ml--unidiffuser-v1.jpg", "extras": "width: 512, height: 512, sampler: Default" }, + "SalesForce BLIP-Diffusion": { "path": "salesforce/blipdiffusion", "desc": "BLIP-Diffusion, a new subject-driven image generation model that supports multimodal control which consumes inputs of subject images and text prompts. Unlike other subject-driven generation models, BLIP-Diffusion introduces a new multimodal encoder which is pre-trained to provide subject representation.", "preview": "salesforce--blipdiffusion.jpg" }, + "InstaFlow 0.9B": { "path": "XCLiu/instaflow_0_9B_from_sd_1_5", "desc": "InstaFlow is an ultra-fast, one-step image generator that achieves image quality close to Stable Diffusion. This efficiency is made possible through a recent Rectified Flow technique, which trains probability flows with straight trajectories, hence inherently requiring only a single step for fast inference.", "preview": "XCLiu--instaflow_0_9B_from_sd_1_5.jpg" }, + "DeepFloyd IF Medium": { "path": "DeepFloyd/IF-I-M-v1.0", "desc": "DeepFloyd-IF is a pixel-based text-to-image triple-cascaded diffusion model, that can generate pictures with new state-of-the-art for photorealism and language understanding. The result is a highly efficient model that outperforms current state-of-the-art models, achieving a zero-shot FID-30K score of 6.66 on the COCO dataset. It is modular and composed of frozen text mode and three pixel cascaded diffusion modules, each designed to generate images of increasing resolution: 64x64, 256x256, and 1024x1024.", diff --git a/html/simple-dark.jpg b/html/simple-dark.jpg deleted file mode 100644 index 0aa0f2450..000000000 Binary files a/html/simple-dark.jpg and /dev/null differ diff --git a/html/simple-light.jpg b/html/simple-light.jpg deleted file mode 100644 index aa85547c3..000000000 Binary files a/html/simple-light.jpg and /dev/null differ diff --git a/html/timeless-beige.jpg b/html/timeless-beige.jpg deleted file mode 100644 index e8c591379..000000000 Binary files a/html/timeless-beige.jpg and /dev/null differ diff --git a/installer.py b/installer.py index 8b32ed927..d59908bf0 100644 --- a/installer.py +++ b/installer.py @@ -447,6 +447,8 @@ def git(arg: str, folder: str = None, ignore: bool = False, optional: bool = Fal stdout += ('\n' if len(stdout) > 0 else '') + result.stderr.decode(encoding="utf8", errors="ignore") stdout = stdout.strip() if result.returncode != 0 and not ignore: + if folder is None: + folder = 'root' if "couldn't find remote ref" in stdout: # not a git repo log.error(f'Git: folder="{folder}" could not identify repository') elif "no submodule mapping found" in stdout: @@ -601,7 +603,7 @@ def check_diffusers(): if args.skip_git: install('diffusers') return - sha = '4fcd0bc7ebb934a1559d0b516f09534ba22c8a0d' # diffusers commit hash + sha = '9b721db205729d5a6e97a72312c3a0f4534064f1' # diffusers commit hash pkg = pkg_resources.working_set.by_key.get('diffusers', None) minor = int(pkg.version.split('.')[1] if pkg is not None else -1) cur = opts.get('diffusers_version', '') if minor > -1 else '' @@ -622,18 +624,22 @@ def check_transformers(): t_start = time.time() if args.skip_all or args.skip_git or args.experimental: return - pkg = pkg_resources.working_set.by_key.get('transformers', None) + pkg_transofmers = pkg_resources.working_set.by_key.get('transformers', None) + pkg_tokenizers = pkg_resources.working_set.by_key.get('tokenizers', None) if args.use_directml: - target = '4.52.4' + target_transformers = '4.52.4' + target_tokenizers = '0.21.4' else: - target = '4.55.2' - if (pkg is None) or ((pkg.version != target) and (not args.experimental)): - if pkg is None: - log.info(f'Transformers install: version={target}') + target_transformers = '4.56.0' + target_tokenizers = '0.22.0' + if (pkg_transofmers is None) or ((pkg_transofmers.version != target_transformers) or (pkg_tokenizers is None) or ((pkg_tokenizers.version != target_tokenizers) and (not args.experimental))): + if pkg_transofmers is None: + log.info(f'Transformers install: version={target_transformers}') else: - log.info(f'Transformers update: current={pkg.version} target={target}') + log.info(f'Transformers update: current={pkg_transofmers.version} target={target_transformers}') pip('uninstall --yes transformers', ignore=True, quiet=True, uv=False) - pip(f'install --upgrade transformers=={target}', ignore=False, quiet=True, uv=False) + pip(f'install --upgrade tokenizers=={target_tokenizers}', ignore=False, quiet=True, uv=False) + pip(f'install --upgrade transformers=={target_transformers}', ignore=False, quiet=True, uv=False) ts('transformers', t_start) @@ -768,10 +774,6 @@ def install_rocm_zluda(): # older rocm (5.7) uses torch 2.3 or older torch_command = os.environ.get('TORCH_COMMAND', f'torch torchvision --index-url https://download.pytorch.org/whl/rocm{rocm.version}') - if device is not None and rocm.version != "6.2" and rocm.get_blaslt_enabled(): - log.debug(f'ROCm hipBLASLt: arch={device.name} available={device.blaslt_supported}') - rocm.set_blaslt_enabled(device.blaslt_supported) - if device is None or os.environ.get("HSA_OVERRIDE_GFX_VERSION", None) is not None: log.info(f'ROCm: HSA_OVERRIDE_GFX_VERSION auto config skipped: device={device.name if device is not None else None} version={os.environ.get("HSA_OVERRIDE_GFX_VERSION", None)}') else: @@ -1271,21 +1273,6 @@ def install_optional(): ts('optional', t_start) -def install_sentencepiece(): - if installed('sentencepiece', quiet=True): - pass - elif int(sys.version_info.minor) >= 13: - backup_cmake_policy = os.environ.get('CMAKE_POLICY_VERSION_MINIMUM', None) - backup_cxxflags = os.environ.get('CXXFLAGS', None) - os.environ.setdefault('CMAKE_POLICY_VERSION_MINIMUM', '3.5') - os.environ.setdefault('CXXFLAGS', '-include cstdint') - install('git+https://github.com/google/sentencepiece#subdirectory=python', 'sentencepiece') - os.environ.setdefault('CMAKE_POLICY_VERSION_MINIMUM', backup_cmake_policy) - os.environ.setdefault('CXXFLAGS', backup_cxxflags) - else: - install('sentencepiece', 'sentencepiece') - - def install_requirements(): t_start = time.time() if args.profile: diff --git a/javascript/base.css b/javascript/base.css index f6a7f7d09..cc2e22061 100644 --- a/javascript/base.css +++ b/javascript/base.css @@ -98,7 +98,7 @@ table.settings-value-table td { padding: 0.4em; border: 1px solid #ccc; max-widt .extra-network-cards .card:hover .overlay { background: rgba(0, 0, 0, 0.40); } .extra-network-cards .card:hover .preview { box-shadow: none; filter: grayscale(100%); } .extra-network-cards .card:hover .overlay { background: rgba(0, 0, 0, 0.40); } -.extra-network-cards .card .tags { margin: 4px; display: none; overflow-wrap: break-word; } +.extra-network-cards .card .tags { margin: 4px; display: none; overflow-wrap: anywhere; } .extra-network-cards .card .tag { padding: 2px; margin: 2px; background: var(--neutral-700); cursor: pointer; display: inline-block; } .extra-network-cards .card .actions > span { padding: 4px; } .extra-network-cards .card:hover .actions { display: block; } @@ -128,8 +128,8 @@ div:has(>#tab-browser-folders) { flex-grow: 0 !important; background-color: var( /* loader */ .splash { position: fixed; top: 0; left: 0; width: 100vw; height: 100vh; z-index: 1000; display: block; text-align: center; } -.motd { margin-top: 2em; color: var(--body-text-color-subdued); font-family: monospace; font-variant: all-petite-caps; } -.splash-img { margin: 10% auto 0 auto; width: 512px; background-repeat: no-repeat; height: 512px; animation: color 10s infinite alternate; max-width: 80vw; background-size: contain; } +.motd { margin-top: 2em; color: var(--body-text-color-subdued); font-family: monospace; font-variant: all-petite-caps; font-size: 1.2em; } +.splash-img { margin: 10% auto 0 auto; width: 512px; background-repeat: no-repeat; height: 512px; animation: hue 5s infinite alternate; max-width: 80vw; background-size: contain; } .loading { color: white; position: absolute; top: 20%; left: 50%; transform: translateX(-50%); } .loader { width: 300px; height: 300px; border: var(--spacing-md) solid transparent; border-radius: 50%; border-top: var(--spacing-md) solid var(--primary-600); animation: spin 4s linear infinite; position: relative; } .loader::before, .loader::after { content: ""; position: absolute; top: 6px; bottom: 6px; left: 6px; right: 6px; border-radius: 50%; border: var(--spacing-md) solid transparent; } @@ -137,4 +137,4 @@ div:has(>#tab-browser-folders) { flex-grow: 0 !important; background-color: var( .loader::after { border-top-color: var(--primary-300); animation: spin 1.5s linear infinite; } @keyframes move { from { background-position-x: 0, -40px; } to { background-position-x: 0, 40px; } } @keyframes spin { from { transform: rotate(0deg); } to { transform: rotate(360deg); } } -@keyframes color { from { filter: hue-rotate(0deg) } to { filter: hue-rotate(360deg) } } +@keyframes hue { from { filter: hue-rotate(0deg) } to { filter: hue-rotate(360deg) } } diff --git a/javascript/black-teal-reimagined.css b/javascript/black-teal-reimagined.css index 1e7d4dc0b..dc3d4d0ce 100644 --- a/javascript/black-teal-reimagined.css +++ b/javascript/black-teal-reimagined.css @@ -953,7 +953,7 @@ svg.feather.feather-image, } /* No Preview Card Styles */ -.extra-network-cards .card:has(>img[src*="card-no-preview.png"])::before { +.extra-network-cards .card:has(>img[src*="missing.png"])::before { content: ''; position: absolute; width: 100%; @@ -1007,11 +1007,11 @@ svg.feather.feather-image, } .splash-img { - margin: 0; + margin: 10% auto 0 auto; width: 512px; height: 512px; background-repeat: no-repeat; - animation: color 8s infinite alternate, move 3s infinite alternate; + animation: hue 5s infinite alternate; } .loading { diff --git a/javascript/civitai.js b/javascript/civitai.js index 31f1fc890..71e957380 100644 --- a/javascript/civitai.js +++ b/javascript/civitai.js @@ -107,7 +107,7 @@ async function modelCardClick(id) { downloads: data.downloads?.toString() || '', creator, desc: data.desc || 'no description available', - image: images.length > 0 ? images[0] : '/sdapi/v1/network/thumb?filename=html/card-no-preview.png', + image: images.length > 0 ? images[0] : '/sdapi/v1/network/thumb?filename=html/missing.png', versions: versionsHTML || '', }); el.innerHTML = modelHTML; diff --git a/javascript/extraNetworks.js b/javascript/extraNetworks.js index e13323307..5026ef6e2 100644 --- a/javascript/extraNetworks.js +++ b/javascript/extraNetworks.js @@ -14,8 +14,13 @@ const getENActiveTab = () => { else if (gradioApp().getElementById('extras_image')?.checkVisibility()) tabName = 'process'; else if (gradioApp().getElementById('interrogate_image')?.checkVisibility()) tabName = 'caption'; else if (gradioApp().getElementById('tab-gallery-search')?.checkVisibility()) tabName = 'gallery'; - if (tabName in ['process', 'caption', 'gallery']) tabName = lastTab; - else lastTab = tabName; + + if (['process', 'caption', 'gallery'].includes(tabName)) { + tabName = lastTab; + } else if (tabName !== '') { + lastTab = tabName; + } + if (tabName !== '') return tabName; // legacy method if (gradioApp().getElementById('tab_txt2img')?.style.display === 'block') tabName = 'txt2img'; @@ -277,8 +282,31 @@ function extraNetworksSearchButton(event) { const tabName = getENActiveTab(); const searchTextarea = gradioApp().querySelector(`#${tabName}_extra_search textarea`); const button = event.target; - searchTextarea.value = `${button.textContent.trim()}/`; - updateInput(searchTextarea); + if (searchTextarea) { + searchTextarea.value = `${button.textContent.trim()}/`; + updateInput(searchTextarea); + } else { + console.error(`Could not find the search textarea for the tab: ${tabName}`); + } +} + +function extraNetworksFilterVersion(event) { + // log('extraNetworksFilterVersion', event); + const version = event.target.textContent.trim(); + const activeTab = gradioApp().querySelector('.extra-networks-tab:not([style*="display: none"])'); + if (!activeTab) return; + const cardContainer = activeTab.querySelector('.extra-network-cards'); + if (!cardContainer) return; + if (cardContainer.dataset.activeVersion === version) { + cardContainer.dataset.activeVersion = ''; + cardContainer.querySelectorAll('.card').forEach((card) => card.style.display = ''); + } else { + cardContainer.dataset.activeVersion = version; + cardContainer.querySelectorAll('.card').forEach((card) => { + if (card.dataset.version === version) card.style.display = ''; + else card.style.display = 'none'; + }); + } } let desiredStyle = ''; diff --git a/javascript/sdnext.css b/javascript/sdnext.css index adbeb8d9d..5c29863ab 100644 --- a/javascript/sdnext.css +++ b/javascript/sdnext.css @@ -1197,7 +1197,7 @@ table.settings-value-table td { } .extra-networks .search textarea { - width: calc(120px / 1.1); + width: calc(140px / 1.1); resize: none; margin-right: 2px; } @@ -1233,7 +1233,7 @@ table.settings-value-table td { padding: 3px 3px 3px 12px; text-align: left; text-indent: -6px; - width: 120px; + width: 140px; width: 100%; } @@ -1249,7 +1249,7 @@ table.settings-value-table td { .extra-network-subdirs { background: var(--input-background-fill); border-radius: 4px; - min-width: max(15%, 120px); + min-width: max(15%, 140px); overflow-x: hidden; overflow-y: auto; padding-top: 0.5em; @@ -1376,7 +1376,17 @@ table.settings-value-table td { display: block; } -.extra-network-cards .card:has(>img[src*="card-no-preview.png"])::before { +.extra-network-cards .card:hover { + z-index: 100; + position: relative; +} + +.extra-network-cards .card:hover .tags { + display: block; + z-index: 101; /* Optional: ensure tags are above everything */ +} + +.extra-network-cards .card:has(>img[src*="missing.png"])::before { background-color: var(--data-color); content: ''; height: 100%; @@ -1461,7 +1471,6 @@ table.settings-value-table td { overflow-y: auto; } - .extra-details td:first-child { font-weight: bold; vertical-align: top; @@ -1471,6 +1480,29 @@ table.settings-value-table td { max-height: 50vh; } +.network-folder::before { + content: "󰉖 "; + margin-right: 0.8em; +} + +.network-reference { + filter: contrast(0.9); +} + +.network-reference::before { + content: "󰴊 "; + margin-right: 0.8em; +} + +.network-model { + opacity: 0.6; +} + +.network-model::before { + content: "󰴉 "; + margin-right: 0.8em; +} + .input-accordion-checkbox { display: none !important; } @@ -1716,6 +1748,13 @@ background: var(--background-color) width: max-content; } +#tab-gallery-files gallery-file { + /* Add a vertical gutter between items (left/right), matching existing small row spacing */ + display: inline-block; + margin-right: 0.2em; + vertical-align: top; /* keep rows aligned on the top edge */ +} + #tab-gallery-files { display: block; height: 75vh; @@ -1813,6 +1852,24 @@ div:has(>#tab-gallery-folders) { object-fit: contain; } +/* Gallery video preview matches image preview sizing and layout */ +#tab-gallery-video { + height: 63vh; +} + +/* Ensure the