Comprehensive test script for all Caption API endpoints:
- GET/POST /sdapi/v1/interrogate (OpenCLiP/DeepBooru)
- POST /sdapi/v1/vqa (VLM captioning)
- GET /sdapi/v1/vqa/models, /sdapi/v1/vqa/prompts
- POST /sdapi/v1/tagger
- GET /sdapi/v1/tagger/models
Usage: python cli/test-caption-api.py [--url URL] [--image PATH]
Add comprehensive caption/interrogate API with documentation:
- GET /sdapi/v1/interrogate: List available interrogation models
- POST /sdapi/v1/interrogate: Interrogate with OpenCLIP/BLIP/DeepDanbooru
- POST /sdapi/v1/vqa: Caption with Vision-Language Models (VLM)
- GET /sdapi/v1/vqa: List available VLM models
- POST /sdapi/v1/vqa/batch: Batch caption multiple images
- POST /sdapi/v1/tagger: Tag images with WaifuDiffusion/DeepBooru
Updates:
- Add detailed docstrings with usage examples
- Fix analyze_image response parsing for Gradio update dicts
- Add request/response models for all endpoints
Add comprehensive tooltips to Caption tab UI elements in locale_en.json:
- Add new "llm" section for shared LLM/VLM parameters:
System prompt, Prefill, Top-K, Top-P, Temperature, Num Beams,
Use Samplers, Thinking Mode, Keep Thinking Trace, Keep Prefill
- Add new "caption" section for caption-specific settings:
VLM, OpenCLiP, Tagger tab labels and all their parameters
including thresholds, tag formatting, batch options
- Consolidate accordion labels in ui_caption.py:
"Caption: Advanced Options" and "Caption: Batch" shared across
VLM, OpenCLiP, and Tagger tabs (localized to "Advanced Options"
and "Batch" in UI)
- Remove duplicate entries from missing section
- Remove modules/facelib from .ruff.toml and .pylintrc exclusions
(folder was deleted, no longer needs to be excluded)
- Fix sdnext-modernui submodule pointer to match origin/dev
(was accidentally rolled back 6 commits in original PR)
- Change AxisOption type from str to bool with [False, True] choices
- Simplify apply_detailer() to accept bool directly
- Fix log message from "face-restore" to "detailer"
- Build CLIP with --no-build-isolation to use venv's setuptools 69.5.1
instead of pip pulling setuptools 82.0.0 (which removed pkg_resources)
into an isolated build environment
- Add cleanup_broken_packages() to remove dist-info directories with
missing RECORD files before falling back from uv to pip, preventing
cascading install failures when uv partially installs packages
- Add no_build_isolation parameter to install() function
- Remove GFPGAN and CodeFormer sections from Modern UI extras template
- Remove CodeFormer S-Lab license block from licenses.html
- Update Postprocessing hint in locale_en.json
- Remove completed TODO items for CodeFormer/GFPGAN removal
- Remove GFPGAN pip install from installer.py optional requirements
- Remove 'gfpgan' from modules_to_remove cleanup list in launch.py
- Remove --codeformer-models-path and --gfpgan-models-path CLI args
- Remove GFPGAN model directory migration from modelloader.py
- Remove codeformer, restoreformer, GFPGANv1.4, and GPEN-BFR ONNX
model URLs from the predefined list
- Remove the .fp16 ONNX restorer code path that bypassed detailer
processing to run face restoration directly
- Remove /sdapi/v1/face-restorers route from api.py
- Remove get_restorers() function from endpoints.py
- Remove gfpgan_visibility, codeformer_visibility, codeformer_weight
fields from ReqProcess model
- Remove GFPGAN and CodeFormer entries from run_extras() signature
and create_args_for_run dict in postprocessing.py
- Remove CodeFormer/GFPGAN import and setup from webui.py initialize()
- Remove face_restorers list, codeformer/gfpgan model path settings,
and face restore UI settings section from shared.py
- Remove restore_faces parameter from StableDiffusionProcessing
- Remove face_restoration import and restore_faces processing block
from processing.py
Remove all vendored face restoration code that is no longer maintained:
- modules/postprocess/codeformer_model.py, codeformer_arch.py, vqgan_arch.py
- modules/postprocess/gfpgan_model.py, restorer.py
- modules/face_restoration.py (base class and dispatcher)
- scripts/postprocessing_codeformer.py, postprocessing_gfpgan.py
- modules/facelib/ (vendored face detection/parsing library)
These were the only two backends registered in shared.face_restorers,
making the entire face restoration infrastructure dead code.
Nunchaku's SDXL UNet does not support offloading and raises
NotImplementedError when offload=True is passed. Skip the parameter
for SDXL and log a warning instead of crashing.
Add 4-step distilled Nunchaku SVDQuant entries for Qwen-Lightning and
Qwen-Lightning-Edit alongside the existing 8-step variants. Step count
is now shown in the reference name (e.g. "Qwen-Lightning (4-step)").
- Add subfolder parameter to load_qwen_nunchaku to distinguish
4-step (nunchaku-4step) from 8-step (nunchaku) variants
- Route to correct safetensors: lightningv1.0-4steps vs
lightningv1.1-8steps for gen, lightningv1.0-4steps vs
lightningv1.0-8steps for edit
- Strip nunchaku subfolder before pipeline from_pretrained since
it does not exist in the base HuggingFace repos
Filter out reference entries tagged "nunchaku" from Extra Networks
when the active backend is not CUDA, since Nunchaku requires NVIDIA
GPUs. Entries remain in shared.reference_models for programmatic
lookup but are not yielded to the UI.
- Rename HuggingFace org from nunchaku-tech to nunchaku-ai across all
nunchaku model repos (flux, sdxl, sana, z-image, qwen, t5)
- Add per-torch-version nunchaku version mapping instead of single global
version, with robust torch version parsing
- Add 'Fill (Nunchaku)' and 'Depth (Nunchaku)' options to Flux Tools
dropdown, loading models with +nunchaku suffix for SVDQuant quantization
- Mark Fill and Depth nunchaku reference entries as hidden so they remain
available for check_nunchaku() lookup but don't appear in Extra Networks
- Filter hidden reference models in ui_extra_networks_checkpoints
Replace manual Model/TE checkboxes in Quantization Settings with a
dedicated "Nunchaku" tab in the Extra Networks menu where users can
directly select nunchaku-quantized model variants. Detection is now
using a +nunchaku path marker for disambiguation.