automatic

Commit Graph

Author	SHA1	Message	Date
vladmandic	78c58e0d70	update precommit Signed-off-by: vladmandic <mandic00@live.com>	2026-02-11 11:12:21 +01:00
vladmandic	b4e5b563c6	update lint rules Signed-off-by: vladmandic <mandic00@live.com>	2026-02-11 10:47:07 +01:00
vladmandic	73a5d55022	cleanup Signed-off-by: vladmandic <mandic00@live.com>	2026-02-11 10:12:37 +01:00
vladmandic	8561da6f8c	cleanup Signed-off-by: vladmandic <mandic00@live.com>	2026-02-11 10:02:41 +01:00
vladmandic	967974ade7	merge cleanup Signed-off-by: vladmandic <mandic00@live.com>	2026-02-11 09:57:37 +01:00
vladmandic	3ae9909b2a	update sharpfin usage Signed-off-by: vladmandic <mandic00@live.com>	2026-02-11 09:57:37 +01:00
CalamitousFelicitousness	dc8ecb0a64	refactor: address remaining PR #4640 review comments - Remove _get_device_dtype() indirection, inline device/dtype at call sites - Remove commented-out fallback blocks and try/finally wrappers - Add modules/sharpfin to ruff and pylint excludes in pyproject.toml - Fix import ordering in joytag.py and pixelart.py	2026-02-11 09:57:37 +01:00
CalamitousFelicitousness	162651cbdb	refactor: address PR #4640 review comments Changes based on vladmandic and Disty0 feedback: - Fix logging: use direct `from installer import log` instead of lazy _get_log() - Remove unused is_available() function - Remove defensive getattr() calls in _resolve_kernel/_resolve_linearize - Simplify _get_device_dtype() to use devices module directly - Refactor to_pil() with single Image.fromarray() call and explicit mode - Add cross-platform fallback: sharpfin only runs on CUDA, falls back to PIL/F.interpolate for other devices (CPU, MPS, OpenVINO) - Replace lambdas with functools.partial in functional.py for torch.compile safety - Add modules/sharpfin to pylint ignore-paths (vendored code)	2026-02-11 09:57:37 +01:00
CalamitousFelicitousness	76aa949a26	refactor: integrate sharpfin for high-quality image resize Vendor sharpfin library (Apache 2.0) and add centralized wrapper module (images_sharpfin.py) replacing torchvision tensor/PIL conversion and resize operations throughout the codebase. - Add modules/sharpfin/ vendored library with MKS2021, Lanczos3, Mitchell, Catmull-Rom kernels and optional Triton sparse acceleration - Add modules/images_sharpfin.py wrapper with to_tensor(), to_pil(), pil_to_tensor(), normalize(), resize(), resize_tensor() - Add resize_quality and resize_linearize_srgb settings - Add MKS2021 and Lanczos3 upscaler entries - Replace torchvision.transforms.functional imports across 18 files - to_pil() auto-detects HWC/BHWC layout, adds .round() before uint8 - Sparse Triton path falls back to dense GPU on compilation failure - Mixed-axis resize splits into two single-axis scale() calls - Masks and non-sRGB data always use linearize=False	2026-02-11 09:57:37 +01:00
Vladimir Mandic	2c4d0751d9	Merge pull request #4613 from CalamitousFelicitousness/feat/caption-improvements-v2_backup Caption system overhaul V2	2026-02-11 09:33:29 +01:00
CalamitousFelicitousness	80014fac7c	fix(caption): address PR review feedback - Remove superfluous SimpleNamespace import in cli/api-caption.py, use Map instead - Drop _ prefix from internal helper functions in modules/api/caption.py - Move DeepDanbooru model path to top-level models folder instead of nesting under CLIP	2026-02-11 02:50:06 +00:00
CalamitousFelicitousness	139e331d80	style(caption): fix lint warnings across caption module - Rename shadowing import in waifudiffusion batch to avoid F823/E0606 - Fix import order in cli/api-caption.py (stdlib before third-party) - Rename local variable shadowing function name in cli/api-caption.py - Remove unnecessary global statement in devices.bypass_sdpa_hijacks	2026-02-11 02:50:06 +00:00
CalamitousFelicitousness	8d67debdfd	fix(caption): use cache_dir for BLIP and Moondream model downloads - Add _load_blip_model helper with explicit cache_dir so downloads go to hfcache_dir instead of default HF cache - Pre-load BLIP model/processor before creating Interrogator config to control download location and avoid redundant loads - Set clip_model_path on config for CLIP model cache location - Add cache_dir to Moondream model and tokenizer loading	2026-02-11 02:50:06 +00:00
CalamitousFelicitousness	6c20e49897	refactor(caption): extract caption API into standalone module Move all caption/interrogate/tagger/VQA API code out of the monolithic endpoints.py and models.py into a new self-contained modules/api/caption.py, following the loras.py / nudenet.py self-registering pattern. - Move 15 Pydantic models (ReqCaption, ResCaption, ReqVQA, ResVQA, ReqTagger, ResTagger, dispatch union types, etc.) from models.py - Move 11 handler functions from endpoints.py - Deduplicate ~150 lines via shared _do_openclip, _do_tagger, _do_vqa core functions called by both direct and dispatch endpoints - Add register_api() that registers all 8 caption routes - Add promptgen field to ResVLMPrompts (bug fix: handler returned it but response model silently dropped it) - Improve all endpoint docstrings and Field descriptions for API docs	2026-02-11 02:50:06 +00:00
CalamitousFelicitousness	e2cdbe47fa	fix(caption): safetensors-only downloads, model load fixes, UI default, prefill tests - Add use_safetensors=True to all 16 model from_pretrained calls to avoid downloading redundant .bin files alongside safetensors - Add device property to JoyTag VisionModel so move_model can relocate it to CUDA (fixes 'ViT object has no attribute device') - Fix Pix2Struct dtype mismatch by casting float inputs to model dtype while preserving integer tensor types - Patch AutoConfig.register with exist_ok=True during Ovis loading to handle duplicate aimv2 registration on model reload - Detect Qwen VL fine-tune architecture from config model_type instead of repo name, fixing ToriiGate and similar third-party fine-tunes - Change UI default task from Short Caption to Normal Caption, and preserve it on model switch instead of resetting to Use Prompt - Add dual-prefill testing across 5 VQA test methods using a shared _check_prefill helper - Fix pre-existing ruff W605 in strip_think_xml_tags docstring	2026-02-11 02:48:11 +00:00
CalamitousFelicitousness	57659ab642	fix(caption): set clip_interrogator params on config, not instance update_caption_params() was setting caption_max_length, chunk_size, and flavor_intermediate_count on the Interrogator instance, but the library reads them from self.config. The overrides were silently ignored.	2026-02-11 02:48:11 +00:00
CalamitousFelicitousness	17b03ed8e4	feat(caption): add Florence detection parsing, SDPA bypass, and offload support - Add parse_florence_detections() and format_florence_response() to vqa_detection for handling Florence-2 detection output formats - Add bypass_sdpa_hijacks() context manager to devices.py for models incompatible with SageAttention or other SDPA hijacks - Add OpenCLIP model offload support when caption_offload is enabled	2026-02-11 02:48:11 +00:00
CalamitousFelicitousness	443a73b740	refactor(caption): code review fixes for offload, inference, and maintainability Comprehensive review of modules/caption/ addressing memory management, consistency, and code quality: Inference correctness: - Add devices.inference_context() to _qwen(), _smol(), _sa2() handlers - Remove redundant @torch.no_grad() decorator from joycaption predict() - Remove dead dtype=torch.bfloat16 kwarg from Florence loader Memory management: - Bound moondream3 image cache with LRU eviction (max 8 entries) - Replace fragile id(image) cache keys with content-based md5 hash - Add devices.torch_gc() after model loading in deepseek - Move deepbooru model to CPU before dropping reference on unload - Add external handler delegation to VQA.unload() (moondream3, joycaption, joytag, deepseek) - Protect batch offload mutation with try/finally Code deduplication: - Extract strip_think_xml_tags() shared helper for Qwen/Gemma/SmolVLM - Extract save_tags_to_file() into tagger.py from deepbooru and waifudiffusion Documentation and clarity: - Document deepseek global monkey-patches (LlamaFlashAttention2, attrdict) - Document Florence task="task" as intentional design choice - Add vendored-code comment to joytag.py - Document openclip direct .to() usage vs sd_models.move_model - Comment model.eval() calls that are required (trust_remote_code, custom loaders) vs removed where redundant (standard from_pretrained) API robustness: - Add HTTP 422 error response for VQA caption error strings in API endpoints (post_vqa, _dispatch_vlm)	2026-02-11 02:48:11 +00:00
CalamitousFelicitousness	bf7a72f12e	fix(caption): remove dead min_length param, split Florence/PromptGen prompts, fix gaze detection - Remove caption_openclip_min_length from settings, API models, endpoints, and UI (clip_interrogator library has no min_length support; parameter was never functional) - Split vlm_prompts_florence into base Florence prompts and PromptGen-only prompts (GENERATE_TAGS, Analyze, Mixed Caption require MiaoshouAI PromptGen fine-tune) - Add 'promptgen' category to /vqa/prompts API endpoint - Fix gaze detection: move DETECT_GAZE check before generic 'detect ' prefix to prevent "Detect Gaze" matching as detect target="Gaze" - Update test suite: remove min_length tests, fix min_flavors to use mode='best', add acceptance-only notes, fix thinking trace detection, improve bracket/OCR tests, split Florence/PromptGen test coverage	2026-02-11 02:48:11 +00:00
CalamitousFelicitousness	fba942b25e	feat(caption): add debug logging for Florence-2 handler	2026-02-11 02:48:11 +00:00
CalamitousFelicitousness	0c45e58e80	docs: update localization and README for caption module - Update html/locale_en.json with caption-related strings - Update README.md documentation	2026-02-11 02:48:11 +00:00
CalamitousFelicitousness	588222f2d1	test: update caption API tests Update cli/test-caption-api.py: - Update test structure for new caption API endpoints - Fix Moondream gaze detection test prompt to use 'Detect Gaze' instead of 'Where is the person looking?' to match handler trigger - Improve test result categorization and tracking	2026-02-11 02:48:11 +00:00
CalamitousFelicitousness	d78c5c1cd0	refactor: update CLI tools for caption module - Rename cli/api-interrogate.py to cli/api-caption.py - Update cli/options.py, cli/process.py for new module paths - Update cli/test-tagger.py for caption module imports	2026-02-11 02:48:11 +00:00
CalamitousFelicitousness	f4b5abde68	refactor: update API for caption module Update API endpoints and models for caption module rename: - modules/api/api.py - update imports and endpoint handlers - modules/api/endpoints.py - update endpoint definitions - modules/api/models.py - update request/response models	2026-02-11 02:48:11 +00:00
CalamitousFelicitousness	61b031ada5	refactor: update imports for caption module rename Update all imports from modules.interrogate to modules.caption across: - modules/shared.py, modules/shared_legacy.py - modules/ui_caption.py, modules/ui_common.py - modules/ui_control.py, modules/ui_control_helpers.py - modules/ui_img2img.py, modules/ui_sections.py - modules/ui_symbols.py, modules/ui_video_vlm.py	2026-02-11 02:47:41 +00:00
CalamitousFelicitousness	5183ebec58	refactor: rename interrogate module to caption Move all caption-related modules from modules/interrogate/ to modules/caption/ for better naming consistency: - Rename deepbooru, deepseek, joycaption, joytag, moondream3, openclip, tagger, vqa, vqa_detection, waifudiffusion modules - Add new caption.py dispatcher module - Remove old interrogate.py (functionality moved to caption.py)	2026-02-11 02:47:41 +00:00
CalamitousFelicitousness	83fa8e39ba	refactor(api): update cli tools for DeepBooru tagger migration - Update cli/api-interrogate.py to use /sdapi/v1/tagger for DeepBooru - Handle tagger response format (scores dict or tags string) - Remove DeepBooru test from interrogate endpoint tests - Update API model descriptions to reference tagger for anime tagging	2026-02-11 02:47:41 +00:00
CalamitousFelicitousness	7825f44581	refactor(api): remove DeepBooru from interrogate endpoint DeepBooru/DeepDanbooru should only be accessed via the tagger endpoint. The interrogate endpoint is now exclusively for OpenCLIP/BLIP. - Remove DeepDanbooru handling from post_interrogate - Update docstring to reference tagger endpoint for anime tagging - Simplify code by removing if/else branching	2026-02-11 02:47:40 +00:00
CalamitousFelicitousness	0559651b1b	fix(vqa): fix infinite recursion and Florence-2 generation - Fix get_keep_thinking() infinite recursion (was calling itself) - Fix get_keep_prefill() infinite recursion (was calling itself) - Fix Florence-2 to use beam search instead of sampling Sampling causes probability tensor errors with Florence-2	2026-02-11 02:47:40 +00:00
CalamitousFelicitousness	3208067259	test(api): improve caption API test coverage and validation Add model architecture coverage tests: - VQA model family detection for 19 architectures - Florence special prompts test (<OD>, <OCR>, <CAPTION>, etc.) - Moondream detection features test - VQA architecture capabilities test - Tagger model types and WD version comparison tests Improve test validation: - Add is_meaningful_answer() to reject responses like "." - Verify parameters have actual effect (not just accepted) - Show actual output traces in PASS/FAIL messages - Fix prefill tests to verify keep_prefill behavior Add configurable timeout: - Default timeout increased to 300s for slow models - Add --timeout CLI argument for customization Other improvements: - Add JoyCaption to recognized model families - Reduce BLIP models to avoid reloading large models - Better detection result validation for annotated images	2026-02-11 02:47:40 +00:00
CalamitousFelicitousness	a04ba1e482	feat(api): add missing caption API parameters for UI parity Add prompt field to VQA endpoint and advanced settings to OpenCLIP endpoint to achieve full parity between UI and API capabilities. VLM endpoint changes: - Add prompt field for custom text input (required for 'Use Prompt' task) - Pass prompt to vqa.interrogate instead of hardcoded empty string OpenCLIP endpoint changes: - Add 7 optional per-request override fields: min_length, max_length, chunk_size, min_flavors, max_flavors, flavor_count, num_beams - Add get_clip_setting() helper for override support in openclip.py - Apply overrides via update_interrogate_params() before interrogation All new fields are optional with None defaults for backwards compatibility.	2026-02-11 02:47:40 +00:00
CalamitousFelicitousness	5fc46c042e	docs(api): synchronize API descriptions with UI hints Update API model field descriptions to match the hints in locale_en.json for consistency between UI and API documentation. Updated models: - ReqInterrogate: clip_model, blip_model, mode - ReqVQA: model, question, system - ReqTagger: model, threshold, character_threshold, max_tags, include_rating, sort_alpha, use_spaces, escape_brackets, exclude_tags, show_scores	2026-02-11 02:47:40 +00:00
CalamitousFelicitousness	f431141d2f	feat(api): add LLM generation parameters to VQA endpoint Add optional LLM generation parameters to the VQA API request model, allowing per-request override of settings: - max_tokens, temperature, top_k, top_p, num_beams, do_sample - thinking_mode, prefill, keep_thinking, keep_prefill Changes: - Add 10 new optional fields to ReqVQA model with descriptive docs - Update get_kwargs() to support per-request overrides via singleton - Add helper functions get_keep_thinking(), get_keep_prefill() - Update post_vqa endpoint to pass generation kwargs - Add _generation_overrides instance variable to VQA class	2026-02-11 02:47:40 +00:00
CalamitousFelicitousness	f3c4fae440	test(api): add caption API test suite Comprehensive test script for all Caption API endpoints: - GET/POST /sdapi/v1/interrogate (OpenCLiP/DeepBooru) - POST /sdapi/v1/vqa (VLM captioning) - GET /sdapi/v1/vqa/models, /sdapi/v1/vqa/prompts - POST /sdapi/v1/tagger - GET /sdapi/v1/tagger/models Usage: python cli/test-caption-api.py [--url URL] [--image PATH]	2026-02-11 02:47:40 +00:00
CalamitousFelicitousness	ef797169a3	refactor(interrogate): use configurable clip_models_path - Remove unused paths import from deepbooru.py and openclip.py - Use shared.opts.clip_models_path instead of hardcoded paths	2026-02-11 02:47:40 +00:00
CalamitousFelicitousness	ec7934799e	feat(api): add caption API endpoints and documentation Add comprehensive caption/interrogate API with documentation: - GET /sdapi/v1/interrogate: List available interrogation models - POST /sdapi/v1/interrogate: Interrogate with OpenCLIP/BLIP/DeepDanbooru - POST /sdapi/v1/vqa: Caption with Vision-Language Models (VLM) - GET /sdapi/v1/vqa: List available VLM models - POST /sdapi/v1/vqa/batch: Batch caption multiple images - POST /sdapi/v1/tagger: Tag images with WaifuDiffusion/DeepBooru Updates: - Add detailed docstrings with usage examples - Fix analyze_image response parsing for Gradio update dicts - Add request/response models for all endpoints	2026-02-11 02:47:40 +00:00
CalamitousFelicitousness	6b89cc8463	feat(ui): add tooltips/hints to Caption tab Add comprehensive tooltips to Caption tab UI elements in locale_en.json: - Add new "llm" section for shared LLM/VLM parameters: System prompt, Prefill, Top-K, Top-P, Temperature, Num Beams, Use Samplers, Thinking Mode, Keep Thinking Trace, Keep Prefill - Add new "caption" section for caption-specific settings: VLM, OpenCLiP, Tagger tab labels and all their parameters including thresholds, tag formatting, batch options - Consolidate accordion labels in ui_caption.py: "Caption: Advanced Options" and "Caption: Batch" shared across VLM, OpenCLiP, and Tagger tabs (localized to "Advanced Options" and "Batch" in UI) - Remove duplicate entries from missing section	2026-02-11 02:47:40 +00:00
vladmandic	7eb9b1cc5c	create tests folder Signed-off-by: vladmandic <mandic00@live.com>	2026-02-10 14:31:53 +01:00
vladmandic	d602a093fb	lint Signed-off-by: vladmandic <mandic00@live.com>	2026-02-10 13:54:13 +01:00
vladmandic	bd61633e14	switch to pyproject.toml for tool config Signed-off-by: vladmandic <mandic00@live.com>	2026-02-10 13:51:51 +01:00
vladmandic	684d77d871	update diffusers Signed-off-by: vladmandic <mandic00@live.com>	2026-02-10 11:49:01 +01:00
vladmandic	e907a0a573	update graphics Signed-off-by: vladmandic <mandic00@live.com>	2026-02-09 22:46:32 +01:00
vladmandic	363cb175aa	allow different lora in hires Signed-off-by: vladmandic <mandic00@live.com>	2026-02-09 22:31:00 +01:00
vladmandic	42d8ad498e	add ftfy Signed-off-by: vladmandic <mandic00@live.com>	2026-02-09 19:54:26 +01:00
Vladimir Mandic	4e7b5c0b70	Merge pull request #4638 from vladmandic/revert-4629-public-re-export Revert "Mark public re-exports"	2026-02-09 18:30:46 +01:00
Vladimir Mandic	e3ca883cbd	Revert "Mark public re-exports"	2026-02-09 18:30:18 +01:00
vladmandic	0d2e9fbf62	cleanup and update changelog Signed-off-by: vladmandic <mandic00@live.com>	2026-02-09 18:20:10 +01:00
Vladimir Mandic	d0f9e25906	Merge pull request #4634 from CalamitousFelicitousness/nunchaku-reference Nunchaku reference	2026-02-09 18:06:34 +01:00
Vladimir Mandic	480b58e994	Merge pull request #4636 from CalamitousFelicitousness/fix/installer-uv-clip fix(installer): handle setuptools 82 removing pkg_resources and uv broken fallback	2026-02-09 17:52:02 +01:00
Vladimir Mandic	b454fa9748	Merge pull request #4633 from awsr/patch-2 Linting rules: TCH -> TC	2026-02-09 17:50:58 +01:00

1 2 3 4 5 ...

12419 Commits (78c58e0d70bb7a4c37a8ccb963b155ba3e4468d2) All Branches Search

12419 Commits (78c58e0d70bb7a4c37a8ccb963b155ba3e4468d2)

All Branches