automatic

Commit Graph

Author	SHA1	Message	Date
CalamitousFelicitousness	c68ab0f75e	fix(ui): constrain batch file list height in caption tab Add max-height and overflow-y scroll to batch file upload components so uploading many files doesn't push the entire UI down.	2026-03-07 02:50:23 +00:00
Vladimir Mandic	e5c494f999	cleanup logger	2026-02-19 11:09:13 +01:00
Vladimir Mandic	a3074baf8b	unified logger	2026-02-19 09:46:42 +01:00
CalamitousFelicitousness	e2cdbe47fa	fix(caption): safetensors-only downloads, model load fixes, UI default, prefill tests - Add use_safetensors=True to all 16 model from_pretrained calls to avoid downloading redundant .bin files alongside safetensors - Add device property to JoyTag VisionModel so move_model can relocate it to CUDA (fixes 'ViT object has no attribute device') - Fix Pix2Struct dtype mismatch by casting float inputs to model dtype while preserving integer tensor types - Patch AutoConfig.register with exist_ok=True during Ovis loading to handle duplicate aimv2 registration on model reload - Detect Qwen VL fine-tune architecture from config model_type instead of repo name, fixing ToriiGate and similar third-party fine-tunes - Change UI default task from Short Caption to Normal Caption, and preserve it on model switch instead of resetting to Use Prompt - Add dual-prefill testing across 5 VQA test methods using a shared _check_prefill helper - Fix pre-existing ruff W605 in strip_think_xml_tags docstring	2026-02-11 02:48:11 +00:00
CalamitousFelicitousness	bf7a72f12e	fix(caption): remove dead min_length param, split Florence/PromptGen prompts, fix gaze detection - Remove caption_openclip_min_length from settings, API models, endpoints, and UI (clip_interrogator library has no min_length support; parameter was never functional) - Split vlm_prompts_florence into base Florence prompts and PromptGen-only prompts (GENERATE_TAGS, Analyze, Mixed Caption require MiaoshouAI PromptGen fine-tune) - Add 'promptgen' category to /vqa/prompts API endpoint - Fix gaze detection: move DETECT_GAZE check before generic 'detect ' prefix to prevent "Detect Gaze" matching as detect target="Gaze" - Update test suite: remove min_length tests, fix min_flavors to use mode='best', add acceptance-only notes, fix thinking trace detection, improve bracket/OCR tests, split Florence/PromptGen test coverage	2026-02-11 02:48:11 +00:00
CalamitousFelicitousness	61b031ada5	refactor: update imports for caption module rename Update all imports from modules.interrogate to modules.caption across: - modules/shared.py, modules/shared_legacy.py - modules/ui_caption.py, modules/ui_common.py - modules/ui_control.py, modules/ui_control_helpers.py - modules/ui_img2img.py, modules/ui_sections.py - modules/ui_symbols.py, modules/ui_video_vlm.py	2026-02-11 02:47:41 +00:00
CalamitousFelicitousness	6b89cc8463	feat(ui): add tooltips/hints to Caption tab Add comprehensive tooltips to Caption tab UI elements in locale_en.json: - Add new "llm" section for shared LLM/VLM parameters: System prompt, Prefill, Top-K, Top-P, Temperature, Num Beams, Use Samplers, Thinking Mode, Keep Thinking Trace, Keep Prefill - Add new "caption" section for caption-specific settings: VLM, OpenCLiP, Tagger tab labels and all their parameters including thresholds, tag formatting, batch options - Consolidate accordion labels in ui_caption.py: "Caption: Advanced Options" and "Caption: Batch" shared across VLM, OpenCLiP, and Tagger tabs (localized to "Advanced Options" and "Batch" in UI) - Remove duplicate entries from missing section	2026-02-11 02:47:40 +00:00
CalamitousFelicitousness	6b10f0df4f	refactor(caption): address PR review feedback Rename WD14 module and settings to WaifuDiffusion: - Rename wd14.py to waifudiffusion.py - Rename WD14Tagger class to WaifuDiffusionTagger - Rename WD14_MODELS constant to WAIFUDIFFUSION_MODELS - Rename settings: wd14_model -> waifudiffusion_model, wd14_character_threshold -> waifudiffusion_character_threshold - Update all log messages from "WD14" to "WaifuDiffusion" Code quality improvements: - Simplify threshold parameter defaulting using `or` operator - Extract save_output logic into _save_tags_to_file() helper with isolated error handling to prevent single file failures from impacting entire batch - Fix timing log format consistency (remove 's' suffix)	2026-01-21 11:56:07 +00:00
CalamitousFelicitousness	becb19319d	refactor(caption): unify tagger settings and reorganize Caption Tab UI Consolidate WD14 and DeepBooru tagger settings into unified options: - Merge wd14_general_threshold + deepbooru_score_threshold → tagger_threshold - Merge wd14_include_rating + deepbooru_include_rating → tagger_include_rating - Rename interrogate_score → tagger_show_scores - Rename tagger_escape → tagger_escape_brackets - Rename CLiP → OpenCLiP in caption type choices UI reorganization: - Add Interrogate tab to Caption Tab with default caption type selector - Move interrogate_offload to Model Offloading section as "Offload caption models" - Hide Interrogate settings section (all settings now in Caption Tab UI) - Update locale_en.json for OpenCLiP naming Code improvements: - DeepBooru tag_multi() now accepts same parameters as WD14 for unified interface - Fix setting references in interrogate.py for consolidated settings - Add comprehensive tagger test suite (cli/test-tagger.py)	2026-01-21 11:56:07 +00:00
CalamitousFelicitousness	656e86a962	refactor(caption): consolidate interrogate settings into Caption Tab UI Hide all CLiP, VLM, and Tagger settings from Settings > Interrogate page while keeping them in shared.opts for persistence. Caption Tab UI becomes the single control point with change handlers that save directly to config. Changes: - Hide OpenCLiP, VLM, and Tagger settings with visible=False - Add change handlers to save settings when UI controls change - Rename "Booru Tags" tab to "Tagger", update choice labels - Update interrogate.py to use unified tagger interface with all settings	2026-01-21 11:56:07 +00:00
CalamitousFelicitousness	09b8fe9761	feat(caption): integrate DeepBooru into unified Booru Tagger UI Add DeepBooru as a model option alongside WD14 models in the Booru Tags tab, with dynamic UI that disables inapplicable controls. Changes: - Create modules/interrogate/tagger.py as unified adapter module - Add batch, load/unload, get_models functions to deepbooru.py - Update ui_caption.py to use unified tagger interface - Consolidate shared tagger settings in shared.py - Add implementation plan for future settings consolidation UI behavior: - Model dropdown shows DeepBooru + all WD14 models - Character threshold and include rating disabled for DeepBooru - All controls re-enable when WD14 model selected	2026-01-21 11:56:07 +00:00
CalamitousFelicitousness	db97c42320	feat(caption): add WD14 tagger with Booru Tags tab Add SmilingWolf's WD14/WaifuDiffusion tagger models for anime/illustration tagging as a new "Booru Tags" tab in the Caption panel. - Support 9 models (v2 and v3 variants) via HuggingFace - ONNX backend chosen due to safetensors v3 variants exhibiting unacceptable accuracy loss - Separate thresholds for general/character tags - Batch processing with progress bar - Consolidate debug env var to SD_INTERROGATE_DEBUG	2026-01-21 11:56:07 +00:00
awsr	0faabffc14	Simplify options init/save/load	2026-01-10 13:27:38 -08:00
vladmandic	a72b98848c	cleanup Signed-off-by: vladmandic <mandic00@live.com>	2025-12-10 10:17:37 +01:00
CalamitousFelicitousness	d277392103	feat(ui): caption tab label styling and CLIP analysis text output Add clip_labels_text component for CLIP analysis results and standardize label capitalization across VLM and CLiP sections for consistency.	2025-12-09 18:54:44 +00:00
CalamitousFelicitousness	5193285bc7	refactor(vqa): convert to class-based singleton Refactor VQA module from module-level globals to a VQA class singleton pattern with self-contained per-model loading methods. Changes: - Add VQA class with model/processor state and detection data storage - Extract load methods for clean model pre-loading via UI - Interrogate to return string only; store detection data on instance - Add vqa_draw.py for bounding box/point annotation utilities Stub, further transfer of drawing functions to follow - Update moondream3.py to store detection data on VQA singleton - Update endpoints.py and ui_caption.py for new return type	2025-12-05 20:53:18 +00:00
CalamitousFelicitousness	2b6226b62b	feat(vqa): persist thinking mode and improve reasoning output formatting - Add interrogate_vlm_thinking_mode setting to save checkbox state - Update ui_caption to restore Thinking Mode preference on load - Add blank line before 'Answer:' label for visual separation - Remove '\n\n' replacement in clean() that stripped blank lines - Fix Qwen reasoning detection when <think> tag is in prompt, not response - Add reasoning icon to Moondream 2 and 3 model names	2025-12-05 00:00:25 +00:00
CalamitousFelicitousness	506515b018	feat(vqa): add load/unload model buttons to Caption tab - Add load_model() function to pre-load VLM into memory - Add unload_model() function to free VLM from memory - Add Load/Unload buttons to Caption tab UI	2025-12-05 00:00:25 +00:00
CalamitousFelicitousness	a90d85ddfd	feat(ui): add dynamic task selection based on VLM model - Rename "Predefined question" to "Task" - Task dropdown updates choices when model changes - Prompt placeholder updates based on selected task - Model-specific tasks: Florence-2 gets detection tasks, Moondream gets point/detect	2025-12-05 00:00:25 +00:00
CalamitousFelicitousness	4df6aa7944	fix(ui): set prefill text to empty by default	2025-12-05 00:00:25 +00:00
CalamitousFelicitousness	0d88fcd396	feat(ui): add prefill and thinking controls to Caption tab Add minimal UI controls to expose new VQA functionality: - Prefill Text input for guiding VLM responses - Thinking Mode checkbox for reasoning models - Keep Thinking Trace checkbox for output retention - Keep Prefill checkbox for output retention - Annotated Image output panel for detection visualization - Updated button handlers to pass new parameters	2025-12-05 00:00:24 +00:00
CalamitousFelicitousness	78711fb1d4	Merge branch 'dev' into patch-2	2025-10-01 20:58:58 +01:00
CalamitousFelicitousness	78820a14dc	Allow VLM temp setting temperature to 0 Allow VLM temp setting temperature to 0	2025-10-01 20:52:04 +01:00
Vladimir Mandic	cd79f92dff	add opts models_not_to_offload Signed-off-by: Vladimir Mandic <mandic00@live.com>	2025-09-19 11:21:54 -04:00
Vladimir Mandic	05dd0096c9	set default vqa model Signed-off-by: Vladimir Mandic <mandic00@live.com>	2025-09-04 08:38:29 -04:00
Vladimir Mandic	b2dbef53e5	restyled all toolbuttons to be modernui native Signed-off-by: Vladimir Mandic <mandic00@live.com>	2025-08-31 15:01:50 -04:00
Vladimir Mandic	8473bae0fc	1000 papercuts Signed-off-by: Vladimir Mandic <mandic00@live.com>	2025-05-13 21:51:33 -04:00
Vladimir Mandic	9bf6838962	update video tab Signed-off-by: Vladimir Mandic <mandic00@live.com>	2025-03-20 14:39:38 -04:00
Vladimir Mandic	dbfd59434f	add gemma3 Signed-off-by: Vladimir Mandic <mandic00@live.com>	2025-03-15 15:30:57 -04:00
Vladimir Mandic	b6990151c4	caption tab modernui support Signed-off-by: Vladimir Mandic <mandic00@live.com>	2025-02-17 10:59:22 -05:00
Vladimir Mandic	a4b3dc269e	modernize clip interrogate Signed-off-by: Vladimir Mandic <mandic00@live.com>	2025-02-16 19:37:09 -05:00
Vladimir Mandic	f3dd9b9646	vlm advanced settings and batch processing Signed-off-by: Vladimir Mandic <mandic00@live.com>	2025-02-15 14:34:28 -05:00
Vladimir Mandic	e95bd93f67	caption ui redesign Signed-off-by: Vladimir Mandic <mandic00@live.com>	2025-02-15 12:57:19 -05:00

33 Commits (master)