automatic

Commit Graph

Author	SHA1	Message	Date
vladmandic	3a65d561a7	add google-veo-3.1 Signed-off-by: vladmandic <mandic00@live.com>	2025-12-09 19:14:08 +01:00
vladmandic	acca58f50c	add kandinsky5 Signed-off-by: vladmandic <mandic00@live.com>	2025-12-09 09:47:22 +01:00
vladmandic	f91af19094	update video models Signed-off-by: vladmandic <mandic00@live.com>	2025-12-09 09:22:28 +01:00
Disty0	1c2a81ee2d	Make SDNQDequantizer a dataclass	2025-12-08 22:29:45 +03:00
vladmandic	3f161b5532	lint moondream Signed-off-by: vladmandic <mandic00@live.com>	2025-12-08 18:16:00 +01:00
vladmandic	69f0d6bf5d	lint Signed-off-by: vladmandic <mandic00@live.com>	2025-12-08 18:12:47 +01:00
Vladimir Mandic	5a1d60e1b9	Merge pull request #4448 from CalamitousFelicitousness/feat/vqa-prefill-thinking-moondream3 VQA Refactor	2025-12-08 17:43:48 +01:00
Disty0	d4e2cbb826	SDNQ fix torch.compile always being active	2025-12-08 18:15:08 +03:00
Disty0	3ae7ecdbad	SDNQ fix quantization_device getting ignored on post load quant	2025-12-08 01:29:52 +03:00
Disty0	064b64c76c	cleanup	2025-12-08 01:14:19 +03:00
Disty0	6e05a12a49	SDNQ post process pre-quants after load	2025-12-08 01:08:53 +03:00
Disty0	0835ca6f66	SDNQ add explicit model.quantization_method = QuantizationMethod.SDNQ	2025-12-08 00:46:40 +03:00
Disty0	7a6356f8eb	SDNQ fix transformers v5 and check for torch._dynamo.config.disable	2025-12-08 00:36:15 +03:00
Disty0	4f90054bf7	SDNQ transformers v5 support	2025-12-07 21:37:41 +03:00
Vladimir Mandic	469962cc9c	Merge pull request #4453 from awsr/python-datetime-compat Fix timestamp formatting for thumbnails	2025-12-07 06:49:38 +01:00
awsr	f01e977695	Fix timestamp formatting for thumbnails	2025-12-06 18:34:15 -08:00
vladmandic	7bd04e0b5c	add /detailers api endpoint Signed-off-by: vladmandic <mandic00@live.com>	2025-12-06 12:33:52 +01:00
CalamitousFelicitousness	a51e1501d6	fix(vqa): no moondream3 compile during explicit load - Initialize KV caches before moving model to device - Disable flex_attention decoding to avoid torch.compile hang - Remove unused compile step (controlled by cuda_compile setting) The flex_attention's create_block_mask triggers torch compilation which can hang the system when called during model preload.	2025-12-06 02:26:34 +00:00
CalamitousFelicitousness	7714f71994	feat(vqa): un/load support and extract detection Make external VQA handlers (moondream3, joytag, joycaption, deepseek) compatible with VQA load/unload mechanism for consistent model lifecycle. - Added vqa_detection.py, add shared detection helpers - Add load and unload functions to all external handlers - Replace device_map="auto" with sd_models.move_model in joycaption - Update dispatcher and moondream handlers to use shared helpers	2025-12-05 23:52:02 +00:00
CalamitousFelicitousness	5193285bc7	refactor(vqa): convert to class-based singleton Refactor VQA module from module-level globals to a VQA class singleton pattern with self-contained per-model loading methods. Changes: - Add VQA class with model/processor state and detection data storage - Extract load methods for clean model pre-loading via UI - Interrogate to return string only; store detection data on instance - Add vqa_draw.py for bounding box/point annotation utilities Stub, further transfer of drawing functions to follow - Update moondream3.py to store detection data on VQA singleton - Update endpoints.py and ui_caption.py for new return type	2025-12-05 20:53:18 +00:00
Disty0	1cfb61809f	cleanup	2025-12-05 18:40:49 +03:00
Disty0	5b86bef796	SDNQ add longcat keys	2025-12-05 18:37:20 +03:00
CalamitousFelicitousness	d1b1d574a6	fix(vqa): add graceful error for empty "Use Prompt" task Replace silent fallback to "Describe the image" with explicit error when user selects "Use Prompt" but leaves the prompt field empty. Follows the same pattern as missing image validation.	2025-12-05 01:48:07 +00:00
CalamitousFelicitousness	a8a9e6d836	fix(vqa): separate Moondream 2 and 3 task prompts Moondream 3 does not support gaze detection (detect_gaze method), so "Detect Gaze" task is now only shown for Moondream 2.	2025-12-05 01:38:28 +00:00
CalamitousFelicitousness	195161c436	fix(settings): hide VLM prefill/thinking settings from Settings UI These settings are accessible from the Caption tab and can be saved as defaults via "Set UI defaults", so they don't need to appear in Settings > Interrogate.	2025-12-05 00:54:24 +00:00
CalamitousFelicitousness	2b6226b62b	feat(vqa): persist thinking mode and improve reasoning output formatting - Add interrogate_vlm_thinking_mode setting to save checkbox state - Update ui_caption to restore Thinking Mode preference on load - Add blank line before 'Answer:' label for visual separation - Remove '\n\n' replacement in clean() that stripped blank lines - Fix Qwen reasoning detection when <think> tag is in prompt, not response - Add reasoning icon to Moondream 2 and 3 model names	2025-12-05 00:00:25 +00:00
CalamitousFelicitousness	a4b5e84a13	feat(vqa): enhance Moondream 2 with reasoning mode, gaze detection, and annotations - Add thinking_mode/reasoning parameter to enable reasoning mode - Add Detect Gaze task with placeholder hint - Parse point/detect results to return annotation data for visualization - Handle keep_thinking setting: format as "Reasoning:\n...\nAnswer:\n..." or discard - Add comprehensive debug logging throughout handler	2025-12-05 00:00:25 +00:00
CalamitousFelicitousness	c75a09be83	fix(vqa): handle Moondream point and detect tasks Add handlers for "Point at..." and "Detect..." tasks in moondream() that were falling through to answer_question() and failing.	2025-12-05 00:00:25 +00:00
CalamitousFelicitousness	506515b018	feat(vqa): add load/unload model buttons to Caption tab - Add load_model() function to pre-load VLM into memory - Add unload_model() function to free VLM from memory - Add Load/Unload buttons to Caption tab UI	2025-12-05 00:00:25 +00:00
CalamitousFelicitousness	a90d85ddfd	feat(ui): add dynamic task selection based on VLM model - Rename "Predefined question" to "Task" - Task dropdown updates choices when model changes - Prompt placeholder updates based on selected task - Model-specific tasks: Florence-2 gets detection tasks, Moondream gets point/detect	2025-12-05 00:00:25 +00:00
CalamitousFelicitousness	4df6aa7944	fix(ui): set prefill text to empty by default	2025-12-05 00:00:25 +00:00
CalamitousFelicitousness	0d88fcd396	feat(ui): add prefill and thinking controls to Caption tab Add minimal UI controls to expose new VQA functionality: - Prefill Text input for guiding VLM responses - Thinking Mode checkbox for reasoning models - Keep Thinking Trace checkbox for output retention - Keep Prefill checkbox for output retention - Annotated Image output panel for detection visualization - Updated button handlers to pass new parameters	2025-12-05 00:00:24 +00:00
CalamitousFelicitousness	c2810dfee2	fix(api): update VQA API endpoint for tuple return format Update interrogate API endpoint to handle the new (text, image) tuple return format from VQA interrogate function.	2025-12-05 00:00:24 +00:00
CalamitousFelicitousness	27fa48cc99	feat(vqa): major VQA handler refactor with prefill, thinking, and visualization Comprehensive overhaul of the VQA interrogation system including: - Prefill text support for guiding VLM responses - Thinking mode support with tag cleanup/retention - Dynamic prompt/task selection based on model type - Bounding box visualization for detection results - Debug infrastructure (SD_VQA_DEBUG env var) - New model support: MiMo-VL, Nidum Gemma, Allura Gemma - Model-specific prompt lists (Florence, Moondream)	2025-12-05 00:00:24 +00:00
CalamitousFelicitousness	0a322c0faf	feat(vqa): add Moondream 3 Preview handler Add support for Moondream 3 Preview VLM with: - Text query, caption, point, and detect capabilities - Bounding box visualization for object detection - Max pixels setting for resolution control - Device offloading support	2025-12-05 00:00:24 +00:00
CalamitousFelicitousness	c024c0c9c6	feat(settings): add VLM prefill and thinking retention options Add new VLM configuration options: - interrogate_vlm_keep_prefill: Keep prefill text in output - interrogate_vlm_keep_thinking: Keep reasoning trace in output Also adjust defaults: - Change interrogate_clip_flavor_count: 16 -> 1024 with updated range - Change interrogate_vlm_prompt default to first item ("Use Prompt")	2025-12-05 00:00:24 +00:00
CalamitousFelicitousness	85cd222793	fix(vqa): sort CLiP analysis results and add text output Improvements to the OpenCLIP interrogation: - Sort all ranking dicts by similarity score (descending) - Add format_category() helper for text formatting - Add formatted text output for CLIP labels textbox - Return additional text update in analyze_image()	2025-12-02 21:48:09 +00:00
CalamitousFelicitousness	eb832a4850	fix(vqa): respect offload setting in JoyCaption, add max_pixels Two fixes for the JoyCaption handler: - Only offload model if shared.opts.interrogate_offload is True - Add max_pixels=1024*1024 to AutoProcessor for consistent image handling	2025-12-02 21:46:09 +00:00
CalamitousFelicitousness	766cb49928	feat(ui): add vision and reasoning symbols, fix dropdown fonts Add new Font Awesome symbols for model capability indicators: - vision symbol (eye icon) for vision-capable VLM models - reasoning symbol (lightbulb icon) for thinking/reasoning models Also fix dropdown font styling by adding NotoSans font-family.	2025-12-02 21:43:13 +00:00
vladmandic	d3a2f6c7ed	fix loading local prequant models Signed-off-by: vladmandic <mandic00@live.com>	2025-12-02 20:53:19 +01:00
vladmandic	0ad40d2b8b	lint Signed-off-by: vladmandic <mandic00@live.com>	2025-12-02 12:25:04 +01:00
vladmandic	39bced0987	Merge branch 'dev' of https://github.com/vladmandic/sdnext into dev	2025-12-02 10:40:31 +01:00
vladmandic	903d47f9e6	add zimage and f2 to lora overrides Signed-off-by: vladmandic <mandic00@live.com>	2025-12-02 10:40:27 +01:00
Vladimir Mandic	3b4f909862	Merge pull request #4436 from CalamitousFelicitousness/runai-update Update runai-model-streamer logging integration	2025-12-02 03:59:38 -05:00
Vladimir Mandic	1673380b94	Merge pull request #4430 from awsr/fix_show_progress show_progress requires "full", "minimal", or "hidden"	2025-12-02 03:50:34 -05:00
Vladimir Mandic	de3ebf470d	Merge pull request #4428 from awsr/revert-for-now Revert changes that require at least Python version 3.12	2025-12-02 03:49:20 -05:00
CalamitousFelicitousness	55c089ae48	Update runai-model-streamer logging integration - Remove stdout redirect monkeypatch (fixed in runai v0.15.1 via PR #97) - Add RUNAI_STREAMER_LOG_LEVEL controlled by SD_LOAD_DEBUG - Add one-time runai config log when hijack is activated - Add `loader=runai\|default` to model loading logs - Remove per-file logging clutter from sd_hijack_safetensors.py	2025-12-02 02:01:51 +00:00
Disty0	7aa1bfdc70	Add get_modules_to_not_convert from transformers v5	2025-12-02 01:01:51 +03:00
Disty0	d9bc31e7da	Cleanup	2025-11-29 01:46:04 +03:00
Disty0	01a0f6b356	Warn and disable quantized matmul if triton is not available	2025-11-29 01:34:54 +03:00

1 2 3 4 5 ...

7603 Commits (3a65d561a70f60d2c67f607d2b00a944c7c427ed)