Commit Graph

61 Commits (3a65d561a70f60d2c67f607d2b00a944c7c427ed)

Author SHA1 Message Date
vladmandic 69f0d6bf5d lint
Signed-off-by: vladmandic <mandic00@live.com>
2025-12-08 18:12:47 +01:00
CalamitousFelicitousness 7714f71994 feat(vqa): un/load support and extract detection
Make external VQA handlers (moondream3, joytag, joycaption, deepseek)
compatible with VQA load/unload mechanism for consistent model lifecycle.

- Added vqa_detection.py, add shared detection helpers
- Add load and unload functions to all external handlers
- Replace device_map="auto" with sd_models.move_model in joycaption
- Update dispatcher and moondream handlers to use shared helpers
2025-12-05 23:52:02 +00:00
CalamitousFelicitousness 5193285bc7 refactor(vqa): convert to class-based singleton
Refactor VQA module from module-level globals to a VQA class singleton
  pattern with self-contained per-model loading methods.

Changes:
- Add VQA class with model/processor state and detection data storage
- Extract load methods for clean model pre-loading via UI
- Interrogate to return string only; store detection data on instance
- Add vqa_draw.py for bounding box/point annotation utilities
    Stub, further transfer of drawing functions to follow
- Update moondream3.py to store detection data on VQA singleton
- Update endpoints.py and ui_caption.py for new return type
2025-12-05 20:53:18 +00:00
CalamitousFelicitousness d1b1d574a6 fix(vqa): add graceful error for empty "Use Prompt" task
Replace silent fallback to "Describe the image" with explicit error
when user selects "Use Prompt" but leaves the prompt field empty.
Follows the same pattern as missing image validation.
2025-12-05 01:48:07 +00:00
CalamitousFelicitousness a8a9e6d836 fix(vqa): separate Moondream 2 and 3 task prompts
Moondream 3 does not support gaze detection (detect_gaze method),
so "Detect Gaze" task is now only shown for Moondream 2.
2025-12-05 01:38:28 +00:00
CalamitousFelicitousness 2b6226b62b feat(vqa): persist thinking mode and improve reasoning output formatting
- Add interrogate_vlm_thinking_mode setting to save checkbox state
- Update ui_caption to restore Thinking Mode preference on load
- Add blank line before 'Answer:' label for visual separation
- Remove '\n\n' replacement in clean() that stripped blank lines
- Fix Qwen reasoning detection when <think> tag is in prompt, not response
- Add reasoning icon to Moondream 2 and 3 model names
2025-12-05 00:00:25 +00:00
CalamitousFelicitousness a4b5e84a13 feat(vqa): enhance Moondream 2 with reasoning mode, gaze detection, and annotations
- Add thinking_mode/reasoning parameter to enable reasoning mode
- Add Detect Gaze task with placeholder hint
- Parse point/detect results to return annotation data for visualization
- Handle keep_thinking setting: format as "Reasoning:\n...\nAnswer:\n..." or discard
- Add comprehensive debug logging throughout handler
2025-12-05 00:00:25 +00:00
CalamitousFelicitousness c75a09be83 fix(vqa): handle Moondream point and detect tasks
Add handlers for "Point at..." and "Detect..." tasks in moondream()
that were falling through to answer_question() and failing.
2025-12-05 00:00:25 +00:00
CalamitousFelicitousness 506515b018 feat(vqa): add load/unload model buttons to Caption tab
- Add load_model() function to pre-load VLM into memory
- Add unload_model() function to free VLM from memory
- Add Load/Unload buttons to Caption tab UI
2025-12-05 00:00:25 +00:00
CalamitousFelicitousness 27fa48cc99 feat(vqa): major VQA handler refactor with prefill, thinking, and visualization
Comprehensive overhaul of the VQA interrogation system including:
- Prefill text support for guiding VLM responses
- Thinking mode support with tag cleanup/retention
- Dynamic prompt/task selection based on model type
- Bounding box visualization for detection results
- Debug infrastructure (SD_VQA_DEBUG env var)
- New model support: MiMo-VL, Nidum Gemma, Allura Gemma
- Model-specific prompt lists (Florence, Moondream)
2025-12-05 00:00:24 +00:00
Vladimir Mandic f2835499b1 kanvas bindings
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-11-07 12:21:48 -05:00
Vladimir Mandic 58581896f5 cleanup
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-10-26 10:01:24 -04:00
CalamitousFelicitousness 33f335a98c VQA class fix f-statement fix 2025-10-26 06:39:05 +00:00
CalamitousFelicitousness 25607693ca
Merge branch 'dev' into qwen3-vl 2025-10-26 06:16:38 +00:00
CalamitousFelicitousness 80bb331169 Prompt enhance resizing and Qwen VL fix 2025-10-26 06:01:33 +00:00
CalamitousFelicitousness 3fc9efa9ee Add remaining Qwen3VL models up to 8B 2025-10-26 02:53:34 +00:00
CalamitousFelicitousness 1b80147881 Add Qwen3-VL-4B-Instruct 2025-10-25 22:12:20 +01:00
CalamitousFelicitousness c5d937b9c4
Fix typo in Qwen2.5 VL 4B to 3B
https://huggingface.co/Qwen/Qwen2.5-VL-3B-Instruct has been wrongly named 4B in Captioning menu.
2025-10-25 20:26:38 +01:00
Vladimir Mandic 3e47f3dd9a video prompt enhance
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-10-05 20:17:32 -04:00
Disty0 81bb2b99ef update florence promptgen repo ids 2025-10-01 21:43:02 +03:00
Vladimir Mandic 22074f4727 cleanup vqa
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-10-01 12:02:55 -04:00
Vladimir Mandic 5d0a3e5e8a fix microsoft-florence
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-10-01 10:58:52 -04:00
Vladimir Mandic d351fdb98f add more job state updates and update history tab
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-09-13 10:54:04 -04:00
Vladimir Mandic 175e9cbe29 cleanup/refactor state history
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-09-12 16:12:45 -04:00
Vladimir Mandic d665ac254e add apple-fastvlm
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-09-05 14:25:37 -04:00
Vladimir Mandic 05dd0096c9 set default vqa model
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-09-04 08:38:29 -04:00
Vladimir Mandic 863e172aad add Qwen/Qwen2.5-VL-3B-Instruct
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-08-12 15:09:08 -04:00
Vladimir Mandic fa44521ea3 offload-never and offload-always per-module and new highvram profile
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-07-31 11:40:24 -04:00
Vladimir Mandic f243c35892 improve traceback display
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-07-21 19:07:35 -04:00
Vladimir Mandic 287c3600d7 torch compile for llm
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-07-20 12:07:30 -04:00
Vladimir Mandic c559e26616 add builtin framepack
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-07-08 15:47:07 -04:00
Vladimir Mandic b625884031 add gemma3n to caption/vlm and promptenhance
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-07-07 10:01:02 -04:00
Vladimir Mandic e8b5ea3847 major refactor: remove backend original
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-07-05 13:16:46 -04:00
Vladimir Mandic 1b4e1ff0ef enable quants for vlm-captioning
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-06-29 11:48:05 -04:00
Vladimir Mandic 78330142ae add moondream2, sdnq xyzgrid timing info
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-06-27 09:41:32 -04:00
Vladimir Mandic 5b486a6ef1 sdnq add xyz grid support, improve offloading compatibility
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-06-25 15:32:37 -04:00
Vladimir Mandic f0d81ee1e0 cleanup
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-05-10 08:28:19 -04:00
Vladimir Mandic 6489e4c37d prompt-enhance api support and img2img support
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-05-08 15:31:07 -04:00
Vladimir Mandic d12cfdb537 add vlm prompt enhancer
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-05-05 12:39:45 -04:00
Disty0 dca11dd806 Add jxl to image extension lists 2025-05-01 16:02:50 +03:00
Vladimir Mandic d1c3b97c65 add prompt enhance
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-03-28 14:05:28 -04:00
Vladimir Mandic 8bcc4527ea add vlm ByteDance/Sa2VA 1b and 4b
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-03-26 11:10:23 -04:00
Vladimir Mandic a91c95870d remote vae encode
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-03-15 17:03:37 -04:00
Vladimir Mandic dbfd59434f add gemma3
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-03-15 15:30:57 -04:00
Disty0 30afbd036d Fix circular import between sd_models.py and shared.py 2025-03-11 17:03:33 +03:00
Vladimir Mandic 041f0bbf97 regenerate locales and rerun all translations
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-02-18 10:32:39 -05:00
Vladimir Mandic 1ee151b533 update diffusers
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-02-18 08:23:33 -05:00
Vladimir Mandic b6990151c4 caption tab modernui support
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-02-17 10:59:22 -05:00
Vladimir Mandic a4b3dc269e modernize clip interrogate
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-02-16 19:37:09 -05:00
Vladimir Mandic f3dd9b9646 vlm advanced settings and batch processing
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-02-15 14:34:28 -05:00