- Add _load_blip_model helper with explicit cache_dir so downloads
go to hfcache_dir instead of default HF cache
- Pre-load BLIP model/processor before creating Interrogator config
to control download location and avoid redundant loads
- Set clip_model_path on config for CLIP model cache location
- Add cache_dir to Moondream model and tokenizer loading
- Add use_safetensors=True to all 16 model from_pretrained calls to
avoid downloading redundant .bin files alongside safetensors
- Add device property to JoyTag VisionModel so move_model can relocate
it to CUDA (fixes 'ViT object has no attribute device')
- Fix Pix2Struct dtype mismatch by casting float inputs to model dtype
while preserving integer tensor types
- Patch AutoConfig.register with exist_ok=True during Ovis loading to
handle duplicate aimv2 registration on model reload
- Detect Qwen VL fine-tune architecture from config model_type instead
of repo name, fixing ToriiGate and similar third-party fine-tunes
- Change UI default task from Short Caption to Normal Caption, and
preserve it on model switch instead of resetting to Use Prompt
- Add dual-prefill testing across 5 VQA test methods using a shared
_check_prefill helper
- Fix pre-existing ruff W605 in strip_think_xml_tags docstring
- Remove caption_openclip_min_length from settings, API models, endpoints, and UI
(clip_interrogator library has no min_length support; parameter was never functional)
- Split vlm_prompts_florence into base Florence prompts and PromptGen-only prompts
(GENERATE_TAGS, Analyze, Mixed Caption require MiaoshouAI PromptGen fine-tune)
- Add 'promptgen' category to /vqa/prompts API endpoint
- Fix gaze detection: move DETECT_GAZE check before generic 'detect ' prefix
to prevent "Detect Gaze" matching as detect target="Gaze"
- Update test suite: remove min_length tests, fix min_flavors to use mode='best',
add acceptance-only notes, fix thinking trace detection, improve bracket/OCR tests,
split Florence/PromptGen test coverage
Move all caption-related modules from modules/interrogate/ to modules/caption/
for better naming consistency:
- Rename deepbooru, deepseek, joycaption, joytag, moondream3, openclip, tagger,
vqa, vqa_detection, waifudiffusion modules
- Add new caption.py dispatcher module
- Remove old interrogate.py (functionality moved to caption.py)