- Add use_safetensors=True to all 16 model from_pretrained calls to
avoid downloading redundant .bin files alongside safetensors
- Add device property to JoyTag VisionModel so move_model can relocate
it to CUDA (fixes 'ViT object has no attribute device')
- Fix Pix2Struct dtype mismatch by casting float inputs to model dtype
while preserving integer tensor types
- Patch AutoConfig.register with exist_ok=True during Ovis loading to
handle duplicate aimv2 registration on model reload
- Detect Qwen VL fine-tune architecture from config model_type instead
of repo name, fixing ToriiGate and similar third-party fine-tunes
- Change UI default task from Short Caption to Normal Caption, and
preserve it on model switch instead of resetting to Use Prompt
- Add dual-prefill testing across 5 VQA test methods using a shared
_check_prefill helper
- Fix pre-existing ruff W605 in strip_think_xml_tags docstring
Move all caption-related modules from modules/interrogate/ to modules/caption/
for better naming consistency:
- Rename deepbooru, deepseek, joycaption, joytag, moondream3, openclip, tagger,
vqa, vqa_detection, waifudiffusion modules
- Add new caption.py dispatcher module
- Remove old interrogate.py (functionality moved to caption.py)