mirror of https://github.com/vladmandic/automatic
parent
1c2a81ee2d
commit
f91af19094
18
CHANGELOG.md
18
CHANGELOG.md
|
|
@ -1,22 +1,23 @@
|
|||
# Change Log for SD.Next
|
||||
|
||||
## Update for 2025-12-08
|
||||
## Update for 2025-12-09
|
||||
|
||||
### TBD
|
||||
|
||||
Merge commit: `f903a36d9`
|
||||
|
||||
### Highlights for 2025-12-08
|
||||
### Highlights for 2025-12-09
|
||||
|
||||
New native [kanvas](https://vladmandic.github.io/sdnext-docs/Kanvas/) module for image manipulation that fully replaces *img2img*, *inpaint* and *outpaint* controls, massive update to **Captioning/VQA** models and features
|
||||
New generation of **Flux.2** large image model, new **Z-Image** model that is creating a lot of buzz and a first cloud model with **Google's Nano Banana** *2.5 Flash and 3.0 Pro* plus new **Photoroom PRX** model
|
||||
New native [kanvas](https://vladmandic.github.io/sdnext-docs/Kanvas/) module for image manipulation that fully replaces *img2img*, *inpaint* and *outpaint* controls, massive update to **Captioning/VQA** models and features
|
||||
New generation of **Flux.2** large image model, new **Z-Image** model that is creating a lot of buzz and a first cloud model with **Google's Nano Banana** *2.5 Flash and 3.0 Pro*, new **Photoroom PRX** model
|
||||
Also new are **HunyuanVideo 1.5** and **Kandinsky 5 Pro** video models, plus a lot of internal improvements and fixes
|
||||
|
||||

|
||||
|
||||
[ReadMe](https://github.com/vladmandic/automatic/blob/master/README.md) | [ChangeLog](https://github.com/vladmandic/automatic/blob/master/CHANGELOG.md) | [Docs](https://vladmandic.github.io/sdnext-docs/) | [WiKi](https://github.com/vladmandic/automatic/wiki) | [Discord](https://discord.com/invite/sd-next-federal-batch-inspectors-1101998836328697867) | [Sponsor](https://github.com/sponsors/vladmandic)
|
||||
|
||||
### Details for 2025-12-08
|
||||
|
||||
### Details for 2025-12-09
|
||||
|
||||
- **Models**
|
||||
- [Black Forest Labs FLUX.2 Dev](https://bfl.ai/blog/flux-2) and prequantized variation [SDNQ-SVD-Uint4](https://huggingface.co/Disty0/FLUX.2-dev-SDNQ-uint4-svd-r32)
|
||||
**FLUX.2-Dev** is a brand new model from BFL and uses large 32B DiT together with Mistral 24B as text encoder
|
||||
|
|
@ -33,6 +34,11 @@ New generation of **Flux.2** large image model, new **Z-Image** model that is cr
|
|||
*note*: need to set `GOOGLE_API_KEY` environment variable with your key to use this model
|
||||
- [Photoroom PRX 1024 Beta](https://huggingface.co/Photoroom/prx-1024-t2i-beta)
|
||||
PRX (Photoroom Experimental) is a small 1.3B parameter t2i model trained entirely from scratch, it uses T5-Gemma text-encoder
|
||||
- [HunyuanVideo 1.5](https://huggingface.co/tencent/HunyuanVideo-1.5) in T2V and I2V variants, both standard and distilled and both 720p and 480p resolutions
|
||||
**HunyuanVideo 1.5** improves upon previous 1.0 version with better quality and higher resolution outputs, it uses Qwen2.5-VL text-encoder
|
||||
distilled variants provide faster generation with slightly reduced quality
|
||||
- [Kandinsky 5.0 Pro Video](https://huggingface.co/kandinskylab/Kandinsky-5.0-T2V-Pro-sft-5s-Diffusers) in T2V and I2V variants
|
||||
larger 19B (and more powerful version) of previously released Lite 2B models
|
||||
- **Kanvas**: new module for native canvas-based image manipulation
|
||||
kanvas is a full replacement for *img2img, inpaint and outpaint* controls
|
||||
see [docs](https://vladmandic.github.io/sdnext-docs/Kanvas/) for details
|
||||
|
|
|
|||
9
TODO.md
9
TODO.md
|
|
@ -4,12 +4,9 @@
|
|||
|
||||
- <https://github.com/users/vladmandic/projects>
|
||||
|
||||
## Kanvas
|
||||
|
||||
- Reimplement llama remover
|
||||
|
||||
## Internal
|
||||
|
||||
- Reimplement llama remover for kanvas
|
||||
- Deploy: Create executable for SD.Next
|
||||
- Feature: Integrate natural language image search
|
||||
[ImageDB](https://github.com/vladmandic/imagedb)
|
||||
|
|
@ -35,13 +32,13 @@
|
|||
- [SmoothCache](https://github.com/huggingface/diffusers/issues/11135)
|
||||
- [STG](https://github.com/huggingface/diffusers/blob/main/examples/community/README.md#spatiotemporal-skip-guidance)
|
||||
- [Video Inpaint Pipeline](https://github.com/huggingface/diffusers/pull/12506)
|
||||
- [Sonic Inpaint](https://github.com/ubc-vision/sonic)
|
||||
|
||||
### New models / Pipelines
|
||||
|
||||
TODO: *Prioritize*!
|
||||
|
||||
- [Kandinsky 5 Pro and Lite](https://github.com/huggingface/diffusers/pull/12664)
|
||||
- [HunyuanVideo-1.5](https://github.com/huggingface/diffusers/pull/12696)
|
||||
- [NewBie Image Exp0.1](https://github.com/huggingface/diffusers/pull/12803)
|
||||
- [Sana-I2V](https://github.com/huggingface/diffusers/pull/12634#issuecomment-3540534268)
|
||||
- [Bria FIBO](https://huggingface.co/briaai/FIBO)
|
||||
- [Bytedance Lynx](https://github.com/bytedance/lynx)
|
||||
|
|
|
|||
|
|
@ -121,6 +121,8 @@ def guess_by_name(fn, current_guess):
|
|||
new_guess = 'Kandinsky 2.2'
|
||||
elif 'kandinsky-3' in fn.lower():
|
||||
new_guess = 'Kandinsky 3.0'
|
||||
elif 'kandinsky-5.0' in fn.lower():
|
||||
new_guess = 'Kandinsky 5.0'
|
||||
elif 'hunyuanimage3' in fn.lower() or 'hunyuanimage-3' in fn.lower():
|
||||
new_guess = 'HunyuanImage3'
|
||||
elif 'hunyuanimage' in fn.lower():
|
||||
|
|
|
|||
|
|
@ -35,21 +35,70 @@ try:
|
|||
'None': [],
|
||||
'Hunyuan Video': [
|
||||
Model(name='None'),
|
||||
Model(name='Hunyuan Video T2V',
|
||||
Model(name='Hunyuan Video 1.5 T2V 720p',
|
||||
url='https://huggingface.co/tencent/HunyuanVideo-1.5',
|
||||
vae_remote=False,
|
||||
repo='hunyuanvideo-community/HunyuanVideo-1.5-Diffusers-720p_t2v',
|
||||
repo_cls=getattr(diffusers, 'HunyuanVideo15Pipeline', None),
|
||||
te_cls=getattr(transformers, 'Qwen2_5_VLTextModel', None),
|
||||
dit_cls=getattr(diffusers, 'HunyuanVideo15Transformer3DModel', None)),
|
||||
Model(name='Hunyuan Video 1.5 I2V 720p',
|
||||
url='https://huggingface.co/tencent/HunyuanVideo-1.5',
|
||||
vae_remote=False,
|
||||
repo='hunyuanvideo-community/HunyuanVideo-1.5-Diffusers-720p_i2v',
|
||||
repo_cls=getattr(diffusers, 'HunyuanVideo15ImageToVideoPipeline', None),
|
||||
te_cls=getattr(transformers, 'Qwen2_5_VLTextModel', None),
|
||||
dit_cls=getattr(diffusers, 'HunyuanVideo15Transformer3DModel', None)),
|
||||
Model(name='Hunyuan Video 1.5 I2V 720p Distilled',
|
||||
url='https://huggingface.co/tencent/HunyuanVideo-1.5',
|
||||
vae_remote=False,
|
||||
repo='hunyuanvideo-community/HunyuanVideo-1.5-Diffusers-720p_i2v_distilled',
|
||||
repo_cls=getattr(diffusers, 'HunyuanVideo15ImageToVideoPipeline', None),
|
||||
te_cls=getattr(transformers, 'Qwen2_5_VLTextModel', None),
|
||||
dit_cls=getattr(diffusers, 'HunyuanVideo15Transformer3DModel', None)),
|
||||
Model(name='Hunyuan Video 1.5 T2V 480p',
|
||||
url='https://huggingface.co/tencent/HunyuanVideo-1.5',
|
||||
vae_remote=False,
|
||||
repo='hunyuanvideo-community/HunyuanVideo-1.5-Diffusers-480p_t2v',
|
||||
repo_cls=getattr(diffusers, 'HunyuanVideo15Pipeline', None),
|
||||
te_cls=getattr(transformers, 'Qwen2_5_VLTextModel', None),
|
||||
dit_cls=getattr(diffusers, 'HunyuanVideo15Transformer3DModel', None)),
|
||||
Model(name='Hunyuan Video 1.5 T2V 480p Distilled',
|
||||
url='https://huggingface.co/tencent/HunyuanVideo-1.5',
|
||||
vae_remote=False,
|
||||
repo='hunyuanvideo-community/HunyuanVideo-1.5-Diffusers-480p_t2v_distilled',
|
||||
repo_cls=getattr(diffusers, 'HunyuanVideo15Pipeline', None),
|
||||
te_cls=getattr(transformers, 'Qwen2_5_VLTextModel', None),
|
||||
dit_cls=getattr(diffusers, 'HunyuanVideo15Transformer3DModel', None)),
|
||||
Model(name='Hunyuan Video 1.5 I2V 480p',
|
||||
url='https://huggingface.co/tencent/HunyuanVideo-1.5',
|
||||
vae_remote=False,
|
||||
repo='hunyuanvideo-community/HunyuanVideo-1.5-Diffusers-480p_i2v',
|
||||
repo_cls=getattr(diffusers, 'HunyuanVideo15ImageToVideoPipeline', None),
|
||||
te_cls=getattr(transformers, 'Qwen2_5_VLTextModel', None),
|
||||
dit_cls=getattr(diffusers, 'HunyuanVideo15Transformer3DModel', None)),
|
||||
Model(name='Hunyuan Video 1.5 I2V 480p Distilled',
|
||||
url='https://huggingface.co/tencent/HunyuanVideo-1.5',
|
||||
vae_remote=False,
|
||||
repo='hunyuanvideo-community/HunyuanVideo-1.5-Diffusers-480p_i2v_distilled',
|
||||
repo_cls=getattr(diffusers, 'HunyuanVideo15ImageToVideoPipeline', None),
|
||||
te_cls=getattr(transformers, 'Qwen2_5_VLTextModel', None),
|
||||
dit_cls=getattr(diffusers, 'HunyuanVideo15Transformer3DModel', None)),
|
||||
Model(name='Hunyuan Video 1.0 T2V',
|
||||
url='https://huggingface.co/tencent/HunyuanVideo',
|
||||
vae_remote=True,
|
||||
repo='hunyuanvideo-community/HunyuanVideo',
|
||||
repo_cls=getattr(diffusers, 'HunyuanVideoPipeline', None),
|
||||
te_cls=getattr(transformers, 'LlamaModel', None),
|
||||
dit_cls=getattr(diffusers, 'HunyuanVideoTransformer3DModel', None)),
|
||||
Model(name='Hunyuan Video I2V', # https://github.com/huggingface/diffusers/pull/10983
|
||||
Model(name='Hunyuan Video 1.0 I2V', # https://github.com/huggingface/diffusers/pull/10983
|
||||
url='https://huggingface.co/tencent/HunyuanVideo-I2V',
|
||||
vae_remote=True,
|
||||
repo='hunyuanvideo-community/HunyuanVideo-I2V',
|
||||
repo_cls=getattr(diffusers, 'HunyuanVideoImageToVideoPipeline', None),
|
||||
te_cls=getattr(transformers, 'LlavaForConditionalGeneration', None),
|
||||
dit_cls=getattr(diffusers, 'HunyuanVideoTransformer3DModel', None)),
|
||||
Model(name='SkyReels Hunyuan T2V', # https://github.com/huggingface/diffusers/pull/10837
|
||||
Model(name='SkyReels Hunyuan 1.0 T2V', # https://github.com/huggingface/diffusers/pull/10837
|
||||
url='https://huggingface.co/Skywork/SkyReels-V1-Hunyuan-T2V',
|
||||
vae_remote=True,
|
||||
repo='hunyuanvideo-community/HunyuanVideo',
|
||||
|
|
@ -58,7 +107,7 @@ try:
|
|||
dit='Skywork/SkyReels-V1-Hunyuan-T2V',
|
||||
dit_folder=None,
|
||||
dit_cls=getattr(diffusers, 'HunyuanVideoTransformer3DModel', None)),
|
||||
Model(name='SkyReels Hunyuan I2V', # https://github.com/huggingface/diffusers/pull/10837
|
||||
Model(name='SkyReels Hunyuan 1.0 I2V', # https://github.com/huggingface/diffusers/pull/10837
|
||||
url='https://huggingface.co/Skywork/SkyReels-V1-Hunyuan-I2V',
|
||||
vae_remote=True,
|
||||
repo='hunyuanvideo-community/HunyuanVideo',
|
||||
|
|
@ -67,7 +116,7 @@ try:
|
|||
dit='Skywork/SkyReels-V1-Hunyuan-I2V',
|
||||
dit_folder=None,
|
||||
dit_cls=getattr(diffusers, 'HunyuanVideoTransformer3DModel', None)),
|
||||
Model(name='Fast Hunyuan T2V', # https://github.com/hao-ai-lab/FastVideo/blob/8a77cf22c9b9e7f931f42bc4b35d21fd91d24e45/fastvideo/models/hunyuan/inference.py#L213
|
||||
Model(name='Fast Hunyuan 1.0 T2V', # https://github.com/hao-ai-lab/FastVideo/blob/8a77cf22c9b9e7f931f42bc4b35d21fd91d24e45/fastvideo/models/hunyuan/inference.py#L213
|
||||
url='https://huggingface.co/FastVideo/FastHunyuan',
|
||||
vae_remote=True,
|
||||
repo='hunyuanvideo-community/HunyuanVideo',
|
||||
|
|
@ -383,41 +432,53 @@ try:
|
|||
],
|
||||
'Kandinsky': [
|
||||
Model(name='Kandinsky 5.0 Lite 5s SFT T2V',
|
||||
url='https://huggingface.co/ai-forever/Kandinsky-5.0-T2V-Lite-sft-5s-Diffusers',
|
||||
repo='ai-forever/Kandinsky-5.0-T2V-Lite-sft-5s-Diffusers',
|
||||
url='https://huggingface.co/kandinskylab/Kandinsky-5.0-T2V-Lite-sft-5s-Diffusers',
|
||||
repo='kandinskylab/Kandinsky-5.0-T2V-Lite-sft-5s-Diffusers',
|
||||
repo_cls=getattr(diffusers, 'Kandinsky5T2VPipeline', None),
|
||||
te_cls=getattr(transformers, 'Qwen2_5_VLForConditionalGeneration', None),
|
||||
dit_cls=getattr(diffusers, 'Kandinsky5Transformer3DModel', None)),
|
||||
Model(name='Kandinsky 5.0 Lite 5s CFG-distilled T2V',
|
||||
url='https://huggingface.co/ai-forever/Kandinsky-5.0-T2V-Lite-nocfg-5s-Diffusers',
|
||||
repo='ai-forever/Kandinsky-5.0-T2V-Lite-nocfg-5s-Diffusers',
|
||||
url='https://huggingface.co/kandinskylab/Kandinsky-5.0-T2V-Lite-nocfg-5s-Diffusers',
|
||||
repo='kandinskylab/Kandinsky-5.0-T2V-Lite-nocfg-5s-Diffusers',
|
||||
repo_cls=getattr(diffusers, 'Kandinsky5T2VPipeline', None),
|
||||
te_cls=getattr(transformers, 'Qwen2_5_VLForConditionalGeneration', None),
|
||||
dit_cls=getattr(diffusers, 'Kandinsky5Transformer3DModel', None)),
|
||||
Model(name='Kandinsky 5.0 Lite 5s Steps-distilled T2V',
|
||||
url='https://huggingface.co/ai-forever/Kandinsky-5.0-T2V-Lite-distilled16steps-5s-Diffusers',
|
||||
repo='ai-forever/Kandinsky-5.0-T2V-Lite-distilled16steps-5s-Diffusers',
|
||||
url='https://huggingface.co/kandinskylab/Kandinsky-5.0-T2V-Lite-distilled16steps-5s-Diffusers',
|
||||
repo='kandinskylab/Kandinsky-5.0-T2V-Lite-distilled16steps-5s-Diffusers',
|
||||
repo_cls=getattr(diffusers, 'Kandinsky5T2VPipeline', None),
|
||||
te_cls=getattr(transformers, 'Qwen2_5_VLForConditionalGeneration', None),
|
||||
dit_cls=getattr(diffusers, 'Kandinsky5Transformer3DModel', None)),
|
||||
Model(name='Kandinsky 5.0 Lite 10s SFT T2V',
|
||||
url='https://huggingface.co/ai-forever/Kandinsky-5.0-T2V-Lite-sft-10s-Diffusers',
|
||||
repo='ai-forever/Kandinsky-5.0-T2V-Lite-sft-10s-Diffusers',
|
||||
url='https://huggingface.co/kandinskylab/Kandinsky-5.0-T2V-Lite-sft-10s-Diffusers',
|
||||
repo='kandinskylab/Kandinsky-5.0-T2V-Lite-sft-10s-Diffusers',
|
||||
repo_cls=getattr(diffusers, 'Kandinsky5T2VPipeline', None),
|
||||
te_cls=getattr(transformers, 'Qwen2_5_VLForConditionalGeneration', None),
|
||||
dit_cls=getattr(diffusers, 'Kandinsky5Transformer3DModel', None)),
|
||||
Model(name='Kandinsky 5.0 Lite 10s CFG-distilled T2V',
|
||||
url='https://huggingface.co/ai-forever/Kandinsky-5.0-T2V-Lite-nocfg-10s-Diffusers',
|
||||
repo='ai-forever/Kandinsky-5.0-T2V-Lite-nocfg-10s-Diffusers',
|
||||
url='https://huggingface.co/kandinskylab/Kandinsky-5.0-T2V-Lite-nocfg-10s-Diffusers',
|
||||
repo='kandinskylab/Kandinsky-5.0-T2V-Lite-nocfg-10s-Diffusers',
|
||||
repo_cls=getattr(diffusers, 'Kandinsky5T2VPipeline', None),
|
||||
te_cls=getattr(transformers, 'Qwen2_5_VLForConditionalGeneration', None),
|
||||
dit_cls=getattr(diffusers, 'Kandinsky5Transformer3DModel', None)),
|
||||
Model(name='Kandinsky 5.0 Lite 10s Steps-distilled T2V',
|
||||
url='https://huggingface.co/ai-forever/Kandinsky-5.0-T2V-Lite-distilled16steps-10s-Diffusers',
|
||||
repo='ai-forever/Kandinsky-5.0-T2V-Lite-distilled16steps-10s-Diffusers',
|
||||
url='https://huggingface.co/kandinskylab/Kandinsky-5.0-T2V-Lite-distilled16steps-10s-Diffusers',
|
||||
repo='kandinskylab/Kandinsky-5.0-T2V-Lite-distilled16steps-10s-Diffusers',
|
||||
repo_cls=getattr(diffusers, 'Kandinsky5T2VPipeline', None),
|
||||
te_cls=getattr(transformers, 'Qwen2_5_VLForConditionalGeneration', None),
|
||||
dit_cls=getattr(diffusers, 'Kandinsky5Transformer3DModel', None)),
|
||||
Model(name='Kandinsky 5.0 Pro 5s SFT T2V',
|
||||
url='https://huggingface.co/kandinskylab/Kandinsky-5.0-T2V-Pro-sft-5s-Diffusers',
|
||||
repo='kandinskylab/Kandinsky-5.0-T2V-Pro-sft-5s-Diffusers',
|
||||
repo_cls=getattr(diffusers, 'Kandinsky5T2VPipeline', None),
|
||||
te_cls=getattr(transformers, 'Qwen2_5_VLForConditionalGeneration', None),
|
||||
dit_cls=getattr(diffusers, 'Kandinsky5Transformer3DModel', None)),
|
||||
Model(name='Kandinsky 5.0 Pro 5s SFT I2V',
|
||||
url='https://huggingface.co/kandinskylab/Kandinsky-5.0-I2V-Pro-sft-5s-Diffusers',
|
||||
repo='kandinskylab/Kandinsky-5.0-I2V-Pro-sft-5s-Diffusers',
|
||||
repo_cls=getattr(diffusers, 'Kandinsky5I2VPipeline', None),
|
||||
te_cls=getattr(transformers, 'Qwen2_5_VLForConditionalGeneration', None),
|
||||
dit_cls=getattr(diffusers, 'Kandinsky5Transformer3DModel', None)),
|
||||
],
|
||||
}
|
||||
t1 = time.time()
|
||||
|
|
|
|||
Loading…
Reference in New Issue