jumbo update: add wan22-vace, improve offloading, add offloading-using-streams, change logging-levels, refactor some default packages

Signed-off-by: Vladimir Mandic <mandic00@live.com>
pull/4216/head
Vladimir Mandic 2025-09-17 12:32:27 -04:00
parent 6a021e7743
commit cc6101ecb2
17 changed files with 156 additions and 76 deletions

View File

@ -1,5 +1,19 @@
# Change Log for SD.Next
## Update for 2025-09-17
- **Models**
- [WAN 2.2 14B VACE](https://huggingface.co/alibaba-pai/Wan2.2-VACE-Fun-A14B)
available for *text-to-image* and *text-to-video* and *image-to-video* workflows
- **Offloading**
- improve offloading for pipelines with multiple stages such as *wan-2.2-14b*
- add timers to measure onload/offload times during generate
- experimental offloading using `torch.streams`
enable in settings -> model offloading
- **Other**
- **logging** enable `debug`, `docs` and `api-docs` by default
- refactor to use new libraries
## Update for 2025-09-15
### Highlights for 2025-09-15

View File

@ -6,6 +6,7 @@
![Last update](https://img.shields.io/github/last-commit/vladmandic/sdnext?svg=true)
![License](https://img.shields.io/github/license/vladmandic/sdnext?svg=true)
[![Discord](https://img.shields.io/discord/1101998836328697867?logo=Discord&svg=true)](https://discord.gg/VjvR2tabEX)
[![DeepWiki](https://deepwiki.com/badge.svg)](https://deepwiki.com/vladmandic/sdnext)
[![Sponsors](https://img.shields.io/static/v1?label=Sponsor&message=%E2%9D%A4&logo=GitHub&color=%23fe8e86)](https://github.com/sponsors/vladmandic)
[Docs](https://vladmandic.github.io/sdnext-docs/) | [Wiki](https://github.com/vladmandic/sdnext/wiki) | [Discord](https://discord.gg/VjvR2tabEX) | [Changelog](CHANGELOG.md)

View File

@ -15,6 +15,7 @@ Main ToDo list can be found at [GitHub projects](https://github.com/users/vladma
### Under Consideration
- [Inf-DiT](https://github.com/zai-org/Inf-DiT)
- [X-Omni](https://github.com/X-Omni-Team/X-Omni/blob/main/README.md)
- [DiffSynth Studio](https://github.com/modelscope/DiffSynth-Studio)
- [IPAdapter negative guidance](https://github.com/huggingface/diffusers/discussions/7167)
@ -36,13 +37,14 @@ Main ToDo list can be found at [GitHub projects](https://github.com/users/vladma
### New models
- [HunyuanImage](https://huggingface.co/tencent/HunyuanImage-2.1)
- [Phantom HuMo](https://github.com/Phantom-video/Phantom)
- [Lumina-DiMOO](https://huggingface.co/Alpha-VLLM/Lumina-DiMOO)
- [Wan2.2 Fun](https://huggingface.co/collections/alibaba-pai/wan22-fun-68958eabec343b948f1225c5) (includes VACE, Control, etc)
- [Magi](https://github.com/SandAI-org/MAGI-1)(https://github.com/huggingface/diffusers/pull/11713)
- [SEVA](https://github.com/huggingface/diffusers/pull/11440)
- [Ming](https://github.com/inclusionAI/Ming)
- [Liquid](https://github.com/FoundationVision/Liquid)
- [Step1X](https://github.com/stepfun-ai/Step1X-Edit)
- [LucyEdit](https://github.com/huggingface/diffusers/pull/12340)
- [SD3 UltraEdit](https://github.com/HaozheZhao/UltraEdit)
- [WAN2GP](https://github.com/deepbeepmeep/Wan2GP)
- [SelfForcing](https://github.com/guandeh17/Self-Forcing)

@ -1 +1 @@
Subproject commit 430c8140c0cdda5e82a80f44ee6961e0d803e516
Subproject commit 049386fd9d5c2d9b30cc9136be1e1bcda13d81c6

View File

@ -20,10 +20,11 @@
"HiDream-I1-Full": "models/Reference/HiDream-I1 Full",
"lodestones--Chroma1-Base": "models/Reference/lodestones--Chroma-Base.jpg",
"lodestones--Chroma1-HD": "models/Reference/lodestones--Chroma-HD.jpg",
"chroma-unlocked-v50": "models/Reference/lodestones Chroma Unlocked HD",
"chroma-unlocked-v50-annealed": "models/Reference/lodestones Chroma Unlocked HD",
"chroma-unlocked-v50": "models/Reference/lodestones--Chroma-detail.jpg",
"chroma-unlocked-v50-annealed": "models/Reference/lodestones--Chroma-annealed.jpg",
"vladmandic--Qwen-Lightning": "models/Reference/Qwen-Lightning.jpg",
"vladmandic--Qwen-Lightning-Edit": "models/Reference/Qwen-Lightning.jpg",
"Wan-AI--Wan2.2-T2V-A14B-Diffusers": "models/Reference/Wan2.2-T2V-A14B.jpg",
"Wan-AI--Wan2.1-T2V-14B-Diffusers": "models/Reference/Wan-AI--Wan2.1.jpg"
"Wan-AI--Wan2.2-T2V-A14B-Diffusers": "models/Reference/Wan-AI--Wan2.2-T2V-A14B-Diffusers.jpg",
"Wan-AI--Wan2.1-T2V-14B-Diffusers": "models/Reference/Wan-AI--Wan2.1-T2V-14B-Diffusers.jpg",
"linoyts--Wan2.2-VACE-Fun-14B-diffusers": "models/Reference/Wan-AI--Wan2.2-T2V-A14B-Diffusers.jpg"
}

View File

@ -139,6 +139,42 @@
"extras": "sampler: Default, cfg_scale: 4.5"
},
"Qwen-Image": {
"path": "Qwen/Qwen-Image",
"preview": "Qwen--Qwen-Image.jpg",
"desc": " Qwen-Image, an image generation foundation model in the Qwen series that achieves significant advances in complex text rendering and precise image editing.",
"skip": true,
"extras": ""
},
"Qwen-Image-Edit": {
"path": "Qwen/Qwen-Image-Edit",
"preview": "Qwen--Qwen-Image-Edit.jpg",
"desc": "Qwen-Image-Edit, the image editing version of Qwen-Image. Built upon our 20B Qwen-Image model, Qwen-Image-Edit successfully extends Qwen-Images unique text rendering capabilities to image editing tasks, enabling precise text editing.",
"skip": true,
"extras": ""
},
"Qwen-Image-Lightning": {
"path": "vladmandic/Qwen-Lightning",
"preview": "vladmandic--Qwen-Lightning.jpg",
"desc": "Qwen-Lightning is step-distilled from Qwen-Image to allow for generation in 8 steps.",
"skip": true,
"extras": "steps: 8"
},
"Qwen-Image-Distill": {
"path": "SahilCarterr/Qwen-Image-Distill-Full",
"preview": "SahilCarterr--Qwen-Image-Distill-Full.jpg",
"desc": "Qwen-Image-Distill is a distilled and accelerated version of Qwen-Image by DiffSynth-Studio.",
"skip": true,
"extras": "steps: 15"
},
"Qwen-Image-Lightning-Edit": {
"path": "vladmandic/Qwen-Lightning-Edit",
"preview": "vladmandic--Qwen-Lightning-Edit.jpg",
"desc": " Qwen-Lightning-Edit is step-distilled from Qwen-Image-Edit to allow for generation in 8 steps.",
"skip": true,
"extras": "steps: 8"
},
"lodestones Chroma1 HD": {
"path": "lodestones/Chroma1-HD",
"preview": "lodestones--Chroma-HD.jpg",
@ -182,42 +218,6 @@
"extras": ""
},
"Qwen-Image": {
"path": "Qwen/Qwen-Image",
"preview": "Qwen--Qwen-Image.jpg",
"desc": " Qwen-Image, an image generation foundation model in the Qwen series that achieves significant advances in complex text rendering and precise image editing.",
"skip": true,
"extras": ""
},
"Qwen-Image-Edit": {
"path": "Qwen/Qwen-Image-Edit",
"preview": "Qwen--Qwen-Image-Edit.jpg",
"desc": "Qwen-Image-Edit, the image editing version of Qwen-Image. Built upon our 20B Qwen-Image model, Qwen-Image-Edit successfully extends Qwen-Images unique text rendering capabilities to image editing tasks, enabling precise text editing.",
"skip": true,
"extras": ""
},
"Qwen-Image-Lightning": {
"path": "vladmandic/Qwen-Lightning",
"preview": "vladmandic--Qwen-Lightning.jpg",
"desc": "Qwen-Lightning is step-distilled from Qwen-Image to allow for generation in 8 steps.",
"skip": true,
"extras": "steps: 8"
},
"Qwen-Image-Distill": {
"path": "SahilCarterr/Qwen-Image-Distill-Full",
"preview": "SahilCarterr--Qwen-Image-Distill-Full.jpg",
"desc": "Qwen-Image-Distill is a distilled and accelerated version of Qwen-Image by DiffSynth-Studio.",
"skip": true,
"extras": "steps: 15"
},
"Qwen-Image-Lightning-Edit": {
"path": "vladmandic/Qwen-Lightning-Edit",
"preview": "vladmandic--Qwen-Lightning-Edit.jpg",
"desc": " Qwen-Lightning-Edit is step-distilled from Qwen-Image-Edit to allow for generation in 8 steps.",
"skip": true,
"extras": "steps: 8"
},
"Ostris Flex.2 Preview": {
"path": "ostris/Flex.2-preview",
"preview": "ostris--Flex.2-preview.jpg",
@ -268,6 +268,13 @@
"skip": true,
"extras": "sampler: Default"
},
"Wan-AI Wan2.2 14B VACE": {
"path": "linoyts/Wan2.2-VACE-Fun-14B-diffusers",
"preview": "Wan-AI--Wan2.2-T2V-A14B-Diffusers.jpg",
"desc": "Wan2.2, offering more powerful capabilities, better performance, and superior visual quality. With Wan2.2, we have focused on incorporating the following technical innovations: MoE Architecture, Data Scalling, Cinematic Aesthetics, Efficient High-Definition Hybrid",
"skip": true,
"extras": "sampler: Default"
},
"Freepik F-Lite": {
"path": "Freepik/F-Lite",

View File

@ -605,7 +605,7 @@ def check_diffusers():
if args.skip_git:
install('diffusers')
return
sha = '5e181eddfe7e44c1444a2511b0d8e21d177850a0' # diffusers commit hash
sha = 'efb7a299af46d739dec6a57a5d2814165fba24b5' # diffusers commit hash
pkg = pkg_resources.working_set.by_key.get('diffusers', None)
minor = int(pkg.version.split('.')[1] if pkg is not None else -1)
cur = opts.get('diffusers_version', '') if minor > -1 else ''
@ -1636,7 +1636,6 @@ def add_args(parser):
group_http.add_argument("--cors-regex", type=str, default=os.environ.get("SD_CORSREGEX", None), help="Allowed CORS origins as regular expression, default: %(default)s")
group_http.add_argument('--subpath', type=str, default=os.environ.get("SD_SUBPATH", None), help='Customize the URL subpath for usage with reverse proxy')
group_http.add_argument("--autolaunch", default=os.environ.get("SD_AUTOLAUNCH", False), action='store_true', help="Open the UI URL in the system's default browser upon launch")
group_http.add_argument('--docs', default=not os.environ.get("SD_NODOCS", False), action='store_true', help = "Mount API docs, default: %(default)s")
group_http.add_argument("--auth", type=str, default=os.environ.get("SD_AUTH", None), help='Set access authentication like "user:pwd,user:pwd""')
group_http.add_argument("--auth-file", type=str, default=os.environ.get("SD_AUTHFILE", None), help='Set access authentication using file, default: %(default)s')
group_http.add_argument("--allowed-paths", nargs='+', default=[], type=str, required=False, help="add additional paths to paths allowed for web access")
@ -1660,7 +1659,7 @@ def add_args(parser):
group_log.add_argument('--debug', default=not os.environ.get("SD_NODEBUG",False), action='store_true', help="Run with debug logging, default: %(default)s")
group_log.add_argument("--trace", default=os.environ.get("SD_TRACE", False), action='store_true', help="Run with trace logging, default: %(default)s")
group_log.add_argument("--profile", default=os.environ.get("SD_PROFILE", False), action='store_true', help="Run profiler, default: %(default)s")
group_log.add_argument('--docs', default=os.environ.get("SD_DOCS", False), action='store_true', help="Mount API docs, default: %(default)s")
group_log.add_argument('--docs', default=not os.environ.get("SD_NODOCS", False), action='store_true', help = "Mount API docs, default: %(default)s")
group_log.add_argument("--api-log", default=not os.environ.get("SD_NOAPILOG", False), action='store_true', help="Log all API requests")
group_nargs = parser.add_argument_group('Other')

View File

@ -128,6 +128,8 @@ def task_specific_kwargs(p, model):
}
if model.__class__.__name__ == 'WanImageToVideoPipeline' and hasattr(p, 'init_images') and len(p.init_images) > 0:
task_args['image'] = p.init_images[0]
if model.__class__.__name__ == 'WanVACEPipeline' and hasattr(p, 'init_images') and len(p.init_images) > 0:
task_args['reference_images'] = p.init_images
if debug_enabled:
debug_log(f'Process task specific args: {task_args}')
@ -307,6 +309,14 @@ def set_pipeline_args(p, model, prompts:list, negative_prompts:list, prompts_2:t
args['control_strength'] = p.denoising_strength
args['width'] = p.width
args['height'] = p.height
if 'WanVACEPipeline' in model.__class__.__name__:
if isinstance(args['prompt'], list):
args['prompt'] = args['prompt'][0] if len(args['prompt']) > 0 else ''
if isinstance(args.get('negative_prompt', None), list):
args['negative_prompt'] = args['negative_prompt'][0] if len(args['negative_prompt']) > 0 else ''
if isinstance(args['generator'], list) and len(args['generator']) > 0:
args['generator'] = args['generator'][0]
# set callbacks
if 'prior_callback_steps' in possible: # Wuerstchen / Cascade
args['prior_callback_steps'] = 1

View File

@ -487,7 +487,7 @@ def validate_pipeline(p: processing.StableDiffusionProcessing):
if m.repo_cls is not None:
models_cls.append(m.repo_cls.__name__)
is_video_model = shared.sd_model.__class__.__name__ in models_cls
override_video_pipelines = ['WanPipeline', 'WanImageToVideoPipeline']
override_video_pipelines = ['WanPipeline', 'WanImageToVideoPipeline', 'WanVACEPipeline']
is_video_pipeline = ('video' in p.__class__.__name__.lower()) or (shared.sd_model.__class__.__name__ in override_video_pipelines)
if is_video_model and not is_video_pipeline:
shared.log.error(f'Mismatch: type={shared.sd_model_type} cls={shared.sd_model.__class__.__name__} request={p.__class__.__name__} video model with non-video pipeline')

View File

@ -372,7 +372,7 @@ def calculate_base_steps(p, use_denoise_start, use_refiner_start):
if len(getattr(p, 'timesteps', [])) > 0:
return None
cls = shared.sd_model.__class__.__name__
if 'Flex' in cls or 'Kontext' in cls or 'Edit' in cls:
if 'Flex' in cls or 'Kontext' in cls or 'Edit' in cls or 'Wan' in cls:
steps = p.steps
elif not is_txt2img():
if cls in sd_models.i2i_pipes:
@ -393,7 +393,7 @@ def calculate_base_steps(p, use_denoise_start, use_refiner_start):
def calculate_hires_steps(p):
cls = shared.sd_model.__class__.__name__
if 'Flex' in cls or 'HiDreamImageEditingPipeline' in cls or 'Kontext' in cls:
if 'Flex' in cls or 'Kontext' in cls or 'Edit' in cls or 'Wan' in cls:
steps = p.steps
elif p.hr_second_pass_steps > 0:
steps = (p.hr_second_pass_steps // p.denoising_strength) + 1
@ -407,7 +407,7 @@ def calculate_hires_steps(p):
def calculate_refiner_steps(p):
cls = shared.sd_model.__class__.__name__
if 'Flex' in cls or 'HiDreamImageEditingPipeline' in cls or 'Kontext' in cls:
if 'Flex' in cls or 'Kontext' in cls or 'Edit' in cls or 'Wan' in cls:
steps = p.steps
elif "StableDiffusionXL" in shared.sd_refiner.__class__.__name__:
if p.refiner_start > 0 and p.refiner_start < 1:

View File

@ -18,6 +18,7 @@ offload_post = ['h1']
offload_hook_instance = None
balanced_offload_exclude = ['CogView4Pipeline', 'MeissonicPipeline']
accelerate_dtype_byte_size = None
move_stream = None
def dtype_byte_size(dtype: torch.dtype):
@ -176,7 +177,7 @@ class OffloadHook(accelerate.hooks.ModelHook):
self.last_post = None
self.last_cls = None
gpu = f'{(shared.gpu_memory * shared.opts.diffusers_offload_min_gpu_memory):.2f}-{(shared.gpu_memory * shared.opts.diffusers_offload_max_gpu_memory):.2f}:{shared.gpu_memory:.2f}'
shared.log.info(f'Offload: type=balanced op=init watermark={self.min_watermark}-{self.max_watermark} gpu={gpu} cpu={shared.cpu_memory:.3f} limit={shared.opts.cuda_mem_fraction:.2f} always={self.offload_always} never={self.offload_never} pre={shared.opts.diffusers_offload_pre}')
shared.log.info(f'Offload: type=balanced op=init watermark={self.min_watermark}-{self.max_watermark} gpu={gpu} cpu={shared.cpu_memory:.3f} limit={shared.opts.cuda_mem_fraction:.2f} always={self.offload_always} never={self.offload_never} pre={shared.opts.diffusers_offload_pre} streams={shared.opts.diffusers_offload_streams}')
self.validate()
super().__init__()
@ -210,8 +211,12 @@ class OffloadHook(accelerate.hooks.ModelHook):
def pre_forward(self, module, *args, **kwargs):
_id = id(module)
if (self.last_pre != _id) and (module.__class__.__name__ != self.last_cls) and self.offload_allowed(module): # offload every other module first time when new module starts pre-forward
do_offload = (self.last_pre != _id) or (module.__class__.__name__ != self.last_cls)
if do_offload and self.offload_allowed(module): # offload every other module first time when new module starts pre-forward
if shared.opts.diffusers_offload_pre:
t0 = time.time()
debug_move(f'Offload: type=balanced op=pre module={module.__class__.__name__}')
for pipe in get_pipe_variants():
for module_name in get_module_names(pipe):
@ -220,15 +225,16 @@ class OffloadHook(accelerate.hooks.ModelHook):
if (_id != id(module_instance)) and (module_cls not in self.offload_never) and (not devices.same_device(module_instance.device, devices.cpu)):
apply_balanced_offload_to_module(module_instance, op='pre')
self.last_cls = module.__class__.__name__
self.last_pre = _id
process_timer.add('offload', time.time() - t0)
if not devices.same_device(module.device, devices.device): # move-to-device
t0 = time.time()
device_index = torch.device(devices.device).index
if device_index is None:
device_index = 0
max_memory = { device_index: self.gpu, "cpu": self.cpu }
device_map = getattr(module, "balanced_offload_device_map", None)
if device_map is None or max_memory != getattr(module, "balanced_offload_max_memory", None):
if (device_map is None) or (max_memory != getattr(module, "balanced_offload_max_memory", None)):
device_map = accelerate.infer_auto_device_map(module, max_memory=max_memory)
offload_dir = getattr(module, "offload_dir", os.path.join(shared.opts.accelerate_offload_path, module.__class__.__name__))
if devices.backend == "directml":
@ -241,13 +247,15 @@ class OffloadHook(accelerate.hooks.ModelHook):
module._hf_hook.execution_device = torch.device(devices.device) # pylint: disable=protected-access
module.balanced_offload_device_map = device_map
module.balanced_offload_max_memory = max_memory
process_timer.add('onload', time.time() - t0)
if debug:
for pipe in get_pipe_variants():
for _i, pipe in enumerate(get_pipe_variants()):
for module_name in get_module_names(pipe):
module_instance = getattr(pipe, module_name, None)
shared.log.trace(f'Offload: type=balanced op=pre:status module={module_instance.__class__.__name__} device={module_instance.device} dtype={module_instance.dtype}')
shared.log.trace(f'Offload: type=balanced op=pre:status forward={module.__class__.__name__} module={module_name} class={module_instance.__class__.__name__} pipe={_i} device={module_instance.device} dtype={module_instance.dtype}')
self.last_pre = _id
return args, kwargs
def post_forward(self, module, output):
@ -283,6 +291,7 @@ def get_module_names(pipe=None, exclude=[]):
modules_names = get_signature(pipe).keys()
modules_names = [m for m in modules_names if m not in exclude and not m.startswith('_')]
modules_names = [m for m in modules_names if isinstance(getattr(pipe, m, None), torch.nn.Module)]
modules_names = list(sorted(set(modules_names)))
return modules_names
@ -308,6 +317,17 @@ def get_module_sizes(pipe=None, exclude=[]):
def move_module_to_cpu(module, op='unk', force:bool=False):
def do_move(module):
if shared.opts.diffusers_offload_streams:
global move_stream # pylint: disable=global-statement
if move_stream is None:
move_stream = torch.cuda.Stream(device=devices.device)
with torch.cuda.stream(move_stream):
module = module.to(devices.cpu)
else:
module = module.to(devices.cpu)
return module
try:
module_name = getattr(module, "module_name", module.__class__.__name__)
module_size = offload_hook_instance.offload_map.get(module_name, offload_hook_instance.model_size())
@ -318,17 +338,17 @@ def move_module_to_cpu(module, op='unk', force:bool=False):
op = f'{op}:skip'
if force:
op = f'{op}:force'
module = module.to(devices.cpu)
module = do_move(module)
used_gpu -= module_size
elif module_cls in offload_hook_instance.offload_never:
op = f'{op}:never'
elif module_cls in offload_hook_instance.offload_always:
op = f'{op}:always'
module = module.to(devices.cpu)
module = do_move(module)
used_gpu -= module_size
elif perc_gpu > shared.opts.diffusers_offload_min_gpu_memory:
op = f'{op}:mem'
module = module.to(devices.cpu)
module = do_move(module)
used_gpu -= module_size
if debug:
quant = getattr(module, "quantization_method", None)

View File

@ -170,6 +170,7 @@ options_templates.update(options_section(('offload', "Model Offloading"), {
"diffusers_offload_nonblocking": OptionInfo(False, "Non-blocking move operations"),
"offload_balanced_sep": OptionInfo("<h2>Balanced Offload</h2>", "", gr.HTML),
"diffusers_offload_pre": OptionInfo(True, "Offload during pre-forward"),
"diffusers_offload_streams": OptionInfo(False, "Offload using streams"),
"diffusers_offload_min_gpu_memory": OptionInfo(startup_offload_min_gpu, "Offload low watermark", gr.Slider, {"minimum": 0, "maximum": 1, "step": 0.01 }),
"diffusers_offload_max_gpu_memory": OptionInfo(startup_offload_max_gpu, "Offload GPU high watermark", gr.Slider, {"minimum": 0.1, "maximum": 1, "step": 0.01 }),
"diffusers_offload_max_cpu_memory": OptionInfo(0.90, "Offload CPU high watermark", gr.Slider, {"minimum": 0, "maximum": 1, "step": 0.01, "visible": False }),

View File

@ -14,6 +14,8 @@ version_map = {
"SDXL Hyper": "SD XL",
"StableDiffusion3": "SD 3",
"StableDiffusionXL": "SD XL",
"WanToVideo": "Wan",
"WanVACE": "Wan",
}
class ExtraNetworksPageCheckpoints(ui_extra_networks.ExtraNetworksPage):

View File

@ -169,6 +169,12 @@ models = {
te_cls=transformers.T5EncoderModel,
dit_cls=diffusers.WanTransformer3DModel,
dit_folder=("transformer", "transformer_2")),
Model(name='WAN 2.2 14B VACE',
url='https://huggingface.co/Wan-AI/Wan2.2-14B-VACE-T2V-Diffusers',
repo='linoyts/Wan2.2-VACE-Fun-14B-diffusers',
repo_cls=diffusers.WanVACEPipeline,
te_cls=transformers.T5EncoderModel,
dit_cls=diffusers.WanVACETransformer3DModel),
Model(name='WAN 2.1 1.3B T2V',
url='https://huggingface.co/Wan-AI/Wan2.1-T2V-1.3B-Diffusers',
repo='Wan-AI/Wan2.1-T2V-1.3B-Diffusers',
@ -204,13 +210,13 @@ models = {
repo='Wan-AI/Wan2.1-VACE-1.3B-diffusers',
repo_cls=diffusers.WanVACEPipeline,
te_cls=transformers.T5EncoderModel,
dit_cls=diffusers.WanTransformer3DModel),
dit_cls=diffusers.WanVACETransformer3DModel),
Model(name='WAN 2.1 VACE 14B',
url='https://huggingface.co/Wan-AI/Wan2.1-VACE-14B-diffusers',
repo='Wan-AI/Wan2.1-VACE-14B-diffusers',
repo_cls=diffusers.WanVACEPipeline,
te_cls=transformers.T5EncoderModel,
dit_cls=diffusers.WanTransformer3DModel),
dit_cls=diffusers.WanVACETransformer3DModel),
],
'SkyReels V2': [
Model(name='None'),

View File

@ -32,19 +32,22 @@ def set_overrides(p: processing.StableDiffusionProcessingVideo, selected: Model)
if selected.name == 'Latte 1 T2V':
p.task_args['enable_temporal_attentions'] = True
p.task_args['video_length'] = 16 * (max(p.frames // 16, 1))
# SkyReels
if 'SkyReelsV2DiffusionForcing' in cls:
p.task_args['overlap_history'] = 17
# LTX
if cls == 'LTXImageToVideoPipeline' or cls == 'LTXConditionPipeline':
p.task_args['generator'] = None
if cls == 'LTXConditionPipeline':
p.task_args['strength'] = p.denoising_strength
if 'LTX' in cls:
p.task_args['width'] = 32 * (p.width // 32)
p.task_args['height'] = 32 * (p.height // 32)
# WAN
if 'Wan' in cls:
p.task_args['width'] = 16 * (p.width // 16)
p.task_args['height'] = 16 * (p.height // 16)
p.frames = 4 * (max(p.frames // 4, 1)) + 1
# LTX
if 'LTX' in cls:
p.task_args['width'] = 32 * (p.width // 32)
p.task_args['height'] = 32 * (p.height // 32)
if 'SkyReelsV2DiffusionForcing' in cls:
p.task_args['overlap_history'] = 17
# WAN VACE
if 'WanVACEPipeline' in cls:
p.task_args['reference_images'] = 1

View File

@ -53,7 +53,10 @@ def generate(*args, **kwargs):
p.do_not_save_grid = True
p.do_not_save_samples = not save_frames
p.outpath_samples = shared.opts.outdir_samples or shared.opts.outdir_video
if 'I2V' in model:
if 'T2V' in model:
if init_image is not None:
shared.log.warning('Video: op=T2V init image not supported')
elif 'I2V' in model:
if init_image is None:
return video_utils.queue_err('init image not set')
p.task_args['image'] = images.resize_image(resize_mode=2, im=init_image, width=p.width, height=p.height, upscaler_name=None, output_type='pil')
@ -66,9 +69,10 @@ def generate(*args, **kwargs):
p.task_args['image'] = images.resize_image(resize_mode=2, im=init_image, width=p.width, height=p.height, upscaler_name=None, output_type='pil')
p.task_args['last_image'] = images.resize_image(resize_mode=2, im=last_image, width=p.width, height=p.height, upscaler_name=None, output_type='pil')
shared.log.debug(f'Video: op=FLF2V init={init_image} last={last_image} resized={p.task_args["image"]}')
elif 'T2V' in model:
elif 'VACE' in model:
if init_image is not None:
shared.log.warning('Video: op=T2V init image not supported')
p.task_args['reference_images'] = [images.resize_image(resize_mode=2, im=init_image, width=p.width, height=p.height, upscaler_name=None, output_type='pil')]
shared.log.debug(f'Video: op=VACE reference={init_image} resized={p.task_args["reference_images"]}')
else:
shared.log.warning(f'Video: unknown model type "{model}"')

View File

@ -8,6 +8,11 @@ def load_transformer(repo_id, diffusers_load_config={}, subfolder='transformer')
load_args, quant_args = model_quant.get_dit_args(diffusers_load_config, module='Model', device_map=True)
fn = None
if 'VACE' in repo_id:
transformer_cls = diffusers.WanVACETransformer3DModel
else:
transformer_cls = diffusers.WanTransformer3DModel
if shared.opts.sd_unet is not None and shared.opts.sd_unet != 'Default':
from modules import sd_unet
if shared.opts.sd_unet not in list(sd_unet.unet_dict):
@ -20,7 +25,7 @@ def load_transformer(repo_id, diffusers_load_config={}, subfolder='transformer')
transformer = None
elif fn is not None and 'safetensors' in fn.lower():
shared.log.debug(f'Load model: type=WanAI {subfolder}="{fn}" quant="{model_quant.get_quant(repo_id)}" args={load_args}')
transformer = diffusers.WanTransformer3DModel.from_single_file(
transformer = transformer_cls.from_single_file(
fn,
cache_dir=shared.opts.hfcache_dir,
**load_args,
@ -28,7 +33,7 @@ def load_transformer(repo_id, diffusers_load_config={}, subfolder='transformer')
)
else:
shared.log.debug(f'Load model: type=WanAI {subfolder}="{repo_id}" quant="{model_quant.get_quant_type(quant_args)}" args={load_args}')
transformer = diffusers.WanTransformer3DModel.from_pretrained(
transformer = transformer_cls.from_pretrained(
repo_id,
subfolder=subfolder,
cache_dir=shared.opts.hfcache_dir,
@ -60,7 +65,7 @@ def load_wan(checkpoint_info, diffusers_load_config={}):
repo_id = sd_models.path_to_repo(checkpoint_info)
sd_models.hf_auth_check(checkpoint_info)
if 'a14b' in repo_id.lower():
if 'a14b' in repo_id.lower() or 'fun-14b' in repo_id.lower():
if shared.opts.model_wan_stage == 'high noise' or shared.opts.model_wan_stage == 'first':
transformer = load_transformer(repo_id, diffusers_load_config, 'transformer')
transformer_2 = None
@ -83,13 +88,18 @@ def load_wan(checkpoint_info, diffusers_load_config={}):
boundary_ratio = shared.opts.model_wan_boundary if transformer_2 is not None else None
if 'Wan2.2-I2V' in repo_id:
cls = diffusers.WanImageToVideoPipeline
pipe_cls = diffusers.WanImageToVideoPipeline
diffusers.pipelines.auto_pipeline.AUTO_IMAGE2IMAGE_PIPELINES_MAPPING["wanai"] = diffusers.WanImageToVideoPipeline
elif 'Wan2.2-VACE' in repo_id:
pipe_cls = diffusers.WanVACEPipeline
diffusers.pipelines.auto_pipeline.AUTO_TEXT2IMAGE_PIPELINES_MAPPING["wanai"] = diffusers.WanVACEPipeline
diffusers.pipelines.auto_pipeline.AUTO_IMAGE2IMAGE_PIPELINES_MAPPING["wanai"] = diffusers.WanVACEPipeline
diffusers.pipelines.auto_pipeline.AUTO_INPAINT_PIPELINES_MAPPING["wanai"] = diffusers.WanVACEPipeline
else:
cls = diffusers.WanPipeline
pipe_cls = diffusers.WanPipeline
diffusers.pipelines.auto_pipeline.AUTO_TEXT2IMAGE_PIPELINES_MAPPING["wanai"] = diffusers.WanPipeline
shared.log.debug(f'Load model: type=WanAI model="{checkpoint_info.name}" repo="{repo_id}" cls={cls.__name__} offload={shared.opts.diffusers_offload_mode} dtype={devices.dtype} args={load_args} stage="{shared.opts.model_wan_stage}" boundary={boundary_ratio}')
pipe = cls.from_pretrained(
shared.log.debug(f'Load model: type=WanAI model="{checkpoint_info.name}" repo="{repo_id}" cls={pipe_cls.__name__} offload={shared.opts.diffusers_offload_mode} dtype={devices.dtype} args={load_args} stage="{shared.opts.model_wan_stage}" boundary={boundary_ratio}')
pipe = pipe_cls.from_pretrained(
repo_id,
transformer=transformer,
transformer_2=transformer_2,