Commit Graph

108 Commits (cc6f9500a1b972e9dca14e769f4b70a8927ffa43)

Author SHA1 Message Date
rattus 535c16ce6e
Widen OOM_EXCEPTION to AcceleratorError form (#12835)
Pytorch only filters for OOMs in its own allocators however there are
paths that can OOM on allocators made outside the pytorch allocators.
These manifest as an AllocatorError as pytorch does not have universal
error translation to its OOM type on exception. Handle it. A log I have
for this also shows a double report of the error async, so call the
async discarder to cleanup and make these OOMs look like OOMs.
2026-03-10 00:41:02 -04:00
comfyanonymous a50c32d63f
Disable sage attention on ace step 1.5 (#12297) 2026-02-04 22:15:30 -05:00
mengqin 0357ed7ec4
Add support for sage attention 3 in comfyui, enable via new cli arg (#11026)
* Add support for sage attention 3 in comfyui, enable via new cli arg
--use-sage-attiention3

* Fix some bugs found in PR review. The N dimension at which Sage
Attention 3 takes effect is reduced to 1024 (although the improvement is
not significant at this scale).

* Remove the Sage Attention3 switch, but retain the attention function
registration.

* Fix a ruff check issue in attention.py
2025-12-30 22:53:52 -05:00
rattus 277237ccc1
attention: use flag based OOM fallback (#11038)
Exception ref all local variables for the lifetime of exception
context. Just set a flag and then if to dump the exception before
falling back.
2025-12-02 17:24:19 -05:00
blepping 1a85483da1
Fix depending on asserts to raise an exception in BatchedBrownianTree and Flash attn module (#9884)
Correctly handle the case where w0 is passed by kwargs in BatchedBrownianTree
2025-09-15 20:05:03 -04:00
Jedrzej Kosinski f228367c5e
Make ModuleNotFoundError ImportError instead (#9850) 2025-09-13 21:34:21 -04:00
Jedrzej Kosinski d7f40442f9
Enable Runtime Selection of Attention Functions (#9639)
* Looking into a @wrap_attn decorator to look for 'optimized_attention_override' entry in transformer_options

* Created logging code for this branch so that it can be used to track down all the code paths where transformer_options would need to be added

* Fix memory usage issue with inspect

* Made WAN attention receive transformer_options, test node added to wan to test out attention override later

* Added **kwargs to all attention functions so transformer_options could potentially be passed through

* Make sure wrap_attn doesn't make itself recurse infinitely, attempt to load SageAttention and FlashAttention if not enabled so that they can be marked as available or not, create registry for available attention

* Turn off attention logging for now, make AttentionOverrideTestNode have a dropdown with available attention (this is a test node only)

* Make flux work with optimized_attention_override

* Add logs to verify optimized_attention_override is passed all the way into attention function

* Make Qwen work with optimized_attention_override

* Made hidream work with optimized_attention_override

* Made wan patches_replace work with optimized_attention_override

* Made SD3 work with optimized_attention_override

* Made HunyuanVideo work with optimized_attention_override

* Made Mochi work with optimized_attention_override

* Made LTX work with optimized_attention_override

* Made StableAudio work with optimized_attention_override

* Made optimized_attention_override work with ACE Step

* Made Hunyuan3D work with optimized_attention_override

* Make CosmosPredict2 work with optimized_attention_override

* Made CosmosVideo work with optimized_attention_override

* Made Omnigen 2 work with optimized_attention_override

* Made StableCascade work with optimized_attention_override

* Made AuraFlow work with optimized_attention_override

* Made Lumina work with optimized_attention_override

* Made Chroma work with optimized_attention_override

* Made SVD work with optimized_attention_override

* Fix WanI2VCrossAttention so that it expects to receive transformer_options

* Fixed Wan2.1 Fun Camera transformer_options passthrough

* Fixed WAN 2.1 VACE transformer_options passthrough

* Add optimized to get_attention_function

* Disable attention logs for now

* Remove attention logging code

* Remove _register_core_attention_functions, as we wouldn't want someone to call that, just in case

* Satisfy ruff

* Remove AttentionOverrideTest node, that's something to cook up for later
2025-09-12 18:07:38 -04:00
comfyanonymous 9df8792d4b
Make last PR not crash comfy on old pytorch. (#9324) 2025-08-13 15:12:41 -04:00
contentis 3da5a07510
SDPA backend priority (#9299) 2025-08-13 14:53:27 -04:00
Kohaku-Blueleaf 520eb77b72
LoRA Trainer: LoRA training node in weight adapter scheme (#8446) 2025-06-13 19:25:59 -04:00
comfyanonymous 5a87757ef9
Better error if sageattention is installed but a dependency is missing. (#8264) 2025-05-24 06:43:12 -04:00
Raphael Walker 89e4ea0175
Add activations_shape info in UNet models (#7482)
* Add activations_shape info in UNet models

* activations_shape should be a list
2025-04-04 21:27:54 -04:00
comfyanonymous e471c726e5 Fallback to pytorch attention if sage attention fails. 2025-03-22 15:45:56 -04:00
FeepingCreature 9c98c6358b
Tolerate missing `@torch.library.custom_op` (#7234)
This can happen on Pytorch versions older than 2.4.
2025-03-14 09:51:26 -04:00
FeepingCreature 7aceb9f91c
Add --use-flash-attention flag. (#7223)
* Add --use-flash-attention flag.
This is useful on AMD systems, as FA builds are still 10% faster than Pytorch cross-attention.
2025-03-14 03:22:41 -04:00
comfyanonymous 96d891cb94 Speedup on some models by not upcasting bfloat16 to float32 on mac. 2025-02-24 05:41:32 -05:00
comfyanonymous aff16532d4 Remove some useless code. 2025-02-22 04:45:14 -05:00
Dr.Lt.Data 0a0df5f136
better guide message for sageattention (#6634) 2025-02-02 09:26:47 -05:00
comfyanonymous 129d8908f7 Add argument to skip the output reshaping in the attention functions. 2025-01-10 06:27:37 -05:00
comfyanonymous 37e5390f5f Add: --use-sage-attention to enable SageAttention.
You need to have the library installed first.
2024-12-18 01:56:10 -05:00
comfyanonymous 19ee5d9d8b Don't expand mask when not necessary.
Expanding seems to slow down inference.
2024-12-16 18:22:50 -05:00
Raphael Walker 61b50720d0
Add support for attention masking in Flux (#5942)
* fix attention OOM in xformers

* allow passing attention mask in flux attention

* allow an attn_mask in flux

* attn masks can be done using replace patches instead of a separate dict

* fix return types

* fix return order

* enumerate

* patch the right keys

* arg names

* fix a silly bug

* fix xformers masks

* replace match with if, elif, else

* mask with image_ref_size

* remove unused import

* remove unused import 2

* fix pytorch/xformers attention

This corrects a weird inconsistency with skip_reshape.
It also allows masks of various shapes to be passed, which will be
automtically expanded (in a memory-efficient way) to a size that is
compatible with xformers or pytorch sdpa respectively.

* fix mask shapes
2024-12-16 18:21:17 -05:00
Chenlei Hu d9d7f3c619
Lint all unused variables (#5989)
* Enable F841

* Autofix

* Remove all unused variable assignment
2024-12-12 17:59:16 -05:00
comfyanonymous 2fd9c1308a Fix mask issue in some attention functions. 2024-11-22 02:10:09 -05:00
comfyanonymous 07f6eeaa13 Fix mask issue with attention_xformers. 2024-11-20 17:07:46 -05:00
comfyanonymous fabf449feb Mochi VAE encoder. 2024-11-01 17:33:09 -04:00
comfyanonymous 33fb282d5c Fix issue. 2024-08-14 02:51:47 -04:00
comfyanonymous bb1969cab7 Initial support for the stable audio open model. 2024-06-15 12:14:56 -04:00
comfyanonymous 0920e0e5fe Remove some unused imports. 2024-05-27 19:08:27 -04:00
comfyanonymous 8508df2569 Work around black image bug on Mac 14.5 by forcing attention upcasting. 2024-05-21 16:56:33 -04:00
comfyanonymous 83d969e397 Disable xformers when tracing model. 2024-05-21 13:55:49 -04:00
comfyanonymous 1900e5119f Fix potential issue. 2024-05-20 08:19:54 -04:00
comfyanonymous 0bdc2b15c7 Cleanup. 2024-05-18 10:11:44 -04:00
comfyanonymous 98f828fad9 Remove unnecessary code. 2024-05-18 09:36:44 -04:00
comfyanonymous 46daf0a9a7 Add debug options to force on and off attention upcasting. 2024-05-16 04:09:41 -04:00
comfyanonymous ec6f16adb6 Fix SAG. 2024-05-14 18:02:27 -04:00
comfyanonymous bb4940d837 Only enable attention upcasting on models that actually need it. 2024-05-14 17:00:50 -04:00
comfyanonymous b0ab31d06c Refactor attention upcasting code part 1. 2024-05-14 12:47:31 -04:00
comfyanonymous 2aed53c4ac Workaround xformers bug. 2024-04-30 21:23:40 -04:00
comfyanonymous 2a813c3b09 Switch some more prints to logging. 2024-03-11 16:34:58 -04:00
comfyanonymous 6bcf57ff10 Fix attention masks properly for multiple batches. 2024-02-17 16:15:18 -05:00
comfyanonymous f8706546f3 Fix attention mask batch size in some attention functions. 2024-02-17 15:22:21 -05:00
comfyanonymous 3b9969c1c5 Properly fix attention masks in CLIP with batches. 2024-02-17 12:13:13 -05:00
comfyanonymous 89507f8adf Remove some unused imports. 2024-01-25 23:42:37 -05:00
comfyanonymous 6a7bc35db8 Use basic attention implementation for small inputs on old pytorch. 2024-01-09 13:46:52 -05:00
comfyanonymous c6951548cf Update optimized_attention_for_device function for new functions that
support masked attention.
2024-01-07 13:52:08 -05:00
comfyanonymous aaa9017302 Add attention mask support to sub quad attention. 2024-01-07 04:13:58 -05:00
comfyanonymous 0c2c9fbdfa Support attention mask in split attention. 2024-01-06 13:16:48 -05:00
comfyanonymous 3ad0191bfb Implement attention mask on xformers. 2024-01-06 04:33:03 -05:00
comfyanonymous a5056cfb1f Remove useless code. 2023-12-15 01:28:16 -05:00