Commit Graph

83 Commits (0a6746898d6864d65e2fc7504e5e875f8c19c0ba)

Author SHA1 Message Date
comfyanonymous e9aae31fa2
Z Image model. (#10892) 2025-11-25 18:41:45 -05:00
comfyanonymous d196a905bb
Lower vram usage for flux 2 text encoder. (#10887) 2025-11-25 14:58:39 -05:00
comfyanonymous dff996ca39
Fix crash. (#10885) 2025-11-25 14:30:24 -05:00
comfyanonymous 6b573ae0cb
Flux 2 (#10879) 2025-11-25 10:50:19 -05:00
comfyanonymous 25022e0b09
Cleanup and fix issues with text encoder quants. (#10872) 2025-11-25 01:48:53 -05:00
comfyanonymous 943b3b615d
HunyuanVideo 1.5 (#10819)
* init

* update

* Update model.py

* Update model.py

* remove print

* Fix text encoding

* Prevent empty negative prompt

Really doesn't work otherwise

* fp16 works

* I2V

* Update model_base.py

* Update nodes_hunyuan.py

* Better latent rgb factors

* Use the correct sigclip output...

* Support HunyuanVideo1.5 SR model

* whitespaces...

* Proper latent channel count

* SR model fixes

This also still needs timesteps scheduling based on the noise scale, can be used with two samplers too already

* vae_refiner: roll the convolution through temporal

Work in progress.

Roll the convolution through time using 2-latent-frame chunks and a
FIFO queue for the convolution seams.

* Support HunyuanVideo15 latent resampler

* fix

* Some cleanup

Co-Authored-By: comfyanonymous <121283862+comfyanonymous@users.noreply.github.com>

* Proper hyvid15 I2V channels

Co-Authored-By: comfyanonymous <121283862+comfyanonymous@users.noreply.github.com>

* Fix TokenRefiner for fp16

Otherwise x.sum has infs, just in case only casting if input is fp16, I don't know if necessary.

* Bugfix for the HunyuanVideo15 SR model

* vae_refiner: roll the convolution through temporal II

Roll the convolution through time using 2-latent-frame chunks and a
FIFO queue for the convolution seams.

Added support for encoder, lowered to 1 latent frame to save more
VRAM, made work for Hunyuan Image 3.0 (as code shared).

Fixed names, cleaned up code.

* Allow any number of input frames in VAE.

* Better VAE encode mem estimation.

* Lowvram fix.

* Fix hunyuan image 2.1 refiner.

* Fix mistake.

* Name changes.

* Rename.

* Whitespace.

* Fix.

* Fix.

---------

Co-authored-by: kijai <40791699+kijai@users.noreply.github.com>
Co-authored-by: Rattus <rattus128@gmail.com>
2025-11-20 22:44:43 -05:00
comfyanonymous 17027f2a6a
Add a way to disable the final norm in the llama based TE models. (#10794) 2025-11-18 22:36:03 -05:00
comfyanonymous 8aea746212
Implement gemma 3 as a text encoder. (#10241)
Not useful yet.
2025-10-06 22:08:08 -04:00
comfyanonymous 1e098d6132
Don't add template to qwen2.5vl when template is in prompt. (#10043)
Make the hunyuan image refiner template_end 36.
2025-09-26 18:34:17 -04:00
comfyanonymous 1fee8827cb
Support for qwen edit plus model. Use the new TextEncodeQwenImageEditPlus. (#9986) 2025-09-22 16:49:48 -04:00
Kimbing Ng e5e70636e7
Remove single quote pattern to avoid wrong matches (#9842) 2025-09-13 16:59:19 -04:00
comfyanonymous 85e34643f8
Support hunyuan image 2.1 regular model. (#9792) 2025-09-10 02:05:07 -04:00
contentis 97652d26b8
Add explicit casting in apply_rope for Qwen VL (#9759) 2025-09-08 15:08:18 -04:00
comfyanonymous 5a8f502db5
Disable prompt weights for qwen. (#9438) 2025-08-20 01:08:11 -04:00
comfyanonymous dfa791eb4b
Rope fix for qwen vl. (#9435) 2025-08-19 20:47:42 -04:00
comfyanonymous 4977f203fa
P2 of qwen edit model. (#9412)
* P2 of qwen edit model.

* Typo.

* Fix normal qwen.

* Fix.

* Make the TextEncodeQwenImageEdit also set the ref latent.

If you don't want it to set the ref latent and want to use the
ReferenceLatent node with your custom latent instead just disconnect the
VAE.
2025-08-18 22:38:34 -04:00
comfyanonymous c012400240
Initial support for qwen image model. (#9179) 2025-08-04 22:53:25 -04:00
comfyanonymous 938d3e8216
Remove windows line endings. (#8866) 2025-07-11 02:37:51 -04:00
comfyanonymous 170c7bb90c
Fix contiguous issue with pytorch nightly. (#8729) 2025-06-29 06:38:40 -04:00
comfyanonymous ec70ed6aea
Omnigen2 model implementation. (#8669) 2025-06-25 19:35:57 -04:00
comfyanonymous f2289a1f59
Delete useless file. (#8327) 2025-05-29 08:29:37 -04:00
comfyanonymous 5d3cc85e13
Make japanese hiragana and katakana characters work with ACE. (#7997) 2025-05-08 03:32:36 -04:00
comfyanonymous 16417b40d9
Initial ACE-Step model implementation. (#7972) 2025-05-07 08:33:34 -04:00
comfyanonymous 08ff5fa08a Cleanup chroma PR. 2025-04-30 20:57:30 -04:00
Silver 4ca3d84277
Support for Chroma - Flux1 Schnell distilled with CFG (#7355)
* Upload files for Chroma Implementation

* Remove trailing whitespace

* trim more trailing whitespace..oops

* remove unused imports

* Add supported_inference_dtypes

* Set min_length to 0 and remove attention_mask=True

* Set min_length to 1

* get_mdulations added from blepping and minor changes

* Add lora conversion if statement in lora.py

* Update supported_models.py

* update model_base.py

* add uptream commits

* set modelType.FLOW, will cause beta scheduler to work properly

* Adjust memory usage factor and remove unnecessary code

* fix mistake

* reduce code duplication

* remove unused imports

* refactor for upstream sync

* sync chroma-support with upstream via syncbranch patch

* Update sd.py

* Add Chroma as option for the OptimalStepsScheduler node
2025-04-30 20:57:00 -04:00
comfyanonymous 23e39f2ba7
Add a T5TokenizerOptions node to set options for the T5 tokenizer. (#7803) 2025-04-25 19:36:00 -04:00
comfyanonymous fd27494441 Use empty t5 of size 128 for hidream, seems to give closer results. 2025-04-19 19:49:40 -04:00
power88 f43e1d7f41
Hidream: Allow loading hidream text encoders in CLIPLoader and DualCLIPLoader (#7676)
* Hidream: Allow partial loading text encoders

* reformat code for ruff check.
2025-04-19 19:47:30 -04:00
comfyanonymous 636d4bfb89 Fix hard crash when the spiece tokenizer path is bad. 2025-04-19 15:55:43 -04:00
comfyanonymous 9899d187b1 Limit T5 to 128 tokens for HiDream: #7620 2025-04-16 18:07:55 -04:00
comfyanonymous 9ad792f927 Basic support for hidream i1 model. 2025-04-15 17:35:05 -04:00
comfyanonymous 6fc5dbd52a Cleanup. 2025-04-15 12:13:28 -04:00
comfyanonymous 3e8155f7a3 More flexible long clip support.
Add clip g long clip support.

Text encoder refactor.

Support llama models with different vocab sizes.
2025-04-15 10:32:21 -04:00
comfyanonymous be4e760648 Add an image_interleave option to the Hunyuan image to video encode node.
See the tooltip for what it does.
2025-03-07 19:56:26 -05:00
comfyanonymous 29a70ca101 Support HunyuanVideo image to video model. 2025-03-06 03:07:15 -05:00
comfyanonymous 85ef295069 Make applying embeddings more efficient.
Adding new tokens no longer makes a whole copy of the embeddings weight
which can be massive on certain models.
2025-03-05 17:34:38 -05:00
comfyanonymous 65042f7d39 Make it easier to set a custom template for hunyuan video. 2025-03-04 09:26:05 -05:00
comfyanonymous 3ea3bc8546 Fix wan issues when prompt length is long. 2025-02-26 20:34:02 -05:00
comfyanonymous 63023011b9 WIP support for Wan t2v model. 2025-02-25 17:20:35 -05:00
comfyanonymous f40076096e Cleanup some lumina te code. 2025-02-25 04:10:26 -05:00
comfyanonymous e5ea112a90 Support Lumina 2 model. 2025-02-04 04:16:30 -05:00
comfyanonymous 44e19a28d3 Use maximum negative value instead of -inf for masks in text encoders.
This is probably more correct.
2025-02-02 09:46:00 -05:00
comfyanonymous 2ff3104f70 WIP support for Nvidia Cosmos 7B and 14B text to world (video) models. 2025-01-10 09:14:16 -05:00
comfyanonymous d0f3752e33 Properly calculate inner dim for t5 model.
This is required to support some different types of t5 models.
2025-01-07 17:33:03 -05:00
City bddb02660c
Add PixArt model support (#6055)
* PixArt initial version

* PixArt Diffusers convert logic

* pos_emb and interpolation logic

* Reduce  duplicate code

* Formatting

* Use optimized attention

* Edit empty token logic

* Basic PixArt LoRA support

* Fix aspect ratio logic

* PixArtAlpha text encode with conds

* Use same detection key logic for PixArt diffusers
2024-12-20 15:25:00 -05:00
comfyanonymous a4f59bc65e Pick attention implementation based on device in llama code. 2024-12-18 01:30:20 -05:00
comfyanonymous ca457f7ba1 Properly tokenize the template for hunyuan video. 2024-12-17 16:22:02 -05:00
comfyanonymous d6656b0c0c Support llama hunyuan video text encoder in scaled fp8 format. 2024-12-17 04:19:22 -05:00
comfyanonymous bda1482a27 Basic Hunyuan Video model support. 2024-12-16 19:35:40 -05:00
Chenlei Hu d9d7f3c619
Lint all unused variables (#5989)
* Enable F841

* Autofix

* Remove all unused variable assignment
2024-12-12 17:59:16 -05:00