ComfyUI

Commit Graph

Author	SHA1	Message	Date
Jedrzej Kosinski	1b96430c60	Merge master into worksplit-multigpu (#13546 ) * fix: pin SQLAlchemy>=2.0 in requirements.txt (fixes #13036) (#13316) * Refactor io to IO in nodes_ace.py (#13485) * Bump comfyui-frontend-package to 1.42.12 (#13489) * Make the ltx audio vae more native. (#13486) * feat(api-nodes): add automatic downscaling of videos for ByteDance 2 nodes (#13465) * Support standalone LTXV audio VAEs (#13499) * [Partner Nodes] added 4K resolution for Veo models; added Veo 3 Lite model (#13330) * feat(api nodes): added 4K resolution for Veo models; added Veo 3 Lite model Signed-off-by: bigcat88 <bigcat88@icloud.com> * increase poll_interval from 5 to 9 --------- Signed-off-by: bigcat88 <bigcat88@icloud.com> Co-authored-by: Jedrzej Kosinski <kosinkadink1@gmail.com> * Bump comfyui-frontend-package to 1.42.14 (#13493) * Add gpt-image-2 as version option (#13501) * Allow logging in comfy app files. (#13505) * chore: update workflow templates to v0.9.59 (#13507) * fix(veo): reject 4K resolution for veo-3.0 models in Veo3VideoGenerationNode (#13504) The tooltip on the resolution input states that 4K is not available for veo-3.1-lite or veo-3.0 models, but the execute guard only rejected the lite combination. Selecting 4K with veo-3.0-generate-001 or veo-3.0-fast-generate-001 would fall through and hit the upstream API with an invalid request. Broaden the guard to match the documented behavior and update the error message accordingly. Co-authored-by: Jedrzej Kosinski <kosinkadink1@gmail.com> * feat: RIFE and FILM frame interpolation model support (CORE-29) (#13258) * initial RIFE support * Also support FILM * Better RAM usage, reduce FILM VRAM peak * Add model folder placeholder * Fix oom fallback frame loss * Remove torch.compile for now * Rename model input * Shorter input type name --------- * fix: use Parameter assignment for Stable_Zero123 cc_projection weights (fixes #13492) (#13518) On Windows with aimdo enabled, disable_weight_init.Linear uses lazy initialization that sets weight and bias to None to avoid unnecessary memory allocation. This caused a crash when copy_() was called on the None weight attribute in Stable_Zero123.__init__. Replace copy_() with direct torch.nn.Parameter assignment, which works correctly on both Windows (aimdo enabled) and other platforms. * Derive InterruptProcessingException from BaseException (#13523) * bump manager version to 4.2.1 (#13516) * ModelPatcherDynamic: force cast stray weights on comfy layers (#13487) the mixed_precision ops can have input_scale parameters that are used in tensor math but arent a weight or bias so dont get proper VRAM management. Treat these as force-castable parameters like the non comfy weight, random params are buffers already are. * Update logging level for invalid version format (#13526) * [Partner Nodes] add SD2 real human support (#13509) * feat(api-nodes): add SD2 real human support Signed-off-by: bigcat88 <bigcat88@icloud.com> * fix: add validation before uploading Assets Signed-off-by: bigcat88 <bigcat88@icloud.com> * Add asset_id and group_id displaying on the node Signed-off-by: bigcat88 <bigcat88@icloud.com> * extend poll_op to use instead of custom async cycle Signed-off-by: bigcat88 <bigcat88@icloud.com> * added the polling for the "Active" status after asset creation Signed-off-by: bigcat88 <bigcat88@icloud.com> * updated tooltip for group_id * allow usage of real human in the ByteDance2FirstLastFrame node * add reference count limits * corrected price in status when input assets contain video Signed-off-by: bigcat88 <bigcat88@icloud.com> --------- Signed-off-by: bigcat88 <bigcat88@icloud.com> * feat: SAM (segment anything) 3.1 support (CORE-34) (#13408) * [Partner Nodes] GPTImage: fix price badges, add new resolutions (#13519) * fix(api-nodes): fixed price badges, add new resolutions Signed-off-by: bigcat88 <bigcat88@icloud.com> * proper calculate the total run cost when "n > 1" Signed-off-by: bigcat88 <bigcat88@icloud.com> --------- Signed-off-by: bigcat88 <bigcat88@icloud.com> * chore: update workflow templates to v0.9.61 (#13533) * chore: update embedded docs to v0.4.4 (#13535) * add 4K resolution to Kling nodes (#13536) Signed-off-by: bigcat88 <bigcat88@icloud.com> * Fix LTXV Reference Audio node (#13531) * comfy-aimdo 0.2.14: Hotfix async allocator estimations (#13534) This was doing an over-estimate of VRAM used by the async allocator when lots of little small tensors were in play. Also change the versioning scheme to == so we can roll forward aimdo without worrying about stable regressions downstream in comfyUI core. * Disable sageattention for SAM3 (#13529) Causes Nans * execution: Add anti-cycle validation (#13169) Currently if the graph contains a cycle, the just inifitiate recursions, hits a catch all then throws a generic error against the output node that seeded the validation. Instead, fail the offending cycling mode chain and handlng it as an error in its own right. Co-authored-by: guill <jacob.e.segal@gmail.com> * chore: update workflow templates to v0.9.62 (#13539) --------- Signed-off-by: bigcat88 <bigcat88@icloud.com> Co-authored-by: Octopus <liyuan851277048@icloud.com> Co-authored-by: comfyanonymous <121283862+comfyanonymous@users.noreply.github.com> Co-authored-by: Comfy Org PR Bot <snomiao+comfy-pr@gmail.com> Co-authored-by: Alexander Piskun <13381981+bigcat88@users.noreply.github.com> Co-authored-by: Jukka Seppänen <40791699+kijai@users.noreply.github.com> Co-authored-by: AustinMroz <austin@comfy.org> Co-authored-by: Daxiong (Lin) <contact@comfyui-wiki.com> Co-authored-by: Matt Miller <matt@miller-media.com> Co-authored-by: blepping <157360029+blepping@users.noreply.github.com> Co-authored-by: Dr.Lt.Data <128333288+ltdrdata@users.noreply.github.com> Co-authored-by: rattus <46076784+rattus128@users.noreply.github.com> Co-authored-by: guill <jacob.e.segal@gmail.com>	2026-04-23 19:20:14 -07:00
Jedrzej Kosinski	aa464b36b3	Multi-GPU device selection for loader nodes + CUDA context fixes (#13483 ) * Fix Hunyuan 3D 2.1 multi-GPU worksplit: use cond_or_uncond instead of hardcoded chunk(2) Amp-Thread-ID: https://ampcode.com/threads/T-019da964-2cc8-77f9-9aae-23f65da233db Co-authored-by: Amp <amp@ampcode.com> * Add GPU device selection to all loader nodes - Add get_gpu_device_options() and resolve_gpu_device_option() helpers in model_management.py for vendor-agnostic GPU device selection - Add device widget to CheckpointLoaderSimple, UNETLoader, VAELoader - Expand device options in CLIPLoader, DualCLIPLoader, LTXAVTextEncoderLoader from [default, cpu] to include gpu:0, gpu:1, etc. on multi-GPU systems - Wire load_diffusion_model_state_dict and load_state_dict_guess_config to respect model_options['load_device'] - Graceful fallback: unrecognized devices (e.g. gpu:1 on single-GPU) silently fall back to default Amp-Thread-ID: https://ampcode.com/threads/T-019daa41-f394-731a-8955-4cff4f16283a Co-authored-by: Amp <amp@ampcode.com> * Add VALIDATE_INPUTS to skip device combo validation for workflow portability When a workflow saved on a 2-GPU machine (with device=gpu:1) is loaded on a 1-GPU machine, the combo validation would reject the unknown value. VALIDATE_INPUTS with the device parameter bypasses combo validation for that input only, allowing resolve_gpu_device_option to handle the graceful fallback at runtime. Amp-Thread-ID: https://ampcode.com/threads/T-019daa41-f394-731a-8955-4cff4f16283a Co-authored-by: Amp <amp@ampcode.com> * Set CUDA device context in outer_sample to match model load_device Custom CUDA kernels (comfy_kitchen fp8 quantization) use torch.cuda.current_device() for DLPack tensor export. When a model is loaded on a non-default GPU (e.g. cuda:1), the CUDA context must match or the kernel fails with 'Can't export tensors on a different CUDA device index'. Save and restore the previous device around sampling. Amp-Thread-ID: https://ampcode.com/threads/T-019daa41-f394-731a-8955-4cff4f16283a Co-authored-by: Amp <amp@ampcode.com> * Fix code review bugs: negative index guard, CPU offload_device, checkpoint te_model_options - resolve_gpu_device_option: reject negative indices (gpu:-1) - UNETLoader: set offload_device when cpu is selected - CheckpointLoaderSimple: pass te_model_options for CLIP device, set offload_device for cpu, pass load_device to VAE - load_diffusion_model_state_dict: respect offload_device from model_options - load_state_dict_guess_config: respect offload_device, pass load_device to VAE Amp-Thread-ID: https://ampcode.com/threads/T-019daa41-f394-731a-8955-4cff4f16283a Co-authored-by: Amp <amp@ampcode.com> * Fix CUDA device context for CLIP encoding and VAE encode/decode Add torch.cuda.set_device() calls to match model's load device in: - CLIP.encode_from_tokens: fixes 'Can't export tensors on a different CUDA device index' when CLIP is loaded on a non-default GPU - CLIP.encode_from_tokens_scheduled: same fix for the hooks code path - CLIP.generate: same fix for text generation - VAE.decode: fixes VAE decoding on non-default GPU - VAE.encode: fixes VAE encoding on non-default GPU Same pattern as the existing outer_sample fix in samplers.py - saves and restores previous CUDA device in a try/finally block. Amp-Thread-ID: https://ampcode.com/threads/T-019dabdc-8feb-766f-b4dc-f46ef4d8ff57 Co-authored-by: Amp <amp@ampcode.com> * Extract cuda_device_context manager, fix tiled VAE methods Add model_management.cuda_device_context() — a context manager that saves/restores torch.cuda.current_device when operating on a non-default GPU. Replaces 6 copies of the manual save/set/restore boilerplate. Refactored call sites: - CLIP.encode_from_tokens - CLIP.encode_from_tokens_scheduled (hooks path) - CLIP.generate - VAE.decode - VAE.encode - samplers.outer_sample Bug fixes (newly wrapped): - VAE.decode_tiled: was missing device context entirely, would fail on non-default GPU when called from 'VAE Decode (Tiled)' node - VAE.encode_tiled: same issue for 'VAE Encode (Tiled)' node Amp-Thread-ID: https://ampcode.com/threads/T-019dabdc-8feb-766f-b4dc-f46ef4d8ff57 Co-authored-by: Amp <amp@ampcode.com> * Restore CheckpointLoaderSimple, add CheckpointLoaderDevice Revert CheckpointLoaderSimple to its original form (no device input) so it remains the simple default loader. Add new CheckpointLoaderDevice node (advanced/loaders) with separate model_device, clip_device, and vae_device inputs for per-component GPU placement in multi-GPU setups. Amp-Thread-ID: https://ampcode.com/threads/T-019dabdc-8feb-766f-b4dc-f46ef4d8ff57 Co-authored-by: Amp <amp@ampcode.com> --------- Co-authored-by: Amp <amp@ampcode.com>	2026-04-23 19:10:33 -07:00
rattus	7b8b3673ff	comfy-aimdo: 0.0.214 (#13532 ) Cut pre-release 0.0.214 off aimdo master to pickup async mem accounting fix.	2026-04-23 19:09:56 -07:00
Jedrzej Kosinski	b502bcfff9	Merge remote-tracking branch 'origin/master' into worksplit-multigpu	2026-04-20 02:38:33 -07:00
Jedrzej Kosinski	37deccb0d4	Fix Hunyuan 3D 2.1 multi-GPU worksplit: use cond_or_uncond instead of hardcoded chunk(2) (#13478 )	2026-04-20 02:37:18 -07:00
comfyanonymous	fc5f4a996b	Add link to Intel portable to Readme. (#13477 )	2026-04-19 20:26:12 -04:00
Abdul Rehman	138571da95	fix: append directory type annotation to internal files endpoint response (#13078 ) (#13305 )	2026-04-18 23:21:22 -04:00
comfyanonymous	3d816db07f	Some optimizations to make Ernie inference a bit faster. (#13472 )	2026-04-18 23:02:29 -04:00
Jukka Seppänen	b9dedea57d	feat: SUPIR model support (CORE-17) (#13250 )	2026-04-18 23:02:01 -04:00
comfyanonymous	3086026401	ComfyUI v0.19.3	2026-04-17 13:35:01 -04:00
Alexander Piskun	9635c2ec9b	fix(api-nodes): make "obj" output optional in Hunyuan3D Text and Image to 3D (#13449 ) Signed-off-by: bigcat88 <bigcat88@icloud.com> Co-authored-by: Jedrzej Kosinski <kosinkadink1@gmail.com>	2026-04-18 01:31:37 +08:00
Daxiong (Lin)	f8d92cf313	chore: update workflow templates to v0.9.57 (#13455 )	2026-04-17 12:16:39 -05:00
Alexander Piskun	4f48be4138	feat(api-nodes): add new "arrow-1.1" and "arrow-1.1-max" SVG models (#13447 ) Signed-off-by: bigcat88 <bigcat88@icloud.com>	2026-04-17 12:02:06 -05:00
Alexander Piskun	541fd10bbe	fix(api-nodes): corrected StabilityAI price badges (#13454 ) Signed-off-by: bigcat88 <bigcat88@icloud.com>	2026-04-17 11:44:08 -05:00
rattus	05f7531148	nodes_textgen: Implement use_default_template for LTX (#13451 )	2026-04-17 12:20:09 -04:00
comfyanonymous	c033bbf516	ComfyUI v0.19.2	2026-04-17 00:26:35 -04:00
comfyanonymous	1391579c33	Add JsonExtractString node. (#13435 )	2026-04-17 00:20:16 -04:00
Alexander Piskun	d0c53c50c2	feat(api-nodes): add 1080p resolution for SeeDance 2.0 model (#13437 ) Signed-off-by: bigcat88 <bigcat88@icloud.com>	2026-04-16 20:32:04 -05:00
Bedovyy	b41ab53b6f	Use `ErnieTEModel_` not `ErnieTEModel`. (#13431 )	2026-04-16 10:11:58 -04:00
rattus	f0d550bd02	Minor updates for worksplit_gpu with comfy-aimdo (#13419 ) * main: init all visible cuda devices in aimdo * mp: call vbars_analyze for the GPU in question * requirements: bump aimdo to pre-release version	2026-04-15 22:49:01 -07:00
comfyanonymous	e9a2d1e4cc	Add a way to disable default template in text gen node. (#13424 )	2026-04-15 22:59:08 -04:00
Jun Yamog	1de83f91c3	Fix OOM regression in _apply() for quantized models during inference (#13372 ) Skip unnecessary clone of inference-mode tensors when already inside torch.inference_mode(), matching the existing guard in set_attr_param. The unconditional clone introduced in `20561aa9` caused transient VRAM doubling during model movement for FP8/quantized models.	2026-04-15 02:10:36 -07:00
comfyanonymous	8f374716ee	ComfyUI v0.19.1	2026-04-14 22:56:13 -04:00
comfyanonymous	cb0bbde402	Fix ernie on devices that don't support fp64. (#13414 )	2026-04-14 22:54:47 -04:00
Daxiong (Lin)	7ce3f64c78	Update workflow templates to v0.9.54 (#13412 )	2026-04-14 17:35:27 -07:00
comfyanonymous	c5569e8627	Add string output to preview text node. (#13406 )	2026-04-14 14:42:23 -04:00
Comfy Org PR Bot	c16db7fd69	Bump comfyui-frontend-package to 1.42.11 (#13398 )	2026-04-14 14:13:35 -04:00
Daxiong (Lin)	fed4ac031a	chore: update workflow templates to v0.9.50 (#13399 )	2026-04-14 14:24:37 +08:00
Alexander Piskun	35dfcbbb28	[Partner Nodes] add Sonilo Audio nodes (#13391 ) * feat(api-nodes): add Sonilo nodes Signed-off-by: bigcat88 <bigcat88@icloud.com> * fix: do not spam frontend with each chunk arrival Signed-off-by: bigcat88 <bigcat88@icloud.com> * updated pricing badge Signed-off-by: bigcat88 <bigcat88@icloud.com> --------- Signed-off-by: bigcat88 <bigcat88@icloud.com>	2026-04-13 22:21:01 -07:00
comfyanonymous	722bc73319	Make text generation work with ministral model. (#13395 ) Needs template before it works properly.	2026-04-13 20:43:57 -04:00
comfyanonymous	402ff1cdb7	Fix issue with ernie image. (#13393 )	2026-04-13 16:38:42 -04:00
comfyanonymous	acd718598e	ComfyUI v0.19.0	2026-04-13 03:02:36 -04:00
Daxiong (Lin)	559501e4b8	chore: update workflow templates to v0.9.47 (#13385 )	2026-04-12 23:19:09 -07:00
Alexander Piskun	ee2db7488d	feat(api-nodes): add SeeDance 2.0 nodes (#13364 ) Signed-off-by: bigcat88 <bigcat88@icloud.com>	2026-04-12 19:26:19 -10:00
comfyanonymous	c2657d5fb9	Fix typo. (#13382 )	2026-04-12 23:37:13 -04:00
comfyanonymous	971932346a	Update quant doc so it's not completely wrong. (#13381 ) There is still more that needs to be fixed.	2026-04-12 23:27:38 -04:00
comfyanonymous	31283d2892	Implement Ernie Image model. (#13369 )	2026-04-11 22:29:31 -04:00
comfyanonymous	55ebd287ee	Add a supports_fp64 function. (#13368 )	2026-04-11 21:06:36 -04:00
comfyanonymous	a2840e7552	Make ImageUpscaleWithModel node work with intermediate device and dtype. (#13357 )	2026-04-10 21:48:26 -04:00
Jukka Seppänen	a134423890	SDPose: resize input always (#13349 )	2026-04-10 11:26:55 -10:00
Daxiong (Lin)	b920bdd77d	chore: update workflow templates to v0.9.45 (#13353 )	2026-04-10 15:50:40 -04:00
Alexander Piskun	5410ed34f5	fix(api-nodes): fix GrokVideoReferenceNode price badge (#13354 )	2026-04-10 08:01:15 -10:00
Terry Jia	e6be419a30	should use 0 as defalut for brightness (#13345 )	2026-04-09 21:58:05 -04:00
comfyanonymous	3d4aca8084	Bump comfyui-frontend-package version to 1.42.10 (#13346 )	2026-04-09 21:56:49 -04:00
Jedrzej Kosinski	48deb15c0e	Simplify multigpu dispatch: run all devices on pool threads (#13340 ) Benchmarked hybrid (main thread + pool) vs all-pool on 2x RTX 4090 with SD1.5 and NetaYume models. No meaningful performance difference (within noise). All-pool is simpler: eliminates the main_device special case, main_batch_tuple deferred execution, and the 3-way branch in the dispatch loop.	2026-04-09 01:15:57 -07:00
comfyanonymous	2d861fb146	Basic intel standalone package .bat (#13333 )	2026-04-08 21:39:29 -04:00
Jedrzej Kosinski	4b93c4360f	Implement persistent thread pool for multi-GPU CFG splitting (#13329 ) Replace per-step thread create/destroy in _calc_cond_batch_multigpu with a persistent MultiGPUThreadPool. Each worker thread calls torch.cuda.set_device() once at startup, preserving compiled kernel caches across diffusion steps. - Add MultiGPUThreadPool class in comfy/multigpu.py - Create pool in CFGGuider.outer_sample(), shut down in finally block - Main thread handles its own device batch directly for zero overhead - Falls back to sequential execution if no pool is available	2026-04-08 05:39:07 -07:00
Jedrzej Kosinski	da3864436c	Merge remote-tracking branch 'origin/master' into worksplit-multigpu	2026-04-08 05:08:38 -07:00
huemin	b615af1c65	Add support for small flux.2 decoder (#13314 )	2026-04-07 03:44:18 -04:00
comfyanonymous	40862c0776	Support Ace Step 1.5 XL model. (#13317 )	2026-04-07 03:13:47 -04:00

1 2 3 4 5 ...

5182 Commits (worksplit-multigpu) All Branches Search

5182 Commits (worksplit-multigpu)

All Branches