Commit Graph

254 Commits (10bbbed218458b8a899aac2140ec738d8d716f05)

Author SHA1 Message Date
Vladimir Mandic 4b95d72d45 video tab layout
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-10-18 14:07:52 -04:00
Vladimir Mandic 1ebd96fdc6 add kandinsky5-lite t2v
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-10-18 12:15:36 -04:00
Disty0 4a70e82b0c ROCm always use numpy on cholesky 2025-09-28 14:05:20 +03:00
Disty0 a47959b114 move ROCm Windows hijacks outside of torch install 2025-09-28 13:33:15 +03:00
Disty0 6766563510 Add info log for Building CK Flash attention 2025-09-10 19:25:07 +03:00
Disty0 bc7c89c070 add typing to sdpa hijacks 2025-09-10 04:43:57 +03:00
Disty0 c51552af90 Enable triton flash atten option for rocm linux too 2025-09-03 16:26:00 +03:00
Disty0 c42e0e0b37 Cleanup 2025-09-03 16:18:02 +03:00
Disty0 266c9c0d3d Move Zluda Triton flash atten hijack to Triton Flash attention option 2025-09-03 16:16:41 +03:00
Vladimir Mandic fa44521ea3 offload-never and offload-always per-module and new highvram profile
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-07-31 11:40:24 -04:00
Vladimir Mandic 04af23a3bc refactore pipeline apply/unapply optional components & features
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-07-26 20:04:07 -04:00
Disty0 ad716b118b fix enable_gqa with dyn atten 2025-07-26 01:49:17 +03:00
Vladimir Mandic a5b77b8ee2 remove dead code
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-07-05 16:47:25 -04:00
Disty0 c9f49720c5 Cleanup 2025-06-25 23:37:41 +03:00
Disty0 dd84fb541f Always set sdpa params 2025-06-11 21:43:48 +03:00
Disty0 7679028c1a Override CPU to use FP32 by default 2025-06-06 15:33:51 +03:00
chrismuzyn 299d189276 When using the openvino backend, do not look for an nvidia gpu. 2025-05-12 19:14:26 -04:00
Disty0 b0e5a6c4df Add devices.has_triton() and enable NNCF compile if triton is available 2025-05-09 22:24:36 +03:00
Disty0 dfebc909eb Disable cuDNN benchmark on ROCm and add cudnn_benchmark_limit option 2025-05-08 13:27:06 +03:00
Disty0 90f887ac4a Add dim checks to ck flash atten and fix dim check on dyn atten 2025-03-25 03:50:21 +03:00
Seunghoon Lee 0c890b50e0
proper zluda detection 2025-03-20 23:03:23 +09:00
Disty0 1e0f512ccb ROCm disable FP16 for gfx1102 2025-03-19 15:42:36 +03:00
Disty0 878cab085f Reverse the sdpa hijcak order 2025-02-14 19:56:39 +03:00
Disty0 f94196bcd1 Rename ROCm Flash atten hijack to CK Flash atten and enable AOTriton memory and flash atten by default 2025-02-13 22:01:06 +03:00
Vladimir Mandic 49712ab9e7 update requirements
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-02-13 12:03:08 -05:00
Vladimir Mandic e28b8cd920 add torch gc debug
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-01-31 13:12:59 -05:00
Disty0 d2fee97e24 Update changelog 2025-01-31 20:27:22 +03:00
Disty0 039746914f Add check for missing cuda and ipex params 2025-01-31 19:12:56 +03:00
Vladimir Mandic 3dcb70e8a2 device init logging
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-01-31 09:31:12 -05:00
Vladimir Mandic 1697fb1508 add tunable ops path
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-01-30 13:35:09 -05:00
Vladimir Mandic 0ea7840608 add tunable ops
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-01-30 13:08:49 -05:00
Disty0 a770b1c888 More correct Dynamic Atten SDPA implementation and deprecate IPEX Diffusers attention 2025-01-25 21:33:42 +03:00
Disty0 af35296a68 IPEX 4GB alloc detection and log driver version 2025-01-22 18:15:25 +03:00
Vladimir Mandic bb97e695da log cleanup
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2024-12-31 09:15:36 -05:00
Vladimir Mandic 910f5d0a73 lora direct on-demand apply/unapply
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2024-12-29 12:38:19 -05:00
Vladimir Mandic 20f2554cec add sd35-ipadapter and more balanced offload optimizations
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2024-12-20 10:22:42 -05:00
Vladimir Mandic e9f951b2c5 offload logging
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2024-12-11 14:20:01 -05:00
Vladimir Mandic 9a588d9c91 update balanced offload
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2024-12-11 12:06:03 -05:00
Vladimir Mandic 023b13b6cb balanced offload improvements
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2024-12-01 15:34:25 -05:00
Vladimir Mandic b7aff134a2 add low/high threshold to balanced offload
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2024-11-30 19:03:51 -05:00
Vladimir Mandic b74166f9cb detailer add augment setting
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2024-11-29 07:18:07 -05:00
Vladimir Mandic dbb9ba0890 cuda memory limits
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2024-10-26 07:51:40 -04:00
Seunghoon Lee a76893bd72
add cdna check 2024-10-26 14:09:50 +09:00
Seunghoon Lee 81bd236cc3
zluda&rocm bf16 test 2024-10-26 13:59:21 +09:00
Disty0 d459acfcca Cleanup 2024-10-24 20:08:49 +03:00
Disty0 3b916d5e48 Zluda guess the GPU arch with the device name 2024-10-24 18:53:17 +03:00
Disty0 801ebdd080 Treat Zluda as a different backend and auto disable BF16 for Zluda and ROCm on RDNA1-2 2024-10-24 15:06:39 +03:00
Vladimir Mandic 0587f0be0c gc logging
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2024-10-23 07:58:09 -04:00
Vladimir Mandic 64f363283f messages,stats,save
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2024-10-20 09:26:07 -04:00
Disty0 6c11002420 PyTorch 2.5 XPU support 2024-10-17 23:11:52 +03:00