Commit Graph

75 Commits (a141e8c9e32fa393b77b99e43c59ff7b964ae4e0)

Author SHA1 Message Date
Vladimir Mandic ae4591ac0b reimplement torchao quantization
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2024-10-18 09:34:04 -04:00
Vladimir Mandic 6bb688c371 add set_accelerate
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2024-10-14 13:57:05 -04:00
Vladimir Mandic ea0dfebe2d better handle any quant lib requirements
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2024-10-12 13:36:16 -04:00
Disty0 012a7f3572 Update OpenVINO to 2024.3.0 2024-09-13 03:57:06 +03:00
Vladimir Mandic f2c5cbbb36 lint updates and diffusers installer 2024-09-06 14:10:53 -04:00
Vladimir Mandic 85b26e03ff minor updates 2024-09-06 10:13:32 -04:00
Vladimir Mandic 5ed58ac7cc end-to-end update flux, see changelog and wiki 2024-08-28 08:04:24 -04:00
Disty0 963940b9ae Fix no half vae 2024-08-21 22:45:02 +03:00
Disty0 b706083541 Quanto Activations fix Diffuser's model offload bug 2024-08-21 20:48:32 +03:00
Disty0 e40e13a330 Quanto fix Flux activations 2024-08-21 20:04:05 +03:00
Disty0 c3ff21c15e Quanto freeze the model before calibration 2024-08-21 19:18:57 +03:00
Disty0 694d25c161 Fix quanto 2024-08-21 19:17:04 +03:00
Disty0 16d6c03d45 Optimum Quanto activations support 2024-08-21 17:30:45 +03:00
Disty0 3f5c3ba0d8 Add warning to Quanto with balanced and sequential offload 2024-08-16 02:58:43 +03:00
Disty0 f3f721e39a Quanto disable gemm kernels 2024-08-14 20:26:46 +03:00
Disty0 e3b087b6c0 Add balanced offload mode and make offload modes a single choice list 2024-08-11 17:27:30 +03:00
Disty0 7eacec4c39 Quant send to gpu with shuffle option on high vram systems 2024-08-04 23:01:58 +03:00
Disty0 dc9e60aa67 Quant add shuffle models option 2024-08-04 04:46:06 +03:00
Disty0 bb707e4509 FLUX support 2024-08-02 18:22:06 +03:00
Disty0 9965ef75e7 De-dupe Cascade 2024-08-01 18:12:02 +03:00
Disty0 b50a8601fe Fix T5 INT8 and add QINT8 2024-07-30 18:23:21 +03:00
Disty0 6c75bcca0a Optimum Quanto support 2024-07-30 17:35:56 +03:00
Disty0 9c1c8feeb8 NNCF fix AuraFlow 2024-07-22 23:02:30 +03:00
Vladimir Mandic 7a163a34f2 check deepcache 2024-06-28 10:37:43 -04:00
Disty0 0aaabfc2e6 NNCF fix Lora support without reloading 2024-06-21 15:18:17 +03:00
Disty0 bf9565cb46 NNCF compression support on CPU and add INT8 option for T5 2024-06-19 21:23:47 +03:00
Disty0 77a3f0ab2f Cleanup 2024-06-16 21:49:41 +03:00
Disty0 4c7b4f382e Fix NNCF with T5 2024-06-16 21:47:20 +03:00
Disty0 042cac8846 Stable Cascade fix NNCF compress 2024-05-29 16:48:41 +03:00
Vladimir Mandic 9a7a5ba81c lint cleanup 2024-05-28 10:48:27 -04:00
Disty0 47806837e9 Cleanup compile code 2024-05-20 01:18:01 +03:00
Disty0 5ae658d91a Cleanup 2024-05-19 23:32:15 +03:00
Disty0 b7246ef4e6 Stable Cascade compile fixes 2024-05-19 23:20:04 +03:00
Vladimir Mandic b137f67edc lint changes 2024-05-07 09:56:32 -04:00
Disty0 29e5d88e37 Add migraphx compile backend 2024-04-05 18:13:20 +03:00
Vladimir Mandic 25bc3c9bb6
Merge pull request #3000 from aifartist/dev
Partial support for onediff
2024-03-25 15:00:43 -04:00
aifartist 58fefbeb65 Partial support for onediff 2024-03-18 16:34:50 -07:00
Disty0 164ada5805 VRAM efficient loading and compile 2024-03-14 01:42:36 +03:00
Disty0 327bea1eeb NNCF force eval and fix embeddings 2024-03-10 23:51:59 +03:00
Vladimir Mandic fc6f891b7a check for sfast with sag 2024-03-09 08:50:28 -05:00
Vladimir Mandic 7e7ed3b7fe improve model offload compatibility 2024-03-08 08:29:03 -05:00
Vladimir Mandic 456af9abdb refactor compile out of processing 2024-02-19 18:05:25 -05:00
Disty0 ae10ae6997 Cleanup 2024-02-14 15:12:50 +03:00
Disty0 7d25ba4734 Wuerstchen fixes 2024-02-14 13:14:08 +03:00
Disty0 86e8791ed0 Update IPEX Optimize logging 2024-02-13 19:44:21 +03:00
Disty0 c24539be73 Fix NNCF compatibility with model cpu offload 2024-02-12 15:09:29 +03:00
Vladimir Mandic f3be294d53 add deep-cache support 2024-02-08 12:56:06 -05:00
Disty0 ad47d81da7 OpenVINO fix cache and Lora loading 2024-02-06 19:18:49 +03:00
Seunghoon Lee 6a6d282a5d
fix 2024-02-01 01:13:25 +09:00
Seunghoon Lee 286ec8e753
Integrate Olive into compile backend. 2024-02-01 01:13:19 +09:00