Commit Graph

107 Commits (10bbbed218458b8a899aac2140ec738d8d716f05)

Author SHA1 Message Date
Disty0 25fc0094a9 SDNQ use quantize_device and return_device args and fix decompress_fp32 always being on 2025-06-14 21:29:08 +03:00
Disty0 2ba64abcde Cleanup 2025-06-14 00:54:18 +03:00
Disty0 cb4684cbeb SNDQ add separate quant mode option for Text Encoders 2025-06-13 12:42:57 +03:00
Disty0 5eed9135e3 Split SDNQ into multiple files and linting 2025-06-10 03:18:25 +03:00
Disty0 976f0ba61f Cleanup 2025-06-05 20:59:58 +03:00
Disty0 d8e8f47ce5 SDNQ add an option to toggle quantize with GPU 2025-05-28 15:18:39 +03:00
Disty0 4ed15f5cce SDNQ revert device_map = gpu 2025-05-27 23:32:58 +03:00
Disty0 d3e3fb98b0 Don't override user set device_map 2025-05-27 21:45:52 +03:00
Disty0 b1b29e9001 SDNQ disable device_map = gpu with TE and LLM 2025-05-27 21:32:32 +03:00
Disty0 3618e39cff SDNQ use device_map = gpu 2025-05-27 19:46:30 +03:00
Disty0 dece497f10 Refactor SDNQ to use weights_dtype and rename decompress_int8_matmul to use_quantized_matmul 2025-05-27 15:49:21 +03:00
Disty0 4453efee76 Rename NNCF to SDNQ and rename quant schemes 2025-05-26 02:39:51 +03:00
Disty0 2d79380bd7 NNCF implement better layer hijacks and remove all NNCF imports 2025-05-26 01:12:28 +03:00
Vladimir Mandic bfda37903c update nncf linting and changelog
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-05-13 12:11:10 -04:00
Disty0 f4e3a81a84 NNCF experimental direct INT8 MatMul support 2025-05-12 21:41:49 +03:00
Disty0 9cfdc3c079 Remove NNCF device hijack 2025-05-11 18:30:10 +03:00
Disty0 1ee9832e05 NNCF silence the pytorch version warning 2025-05-09 23:16:55 +03:00
Vladimir Mandic 83fc68ece3 update requirements
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-05-06 09:47:13 -04:00
Vladimir Mandic c6cc1476c6 add hidream-e1
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-04-29 10:08:51 -04:00
Disty0 9e133a5044 Cleanup 2025-04-23 04:16:24 +03:00
Disty0 bb0329f54f Update and refactor NNCF and add more quant options 2025-04-23 02:03:30 +03:00
Disty0 2264d8087b Pre-load support for NNCF 2025-04-22 04:35:36 +03:00
Vladimir Mandic 0fe0707cdd add lodestone-chroma
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-04-21 19:29:09 -04:00
Disty0 434bb660ce Move post load quant functions to a single function in model_quant 2025-04-20 17:04:25 +03:00
Vladimir Mandic 75ebf1e196 hidream add llm info to metadata
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-04-17 14:44:37 -04:00
Disty0 a6ea26fb8d Use transformers QuantoConfig for TE and LLM 2025-04-16 19:16:35 +03:00
Vladimir Mandic 15f8e70e89 add nunchaku prototype
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-04-15 14:39:24 -04:00
Vladimir Mandic 0f595d4cc5 cleanup multiple model loaders
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-04-11 22:16:05 -04:00
Disty0 72918eb94d Fix quanto needing to re-quant when moving to gpu 2025-04-11 21:50:06 +03:00
Disty0 8e567f3ab0 Fix and update TorchAO 2025-04-11 20:08:03 +03:00
Disty0 bd1d8d44dc Fix quanto with transformers 2025-04-11 19:15:23 +03:00
Vladimir Mandic 84fe068070 custom model loader
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-04-08 14:56:07 -04:00
Vladimir Mandic 6430f7006f add monitor cli option and finish lora refactor
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-04-01 13:39:47 -04:00
Vladimir Mandic f4fdd496b9 more granular quantization modules options
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-03-28 14:46:52 -04:00
Vladimir Mandic d1c3b97c65 add prompt enhance
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-03-28 14:05:28 -04:00
Vladimir Mandic 4f56f4aa33 add new optimum-quanto on-the-fly and simplify quantization loading
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-03-16 21:45:05 -04:00
Vladimir Mandic a5d3a68107 add cogview4
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-03-15 13:20:17 -04:00
Vladimir Mandic 49712ab9e7 update requirements
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-02-13 12:03:08 -05:00
Vladimir Mandic a7ccea60ff add lumina2
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-02-12 08:54:00 -05:00
Vladimir Mandic 61a98b0b7b fix bnb version
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-02-01 16:09:31 -05:00
Vladimir Mandic 06ba03cf80 settings option to disable reference models
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-01-23 15:19:43 -05:00
Disty0 9b579bfd96 Move quant functions to model_quant.py 2025-01-23 21:50:26 +03:00
Vladimir Mandic 5a7c1f50c1 add native torch fp8 storage dtype
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-01-23 12:07:08 -05:00
Disty0 e58b9b50e9 Don't force bnb version outside of cuda 2025-01-23 01:10:53 +03:00
Disty0 5d950a0164 Fix quanto logging with model offload 2024-12-30 16:44:07 +03:00
Vladimir Mandic fd7fe8cea5 add torchao
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2024-12-17 13:29:36 -05:00
Vladimir Mandic 8f21e96f73 update bnb and increase ui timeouts
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2024-12-11 15:22:51 -05:00
Vladimir Mandic 944408e93b warn on quanto with offload
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2024-12-10 10:39:13 -05:00
Vladimir Mandic 164ce252dc add sd35 controlnets
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2024-11-28 08:46:10 -05:00
Vladimir Mandic 62c53942e0 post release jumbo patch
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2024-11-22 12:55:18 -05:00
Vladimir Mandic 168e9445d1 add bnb and quanto version info
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2024-11-11 16:16:22 -05:00
Vladimir Mandic abfb197504 sd35 all-in-one safetensors support
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2024-10-30 19:30:52 -04:00
Vladimir Mandic 7b150ba361 fix bnb loader
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2024-10-29 12:36:03 -04:00
Vladimir Mandic 58220b6497 flux enabled bnb quant on-the-fly
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2024-10-29 12:30:20 -04:00
Vladimir Mandic 6760632f38 major model load refactor
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2024-10-26 13:22:29 -04:00
Vladimir Mandic f191134aa6 support bnb quantization during load
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2024-10-22 15:43:09 -04:00
Vladimir Mandic ea0dfebe2d better handle any quant lib requirements
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2024-10-12 13:36:16 -04:00