Disty0
|
c25d15398f
|
SDNQ pre-mode don't quant if weights_dtype == 'none'
|
2025-06-26 01:48:01 +03:00 |
Disty0
|
81e55f0459
|
Fix SDNQ pre-mode
|
2025-06-25 22:58:40 +03:00 |
Vladimir Mandic
|
5b486a6ef1
|
sdnq add xyz grid support, improve offloading compatibility
Signed-off-by: Vladimir Mandic <mandic00@live.com>
|
2025-06-25 15:32:37 -04:00 |
Disty0
|
87e6d3f4fc
|
SDNQ add modules_to_not_convert and don't quant _keep_in_fp32_modules layers in post mode
|
2025-06-20 20:41:11 +03:00 |
Disty0
|
2c4850cc2b
|
Log more info with SDNQ
|
2025-06-18 18:03:46 +03:00 |
Disty0
|
86cd272b96
|
SDNQ fix Dora
|
2025-06-18 16:24:42 +03:00 |
Enes Sadık Özbek
|
4b3ce06916
|
Initial support for Chroma
|
2025-06-18 00:38:17 +00:00 |
Disty0
|
25fc0094a9
|
SDNQ use quantize_device and return_device args and fix decompress_fp32 always being on
|
2025-06-14 21:29:08 +03:00 |
Disty0
|
2ba64abcde
|
Cleanup
|
2025-06-14 00:54:18 +03:00 |
Disty0
|
cb4684cbeb
|
SNDQ add separate quant mode option for Text Encoders
|
2025-06-13 12:42:57 +03:00 |
Disty0
|
5eed9135e3
|
Split SDNQ into multiple files and linting
|
2025-06-10 03:18:25 +03:00 |
Disty0
|
976f0ba61f
|
Cleanup
|
2025-06-05 20:59:58 +03:00 |
Disty0
|
d8e8f47ce5
|
SDNQ add an option to toggle quantize with GPU
|
2025-05-28 15:18:39 +03:00 |
Disty0
|
4ed15f5cce
|
SDNQ revert device_map = gpu
|
2025-05-27 23:32:58 +03:00 |
Disty0
|
d3e3fb98b0
|
Don't override user set device_map
|
2025-05-27 21:45:52 +03:00 |
Disty0
|
b1b29e9001
|
SDNQ disable device_map = gpu with TE and LLM
|
2025-05-27 21:32:32 +03:00 |
Disty0
|
3618e39cff
|
SDNQ use device_map = gpu
|
2025-05-27 19:46:30 +03:00 |
Disty0
|
dece497f10
|
Refactor SDNQ to use weights_dtype and rename decompress_int8_matmul to use_quantized_matmul
|
2025-05-27 15:49:21 +03:00 |
Disty0
|
4453efee76
|
Rename NNCF to SDNQ and rename quant schemes
|
2025-05-26 02:39:51 +03:00 |
Disty0
|
2d79380bd7
|
NNCF implement better layer hijacks and remove all NNCF imports
|
2025-05-26 01:12:28 +03:00 |
Vladimir Mandic
|
bfda37903c
|
update nncf linting and changelog
Signed-off-by: Vladimir Mandic <mandic00@live.com>
|
2025-05-13 12:11:10 -04:00 |
Disty0
|
f4e3a81a84
|
NNCF experimental direct INT8 MatMul support
|
2025-05-12 21:41:49 +03:00 |
Disty0
|
9cfdc3c079
|
Remove NNCF device hijack
|
2025-05-11 18:30:10 +03:00 |
Disty0
|
1ee9832e05
|
NNCF silence the pytorch version warning
|
2025-05-09 23:16:55 +03:00 |
Vladimir Mandic
|
83fc68ece3
|
update requirements
Signed-off-by: Vladimir Mandic <mandic00@live.com>
|
2025-05-06 09:47:13 -04:00 |
Vladimir Mandic
|
c6cc1476c6
|
add hidream-e1
Signed-off-by: Vladimir Mandic <mandic00@live.com>
|
2025-04-29 10:08:51 -04:00 |
Disty0
|
9e133a5044
|
Cleanup
|
2025-04-23 04:16:24 +03:00 |
Disty0
|
bb0329f54f
|
Update and refactor NNCF and add more quant options
|
2025-04-23 02:03:30 +03:00 |
Disty0
|
2264d8087b
|
Pre-load support for NNCF
|
2025-04-22 04:35:36 +03:00 |
Vladimir Mandic
|
0fe0707cdd
|
add lodestone-chroma
Signed-off-by: Vladimir Mandic <mandic00@live.com>
|
2025-04-21 19:29:09 -04:00 |
Disty0
|
434bb660ce
|
Move post load quant functions to a single function in model_quant
|
2025-04-20 17:04:25 +03:00 |
Vladimir Mandic
|
75ebf1e196
|
hidream add llm info to metadata
Signed-off-by: Vladimir Mandic <mandic00@live.com>
|
2025-04-17 14:44:37 -04:00 |
Disty0
|
a6ea26fb8d
|
Use transformers QuantoConfig for TE and LLM
|
2025-04-16 19:16:35 +03:00 |
Vladimir Mandic
|
15f8e70e89
|
add nunchaku prototype
Signed-off-by: Vladimir Mandic <mandic00@live.com>
|
2025-04-15 14:39:24 -04:00 |
Vladimir Mandic
|
0f595d4cc5
|
cleanup multiple model loaders
Signed-off-by: Vladimir Mandic <mandic00@live.com>
|
2025-04-11 22:16:05 -04:00 |
Disty0
|
72918eb94d
|
Fix quanto needing to re-quant when moving to gpu
|
2025-04-11 21:50:06 +03:00 |
Disty0
|
8e567f3ab0
|
Fix and update TorchAO
|
2025-04-11 20:08:03 +03:00 |
Disty0
|
bd1d8d44dc
|
Fix quanto with transformers
|
2025-04-11 19:15:23 +03:00 |
Vladimir Mandic
|
84fe068070
|
custom model loader
Signed-off-by: Vladimir Mandic <mandic00@live.com>
|
2025-04-08 14:56:07 -04:00 |
Vladimir Mandic
|
6430f7006f
|
add monitor cli option and finish lora refactor
Signed-off-by: Vladimir Mandic <mandic00@live.com>
|
2025-04-01 13:39:47 -04:00 |
Vladimir Mandic
|
f4fdd496b9
|
more granular quantization modules options
Signed-off-by: Vladimir Mandic <mandic00@live.com>
|
2025-03-28 14:46:52 -04:00 |
Vladimir Mandic
|
d1c3b97c65
|
add prompt enhance
Signed-off-by: Vladimir Mandic <mandic00@live.com>
|
2025-03-28 14:05:28 -04:00 |
Vladimir Mandic
|
4f56f4aa33
|
add new optimum-quanto on-the-fly and simplify quantization loading
Signed-off-by: Vladimir Mandic <mandic00@live.com>
|
2025-03-16 21:45:05 -04:00 |
Vladimir Mandic
|
a5d3a68107
|
add cogview4
Signed-off-by: Vladimir Mandic <mandic00@live.com>
|
2025-03-15 13:20:17 -04:00 |
Vladimir Mandic
|
49712ab9e7
|
update requirements
Signed-off-by: Vladimir Mandic <mandic00@live.com>
|
2025-02-13 12:03:08 -05:00 |
Vladimir Mandic
|
a7ccea60ff
|
add lumina2
Signed-off-by: Vladimir Mandic <mandic00@live.com>
|
2025-02-12 08:54:00 -05:00 |
Vladimir Mandic
|
61a98b0b7b
|
fix bnb version
Signed-off-by: Vladimir Mandic <mandic00@live.com>
|
2025-02-01 16:09:31 -05:00 |
Vladimir Mandic
|
06ba03cf80
|
settings option to disable reference models
Signed-off-by: Vladimir Mandic <mandic00@live.com>
|
2025-01-23 15:19:43 -05:00 |
Disty0
|
9b579bfd96
|
Move quant functions to model_quant.py
|
2025-01-23 21:50:26 +03:00 |
Vladimir Mandic
|
5a7c1f50c1
|
add native torch fp8 storage dtype
Signed-off-by: Vladimir Mandic <mandic00@live.com>
|
2025-01-23 12:07:08 -05:00 |
Disty0
|
e58b9b50e9
|
Don't force bnb version outside of cuda
|
2025-01-23 01:10:53 +03:00 |
Disty0
|
5d950a0164
|
Fix quanto logging with model offload
|
2024-12-30 16:44:07 +03:00 |
Vladimir Mandic
|
fd7fe8cea5
|
add torchao
Signed-off-by: Vladimir Mandic <mandic00@live.com>
|
2024-12-17 13:29:36 -05:00 |
Vladimir Mandic
|
8f21e96f73
|
update bnb and increase ui timeouts
Signed-off-by: Vladimir Mandic <mandic00@live.com>
|
2024-12-11 15:22:51 -05:00 |
Vladimir Mandic
|
944408e93b
|
warn on quanto with offload
Signed-off-by: Vladimir Mandic <mandic00@live.com>
|
2024-12-10 10:39:13 -05:00 |
Vladimir Mandic
|
164ce252dc
|
add sd35 controlnets
Signed-off-by: Vladimir Mandic <mandic00@live.com>
|
2024-11-28 08:46:10 -05:00 |
Vladimir Mandic
|
62c53942e0
|
post release jumbo patch
Signed-off-by: Vladimir Mandic <mandic00@live.com>
|
2024-11-22 12:55:18 -05:00 |
Vladimir Mandic
|
168e9445d1
|
add bnb and quanto version info
Signed-off-by: Vladimir Mandic <mandic00@live.com>
|
2024-11-11 16:16:22 -05:00 |
Vladimir Mandic
|
abfb197504
|
sd35 all-in-one safetensors support
Signed-off-by: Vladimir Mandic <mandic00@live.com>
|
2024-10-30 19:30:52 -04:00 |
Vladimir Mandic
|
7b150ba361
|
fix bnb loader
Signed-off-by: Vladimir Mandic <mandic00@live.com>
|
2024-10-29 12:36:03 -04:00 |
Vladimir Mandic
|
58220b6497
|
flux enabled bnb quant on-the-fly
Signed-off-by: Vladimir Mandic <mandic00@live.com>
|
2024-10-29 12:30:20 -04:00 |
Vladimir Mandic
|
6760632f38
|
major model load refactor
Signed-off-by: Vladimir Mandic <mandic00@live.com>
|
2024-10-26 13:22:29 -04:00 |
Vladimir Mandic
|
f191134aa6
|
support bnb quantization during load
Signed-off-by: Vladimir Mandic <mandic00@live.com>
|
2024-10-22 15:43:09 -04:00 |
Vladimir Mandic
|
ea0dfebe2d
|
better handle any quant lib requirements
Signed-off-by: Vladimir Mandic <mandic00@live.com>
|
2024-10-12 13:36:16 -04:00 |