Commit Graph

198 Commits (3a65d561a70f60d2c67f607d2b00a944c7c427ed)

Author SHA1 Message Date
Disty0 1c2a81ee2d Make SDNQDequantizer a dataclass 2025-12-08 22:29:45 +03:00
vladmandic 69f0d6bf5d lint
Signed-off-by: vladmandic <mandic00@live.com>
2025-12-08 18:12:47 +01:00
Disty0 d4e2cbb826 SDNQ fix torch.compile always being active 2025-12-08 18:15:08 +03:00
Disty0 3ae7ecdbad SDNQ fix quantization_device getting ignored on post load quant 2025-12-08 01:29:52 +03:00
Disty0 064b64c76c cleanup 2025-12-08 01:14:19 +03:00
Disty0 6e05a12a49 SDNQ post process pre-quants after load 2025-12-08 01:08:53 +03:00
Disty0 0835ca6f66 SDNQ add explicit model.quantization_method = QuantizationMethod.SDNQ 2025-12-08 00:46:40 +03:00
Disty0 7a6356f8eb SDNQ fix transformers v5 and check for torch._dynamo.config.disable 2025-12-08 00:36:15 +03:00
Disty0 4f90054bf7 SDNQ transformers v5 support 2025-12-07 21:37:41 +03:00
Disty0 1cfb61809f cleanup 2025-12-05 18:40:49 +03:00
Disty0 5b86bef796 SDNQ add longcat keys 2025-12-05 18:37:20 +03:00
vladmandic 0ad40d2b8b lint
Signed-off-by: vladmandic <mandic00@live.com>
2025-12-02 12:25:04 +01:00
Disty0 7aa1bfdc70 Add get_modules_to_not_convert from transformers v5 2025-12-02 01:01:51 +03:00
Disty0 d9bc31e7da Cleanup 2025-11-29 01:46:04 +03:00
Disty0 01a0f6b356 Warn and disable quantized matmul if triton is not available 2025-11-29 01:34:54 +03:00
Disty0 3e52009a4f SDNQ assert Triton for quantized matmul 2025-11-29 00:54:19 +03:00
Disty0 aaef4992c3 SDNQ fix svd + fp8 tw and fp16 mm 2025-11-28 22:31:09 +03:00
Disty0 a46f32b354 pull sdnq version from .common 2025-11-28 01:10:05 +03:00
Disty0 55cf627ac6 add version to sdnq 2025-11-28 00:45:24 +03:00
Disty0 368eb3103a cleanup 2025-11-27 18:40:15 +03:00
Disty0 73e4d1e379 Pass torch_dtype to sdnq loader 2025-11-27 18:37:35 +03:00
Disty0 7b2a8e3f87 cleanup 2025-11-27 18:26:14 +03:00
Disty0 ff4c254930 Auto handle tied weights with new transformers 2025-11-27 18:24:55 +03:00
CalamitousFelicitousness 9dd537072c Fix import path for SDNQ options and handle Qwen models in load_sdnq_model 2025-11-27 14:53:03 +00:00
Disty0 131c51918b SDNQ fix model_ oader 2025-11-27 14:51:45 +03:00
Disty0 ed6f977218 SDNQ fix z_image matmul 2025-11-27 14:19:29 +03:00
Disty0 16c429711c update lumina and z_image keys 2025-11-26 23:22:44 +03:00
Disty0 679060bd00 SDNQ add lumina and z_image keys 2025-11-26 22:51:15 +03:00
Disty0 48b5d56ba4 Enable or disable quantized matmul on pre-quant models 2025-11-26 21:08:15 +03:00
Disty0 70b96daa63 cleanup 2025-11-25 23:02:01 +03:00
Disty0 da0df35106 fix typo 2025-11-25 21:58:53 +03:00
Disty0 da3c439059 SDNQ fix _tied_weights_keys is dict case 2025-11-25 19:37:46 +03:00
Disty0 aeb71d172e SDNQ add Flux2Transformer2DModel keys 2025-11-25 19:22:02 +03:00
vladmandic 9658a330b2 lint
Signed-off-by: vladmandic <mandic00@live.com>
2025-11-23 13:29:03 -05:00
Disty0 41ef28bb78 SDNQ don't divide group_size 2025-11-22 16:44:13 +03:00
Disty0 25d05b1445 SDNQ catch all exceptions on triton import 2025-11-22 14:48:55 +03:00
Disty0 4e4f49b38d update sdnq loader 2025-11-22 03:45:27 +03:00
Disty0 b6e9332cfe SDNQ de-couple matmul dtype and add fp16 matmul 2025-11-22 02:16:20 +03:00
Disty0 5308630b3a SDNQ use dequantize_fp32 with uint16 + torch_dtype = fp16 2025-11-18 23:53:27 +03:00
Disty0 49cd85d388 SDNQ add training related changes 2025-11-18 22:46:14 +03:00
Disty0 3fbfae5963 cleanup 2025-11-18 02:37:10 +03:00
Disty0 1745ed53f8 Refactor SDNQDequantizer 2025-11-18 01:42:58 +03:00
Disty0 3a4d7795d8 SDNQ fix weights_dtype getting overwritten on post load quant 2025-11-14 16:51:10 +03:00
Disty0 6f33ec3357 SDNQ use the model quant params instead of user settings on Lora 2025-11-10 00:12:38 +03:00
Disty0 0e8429dbd8 Cleanup 2025-11-07 18:49:29 +03:00
Disty0 93f28f07ac Make SDNQ not depended on quantization_config.json and fix invalid quantization_config getting attached to the model on load 2025-11-07 18:11:21 +03:00
Disty0 a4378a79e4 fix typo 2025-11-04 14:30:52 +03:00
Disty0 8ad53ed4b3 SDNQ update keys 2025-11-04 14:29:44 +03:00
Disty0 76d699dc09 SDNQ add common keys 2025-10-31 00:21:54 +03:00
Disty0 da3d183f96 add Emu3ForCausalLM keys 2025-10-30 23:44:05 +03:00