Commit Graph

55 Commits (3a8ef750687d8d0d4e2575bd16f796b8dfc15cdb)

Author SHA1 Message Date
Vladimir Mandic bfe014f5da modernize typing 2026-02-19 09:15:37 +01:00
Disty0 784cda80aa update sdnq 2026-01-14 16:23:26 +03:00
Disty0 47dcab3522 update sdnq 2026-01-09 00:34:32 +03:00
vladmandic 4e8b0f83b4 lint
Signed-off-by: vladmandic <mandic00@live.com>
2026-01-01 16:33:49 +01:00
Disty0 4a4784eafa SDNQ add new stack of custom floating point types and remove irrelevant qtypes from the ui list 2025-12-26 20:09:17 +03:00
Disty0 ce8b6d138c SDNQ remove forced uint4 from convs and cleanup 2025-12-13 01:32:52 +03:00
Disty0 d4e2cbb826 SDNQ fix torch.compile always being active 2025-12-08 18:15:08 +03:00
Disty0 064b64c76c cleanup 2025-12-08 01:14:19 +03:00
Disty0 6e05a12a49 SDNQ post process pre-quants after load 2025-12-08 01:08:53 +03:00
vladmandic 0ad40d2b8b lint
Signed-off-by: vladmandic <mandic00@live.com>
2025-12-02 12:25:04 +01:00
Disty0 d9bc31e7da Cleanup 2025-11-29 01:46:04 +03:00
Disty0 01a0f6b356 Warn and disable quantized matmul if triton is not available 2025-11-29 01:34:54 +03:00
Disty0 55cf627ac6 add version to sdnq 2025-11-28 00:45:24 +03:00
Disty0 73e4d1e379 Pass torch_dtype to sdnq loader 2025-11-27 18:37:35 +03:00
Disty0 7b2a8e3f87 cleanup 2025-11-27 18:26:14 +03:00
Disty0 ff4c254930 Auto handle tied weights with new transformers 2025-11-27 18:24:55 +03:00
CalamitousFelicitousness 9dd537072c Fix import path for SDNQ options and handle Qwen models in load_sdnq_model 2025-11-27 14:53:03 +00:00
Disty0 131c51918b SDNQ fix model_ oader 2025-11-27 14:51:45 +03:00
Disty0 ed6f977218 SDNQ fix z_image matmul 2025-11-27 14:19:29 +03:00
Disty0 48b5d56ba4 Enable or disable quantized matmul on pre-quant models 2025-11-26 21:08:15 +03:00
Disty0 da0df35106 fix typo 2025-11-25 21:58:53 +03:00
Disty0 4e4f49b38d update sdnq loader 2025-11-22 03:45:27 +03:00
Disty0 b6e9332cfe SDNQ de-couple matmul dtype and add fp16 matmul 2025-11-22 02:16:20 +03:00
Disty0 1745ed53f8 Refactor SDNQDequantizer 2025-11-18 01:42:58 +03:00
Disty0 0e8429dbd8 Cleanup 2025-11-07 18:49:29 +03:00
Disty0 93f28f07ac Make SDNQ not depended on quantization_config.json and fix invalid quantization_config getting attached to the model on load 2025-11-07 18:11:21 +03:00
Vladimir Mandic 5ab9a5a15d add sota model loader: runai streamer
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-10-27 14:20:10 -04:00
Disty0 b627617d14 SDNQ fix enable matmul after load 2025-10-19 17:25:02 +03:00
Disty0 758b006104 cleanup 2025-10-19 02:00:16 +03:00
Disty0 ef72edf18f SDNQ improve svd and low bit matmul perf 2025-10-19 00:06:07 +03:00
Disty0 845869079d Fix sdnq unset config 2025-10-14 17:58:09 +03:00
Disty0 b601f0d402 SDNQ expose svd_steps and update module skip keys 2025-10-14 00:15:09 +03:00
Disty0 9a8ba0fc90 SDNQ unset device specific configs on save 2025-10-11 19:24:09 +03:00
Disty0 c7aba8589b SDNQ fix Qwen loading 2025-10-11 00:05:09 +03:00
Disty0 2a3deaa064 Check T5 keys before override 2025-10-09 22:46:27 +03:00
Disty0 6995d8c3c6 SDNQ fix T5 loading 2025-10-09 22:42:20 +03:00
Disty0 612df3abbb cleanup 2025-10-09 20:09:34 +03:00
Disty0 a9de8ef152 cleanup 2025-10-09 19:58:57 +03:00
Disty0 e19fb2d833 SDNQ keep the quant configs inside the module subfolder, add dtype cast and don't send to GPU 2025-10-09 19:34:48 +03:00
Vladimir Mandic 70defe6d06 handle load shards
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-10-09 11:29:36 -04:00
Vladimir Mandic 6907fcd320 speedup prequant model load
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-10-08 13:47:36 -04:00
Disty0 bdcd07f713 Add add_module_skip_keys to pre-load quant too 2025-10-08 01:11:40 +03:00
Disty0 7fdf400e8b cleanup 2025-10-08 00:41:04 +03:00
Disty0 df03ea9ba8 SDNQ add sdnq_post_load_quant and update Qwen keys 2025-10-08 00:29:36 +03:00
Vladimir Mandic 962cb7115d infra for full-model load/save with quant
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-10-07 14:30:45 -04:00
Vladimir Mandic 7fdc880a73 sdnq patches
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-10-07 09:43:34 -04:00
Disty0 1cd7b6d63a fix upcast scale check 2025-10-07 01:27:54 +03:00
Disty0 aa0c10440f SDNQ make the loader don't touch the model options by default 2025-10-07 00:15:23 +03:00
Disty0 c931bf9efa SDNQ add dtype casting to loader 2025-10-06 17:44:52 +03:00
Disty0 5c042c5fb8 cleanup 2025-10-06 11:30:26 +03:00