Commit Graph

14 Commits (e8a158f4f5e10742c316e9b308ce9241b2c98430)

Author SHA1 Message Date
Disty0 784cda80aa update sdnq 2026-01-14 16:23:26 +03:00
Disty0 db59d2b507 SDNQ handle packed floats in fp mm 2025-12-27 16:29:18 +03:00
Disty0 b6e9332cfe SDNQ de-couple matmul dtype and add fp16 matmul 2025-11-22 02:16:20 +03:00
Disty0 1745ed53f8 Refactor SDNQDequantizer 2025-11-18 01:42:58 +03:00
Disty0 f12caf81f9 SDNQ skip bad layers on svd and fix svd with dequantize_fp32 2025-10-17 17:25:50 +03:00
Disty0 c7aba8589b SDNQ fix Qwen loading 2025-10-11 00:05:09 +03:00
Disty0 9e52d0c1fb SDNQ add SVDQuant quantization method 2025-10-05 22:50:30 +03:00
Disty0 54acf1760b Make SDNQ scales compatible with balanced offload 2025-10-03 18:13:55 +03:00
Disty0 6b67a9d0c4 SDNQ add check_mats to matmul 2025-09-30 01:58:13 +03:00
Disty0 e6715ba8d3 Cleanup SDNQ compile 2025-09-19 19:29:36 +03:00
Disty0 8460be662c SDNQ use inplace transpose and use view instead of reshape 2025-08-17 05:07:55 +03:00
Disty0 9992338187 sdnq fix convs 2025-08-11 23:24:13 +03:00
Disty0 dc7b25d387 Cleanup SDNQ and add SDNQ_USE_TENSORWISE_FP8_MATMUL env var 2025-08-11 14:50:17 +03:00
Disty0 c3d007b02c SDNQ split forward.py into layers and cleanup 2025-08-02 17:36:55 +03:00