Commit Graph

59 Commits (9de84792b4c4ae644fe84df73e97ecffac01a163)

Author SHA1 Message Date
Disty0 a8de3f7282 SDNQ add quantized matmul support for all quantization types and group sizes 2025-08-29 22:26:47 +03:00
Disty0 dc7b25d387 Cleanup SDNQ and add SDNQ_USE_TENSORWISE_FP8_MATMUL env var 2025-08-11 14:50:17 +03:00
Disty0 c3d007b02c SDNQ split forward.py into layers and cleanup 2025-08-02 17:36:55 +03:00
Vladimir Mandic 2656d3aa68 lint
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-07-24 15:42:29 -04:00
Disty0 444974a6ff cleanup 2025-07-23 19:02:44 +03:00
Disty0 7a08e1a7f2 SDNQ always use custom tensorwise fp8 matmul 2025-07-23 19:01:10 +03:00
Disty0 e43d1d2ba7 SDNQ use strings as target_dtype 2025-06-25 23:25:49 +03:00
Disty0 33fadf946b SDNQ add 7 bit support 2025-06-10 11:33:06 +03:00
Disty0 5eed9135e3 Split SDNQ into multiple files and linting 2025-06-10 03:18:25 +03:00