Commit Graph

70 Commits (3a8ef750687d8d0d4e2575bd16f796b8dfc15cdb)

Author SHA1 Message Date
Vladimir Mandic e6d97f4d44 monkeypatch numpy for gradio
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-07-07 19:45:30 -04:00
Disty0 cf90e5621a Add _skip_layerwise_casting_patterns to SDNQ skip list 2025-07-04 00:04:01 +03:00
Vladimir Mandic c4d9338d2e major refactoring of modules
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-07-03 09:18:38 -04:00
Disty0 dc8fd006b2 Add modules_to_not_convert to pre-mode quants 2025-06-26 02:47:10 +03:00
Disty0 bbf986d3a5 Fix LTXVideo 2025-06-23 17:37:20 +03:00
Disty0 87e6d3f4fc SDNQ add modules_to_not_convert and don't quant _keep_in_fp32_modules layers in post mode 2025-06-20 20:41:11 +03:00
Disty0 7fc7797a1d SDNQ fix group size calc on odd shapes 2025-06-19 11:36:41 +03:00
Disty0 86cd272b96 SDNQ fix Dora 2025-06-18 16:24:42 +03:00
Disty0 e657cf790d SDNQ fix int8 matmul with qwen 2025-06-18 02:12:34 +03:00
Disty0 26800a1ef9 Cleanup sdnq 2025-06-17 02:05:13 +03:00
Disty0 25fc0094a9 SDNQ use quantize_device and return_device args and fix decompress_fp32 always being on 2025-06-14 21:29:08 +03:00
Disty0 c01802d9ff SDNQ fix transformers llm 2025-06-14 01:13:51 +03:00
Disty0 8f8e5ce1b0 Cleanup x2 2025-06-14 01:08:25 +03:00
Disty0 2ba64abcde Cleanup 2025-06-14 00:54:18 +03:00
Disty0 5e013fb154 SDNQ optimize input quantization and use the word quantize instead of compress 2025-06-12 12:06:57 +03:00
Disty0 2d05396b4e SDNQ simplify sym scale formula 2025-06-12 02:26:04 +03:00
Disty0 5cefa64a60 SDNQ update accepted dtypes 2025-06-11 20:58:54 +03:00
Disty0 78f99abec8 SDNQ use group_size / 2 for convs 2025-06-10 15:29:24 +03:00
Disty0 5bd7a08877 don't use inplace ops in quant layer 2025-06-10 03:29:07 +03:00
Disty0 5eed9135e3 Split SDNQ into multiple files and linting 2025-06-10 03:18:25 +03:00