automatic

Commit Graph

Author	SHA1	Message	Date
Disty0	dc8fd006b2	Add modules_to_not_convert to pre-mode quants	2025-06-26 02:47:10 +03:00
Disty0	e43d1d2ba7	SDNQ use strings as target_dtype	2025-06-25 23:25:49 +03:00
Disty0	bbf986d3a5	Fix LTXVideo	2025-06-23 17:37:20 +03:00
Disty0	87e6d3f4fc	SDNQ add modules_to_not_convert and don't quant _keep_in_fp32_modules layers in post mode	2025-06-20 20:41:11 +03:00
Disty0	7fc7797a1d	SDNQ fix group size calc on odd shapes	2025-06-19 11:36:41 +03:00
Disty0	86cd272b96	SDNQ fix Dora	2025-06-18 16:24:42 +03:00
Disty0	e657cf790d	SDNQ fix int8 matmul with qwen	2025-06-18 02:12:34 +03:00
Disty0	26800a1ef9	Cleanup sdnq	2025-06-17 02:05:13 +03:00
Disty0	d31df8c1eb	SDNQ fuse bias into dequantizer with matmul	2025-06-14 22:10:10 +03:00
Disty0	25fc0094a9	SDNQ use quantize_device and return_device args and fix decompress_fp32 always being on	2025-06-14 21:29:08 +03:00
Disty0	c01802d9ff	SDNQ fix transformers llm	2025-06-14 01:13:51 +03:00
Disty0	8f8e5ce1b0	Cleanup x2	2025-06-14 01:08:25 +03:00
Disty0	2ba64abcde	Cleanup	2025-06-14 00:54:18 +03:00
Disty0	5e013fb154	SDNQ optimize input quantization and use the word quantize instead of compress	2025-06-12 12:06:57 +03:00
Disty0	2d05396b4e	SDNQ simplify sym scale formula	2025-06-12 02:26:04 +03:00
Disty0	5cefa64a60	SDNQ update accepted dtypes	2025-06-11 20:58:54 +03:00
Disty0	78f99abec8	SDNQ use group_size / 2 for convs	2025-06-10 15:29:24 +03:00
Disty0	33fadf946b	SDNQ add 7 bit support	2025-06-10 11:33:06 +03:00
Disty0	5bd7a08877	don't use inplace ops in quant layer	2025-06-10 03:29:07 +03:00
Disty0	5eed9135e3	Split SDNQ into multiple files and linting	2025-06-10 03:18:25 +03:00

20 Commits (1b4e1ff0ef60c27fe81f7189909af2ec4eef3a76)