automatic

Author	SHA1	Message	Date
Disty0	f12caf81f9	SDNQ skip bad layers on svd and fix svd with dequantize_fp32	2025-10-17 17:25:50 +03:00
Disty0	c7aba8589b	SDNQ fix Qwen loading	2025-10-11 00:05:09 +03:00
Disty0	9e52d0c1fb	SDNQ add SVDQuant quantization method	2025-10-05 22:50:30 +03:00
Disty0	99113947bf	SDNQ add RDNA2 INT8 support via Triton	2025-10-04 18:31:25 +03:00
Disty0	54acf1760b	Make SDNQ scales compatible with balanced offload	2025-10-03 18:13:55 +03:00
Disty0	6b67a9d0c4	SDNQ add check_mats to matmul	2025-09-30 01:58:13 +03:00
Disty0	e6715ba8d3	Cleanup SDNQ compile	2025-09-19 19:29:36 +03:00
Disty0	a12edc1e90	SDNQ use nan_to_num_ with fp8 quantization in case of zeros	2025-09-15 20:22:39 +03:00
Vladimir Mandic	9743c8e4bf	keep previous processed state Signed-off-by: Vladimir Mandic <mandic00@live.com>	2025-08-31 15:20:15 -04:00
Disty0	a8de3f7282	SDNQ add quantized matmul support for all quantization types and group sizes	2025-08-29 22:26:47 +03:00
Disty0	f324b7c0e5	SDNQ remove unnecessary .contiguous()	2025-08-21 02:21:05 +03:00
Disty0	8460be662c	SDNQ use inplace transpose and use view instead of reshape	2025-08-17 05:07:55 +03:00
Disty0	dc7b25d387	Cleanup SDNQ and add SDNQ_USE_TENSORWISE_FP8_MATMUL env var	2025-08-11 14:50:17 +03:00
Disty0	3f45c4e570	Cleanup SDNQ and skip transpose on packed int8 matmul	2025-08-10 19:31:34 +03:00
Disty0	22d86acda3	Make SDNQ MatMul listen to the dequantize fp32 setting	2025-08-09 01:10:07 +03:00
Disty0	c3d007b02c	SDNQ split forward.py into layers and cleanup	2025-08-02 17:36:55 +03:00