automatic

Commit Graph

Author	SHA1	Message	Date
Disty0	f324b7c0e5	SDNQ remove unnecessary .contiguous()	2025-08-21 02:21:05 +03:00
Disty0	e49814098e	Add sdnq_modules_dtype_dict	2025-08-20 14:58:54 +03:00
Disty0	0946710662	Add sdnq_modules_to_not_convert to UI settings	2025-08-20 04:38:20 +03:00
Disty0	47ff01fd3b	SDNQ add "*" support and upcast only the first and last layer's img_mod to 6 bit with Qwen Image	2025-08-20 03:24:19 +03:00
Disty0	47154db8b1	SDNQ Flux lora, use shape from sdnq_dequantizer	2025-08-18 19:53:57 +03:00
Vladimir Mandic	fc547a3ccd	sdnq with diffusers lora loader Signed-off-by: Vladimir Mandic <mandic00@live.com>	2025-08-18 10:29:01 -04:00
Disty0	85d494ee84	revert .t_().contiguous().t_()	2025-08-17 05:19:44 +03:00
Disty0	8460be662c	SDNQ use inplace transpose and use view instead of reshape	2025-08-17 05:07:55 +03:00
Disty0	8ca74d0cd2	SDNQ rename unused param_name arg to op	2025-08-13 22:10:30 +03:00
Disty0	cb0c5414a3	SDNQ use uint with minimum_bits <= 4	2025-08-13 00:37:44 +03:00
Disty0	7085db9add	Update changelog	2025-08-13 00:17:15 +03:00
Disty0	15cb8fe9f8	SDNQ add modules_dtype_dict and fix Qwen Image with quants less than 5 bits	2025-08-13 00:07:36 +03:00
Disty0	f45e3342e6	Cleanup	2025-08-11 15:11:29 +03:00
Disty0	afb3a5a06d	SDNQ move non_blocking to quant config	2025-08-11 15:07:02 +03:00
Disty0	3f45c4e570	Cleanup SDNQ and skip transpose on packed int8 matmul	2025-08-10 19:31:34 +03:00
Disty0	69db77e365	SDNQ remove eps	2025-08-09 02:39:01 +03:00
Disty0	aa0652caa9	SDNQ fix new transformers	2025-08-07 00:18:24 +03:00
Disty0	ab8badfe0d	SDNQ use non-blocking ops	2025-08-06 17:07:36 +03:00
Disty0	1d5dce1fb1	cleanup	2025-08-02 17:41:53 +03:00
Disty0	c3d007b02c	SDNQ split forward.py into layers and cleanup	2025-08-02 17:36:55 +03:00
Vladimir Mandic	fa44521ea3	offload-never and offload-always per-module and new highvram profile Signed-off-by: Vladimir Mandic <mandic00@live.com>	2025-07-31 11:40:24 -04:00
Disty0	30af1f8fb0	Use inference_context with SDNQ	2025-07-14 13:07:44 +03:00
Vladimir Mandic	e6d97f4d44	monkeypatch numpy for gradio Signed-off-by: Vladimir Mandic <mandic00@live.com>	2025-07-07 19:45:30 -04:00
Disty0	cf90e5621a	Add _skip_layerwise_casting_patterns to SDNQ skip list	2025-07-04 00:04:01 +03:00
Vladimir Mandic	c4d9338d2e	major refactoring of modules Signed-off-by: Vladimir Mandic <mandic00@live.com>	2025-07-03 09:18:38 -04:00
Disty0	dc8fd006b2	Add modules_to_not_convert to pre-mode quants	2025-06-26 02:47:10 +03:00
Disty0	bbf986d3a5	Fix LTXVideo	2025-06-23 17:37:20 +03:00
Disty0	87e6d3f4fc	SDNQ add modules_to_not_convert and don't quant _keep_in_fp32_modules layers in post mode	2025-06-20 20:41:11 +03:00
Disty0	7fc7797a1d	SDNQ fix group size calc on odd shapes	2025-06-19 11:36:41 +03:00
Disty0	86cd272b96	SDNQ fix Dora	2025-06-18 16:24:42 +03:00
Disty0	e657cf790d	SDNQ fix int8 matmul with qwen	2025-06-18 02:12:34 +03:00
Disty0	26800a1ef9	Cleanup sdnq	2025-06-17 02:05:13 +03:00
Disty0	25fc0094a9	SDNQ use quantize_device and return_device args and fix decompress_fp32 always being on	2025-06-14 21:29:08 +03:00
Disty0	c01802d9ff	SDNQ fix transformers llm	2025-06-14 01:13:51 +03:00
Disty0	8f8e5ce1b0	Cleanup x2	2025-06-14 01:08:25 +03:00
Disty0	2ba64abcde	Cleanup	2025-06-14 00:54:18 +03:00
Disty0	5e013fb154	SDNQ optimize input quantization and use the word quantize instead of compress	2025-06-12 12:06:57 +03:00
Disty0	2d05396b4e	SDNQ simplify sym scale formula	2025-06-12 02:26:04 +03:00
Disty0	5cefa64a60	SDNQ update accepted dtypes	2025-06-11 20:58:54 +03:00
Disty0	78f99abec8	SDNQ use group_size / 2 for convs	2025-06-10 15:29:24 +03:00
Disty0	5bd7a08877	don't use inplace ops in quant layer	2025-06-10 03:29:07 +03:00
Disty0	5eed9135e3	Split SDNQ into multiple files and linting	2025-06-10 03:18:25 +03:00

42 Commits (3af9d3bb501c1ef34bb9a2fa0df9bb75bc3830e2)