automatic

Commit Graph

Author	SHA1	Message	Date
Disty0	71f7474de2	Unify quant options	2025-06-27 21:05:14 +03:00
Disty0	3b8ced444c	Add auto quantization mode	2025-06-27 18:54:15 +03:00
Disty0	0e4d712f27	Chroma fixes	2025-06-26 21:25:50 +03:00
Enes Sadık Özbek	e91208bea9	Merge branch 'dev' into feature/chroma-support	2025-06-26 17:02:00 +03:00
Disty0	0f6eb624c9	Use llm_int8_skip_modules with bnb	2025-06-26 03:10:26 +03:00
Disty0	dc8fd006b2	Add modules_to_not_convert to pre-mode quants	2025-06-26 02:47:10 +03:00
Enes Sadık Özbek	21bdde12d3	Merge branch 'dev' into feature/chroma-support	2025-06-26 01:56:34 +03:00
Disty0	c25d15398f	SDNQ pre-mode don't quant if weights_dtype == 'none'	2025-06-26 01:48:01 +03:00
Disty0	81e55f0459	Fix SDNQ pre-mode	2025-06-25 22:58:40 +03:00
Vladimir Mandic	5b486a6ef1	sdnq add xyz grid support, improve offloading compatibility Signed-off-by: Vladimir Mandic <mandic00@live.com>	2025-06-25 15:32:37 -04:00
Disty0	87e6d3f4fc	SDNQ add modules_to_not_convert and don't quant _keep_in_fp32_modules layers in post mode	2025-06-20 20:41:11 +03:00
Disty0	2c4850cc2b	Log more info with SDNQ	2025-06-18 18:03:46 +03:00
Disty0	86cd272b96	SDNQ fix Dora	2025-06-18 16:24:42 +03:00
Enes Sadık Özbek	4b3ce06916	Initial support for Chroma	2025-06-18 00:38:17 +00:00
Disty0	25fc0094a9	SDNQ use quantize_device and return_device args and fix decompress_fp32 always being on	2025-06-14 21:29:08 +03:00
Disty0	2ba64abcde	Cleanup	2025-06-14 00:54:18 +03:00
Disty0	cb4684cbeb	SNDQ add separate quant mode option for Text Encoders	2025-06-13 12:42:57 +03:00
Disty0	5eed9135e3	Split SDNQ into multiple files and linting	2025-06-10 03:18:25 +03:00
Disty0	976f0ba61f	Cleanup	2025-06-05 20:59:58 +03:00
Disty0	d8e8f47ce5	SDNQ add an option to toggle quantize with GPU	2025-05-28 15:18:39 +03:00
Disty0	4ed15f5cce	SDNQ revert device_map = gpu	2025-05-27 23:32:58 +03:00
Disty0	d3e3fb98b0	Don't override user set device_map	2025-05-27 21:45:52 +03:00
Disty0	b1b29e9001	SDNQ disable device_map = gpu with TE and LLM	2025-05-27 21:32:32 +03:00
Disty0	3618e39cff	SDNQ use device_map = gpu	2025-05-27 19:46:30 +03:00
Disty0	dece497f10	Refactor SDNQ to use weights_dtype and rename decompress_int8_matmul to use_quantized_matmul	2025-05-27 15:49:21 +03:00
Disty0	4453efee76	Rename NNCF to SDNQ and rename quant schemes	2025-05-26 02:39:51 +03:00
Disty0	2d79380bd7	NNCF implement better layer hijacks and remove all NNCF imports	2025-05-26 01:12:28 +03:00
Vladimir Mandic	bfda37903c	update nncf linting and changelog Signed-off-by: Vladimir Mandic <mandic00@live.com>	2025-05-13 12:11:10 -04:00
Disty0	f4e3a81a84	NNCF experimental direct INT8 MatMul support	2025-05-12 21:41:49 +03:00
Disty0	9cfdc3c079	Remove NNCF device hijack	2025-05-11 18:30:10 +03:00
Disty0	1ee9832e05	NNCF silence the pytorch version warning	2025-05-09 23:16:55 +03:00
Vladimir Mandic	83fc68ece3	update requirements Signed-off-by: Vladimir Mandic <mandic00@live.com>	2025-05-06 09:47:13 -04:00
Vladimir Mandic	c6cc1476c6	add hidream-e1 Signed-off-by: Vladimir Mandic <mandic00@live.com>	2025-04-29 10:08:51 -04:00
Disty0	9e133a5044	Cleanup	2025-04-23 04:16:24 +03:00
Disty0	bb0329f54f	Update and refactor NNCF and add more quant options	2025-04-23 02:03:30 +03:00
Disty0	2264d8087b	Pre-load support for NNCF	2025-04-22 04:35:36 +03:00
Vladimir Mandic	0fe0707cdd	add lodestone-chroma Signed-off-by: Vladimir Mandic <mandic00@live.com>	2025-04-21 19:29:09 -04:00
Disty0	434bb660ce	Move post load quant functions to a single function in model_quant	2025-04-20 17:04:25 +03:00
Vladimir Mandic	75ebf1e196	hidream add llm info to metadata Signed-off-by: Vladimir Mandic <mandic00@live.com>	2025-04-17 14:44:37 -04:00
Disty0	a6ea26fb8d	Use transformers QuantoConfig for TE and LLM	2025-04-16 19:16:35 +03:00
Vladimir Mandic	15f8e70e89	add nunchaku prototype Signed-off-by: Vladimir Mandic <mandic00@live.com>	2025-04-15 14:39:24 -04:00
Vladimir Mandic	0f595d4cc5	cleanup multiple model loaders Signed-off-by: Vladimir Mandic <mandic00@live.com>	2025-04-11 22:16:05 -04:00
Disty0	72918eb94d	Fix quanto needing to re-quant when moving to gpu	2025-04-11 21:50:06 +03:00
Disty0	8e567f3ab0	Fix and update TorchAO	2025-04-11 20:08:03 +03:00
Disty0	bd1d8d44dc	Fix quanto with transformers	2025-04-11 19:15:23 +03:00
Vladimir Mandic	84fe068070	custom model loader Signed-off-by: Vladimir Mandic <mandic00@live.com>	2025-04-08 14:56:07 -04:00
Vladimir Mandic	6430f7006f	add monitor cli option and finish lora refactor Signed-off-by: Vladimir Mandic <mandic00@live.com>	2025-04-01 13:39:47 -04:00
Vladimir Mandic	f4fdd496b9	more granular quantization modules options Signed-off-by: Vladimir Mandic <mandic00@live.com>	2025-03-28 14:46:52 -04:00
Vladimir Mandic	d1c3b97c65	add prompt enhance Signed-off-by: Vladimir Mandic <mandic00@live.com>	2025-03-28 14:05:28 -04:00
Vladimir Mandic	4f56f4aa33	add new optimum-quanto on-the-fly and simplify quantization loading Signed-off-by: Vladimir Mandic <mandic00@live.com>	2025-03-16 21:45:05 -04:00

1 2

71 Commits (5d2f5dd6e7dbacd3e2ba648517548b43fdde80cc)