Commit Graph

95 Commits (78c2a629b6d4711b1da27b6d7dc1dad2e11f7beb)

Author SHA1 Message Date
Vladimir Mandic 78c2a629b6 add experimental tensorrt quantization
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-09-05 10:43:05 -04:00
Vladimir Mandic 2124ab6879 trt experiment
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-09-04 10:53:32 -04:00
Disty0 e49814098e Add sdnq_modules_dtype_dict 2025-08-20 14:58:54 +03:00
Disty0 0946710662 Add sdnq_modules_to_not_convert to UI settings 2025-08-20 04:38:20 +03:00
Disty0 8ca74d0cd2 SDNQ rename unused param_name arg to op 2025-08-13 22:10:30 +03:00
Disty0 15cb8fe9f8 SDNQ add modules_dtype_dict and fix Qwen Image with quants less than 5 bits 2025-08-13 00:07:36 +03:00
Vladimir Mandic 87bd347116 cleanup flux loader
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-08-11 16:05:39 -04:00
Disty0 afb3a5a06d SDNQ move non_blocking to quant config 2025-08-11 15:07:02 +03:00
Vladimir Mandic 6a6605191f configurable image fit in all image views
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-08-06 11:33:50 -04:00
Vladimir Mandic 7bba30e797 sdnq obey diffusers_to_gpu
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-08-06 10:12:27 -04:00
Vladimir Mandic ba4bff08d6 remove ldsr and refactor sdnq device map
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-08-06 09:27:54 -04:00
Vladimir Mandic 4be093b80f add diffusers_offload_nonblocking setting
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-08-01 16:38:31 -04:00
Vladimir Mandic b291c337a1 refactor internal post loop
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-08-01 10:45:39 -04:00
Vladimir Mandic fa44521ea3 offload-never and offload-always per-module and new highvram profile
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-07-31 11:40:24 -04:00
Vladimir Mandic 052f097956 lint
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-07-30 13:14:04 -04:00
Disty0 f02cfeaef9 Rename SDNQ TE dtype default to Same as model 2025-07-27 23:06:41 +03:00
Vladimir Mandic b5a87c4828 modify installer checks
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-07-26 18:40:28 -04:00
Vladimir Mandic ed1e59464e update requirements
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-07-06 14:53:25 -04:00
Vladimir Mandic 2b9056179d add lbm background replace with relightining
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-07-04 15:33:16 -04:00
Disty0 4aeb9f004c Cleanup layerwise 2025-07-04 06:12:25 +03:00
Disty0 9d5571f4f8 Use layerwise casting with Flux FP8 models 2025-07-04 04:26:43 +03:00
Disty0 fe9a3b8506 Add device map support to Flux and Chroma and add custom UNet support to Chroma 2025-07-04 02:22:42 +03:00
Disty0 cf90e5621a Add _skip_layerwise_casting_patterns to SDNQ skip list 2025-07-04 00:04:01 +03:00
Vladimir Mandic c4d9338d2e major refactoring of modules
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-07-03 09:18:38 -04:00
Disty0 71f7474de2 Unify quant options 2025-06-27 21:05:14 +03:00
Disty0 3b8ced444c Add auto quantization mode 2025-06-27 18:54:15 +03:00
Disty0 0e4d712f27 Chroma fixes 2025-06-26 21:25:50 +03:00
Enes Sadık Özbek e91208bea9
Merge branch 'dev' into feature/chroma-support 2025-06-26 17:02:00 +03:00
Disty0 0f6eb624c9 Use llm_int8_skip_modules with bnb 2025-06-26 03:10:26 +03:00
Disty0 dc8fd006b2 Add modules_to_not_convert to pre-mode quants 2025-06-26 02:47:10 +03:00
Enes Sadık Özbek 21bdde12d3
Merge branch 'dev' into feature/chroma-support 2025-06-26 01:56:34 +03:00
Disty0 c25d15398f SDNQ pre-mode don't quant if weights_dtype == 'none' 2025-06-26 01:48:01 +03:00
Disty0 81e55f0459 Fix SDNQ pre-mode 2025-06-25 22:58:40 +03:00
Vladimir Mandic 5b486a6ef1 sdnq add xyz grid support, improve offloading compatibility
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-06-25 15:32:37 -04:00
Disty0 87e6d3f4fc SDNQ add modules_to_not_convert and don't quant _keep_in_fp32_modules layers in post mode 2025-06-20 20:41:11 +03:00
Disty0 2c4850cc2b Log more info with SDNQ 2025-06-18 18:03:46 +03:00
Disty0 86cd272b96 SDNQ fix Dora 2025-06-18 16:24:42 +03:00
Enes Sadık Özbek 4b3ce06916
Initial support for Chroma 2025-06-18 00:38:17 +00:00
Disty0 25fc0094a9 SDNQ use quantize_device and return_device args and fix decompress_fp32 always being on 2025-06-14 21:29:08 +03:00
Disty0 2ba64abcde Cleanup 2025-06-14 00:54:18 +03:00
Disty0 cb4684cbeb SNDQ add separate quant mode option for Text Encoders 2025-06-13 12:42:57 +03:00
Disty0 5eed9135e3 Split SDNQ into multiple files and linting 2025-06-10 03:18:25 +03:00
Disty0 976f0ba61f Cleanup 2025-06-05 20:59:58 +03:00
Disty0 d8e8f47ce5 SDNQ add an option to toggle quantize with GPU 2025-05-28 15:18:39 +03:00
Disty0 4ed15f5cce SDNQ revert device_map = gpu 2025-05-27 23:32:58 +03:00
Disty0 d3e3fb98b0 Don't override user set device_map 2025-05-27 21:45:52 +03:00
Disty0 b1b29e9001 SDNQ disable device_map = gpu with TE and LLM 2025-05-27 21:32:32 +03:00
Disty0 3618e39cff SDNQ use device_map = gpu 2025-05-27 19:46:30 +03:00
Disty0 dece497f10 Refactor SDNQ to use weights_dtype and rename decompress_int8_matmul to use_quantized_matmul 2025-05-27 15:49:21 +03:00
Disty0 4453efee76 Rename NNCF to SDNQ and rename quant schemes 2025-05-26 02:39:51 +03:00