change offload and upcast defaults

Signed-off-by: Vladimir Mandic <mandic00@live.com>
2024-12-05 07:58:52 -05:00 · 2024-12-05 07:58:52 -05:00 · 2965045993
parent 6f6ba0b598
commit 2965045993
6 changed files with 20 additions and 13 deletions
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@ -47,11 +47,17 @@
  - Flux: all-in-one safetensors  
    example: <https://civitai.com/models/646328?modelVersionId=1040235>  
  - Flux: do not recast quants  
- **Offload** improvements:  
-  - faster and more compatible *balanced* mode  
+- **Memory** improvements:  
+  - faster and more compatible *balanced offload* mode  
  - balanced offload: units are now in percentage instead of bytes  
  - balanced offload: add both high and low watermark  
-    *note*: balanced offload is recommended method for offload when using any large models such as sd35 or flux
+    default is 25% for low-watermark (skip offload if memory usage is below 25%) and 70% high-watermark (must offload if memory usage is above 70%)  
+  - change-in-behavior:  
+    `lowvrwam` triggers *sequential offload*, also automatically triggered on systems with <=4GB vram  
+    all other systems use *balanced offload* by default (can be changed in settings)  
+    previous behavior was to use *model offload* on systems with <=8GB and `medvram` and no offload by default  
+  - VAE upcase is now disabled by default on all systems  
+    if you have issues with image decode, you'll need to enable it manually  
 - **UI**:  
  - improved stats on generate completion  
  - improved live preview display and performance  
--- a/configs/flux/vae/config.json
+++ b/configs/flux/vae/config.json
@ -14,7 +14,7 @@
    "DownEncoderBlock2D",
    "DownEncoderBlock2D"
  ],
-  "force_upcast": true,
+  "force_upcast": false,
  "in_channels": 3,
  "latent_channels": 16,
  "latents_mean": null,
--- a/configs/sd15/vae/config.json
+++ b/configs/sd15/vae/config.json
@ -14,6 +14,7 @@
    "DownEncoderBlock2D",
    "DownEncoderBlock2D"
  ],
+  "force_upcast": false,
  "in_channels": 3,
  "latent_channels": 4,
  "layers_per_block": 2,
--- a/configs/sd3/vae/config.json
+++ b/configs/sd3/vae/config.json
@ -15,7 +15,7 @@
    "DownEncoderBlock2D",
    "DownEncoderBlock2D"
  ],
-  "force_upcast": true,
+  "force_upcast": false,
  "in_channels": 3,
  "latent_channels": 16,
  "latents_mean": null,
--- a/configs/sdxl/vae/config.json
+++ b/configs/sdxl/vae/config.json
@ -15,7 +15,7 @@
    "DownEncoderBlock2D",
    "DownEncoderBlock2D"
  ],
-  "force_upcast": true,
+  "force_upcast": false,
  "in_channels": 3,
  "latent_channels": 4,
  "layers_per_block": 2,
--- a/modules/shared.py
+++ b/modules/shared.py
@ -432,15 +432,15 @@ def get_default_modes():
                cmd_opts.lowvram = True
                default_offload_mode = "sequential"
                log.info(f"Device detect: memory={gpu_memory:.1f} optimization=lowvram")
-            elif gpu_memory <= 8:
-                cmd_opts.medvram = True
-                default_offload_mode = "model"
-                log.info(f"Device detect: memory={gpu_memory:.1f} optimization=medvram")
+            # elif gpu_memory <= 8:
+            #     cmd_opts.medvram = True
+            #     default_offload_mode = "model"
+            #     log.info(f"Device detect: memory={gpu_memory:.1f} optimization=medvram")
            else:
-                default_offload_mode = "none"
-                log.info(f"Device detect: memory={gpu_memory:.1f} optimization=none")
+                default_offload_mode = "balanced"
+                log.info(f"Device detect: memory={gpu_memory:.1f} optimization=balanced")
    elif cmd_opts.medvram:
-        default_offload_mode = "model"
+        default_offload_mode = "balanced"
    elif cmd_opts.lowvram:
        default_offload_mode = "sequential"