Disty0
5edf481c8d
Add Torch GC threshold slider
2023-09-28 14:38:22 +03:00
Vladimir Mandic
0afcfe6097
logger early init
2023-09-23 23:44:34 -04:00
Disty0
550b7056ac
IPEX fix SDPA and reduce torch_gc force to %90
2023-09-18 15:36:14 +03:00
Vladimir Mandic
484dae8dbd
upgrade diffusers
2023-09-14 09:38:17 -04:00
Vladimir Mandic
76c444fbc8
cleanup
2023-09-13 11:48:13 -04:00
Vladimir Mandic
f8fcb6f853
fix original hires non-latent
2023-09-10 18:30:20 -04:00
Vladimir Mandic
250d1bf2fb
update hints
2023-09-10 13:05:31 -04:00
Disty0
34ee67477e
Fix BF16 and FP32 logging
2023-09-08 23:49:49 +03:00
Vladimir Mandic
29d88cf557
cleanup logging
2023-09-08 13:29:33 -04:00
Vladimir Mandic
f36c1eb476
jumbo patch
2023-09-08 13:01:20 -04:00
Vladimir Mandic
8fd96d0f30
catch directml and ipex initialization errors
2023-09-07 07:27:54 -04:00
Vladimir Mandic
df65df3f36
minor fixes
2023-08-30 09:45:47 -04:00
Vladimir Mandic
48c0ce9b2b
fix model lookups
2023-08-27 08:01:29 +00:00
Vladimir Mandic
6a4d4ea5b7
update logging and model hashinh
2023-08-22 18:28:09 +00:00
Disty0
f9718f068c
Seperate OpenVINO from IPEX
2023-08-19 17:52:15 +03:00
Disty0
209f9a19c6
IPEX fixes
2023-08-16 18:56:50 +03:00
Vladimir Mandic
88fff06c9e
downgrade warn to info
2023-08-13 10:58:02 +00:00
Vladimir Mandic
4234555566
update
2023-08-05 14:37:09 +00:00
Seunghoon Lee
0f44332e5c
Make sequential CPU offload available for non-CUDA
...
Add settings override for DirectML.
Move `devices.set_cuda_params()` to correct line.
2023-07-28 23:11:57 +09:00
Disty0
f38d5a91bf
Move ipex fixes into it's own folder
2023-07-28 10:58:45 +03:00
Disty0
38dcca7399
ipex cleanup
2023-07-28 01:35:02 +03:00
Nuullll
6acb3ef131
[IPEX] Fix batch_norm for Tiled VAE
...
Tiled VAE invokes `torch.nn.functional.batch_norm` without providing the
`weight` and `bias` parameter, so torch backend creates default empty
tensors for them but bails out with "tensor does not have a device" error.
This patch overrides the `weight` and `bias` parameters to all-ones and
all-zeros if they are `None`.
2023-07-27 22:56:19 +08:00
Seunghoon Lee
43b9c52bd4
Fix image corruption in half mode with embeddings.
...
(DirectML)
2023-07-25 17:02:55 +09:00
Seunghoon Lee
d4aa840a77
DirectML update.
...
DirectML reuses GPU memory instead of returning it.
So prints "practical" GPU memory utilization too.
2023-07-24 16:10:01 +09:00
Vladimir Mandic
b31fa98669
fixes...
2023-07-21 09:28:02 -04:00
Disty0
7959dbceac
ipex fix cuda error with openpose
2023-07-21 15:28:07 +03:00
Disty0
57d1d3ed16
Fix Kandinsky safety_checker and compile
2023-07-20 14:29:15 +03:00
Disty0
88794e3724
ipex fix cuda error when using pin memory
2023-07-17 17:14:13 +03:00
Vladimir Mandic
e2b33b81d3
fix diffusers samplers
2023-07-15 22:40:03 -04:00
Disty0
f773c782fa
ipex cleanup
2023-07-16 01:39:40 +03:00
Seunghoon Lee
0a52c44e73
DirectML rework & provide GPU memory usage (AMD only).
2023-07-15 18:55:38 +09:00
Disty0
14d1136fe7
Fix ipex memstats
2023-07-14 18:09:07 +03:00
Disty0
2a9133bfec
IPEX rework
2023-07-14 17:33:24 +03:00
Disty0
c3a4293f22
Disable torch_gc for IPEX in WSL2
2023-07-12 13:02:42 +03:00
Disty0
2bce86a50a
Replace empty_cache with torch_gc
2023-07-12 12:45:21 +03:00
Disty0
562ca33275
Fix Diffusers _conv_forward dtype error with IPEX
2023-07-12 02:03:45 +03:00
Vladimir Mandic
db30f5faec
update changelog
2023-07-08 14:22:51 -04:00
Vladimir Mandic
2a21196061
Merge branch 'master' into dev
2023-07-08 13:35:25 -04:00
Vladimir Mandic
89a7ea6a3f
overal quality fixes
2023-07-08 09:49:41 -04:00
Disty0
205b516487
Fix diffusers_sdxl on ipex
2023-07-07 22:41:26 +03:00
Disty0
3bcca6f92b
Patch torch.Generator again
2023-07-06 02:51:27 +03:00
Disty0
422c60c787
Patch torch.Generator
2023-07-05 20:49:39 +03:00
Disty0
99284ff020
Cleanup
2023-07-05 12:43:15 +03:00
Disty0
a62d9b0ca4
Cleanup
2023-07-05 12:39:34 +03:00
Nuullll
860bf8e2bf
[IPEX] Support SDE samplers
...
This is a W/A since `torch.Generator()` API doesn't support `xpu`
backend at the moment. So replacing it with `torch.xpu.Generator()` API
provided by IPEX.
2023-07-05 15:48:58 +08:00
Disty0
45d50bd106
Remove cpu=xpu with ipex
2023-07-05 00:12:07 +03:00
Disty0
966eed8dd9
Autodetect IPEX
2023-07-04 23:37:36 +03:00
Vladimir Mandic
b216a35ddd
update diffusers and extra networks
2023-07-04 09:28:48 -04:00
Vladimir Mandic
2a41bf1406
fix styles
2023-06-27 09:04:42 -04:00
Disty0
102503a3a4
Fix ControlNet and change to sub-quad on ipex
2023-06-27 15:17:13 +03:00
Disty0
618097dac2
GradScaler patch for IPEX
2023-06-15 01:19:35 +03:00
Vladimir Mandic
cb307399dd
jumbo merge
2023-06-13 11:59:56 -04:00
Disty0
c9e58c9604
Fix train for IPEX
2023-06-12 00:21:32 +03:00
Disty0
f63dd1c92e
Fix torch.linalg.solve with IPEX & Diffusers UniPC
2023-06-10 22:01:09 +03:00
Disty0
3bef3e3eee
Train patches for IPEX
2023-06-07 17:25:11 +03:00
Vladimir Mandic
efbe364f7d
js optimizations
2023-06-05 14:26:01 -04:00
Disty0
c52fb69dde
Fix bf16 test
2023-06-05 20:49:18 +03:00
Vladimir Mandic
c0a824d8c6
add extra networks to xyz
2023-06-05 10:32:08 -04:00
Disty0
8bef48e501
Fix GroupNorm.forward with IPEX
2023-06-04 12:22:56 +03:00
Disty0
4265692505
Fix GradScaler doesn't exist for XPU
2023-06-03 17:02:44 +03:00
Vladimir Mandic
1f988d1df6
cleanup
2023-06-02 19:39:44 -04:00
Vince Navarro
c30eb90aff
Remove stray print
2023-06-01 17:28:13 -04:00
Vince Navarro
523dbaf8dc
Add XPU support for --device-id
2023-06-01 16:42:21 -04:00
Vladimir Mandic
9bf0b1ae1f
allow experimental to override precision
2023-05-28 07:46:47 -04:00
Vladimir Mandic
f8884bc051
fix hip detection
2023-05-25 09:13:57 -04:00
Vladimir Mandic
9e66d88e21
add mps defaults
2023-05-24 15:21:49 -04:00
Vladimir Mandic
684851ae34
set default optimizer
2023-05-24 13:50:01 -04:00
Vladimir Mandic
0acc7d3b86
fix redirector
2023-05-24 08:49:33 -04:00
Vladimir Mandic
ea0780339a
fixes
2023-05-21 08:17:36 -04:00
Vladimir Mandic
9033499e08
add manual seed
2023-05-19 08:34:43 -04:00
Vladimir Mandic
0ccda9bc8b
jumbo patch
2023-05-17 14:15:55 -04:00
Vladimir Mandic
5250ba4be3
force no-half with directml
2023-05-16 21:20:36 -04:00
Vladimir Mandic
8350b93a5c
add force latent sampler
2023-05-15 09:32:20 -04:00
Vladimir Mandic
5134471bc8
dml autocast
2023-05-14 13:24:59 -04:00
Vladimir Mandic
618a1703ae
update cudnn benchmark setting
2023-05-14 12:28:37 -04:00
Vladimir Mandic
760f5fb89a
add extra debug messages
2023-05-14 12:26:15 -04:00
Vladimir Mandic
a652270999
fix
2023-05-13 12:26:00 -04:00
Vladimir Mandic
a2923064a5
update cudnn
2023-05-13 11:52:31 -04:00
Vladimir Mandic
d96ab6a1ae
update directml
2023-05-13 11:21:11 -04:00
Vladimir Mandic
a2485cf7ef
update
2023-05-12 21:12:24 -04:00
Vladimir Mandic
1921504e64
enable dynamo compile
2023-05-12 15:58:00 -04:00
Vladimir Mandic
daf90cb6b4
add performance note
2023-05-12 14:23:51 -04:00
Vladimir Mandic
62dda471a3
process images in threads
2023-05-12 14:21:26 -04:00
Vladimir Mandic
1943bfea88
use cudnn workaround
2023-05-11 22:24:12 -04:00
Vladimir Mandic
e038bf1549
aggressive gc
2023-05-10 16:03:55 -04:00
Vladimir Mandic
41182009cb
switch some cmdopts to opts
2023-05-08 09:27:50 -04:00
Vladimir Mandic
1360c6422a
add fp16 test
2023-05-08 09:27:50 -04:00
Disty0
8171d57c36
Remove unnecessary IPEX imports
2023-05-04 02:34:34 +03:00
Vladimir Mandic
5d8c787a7b
restart server redesign
2023-05-03 17:20:22 -04:00
Disty0
53f3567224
Use cmd_args parser instead of launch.py
2023-05-03 21:25:23 +03:00
Disty0
7577a09528
Add IPEX Optimizers and use XPU instead of CPU when using IPEX
2023-05-03 18:12:38 +03:00
Disty0
de8d0bef9f
More patches and Import IPEX after Torch
2023-04-30 18:19:37 +03:00
Disty0
a720a670e8
More patches and less import shared
2023-04-30 16:01:17 +03:00
Disty0
b075d3c8fd
Intel ARC Support
2023-04-30 15:13:56 +03:00
Seunghoon Lee
d2d5011bd3
Implement memory estimation for AMDGPUs.
...
Stable.
2023-04-26 17:44:32 +09:00
Seunghoon Lee
a49a8f8b46
First DirectML implementation.
...
Unstable and not tested.
2023-04-25 01:43:19 +09:00
Vladimir Mandic
61e9a1970c
add exception around torch properties
2023-04-22 08:35:17 -04:00
Vladimir Mandic
cf277e7326
fix dtype logic
2023-04-21 15:04:05 -04:00
Vladimir Mandic
57204b3d70
disable xformers/sdp if cannot be used
2023-04-21 11:32:19 -04:00
Vladimir Mandic
7939a1649d
parse model preload
2023-04-20 23:19:25 -04:00
Vladimir Mandic
0e7144186d
jump patch
2023-04-20 11:20:27 -04:00
Vladimir Mandic
e14cba0771
add lycoris folder
2023-04-15 12:25:59 -04:00
Vladimir Mandic
81b8294e93
switch cmdflags to settings
2023-04-12 10:40:11 -04:00
brkirch
1b8af15f13
Refactor Mac specific code to a separate file
...
Move most Mac related code to a separate file, don't even load it unless web UI is run under macOS.
2023-02-01 14:05:56 -05:00
brkirch
2217331cd1
Refactor MPS fixes to CondFunc
2023-02-01 06:36:22 -05:00
brkirch
7738c057ce
MPS fix is still needed :(
...
Apparently I did not test with large enough images to trigger the bug with torch.narrow on MPS
2023-02-01 05:23:58 -05:00
AUTOMATIC1111
fecb990deb
Merge pull request #7309 from brkirch/fix-embeddings
...
Fix embeddings, upscalers, and refactor `--upcast-sampling`
2023-01-28 18:44:36 +03:00
brkirch
f9edd578e9
Remove MPS fix no longer needed for PyTorch
...
The torch.narrow fix was required for nightly PyTorch builds for a while to prevent a hard crash, but newer nightly builds don't have this issue.
2023-01-28 04:16:27 -05:00
brkirch
ada17dbd7c
Refactor conditional casting, fix upscalers
2023-01-28 04:16:25 -05:00
AUTOMATIC
9beb794e0b
clarify the option to disable NaN check.
2023-01-27 13:08:00 +03:00
AUTOMATIC
d2ac95fa7b
remove the need to place configs near models
2023-01-27 11:28:12 +03:00
brkirch
e3b53fd295
Add UI setting for upcasting attention to float32
...
Adds "Upcast cross attention layer to float32" option in Stable Diffusion settings. This allows for generating images using SD 2.1 models without --no-half or xFormers.
In order to make upcasting cross attention layer optimizations possible it is necessary to indent several sections of code in sd_hijack_optimizations.py so that a context manager can be used to disable autocast. Also, even though Stable Diffusion (and Diffusers) only upcast q and k, unfortunately my findings were that most of the cross attention layer optimizations could not function unless v is upcast also.
2023-01-25 01:13:04 -05:00
brkirch
84d9ce30cb
Add option for float32 sampling with float16 UNet
...
This also handles type casting so that ROCm and MPS torch devices work correctly without --no-half. One cast is required for deepbooru in deepbooru_model.py, some explicit casting is required for img2img and inpainting. depth_model can't be converted to float16 or it won't work correctly on some systems (it's known to have issues on MPS) so in sd_models.py model.depth_model is removed for model.half().
2023-01-25 01:13:02 -05:00
AUTOMATIC1111
aa60fc6660
Merge pull request #6922 from brkirch/cumsum-fix
...
Improve cumsum fix for MPS
2023-01-19 13:18:34 +03:00
brkirch
a255dac4f8
Fix cumsum for MPS in newer torch
...
The prior fix assumed that testing int16 was enough to determine if a fix is needed, but a recent fix for cumsum has int16 working but not bool.
2023-01-17 20:54:18 -05:00
AUTOMATIC
c361b89026
disable the new NaN check for the CI
2023-01-17 11:05:01 +03:00
AUTOMATIC
9991967f40
Add a check and explanation for tensor with all NaNs.
2023-01-16 22:59:46 +03:00
brkirch
8111b5569d
Add support for PyTorch nightly and local builds
2023-01-05 20:54:52 -05:00
brkirch
16b4509fa6
Add numpy fix for MPS on PyTorch 1.12.1
...
When saving training results with torch.save(), an exception is thrown:
"RuntimeError: Can't call numpy() on Tensor that requires grad. Use tensor.detach().numpy() instead."
So for MPS, check if Tensor.requires_grad and detach() if necessary.
2022-12-17 04:22:58 -05:00
AUTOMATIC
b6e5edd746
add built-in extension system
...
add support for adding upscalers in extensions
move LDSR, ScuNET and SwinIR to built-in extensions
2022-12-03 18:06:33 +03:00
AUTOMATIC
46b0d230e7
add comment for #4407 and remove seemingly unnecessary cudnn.enabled
2022-12-03 16:01:23 +03:00
AUTOMATIC
2651267e3a
fix #4407 breaking UI entirely for card other than ones related to the PR
2022-12-03 15:57:52 +03:00
AUTOMATIC1111
681c0003df
Merge pull request #4407 from yoinked-h/patch-1
...
Fix issue with 16xx cards
2022-12-03 10:30:34 +03:00
brkirch
0fddb4a1c0
Rework MPS randn fix, add randn_like fix
...
torch.manual_seed() already sets a CPU generator, so there is no reason to create a CPU generator manually. torch.randn_like also needs a MPS fix for k-diffusion, but a torch hijack with randn_like already exists so it can also be used for that.
2022-11-30 10:33:42 -05:00
AUTOMATIC1111
cc90dcc933
Merge pull request #4918 from brkirch/pytorch-fixes
...
Fixes for PyTorch 1.12.1 when using MPS
2022-11-27 13:47:01 +03:00
AUTOMATIC
5b2c316890
eliminate duplicated code from #5095
2022-11-27 13:08:54 +03:00
Matthew McGoogan
c67c40f983
torch.cuda.empty_cache() defaults to cuda:0 device unless explicitly set otherwise first. Updating torch_gc() to use the device set by --device-id if specified to avoid OOM edge cases on multi-GPU systems.
2022-11-26 23:25:16 +00:00
brkirch
e247b7400a
Add fixes for PyTorch 1.12.1
...
Fix typo "MasOS" -> "macOS"
If MPS is available and PyTorch is an earlier version than 1.13:
* Monkey patch torch.Tensor.to to ensure all tensors sent to MPS are contiguous
* Monkey patch torch.nn.functional.layer_norm to ensure input tensor is contiguous (required for this program to work with MPS on unmodified PyTorch 1.12.1)
2022-11-21 02:07:19 -05:00
brkirch
abfa22c16f
Revert "MPS Upscalers Fix"
...
This reverts commit 768b95394a8500da639b947508f78296524f1836.
2022-11-17 00:08:21 -05:00
AUTOMATIC
0ab0a50f9a
change formatting to match the main program in devices.py
2022-11-12 10:00:49 +03:00
源文雨
1130d5df66
Update devices.py
2022-11-12 11:09:28 +08:00
源文雨
76ab31e188
Fix wrong mps selection below MasOS 12.3
2022-11-12 11:02:40 +08:00
pepe10-gpu
62e9fec3df
actual better fix
...
thanks C43H66N12O12S2
2022-11-08 15:19:09 -08:00
pepe10-gpu
29eff4a194
terrible hack
2022-11-07 18:06:48 -08:00
pepe10-gpu
cd6c55c1ab
16xx card fix
...
cudnn
2022-11-06 17:05:51 -08:00
brkirch
faed465a0b
MPS Upscalers Fix
...
Get ESRGAN, SCUNet, and SwinIR working correctly on MPS by ensuring memory is contiguous for tensor views before sending to MPS device.
2022-10-25 09:42:53 +03:00
brkirch
4c24347e45
Remove BSRGAN from --use-cpu, add SwinIR
2022-10-25 09:42:53 +03:00
AUTOMATIC
50b5504401
remove parsing command line from devices.py
2022-10-22 14:04:14 +03:00
Extraltodeus
57eb54b838
implement CUDA device selection by ID
2022-10-22 00:11:07 +02:00
brkirch
fdef8253a4
Add 'interrogate' and 'all' choices to --use-cpu
...
* Add 'interrogate' and 'all' choices to --use-cpu
* Change type for --use-cpu argument to str.lower, so that choices are case insensitive
2022-10-14 16:31:39 +03:00
AUTOMATIC
7349088d32
--no-half-vae
2022-10-10 16:16:29 +03:00
brkirch
e9e2a7ec9a
Merge branch 'master' into cpu-cmdline-opt
2022-10-04 07:42:53 -04:00
AUTOMATIC
6c6ae28bf5
send all three of GFPGAN's and codeformer's models to CPU memory instead of just one for #1283
2022-10-04 12:32:22 +03:00
brkirch
27ddc24fde
Add BSRGAN to --add-cpu
2022-10-04 05:18:17 -04:00
brkirch
eeab7aedf5
Add --use-cpu command line option
...
Remove MPS detection to use CPU for GFPGAN / CodeFormer and add a --use-cpu command line option.
2022-10-04 04:24:35 -04:00
brkirch
b88e4ea7d6
Merge branch 'master' into master
2022-10-04 01:04:19 -04:00
AUTOMATIC
820f1dc96b
initial support for training textual inversion
2022-10-02 15:03:39 +03:00
brkirch
bdaa36c844
When device is MPS, use CPU for GFPGAN instead
...
GFPGAN will not work if the device is MPS, so default to CPU instead.
2022-09-30 23:53:25 -04:00
AUTOMATIC
9d40212485
first attempt to produce crrect seeds in batch
2022-09-13 21:49:58 +03:00
AUTOMATIC
c7e0e28ccd
changes for #294
2022-09-12 20:09:32 +03:00