Commit Graph

67 Commits (0a49961baaaa594313baffba7a000e4e98712c71)

Author SHA1 Message Date
Disty0 9a7c765506 Run torch.xpu.memory_allocated with device 2023-06-23 11:08:11 +03:00
Vladimir Mandic 9334b2f21c jumbo merge part three 2023-06-14 13:54:23 -04:00
Vladimir Mandic 1d9e490ef9 ruff linting fixes 2023-06-13 12:22:39 -04:00
Vladimir Mandic cb307399dd jumbo merge 2023-06-13 11:59:56 -04:00
Vladimir Mandic 0cca4d452a add saving from process tab 2023-06-07 17:35:27 -04:00
Vladimir Mandic d25b020f61 update 2023-06-02 12:29:21 -04:00
Disty0 562947c944 Use proper device names instead of "xpu" 2023-06-02 01:14:56 +03:00
Vladimir Mandic 54257dd226 refactoring for pylint 2023-05-28 17:09:58 -04:00
Vladimir Mandic 0ccda9bc8b jumbo patch 2023-05-17 14:15:55 -04:00
Disty0 5c9894724c Fix memory monitoring when using IPEX 2023-05-05 12:26:47 +03:00
Disty0 8171d57c36 Remove unnecessary IPEX imports 2023-05-04 02:34:34 +03:00
Vladimir Mandic cb4cff3929 redesign logging 2023-05-02 13:57:16 -04:00
Vladimir Mandic deb0546b46 update requirements 2023-05-01 18:54:50 -04:00
Disty0 de8d0bef9f More patches and Import IPEX after Torch 2023-04-30 18:19:37 +03:00
Disty0 b075d3c8fd Intel ARC Support 2023-04-30 15:13:56 +03:00
Seunghoon Lee df0e89be48
fix.
Unstable & need more test.
2023-04-26 12:45:44 +09:00
Seunghoon Lee 8b75033a11
fix 2023-04-26 12:34:27 +09:00
Seunghoon Lee 09ae33cdf7
Implement torch.dml.
VERY UNSTABLE & NOT TESTED.
2023-04-26 12:21:44 +09:00
Seunghoon Lee db56da075a
need full precision for model & vae.
Stable & tested.
2023-04-25 23:04:52 +09:00
Seunghoon Lee a49a8f8b46
First DirectML implementation.
Unstable and not tested.
2023-04-25 01:43:19 +09:00
Vladimir Mandic ed8819b8fc lycoris, strong linting, model keyword, circular imports 2023-04-15 10:28:31 -04:00
Vladimir Mandic 81b8294e93 switch cmdflags to settings 2023-04-12 10:40:11 -04:00
Vladimir Mandic f181885f0c
Merge pull request #57 from AUTOMATIC1111/master
merge from upstream
2023-03-25 08:47:00 -04:00
FNSpd 280ed8f00f
Update sd_hijack_optimizations.py 2023-03-24 16:29:16 +04:00
FNSpd c84c9df737
Update sd_hijack_optimizations.py 2023-03-21 14:50:22 +04:00
Vladimir Mandic f6679fcc77 add global exception handler 2023-03-17 10:08:07 -04:00
Pam 8d7fa2f67c sdp_attnblock_forward hijack 2023-03-10 22:48:41 +05:00
Pam 37acba2633 argument to disable memory efficient for sdp 2023-03-10 12:19:36 +05:00
Pam fec0a89511 scaled dot product attention 2023-03-07 00:33:13 +05:00
brkirch e3b53fd295 Add UI setting for upcasting attention to float32
Adds "Upcast cross attention layer to float32" option in Stable Diffusion settings. This allows for generating images using SD 2.1 models without --no-half or xFormers.

In order to make upcasting cross attention layer optimizations possible it is necessary to indent several sections of code in sd_hijack_optimizations.py so that a context manager can be used to disable autocast. Also, even though Stable Diffusion (and Diffusers) only upcast q and k, unfortunately my findings were that most of the cross attention layer optimizations could not function unless v is upcast also.
2023-01-25 01:13:04 -05:00
AUTOMATIC 59146621e2 better support for xformers flash attention on older versions of torch 2023-01-23 16:40:20 +03:00
Takuma Mori 3262e825cc add --xformers-flash-attention option & impl 2023-01-21 17:42:04 +09:00
AUTOMATIC 40ff6db532 extra networks UI
rework of hypernets: rather than via settings, hypernets are added directly to prompt as <hypernet:name:weight>
2023-01-21 08:36:07 +03:00
brkirch c18add68ef Added license 2023-01-06 16:42:47 -05:00
brkirch b95a4c0ce5 Change sub-quad chunk threshold to use percentage 2023-01-06 01:01:51 -05:00
brkirch d782a95967 Add Birch-san's sub-quadratic attention implementation 2023-01-06 00:14:13 -05:00
brkirch 35b1775b32 Use other MPS optimization for large q.shape[0] * q.shape[1]
Check if q.shape[0] * q.shape[1] is 2**18 or larger and use the lower memory usage MPS optimization if it is. This should prevent most crashes that were occurring at certain resolutions (e.g. 1024x1024, 2048x512, 512x2048).

Also included is a change to check slice_size and prevent it from being divisible by 4096 which also results in a crash. Otherwise a crash can occur at 1024x512 or 512x1024 resolution.
2022-12-20 21:30:00 -05:00
AUTOMATIC 505ec7e4d9 cleanup some unneeded imports for hijack files 2022-12-10 09:17:39 +03:00
AUTOMATIC 7dbfd8a7d8 do not replace entire unet for the resolution hack 2022-12-10 09:14:45 +03:00
Billy Cao adb6cb7619 Patch UNet Forward to support resolutions that are not multiples of 64
Also modifed the UI to no longer step in 64
2022-11-23 18:11:24 +08:00
Cheka 2fd7935ef4 Remove wrong self reference in CUDA support for invokeai 2022-10-19 09:35:53 +03:00
C43H66N12O12S2 c71008c741 Update sd_hijack_optimizations.py 2022-10-18 11:53:04 +03:00
C43H66N12O12S2 84823275e8 readd xformers attnblock 2022-10-18 11:53:04 +03:00
C43H66N12O12S2 2043c4a231 delete xformers attnblock 2022-10-18 11:53:04 +03:00
brkirch 861db783c7 Use apply_hypernetwork function 2022-10-11 17:24:00 +03:00
brkirch 574c8e554a Add InvokeAI and lstein to credits, add back CUDA support 2022-10-11 17:24:00 +03:00
brkirch 98fd5cde72 Add check for psutil 2022-10-11 17:24:00 +03:00
brkirch c0484f1b98 Add cross-attention optimization from InvokeAI
* Add cross-attention optimization from InvokeAI (~30% speed improvement on MPS)
* Add command line option for it
* Make it default when CUDA is unavailable
2022-10-11 17:24:00 +03:00
AUTOMATIC 873efeed49 rename hypernetwork dir to hypernetworks to prevent clash with an old filename that people who use zip instead of git clone will have 2022-10-11 15:51:30 +03:00
AUTOMATIC 530103b586 fixes related to merge 2022-10-11 14:53:02 +03:00