finish v1.1.0, remove a bunch of features
parent
4bbeccf33b
commit
8cc867f20a
51
README.md
51
README.md
|
|
@ -4,25 +4,27 @@ This extension aim for integrating [AnimateDiff](https://github.com/guoyww/Anima
|
|||
|
||||
This extension implements AnimateDiff in a different way. It does not require you to clone the whole SD1.5 repository. It also applied (probably) the least modification to `ldm`, so that you do not need to reload your model weights if you don't want to.
|
||||
|
||||
It essentially inject multiple motion modules into SD1.5 UNet. It does not work for other variations of SD, such as SD2.1 and SDXL.
|
||||
|
||||
Batch size on WebUI will be replaced by GIF frame number: 1 full GIF generated in 1 batch. If you want to generate multiple GIF at once, please change batch number.
|
||||
|
||||
You can try txt2gif on txt2img panel, img2gif on img2img panel with any LoRA/ControlNet. Due to the 1-batch behavior of AnimateDiff, it is probably not possible to support gif2gif. However, I need to discuss this with the authors of AnimateDiff.
|
||||
|
||||
You can access GIF output via `stable-diffusion-webui/outputs/{txt2img or img2img}-images/AnimateDiff`. The Gradio gallary might just show one frame of your GIF. You can also use download/save buttons on WebUI, just like how you save your images.
|
||||
|
||||
Motion modules will be **auto-downloaded** from [here](https://drive.google.com/drive/folders/1EqLC65eR1-W-sGD0Im7fkED6c8GkiNFI). If your terminal cannot access google due to whatever reason, please either configurate your proxy via `Settings/AnimateDiff` or manually download model weights and put to `sd-webui-animatediff/model/`. DO NOT change model filename. If you get a GIF with random images combined together, it is most likely because your terminal failed to download the model weights. Google `gdown` just silently print error messages on your terminal but do not throw an exception. You need to read your terminal log to see what was going wrong, especially whether or not you have successfully downloaded model weights.
|
||||
|
||||
You might also be interested in another extension I created: [Segment Anything for Stable Diffusion WebUI](https://github.com/continue-revolution/sd-webui-segment-anything).
|
||||
|
||||
## How to Use
|
||||
|
||||
1. Install this extension via link.
|
||||
2. Download motion modules from [Google Drive](https://drive.google.com/drive/folders/1EqLC65eR1-W-sGD0Im7fkED6c8GkiNFI) | [HuggingFace](https://huggingface.co/guoyww/animatediff) | [CivitAI](https://civitai.com/models/108836) | [Baidu NetDisk](https://pan.baidu.com/s/18ZpcSM6poBqxWNHtnyMcxg?pwd=et8y). You only need to download one of `mm_sd_v14.ckpt` | `mm_sd_v15.ckpt`. Put the model weights under `sd-webui-animatediff/model/`. DO NOT change model filename.
|
||||
3. Go to txt2img if you want to try txt2gif and img2img if you want to try img2gif.
|
||||
4. Choose an SD1.5 checkpoint, write prompts, set configurations such as image width/height and click `Generate`. You should see the output GIF on the output gallery. If you want to generate multiple GIFs at once, please change batch number, instead of batch size.
|
||||
5. You can access GIF output at `stable-diffusion-webui/outputs/{txt2img or img2img}-images/AnimateDiff`. The Gradio gallary might just show one frame of your GIF. You can also use download/save buttons on WebUI, just like how you save your images. You can also access image frames at `stable-diffusion-webui/outputs/{txt2img or img2img}-images/{date}`.
|
||||
|
||||
## Update
|
||||
|
||||
- `2023/07/20` [v1.1.0](https://github.com/continue-revolution/sd-webui-segment-anything/releases/tag/v1.1.0): remove auto-download, remove xformers, remove instructions on gradio UI, refactor README, add [sponsor](#sponsor) QR code.
|
||||
|
||||
## TODO
|
||||
- [ ] remove xformers and try other optimization of attention (e.g. sdp)
|
||||
- [ ] remove auto-download or use huggingface/civitai link
|
||||
- [ ] try other attention optimization (e.g. sdp)
|
||||
- [ ] fix matrix incompatible issue
|
||||
- [ ] check vram usage
|
||||
- [ ] fix all problems reported at github issues and reddit.
|
||||
- [ ] refactor README as a step-by-step guidance
|
||||
- [ ] check vram usage
|
||||
|
||||
## FAQ
|
||||
1. Q: Can I reproduce the result created by the original authors?
|
||||
|
|
@ -30,12 +32,31 @@ You might also be interested in another extension I created: [Segment Anything f
|
|||
A: Unfortunately, you cannot. This is because A1111 implements generation of random tensors in a completely different way. It is not possible to produce exactly the same random tensors as the original authors without an extremely large code modification.
|
||||
2. Q: I am using a remote server which blocks Google. What should I do?
|
||||
|
||||
A: You will have to find a way to download motion modules locally and re-upload to your server. At this time, the motion modules are not available on huggingface, which some GPU leasers did provide some proxy access. I provide a [baidu netdisk link](https://pan.baidu.com/s/18ZpcSM6poBqxWNHtnyMcxg?pwd=et8y).
|
||||
3. Q: Can I generate a video instead a GIF? How much VRAM do I need?
|
||||
A: You will have to find a way to download motion modules locally and re-upload to your server. I provide a [baidu netdisk link](https://pan.baidu.com/s/18ZpcSM6poBqxWNHtnyMcxg?pwd=et8y).
|
||||
3. Q: How much VRAM do I need?
|
||||
|
||||
A: Currently, you can run webui with this extension via NVIDIA 3090. I cannot guarantee any other variations of GPU. You cannot generate a video. This is because a whole batch of images will pass through a transformer module, which prevents us from generating videos sequentially. We look forward to future developments of deep learning for video generation.
|
||||
A: Currently, you can run WebUI with this extension via NVIDIA 3090. I cannot guarantee any other variations of GPU. Actual VRAM usage depends on your image size and video frame number. You can try to reduce image size or video frame number to reduce VRAM usage. I will add some sample VRAM requirements here later.
|
||||
|
||||
4. Q: Can I generate a video instead a GIF?
|
||||
|
||||
A: Unfortunately, you cannot. This is because a whole batch of images will pass through a transformer module, which prevents us from generating videos sequentially. We look forward to future developments of deep learning for video generation.
|
||||
|
||||
5. Q: Can I use SDXL to generate GIFs?
|
||||
|
||||
A: At least at this time, you cannot. This extension essentially inject multiple motion modules into SD1.5 UNet. It does not work for other variations of SD, such as SD2.1 and SDXL. I'm not sure what will happen if you force-add motion modules to SD2.1 or SDXL. Future experiments are needed.
|
||||
|
||||
6. Q: Can I use this extension to do gif2gif?
|
||||
|
||||
A: Due to the 1-batch behavior of AnimateDiff, it is probably not possible to support gif2gif. However, I need to discuss this with the authors of AnimateDiff.
|
||||
|
||||
## Sample
|
||||

|
||||
|
||||

|
||||
|
||||
## Sponsor
|
||||
You can sponsor me via WeChat or Alipay.
|
||||
|
||||
| WeChat | Alipay |
|
||||
| --- | --- |
|
||||
|  |  |
|
||||
|
|
|
|||
|
|
@ -5,7 +5,6 @@ import torch.nn.functional as F
|
|||
from torch import nn
|
||||
|
||||
from ldm.modules.attention import FeedForward
|
||||
from modules import shared
|
||||
|
||||
from einops import rearrange, repeat
|
||||
import math
|
||||
|
|
@ -287,7 +286,8 @@ class CrossAttention(nn.Module):
|
|||
# You can set slice_size with `set_attention_slice`
|
||||
self.sliceable_head_dim = heads
|
||||
self._slice_size = None
|
||||
self._use_memory_efficient_attention_xformers = shared.xformers_available
|
||||
# self._use_memory_efficient_attention_xformers = shared.xformers_available
|
||||
self._use_memory_efficient_attention_xformers = False
|
||||
self.added_kv_proj_dim = added_kv_proj_dim
|
||||
|
||||
if norm_num_groups is not None:
|
||||
|
|
|
|||
Loading…
Reference in New Issue