|
|
||
|---|---|---|
| images | ||
| javascript | ||
| scripts | ||
| .gitignore | ||
| LICENSE | ||
| README.md | ||
| README_en.md | ||
| style.css | ||
README_en.md
Dump U-Net
Table of contents
What is this
This is an extension for stable-diffusion-webui that adds a custom script which let you to observe U-Net feature maps.
What can this
This extension can
- visualize intermediate output of the model: features of each block of U-Net and attention layer.
- per-block prompts: generate images changing the prompt in each block of U-Net.
- visualize the difference of features in 2.
Feature extraction
Use the image below as an example.
Model: waifu-diffusion-v1-3-float16 (84692140)
Prompt: a cute girl, pink hair
Sampling steps: 20
Sampling Method: DPM++ 2M Karras
Size: 512x512
CFG Scale: 7
Seed: 1719471015
Feature extraction from U-Net
For example, the following images are generated.
Grayscale output OUT11, steps 20, Black/White, Sigmoid(1,0)

Colored output OUT11, steps 20, Custom, Sigmoid(1,0), H=(2+v)/3, S=1.0, V=0.5

UI description
- Extract U-Net features
- If checked, U-Net feature extraction is enabled.
- Layers
- Specify blocks to be extracted. Comma delimiters and hyphen delimiters can be used.
IN11,M00andOUT00are connected. - Image saving steps
- Specify the steps processing extraction.
- Colorization
- Specify how colorize the output images.
- Dump Setting
- Configure "binary-dump" settings.
- Selected Layer Info
- Details of the block input/output specified in
Layersection.
In Layer section you can use the grammer below:
single block: IN00
You can use IN00, IN01, ..., IN11, M00, OUT00, OUT01 ..., OUT11.
multiple blocks: IN00, IN01, M00
Comma separated block names.
range: IN00-OUT11
Hyphen separated block names.
Edges are included in the range.
IN11, M00 and OUT00 are connected.
range with steps: IN00-OUT11(+2)
`(+digits)` after the range defines steps.
`+1` is same as normal range.
`+2` means "every other block".
For instance, `IN00-OUT11(+2)` means:
IN00, IN02, IN04, IN06, IN08, IN10,
M00,
OUT01, OUT03, OUT05, OUT07, OUT09, OUT11
Colorization
- Colorize method
- Specifies the colorization method.
Letvbe the feature value.
White/Blackshows white pixel for large|v|, black pixel for small|v|.
Red/Blueshows red pixel for largev, blue pixel for small|v|.
Customcomputes the color fromv. You can use RGB or HSL colorspace. - Value transform
-
Feature values are not suitable to be used as-is to specify colors. This section specifies the conversion method from feature values to pixel values.
Auto [0,1]converts the value to[0,1]linearly using the minimum and maximum values of given feature values.
Auto [-1,1]converts the value to[-1,1]as well.
Linearfirst clamps feature values to specifiedClamp min./max.range. Then linearly converts values to[0,1]whenColorize methodis White/Black and to [-1,1] otherwise.

Sigmoidis a sigmoid function with specified gain and x-offset. The output is in range[0,1]whenColorize methodisWhite/Black, and[-1,1]otherwise.
- Color space
- Write code to convert
vtransformed byValue transformto the pixel value, wherevis given as[0,1]or[-1,1]according toColorize methodandValue transform. The result is clipped at[0,1].
The code is executed withnumpymodule as the global environment. For example,abs(v)meansnumpy.abs(v).
Dump Setting
- Dump feature tensors to files
- If checked, U-Net feature tensors are exported as files.
- Output path
- Specify the directory to output binaries. If it does not exist, it will be created.
Examples of extracted images
Images with steps=1,5,10 from left to right.
Feature extraction from Attention layer
UI description
Same as Feature extraction from U-Net.
Examples
The horizontal axis represents the token position. The beginning token and ending token are inserted, so the 75 images in between represent the influence of each token.
The vertical axis represents the heads of the attention layer. In the current model, h=8, so there will be 8 images in a row.
"It seems pink hair is working on this layer..." Something like that can be seen.
Per-block Prompts
Overview
See the following article for content (Japanese lang).
Model: waifu-diffusion-v1-3-float16 (84692140)
Prompt: a (~: IN00-OUT11: cute; M00: excellent :~) girl
Sampling Method: Euler a
Size: 512x512
CFG Scale: 7
Seed: 3292581281
The above images are in order:
- generated by
a cute girl. - with cute changed to excellent in IN00
- with cute changed to excellent in IN05
- with cute changed to excellent in M00
UI description
Same as Feature extraction from U-Net
- Output difference map of U-Net features between with and without Layer Prompt
- Add outputs to an image which shows difference between per-block prompt disabled and enabled.
Notation
Use notation below in the prompt:
a (~: IN00-OUT11: cute ; M00: excellent :~) girl
In above case, IN00-OUT11 (i.e. whole generation process) use
a cute girl
but for M00
a excellent girl
You can specify per-block prompts with the grammer below:
(~:
block-spec:prompt;
block-spec:prompt;
...
block-spec:prompt;
:~)
After (~:, before :~), before :, and after ;, you may insert spaces. Note that the :prompt; is reflected in the result as it is with spaces. The semicolon after the last prompt may be omitted.
The block specification (block-spec above) is as follows. Generally, it is the same as X/Y plot. If there are overlapping ranges, the later one takes precedence.
single block: IN00
You can use IN00, IN01, ..., IN11, M00, OUT00, OUT01 ..., OUT11.
multiple blocks: IN00, IN01, M00
Comma separated block names.
range: IN00-OUT11
Hyphen separated block names.
Edges are included in the range.
IN11, M00 and OUT00 are connected.
range with steps: IN00-OUT11(+2)
`(+digits)` after the range defines steps.
`+1` is same as normal range.
`+2` means "every other block".
For instance, `IN00-OUT11(+2)` means:
IN00, IN02, IN04, IN06, IN08, IN10,
M00,
OUT01, OUT03, OUT05, OUT07, OUT09, OUT11
otherwise: _ (underbar)
This is a special symbol and has the lowest precedence.
If any other block specs are matched, the prompt defined here will be used.
Examples
A few examaples.
1: (~: IN00: A ; IN01: B :~)
2: (~: IN00: A ; IN01: B ; IN02: C :~)
3: (~: IN00: A ; IN01: B ; IN02: C ; _ : D :~)
4: (~: IN00,IN01: A ; M00 : B :~)
5: (~: IN00-OUT11: A ; M00 : B :~)
1: use A in IN00, B in IN01, and nothing in other blocks. 2: use A in IN00, B in IN01, C in IN02 and nothing in other blocks. 3: use A in IN00, B in IN01, C in IN02 and D in other blocks. 4: use A in IN00 and IN01, B in M00, and nothing in other blocks. 5: use A in from IN00 to OUT11 (all blocks), but B for M00.
Use with Dynamic Prompts
For experiments, Dynamic Prompts is useful.
For instance, if you want to see the effect of changing the prompt in only one block, enable Jinja Template in Dynamic Prompts and input the following prompt:
{% for layer in [ "IN00", "IN01", "IN02", "IN03", "IN04", "IN05", "IN06", "IN07", "IN08", "IN09", "IN10", "IN11", "M00", "OUT00", "OUT01", "OUT02", "OUT03", "OUT04", "OUT05", "OUT06", "OUT07", "OUT08", "OUT09", "OUT10", "OUT11" ] %}
{% prompt %}a cute school girl, pink hair, wide shot, (~:{{layer}}:bad anatomy:~){% endprompt %}
{% endfor %}
to check the effect of bad anatomy in each block.
Actual examples are here (Japasese lang).
Test adding prompts to one specific block with prompts by block
TODO
- visualize self-attention layer













