- Addressing #2
- Implement multiple noise functions
- Update all sample images
pull/8/head
Haoming 2023-07-03 15:17:29 +08:00
parent 93ab335241
commit 8da191acb5
16 changed files with 169 additions and 75 deletions

View File

@ -9,20 +9,19 @@ refer to the parameters and sample images below and play around with the values.
**Note:** Since this modifies the underlying latent noise, the composition may change drastically.
**Note:** Due to my scaling implementations, lower steps *(`< 10`)* causes the effects to be way stronger
#### Parameters
- **Enable:** Turn on & off this Extension
- **Alt:** Modify an alternative Tensor instead. The effects are significantly stronger when this is enabled.
- **Alt:** Modify an alternative Tensor instead, causing the effects to be significantly stronger
- **Skip:** Skip the last percentage of steps and only process the first few steps
<p align="center"><img src="samples/Skip.jpg" width=512></p>
<p align="center">When <code>Alt.</code> is enabled, the image can get distorted at high value<br>Increase <code>Skip</code> to still achieve a stronger effect but without distortion</p>
- **Brightness:** Adjust the overall brightness of the image
- **Contrast:** Adjust the overall contrast of the image
- **Saturation:** Adjust the overall saturation of the image
- **Skip:** Skip the last few percentage of steps and only process the first few
<p align="center"><img src="samples/Skip.jpg" width=512></p>
<p align="center"><i>Notice the noises on the faces</i></p>
##### Color Channels
#### Color Channels
<table>
<thead align="center">
<tr>
@ -50,50 +49,73 @@ refer to the parameters and sample images below and play around with the values.
</tbody>
</table>
##### Advanced Settings
#### Advanced Settings
- **Process Hires. fix:** By default, this Extension only functions during the **txt2img** phase, so that **Hires. fix** may "fix" the artifacts introduced during **txt2img**. Enable this to process **Hires. fix** phase too.
- This option does not affect **img2img**
- **Note:** Make sure your **txt2img** has higher `steps` than **Hires. fix** if you enable this
- **Noise Settings:** Currently does **nothing** *(To be implemented...)*
- **Note:** Keep the **txt2img** base `steps` higher than **Hires. fix** `steps` if you enable this
##### Noise Settings
> let `x` denote the Tensor ; let `y` denote the operations
<!-- "Straight", "Straight Abs.", "Cross", "Cross Abs.", "Ones", "N.Random", "U.Random", "Multi-Res", "Multi-Res Abs." -->
- **Straight:** All operations are calculated on the same Tensor
- `x += x * y`
- **Cross:** All operations are calculated on the Tensor opposite of the `Alt.` setting
- `x += x' * y`
- **Ones:** All operations are calculated on a Tensor filled with ones
- `x += 1 * y`
- **N.Random:** All operations are calculated on a Tensor filled with random values from normal distribution
- `x += randn() * y`
- **U.Random:** All operations are calculated on a Tensor filled with random values from uniform distribution
- `x += rand() * y`
- **Multi-Res:** All operations are calculated on a Tensor generated with multi-res noise algorithm
- `x += multires() * y`
- **Abs.:** Calculate using the absolute values of the chosen Tensors instead
- `x += abs(F) * y`
<p align="center"><img src="samples/Bright.jpg" width=768></p>
<p align="center"><img src="samples/Dark.jpg" width=768></p>
## Sample Images
- **Checkpoint:** [UHD-23](https://civitai.com/models/22371/uhd-23)
- **Pos. Prompt:** `(masterpiece, best quality), 1girl, solo, night, street, city, neon_lights`
- **Neg. Prompt:** `(low quality, worst quality:1.2)`, [`EasyNegative`](https://huggingface.co/datasets/gsdf/EasyNegative/tree/main), [`EasyNegativeV2`](https://huggingface.co/gsdf/Counterfeit-V3.0/tree/main/embedding)
- `Euler a`; `20 steps`; `7.5 CFG`; `Hires. fix`; `Latent (nearest)`; `16 H.steps`; `0.6 D.Str.`; `Seed:`**`3814649974`**
- *No offset noise models were used*
- `Straight Abs.`
<p align="center">
<b>Base</b><br>
<code>Extension Disabled</code><br>
<img src="samples/00.png" width=512>
<img src="samples/00.jpg" width=512>
</p>
<p align="center">
<b>Dark</b><br>
<code><b>Brightness:</b> -2; <b>Contrast:</b> 1</code><br>
<img src="samples/01.png" width=512>
<code><b>Brightness:</b> -3; <b>Contrast:</b> 1.5</code><br>
<img src="samples/01.jpg" width=512>
</p>
<p align="center">
<b>Bright</b><br>
<code><b>Brightness:</b> 1; <b>Contrast:</b> -0.5; <b>Alt:</b> Enabled</code><br>
<img src="samples/02.png" width=512>
<code><b>Brightness:</b> 2.5; <b>Contrast:</b> 0.5; <b>Alt:</b> Enabled</code><br>
<img src="samples/02.jpg" width=512>
</p>
<p align="center">
<b>Chill</b><br>
<code><b>Brightness:</b> -2.5; <b>Contrast:</b> 1.5</code><br>
<code><b>R:</b> -0.5; <b>B:</b> 1</code><br>
<img src="samples/03.png" width=512>
<code><b>Brightness:</b> -2.5; <b>Contrast:</b> 1.25</code><br>
<code><b>R:</b> -1.5; <b>B:</b> 2.5</code><br>
<img src="samples/03.jpg" width=512>
</p>
<p align="center">
<b><s>Mexican Movie</s></b><br>
<code><b>Brightness:</b> 3; <b>Contrast:</b> -1.5; <b>Saturation:</b> 1</code><br>
<code><b>R:</b> 1; <b>G:</b> 0.5; <b>B:</b> -2</code><br>
<img src="samples/04.png" width=512>
<code><b>Brightness:</b> 3; <b>Saturation:</b> 1.5</code><br>
<code><b>R:</b> 2; <b>G:</b> 1; <b>B:</b> -2</code><br>
<img src="samples/04.jpg" width=512>
</p>
<p align="center"><i>Notice the significant differences even when using the same seed</i></p>
@ -101,21 +123,23 @@ refer to the parameters and sample images below and play around with the values.
## Roadmap
- [X] Extension Released
- [X] Add Support for **X/Y/Z Plot**
- [ ] Add Support for **Inpaint**
- [ ] Implement different Noise functions
- [ ] Implement better scaling algorithm
- [ ] Add Gradient feature
- [X] Implement different Noise functions
- [ ] Add Randomize functions
- [ ] Implement a better scaling algorithm
- [ ] Fix the Brightness issues
- [ ] Add Support for **Inpaint**
- [ ] Add Gradient feature
- [ ] Append Parameters onto Metadata
<p align="center"><img src="samples/XYZ.jpg" width=512></p>
<p align="center"><code><b>X/Y/Z Plot</b> Support</code></p>
<p align="center"><img src="samples/XYZ.jpg" width=768></p>
<p align="center"><code>X/Y/Z Plot Support</code></p>
## Known Issues
- Does not work with `DDIM` sampler
- Has little effect when used with certain **LoRA**s
- Too high **Brightness** causes the image to be blurry; Too low **Brightness** causes the image to be noisy
- Values too extreme can cause distortions
- Using `Multi-Res` seems to fix the blurry issue *(but not the noise issue)*
- Low `steps` *(`< 10`)* may cause the effects to be stronger due to poor scaling algorithm
<hr>
@ -138,11 +162,11 @@ After reading through and messing around with the code,
I found out that it is possible to directly modify the Tensors
representing the latent noise used by the Stable Diffusion process.
The dimentions of the Tensors is `(X, 4, H / 8, W / 8)`, which can be thought of like this:
The dimensions of the Tensors is `(X, 4, H / 8, W / 8)`, which can be thought of like this:
> **X** batch of noise images, with **4** channels, each with **(W / 8) x (H / 8)** values
> **eg.** Generating a 512x768 image will create a Tensor of size (1, 4, 96, 64)
> **eg.** Generating a single 512x768 image will create a Tensor of size (1, 4, 96, 64)
Then, I tried to play around with the values of each channel and ended up discovering these relationships.
Essentially, the 4 channels correspond to the **CMYK** color format,

BIN
samples/00.jpg (Stored with Git LFS) Normal file

Binary file not shown.

BIN
samples/00.png (Stored with Git LFS)

Binary file not shown.

BIN
samples/01.jpg (Stored with Git LFS) Normal file

Binary file not shown.

BIN
samples/01.png (Stored with Git LFS)

Binary file not shown.

BIN
samples/02.jpg (Stored with Git LFS) Normal file

Binary file not shown.

BIN
samples/02.png (Stored with Git LFS)

Binary file not shown.

BIN
samples/03.jpg (Stored with Git LFS) Normal file

Binary file not shown.

BIN
samples/03.png (Stored with Git LFS)

Binary file not shown.

BIN
samples/04.jpg (Stored with Git LFS) Normal file

Binary file not shown.

BIN
samples/04.png (Stored with Git LFS)

Binary file not shown.

BIN
samples/Bright.jpg (Stored with Git LFS) Normal file

Binary file not shown.

BIN
samples/Dark.jpg (Stored with Git LFS) Normal file

Binary file not shown.

BIN
samples/Skip.jpg (Stored with Git LFS)

Binary file not shown.

BIN
samples/XYZ.jpg (Stored with Git LFS)

Binary file not shown.

View File

@ -1,10 +1,14 @@
from modules.sd_samplers_kdiffusion import KDiffusionSampler
import modules.scripts as scripts
from modules import devices
from modules import shared
import gradio as gr
import random
import torch
import json
VERSION = 'v1.2.0'
def clean_outdated(EXT_NAME:str):
with open(scripts.basedir() + '/' + 'ui-config.json', 'r') as json_file:
configs = json.loads(json_file.read())
@ -13,6 +17,44 @@ def clean_outdated(EXT_NAME:str):
with open(scripts.basedir() + '/' + 'ui-config.json', 'w') as json_file:
json.dump(cleaned_configs, json_file)
def ones(latent):
return torch.ones_like(latent)
def gaussian_noise(latent):
return torch.rand_like(latent)
def normal_noise(latent):
return torch.randn_like(latent)
def multires_noise(latent, use_zero:bool, iterations=8, discount=0.4):
"""
Reference: https://wandb.ai/johnowhitaker/multires_noise/reports/Multi-Resolution-Noise-for-Diffusion-Model-Training--VmlldzozNjYyOTU2
Credit: Kohya_SS
"""
noise = torch.zeros_like(latent) if use_zero else ones(latent)
batchSize = noise.size(0)
height = noise.size(2)
width = noise.size(3)
device = devices.get_optimal_device()
upsampler = torch.nn.Upsample(size=(height, width), mode="bilinear").to(device)
for b in range(batchSize):
for i in range(iterations):
r = random.random() * 2 + 2
wn = max(1, int(width / (r**i)))
hn = max(1, int(height / (r**i)))
for c in range(4):
noise[b, c] += upsampler(torch.randn(1, 1, hn, wn).to(device))[0, 0] * discount**i
if wn == 1 or hn == 1:
break
return noise / noise.std()
class VectorscopeCC(scripts.Script):
def __init__(self):
@ -21,13 +63,6 @@ class VectorscopeCC(scripts.Script):
global og_callback
og_callback = KDiffusionSampler.callback_state
global xyz_grid
module_name = 'xyz_grid.py'
for data in scripts.scripts_data:
if data.script_class.__module__ == module_name and hasattr(data, "module"):
xyz_grid = data.module
break
self.xyzCache = {}
self.xyz_support()
@ -41,7 +76,12 @@ class VectorscopeCC(scripts.Script):
return ["True", "False"]
def choices_method():
return ["Default", "Uniform", "Cross", "Random", "Multi-Res"]
return ["Disabled", "Straight", "Straight Abs.", "Cross", "Cross Abs.", "Ones", "N.Random", "U.Random", "Multi-Res", "Multi-Res Abs."]
for data in scripts.scripts_data:
if data.script_class.__module__ == 'xyz_grid.py' and hasattr(data, "module"):
xyz_grid = data.module
break
extra_axis_options = [
xyz_grid.AxisOption("[Vec.CC] Enable", str, apply_field("Enable"), choices=choices_bool),
@ -66,7 +106,7 @@ class VectorscopeCC(scripts.Script):
return scripts.AlwaysVisible
def ui(self, is_img2img):
with gr.Accordion("Vectorscope CC", open=False):
with gr.Accordion(f"Vectorscope CC {VERSION}", open=False):
with gr.Row():
enable = gr.Checkbox(label="Enable")
@ -85,7 +125,7 @@ class VectorscopeCC(scripts.Script):
with gr.Accordion("Advanced Settings", open=False):
doHR = gr.Checkbox(label="Process Hires. fix")
method = gr.Radio(["Default", "Uniform", "Cross", "Random", "Multi-Res"], label="Noise Settings", value="Default")
method = gr.Radio(["Straight", "Straight Abs.", "Cross", "Cross Abs.", "Ones", "N.Random", "U.Random", "Multi-Res", "Multi-Res Abs."], label="Noise Settings", value="Straight Abs.")
return [enable, latent, bri, con, sat, early, r, g, b, doHR, method]
@ -131,10 +171,14 @@ class VectorscopeCC(scripts.Script):
case 'DoHR':
doHR = self.parse_bool(v)
case 'Method':
print('This setting currently does nothing!')
method = v
self.xyzCache.clear()
if method == 'Disabled':
KDiffusionSampler.callback_state = og_callback
return p
steps = p.steps
if not hasattr(p, 'enable_hr') and hasattr(p, 'denoising_strength') and not shared.opts.img2img_fix_steps:
steps = int(steps * p.denoising_strength)
@ -145,7 +189,7 @@ class VectorscopeCC(scripts.Script):
return p
bri /= steps
con = pow(con, 1.0 / steps)
con = pow(con, 1.0 / steps) - 1
sat = pow(sat, 1.0 / steps)
r /= steps
g /= steps
@ -178,24 +222,44 @@ class VectorscopeCC(scripts.Script):
if d["i"] > stop:
return og_callback(self, d)
source = d[mode]
# "Straight", "Straight Abs.", "Cross", "Cross Abs.", "Ones", "N.Random", "U.Random", "Multi-Res", "Multi-Res Abs."
if 'Straight' in method:
target = d[mode]
elif 'Cross' in method:
cross = 'x' if mode == 'denoised' else 'denoised'
target = d[cross]
elif 'Multi-Res' in method:
target = multires_noise(d[mode], 'Abs' in method)
elif method == 'Ones':
target = ones(d[mode])
elif method == 'N.Random':
target = normal_noise(d[mode])
elif method == 'U.Random':
target = gaussian_noise(d[mode])
if 'Abs' in method:
target = torch.abs(target)
batchSize = d[mode].size(0)
for i in range(batchSize):
BRIGHTNESS = d[mode][i, 0]
R = d[mode][i, 2]
G = d[mode][i, 1]
B = d[mode][i, 3]
BRIGHTNESS = [source[i, 0], target[i, 0]]
R = [source[i, 2], target[i, 2]]
G = [source[i, 1], target[i, 1]]
B = [source[i, 3], target[i, 3]]
BRIGHTNESS += torch.abs(BRIGHTNESS) * bri
BRIGHTNESS *= con
BRIGHTNESS[0] += BRIGHTNESS[1] * bri
BRIGHTNESS[0] += BRIGHTNESS[1] * con
R -= torch.abs(R) * r
G += torch.abs(G) * g
B -= torch.abs(B) * b
R[0] -= R[1] * r
G[0] += G[1] * g
B[0] -= B[1] * b
R *= sat
G *= sat
B *= sat
R[0] *= sat
G[0] *= sat
B[0] *= sat
return og_callback(self, d)