mirror of https://github.com/bmaltais/kohya_ss
308 lines
15 KiB
Markdown
308 lines
15 KiB
Markdown
# Kohya's GUI
|
||
|
||
[](https://github.com/bmaltais/kohya_ss/stargazers)
|
||
[](https://github.com/bmaltais/kohya_ss/network/members)
|
||
[](LICENSE.md)
|
||
[](https://github.com/bmaltais/kohya_ss/issues)
|
||
|
||
This is a GUI and CLI for training diffusion models.
|
||
|
||
This project provides a user-friendly Gradio-based Graphical User Interface (GUI) for [Kohya's Stable Diffusion training scripts](https://github.com/kohya-ss/sd-scripts).
|
||
Stable Diffusion training empowers users to customize image generation models by fine-tuning existing models, creating unique artistic styles,
|
||
and training specialized models like LoRA (Low-Rank Adaptation).
|
||
|
||
Key features of this GUI include:
|
||
* Easy-to-use interface for setting a wide range of training parameters.
|
||
* Automatic generation of the command-line interface (CLI) commands required to run the training scripts.
|
||
* Support for various training methods, including LoRA, Dreambooth, fine-tuning, and SDXL training.
|
||
|
||
Support for Linux and macOS is also available. While Linux support is actively maintained through community contributions, macOS compatibility may vary.
|
||
|
||
## Table of Contents
|
||
|
||
- [Installation Options](#installation-options)
|
||
- [Local Installation Overview](#local-installation-overview)
|
||
- [`uv` vs `pip` – What's the Difference?](#uv-vs-pip--whats-the-difference)
|
||
- [Cloud Installation Overview](#cloud-installation-overview)
|
||
- [Colab](#-colab)
|
||
- [Runpod, Novita, Docker](#runpod-novita-docker)
|
||
- [Custom Path Defaults](#custom-path-defaults)
|
||
- [LoRA](#lora)
|
||
- [Sample image generation during training](#sample-image-generation-during-training)
|
||
- [Troubleshooting](#troubleshooting)
|
||
- [Page File Limit](#page-file-limit)
|
||
- [No module called tkinter](#no-module-called-tkinter)
|
||
- [LORA Training on TESLA V100 - GPU Utilization Issue](#lora-training-on-tesla-v100---gpu-utilization-issue)
|
||
- [SDXL training](#sdxl-training)
|
||
- [Masked loss](#masked-loss)
|
||
- [Guides](#guides)
|
||
- [Using Accelerate Lora Tab to Select GPU ID](#using-accelerate-lora-tab-to-select-gpu-id)
|
||
- [Starting Accelerate in GUI](#starting-accelerate-in-gui)
|
||
- [Running Multiple Instances (linux)](#running-multiple-instances-linux)
|
||
- [Monitoring Processes](#monitoring-processes)
|
||
- [Interesting Forks](#interesting-forks)
|
||
- [Contributing](#contributing)
|
||
- [License](#license)
|
||
- [Change History](#change-history)
|
||
- [v25.0.3](#v2503)
|
||
- [v25.0.2](#v2502)
|
||
- [v25.0.1](#v2501)
|
||
- [v25.0.0](#v2500)
|
||
|
||
|
||
## Installation Options
|
||
|
||
You can run `kohya_ss` either **locally on your machine** or via **cloud-based solutions** like Colab or Runpod.
|
||
|
||
- If you have a GPU-equipped PC and want full control: install it locally using `uv` or `pip`.
|
||
- If your system doesn’t meet requirements or you prefer a browser-based setup: use Colab or a paid GPU provider like Runpod or Novita.
|
||
- If you are a developer or DevOps user, Docker is also supported.
|
||
|
||
---
|
||
|
||
### Local Installation Overview
|
||
|
||
You can install `kohya_ss` locally using either the `uv` or `pip` method. Choose one depending on your platform and preferences:
|
||
|
||
| Platform | Recommended Method | Instructions |
|
||
|--------------|----------------|---------------------------------------------|
|
||
| Linux | `uv` | [uv_linux.md](./docs/Installation/uv_linux.md) |
|
||
| Linux or Mac | `pip` | [pip_linux.md](./docs/Installation/pip_linux.md) |
|
||
| Windows | `uv` | [uv_windows.md](./docs/Installation/uv_windows.md) |
|
||
| Windows | `pip` | [pip_windows.md](./docs/Installation/pip_windows.md) |
|
||
|
||
#### `uv` vs `pip` – What's the Difference?
|
||
|
||
- `uv` is faster and isolates dependencies more cleanly, ideal if you want minimal setup hassle.
|
||
- `pip` is more traditional, easier to debug if issues arise, and works better with some IDEs or Python tooling.
|
||
- If unsure: try `uv`. If it doesn't work for you, fall back to `pip`.
|
||
|
||
### Cloud Installation Overview
|
||
|
||
#### 🦒 Colab
|
||
|
||
For browser-based training without local setup, use this Colab notebook:
|
||
<https://github.com/camenduru/kohya_ss-colab>
|
||
|
||
- No installation required
|
||
- Free to use (GPU availability may vary)
|
||
- Maintained by **camenduru**, not the original author
|
||
|
||
| Colab | Info |
|
||
| ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | ------------------ |
|
||
| [](https://colab.research.google.com/github/camenduru/kohya_ss-colab/blob/main/kohya_ss_colab.ipynb) | kohya_ss_gui_colab |
|
||
|
||
> 💡 If you encounter issues, please report them on camenduru’s repo.
|
||
|
||
**Special thanks**
|
||
I would like to express my gratitude to camenduru for their valuable contribution.
|
||
|
||
#### Runpod, Novita, Docker
|
||
|
||
These options are for users running training on hosted GPU infrastructure or containers.
|
||
|
||
- **[Runpod setup](docs/runpod_setup.md)** – Ready-made GPU background training via templates.
|
||
- **[Novita setup](docs/novita_setup.md)** – Similar to Runpod, but integrated into the Novita UI.
|
||
- **[Docker setup](docs/docker.md)** – For developers/sysadmins using containerized environments.
|
||
|
||
|
||
## Custom Path Defaults with `config.toml`
|
||
|
||
The GUI supports a configuration file named `config.toml` that allows you to set default paths for many of the input fields. This is useful for avoiding repetitive manual selection of directories every time you start the GUI.
|
||
|
||
**Purpose of `config.toml`:**
|
||
|
||
* Pre-fill default directory paths for pretrained models, datasets, output folders, LoRA models, etc.
|
||
* Streamline your workflow by having the GUI remember your preferred locations.
|
||
|
||
**How to Use and Customize:**
|
||
|
||
1. **Create your configuration file:**
|
||
* In the root directory of the `kohya_ss` repository, you'll find a file named `config example.toml`.
|
||
* Copy this file and rename the copy to `config.toml`. This `config.toml` file will be automatically loaded when the GUI starts.
|
||
2. **Edit `config.toml`:**
|
||
* Open `config.toml` with a text editor.
|
||
* The file uses TOML (Tom's Obvious, Minimal Language) format, which consists of `key = "value"` pairs.
|
||
* Modify the paths for the keys according to your local directory structure.
|
||
* **Important:**
|
||
* Use absolute paths (e.g., `C:/Users/YourName/StableDiffusion/Models` or `/home/yourname/sd-models`).
|
||
* Alternatively, you can use paths relative to the `kohya_ss` root directory.
|
||
* Ensure you use forward slashes (`/`) for paths, even on Windows, as this is generally more compatible with TOML and Python.
|
||
* Make sure the specified directories exist on your system.
|
||
|
||
**Structure of `config.toml`:**
|
||
|
||
The `config.toml` file can have several sections, typically corresponding to different training modes or general settings. Common keys you might want to set include:
|
||
|
||
* `model_dir`: Default directory for loading base Stable Diffusion models.
|
||
* `lora_model_dir`: Default directory for saving and loading LoRA models.
|
||
* `output_dir`: Default base directory for training outputs (images, logs, model checkpoints).
|
||
* `dataset_dir`: A general default if you store all your datasets in one place.
|
||
* Specific input paths for different training tabs like Dreambooth, Finetune, LoRA, etc. (e.g., `db_model_dir`, `ft_source_model_name_or_path`).
|
||
|
||
**Example Configurations:**
|
||
|
||
Here's an example snippet of what your `config.toml` might look like:
|
||
|
||
```toml
|
||
# General settings
|
||
model_dir = "C:/ai_stuff/stable-diffusion-webui/models/Stable-diffusion"
|
||
lora_model_dir = "C:/ai_stuff/stable-diffusion-webui/models/Lora"
|
||
vae_dir = "C:/ai_stuff/stable-diffusion-webui/models/VAE"
|
||
output_dir = "C:/ai_stuff/kohya_ss_outputs"
|
||
logging_dir = "C:/ai_stuff/kohya_ss_outputs/logs"
|
||
|
||
# Dreambooth specific paths
|
||
db_model_dir = "C:/ai_stuff/stable-diffusion-webui/models/Stable-diffusion"
|
||
db_reg_image_dir = "C:/ai_stuff/datasets/dreambooth_regularization_images"
|
||
# Add other db_... paths as needed
|
||
|
||
# Finetune specific paths
|
||
ft_model_dir = "C:/ai_stuff/stable-diffusion-webui/models/Stable-diffusion"
|
||
# Add other ft_... paths as needed
|
||
|
||
# LoRA / LoCon specific paths
|
||
lc_model_dir = "C:/ai_stuff/stable-diffusion-webui/models/Stable-diffusion" # Base model for LoRA training
|
||
lc_output_dir = "C:/ai_stuff/kohya_ss_outputs/lora"
|
||
lc_dataset_dir = "C:/ai_stuff/datasets/my_lora_project"
|
||
# Add other lc_... paths as needed
|
||
|
||
# You can find a comprehensive list of all available keys in the `config example.toml` file.
|
||
# Refer to it to customize paths for all supported options in the GUI.
|
||
```
|
||
|
||
**Using a Custom Config File Path:**
|
||
|
||
If you prefer to name your configuration file differently or store it in another location, you can specify its path using the `--config` command-line argument when launching the GUI:
|
||
|
||
* On Windows: `gui.bat --config D:/my_configs/kohya_settings.toml`
|
||
* On Linux/macOS: `./gui.sh --config /home/user/my_configs/kohya_settings.toml`
|
||
|
||
By effectively using `config.toml`, you can significantly speed up your training setup process. Always refer to the `config example.toml` for the most up-to-date list of configurable paths.
|
||
|
||
## LoRA
|
||
|
||
To train a LoRA, you can currently use the `train_network.py` code. You can create a LoRA network by using the all-in-one GUI.
|
||
|
||
Once you have created the LoRA network, you can generate images using auto1111 by installing [this extension](https://github.com/kohya-ss/sd-webui-additional-networks).
|
||
|
||
For more detailed information on LoRA training options and advanced configurations, please refer to our LoRA documentation:
|
||
- [LoRA Training Guide](docs/LoRA/top_level.md)
|
||
- [LoRA Training Options](docs/LoRA/options.md)
|
||
|
||
## Sample image generation during training
|
||
|
||
A prompt file might look like this, for example:
|
||
|
||
```txt
|
||
# prompt 1
|
||
masterpiece, best quality, (1girl), in white shirts, upper body, looking at viewer, simple background --n low quality, worst quality, bad anatomy, bad composition, poor, low effort --w 768 --h 768 --d 1 --l 7.5 --s 28
|
||
|
||
# prompt 2
|
||
masterpiece, best quality, 1boy, in business suit, standing at street, looking back --n (low quality, worst quality), bad anatomy, bad composition, poor, low effort --w 576 --h 832 --d 2 --l 5.5 --s 40
|
||
```
|
||
|
||
Lines beginning with `#` are comments. You can specify options for the generated image with options like `--n` after the prompt. The following options can be used:
|
||
|
||
- `--n`: Negative prompt up to the next option.
|
||
- `--w`: Specifies the width of the generated image.
|
||
- `--h`: Specifies the height of the generated image.
|
||
- `--d`: Specifies the seed of the generated image.
|
||
- `--l`: Specifies the CFG scale of the generated image.
|
||
- `--s`: Specifies the number of steps in the generation.
|
||
|
||
The prompt weighting such as `( )` and `[ ]` is working.
|
||
|
||
## Troubleshooting
|
||
|
||
If you encounter any issues, refer to the troubleshooting steps below.
|
||
|
||
### Page File Limit
|
||
|
||
If you encounter an X error related to the page file, you may need to increase the page file size limit in Windows.
|
||
|
||
### No module called tkinter
|
||
|
||
If you encounter an error indicating that the module `tkinter` is not found, try reinstalling Python 3.10 on your system.
|
||
|
||
### LORA Training on TESLA V100 - GPU Utilization Issue
|
||
|
||
See [Troubleshooting LORA Training on TESLA V100](docs/troubleshooting_tesla_v100.md) for details.
|
||
|
||
## SDXL training
|
||
|
||
For detailed guidance on SDXL training, please refer to the [official sd-scripts documentation](https://github.com/kohya-ss/sd-scripts/blob/main/README.md#sdxl-training) and relevant sections in our [LoRA Training Guide](docs/LoRA/top_level.md).
|
||
|
||
## Masked loss
|
||
|
||
The masked loss is supported in each training script. To enable the masked loss, specify the `--masked_loss` option.
|
||
|
||
> [!WARNING]
|
||
> The feature is not fully tested, so there may be bugs. If you find any issues, please open an Issue.
|
||
|
||
ControlNet dataset is used to specify the mask. The mask images should be the RGB images. The pixel value 255 in R channel is treated as the mask (the loss is calculated only for the pixels with the mask), and 0 is treated as the non-mask. The pixel values 0-255 are converted to 0-1 (i.e., the pixel value 128 is treated as the half weight of the loss). See details for the dataset specification in the [LLLite documentation](./docs/train_lllite_README.md#preparing-the-dataset).
|
||
|
||
## Guides
|
||
|
||
The following are guides extracted from issues discussions
|
||
|
||
### Using Accelerate Lora Tab to Select GPU ID
|
||
|
||
#### Starting Accelerate in GUI
|
||
|
||
- Open the kohya GUI on your desired port.
|
||
- Open the `Accelerate launch` tab
|
||
- Ensure the Multi-GPU checkbox is unchecked.
|
||
- Set GPU IDs to the desired GPU (like 1).
|
||
|
||
#### Running Multiple Instances (linux)
|
||
|
||
- For tracking multiple processes, use separate kohya GUI instances on different ports (e.g., 7860, 7861).
|
||
- Start instances using `nohup ./gui.sh --listen 0.0.0.0 --server_port <port> --headless > log.log 2>&1 &`.
|
||
|
||
#### Monitoring Processes
|
||
|
||
- Open each GUI in a separate browser tab.
|
||
- For terminal access, use SSH and tools like `tmux` or `screen`.
|
||
|
||
For more details, visit the [GitHub issue](https://github.com/bmaltais/kohya_ss/issues/2577).
|
||
|
||
## Interesting Forks
|
||
|
||
To finetune HunyuanDiT models or create LoRAs, visit this [fork](https://github.com/Tencent/HunyuanDiT/tree/main/kohya_ss-hydit)
|
||
|
||
## Contributing
|
||
|
||
Contributions are welcome! If you'd like to contribute to this project, please consider the following:
|
||
- For bug reports or feature requests, please open an issue on the [GitHub Issues page](https://github.com/bmaltais/kohya_ss/issues).
|
||
- If you'd like to submit code changes, please open a pull request. Ensure your changes are well-tested and follow the existing code style.
|
||
- For security-related concerns, please refer to our `SECURITY.md` file.
|
||
|
||
## License
|
||
|
||
This project is licensed under the Apache License 2.0. See the [LICENSE.md](LICENSE.md) file for details.
|
||
|
||
## Change History
|
||
|
||
### v25.0.3
|
||
|
||
- Upgrade Gradio, diffusers and huggingface-hub to latest release to fix issue with ASGI.
|
||
- Add a new method to setup and run the GUI. You will find two new script for both Windows (gui-uv.bat) and Linux (gui-uv.sh). With those scripts there is no need to run setup.bat or setup.sh anymore.
|
||
|
||
### v25.0.2
|
||
|
||
- Force gradio to 5.14.0 or greater so it is updated.
|
||
|
||
### v25.0.1
|
||
|
||
- Fix issue with requirements version causing huggingface download issues
|
||
|
||
### v25.0.0
|
||
|
||
- Major update: Introduced support for flux.1 and sd3, moving the GUI to align with more recent script functionalities.
|
||
- Users preferring the pre-flux.1/sd3 version can check out tag `v24.1.7`.
|
||
```shell
|
||
git checkout v24.1.7
|
||
```
|
||
- For details on new flux.1 and sd3 parameters, refer to the [sd-scripts README](https://github.com/kohya-ss/sd-scripts/blob/sd3/README.md).
|