mirror of https://github.com/vladmandic/automatic
Page:
Models
Pages
AMD MIOpen
AMD ROCm
API
Advanced Install
Backend
Benchmark
CHANGELOG
CLI Arguments
CLI Tools
CLiP Skip
Caption
Control HowTo
Control Settings
Control Technical
Debug
Detailer
DirectML
Docker
Docs
Enso
Extensions
FAQ
FLUX
Features
FramePack
Gated
Getting Started
Google GenAI
HiDream
Hints
Home
Hotkeys
HuggingFace
IPAdapter
Installation
Intel ARC
Kanvas
LTX
Launcher
LoRA
Loader
Locale
MacOS Python
Malloc
Model Support
Models Tab
Models
Networks Search
Networks
Notes
NudeNet
Nunchaku
ONNX Runtime
Offload
OpenVINO
Outpaint
Parameters
Performance Tuning
Process
Profiling
Prompt Enhance
Prompting
Python
Quantization
Reprocess
SD Pipeline How it Works
SD Training Methods
SD XL
SD3
SDNQ Quantization
Schedulers
Scripts
Stability Matrix
Stable Cascade
Styles
Theme User
Themes
Troubleshooting
Update
Using LCM
VAE
Video
WSL
Wildcards
XYZ Grid
ZLUDA
_ToDo
index
nVidia
42
Models
vladmandic edited this page 2026-04-18 19:16:35 +02:00
Models
List of popular text-to-image generative models with their respective parameters and architecture overview
| Publisher | Model | Version | Size | Diffusion Architecture | Model Params | Text Encoder(s) | TE Params | Auto Encoder | License | Release date |
|---|---|---|---|---|---|---|---|---|---|---|
| StabilityAI | Stable Diffusion | 1.5 | 2.28GB | UNet | 0.86B | CLiP ViT-L | 0.12B | VAE | OpenRAIL | 2022 October |
| StabilityAI | Stable Diffusion | 2.1 | 2.58GB | UNet | 0.86B | CLiP ViT-H | 0.34B | VAE | OpenRAIL | 2022 December |
| StabilityAI | Stable Diffusion | XL | 6.94GB | UNet | 2.56B | CLiP ViT-L + ViT+G | 0.12B + 0.69B | SDXL | OpenRAIL | 2023 July |
| StabilityAI | Stable Diffusion | 3.0 Medium | 15.14GB | MMDiT | 2.0B | CLiP ViT-L + ViT+G + T5-XXL | 0.12B + 0.69B + 4.76B | 16ch VAE | Proprietary | 2024 June |
| StabilityAI | Stable Diffusion | 3.5 Medium | 15.89GB | MMDiT | 2.25B | CLiP ViT-L + ViT+G + T5-XXL | 0.12B + 0.69B + 4.76B | 16ch VAE | Proprietary | 2024 October |
| StabilityAI | Stable Diffusion | 3.5 Large | 26.98GB | MMDiT | 8.05B | CLiP ViT-L + ViT+G + T5-XXL | 0.12B + 0.69B + 4.76B | 16ch VAE | Proprietary | 2024 October |
| StabilityAI | Stable Cascade | Medium | 11.82GB | Multi-stage UNet | 1.56B + 3.6B | CLiP ViT-G | 0.69B | 42x VQE | Proprietary | 2024 February |
| StabilityAI | Stable Cascade | Lite | 4.97GB | Multi-stage UNet | 0.7B + 1.0B | CLiP ViT-G | 0.69B | 42x VQE | Proprietary | 2024 February |
| Black Forest Labs | Flux | 1 Schnell | 32.93GB | MMDiT | 11.9B | CLiP ViT-L + T5-XXL | 0.12B + 4.76B | Flux | Apache 2.0 | 2024 August |
| Black Forest Labs | Flux | 1 Dev | 32.93GB | MMDiT | 11.9B | CLiP ViT-L + T5-XXL | 0.12B + 4.76B | Flux | Proprietary | 2024 August |
| Black Forest Labs | Flux | 1 Kontext-Dev | 32.93GB | MMDiT | 11.9B | CLiP ViT-L + T5-XXL | 0.12B + 4.76B | Flux | Proprietary | 2025 June |
| Black Forest Labs | Flux | 1 Krea-Dev | 32.93GB | MMDiT | 11.9B | CLiP ViT-L + T5-XXL | 0.12B + 4.76B | Flux | Proprietary | 2025 July |
| Black Forest Labs | Flux | 2 Dev | 114GB | MMDiT | 32B | Mistral Small 3.2-2506 | 24B | Flux 2 | Proprietary | 2025 November |
| Black Forest Labs | Flux | 2 Klein 9B | 34.7GB | MMDiT | 9B | Qwen3-8B | 8B | Flux 2 | Proprietary | 2026 January |
| Black Forest Labs | Flux | 2 Klein Base 9B | 34.7GB | MMDiT | 9B | Qwen3-8B | 8B | Flux 2 | Proprietary | 2026 January |
| Black Forest Labs | Flux | 2 Klein 4B | 16GB | MMDiT | 4B | Qwen3-4B | 4B | Flux 2 | Apache 2.0 | 2026 January |
| Black Forest Labs | Flux | 2 Klein Base 4B | 16GB | MMDiT | 4B | Qwen3-4B | 4B | Flux 2 | Apache 2.0 | 2026 January |
| lodestones | Chroma | 48 | 26.84GB | MMDiT | 8.9B | T5-XXL | 4.76B | Flux | Apache 2.0 | 2025 July |
| lodestones | Chroma | 1 HD | 26.84GB | MMDiT | 8.9B | T5-XXL | 4.76B | Flux | Apache 2.0 | 2025 July |
| lodestones | Chroma | 1 Base | 26.84GB | MMDiT | 8.9B | T5-XXL | 4.76B | Flux | Apache 2.0 | 2025 July |
| lodestones | Chroma | 1 Flash | 26.84GB | MMDiT | 8.9B | T5-XXL | 4.76B | Flux | Apache 2.0 | 2025 July |
| Ostris | Flex | 1 Alpha | 25.65GB | MMDiT | 8.16B | CLiP ViT-L + T5-XXL | 0.12B + 2.95B | Flux | Apache 2.0 | 2025 January |
| Ostris | Flex | 2 Preview | 25.65GB | MMDiT | 8.16B | CLiP ViT-L + T5-XXL | 0.12B + 2.95B | Flux | Apache 2.0 | 2025 April |
| FreePik | F-Lite | 19.81GB | DiT | 9.8B | T5-XXL | 2.95B | Flux | OpenRAIL | 2025 May | |
| FreePik | F-Lite | Texture | 19.81GB | DiT | 9.8B | T5-XXL | 2.95B | Flux | OpenRAIL | 2025 May |
| FreePik | F-Lite | 7B | 13.89GB | DiT | 6.9B | T5-XXL | 2.95B | Flux | OpenRAIL | 2025 May |
| NVLabs | Sana | 1.5 1.6B | 9.49GB | DiT | 1.60B | Gemma2 | 2.61B | DC-AE | Proprietary | 2025 March |
| NVLabs | Sana | 1.5 4.8B | 15.58GB | DiT | 4.72B | Gemma2 | 2.61B | DC-AE | Proprietary | 2025 March |
| NVLabs | Sana | 1.0 1600M | 12.63GB | DiT | 1.60B | Gemma2 | 2.61B | DC-AE | Proprietary | 2024 November |
| NVLabs | Sana | 1.0 600M | 7.51GB | DiT | 0.59B | Gemma2 | 2.61B | DC-AE | Proprietary | 2024 November |
| nVidia | Cosmos-Predict2 T2I | 2B | 13.32GB | DiT | 1.96B | T5-XXL | 4.86 | Wan | Proprietary | 2025 June |
| nVidia | Cosmos-Predict2 T2I | 14B | 37.36GB | DiT | 14.26B | T5-XXL | 4.86 | Wan | Proprietary | 2025 June |
| FAL | AuraFlow | 0.2 | 31.90GB | MMDiT | 6.8B | UMT5 | 12.1B | SDXL | Apache 2.0 | 2024 July |
| FAL | AuraFlow | 0.3 | 31.90GB | MMDiT | 6.8B | UMT5 | 12.1B | SDXL | Apache 2.0 | 2024 August |
| AlphaVLLM | Lumina | Next SFT | 8.67GB | NextDiT | 1.7B | Gemma | 2.5B | SDXL | Apache 2.0 | 2024 June |
| AlphaVLLM | Lumina | 2 | 20.75GB | NextDiT | 2.61B | Gemma-2 | 2.61B | Flux | Apache 2.0 | 2025 January |
| PixArt | Alpha | XL 2 | 21.3GB | DiT | 0.61B | T5-XXL | 4.76B | VAE | OpenRAIL | 2023 November |
| PixArt | Sigma | XL 2 | 21.3GB | DiT | 0.61B | T5-XXL | 4.76B | VAE | OpenRAIL | 2024 April |
| Segmind | SSD-1B | 8.72GB | UNet | 1.33B | CLiP ViT-L + ViT+G | 0.12B + 0.69B | SDXL | Apache 2.0 | 2023 October | |
| Segmind | Vega | 6.43GB | UNet | 0.75B | CLiP ViT-L + ViT+G | 0.12B + 0.69B | SDXL | Apache 2.0 | 2023 November | |
| Segmind | Tiny | 1.03GB | UNet | 0.32B | CLiP ViT-L | 0.12B | SDXL | OpenRAIL | 2023 July | |
| Thu-ML | UniDiffuser | v1 | 5.37GB | U-ViT | 0.95B | CLiP ViT-L + CLiP ViT-B | 0.12B + 0.16B | VAE | AGPL 3 | 2023 May |
| Kwai | Kolors | 17.40GB | UNnet | 2.58B | ChatGLM | 6.24B | VAE | Apache 2.0 | 2024 July | |
| PlaygroundAI | Playground | 1.0 | 4.95GB | UNet | 0.86B | CLiP ViT-L | 0.12B | VAE | ? | 2023 December |
| PlaygroundAI | Playground | 2.x | 13.35GB | UNet | 2.56B | CLiP ViT-L + ViT+G | 0.12B + 0.69B | VAE | Proprietary | 2023 December |
| Tencent | HunyuanDiT | 1.2 | 14.09GB | DiT | 1.5B | BERT + T5-XL | 3.52B + 1.67B | SDXL | Proprietary | 2024 May |
| Tencent | Hunyuan Image | 2.1 | 34.9GB | MMDiT | 17.4B | Qwen2.5-VL-7B + T5 | 8.29B + 0.22B | Hunyuan Image | Proprietary | 2025 August |
| Warp AI | Wuerstchen | 12.16GB | Multi-stage UNet | 1.0B + 1.05B | CLiP ViT-L + ViT+G | 0.12B + 0.69B | 42x VQE | MIT | 2023 August | |
| Kandinsky | Kandinsky | 2.1 | 5.15GB | Unet | 1.25B | CLiP ViT-G | 0.69B | SBER MOVQGAN | Apache 2.0 | 2023 April |
| Kandinsky | Kandinsky | 2.2 | 5.15GB | Unet | 1.25B | CLiP ViT-G | 0.69B | SBER MOVQGAN | Apache 2.0 | 2023 July |
| Kandinsky | Kandinsky | 3.0 | 27.72GB | Unet | 3.05B | T5-XXXL | 8.72B | SBER MOVQGAN | Apache 2.0 | 2023 November |
| Kandinsky | Kandinsky | 5.0 T2I Lite | 33.20GB | DiT | 6B | Qwen2.5-VL-7B + CLiP ViT-L | 8.29B + 0.12B | Flux | MIT | 2025 November |
| Kandinsky | Kandinsky | 5.0 I2I Lite | 33.20GB | DiT | 6B | Qwen2.5-VL-7B + CLiP ViT-L | 8.29B + 0.12B | Flux | MIT | 2025 November |
| Z.ai | CogView | 3 Plus | 24.96GB | DiT | 2.85B | T5-XXL | 4.76B | CogView | Apache 2.0 | 2024 October |
| Z.ai | CogView | 4 | 30.39GB | DiT | 6.37B | GLM-4 | 9.40B | CogView | Apache 2.0 | 2025 March |
| Z.ai | GLM Image | 35.8GB | Autoregressive transformer + DiT | 7B | GLM-4-9B-0414 | 9B | GLM Image | MIT | 2026 January | |
| IDKiro | SDXS | 2.05GB | UNet | 0.32B | CLiP ViT-L | 0.12B | VAE | OpenRAIL | 2024 March | |
| Open-MUSE | aMUSEd | 256 | 3.41GB | ViT | 0.60B | CLiP ViT-L | 0.12B | VQ | OpenRAIL | 2023 December |
| Koala | Koala | 700M | 6.58GB | UNet | 0.78B | CLiP ViT-L + ViT+G | 0.12B + 0.69B | VAE | Proprietary | 2024 January |
| Salesforce | BLIP-Diffusion | 7.23GB | UNet | 0.86B | CLiP ViT-L + BLiP-2 | 0.12B + 0.49B | VAE | BSD 3 | 2023 July | |
| DeepFloyd | IF | M | 12.79GB | Multi-stage UNet | 0.37B + 0.46B | T5-XXL | 4.76B | Pixel | Proprietary | 2023 April |
| DeepFloyd | IF | L | 15.48GB | Multi-stage UNet | 0.61B + 0.93B | T5-XXL | 4.76B | Pixel | Proprietary | 2023 April |
| MeissonFlow | Meissonic | 3.64GB | DiT | 1.18B | CLiP ViT-H | 0.35B | VQ | Apache 2.0 | 2024 October | |
| VectorSpaceLab | OmniGen | v1 | 15.47GB | Transformer | 3.76B | Phi-3 | 0 | VAE | MIT | 2024 October |
| VectorSpaceLab | OmniGen | v2 | 30.50GB | Transformer | 3.97B | Qwen-VL-2.5 | 3.75B | Flux | Apache 2.0 | 2025 June |
| HiDream-AI | HiDream | I1 Fast/Dev/Full | 42.71 GB + 15.69 | MMDiT | 17.10B | CLiP ViT-L + ViT+G + T5-XXL + LLama-3.1-8B | 0.12B + 0.69B + 2.95B + 4.54B | Flux | MIT | 2025 April |
| Alibaba/Wan-AI | WAN | 2.1 1.3B | 27.72GB | DiT | 1.42xB | UMT5-XXL | 5.68B | Wan | Apache 2.0 | 2025 February |
| Alibaba/Wan-AI | WAN | 2.1 14B | 78.52GB | DiT | B | UMT5-XXL | 14.28B | Wan | Apache 2.0 | 2025 February |
| Bria | Bria | 3.2 | 18.66GB | MMDiT | 3.78B | T5-XXL | 4.76B | 16ch VAE | Proprietary | 2025 June |
| Alibaba/Qwen | Qwen-Image | 56.10GB | MMDiT | 20.43B | Qwen2.5-VL-7B | 8.29B | Qwen Image | Apache 2.0 | 2025 August | |
| Alibaba/Qwen | Qwen-Image | 2512 | 56.10GB | MMDiT | 20.43B | Qwen2.5-VL-7B | 8.29B | Qwen Image | Apache 2.0 | 2025 December |
| Alibaba/Qwen | Qwen-Image-Edit | 56.10GB | MMDiT | 20.43B | Qwen2.5-VL-7B | 8.29B | Qwen Image | Apache 2.0 | 2025 August | |
| Alibaba/Qwen | Qwen-Image-Edit | 2509 | 56.10GB | MMDiT | 20.43B | Qwen2.5-VL-7B | 8.29B | Qwen Image | Apache 2.0 | 2025 September |
| Alibaba/Qwen | Qwen-Image-Edit | 2511 | 56.10GB | MMDiT | 20.43B | Qwen2.5-VL-7B | 8.29B | Qwen Image | Apache 2.0 | 2025 December |
| Alibaba/Qwen | Qwen-Image-Layered | 56.10GB | MMDiT | 20.43B | Qwen2.5-VL-7B | 8.29B | Qwen Image Layered | Apache 2.0 | 2025 December | |
| Photoroom | PRX | 1024 T2I beta | 15.5GB | MMDiT | 1.3B | T5-Gemma-2B-2B-UL2 | 2B | Flux | Apache 2.0 | 2025 November |
| Alibaba/Tongyi-MAI | Z-Image | Turbo | 32.9GB | S3-DiT | 6.1B | Qwen3-4B | 4B | Flux | Apache 2.0 | 2025 November |
| Meituan | Longcat Image | 29.3GB | MMDiT | 6B | Qwen2.5-VL-7B | 8.29B | Flux | Apache 2.0 | 2025 December | |
| Meituan | Longcat Image | Edit | 29.3GB | MMDiT | 6B | Qwen2.5-VL-7B | 8.29B | Flux | Apache 2.0 | 2025 December |
| AIDC-AI | Ovis Image 7B | 23.38GB | MMDiT | 7B | Ovis2.5 2B | 2.5B | Flux | Apache 2.0 | 2025 November |
Notes
- Created using SD.Next built-in model analyzer
- Number of parameters is proportional to model complexity and ability to learn
Quality of generated images is also influenced by training data and duration of training - Size refers to original model variant in 16bit precision where available
Quantized variations may be smaller - Distilled variants are not included as typical goal-distilling does not change underlying model params
e.g. Turbo/LCM/Hyper/Lightning/etc. or even Dev/Schnell