update torch-cuda and create agents.md

Signed-off-by: vladmandic <mandic00@live.com>
pull/4708/head
vladmandic 2026-03-25 08:23:55 +01:00
parent 400d284711
commit 7a47b6bbdb
5 changed files with 94 additions and 16 deletions

View File

@ -1,28 +1,41 @@
# Project Guidelines
# SD.Next: AGENTS.md Project Guidelines
SD.Next is a complex codebase with specific patterns and conventions.
General app structure is:
- Python backend server
Uses Torch for model inference, FastAPI for API routes and Gradio for creation of UI components.
- JavaScript/CSS frontend
## Tools
- `pyproject.toml` for Python configuration, including linting and type checking settings.
- `venv` for Python environment management, activated with `source venv/bin/activate`.
- `venv` for Python environment management, activated with `source venv/bin/activate` (Linux) or `venv\Scripts\activate` (Windows).
venv MUST be activated before running any Python commands or scripts to ensure correct dependencies and environment variables.
- `python` 3.10+.
- `pyproject.toml` for Python configuration, including linting and type checking settings.
- `eslint` configured for both core and UI code.
- `pnpm` for managing JavaScript dependencies and scripts, with key commands defined in `package.json`.
- `ruff` and `pylint` for Python linting, with configurations in `pyproject.toml` and executed via `pnpm ruff` and `pnpm pylint`.
- `pre-commit` hooks which also check line-endings and other formatting issues, configured in `.pre-commit-config.yaml`.
## Project Structure
- Entry/startup flow: `webui.sh` -> `launch.py` -> `webui.py` -> modules under `modules/`.
- Install: `installer.py` takes care of installing dependencies and setting up the environment.
- Core runtime state is centralized in `modules/shared.py` (shared.opts, model state, backend/device state).
- API/server routes are under `modules/api/`.
- UI codebase is split between base JS in `javascript/` and actual UI in `extensions-builtin/sdnext-modernui/`.
- Model and pipeline logic is split between `modules/sd_*` and `pipelines/`.
- Additional plug-ins live in `scripts/` and are used only when specified.
- Extensions live in `extensions-builtin/` and `extensions/` and are loaded dynamically.
- Tests and CLI scripts are under `test/` and `cli/`, with some API smoke checks in `test/full-test.sh`.
## Code Style
- Prefer existing project patterns over strict generic style rules;
this codebase intentionally allows patterns often flagged in default linters such as allowing long lines, etc.
## Build And Test
- Activate environment: `source venv/bin/activate` (always ensure this is active when working with Python code).
- Test startup: `python launch.py --test`
- Full startup: `python launch.py`
@ -31,14 +44,17 @@
- JS checks: `pnpm eslint` and `pnpm eslint-ui`
## Conventions
- Keep PR-ready changes targeted to `dev` branch conventions from `CONTRIBUTING`.
- Keep PR-ready changes targeted to `dev` branch.
- Use conventions from `CONTRIBUTING`.
- Do not include unrelated edits or submodule changes when preparing contributions.
- Use existing CLI/API tool patterns in `cli/` and `test/` when adding automation scripts.
- Respect environment-driven behavior (`SD_*` flags and options) instead of hardcoding platform/model assumptions.
- For startup/init edits, preserve error handling and partial-failure tolerance in parallel scans and extension loading.
## Pitfalls
- Initialization order matters: startup paths in `launch.py` and `webui.py` are sensitive to import/load timing.
- Shared mutable global state can create subtle regressions; prefer narrow, explicit changes.
- Device/backend-specific code paths (CUDA/ROCm/IPEX/DirectML/OpenVINO) should not assume one platform.
- Extension loading is dynamic; failures may appear only when specific extensions or models are present.
- Device/backend-specific code paths (**CUDA/ROCm/IPEX/DirectML/OpenVINO**) should not assume one platform.
- Scripts and extension loading is dynamic; failures may appear only when specific extensions or models are present.

60
AGENTS.md Normal file
View File

@ -0,0 +1,60 @@
# SD.Next: AGENTS.md Project Guidelines
SD.Next is a complex codebase with specific patterns and conventions.
General app structure is:
- Python backend server
Uses Torch for model inference, FastAPI for API routes and Gradio for creation of UI components.
- JavaScript/CSS frontend
## Tools
- `venv` for Python environment management, activated with `source venv/bin/activate` (Linux) or `venv\Scripts\activate` (Windows).
venv MUST be activated before running any Python commands or scripts to ensure correct dependencies and environment variables.
- `python` 3.10+.
- `pyproject.toml` for Python configuration, including linting and type checking settings.
- `eslint` configured for both core and UI code.
- `pnpm` for managing JavaScript dependencies and scripts, with key commands defined in `package.json`.
- `ruff` and `pylint` for Python linting, with configurations in `pyproject.toml` and executed via `pnpm ruff` and `pnpm pylint`.
- `pre-commit` hooks which also check line-endings and other formatting issues, configured in `.pre-commit-config.yaml`.
## Project Structure
- Entry/startup flow: `webui.sh` -> `launch.py` -> `webui.py` -> modules under `modules/`.
- Install: `installer.py` takes care of installing dependencies and setting up the environment.
- Core runtime state is centralized in `modules/shared.py` (shared.opts, model state, backend/device state).
- API/server routes are under `modules/api/`.
- UI codebase is split between base JS in `javascript/` and actual UI in `extensions-builtin/sdnext-modernui/`.
- Model and pipeline logic is split between `modules/sd_*` and `pipelines/`.
- Additional plug-ins live in `scripts/` and are used only when specified.
- Extensions live in `extensions-builtin/` and `extensions/` and are loaded dynamically.
- Tests and CLI scripts are under `test/` and `cli/`, with some API smoke checks in `test/full-test.sh`.
## Code Style
- Prefer existing project patterns over strict generic style rules;
this codebase intentionally allows patterns often flagged in default linters such as allowing long lines, etc.
## Build And Test
- Activate environment: `source venv/bin/activate` (always ensure this is active when working with Python code).
- Test startup: `python launch.py --test`
- Full startup: `python launch.py`
- Full lint sequence: `pnpm lint`
- Python checks individually: `pnpm ruff`, `pnpm pylint`
- JS checks: `pnpm eslint` and `pnpm eslint-ui`
## Conventions
- Keep PR-ready changes targeted to `dev` branch.
- Use conventions from `CONTRIBUTING`.
- Do not include unrelated edits or submodule changes when preparing contributions.
- Use existing CLI/API tool patterns in `cli/` and `test/` when adding automation scripts.
- Respect environment-driven behavior (`SD_*` flags and options) instead of hardcoding platform/model assumptions.
- For startup/init edits, preserve error handling and partial-failure tolerance in parallel scans and extension loading.
## Pitfalls
- Initialization order matters: startup paths in `launch.py` and `webui.py` are sensitive to import/load timing.
- Shared mutable global state can create subtle regressions; prefer narrow, explicit changes.
- Device/backend-specific code paths (**CUDA/ROCm/IPEX/DirectML/OpenVINO**) should not assume one platform.
- Scripts and extension loading is dynamic; failures may appear only when specific extensions or models are present.

View File

@ -49,12 +49,11 @@ But also many smaller quality-of-life improvements - for full details, see [Chan
- **captioning** and **prompt enhance**: add support for all cloud-based Gemini models
*3.1/3.0/2.5 pro/flash/flash-lite*
- improve captioning and prompt enhance memory handling/offloading
- **Control**
- new **pre-processors**:
*anyline, depth_anything v2, dsine, lotus, marigold normals, oneformer, rtmlib pose, sam2, stablenormal, teed, vitpose*
- **Features**
- **Secrets** handling: new `secrets.json` and special handling for tokens/keys/passwords
used to be treated like any other `config.json` param which can cause security issues
- **Control**: many new **pre-processors**
*anyline, depth_anything v2, dsine, lotus, marigold normals, oneformer, rtmlib pose, sam2, stablenormal, teed, vitpose*
- pipelines: add **ZImageInpaint**
- rewritten **CivitAI** module
browse/discover mode with sort, period, type/base dropdowns; URL paste; subfolder sorting; auto-browse; dynamic dropdowns
@ -66,12 +65,13 @@ But also many smaller quality-of-life improvements - for full details, see [Chan
- **ROCm** advanced configuration and tuning, thanks @resonantsky
see *main interface -> scripts -> rocm advanced config*
- **ROCm** support for additional AMD GPUs: `gfx103X`, thanks @crashingalexsan
- **Cuda** `torch==2.10` removed support for `rtx1000` series and older GPUs
use following before first startup to force installation of `torch==2.9.1` with `cuda==12.6`:
> `set TORCH_COMMAND='torch==2.9.1 torchvision==0.24.1 torchaudio==2.9.1 --index-url https://download.pytorch.org/whl/cu126'`
- **Cuda** update to `torch=2.11` with `cuda=13.0`
- **Ipex** update to `torch==2.11`
- **ROCm/Linux** update to `torch==2.11` with `rocm==7.2`
- **OpenVINO** update to `torch==2.11` and `openvino==2026.0`
- *note* **Cuda** `torch==2.10` removed support for `rtx1000` series and older GPUs
use following before first startup to force installation of `torch==2.9.1` with `cuda==12.6`:
> `set TORCH_COMMAND='torch==2.9.1 torchvision==0.24.1 torchaudio==2.9.1 --index-url https://download.pytorch.org/whl/cu126'`
- **UI**
- legacy panels **T2I** and **I2I** are disabled by default
you can re-enable them in *settings -> ui -> hide legacy tabs*

View File

@ -3,11 +3,9 @@
## Release
- Implement: `unload_auxiliary_models`
- Switch to: `torch==2.11`
- Add notes: **Enso**
- Tips: **Color Grading**
- Regen: **Localization**
- AGENTS.md
## Internal
@ -30,7 +28,7 @@
- Feature: Video tab add full API support
- Refactor: Unify *huggingface* and *diffusers* model folders
- Refactor: [GGUF](https://huggingface.co/docs/diffusers/main/en/quantization/gguf)
- Reimplement `llama` remover for Kanvas, pending end-to-end review of `Kanvas`
- Reimplement `llama` remover for Kanvas
## OnHold
@ -72,9 +70,12 @@ TODO: Investigate which models are diffusers-compatible and prioritize!
- [Step1X-Edit](https://github.com/stepfun-ai/Step1X-Edit):Multimodal image editing decoding MLLM tokens via DiT
- [OneReward](https://github.com/bytedance/OneReward):Reinforcement learning grounded generative reward model for image editing
- [ByteDance DreamO](https://huggingface.co/ByteDance/DreamO): image customization framework for IP adaptation and virtual try-on
- [nVidia Cosmos-Transfer-2.5](https://github.com/huggingface/diffusers/pull/13066)
### Video
- [LTX-Condition](https://github.com/huggingface/diffusers/pull/13058)
- [LTX-Distilled](https://github.com/huggingface/diffusers/pull/12934)
- [OpenMOSS MOVA](https://huggingface.co/OpenMOSS-Team/MOVA-720p): Unified foundation model for synchronized high-fidelity video and audio
- [Wan family (Wan2.1 / Wan2.2 variants)](https://huggingface.co/Wan-AI/Wan2.2-Animate-14B): MoE-based foundational tools for cinematic T2V/I2V/TI2V
example: [Wan2.1-T2V-14B-CausVid](https://huggingface.co/lightx2v/Wan2.1-T2V-14B-CausVid)

View File

@ -556,7 +556,8 @@ def install_cuda():
if args.use_nightly:
cmd = os.environ.get('TORCH_COMMAND', '--upgrade --pre torch torchvision --index-url https://download.pytorch.org/whl/nightly/cu128 --extra-index-url https://download.pytorch.org/whl/nightly/cu130')
else:
cmd = os.environ.get('TORCH_COMMAND', 'torch==2.10.0+cu128 torchvision==0.25.0+cu128 --index-url https://download.pytorch.org/whl/cu128')
# cmd = os.environ.get('TORCH_COMMAND', 'torch==2.10.0+cu128 torchvision==0.25.0+cu128 --index-url https://download.pytorch.org/whl/cu128')
cmd = os.environ.get("TORCH_COMMAND", "pip install -U torch==2.11.0+cu130 torchvision==0.26.0+cu130 --index-url https://download.pytorch.org/whl/cu130")
return cmd