From 7a47b6bbdb3b6c306eb310ef1f436f9bb7952334 Mon Sep 17 00:00:00 2001 From: vladmandic Date: Wed, 25 Mar 2026 08:23:55 +0100 Subject: [PATCH] update torch-cuda and create agents.md Signed-off-by: vladmandic --- .github/copilot-instructions.md | 28 +++++++++++---- AGENTS.md | 60 +++++++++++++++++++++++++++++++++ CHANGELOG.md | 12 +++---- TODO.md | 7 ++-- installer.py | 3 +- 5 files changed, 94 insertions(+), 16 deletions(-) create mode 100644 AGENTS.md diff --git a/.github/copilot-instructions.md b/.github/copilot-instructions.md index 6fb62bcfe..2ee05f966 100644 --- a/.github/copilot-instructions.md +++ b/.github/copilot-instructions.md @@ -1,28 +1,41 @@ -# Project Guidelines +# SD.Next: AGENTS.md Project Guidelines + +SD.Next is a complex codebase with specific patterns and conventions. +General app structure is: +- Python backend server + Uses Torch for model inference, FastAPI for API routes and Gradio for creation of UI components. +- JavaScript/CSS frontend ## Tools -- `pyproject.toml` for Python configuration, including linting and type checking settings. -- `venv` for Python environment management, activated with `source venv/bin/activate`. + +- `venv` for Python environment management, activated with `source venv/bin/activate` (Linux) or `venv\Scripts\activate` (Windows). + venv MUST be activated before running any Python commands or scripts to ensure correct dependencies and environment variables. - `python` 3.10+. +- `pyproject.toml` for Python configuration, including linting and type checking settings. - `eslint` configured for both core and UI code. - `pnpm` for managing JavaScript dependencies and scripts, with key commands defined in `package.json`. - `ruff` and `pylint` for Python linting, with configurations in `pyproject.toml` and executed via `pnpm ruff` and `pnpm pylint`. - `pre-commit` hooks which also check line-endings and other formatting issues, configured in `.pre-commit-config.yaml`. ## Project Structure + - Entry/startup flow: `webui.sh` -> `launch.py` -> `webui.py` -> modules under `modules/`. +- Install: `installer.py` takes care of installing dependencies and setting up the environment. - Core runtime state is centralized in `modules/shared.py` (shared.opts, model state, backend/device state). - API/server routes are under `modules/api/`. +- UI codebase is split between base JS in `javascript/` and actual UI in `extensions-builtin/sdnext-modernui/`. - Model and pipeline logic is split between `modules/sd_*` and `pipelines/`. - Additional plug-ins live in `scripts/` and are used only when specified. - Extensions live in `extensions-builtin/` and `extensions/` and are loaded dynamically. - Tests and CLI scripts are under `test/` and `cli/`, with some API smoke checks in `test/full-test.sh`. ## Code Style + - Prefer existing project patterns over strict generic style rules; this codebase intentionally allows patterns often flagged in default linters such as allowing long lines, etc. ## Build And Test + - Activate environment: `source venv/bin/activate` (always ensure this is active when working with Python code). - Test startup: `python launch.py --test` - Full startup: `python launch.py` @@ -31,14 +44,17 @@ - JS checks: `pnpm eslint` and `pnpm eslint-ui` ## Conventions -- Keep PR-ready changes targeted to `dev` branch conventions from `CONTRIBUTING`. + +- Keep PR-ready changes targeted to `dev` branch. +- Use conventions from `CONTRIBUTING`. - Do not include unrelated edits or submodule changes when preparing contributions. - Use existing CLI/API tool patterns in `cli/` and `test/` when adding automation scripts. - Respect environment-driven behavior (`SD_*` flags and options) instead of hardcoding platform/model assumptions. - For startup/init edits, preserve error handling and partial-failure tolerance in parallel scans and extension loading. ## Pitfalls + - Initialization order matters: startup paths in `launch.py` and `webui.py` are sensitive to import/load timing. - Shared mutable global state can create subtle regressions; prefer narrow, explicit changes. -- Device/backend-specific code paths (CUDA/ROCm/IPEX/DirectML/OpenVINO) should not assume one platform. -- Extension loading is dynamic; failures may appear only when specific extensions or models are present. +- Device/backend-specific code paths (**CUDA/ROCm/IPEX/DirectML/OpenVINO**) should not assume one platform. +- Scripts and extension loading is dynamic; failures may appear only when specific extensions or models are present. diff --git a/AGENTS.md b/AGENTS.md new file mode 100644 index 000000000..2ee05f966 --- /dev/null +++ b/AGENTS.md @@ -0,0 +1,60 @@ +# SD.Next: AGENTS.md Project Guidelines + +SD.Next is a complex codebase with specific patterns and conventions. +General app structure is: +- Python backend server + Uses Torch for model inference, FastAPI for API routes and Gradio for creation of UI components. +- JavaScript/CSS frontend + +## Tools + +- `venv` for Python environment management, activated with `source venv/bin/activate` (Linux) or `venv\Scripts\activate` (Windows). + venv MUST be activated before running any Python commands or scripts to ensure correct dependencies and environment variables. +- `python` 3.10+. +- `pyproject.toml` for Python configuration, including linting and type checking settings. +- `eslint` configured for both core and UI code. +- `pnpm` for managing JavaScript dependencies and scripts, with key commands defined in `package.json`. +- `ruff` and `pylint` for Python linting, with configurations in `pyproject.toml` and executed via `pnpm ruff` and `pnpm pylint`. +- `pre-commit` hooks which also check line-endings and other formatting issues, configured in `.pre-commit-config.yaml`. + +## Project Structure + +- Entry/startup flow: `webui.sh` -> `launch.py` -> `webui.py` -> modules under `modules/`. +- Install: `installer.py` takes care of installing dependencies and setting up the environment. +- Core runtime state is centralized in `modules/shared.py` (shared.opts, model state, backend/device state). +- API/server routes are under `modules/api/`. +- UI codebase is split between base JS in `javascript/` and actual UI in `extensions-builtin/sdnext-modernui/`. +- Model and pipeline logic is split between `modules/sd_*` and `pipelines/`. +- Additional plug-ins live in `scripts/` and are used only when specified. +- Extensions live in `extensions-builtin/` and `extensions/` and are loaded dynamically. +- Tests and CLI scripts are under `test/` and `cli/`, with some API smoke checks in `test/full-test.sh`. + +## Code Style + +- Prefer existing project patterns over strict generic style rules; + this codebase intentionally allows patterns often flagged in default linters such as allowing long lines, etc. + +## Build And Test + +- Activate environment: `source venv/bin/activate` (always ensure this is active when working with Python code). +- Test startup: `python launch.py --test` +- Full startup: `python launch.py` +- Full lint sequence: `pnpm lint` +- Python checks individually: `pnpm ruff`, `pnpm pylint` +- JS checks: `pnpm eslint` and `pnpm eslint-ui` + +## Conventions + +- Keep PR-ready changes targeted to `dev` branch. +- Use conventions from `CONTRIBUTING`. +- Do not include unrelated edits or submodule changes when preparing contributions. +- Use existing CLI/API tool patterns in `cli/` and `test/` when adding automation scripts. +- Respect environment-driven behavior (`SD_*` flags and options) instead of hardcoding platform/model assumptions. +- For startup/init edits, preserve error handling and partial-failure tolerance in parallel scans and extension loading. + +## Pitfalls + +- Initialization order matters: startup paths in `launch.py` and `webui.py` are sensitive to import/load timing. +- Shared mutable global state can create subtle regressions; prefer narrow, explicit changes. +- Device/backend-specific code paths (**CUDA/ROCm/IPEX/DirectML/OpenVINO**) should not assume one platform. +- Scripts and extension loading is dynamic; failures may appear only when specific extensions or models are present. diff --git a/CHANGELOG.md b/CHANGELOG.md index 295279932..af91fa110 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -49,12 +49,11 @@ But also many smaller quality-of-life improvements - for full details, see [Chan - **captioning** and **prompt enhance**: add support for all cloud-based Gemini models *3.1/3.0/2.5 pro/flash/flash-lite* - improve captioning and prompt enhance memory handling/offloading -- **Control** - - new **pre-processors**: - *anyline, depth_anything v2, dsine, lotus, marigold normals, oneformer, rtmlib pose, sam2, stablenormal, teed, vitpose* - **Features** - **Secrets** handling: new `secrets.json` and special handling for tokens/keys/passwords used to be treated like any other `config.json` param which can cause security issues + - **Control**: many new **pre-processors** + *anyline, depth_anything v2, dsine, lotus, marigold normals, oneformer, rtmlib pose, sam2, stablenormal, teed, vitpose* - pipelines: add **ZImageInpaint** - rewritten **CivitAI** module browse/discover mode with sort, period, type/base dropdowns; URL paste; subfolder sorting; auto-browse; dynamic dropdowns @@ -66,12 +65,13 @@ But also many smaller quality-of-life improvements - for full details, see [Chan - **ROCm** advanced configuration and tuning, thanks @resonantsky see *main interface -> scripts -> rocm advanced config* - **ROCm** support for additional AMD GPUs: `gfx103X`, thanks @crashingalexsan - - **Cuda** `torch==2.10` removed support for `rtx1000` series and older GPUs - use following before first startup to force installation of `torch==2.9.1` with `cuda==12.6`: - > `set TORCH_COMMAND='torch==2.9.1 torchvision==0.24.1 torchaudio==2.9.1 --index-url https://download.pytorch.org/whl/cu126'` + - **Cuda** update to `torch=2.11` with `cuda=13.0` - **Ipex** update to `torch==2.11` - **ROCm/Linux** update to `torch==2.11` with `rocm==7.2` - **OpenVINO** update to `torch==2.11` and `openvino==2026.0` + - *note* **Cuda** `torch==2.10` removed support for `rtx1000` series and older GPUs + use following before first startup to force installation of `torch==2.9.1` with `cuda==12.6`: + > `set TORCH_COMMAND='torch==2.9.1 torchvision==0.24.1 torchaudio==2.9.1 --index-url https://download.pytorch.org/whl/cu126'` - **UI** - legacy panels **T2I** and **I2I** are disabled by default you can re-enable them in *settings -> ui -> hide legacy tabs* diff --git a/TODO.md b/TODO.md index 3468c97f5..ca6417b76 100644 --- a/TODO.md +++ b/TODO.md @@ -3,11 +3,9 @@ ## Release - Implement: `unload_auxiliary_models` -- Switch to: `torch==2.11` - Add notes: **Enso** - Tips: **Color Grading** - Regen: **Localization** -- AGENTS.md ## Internal @@ -30,7 +28,7 @@ - Feature: Video tab add full API support - Refactor: Unify *huggingface* and *diffusers* model folders - Refactor: [GGUF](https://huggingface.co/docs/diffusers/main/en/quantization/gguf) -- Reimplement `llama` remover for Kanvas, pending end-to-end review of `Kanvas` +- Reimplement `llama` remover for Kanvas ## OnHold @@ -72,9 +70,12 @@ TODO: Investigate which models are diffusers-compatible and prioritize! - [Step1X-Edit](https://github.com/stepfun-ai/Step1X-Edit):Multimodal image editing decoding MLLM tokens via DiT - [OneReward](https://github.com/bytedance/OneReward):Reinforcement learning grounded generative reward model for image editing - [ByteDance DreamO](https://huggingface.co/ByteDance/DreamO): image customization framework for IP adaptation and virtual try-on +- [nVidia Cosmos-Transfer-2.5](https://github.com/huggingface/diffusers/pull/13066) ### Video +- [LTX-Condition](https://github.com/huggingface/diffusers/pull/13058) +- [LTX-Distilled](https://github.com/huggingface/diffusers/pull/12934) - [OpenMOSS MOVA](https://huggingface.co/OpenMOSS-Team/MOVA-720p): Unified foundation model for synchronized high-fidelity video and audio - [Wan family (Wan2.1 / Wan2.2 variants)](https://huggingface.co/Wan-AI/Wan2.2-Animate-14B): MoE-based foundational tools for cinematic T2V/I2V/TI2V example: [Wan2.1-T2V-14B-CausVid](https://huggingface.co/lightx2v/Wan2.1-T2V-14B-CausVid) diff --git a/installer.py b/installer.py index 53027f5b6..ac8802c34 100644 --- a/installer.py +++ b/installer.py @@ -556,7 +556,8 @@ def install_cuda(): if args.use_nightly: cmd = os.environ.get('TORCH_COMMAND', '--upgrade --pre torch torchvision --index-url https://download.pytorch.org/whl/nightly/cu128 --extra-index-url https://download.pytorch.org/whl/nightly/cu130') else: - cmd = os.environ.get('TORCH_COMMAND', 'torch==2.10.0+cu128 torchvision==0.25.0+cu128 --index-url https://download.pytorch.org/whl/cu128') + # cmd = os.environ.get('TORCH_COMMAND', 'torch==2.10.0+cu128 torchvision==0.25.0+cu128 --index-url https://download.pytorch.org/whl/cu128') + cmd = os.environ.get("TORCH_COMMAND", "pip install -U torch==2.11.0+cu130 torchvision==0.26.0+cu130 --index-url https://download.pytorch.org/whl/cu130") return cmd