Allow user to migrate existing workloads including ckpt merge, model training, model inferencing onto AWS
 
 
 
 
Go to file
Xiujuan Li ae280fe82b upgrade 2024-09-10 12:25:26 +08:00
.github fix oas for ja 2024-06-24 16:49:37 +08:00
aws_extension upgrade api client version 2024-06-23 18:35:16 +08:00
build_scripts upgrade 2024-09-10 12:25:26 +08:00
deployment improved serve 2024-04-07 12:48:13 +08:00
docs fix logs 2024-07-25 09:26:11 +08:00
infrastructure make train get file from another bucket 2024-07-18 15:44:10 +08:00
javascript improved validators 2024-04-18 08:40:33 +08:00
middleware_api fix train path 2024-07-18 11:53:32 +08:00
scripts merge proxy 2024-05-22 20:22:58 +08:00
test Merge pull request #850 from awslabs/dependabot/pip/test/setuptools-70.0.0 2024-07-30 15:30:40 +08:00
update_scripts removed unused ModelTable 2024-05-17 21:23:50 +08:00
workshop Merge remote-tracking branch 'origin/dev' into dev_juan 2024-07-24 11:19:16 +08:00
.gitallowed recovery ignores for some files 2024-02-22 14:31:19 +08:00
.gitignore improved oas 2024-06-24 16:14:05 +08:00
.viperlightignore improved oas 2024-06-24 16:14:05 +08:00
.viperlightrc fix: workflow test 2023-07-03 13:18:38 +08:00
CHANGELOG.md doc update: version and notice update 2023-06-20 09:02:07 +00:00
CODE_OF_CONDUCT.md improved cdk 2024-03-30 22:23:37 +08:00
CONTRIBUTING.md improved readme 2024-07-04 11:44:45 +08:00
LICENSE initial push for extension (container exclude) 2023-05-05 15:23:36 +08:00
NOTICE Initial commit 2023-05-04 00:23:42 -04:00
README.md docs update: per new version 2024-07-14 11:58:57 +00:00
THIRD-PARTY-LICENSES.txt initial push for extension (container exclude) 2023-05-05 15:23:36 +08:00
buildspec-private-repo.yml update lambda packages version 2024-04-05 15:27:29 +08:00
buildspec.yml improved oas 2024-06-24 16:14:05 +08:00
commit-id.sh improved endpoint cache check 2024-04-14 16:15:54 +08:00
docker_image.sh improved workflow delete check and delete folder 2024-06-11 12:04:16 +08:00
docker_reset.sh update docker reset 2024-06-07 15:40:53 +08:00
docker_start.sh improved serve 2024-07-16 15:47:58 +08:00
install.bat update windows commit id 2024-04-12 07:54:38 +08:00
install.py initial push for extension (container exclude) 2023-05-05 15:23:36 +08:00
install.sh fixed download esd branch 2024-04-08 14:10:54 +08:00
utils.py improved config 2024-03-19 01:30:46 +08:00
utils_cn.py chore: remove db 2024-01-25 15:28:05 +08:00

README.md

Extension for Stable Diffusion on AWS

Extension for Stable Diffusion on AWS: Unlock the Power of image and video generation in the Cloud with Ease and Speed

This is a webUI extension to help users migrate existing workload (inference, train, etc.) from local server or standalone server to AWS Cloud. Key features include:

  • Support Stable Diffusion webUI inference along with other extensions through BYOC (bring your own containers) in the cloud.
  • Support LoRa model training through Kohya_ss in the cloud.
  • Support ComfyUI inference along with other extensions in the cloud. This supports users in conveniently releasing templates that require stable, continuous inference to the cloud. Additionally, users can make simple modifications (e.g., prompt adjustments) to the released templates on the cloud and maintain stable inference.

Table of Contents

Architecture

The diagram below presents the architecture you can automatically deploy using the solution's implementation guide and accompanying Amazon CloudFormation template. architecture

  1. Users in WebUI console will trigger the requests to API Gateway with assigned API token for authentication. Note that no Amazon Web Services credentials are required from WebUI perspective.

  2. Amazon API Gateway will route the requests based on URL prefix to different functional Lambda to implement util jobs (for example, model upload, checkpoint merge), model training and model inferencing. In the meantime, Amazon Lambda will record the operation metadata into Amazon DynamoDB (for example, inferencing parameters, model name) for successive query and association.

  3. For training process, the Amazon Step Functions will be invoked to orchestrate the training process including Amazon SageMaker for training and SNS for training status notification. For inference process, Amazon Lambda will invoke the Amazon SageMaker to implement async inference. Training data, model and checkpoint will be stored in Amazon S3 bucket delimited with difference prefix.

Quick Start

There are 3 key features that the extension supports. There are 2 branches of deployment method, depending on the key feature that you'd like to deploy.

  • If you'd like to adopt SD webUI or Kohya in the cloud, please follow the instruction here.
  • If you'd like to adopt ComfyUI in the cloud, please follow the instruction here.

API Reference

To provide developers with a more convenient experience for invoking and debugging APIs, we offer a feature API debugger. With this tool, you can view the complete set of APIs and corresponding parameters for cloud-based inference images with a single click.

  1. Click the button to refresh the inference history job list
  2. Pull down the inference job list, find and select the job
  3. Click the API button on the right

debugger

The comprehensive APIs with sample can be found here.

Version

Check our wiki for the latest & historical version

License

This project is licensed under the Apache-2.0 License.

Source Code Structure

.
├── CHANGELOG.md
├── CODE_OF_CONDUCT.md
├── CONTRIBUTING.md
├── LICENSE
├── NOTICE
├── README.md
├── THIRD-PARTY-LICENSES.txt
├── build_scripts -- scripts to build the docker images, we use these scripts to build docker images on cloud
├── buildspec.yml -- buildspec file for CodeBuild, we have code pipeline to use this buildspec to transfer the CDK assets to Cloudformation templates
├── deployment    -- scripts to deploy the CloudFormation template
├── docs
├── infrastructure -- CDK project to deploy the middleware, all the middle ware infrastructure code is in this directory
├── install.py -- install dependencies for the extension
├── install.sh --  script to set the webui and extension to specific version
├── javascript -- javascript code for the extension
├── middleware_api -- middleware api denifition and lambda code
├── sagemaker_entrypoint_json.py -- wrapper function for SageMaker
├── scripts -- extension related code for WebUI
└── utils.py -- wrapper function for configure options