Deploy model endpoint for Stable Diffusion v2.1
In this tutorial, we will create a model endpoint for Stability AI’s Stable Diffusion v2.1 model.
The tutorial focuses on:
- Step-by-step model endpoint creation and image generation
- Creating a model endpoint with custom model weights
- Supported parameters for image generation
For this tutorial, we’ll use the Stable Diffusion v2.1 pre-built container. The advantage of using this is that the API handler is automatically created — no need to build one manually.
A guide on model endpoint creation and image generation
Step 1: Create a model endpoint for Stable Diffusion v2.1
- Go to the AI Platform.
- Select a project.
- Navigate to Model Endpoints.
- Click Create Endpoint.
- Choose the Stable-Diffusion v2.1 model card under Choose Framework.
- Pick a suitable GPU plan. You can keep default values for replicas, disk size, and endpoint details.
- Add environment variables if required, otherwise proceed.
- Skip custom model weights for now; we’ll use the default ones. (See Creating model endpoint with custom model weights if needed.)
- Complete endpoint creation.
Step 2: Generate your API token
The model endpoint API requires an auth token.
- Go to the API Tokens section in your project.
- Click Create Token or use an existing one.
- Copy the generated Auth Token — you’ll use it in the next step.
Step 3: Generate images using text prompts
Once your model endpoint is Ready, test it using the Sample API Request shown on the model details page.
- Launch a Instance with PyTorch or StableDiffusion image.
- Open Jupyter Labs → start a new notebook
untitled.ipynb. - Paste the Python Sample API Request from the endpoint.
- Replace
$AUTH_TOKENwith your actual token. - Run the cell to generate results.
The endpoint will return tensors (PyTorch format) representing images. To visualize them:
Click to expand code
import torch
import torchvision.transforms as transforms
def display_images(tensor_image_data_list):
'''convert PyTorch Tensors to PIL Image'''
for tensor_data in tensor_image_data_list:
tensor_image = torch.tensor(tensor_data.get("data"))
pil_img = transforms.ToPILImage()(tensor_image)
pil_img.show()
# Uncomment to save images
# pil_img.save(tensor_data.get("name"))
if response.status_code == 200:
display_images(response.json().get("predictions"))
Your Stable Diffusion endpoint is now up and running!
You can modify text prompts or parameters to explore different image generations. See Supported parameters for image generation for more details.
Creating model endpoint with custom model weights
To use custom model weights for Stable Diffusion v2.1:
- Download the Stable-Diffusion-2-1 model.
- Upload it to your Model Bucket (EOS).
- Create an endpoint pointing to the uploaded weights.
Step 1.1: Define a model in the dashboard
- Go to the AI Platform.
- Choose your project.
- Navigate to Models → Create Model.
- Name it (e.g.,
stable-diffusion). - Select Model Type: Custom.
- Click CREATE.
- Note the details of the created EOS bucket.
- Copy the Setup Host command from Setup MinIO CLI.
Step 1.2: Start a new Instance
- Go to Notebooks and launch one using Diffusers Image.
- Choose a GPU plan (recommended).
- In Jupyter Labs, open Terminal.
- Run the copied MinIO CLI Host setup command.
- Confirm
mcCLI is ready.
Step 1.3: Download Stable Diffusion v2.1 model
Click to expand code
import torch
from diffusers import StableDiffusionPipeline, DPMSolverMultistepScheduler
model_id = "stabilityai/stable-diffusion-2-1"
pipe = StableDiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16)
pipe.scheduler = DPMSolverMultistepScheduler.from_config(pipe.scheduler.config)
pipe = pipe.to("cuda" if torch.cuda.is_available() else "cpu")
pip install diffusers transformers accelerate scipy safetensors
Run a quick test:
prompt = "a photograph of an astronaut riding a horse"
image = pipe(prompt).images[0]
image.show()
Step 2: Upload the model to EOS
Click to expand code
cd $HOME/.cache/huggingface/hub/models--stabilityai--stable-diffusion-2-1/snapshots
mc cp -r * stable-diffusion/stable-diffusion-854588
ls $HOME/.cache/huggingface/hub
Step 3: Create an endpoint for your model
Return to the dashboard → Model Endpoints → Create Endpoint.
Select the model from your EOS bucket. If your model isn’t in the root directory, provide the path containing model_index.json.
Supported parameters for image generation
Each request can include additional parameters to control image output.
prompt (str or list) — Text input guiding image generation.
{
"prompt": [
"a photo of an astronaut riding a horse",
"a photo of a cat playing football"
]
}
height / width (int, defaults to 768) — Image dimensions in pixels.
{
"prompt": "a photo of an astronaut riding a horse",
"height": 512,
"width": 1024
}
num_images_per_prompt (int, defaults to 1) — Number of images to generate per prompt.
{
"prompt": ["horse", "cat"],
"num_images_per_prompt": 2
}
generator (list[int]) — Random seed for deterministic output.
{
"prompt": ["horse", "cat"],
"generator": [1024, 700]
}
num_inference_steps (int, defaults to 50) — Denoising steps for quality control.
{
"prompt": "horse",
"generator": [2000],
"num_inference_steps": 80
}
negative_prompt (str or list) — Prompts specifying what not to include in the image.
{
"prompt": "a horse riding an astronaut",
"negative_prompt": "brown colored horse"
}
guidance_scale (float, defaults to 7.5) — Adjusts adherence to the prompt.
{
"prompt": "a photo of an astronaut riding a horse",
"guidance_scale": 11
}
guidance_rescale (float, defaults to 0.7) — Adjusts overexposure correction.
{
"prompt": "horse",
"generator": [2000],
"guidance_rescale": 0.3
}
Troubleshooting and best practices
- Verify MinIO credentials before upload.
- Ensure sufficient GPU resources before deployment.
- Use smaller prompts or lower inference steps for testing.
- If API calls fail, check logs under endpoint details.
- Keep
guidance_scalebetween 6–9 for balanced image quality.