- Home
- /
- TIL
- /
- Diffusion Models
Versions of Stable Diffusion
Table of Contents
Stable Diffusion v1.x (v1.1 to v1.4)
Stable Diffusion v1.x is the initial release of the Stable Diffusion model developed by CompVis (Ludwig Maximilian University of Munich), collaborating with Stability AI and Runway.
- Core: Latent Diffusion with a U-Net; CLIP ViT-L/14 text encoder.
- Native resolution: 512x512 pixels.
- Data: LAION subsets with aesthetics filtering.
- Hugging Face: CompVis/stable-diffusion-v1-4
- GitHub: CompVis/stable-diffusion
- Paper: High-Resolution Image Synthesis with Latent Diffusion Models
from diffusers import StableDiffusionPipeline
pipe = StableDiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4")
pipe.to("cuda")
image = pipe("A fantasy landscape, trending on artstation").images[0]
image.save("fantasy_landscape.png")
Stable Diffusion 2.x (2.0, 2.1)
Stable Diffusion v2.x is an improved version of the original Stable Diffusion model, released by Stability AI.
- What changed vs 1.x:
- New text encoder: OpenCLIP Vit-H/14 (different tokenization/semantics than v1.x).
- Added 768x768 native model (alongside 512 variant).
- Expanded official variants: depth-to-image, inpainting, x4 upscaler.
- Stronger data filtering; prompt vocabulary shifted.
- Hugging Face: stabilityai/stable-diffusion-2-1-base
- GitHub: Stability-AI/stablediffusion
SDXL 1.0 (Base + Refiner)
Also released by Stability AI.
- Architecture:
- Two stage diffusion processes (Base + Refiner).
- “Ensemble of experts” pipeline: Base for coarse denoising + optional Refiner for final denoising step.
- OpenCLIP ViT-G/14 + CLIP ViT-L/14 (two encoders)
- Native resolution: 1024x1024 pixels.
- Quality: Big jump in composition, color fidelity, photorealism, and prompt alignment vs 1.4/2.1.
- Trade-offs: Heavier and slower than prior versions.
- Hugging Face:
SDXL-Turbo
(by Stability AI)
- Difference from SDXL 1.0:
- Single U-Net model (no separate Refiner).
- Optimized for speed and cost-efficiency.
- Slightly lower quality than SDXL 1.0 but still superior to v1.4/2.1.
- A distilled version of SDXL 1.0 by using Adversarial Diffusion Distillation.
- Optimized around 512x512.
- Hugging Face: stabilityai/stable-diffusion-xl-turbo-1.0
- Project Page
Stable Diffusion 3.x (3.0, 3.5)
(by Stability AI)
- Architecture: Diffusion Transformer (DiT) + flow matching.
- Strengths reported: Better multi-subject composition, improved spelling/text, and overall prompt adherence.
- Hugging Face:
- Project Page