Download Z-Image Turbo: Open-Source AI Model

Z-Image Turbo is the first released model in the Z-Image series, designed to generate high-quality images using very few diffusion steps (8 internal steps). It is based on a Single-Stream Diffusion Transformer optimized with distillation techniques such as Decoupled-DMD and DMDR. Currently, Z-Image Turbo is the only model available for download and use, both manually and through compatible tools and frontends.

GitHub Repo Hugging Face Repo

Download Z-Image Turbo

OUR RECOMMENDATION

The "official" download and usage method explained in this article is not recommended for beginners or non-technical users, as it is not user-friendly. If you prefer a simpler and more guided setup, we strongly recommend following our ComfyUI installation guide instead. Go to the recommended installation guide

The authors provide the model via Hugging Face using the hf command-line tool from the huggingface_hub package.

1. Install huggingface_hub

pip install -U huggingface_hub

2. Download the model

HF_XET_HIGH_PERFORMANCE=1 hf download Tongyi-MAI/Z-Image-Turbo

This command downloads the official Tongyi-MAI/Z-Image-Turbo model files from Hugging Face to your local environment.

Install Diffusers (required for manual Python usage)

The authors recommend installing the latest version of diffusers from the official repository to ensure full Z-Image support.

pip install git+https://github.com/huggingface/diffusers

Native PyTorch usage (optional)

The official repository provides a native PyTorch inference workflow via an example script.

1. Install repository dependencies

pip install -e .

2. Run native inference

python inference.py

This method uses the native inference flow defined in the repository. Configuration details are handled inside inference.py as provided by the authors.

Using Z-Image with Diffusers (advanced mode)

Once diffusers is installed and the model is downloaded, you can generate images in Python using the ZImagePipeline.

import torch from diffusers import ZImagePipeline pipe = ZImagePipeline.from_pretrained( "Tongyi-MAI/Z-Image-Turbo", torch_dtype=torch.bfloat16, low_cpu_mem_usage=False, ) pipe.to("cuda") prompt = "Young Chinese woman in red Hanfu, intricate embroidery, elaborate hairstyle, night scene with soft lighting." image = pipe( prompt=prompt, height=1024, width=1024, num_inference_steps=9, guidance_scale=0.0, generator=torch.Generator("cuda").manual_seed(42), ).images[0] image.save("example.png")

Key points when using Diffusers

guidance_scale = 0.0 is required for Turbo models.
num_inference_steps = 9 results in 8 internal DiT steps.
torch_dtype=torch.bfloat16 is recommended on supported GPUs.
Flash Attention, model compilation, and CPU offloading are optional optimizations.

How Z-Image Turbo works (author overview)

Decoupled-DMD: the acceleration core

Decoupled-DMD is the few-step distillation algorithm behind Z-Image's 8-step model. The authors identify two independent mechanisms:

CFG Augmentation (CA): the main distillation driver.
Distribution Matching (DM): acts as a stabilizing regularizer.

DMDR: combining DMD with reinforcement learning

Building on Decoupled-DMD, DMDR integrates Reinforcement Learning to improve semantic alignment, aesthetic quality, and structural coherence while preserving high-frequency details.

Citation

@article{team2025zimage, title={Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer}, author={Z-Image Team}, journal={arXiv preprint arXiv:2511.22699}, year={2025} }