← All Guides
fastlightningqwen-imageqwen-image-editcomparisontext-renderingeditingoutpaint

Fast Inference with Lightning LoRAs

Use --fast to generate images 10x faster with Lightning distillation LoRAs. Side-by-side comparisons of text rendering, editing, and outpainting quality at 4, 8, and 50 steps.

Mar 24, 2026 8 min read

The --fast flag applies a Lightning distillation LoRA that reduces inference from 40-50 steps to just 4 or 8 — a 10x speedup with minimal quality loss. This guide shows exactly what you gain and what you trade off.

Prerequisites

Install the Lightning LoRA alongside your base model:

$ modl pull qwen-image # 19 GB (fp8)
$ modl pull qwen-image-2512-lightning # ~850 MB LoRA
 
# For editing
$ modl pull qwen-image-edit # 21 GB (GGUF Q8)
$ modl pull qwen-image-edit-lightning

How --fast works

Lightning distillation — first popularized with SDXL-Lightning — trains a LoRA that teaches the model to produce good output in far fewer denoising steps. The --fast flag handles the details: it loads the Lightning LoRA, sets guidance to 1.0 (required by the distillation), and adjusts the noise schedule.

# 4-step (fastest, ~10x speedup)
$ modl generate "a sunset over the ocean" --base qwen-image --fast
 
# 8-step (higher quality, ~6x speedup)
$ modl generate "a sunset over the ocean" --base qwen-image --fast 8
 
# Works with edit too
$ modl edit "make the sky orange" --image photo.png --fast

The only supported values are --fast (4 steps) and --fast 8 (8 steps) — these match the distillation training. Other step counts won’t produce good results. You can’t combine --fast with --lora since the Lightning LoRA occupies that slot. If you try, modl will error with a clear message.

Tip:

Guidance is locked to 1.0 in —fast mode — the distillation was trained this way and higher values produce artifacts. If you need to tweak guidance, use normal mode instead.

Speed and VRAM

All timings measured on an RTX 4090 (24 GB) at 1024x1024 resolution with the model already loaded (warm start). First-run times include model loading and will be significantly longer.

ModelModeSteps~Time (RTX 4090)VRAM
qwen-imagenormal50~120s~20 GB
qwen-image--fast 88~15s~21 GB
qwen-image--fast4~8s~21 GB
qwen-image-editnormal40~100s~20 GB
qwen-image-edit--fast 88~15s~21 GB
qwen-image-edit--fast4~10s~21 GB
flux2-klein-9bdefault4~12s~18 GB
z-image-turbodefault4~6s~12 GB
Tip:

VRAM is slightly higher with —fast because the Lightning LoRA (~850 MB) is loaded alongside the base model weights.

Why Klein 9B and Z-Image Turbo don’t need --fast

Not all fast models work the same way. Qwen-Image is a full-precision model designed for 50 steps — --fast bolts on a Lightning LoRA to make it work at 4. Klein 9B and Z-Image Turbo were distilled at training time — speed is baked into the weights, no LoRA needed. That’s also why they’re lighter on VRAM.

The trade-off: Qwen with --fast gives you Klein/Turbo-like speeds while keeping Qwen’s quality advantages — especially text rendering, which neither Klein nor Z-Image Turbo can match reliably.

Text rendering: normal vs --fast

Qwen-Image 2512 is the best open model for text rendering. All comparisons below use the same seed per prompt and are generated at 1024x1024.

$ modl generate "A coffee shop chalkboard sign reading ..." \
--base qwen-image --fast # 4 steps
--base qwen-image --fast 8 # 8 steps
--base qwen-image --steps 50 # normal

Prompt 1: Coffee shop chalkboard

“A coffee shop chalkboard sign reading ‘Today’s Special: Oat Milk Latte $4.50‘“

1 50 steps (normal)
Chalkboard at 50 steps
2 8 steps (`--fast 8`)
Chalkboard at 8 steps
3 4 steps (`--fast`)
Chalkboard at 4 steps

Text rendering stays legible down to 4 steps. Some fine detail softens.

Prompt 2: Graffiti wall

“Colorful street graffiti on a brick wall that reads ‘DREAM BIG’ in bold bubble letters”

1 50 steps (normal)
Graffiti at 50 steps
2 8 steps (`--fast 8`)
Graffiti at 8 steps
3 4 steps (`--fast`)
Graffiti at 4 steps

Bold text like graffiti holds up well even at 4 steps.

Prompt 3: Neon sign

“A neon sign at night reading ‘OPEN 24/7’ in glowing pink and blue”

1 50 steps (normal)
Neon at 50 steps
2 8 steps (`--fast 8`)
Neon at 8 steps
3 4 steps (`--fast`)
Neon at 4 steps

Neon glow effects are well-preserved. Character shapes stay accurate.

Prompt 4: Movie poster

“A vintage movie poster with the title ‘THE LAST VOYAGE’ and subtitle ‘Coming Summer 2026‘“

1 50 steps (normal)
Poster at 50 steps
2 8 steps (`--fast 8`)
Poster at 8 steps
3 4 steps (`--fast`)
Poster at 4 steps

Dense text with multiple lines. Smaller subtitle text may degrade at 4 steps — use --fast 8 for text-heavy prompts.

Comparison: Z-Image Turbo

The same prompts on Z-Image Turbo — natively fast at 4 steps, no Lightning LoRA needed, and lighter on VRAM (~12 GB). Text rendering is less reliable than Qwen but the image quality is strong.

$ modl generate "A coffee shop chalkboard sign..." --base z-image-turbo
1 Chalkboard
Z-Image chalkboard
2 Graffiti
Z-Image graffiti
3 Neon
Z-Image neon
4 Poster
Z-Image poster

Z-Image Turbo: fast and lightweight but less accurate text. Use Qwen-Image when legible text matters.

General quality: across models

How does Qwen + --fast compare to natively-fast models for non-text prompts? All images at 1024x1024, same seed per prompt.

$ modl generate "Professional headshot..." --base qwen-image --fast
$ modl generate "Professional headshot..." --base flux2-klein-9b
$ modl generate "Professional headshot..." --base z-image-turbo

Portrait

“Professional headshot of a woman with short hair, studio lighting, neutral background”

1 Qwen 50s (normal)
Portrait qwen 50 steps
2 Qwen `--fast` (4s)
Portrait qwen fast
3 Klein 9B (4s)
Portrait klein 9b
4 Z-Image Turbo (4s)
Portrait z-image turbo

All models produce high-quality portraits at 4 steps. Style differences are model-specific, not speed-related.

Product shot

“A luxury watch on a marble surface, dramatic lighting”

1 Qwen 50s (normal)
Watch qwen 50 steps
2 Qwen `--fast` (4s)
Watch qwen fast
3 Klein 9B (4s)
Watch klein 9b
4 Z-Image Turbo (4s)
Watch z-image turbo

For product photography without text, all fast models deliver strong results.

Editing: fast vs normal

--fast works with modl edit too. Same Lightning approach — dramatically faster with minimal quality loss.

$ modl edit "replace the sneaker with vintage leather boots" \
--image sneaker.png --fast
 
$ modl edit "transform into a watercolor painting" \
--image cafe.png --fast 8

Object replacement

“Replace the sneaker with vintage leather boots”

1 Source
Original sneaker
2 Edit 40s (normal)
Boots edit normal
3 Edit `--fast 8`
Boots edit fast 8
4 Edit `--fast` (4s)
Boots edit fast 4
5 Klein 9B (4s)
Boots klein 9b

Object replacement: red sneaker → leather boots. Faithful edits at all speeds.

Style transfer

“Transform into a watercolor painting”

1 Source
Original cafe
2 Edit 40s (normal)
Watercolor edit normal
3 Edit `--fast 8`
Watercolor edit fast 8
4 Edit `--fast` (4s)
Watercolor edit fast 4
5 Klein 9B (4s)
Watercolor klein 9b

Style transfer: photo → watercolor. The effect is clearly applied at all speeds.

Outpainting with --fast

Extend an image canvas using modl edit --size. The edit model fills in the extra space coherently.

$ modl edit "extend the scene with more outdoor seating" \
--image cafe.png --size "16:9" --fast
1 Source
Original cafe
2 Edit 40s (normal)
Outpaint normal
3 Edit `--fast 8`
Outpaint fast 8
4 Edit `--fast` (4s)
Outpaint fast 4
5 Klein 9B (4s)
Outpaint klein 9b

Outpainting: extending a cafe scene to 16:9 widescreen. All modes produce coherent extensions.

Where --fast struggles

The distillation trades detail for speed. Prompts that work well at 50 steps can degrade at 4:

  • Dense small text — subtitle lines, fine print, multi-paragraph text. Use --fast 8 instead.
  • Hair-like fine textures — fur, beard stubble, intricate fabric weave. May appear slightly softer or oversharpened.
  • Highly complex compositions — many subjects with detailed interactions. The model has fewer steps to resolve spatial relationships.

For these cases, --fast 8 recovers most of the detail. If quality is critical, use normal mode.

When to use what

ModeBest forTrade-off
--fastPrompt iteration, batch gen, style explorationFine detail softens, ~10x faster
--fast 8Text-heavy output, final renders at speedMinimal loss, ~6x faster
Normal (no flag)Hero images, dense text, custom LoRAsFull quality, full time
# Batch generation with --fast
$ modl generate "product photo of a red sneaker" \
--base qwen-image --fast --count 8
Supported models:

—fast currently supports qwen-image and qwen-image-edit. Other models like Z-Image Turbo and Klein 9B are already distilled at training time — they run at 4 steps by default, no LoRA needed.

Explore from here

  1. Which model should I use? — Full comparison of all supported models
  2. Getting started — Install modl and generate your first image
  3. Structural editing — Advanced editing with ControlNet