← All Guides
modelscapabilitiesreferenceinpaintedittraincontrolnet

Capabilities Reference

What can each model do? A task-oriented guide mapping every modl capability to the models that support it, with recommended picks and CLI commands.

Mar 19, 2026 6 min read
Capabilities overview — portrait, landscape, abstract, architecture, product photography

You know what you want to do but not which model to use. This guide is organized by task — find your task, see your options, pick a model.

For side-by-side visual comparisons, see the model comparison guide.

Text to image

All models support this. The question is which one to pick.

PickModelWhy
Fast iterationflux-schnell4 steps, good quality, default model
Best qualityqwen-image20B params, best text rendering
Best open licensechromaApache 2.0, negative prompts, uncensored
Low VRAMsd-1.5Runs on 4GB GPUs
Balancedflux-dev28 steps, strong prompt following
Small + fastflux2-klein-4b4B params, 4 steps, fits anywhere
$ modl generate "a photo of a mountain lake" --base flux-schnell

Image to image (img2img)

Re-style an existing image. Lower --strength = closer to original.

Supported: flux-dev, flux-schnell, chroma, z-image, z-image-turbo, sdxl, sd-1.5

Not supported: qwen-image, qwen-image-edit, flux2 family (use modl edit instead)

$ modl generate "watercolor painting" --init-image photo.png --strength 0.6
Tip: For Flux 2 Klein models, use modl edit instead — they support instruction-based editing which is more flexible than img2img.

Inpainting

Regenerate a specific region of an image using a mask (white = edit, black = keep).

modl supports two inpainting methods: standard (diffusers pipeline or Flux Fill) and LanPaint (training-free, works with any supported model). The --inpaint flag controls which method to use — auto (default) picks the best one for your model.

PickModelMethodWhy
Best qualityflux-fill-dev-onerewardStandardRLHF-tuned, no boundary artifacts
Best quality (alt)flux-fill-devStandardDedicated 384-ch inpainting model
Good defaultflux-devStandardNative inpainting, auto-routes to Fill if installed
Best LanPaintz-imageLanPaintBest quality with training-free inpainting
Fastz-image-turboStandard/LanPaint8 steps, supports both methods
Edit modelsflux2-klein-9bLanPaintNo standard inpaint — LanPaint auto-selected
Low VRAMsdxlStandardNative inpainting, large LoRA ecosystem
# Create a mask from a bounding box
$ modl process segment photo.png --method bbox --bbox 100,200,400,500
 
# Inpaint the masked region (auto-selects best method)
$ modl generate "a garden with roses" --init-image photo.png --mask photo_mask.png
 
# Force LanPaint on a model that supports both
$ modl generate "a garden with roses" --base z-image --init-image photo.png --mask photo_mask.png --inpaint lanpaint
 
# Klein 9b auto-routes to LanPaint (no standard inpaint available)
$ modl generate "a garden with roses" --base flux2-klein-9b --init-image photo.png --mask photo_mask.png
Tip: modl auto-routes intelligently: Flux 1 models route to Flux Fill when installed, Klein models auto-select LanPaint, and Z-Image uses standard inpainting by default (force LanPaint with --inpaint lanpaint).

Creating masks

Three ways to create masks for inpainting:

MethodCommandUse case
Bounding boxmodl process segment --method bbox --bbox x1,y1,x2,y2Quick rectangular mask
SAM (point/box)modl process segment --method sam --point x,yPrecise edges around objects
Backgroundmodl process segment --method backgroundMask everything except the subject
Ground + segmentmodl vision ground "cup" photo.png then modl process segmentFind object by name, then mask it

Instruction-based editing

Tell the model what to change in natural language — no mask needed. Klein models also accept multiple images, so you can pass a reference image alongside the source.

PickModelWhy
Best qualityqwen-image-edit20B params, best text editing, style transfer
Balancedflux2-klein-9b9B, good quality, 4 steps, multi-image reference
Fast / low VRAMflux2-klein-4b4B, fits on consumer GPUs, 4 steps, multi-image reference
# Text instruction edit
$ modl edit "make the sky sunset orange" --image photo.png
$ modl edit "replace the chair with a sofa" --image room.png --base flux2-klein-4b
 
# Reference-based edit — pass a second image as reference
$ modl edit "replace her jacket with the jacket in the second image" \
--image photo.png --image jacket-ref.png --base flux2-klein-9b
Tip: modl edit is different from inpainting. Edit models understand instructions (“add sunglasses”, “change the color”) without needing a mask. Klein models accept multiple --image flags — pass a reference image as the second image for reference-based edits like clothing swaps or style transfer.

LoRA training

Fine-tune a model on your images to learn a character, style, or object.

PickModelWhy
Best defaultflux-dev12B, quantized to ~12GB VRAM, great results
Fastest trainingz-image-turbo6B params, ~1.3s/step on 5090
Best for styleqwen-image20B, fits 24GB with 3-bit quantization
Low VRAMsdxl~10GB, mature ecosystem
New genflux2-klein-4b4B, very fast to train, new architecture

Not trainable: qwen-image-edit, flux-fill models (inference-only)

$ modl train --base flux-dev --lora-type character --dataset my-photos

ControlNet

Structural guidance — maintain the pose, edges, or depth of a reference image.

ModelSupported types
sdxlcanny, depth, pose, softedge, tile, scribble, hed, mlsd, normal
z-image-turbocanny, hed, depth, pose, mlsd, scribble, gray
z-imagecanny, hed, depth, pose, mlsd, scribble, gray
flux-devcanny, depth, pose, softedge, gray
flux-schnellcanny, depth, pose, softedge, gray
qwen-imagecanny, depth, pose, softedge

Not supported: flux2 family, chroma, sd-1.5, qwen-image-edit

# Extract edges, then generate with structural guidance
$ modl process preprocess canny photo.png
$ modl generate "anime style" --base flux-dev --controlnet photo_canny.png

Style reference

Use a reference image to guide the visual style of a generation.

ModelMechanismNotes
flux2-klein-4bMulti-image editPass reference as second --image via modl edit
flux2-klein-9bMulti-image editPass reference as second --image via modl edit
flux-devIP-Adapter--style-ref on modl generate, requires flux-dev-ip-adapter
sdxlIP-Adapter--style-ref on modl generate, supports style/face/content types
# Klein: reference-based style transfer via edit
$ modl edit "transform this photo into the style of the second image" \
--image photo.png --image style-ref.png --base flux2-klein-9b
 
# Flux Dev / SDXL: style-ref flag on generate
$ modl generate "a castle" --base flux-dev --style-ref painting.png
Tip: For models without style-ref or multi-image edit, train a style LoRA instead: modl train --base z-image-turbo --lora-type style --dataset my-paintings

Text rendering

Only qwen-image and qwen-image-edit can render legible text in images. All other models struggle with text.

$ modl generate "a coffee shop sign that says OPEN" --base qwen-image

Quick decision tree

  1. “I just want to generate images fast”flux-schnell
  2. “I need the best quality”qwen-image (or flux-dev if you need inpainting/ControlNet)
  3. “I want to edit an existing image”modl edit with qwen-image-edit
  4. “I want to inpaint a region”flux-fill-dev-onereward (best quality) or any model with --mask (auto-routes to LanPaint for Klein/Z-Image)
  5. “I want to train a LoRA”flux-dev (best default) or z-image-turbo (fastest)
  6. “I need ControlNet”sdxl (most types) or flux-dev (best quality)
  7. “I need text in the image”qwen-image
  8. “I have a low VRAM GPU”flux2-klein-4b (10GB) or sdxl (5GB fp8)