← All Guides
quickstartinstallgenerate

Getting Started with modl

Install modl, pull your first model, and generate an image — all in under 5 minutes. Then explore training, the web UI, and what to try next.

Mar 13, 2026 5 min read

Install modl

One command, no dependencies. Works on Linux and macOS with an NVIDIA GPU.

$ curl -fsSL https://modl.run/install.sh | sh
✓ Installed modl to ~/.modl/bin/modl
✓ Added to PATH

Verify it’s working:

$ modl --version
modl x.y.z

modl is a single Rust binary. It manages its own Python runtime for GPU work — you don’t need to install Python, create virtual environments, or touch pip.

Pull a model

Before you can generate images, you need a model. Let’s start with Z-Image Turbo — it’s fast (4 steps), high quality, and small enough for most GPUs.

$ modl pull z-image-turbo
▸ Downloading z-image-turbo (6.3 GB)...
▸ Downloading z-image-vae (335 MB)...
▸ Downloading z-image-text-encoder (9.8 GB)...
✓ Installed z-image-turbo + 2 dependencies

modl automatically downloads the model’s dependencies (VAE, text encoders) and stores everything in a content-addressed store at ~/.modl/store/.

Tip:

Don’t have much VRAM? modl auto-selects fp8 or fp16 variants based on your GPU. A 12GB card works fine for most models.

Other models to try

ModelSpeedQualityVRAM
z-image-turbo4 stepsGreat~12 GB
z-image30 stepsExcellent~12 GB
flux-schnell4 stepsGreat~12 GB
flux-dev28 stepsExcellent~16 GB

You can see all available models with modl list --remote or at modl.run/models. For a full comparison of all 16 models — capabilities, sizes, and which to use when — see Which Model Should I Use?.

Generate your first image

$ modl generate "a plate of spaghetti, app icon, flat design"
▸ Loading z-image-turbo...
▸ Generating ████████████████ 4/4 steps
✓ Generated 1 image:
~/.modl/outputs/2026-03-13/001.png
First generated image — a plate of spaghetti in flat design app icon style

Your first modl-generated image. No config files, no Python environments, no node graphs.

That’s it. One command, one image. modl picked the model (last pulled), the size (1024x1024), the steps (4 for turbo), and the seed (random).

Customize the output

Override any default with flags:

# Different size
$ modl generate "a mountain landscape at sunset" --size 16:9
 
# Specific model
$ modl generate "a portrait photo" --base flux-dev --steps 28
 
# Reproducible (same seed = same image)
$ modl generate "a cat astronaut" --seed 42
 
# Batch generate
$ modl generate "product on marble" --count 4

Launch the web UI

Everything modl does from the CLI, you can also do from a browser.

$ modl serve
▸ Server running at http://localhost:3333
modl web UI — generate tab with model selection, prompt input, and generated image

The modl web UI. Generate, browse outputs, manage models, train LoRAs — all from your browser.

The web UI runs locally on your machine. Your images, your GPU, your data.

Train a LoRA

This is where modl gets interesting. A LoRA teaches a model to render a specific subject (your face, your dog, your product) or style (watercolor, pixel art, anime).

# 1. Create a dataset from your photos
$ modl dataset create my-subject --from ~/photos/
✓ 20 images → ~/.modl/datasets/my-subject/
 
# 2. Auto-caption the images
$ modl dataset caption my-subject
✓ 20/20 captions written
 
# 3. Train
$ modl train --dataset my-subject --base z-image --name my-subject-v1
▸ Images: 20 → Steps: 1500 Rank: 32 LR: 1e-4
▸ Trigger word: OHWX
✓ Saved my-subject-v1.safetensors
 
# 4. Generate with your LoRA
$ modl generate "OHWX on a beach at sunset" --lora my-subject-v1

modl handles the Python runtime, ai-toolkit configuration, VRAM management, and training parameter selection. You provide photos and a name.

modl training dashboard showing LoRA runs with sample outputs

The training dashboard shows sample evolution across steps — you can see exactly when your LoRA locks in.

Tip:

Want the full walkthrough? See Train Your First Style LoRA for a deep dive on captioning strategy, presets, and evaluation.

Script it

Every modl command supports --json for machine-readable output. This means you can pipe it into scripts, chain commands, or let an LLM agent drive.

# Score an image's aesthetic quality
$ modl vision score image.png --json
{"score": 6.24, "mean_score": 6.24}
 
# Compare two images (CLIP similarity)
$ modl vision compare ref.png target.png --json
{"similarity": 0.82}
 
# Upscale to 4x resolution
$ modl process upscale image.png --output ./hires/

See the AI Storybook guide for an example of chaining these into an automated pipeline.

What’s next

Explore from here

  1. Browse modelsmodl list --remote or modl.run/models. Try flux-dev for photorealism, z-image for illustrated styles.
  2. Train a LoRA — follow the Style LoRA guide or the AI Storybook guide.
  3. Use the web UImodl serve for a visual interface with real-time generation progress.
  4. Edit imagesmodl edit image.png "add sunglasses" uses AI to modify existing images.
  5. Read the docsmodl.run/docs for the full command reference.

Recommended next

If you’re ready to train, start with Train Your First Style LoRA — it covers the captioning strategy that makes or breaks a LoRA. Or jump to the AI Storybook guide to see what a full modl pipeline looks like end to end.