generationlorakleinstorybookmulti-characteredit

Illustrate a Children's Story with Multiple Characters

Generate a 6-page storybook with 3 characters using Klein 9B + a LoRA. The hard problem: keeping a kitten visible when a LoRA dog dominates every scene.

Mar 27, 2026 15 min read

What you’ll build

A 6-page illustrated children’s story with three characters — one locked with a LoRA, two described by prompt alone. The story is “Maxi and the Midnight Kitchen”: a Pomeranian hears a noise at night, finds a kitten raiding the cookie jar, and a little girl catches them both.

Page 1 — Maxi sleeping on the couch, one eye open

Page 2 — Maxi finds a kitten on the kitchen counter

Page 3 — Kitten pushes a glass toward the edge

Page 5 — Luna sets down a milk saucer for the kitten

Page 6 — All three asleep on the kitchen floor, morning light

Every illustration generated with Klein 9B + a custom Maxi LoRA. Pixar 3D animated style, 16:9 aspect ratio. The hardest scenes (pages 5-6) required reference images + LoRA together via modl edit.

Multi-character consistency is the hard problem this guide tackles. When a LoRA-locked character shares the frame with prompt-only characters, the LoRA dominates — smaller characters get absorbed or vanish entirely. We test two approaches: prompt-order tricks and reference images + LoRA.

The hard problem:

A LoRA modifies the model’s weights, giving one character an unfair advantage in every scene. Smaller characters (especially animals similar to the LoRA subject) get dropped or distorted. This guide documents the failures and two fixes — one fragile, one reliable.

Prerequisites

modl installed — curl -fsSL https://modl.run/install.sh | sh
Klein 9B — modl pull flux2-klein-9b (8.8 GB download, needs roughly 16 GB VRAM with a LoRA loaded)
A trained character LoRA — we used a Pomeranian LoRA trained on Klein 9B (see Train a Character LoRA). When you train a LoRA with modl, you get back an ID like train:maxi-klein-9b:c3bc24dc9980e627 — use yours in place of ours throughout this guide.
GPU with 16+ GB VRAM — tested on an RTX 4090 (24 GB). Klein 9B runs quantized at ~13 GB; the LoRA adds ~3 GB on top
A story — write one, or ask an LLM

Step 1: The cast

Before generating story pages, create standalone reference images for each character. These serve two purposes: they’re visual anchors as you write prompts, and they become input images for Klein’s edit mode in multi-character scenes later.

Maxi (LoRA-locked)

Maxi is the main character. He has a trained LoRA, so his identity is locked. The trigger word OHWX activates the LoRA.

  $ modl generate "OHWX pomeranian sitting on a cozy couch, soft evening light, Pixar 3D animated style" \     
       --base flux2-klein-9b --lora "train:maxi-klein-9b:c3bc24dc9980e627" \  
       --size 3:4 --seed 42  

Reference image of Maxi the Pomeranian on a couch

Maxi reference — seed 42. The LoRA handles identity, so every seed produces the same dog.

With a LoRA, every seed produces the same dog. The reference image is less about finding the right likeness and more about confirming the style works — and having a clean reference to pass to modl edit later.

Tip:

Klein 9B defaults to 4 inference steps (it’s a distilled model). We omit --steps throughout this guide since 4 is the default.

Kitten (prompt-only)

The kitten has no LoRA — its appearance is controlled entirely by the prompt. This means it needs a detailed, consistent description in every scene.

$ modl generate "a tiny orange tabby kitten with bright blue eyes, fluffy fur, sitting on a kitchen counter, Pixar 3D animated style, warm tones" \

--base flux2-klein-9b --size 3:4 --seed 42

Kitten references at seed 42 and 77. No LoRA — prompt-only consistency. The orange tabby with blue eyes holds up across seeds, though coat patterns and face shape vary.

Without a LoRA, different seeds produce different kittens. The description “tiny orange tabby kitten with bright blue eyes” was specific enough to keep it recognizable. Pick your best reference — you’ll use it as an input image later.

Luna (prompt-only)

Luna is an 8-year-old girl in floral pajamas. Like the kitten, she’s prompt-only.

$ modl generate "an 8 year old girl with messy brown hair, freckles, wearing floral pajamas, standing in a hallway at night, Pixar 3D animated style" \

--base flux2-klein-9b --size 3:4 --seed 42

Luna references at seed 42 and 77. Consistent across seeds without a LoRA — the messy brown hair, freckles, and floral pajamas combination gives the model enough anchors.

Luna was notably consistent. Her description was specific enough that she looked like the same character across seeds without any LoRA. Human characters tend to have more distinguishing features than animals, making them easier to keep consistent with prompts alone.

Tip:

A detailed outfit + hair + face description is often enough for human characters. You don’t always need a LoRA — save those for subjects where prompt-only consistency breaks down.

Step 2: Page by page

Each page combines the story text with a generation prompt. Pages 1-4 used modl generate with the LoRA. Pages 5-6 needed reference images + LoRA via modl edit — the debugging story behind that is in Step 3.

Base modelKlein 9BBest multi-character coherence, native multi-image edit

LoRAtrain:maxi-klein-9b:c3bc24dc9980e627Locks Maxi's identity

Aspect ratio16:9Cinematic storybook spread layout

Steps4 (default, omitted from commands)Klein 9B is distilled — 4 steps is the sweet spot

Style suffixPixar 3D animated style, warm tones, cinematicConsistent look across all pages

Page 1: Maxi sleeping

“Maxi was dreaming of squirrels when a tiny sound woke him up. Clink. Clink. Clink.”

Single character, simple scene. Just the LoRA doing its job.

$ modl generate "OHWX pomeranian sleeping on a cozy couch at night, one eye slightly open, moonlight through window, Pixar 3D animated style, warm tones, cinematic" \

--base flux2-klein-9b --lora "train:maxi-klein-9b:c3bc24dc9980e627" --size 16:9 --seed 42

Maxi sleeping on the couch, one eye open

Page 1 — seed 42. Single character + LoRA. Worked on every seed tested.

Page 2: Finding the kitten

“He tiptoed to the kitchen and found… a kitten. A very small, very orange kitten, with its paw stuck in the cookie jar.”

First multi-character scene. The kitten shares the frame with the LoRA dog, but they’re spatially separated — Maxi in the doorway, kitten on the counter.

$ modl generate "OHWX pomeranian standing in a kitchen doorway looking at a tiny orange tabby kitten with blue eyes sitting on the counter with its paw in a cookie jar, nighttime kitchen, Pixar 3D animated style, warm tones, cinematic" \

--base flux2-klein-9b --lora "train:maxi-klein-9b:c3bc24dc9980e627" --size 16:9 --seed 88

Maxi finds a kitten on the kitchen counter

Page 2 — seed 88. The LoRA keeps Maxi correct while the kitten comes from the prompt alone. Spatial separation between characters keeps both intact.

Page 3: The staredown

“They stared at each other. The kitten slowly pushed a glass toward the edge of the counter. Maxi’s eyes went wide.”

$ modl generate "OHWX pomeranian watching from the kitchen floor as a tiny orange tabby kitten with blue eyes on the counter reaches its paw toward a glass near the edge, tense moment, Pixar 3D animated style, warm tones, cinematic" \

--base flux2-klein-9b --lora "train:maxi-klein-9b:c3bc24dc9980e627" --size 16:9 --seed 42

Kitten pushing a glass toward the edge while Maxi watches

Page 3 — seed 42. The glass between them sells the scene.

Page 4: Luna arrives

“CRASH. Luna appeared in the doorway, flashlight in hand. ‘MAXI. What did you DO?’”

The first 3-character scene. Luna is in the doorway, animals on the counter — everyone is spatially separated.

$ modl generate "an 8 year old girl with messy brown hair and freckles wearing floral pajamas holding a flashlight in a kitchen doorway, OHWX pomeranian and a tiny orange tabby kitten with blue eyes on the kitchen counter, dramatic flashlight beam, Pixar 3D animated style, warm tones, cinematic" \

--base flux2-klein-9b --lora "train:maxi-klein-9b:c3bc24dc9980e627" --size 16:9 --seed 42

Luna in the doorway with a flashlight, catching Maxi and the kitten

Page 4 — seed 42. All three characters present. Worked on every seed tested — spatial separation makes this scene reliable.

Page 5: Milk saucer

“But Luna wasn’t really mad. She poured milk into a saucer and set it on the floor. ‘You must be hungry, little one.’”

This was the first scene where modl generate failed repeatedly — the kitten vanished or mutated in most seeds. The fix was switching to modl edit with reference images + LoRA (explained in Step 3).

  $ modl edit "a tiny orange tabby kitten with bright blue eyes drinking from a milk saucer on the kitchen floor, OHWX pomeranian watching nearby, an 8 year old girl with messy brown hair freckles and floral pajamas kneeling beside them, warm kitchen at night, Pixar 3D animated style, warm tones, cinematic" \     
       --image ref-kitten-42.webp --image ref-maxi-couch.webp --image ref-luna-42.webp \  
       --base flux2-klein-9b --lora "train:maxi-klein-9b:c3bc24dc9980e627" --seed 42 --size 16:9  

All three characters around a milk saucer on the kitchen floor

Page 5 — ref+LoRA via modl edit. All three characters present and distinct. This scene failed 70% of the time with generate-only.

Page 6: Morning sleep

“When Luna’s parents found them the next morning, nobody moved. The cookie jar was empty. The milk was gone. And Maxi had a new friend.”

The hardest scene in the book. Three characters sleeping on the floor — physically close together, similar poses. Even the ref+LoRA approach struggled here (the kitten merged into Maxi when overlapping). This was the best result from generate-only, seed 42.

$ modl generate "a tiny orange tabby kitten curled up against OHWX pomeranian, both sleeping on a kitchen floor, an 8 year old girl with messy brown hair and freckles in floral pajamas asleep beside them, golden morning sunlight through window, overhead angle, Pixar 3D animated style, warm tones, cinematic" \

--base flux2-klein-9b --lora "train:maxi-klein-9b:c3bc24dc9980e627" --size 16:9 --seed 42

All three characters asleep on the kitchen floor in morning light

Page 6 — seed 42, generate-only with kitten-first prompt order. The only seed out of 6 tested that kept all three characters distinct. Sleeping scenes with overlapping characters remain the hardest case.

Step 3: Multi-character scenes — the hard part

Pages 1-4 were straightforward. Pages 5-6 broke — and the debugging process revealed the core problem with multi-character LoRA generation. Here’s what happened, in order.

The failure

Page 5 (milk saucer) was the first scene where all three characters share the floor at close range. Using modl generate with the LoRA trigger first in the prompt:

  # Standard approach — LoRA trigger first   
  $ modl generate "OHWX pomeranian and an 8 year old girl with messy brown hair and freckles in floral pajamas setting a milk saucer on the floor for a tiny orange tabby kitten with blue eyes, kitchen floor, Pixar 3D animated style, warm tones" \     
       --base flux2-klein-9b --lora "train:maxi-klein-9b:c3bc24dc9980e627" --size 16:9 --seed 42  
     # Result: Maxi and Luna present, kitten missing entirely  

Seeds 42 and 77 — the kitten is just gone. The LoRA and Luna’s description consumed all the model’s attention budget.

The kitten vanished in 4 out of 6 attempts. The two remaining attempts produced a different failure — LoRA bleed, where the kitten appeared but with Pomeranian features:

LoRA bleed — kitten looks dog-like, seed 99

LoRA bleed — kitten looks dog-like, seed 200

Seeds 99 and 200 — the kitten appeared, but with Pomeranian features. The LoRA’s influence bleeds into “small fluffy animal” tokens.

The pattern: when a LoRA character shares the frame with prompt-only characters at close range, there’s an attention hierarchy. The LoRA always wins (it modifies the model’s weights). Detailed prompt characters usually survive. The smallest, simplest character gets dropped first. The kitten was the perfect victim — smallest physical size, no LoRA, and “small fluffy animal” tokens that overlap with the Pomeranian LoRA.

Workaround: prompt order

Putting the weakest character first in the prompt gives it priority in the model’s text attention. The earlier a subject appears, the more attention it receives during generation.

Kitten last — 1 of 6 seeds worked

”OHWX pomeranian… 8 year old girl… a tiny orange tabby kitten…”

Kitten first — 4 of 6 seeds worked

”a tiny orange tabby kitten with bright blue eyes… OHWX pomeranian… 8 year old girl…”

This helped — success rate went from ~1/6 to ~4/6. But it’s fragile. You’re still relying on seed luck, and for the sleeping scene (page 6) it only worked 1 out of 6 seeds.

The fix: reference images + LoRA

The real solution uses Klein’s ability to accept multiple input images. Instead of describing characters with text alone, you pass the reference images from Step 1 as visual anchors. Klein’s pipeline uses these images as conditioning — each reference grounds a character’s appearance in pixel space, not just token space. Combined with --lora, the LoRA locks the trained character while the reference images lock everyone else.

  # Pass character references + LoRA together via modl edit   
  $ modl edit "a tiny orange tabby kitten with bright blue eyes drinking from a milk saucer on the kitchen floor, OHWX pomeranian watching nearby, an 8 year old girl with messy brown hair freckles and floral pajamas kneeling beside them, warm kitchen at night, Pixar 3D animated style, warm tones, cinematic" \     
       --image ref-kitten-42.webp \  
       --image ref-maxi-couch.webp \  
       --image ref-luna-42.webp \  
       --base flux2-klein-9b \  
       --lora "train:maxi-klein-9b:c3bc24dc9980e627" \  
       --seed 42 --size 16:9  

3-character milk scene with ref+LoRA — seed 42

3-character milk scene with ref+LoRA — seed 77

Seeds 42 and 77. Both seeds have all three characters present and distinct. The kitten is clearly an orange tabby, Maxi is himself (LoRA), Luna has messy brown hair and pajamas. 4 out of 4 seeds tested produced all three characters — 100% success rate.

This is the same scene that failed 4 out of 6 times with generate-only. With reference images + LoRA: every seed worked.

Why this works:

Generate-only relies entirely on text attention — the LoRA dominates and weaker characters get dropped. Reference images give each character a visual signal that doesn’t compete with the LoRA’s weight modifications. The model sees what the kitten looks like instead of inferring it from text tokens alone.

The order of --image flags doesn’t affect the result — Klein treats the reference images as a set, not a sequence. We tested different orderings and saw no difference in output quality.

2-character scenes with ref+LoRA

For pages with just Maxi and the kitten, reference images + LoRA is more than needed — but it produces more consistent kitten appearance than prompt-only generation, with zero LoRA bleed.

2-character scene with reference images + LoRA

Maxi + kitten with ref images + LoRA. Both characters clearly distinct — no bleed, no disappearance.

Where it still fails: physical overlap

The sleeping scene (page 6) pushes the limits even with reference images + LoRA. When characters are curled up together — physically overlapping — the LoRA bleed returns.

Sleeping scene — kitten merges with Maxi

Sleeping scene — LoRA strength 0.8, kitten still merges

Left: LoRA strength 1.0. Right: LoRA strength 0.8. In both cases, the kitten merges into Maxi when they’re physically overlapping. Lowering LoRA strength didn’t help — it just made the merged animal look more cat-like. 0 out of 4 ref+LoRA attempts produced a distinct kitten in the sleeping scene.

The reference images solve the attention problem (kitten doesn’t disappear), but they can’t solve the spatial overlap problem. When a LoRA character and a similar-looking prompt-only character are touching, the LoRA’s influence bleeds across the boundary.

For page 6, the best result came from generate-only with kitten-first prompt order and seed luck (1 out of 6 seeds worked — shown in Step 2 above).

Tip:

For sleeping or cuddling scenes, compose characters with slight separation — “kitten curled up near the dog’s paws” rather than “kitten curled up against the dog.” Or generate many seeds and pick the rare one that works.

When to use which approach

generate + LoRA1-2 characters, separated6/6

generate + LoRA3 characters, separated (page 4)6/6

generate + LoRA3 characters, close range (page 5)1/6

generate + prompt order3 characters, close range (page 5)4/6

generate + prompt order3 characters, overlapping (page 6)1/6

edit + refs + LoRA3 characters, close range (page 5)4/4

edit + refs + LoRA3 characters, overlapping (page 6)0/4

Use modl generate for simple scenes (1-2 characters, or well-separated groups). Switch to modl edit with reference images + LoRA when characters share close quarters. For physically overlapping characters (cuddling, sleeping), no approach is reliable — generate many seeds and pick the best.

What we learned

1. Reference images + LoRA is the breakthrough. The combination solves the attention competition problem. The LoRA locks one character’s identity; the reference images lock everyone else’s. For spatially separated scenes, this took us from ~1/6 to 4/4.

2. Prompt order is a workaround, not a fix. Putting the weakest character first in the prompt helps (~1/6 → ~4/6), but it’s seed-dependent. Reference images are more reliable.

3. Physical overlap defeats everything. When characters touch or overlap, LoRA bleed happens regardless of approach. Design your compositions with slight separation between characters.

4. Character references serve double duty. The reference images you generate in Step 1 aren’t just for your own visual reference — they become inputs to modl edit. Invest time in getting good, clean character references early.

5. Human characters are easier than animals. Luna was consistent across every seed with just a prompt. A specific description (age, hair, outfit, distinguishing feature) is often enough. Animals have fewer distinguishing tokens and are more susceptible to LoRA bleed.

6. Generate more seeds than you think you need. For single-character pages, 2 seeds was enough. For 3-character generate-only pages, we needed 6 seeds to find one that worked. With ref+LoRA, every seed worked for non-overlapping scenes.

Generation settings

Base modelKlein 9B (flux2-klein-9b)

LoRAtrain:maxi-klein-9b:c3bc24dc9980e627

Aspect ratio16:9 (story pages), 3:4 (character refs)

Steps4 (Klein 9B default, omitted from commands)

StylePixar 3D animated style, warm tones, cinematic

Multi-character approach

  # Step 1: Generate character references (3:4, one per character)   
  $ modl generate "OHWX pomeranian..." --base flux2-klein-9b --lora your-lora --size 3:4     
  $ modl generate "a tiny orange tabby kitten..." --base flux2-klein-9b --size 3:4     
  $ modl generate "an 8 year old girl..." --base flux2-klein-9b --size 3:4     
      
  # Step 2: Use refs + LoRA for multi-character scenes   
  # Image order doesn't matter — Klein treats refs as a set   
  $ modl edit "scene description with all characters..." \     
       --image ref-kitten.webp --image ref-maxi.webp --image ref-luna.webp \  
       --base flux2-klein-9b --lora your-lora --size 16:9  

Related guides

Train a Character LoRA — train the LoRA used in this guide, with benchmarks across 5 models. Make a Children’s Book (v1) — the simpler single-character version using Z-Image + quality-checking pipeline. Character Reference Sheet — generate consistent character views with Klein’s edit mode.