Runway Gen‑4 “References” vs Prompts: What to Lock with Images (and What to Write) — a Practical Creator Playbook (as of 2026-03-24)

If you’re making short ads, UGC-style product clips, or branded social loops, the fastest way to waste time in Gen‑4 is to overprompt what you should have shown.

Here’s the decision-first split that keeps your iterations tight:

References = visual consistency. Use them to anchor identity, brand look, and recurring scene cues. Runway Gen‑4 References supports up to 3 reference images. (https://www.imagine.art/blogs/prompt-guide-runway-gen-4-references)
Prompts = context, actions, and atmosphere. Use text to direct what happens (motion), how it’s shot (camera intent), and what changes between shots. (https://www.imagine.art/blogs/prompt-guide-runway-gen-4-references)

Keep it simple, then add detail deliberately. Runway’s own Gen‑4 guidance emphasizes that the model thrives on prompt simplicity and recommends starting simple and iterating. (https://help.runwayml.com/hc/en-us/articles/39789879462419-Gen-4-Video-Prompting-Guide)

What “References” actually solve (and what they don’t)

References are best treated as locks—they reduce “identity drift” by giving the model concrete visual targets.

References solve

Character/product continuity: same face, same packaging silhouette, same wardrobe when you keep wardrobe consistent in your references.
Brand look continuity: repeating a color palette, lighting direction, texture, or “finish” across shots.
Scene continuity cues: recurring environment elements (a recognizable kitchen counter, a signature studio sweep, a hero prop placement).

This aligns with the common workflow description that references handle visual consistency while the prompt steers context, actions, and atmosphere. (https://www.imagine.art/blogs/prompt-guide-runway-gen-4-references)

References don’t solve

Motion clarity. Runway’s Gen‑4 Video guidance explicitly advises using the text prompt to focus on describing motion. (https://help.runwayml.com/hc/en-us/articles/39789879462419-Gen-4-Video-Prompting-Guide)
Complex shot logic. If your prompt tries to cram three beats, two camera moves, and a wardrobe change into one generation, you’ll still get weirdness—references can’t “unconflict” your intent.
Bad inputs. If your input/reference images are low quality or artifacted, you’re anchoring the model to problems. The Gen‑4 Video guide highlights using a high-quality input image free of visual artifacts for best results. (https://help.runwayml.com/hc/en-us/articles/39789879462419-Gen-4-Video-Prompting-Guide)

The 80/20 rule: let images lock identity; let text control the shot

Here’s the heuristic that stops overprompting:

Put 80% of your “what it looks like” requirements into references.
Put 80% of your “what happens” requirements into text.

Practically, that means:

References: character/product, wardrobe, hair, makeup, packaging label layout, hero environment, signature lighting.
Prompt: action verbs, camera movement, pacing, mood, and what changes between shot A → shot B.

When you do write, keep the prompt readable. For Gen‑4 Image, Runway recommends full sentences with natural language for more control. (https://help.runwayml.com/hc/en-us/articles/35694045317139-Gen-4-Image-Prompting-Guide)

Also: avoid “don’t / no / without” prompt patterns. Both Gen‑4 Video and Image guides warn that negative prompting isn’t supported (or may reduce adherence). (https://help.runwayml.com/hc/en-us/articles/39789879462419-Gen-4-Video-Prompting-Guide) (https://help.runwayml.com/hc/en-us/articles/35694045317139-Gen-4-Image-Prompting-Guide)

The “Reference Anchor Line” template (copy/paste)

Drop this at the top of your prompt, then add your shot direction.

Reference Anchor Line (template):

Same [character/product] as in the reference images, maintaining the exact appearance (face/features or packaging), [wardrobe/materials], and [brand colors]. Keep [key identifying details] consistent.

Then, add one clean block for motion + camera + scene changes.

Runway’s Gen‑4 Video guide suggests iterating by adding one new prompt element at a time (subject motion, camera motion, scene motion, style descriptors). (https://help.runwayml.com/hc/en-us/articles/39789879462419-Gen-4-Video-Prompting-Guide)

Three example prompts (ad, UGC, cinematic)

1) Direct-response ad shot (product hero):

Same skincare bottle as in the reference images, maintaining the exact label design, cap shape, and brand colors. Keep the bottle pristine and centered.

The bottle slowly rotates on a clean studio surface while a soft spotlight glides across the label. Camera does a gentle push-in from medium to close-up. Bright, high-end studio lighting with subtle reflections.

2) UGC-style demo (hand + product):

Same supplement jar as in the reference images, maintaining the exact label, colors, and lid. Keep the jar size and proportions consistent.

A person’s hands open the jar, scoop one serving, and tap the scoop back into the jar. Camera is handheld and close, slight natural shake, framed like a phone video on a kitchen counter. Warm indoor lighting.

3) Cinematic brand moment (character + mood):

Same character as in the reference images, maintaining the exact facial features, hairstyle, and wardrobe. Keep skin tone and makeup consistent.

The subject walks through a hallway and turns toward window light; hair moves subtly as they stop. Slow dolly sideways with shallow depth of field. Moody, soft contrast lighting with a restrained color palette.

Note how each prompt spends words on motion and camera intent, because Gen‑4 Video prompting guidance emphasizes describing motion in the text prompt. (https://help.runwayml.com/hc/en-us/articles/39789879462419-Gen-4-Video-Prompting-Guide)

What to put in each reference slot (Subject vs Scene vs Style)

You have up to three reference images—use them with purpose, not redundancy. (https://www.imagine.art/blogs/prompt-guide-runway-gen-4-references)

Subject reference: lock identity

Pick a frame that answers, instantly:

Who/what is it?
What are the defining details?

Good subject reference:

Clear face (or clear product label), sharp focus, minimal motion blur.
Wardrobe you plan to keep for multiple shots.
Neutral pose/expression if you need the character to do many actions later.

Avoid: busy patterns that create false “identity features,” heavy compression artifacts, or extreme lens distortion that warps proportions.

Scene reference: lock composition + lighting logic

This is your “where” and “how it’s lit.”

Good scene reference:

Readable layout (foreground/midground/background).
Lighting direction you can reuse (soft window key, top-down studio, neon rim, etc.).

Avoid: cluttered backgrounds that fight your subject or force the model to invent details.

Style reference: lock palette + texture

Use this to keep brand consistency across different scenes.

Good style reference:

A strong, consistent palette.
A clear texture language (clean studio gloss, grainy doc look, soft filmic rolloff).

Avoid: overly “effected” images where the effect is the only thing the model learns.

Prompt elements that still matter most with references

Even with great references, these prompt components do the heavy lifting:

Action: specify one beat at a time

If the output “drifts,” it’s often because the action is vague.

Prefer: “The subject lifts the mug, takes one sip, and exhales.”
Overstuffed: “The subject walks in, smiles, talks to camera, shows product, pours, drinks, winks.”

Runway recommends starting simple and iterating. (https://help.runwayml.com/hc/en-us/articles/39789879462419-Gen-4-Video-Prompting-Guide)

Camera: describe intent, not equipment lists

Instead of naming five lens specs, describe what you want viewers to feel:

“Slow push-in to close-up”
“Handheld, intimate framing”
“Locked-off tripod shot, minimal movement”

Ambiguous action: the model fills in gaps with invented movement.
Conflicting style words: “cinematic,” “raw phone footage,” and “high-fashion studio” in the same prompt.
Wardrobe changes you didn’t intend: you described “a new outfit” or implied time passing.
Camera move too complex: multiple moves (pan + dolly + zoom) plus fast action.
Environment forces identity changes: reflections, harsh mixed lighting, crowded scenes.

Two rewrite patterns that fix most drift

Rewrite pattern A (reduce conflict):

Keep the Reference Anchor Line.
Remove all style adjectives except 1–2.
Keep one motion for the subject and one for the camera.

Rewrite pattern B (sequence the intent):

Generate the motion beat first (simple camera).
Then iterate: add one camera refinement, or one style refinement—never both at once.

This matches Runway’s “add one new prompt element at a time” iteration approach. (https://help.runwayml.com/hc/en-us/articles/39789879462419-Gen-4-Video-Prompting-Guide)

Drift troubleshooting checklist (quick)

Are my reference images high quality and free of obvious artifacts? (https://help.runwayml.com/hc/en-us/articles/39789879462419-Gen-4-Video-Prompting-Guide)
Did I write one clear action beat (not three)?
Did I avoid negative prompting (“no/without/don’t”)? (https://help.runwayml.com/hc/en-us/articles/39789879462419-Gen-4-Video-Prompting-Guide)
Did I remove conflicting style words?
Did I simplify the camera move to one primary motion?
Did I accidentally imply a wardrobe or identity change?

A 10-minute micro-workflow for a consistent 3-shot sequence

This is built for short-form ads where you need continuity without burning an afternoon.

Shot plan (3 shots)

Shot 1 (establish): clean intro of character/product in stable framing.
Shot 2 (action): one interaction beat.
Shot 3 (payoff): hero close-up or “result” moment.

Workflow

Pick references (2–3 min):
- Subject reference (identity)
- Scene reference (lighting/composition)
- Style reference (palette/texture)
Write Shot 1 prompt (2 min):
- Reference Anchor Line
- Minimal action + minimal camera
Write Shot 2 prompt (2 min):
- Same Anchor Line
- One new action beat
Write Shot 3 prompt (2 min):
- Same Anchor Line
- Camera pushes closer + simple payoff
Iterate smart (2 min):
- Change only one element per iteration (action, camera, scene motion, or style). (https://help.runwayml.com/hc/en-us/articles/39789879462419-Gen-4-Video-Prompting-Guide)

Runway notes Gen‑4 can create videos in 5 or 10 second durations based on an input image and a text prompt—so plan your beats to fit a short clip. (https://help.runwayml.com/hc/en-us/articles/39789879462419-Gen-4-Video-Prompting-Guide)

When to skip references and just prompt (and when you’ll regret it)

Skip references when

You’re exploring broad concepts (mood boards, visual directions) and don’t need identity continuity.
Each clip is a one-off and brand consistency isn’t critical.

Gen‑4 Image is designed for strong visual fidelity and stylistic control, and its guide suggests simple prompting can work well (with complexity adding more stylistic control). (https://help.runwayml.com/hc/en-us/articles/35694045317139-Gen-4-Image-Prompting-Guide)

You’ll regret skipping references when

You need the same character/product across multiple deliverables.
You’re cutting a sequence and the “same person” must remain the same person.
You’re doing performance marketing where repeated brand cues matter.

Runway’s guides advise against negative prompting; Gen‑4 Video notes negative phrasing isn’t supported and may be unpredictable, and Gen‑4 Image advises avoiding negative prompting for best adherence. (https://help.runwayml.com/hc/en-us/articles/39789879462419-Gen-4-Video-Prompting-Guide) (https://help.runwayml.com/hc/en-us/articles/35694045317139-Gen-4-Image-Prompting-Guide)

4) What’s the safest way to improve results without breaking consistency?

Iterate gradually—add one new element at a time (subject motion, camera motion, scene motion, or style descriptors). (https://help.runwayml.com/hc/en-us/articles/39789879462419-Gen-4-Video-Prompting-Guide)

CTA: productionize your video workflow

If you’re building a repeatable content pipeline (variants, localization, bulk generations, creative testing), you’ll want generation to be programmable.

Explore the docs: Veo3Gen API
See plans when you’re ready to scale: Pricing

Runway Gen‑4 “References” vs Prompts: What to Lock with Images (and What to Write) — a Practical Creator Playbook (as of 2026-03-24)