Creator How-To (Image-to-Video) ·
Reference‑Anchored Image‑to‑Video: The “Same Character” Prompt Template for Clean Motion in Veo3Gen (inspired by Runway Gen‑4 References)
A copy‑paste “same character from reference” template for reference‑anchored image‑to‑video in Veo3Gen—plus motion prompts, examples, and fixes for drift.
On this page
- Why your character “drifts” (and why reference-first fixes it)
- What to prep before you generate: 5 traits your reference image must carry
- 1) Identity details (the non-negotiables)
- 2) Composition that matches the shot you want
- 3) Lighting direction and mood
- 4) Wardrobe and texture cues
- 5) Style continuity
- The core template: “Same character from reference” + motion-only prompting
- Copy-paste prompt template (bracketed placeholders)
- Motion stack: subject vs camera vs scene (with examples)
- Subject motion (what the character does)
- Camera motion (how the viewer moves)
- Scene motion (what in the environment changes)
- A note on complexity
- 7 copy‑paste prompt recipes (creator + small-business use cases)
- 1) Creator portrait (clean talking-head beat)
- 2) Product demo (hands + label reveal)
- 3) UGC-style ad (hook + gesture)
- 4) Lifestyle b-roll (coffee + glance)
- 5) Interior walkthrough (brand character in space)
- 6) Outdoor hero shot (sun + flare)
- 7) Urban street (parallax + vibe)
- Troubleshooting: when the model ignores the reference (change this first)
- Mini workflow: 3 runs to a keeper (iterate one variable at a time)
- Run 1: Lock identity + minimal motion
- Run 2: Add one motion layer
- Run 3: Add scene motion or texture
- Quick checklist (before you hit generate)
- FAQ
- What does “same character from reference” actually do?
- Should I describe the person’s face in the prompt?
- How do I move the character into a new scene without changing who they are?
- Why do simpler prompts often look better?
- Related reading
- CTA: Generate reference‑anchored clips at scale
- Try Veo3Gen (Affordable Veo 3.1 Access)
- Sources
Why your character “drifts” (and why reference-first fixes it)
If you’ve ever generated image-to-video and watched a character subtly turn into someone else mid-clip, you’ve hit identity drift: the model keeps “inventing” details frame to frame when it’s not strongly constrained.
A practical way to reduce drift is to go reference-first:
- Let the input image carry the look (face, hair, wardrobe, palette, lighting, framing).
- Use the text prompt mainly to specify motion (what moves, how the camera moves, what in the scene changes).
This mirrors a core idea described in guides for reference-based workflows: references are used to keep visual consistency, while prompts steer context, actions, and atmosphere. (https://www.imagine.art/blogs/prompt-guide-runway-gen-4-references)
In other words: the more your prompt tries to re-describe the person, the more opportunities you give the model to “reinterpret” them.
What to prep before you generate: 5 traits your reference image must carry
Before you write a “same character” prompt, pick (or create) a reference image that already contains most of the decisions you care about.
1) Identity details (the non-negotiables)
Make sure the reference clearly shows distinguishing traits that must not change (e.g., haircut silhouette, eyebrow shape, signature accessory). The goal is that your text prompt doesn’t need to restate these.
2) Composition that matches the shot you want
If you want a medium close-up talking-head, use a medium close-up reference. If you want a full-body walk cycle, use a full-body reference. Mismatched framing is a common cause of warped hands/limbs.
3) Lighting direction and mood
Try to “bake in” the lighting direction (window light, backlight, neon spill). Scene-driven prompting that calls out camera and lighting can help, but your reference is the strongest anchor. (https://focalml.com/blog/exploring-runway-gen-4-tips-for-crafting-professional-grade-videos/)
4) Wardrobe and texture cues
If your brand character always wears a specific jacket, make that jacket obvious in the reference. Texture cues (like film grain or soft depth of field) can also be prompted, but it’s best when the reference already points that way. (https://focalml.com/blog/exploring-runway-gen-4-tips-for-crafting-professional-grade-videos/)
5) Style continuity
If the reference is photoreal, keep your motion prompt photoreal. If it’s stylized/illustrative, keep everything in that lane. Big style swings increase drift.
The core template: “Same character from reference” + motion-only prompting
Reference-based prompting guides often recommend starting with subject acknowledgment to anchor the prompt to the reference. (https://www.imagine.art/blogs/prompt-guide-runway-gen-4-references)
Below is a Veo3Gen-friendly, reusable structure inspired by that approach (subject acknowledgment → action → environment → style/mood), but tuned for motion-only prompting.
Copy-paste prompt template (bracketed placeholders)
Use explicit identity-lock phrasing up front—then stop describing the face.
Template
Same character from reference, maintaining appearance and exact features. Do not change identity.
[SUBJECT] in [SETTING], wearing [WARDROBE].
Subject motion: [SUBJECT_MOTION].
Camera motion: [CAMERA_MOTION].
Scene motion: [SCENE_MOTION].
Lighting/mood: [LIGHTING_MOOD].
Style/texture: [STYLE_TEXTURE].
Why those identity-lock phrases help: they function as a clear “constraint header.” Reference-anchoring guidance emphasizes tying the prompt directly to the reference to avoid losing the subject’s identity. (https://www.imagine.art/blogs/prompt-guide-runway-gen-4-references)
Motion stack: subject vs camera vs scene (with examples)
When your reference already defines the “who,” most of your wins come from separating what moves into three layers.
Subject motion (what the character does)
Keep it concrete and filmable:
- “turns head to camera, smiles, then looks away”
- “reaches forward and taps the product label”
- “walks two steps, stops, adjusts jacket collar”
Camera motion (how the viewer moves)
Camera language is especially powerful when you keep it to one move.
Examples:
- “slow push-in from medium shot to close-up”
- “handheld micro-shake, subtle”
- “smooth left-to-right dolly, parallax visible”
Scene-driven prompting that includes camera angle/motion is commonly recommended for more professional results. (https://focalml.com/blog/exploring-runway-gen-4-tips-for-crafting-professional-grade-videos/)
Scene motion (what in the environment changes)
This is where you add life without touching identity:
- “curtains gently moving from a breeze”
- “traffic bokeh drifting in background”
- “steam rising from coffee cup”
A note on complexity
If you’re not getting clean results, reduce scope. One guide explicitly advises avoiding overstuffed prompts and sticking to one strong visual idea. (https://focalml.com/blog/exploring-runway-gen-4-tips-for-crafting-professional-grade-videos/)
7 copy‑paste prompt recipes (creator + small-business use cases)
Each example below assumes you’ve already supplied a strong reference image.
1) Creator portrait (clean talking-head beat)
Same character from reference, maintaining appearance and exact features. Subject remains consistent.
Subject motion: subtle breathing, natural blink, small smile, then a gentle head nod. Camera motion: locked-off tripod, minimal movement. Scene motion: none. Lighting/mood: soft window light from the left casting long shadows. Style/texture: soft depth of field, light film grain.
(Example lighting/texture phrasing aligns with suggested prompt cues like soft window light, film grain, and soft depth of field.) (https://focalml.com/blog/exploring-runway-gen-4-tips-for-crafting-professional-grade-videos/)
2) Product demo (hands + label reveal)
Same character from reference, maintaining appearance and exact features.
Subject motion: brings [PRODUCT] into frame, rotates it slowly to show the label, then points to the key feature. Camera motion: slow push-in to the product. Scene motion: gentle bounce of sleeve fabric as the hand moves. Lighting/mood: clean studio lighting, soft reflections. Style/texture: crisp detail, glass reflections on lens.
(“Glass reflections on lens” is an example texture cue recommended for richer results.) (https://focalml.com/blog/exploring-runway-gen-4-tips-for-crafting-professional-grade-videos/)
3) UGC-style ad (hook + gesture)
Same character from reference, maintaining appearance and exact features.
Subject motion: quick lean toward camera like a hook, enthusiastic hand gesture, then holds [PRODUCT] near face. Camera motion: handheld, subtle micro-shake. Scene motion: background stays stable. Lighting/mood: bright indoor daylight. Style/texture: natural smartphone look, soft depth of field.
4) Lifestyle b-roll (coffee + glance)
Same character from reference, maintaining appearance and exact features.
Subject motion: lifts coffee cup, takes a small sip, relaxed exhale, looks out the window. Camera motion: slow right-to-left dolly. Scene motion: steam rises from the cup. Lighting/mood: warm morning window light. Style/texture: film grain, gentle bloom.
5) Interior walkthrough (brand character in space)
Same character from reference, maintaining appearance and exact features.
Subject motion: walks slowly through the room, pauses to touch the countertop, then turns back to camera. Camera motion: smooth gimbal follow, medium-wide. Scene motion: subtle curtain movement from a breeze. Lighting/mood: soft ambient interior light. Style/texture: clean, realistic, soft depth of field.
6) Outdoor hero shot (sun + flare)
Same character from reference, maintaining appearance and exact features.
Subject motion: steps into sunlight, hair moves slightly in wind, confident look to camera. Camera motion: slow push-in, slight tilt up. Scene motion: leaves moving in background. Lighting/mood: backlit by harsh sunlight, lens flare visible. Style/texture: high contrast, film grain.
(Backlit harsh sunlight and lens flare are cited examples of lighting prompt language.) (https://focalml.com/blog/exploring-runway-gen-4-tips-for-crafting-professional-grade-videos/)
7) Urban street (parallax + vibe)
Same character from reference, maintaining appearance and exact features.
Subject motion: walks toward camera at a calm pace, glances to the side, then back forward. Camera motion: backward tracking shot, smooth, parallax visible. Scene motion: pedestrians and traffic bokeh drifting behind. Lighting/mood: evening street light, cinematic mood. Style/texture: soft depth of field, subtle film grain.
Troubleshooting: when the model ignores the reference (change this first)
Reference-based prompting guides emphasize anchoring the subject early and using progressive detailing—start simple, then add complexity. (https://www.imagine.art/blogs/prompt-guide-runway-gen-4-references)
Use this table to decide your next change without randomly rewriting everything.
| Symptom | Likely cause | Best next change |
|---|---|---|
| Identity drift (face/hair changes) | Prompt is re-describing identity or adding too many competing details | Strengthen the first line: “same character from reference… exact features,” and remove extra facial descriptors |
| Hands look weird during product demo | Motion too complex + framing mismatch | Simplify subject motion to one action; use a reference with hands visible in similar framing |
| Background morphs or “melts” | Too much scene description or multiple environments | Reduce environment to one clear setting; avoid stacking many props |
| Character outfit changes | Wardrobe not clearly anchored | Ensure wardrobe is visible in reference; add a single wardrobe line and stop |
| Camera move feels chaotic | Too many camera instructions | Pick one camera move (push-in or dolly or handheld), not three |
| Overall looks “busy” or inconsistent | Overstuffed prompt | Keep one strong visual idea; cut adjectives and extra beats (https://focalml.com/blog/exploring-runway-gen-4-tips-for-crafting-professional-grade-videos/) |
Mini workflow: 3 runs to a keeper (iterate one variable at a time)
A reliable beginner workflow is progressive detailing: start simple and layer in complexity only after you’ve confirmed identity is stable. (https://www.imagine.art/blogs/prompt-guide-runway-gen-4-references)
Run 1: Lock identity + minimal motion
- Use only the identity-lock header.
- Add one small subject action (blink + head turn).
- No camera move.
Run 2: Add one motion layer
- Keep identity text identical.
- Add either a camera move or a richer subject action.
Run 3: Add scene motion or texture
- Add one environmental motion cue (steam, curtains, bokeh).
- Optionally add one texture cue (film grain or soft depth of field). (https://focalml.com/blog/exploring-runway-gen-4-tips-for-crafting-professional-grade-videos/)
Quick checklist (before you hit generate)
- My prompt starts with: “same character from reference” + “maintaining appearance / exact features”
- I’m not re-describing the face (the reference already does)
- I chose one primary motion (subject or camera)
- The reference framing matches the intended action (hands/full body/close-up)
- I’m changing only one variable per iteration
FAQ
What does “same character from reference” actually do?
It’s explicit subject acknowledgment that ties the prompt to the reference, which is recommended as a way to anchor outputs to the referenced subject. (https://www.imagine.art/blogs/prompt-guide-runway-gen-4-references)
Should I describe the person’s face in the prompt?
Usually no. If the reference image already shows identity, use text to direct actions, environment, and mood—references handle visual consistency while prompts control context and atmosphere. (https://www.imagine.art/blogs/prompt-guide-runway-gen-4-references)
How do I move the character into a new scene without changing who they are?
Use “contextual bridging”: keep identity anchored to the reference while changing the scenario in the prompt. (https://www.imagine.art/blogs/prompt-guide-runway-gen-4-references)
Why do simpler prompts often look better?
Overstuffing prompts can dilute the main visual idea; guidance for pro-grade results recommends sticking to one strong idea and adding camera/lighting details intentionally. (https://focalml.com/blog/exploring-runway-gen-4-tips-for-crafting-professional-grade-videos/)
Related reading
CTA: Generate reference‑anchored clips at scale
If you’re ready to turn this template into a repeatable pipeline—batch variants, A/B camera moves, or generate multiple clips to stitch into a sequence—Veo3Gen is built for programmatic workflows.
- Explore the developer workflow: /api → https://veo3gen.com/api
- See plans and usage options: /pricing → https://veo3gen.com/pricing
Keep the reference fixed, iterate one variable at a time, and you’ll get to “same character, clean motion” faster—without reinventing your prompt every run.
Try Veo3Gen (Affordable Veo 3.1 Access)
If you want to turn these tips into real clips today, try Veo3Gen:
Sources
Try Veo 3 & Veo 3 API for Free
Experience cinematic AI video generation at the industry's lowest price point. No credit card required to start.