Creator How-To (Image-to-Video) ·
First & Last Frame in Veo 3.1 (as of 2026-02-24): A Creator FAQ for Smooth Transitions, Match Cuts, and Before→After Ads
Learn how to use Veo 3.1 first & last frame as creative anchors for match cuts, smooth transitions, and before→after ads—plus templates and fixes.
On this page
- What “First & Last Frame” is (and when it beats text-only prompting)
- Quick start: the 3-part prompt that makes transitions obey your frames
- The 3-part structure
- Copy/paste prompt skeleton (with labeled slots)
- FAQ: What kinds of transitions work best (match cut, whip pan, morph, rack focus)?
- What’s the easiest “high success” transition?
- Are whip pans reliable?
- Should I use “morph” transitions?
- When does rack focus help?
- FAQ: How to design your first/last images for continuity
- What must stay consistent?
- What should change to make the transition feel meaningful?
- Mini guide: how to build the two frames
- FAQ: How to control the camera move between frames (without drift)
- How do I prevent the camera from inventing a new shot?
- How do I make the transition feel like a real edit?
- FAQ: How to add audio direction that matches the transition (dialogue, SFX, music)
- FAQ: Common problems (and fixes)
- Problem: The last frame isn’t respected
- Problem: Character or brand drift (face changes, logo shifts)
- Problem: Unnatural morphing in the middle
- Problem: Lighting jumps between frames
- Problem: Camera jitter or wobble
- 5 ready-to-copy prompt templates
- 1) Before→after AI video ad (cleaning/beauty/upgrade)
- 2) Product reveal (hand-off to hero)
- 3) Location swap match cut
- 4) Character entrance (occlusion reveal)
- 5) Logo end-card landing
- A simple QC checklist before you render variations
- Related reading
- CTA: Build this workflow into your pipeline
- FAQ (quick)
- Does Veo 3.1 support first/last frame inputs?
- Is Veo 3.1 production-ready?
- Can I direct camera and lens behavior in prompts?
- Should I change style between the first and last frames?
- What’s the fastest way to improve results?
- Try Veo3Gen (Affordable Veo 3.1 Access)
What “First & Last Frame” is (and when it beats text-only prompting)
First & Last Frame is a control mode where you provide two endpoint images—your opening frame and your closing frame—and the model generates the motion and in-between moments that connect them.
Here’s the mental model that makes this feature useful in production:
- Endpoints are anchors: your first frame locks the start, your last frame locks the destination.
- The prompt directs the journey: your text describes how the camera and the scene travel from anchor A to anchor B.
As of 2026-02-24, this is one of the most practical ways to get transitions that feel edited rather than “generated,” especially for:
- Match cuts (same composition, different scene)
- Before→after ads (messy room → spotless room; dull hair → shiny hair)
- Product reveals (hand enters frame holding product → hero shot end-card)
- Location swaps (street → studio)
Veo 3.1 is described by Google Cloud as a state-of-the-art video generation model with professional-grade creative controls and rich synchronous audio. (https://cloud.google.com/blog/products/ai-machine-learning/ultimate-prompting-guide-for-veo-3-1)
Quick start: the 3-part prompt that makes transitions obey your frames
If your outputs “wander,” it’s usually because the model has too many competing instructions. Keep your direction simple and structured.
The 3-part structure
- Restate the anchors (briefly describe what must be true in the first and last frames)
- Define the transition path (camera move + the single main change)
- Constrain style + timing + audio (avoid big mid-clip reinventions)
Replicate’s Veo 3.1 prompting guidance emphasizes specifying composition, lens/focus, style, and camera movement—those same elements matter even more when you’re trying to “bridge” two fixed endpoints. (https://replicate.com/blog/veo-3-1)
Copy/paste prompt skeleton (with labeled slots)
Use this as your default starting point:
Prompt:
- [First frame description]: …
- [Last frame description]: …
- [Transition + camera]: …
- [Style]: …
- [Timing]: …
- [Audio]: …
Audio is a first-class control in Veo 3.1 according to Google Cloud’s description of “rich synchronous audio,” so it’s worth planning the sound to match the cut instead of treating it as an afterthought. (https://cloud.google.com/blog/products/ai-machine-learning/ultimate-prompting-guide-for-veo-3-1)
FAQ: What kinds of transitions work best (match cut, whip pan, morph, rack focus)?
What’s the easiest “high success” transition?
Match cut. Keep framing consistent across both images and change only the environment or the subject state (clean/dirty, empty/full, day/night). Your prompt’s job is then to describe one clear bridge: a subtle push-in, a head turn, a hand wipe across lens, etc.
Are whip pans reliable?
They can be—but treat them as a motion mask. Make both endpoints similarly composed, then request a fast lateral camera move with motion blur in the middle, not constantly. If you ask for blur the whole time, you risk losing identity details.
Should I use “morph” transitions?
Use morphing sparingly for brand work. If you want a transformation (e.g., “before → after”), it’s often cleaner to describe a physical cause (wipe, splash, snap, fold, spin) rather than an abstract “morph,” which can introduce unintended mid-clip deformations.
When does rack focus help?
Rack focus is great when you want to change attention without changing the camera position much. Ask for:
- foreground focus at start,
- smooth pull to background by mid-clip,
- end in sharp focus on the final product/face.
Invideo’s 7-layer prompt formula explicitly includes Camera & Lens, Lighting, and Audio—those layers map well to transition planning because they constrain the “journey” between endpoints. (https://invideo.io/blog/google-veo-prompt-guide/)
FAQ: How to design your first/last images for continuity
What must stay consistent?
To sell a seamless transition, keep these stable across both frames:
- Subject identity: same person/product angle if you want a match cut
- Key props: logo placement, hero object, distinctive wardrobe elements
- Composition: horizon line, head size, product size in frame
- Lighting direction: same key light side if you want “one shot” continuity
What should change to make the transition feel meaningful?
Change only 1–2 things that communicate the story:
- Before→after state (messy → clean)
- Location (office → beach)
- Mood (neutral → celebratory)
- Product state (closed → open)
Mini guide: how to build the two frames
- Frame 1 (setup): include the “problem” clearly, but avoid clutter that you don’t want carried through.
- Frame 2 (payoff): keep the camera angle consistent; increase clarity—cleaner background, stronger product presence, more readable branding.
As of 2026-02-24, the best practical habit is to treat your first/last images like an editor would: same lens and framing if you want a cut that feels intentional.
FAQ: How to control the camera move between frames (without drift)
How do I prevent the camera from inventing a new shot?
- Specify one camera instruction: “slow dolly in,” “locked-off tripod,” or “gentle handheld.”
- Add a constraint: “maintain the same framing and angle as the first frame.”
Replicate’s guide highlights camera positioning and movement as promptable controls; use that, but avoid stacking multiple moves (e.g., dolly + orbit + crane) unless you truly need it. (https://replicate.com/blog/veo-3-1)
How do I make the transition feel like a real edit?
Give the model an editing motivation:
- “match cut on the subject’s silhouette”
- “wipe transition as a hand passes the lens”
- “whip pan to the right, cut lands perfectly on same composition”
FAQ: How to add audio direction that matches the transition (dialogue, SFX, music)
Veo 3.1 is described as supporting rich synchronous audio, so you can direct sound alongside the visual transition. (https://cloud.google.com/blog/products/ai-machine-learning/ultimate-prompting-guide-for-veo-3-1)
Practical audio patterns for transitions:
- Match cut: continuous ambient bed + subtle whoosh at cut point
- Before→after: “wipe” SFX or snap + music lift on the reveal
- Rack focus: soft ambient + gentle tonal rise as focus resolves
Keep audio prompts specific but short: what you want, when it peaks, and the mood.
FAQ: Common problems (and fixes)
Problem: The last frame isn’t respected
Fixes:
- Restate the last frame in the prompt using concrete details (pose, object placement, text on screen).
- Reduce the amount of mid-clip action.
- Shorten the duration so there’s less “room” to drift.
Problem: Character or brand drift (face changes, logo shifts)
Fixes:
- Keep style constant from start to end; avoid switching aesthetics mid-clip.
- Repeat the identity anchors: “same person,” “same outfit,” “logo centered on chest.”
- If available in your workflow, use character reference images (Replicate notes Veo 3.1 supports character reference images). (https://replicate.com/blog/veo-3-1)
Problem: Unnatural morphing in the middle
Fixes:
- Replace “morph” with a physical transition (wipe, spin, tilt down, object passes lens).
- Simplify motion: one main action, one camera move.
Problem: Lighting jumps between frames
Fixes:
- Add a single lighting line: “consistent soft key from camera-left; no change in color temperature.”
- Avoid extreme changes like “golden hour” → “neon nightclub” unless you want a stylized jump.
Problem: Camera jitter or wobble
Fixes:
- Ask for “tripod-stable” or “smooth gimbal” explicitly.
- Reduce speed of camera move.
- Shorten duration or remove secondary actions.
5 ready-to-copy prompt templates
Adapt these by swapping the bracketed slots.
1) Before→after AI video ad (cleaning/beauty/upgrade)
- [First frame description]: “A small bathroom sink area, cluttered counter, dull mirror, product bottle on the left.”
- [Last frame description]: “Same sink composition, spotless counter, mirror streak-free, same product bottle now centered and label facing camera.”
- [Transition + camera]: “A hand wipes across the lens left-to-right; as the hand passes, the scene becomes clean. Camera locked-off, same framing throughout.”
- [Style]: “Bright commercial realism, natural textures, no stylization.”
- [Timing]: “4 seconds total; wipe occurs around 2 seconds.”
- [Audio]: “Soft bathroom room tone, satisfying wipe ‘swish’ at midpoint, subtle music lift on the clean reveal.”
2) Product reveal (hand-off to hero)
- [First]: “Close-up of hands holding a closed product box at chest height.”
- [Last]: “Hero shot of the product out of the box on a clean surface, centered.”
- [Transition + camera]: “Slow push-in as the box opens; cut lands on the hero placement without changing angle.”
- [Style]: “Studio commercial, soft shadows.”
- [Timing]: “5 seconds.”
- [Audio]: “Box opening SFX, gentle whoosh as camera pushes in.”
3) Location swap match cut
- [First]: “Subject centered, medium shot, standing in a city street.”
- [Last]: “Same pose and framing, now in a minimalist studio.”
- [Transition + camera]: “Whip pan right with motion blur mid-clip; pan stops perfectly on the same composition in the studio.”
- [Style]: “Natural cinematic realism.”
- [Timing]: “4 seconds.”
- [Audio]: “City ambience fades into quiet room tone during the whip; brief whoosh.”
4) Character entrance (occlusion reveal)
- [First]: “Empty hallway, camera facing forward.”
- [Last]: “Same hallway framing, character now in foreground facing camera.”
- [Transition + camera]: “A foreground object passes close to lens (brief full occlusion). When it clears, the character is present.”
- [Style]: “Clean cinematic.”
- [Timing]: “3–4 seconds.”
- [Audio]: “Footsteps begin softly after occlusion; subtle riser.”
5) Logo end-card landing
- [First]: “Product on table, brand colors in background, space above product.”
- [Last]: “Same scene, logo lockup and tagline cleanly visible in the negative space.”
- [Transition + camera]: “Very slow push-in; minimal movement; logo resolves crisply in final second.”
- [Style]: “Commercial, high clarity.”
- [Timing]: “5 seconds; logo appears at 3.5 seconds.”
- [Audio]: “Music sting resolves at the final frame; no distracting SFX.”
A simple QC checklist before you render variations
- First and last frames share the same framing (if match cut) or an intentionally planned change
- Only one main action is driving the transition
- Camera instruction is single and explicit (tripod / gimbal / slow dolly)
- Lighting notes don’t contradict your frames
- Audio cue matches the transition moment (wipe/whip/focus)
Related reading
CTA: Build this workflow into your pipeline
If you’re turning these templates into a repeatable production system—batching variations, testing multiple endpoints, or generating ad sets—Veo3Gen makes it easy to integrate.
- Explore the developer workflow in our docs: /api
- Estimate costs and plan production: /pricing
FAQ (quick)
Does Veo 3.1 support first/last frame inputs?
Yes—Replicate’s Veo 3.1 prompting post describes first/last frame input, and Google Cloud’s Veo 3.1 guide includes a mention of a “first frame, last frame” capability in a quoted statement. (https://replicate.com/blog/veo-3-1) (https://cloud.google.com/blog/products/ai-machine-learning/ultimate-prompting-guide-for-veo-3-1)
Is Veo 3.1 production-ready?
Google Cloud states Veo 3.1 is stable and generally available for production on Vertex AI. (https://cloud.google.com/blog/products/ai-machine-learning/ultimate-prompting-guide-for-veo-3-1)
Can I direct camera and lens behavior in prompts?
Replicate’s guide includes prompting examples covering shot composition, focus/lens effects, and camera positioning and movement. (https://replicate.com/blog/veo-3-1)
Should I change style between the first and last frames?
As of 2026-02-24, big style changes often increase drift risk; for continuity, keep style stable and put the “story change” into props, environment, or state instead.
What’s the fastest way to improve results?
Reduce complexity: restate your anchors, choose one transition mechanism (wipe/whip/match), and keep timing short enough that the model doesn’t invent extra beats.
Try Veo3Gen (Affordable Veo 3.1 Access)
If you want to turn these tips into real clips today, try Veo3Gen:
Try Veo 3 & Veo 3 API for Free
Experience cinematic AI video generation at the industry's lowest price point. No credit card required to start.