AI Video Prompt “Anatomy” for Creators: The 6-Part Shot Formula (Subject → Action → Scene → Camera → Light → Style) + 10 Plug‑and‑Play Examples (as of 2026-04-15)

If your generations look almost right—but motion is vague, lighting feels flat, or the camera does something random—your prompt likely lacks a clear “shot card.” A well-crafted prompt is what dictates the content an AI video model produces from text (or images). (https://help.flexclip.com/en/articles/10326783-how-to-write-effective-text-prompts-to-generate-ai-videos)

This post gives you a tool-agnostic prompt anatomy you can reuse across creators’ workflows, and then port directly into Veo3Gen.

What a “good” AI video prompt actually contains (and what to leave out)

Most creator prompts fail for one of two reasons:

They describe a vibe, not a shot. (Mood without subject/action/camera intent.)
They dump everything into one sentence. (The model can’t “prioritize” what you meant.)

A practical middle ground is to write prompts as six explicit slots you can fill quickly: Subject → Action → Scene → Camera → Light → Style. FlexClip teaches a similar structure: Subject + Action + Scene + (Camera Movement + Lighting + Style). (https://help.flexclip.com/en/articles/10326783-how-to-write-effective-text-prompts-to-generate-ai-videos)

What to leave out (most of the time): brand lore, long backstory, multiple scene changes, and conflicting camera instructions. You can add more later—but start with one shot.

The 6-part prompt formula: Subject → Action → Scene → Camera → Light → Style

Below is the “anatomy,” explained in plain language, with micro-examples you can steal.

1) Subject (who/what the viewer should look at)

The Subject is the focus of the video—person, animal, plant, or object. (https://help.flexclip.com/en/articles/10326783-how-to-write-effective-text-prompts-to-generate-ai-videos)

Micro-examples:

Subject: “A barista in a green apron, 30s”
Subject: “A matte-black smartwatch on a stone table”

Tip: add one or two identifiers (material, age range, color, defining feature). Not ten.

2) Action (what changes over time)

Action is the core of the prompt because it drives the storyline. (https://help.flexclip.com/en/articles/10326783-how-to-write-effective-text-prompts-to-generate-ai-videos)

Micro-examples:

Action: “pours steamed milk into espresso, then taps the cup once”
Action: “a droplet hits the watch face and beads off”

Tip: If your output feels like a still image, your action is probably too static. Use a clear verb and a single beat.

3) Scene (where it happens + what’s around it)

Scene is the location and context: foreground, background, and elements that set the place. (https://help.flexclip.com/en/articles/10326783-how-to-write-effective-text-prompts-to-generate-ai-videos)

Micro-examples:

Scene: “cozy café counter, blurred customers in background”
Scene: “minimal studio tabletop, soft gray backdrop, water droplets nearby”

Tip: choose one main environment, then one background detail for believability.

4) Camera (framing + lens feel + movement)

Camera movement includes the shot type, angle, and movement that add narrative and visual appeal. (https://help.flexclip.com/en/articles/10326783-how-to-write-effective-text-prompts-to-generate-ai-videos)

Micro-examples:

Camera: “medium close-up, eye level, slow push-in”
Camera: “top-down product shot, gentle slide left”

FlexClip also notes you can combine movements (e.g., move down + zoom out; aerial + zoom in; handheld follow). (https://help.flexclip.com/en/articles/10326783-how-to-write-effective-text-prompts-to-generate-ai-videos)

Tip: put camera direction only here. If you scatter camera notes inside Scene or Style, it’s easier for the model to ignore.

5) Light (mood, depth, time of day)

Lighting can significantly change mood and depth, with examples like warm light, morning light, spotlight, and backlighting. (https://help.flexclip.com/en/articles/10326783-how-to-write-effective-text-prompts-to-generate-ai-videos)

Micro-examples:

Light: “soft morning window light, gentle shadows”
Light: “dramatic backlight with rim highlight on edges”

Tip: pick one key lighting idea + one modifier (soft/hard, warm/cool, directional/backlit).

6) Style (visual language + tone)

Style sets the overall look and emotional tone; it can include aesthetic and mood (e.g., anime, American comics). (https://help.flexclip.com/en/articles/10326783-how-to-write-effective-text-prompts-to-generate-ai-videos)

Micro-examples:

Style: “clean commercial, modern, minimal”
Style: “playful stop-motion look, slightly exaggerated motion”

Tip: if you’re making UGC-style ads, “style” can simply be “authentic handheld phone video, natural color.”

A fill-in template you can copy for Veo3Gen (text-to-video)

Powtoon recommends prompts that are descriptive and clear, refining a core idea by adding keywords and modifiers. (https://powtoonsupport.powtoon.com/hc/en-gb/articles/36565952594321-Text-to-Video-Prompt-Guide)

Copy/paste template (swap the brackets):

SUBJECT: [main person/object + 1–2 defining traits]
ACTION: [single clear action beat + any micro-gesture]
SCENE: [location + background detail + foreground props]
CAMERA: [shot size + angle + movement (push-in/orbit/dolly/handheld follow)]
LIGHT: [time-of-day or light source + softness/hardness + mood]
STYLE: [visual style + tone + optional genre (commercial/UGC/animation)]

TECH (optional): [duration], [aspect ratio], [resolution], [format]

If you like being explicit, Pyxeljam calls out “Set Technical Specs” (length, resolution, format) as a best practice. (https://pyxeljam.com/10-best-practices-for-writing-effective-ai-video-prompts/)

Image-to-video variant: what changes when you already have a keyframe

When you start from an image, you usually don’t need to re-describe every surface detail—your main job is to define motion.

FlexClip suggests an image-to-video single-action structure like: Subject + Action + Background + Background Movement + Camera Movement. (https://help.flexclip.com/en/articles/10326783-how-to-write-effective-text-prompts-to-generate-ai-videos)

Copy/paste image-to-video template:

SUBJECT: [what in the image should stay primary]
ACTION: [primary motion on subject]
BACKGROUND: [what’s behind the subject]
BACKGROUND MOVEMENT: [wind, people walking, traffic, parallax, subtle flicker]
CAMERA: [shot + movement]

(OPTIONAL) LIGHT: [small change only—e.g., clouds passing]
(OPTIONAL) STYLE: [keep consistent with the source image]

If you need multiple beats, FlexClip also outlines multi-action patterns like Subject 1 + Action 1 + Action 2 or Subject 1 + Action 1 + Subject 2 + Action 2. (https://help.flexclip.com/en/articles/10326783-how-to-write-effective-text-prompts-to-generate-ai-videos)

Camera movement phrasing: keep it simple so it doesn’t get ignored

Camera is the slot where you should be unambiguous. Instead of “cinematic,” say what the camera does.

Common movement phrases (choose one):

Push-in: camera moves closer (adds emphasis)
Dolly-in / dolly-out: forward/back movement (similar to push, more “rig” feel)
Orbit: camera arcs around the subject (reveals shape)
Slide/truck: left/right move (parallax)
Handheld follow: camera tracks behind or beside moving subject

FlexClip explicitly notes you can combine movements, like moving down while zooming out, or handheld following a moving object. (https://help.flexclip.com/en/articles/10326783-how-to-write-effective-text-prompts-to-generate-ai-videos)

Rule of thumb: One primary move per shot. If you must combine, keep it to two, and keep them in the Camera slot.

10 plug-and-play prompts (with the first variable to tweak)

Each example is ready to paste into Veo3Gen using the six slots. After you run it once, tweak the single “first variable” to iterate without breaking the concept.

1) UGC skincare ad (bathroom mirror)

First variable to tweak: Light

SUBJECT: a creator in their 20s holding a small pump bottle of serum
ACTION: applies one pump to fingertips and pats cheeks, smiles slightly
SCENE: bright bathroom vanity, mirror reflection, towel on rack in background
CAMERA: medium close-up at eye level, handheld phone feel, gentle sway
LIGHT: soft morning window light, clean highlights, natural shadows
STYLE: authentic UGC, natural color, minimal retouching

2) Product b-roll (coffee beans pour)

First variable to tweak: Camera

SUBJECT: roasted coffee beans in a glass jar
ACTION: beans pour in slow stream onto a wooden table and scatter
SCENE: rustic kitchen table, linen cloth, blurred kettle in background
CAMERA: macro close-up, low angle, slow push-in
LIGHT: warm light, soft shadows, cozy mood
STYLE: premium commercial b-roll, crisp texture emphasis

3) E-commerce hero shot (sneaker turntable)

First variable to tweak: Camera movement (orbit vs static)

SUBJECT: clean white sneaker with subtle stitching
ACTION: sneaker rotates smoothly like on a turntable
SCENE: seamless studio backdrop, faint floor reflection
CAMERA: product close-up, slight high angle, slow orbit
LIGHT: softbox key light with gentle fill, clean specular highlights
STYLE: modern ecommerce, minimal, high clarity

First variable to tweak: Scene

SUBJECT: chef in a professional kitchen
ACTION: finishes plating a dish, adds a garnish with tweezers
SCENE: stainless steel pass, warm restaurant ambience behind
CAMERA: over-the-shoulder, shallow depth feel, slow push-in
LIGHT: warm overhead kitchen lights, subtle rim light
STYLE: documentary food promo, grounded and appetizing

5) Talking-head cutaway (podcast desk insert)

First variable to tweak: Camera framing

SUBJECT: hands gesturing beside a notebook and pen
ACTION: hand underlines a key phrase, taps pen twice
SCENE: creator desk, laptop out of focus, coffee mug nearby
CAMERA: top-down close-up, steady shot, slight slide right
LIGHT: soft warm desk lamp, gentle falloff
STYLE: clean creator explainer, calm and focused

6) App demo vibe (phone + notifications)

First variable to tweak: Style

SUBJECT: smartphone on a clean desk showing a simple calendar app
ACTION: a finger swipes to schedule an event, subtle notification appears
SCENE: minimal workspace, plant blurred in background
CAMERA: close-up, slight angle, slow push-in
LIGHT: neutral soft light, minimal shadows
STYLE: sleek tech commercial, modern, uncluttered

7) Creator brand intro (logo reveal via practical motion)

First variable to tweak: Action

SUBJECT: a notebook with a simple logo sticker on the cover
ACTION: notebook slides into frame and stops, a hand opens it to first page
SCENE: wooden desk, minimal props, tidy background
CAMERA: medium close-up, gentle handheld, slight push-in
LIGHT: warm natural light, soft shadow shape
STYLE: cozy creator brand, approachable, handmade feel

8) Fitness micro-ad (jump rope)

First variable to tweak: Camera angle

SUBJECT: athlete skipping rope
ACTION: steady jump rope rhythm, rope blurs slightly as it passes
SCENE: outdoor park path, trees in background
CAMERA: medium shot, low angle, handheld follow
LIGHT: late afternoon sun, bright rim light, energetic mood
STYLE: athletic lifestyle, punchy and dynamic

9) Real estate detail (kitchen vignette)

First variable to tweak: Light (day vs evening)

SUBJECT: modern kitchen island with a bowl of fruit
ACTION: a hand places a glass of water down, slight condensation visible
SCENE: bright modern kitchen, clean lines, stools in background
CAMERA: wide-to-medium move, slow dolly-in
LIGHT: soft daylight fill, airy shadows
STYLE: polished real estate promo, inviting and clean

10) Image-to-video: portrait keyframe with subtle motion

First variable to tweak: Background movement

SUBJECT: person in the provided image
ACTION: subtle breathing motion and a small head turn
BACKGROUND: city street behind them
BACKGROUND MOVEMENT: pedestrians drift by softly, distant traffic motion
CAMERA: medium close-up, gentle push-in
LIGHT: keep consistent with the image, slight natural shimmer
STYLE: realistic, consistent with the source photo

Common prompt failures (and the one-line fix for each)

Motion looks frozen

Fix: Make Action a single, visible beat (pour, open, tap, turn, step) and keep it central—Action is what drives the story. (https://help.flexclip.com/en/articles/10326783-how-to-write-effective-text-prompts-to-generate-ai-videos)

Lighting feels flat

Fix: Add one lighting intent (morning window light, backlight, spotlight). Lighting strongly affects mood and depth. (https://help.flexclip.com/en/articles/10326783-how-to-write-effective-text-prompts-to-generate-ai-videos)

Camera instructions get ignored

Fix: Put camera movement and framing only in the Camera slot; keep it to one primary move.

The shot is “generic”

Fix: Be clear and specific about audience and goal—then add 1–2 modifiers that support that goal. (https://pyxeljam.com/10-best-practices-for-writing-effective-ai-video-prompts/)

Fast iteration: change ONE variable without breaking the shot

Treat your prompt like a locked shot card. Keep five slots constant, then iterate on one:

Want more premium feel? Change Light (softbox, rim, warm/cool).
Want more energy? Change Camera (handheld follow, faster push-in).
Want clearer storytelling? Change Action (add a micro-gesture).

Pyxeljam emphasizes “Test and Refine”—experiment, adjust, and track what performs better. (https://pyxeljam.com/10-best-practices-for-writing-effective-ai-video-prompts/)

Checklist: before you spend credits (the 60-second prompt review)

Can I point to one subject?
Is there one clear action that changes over time?
Is the scene specific (location + one detail)?
Did I specify shot + movement in the Camera slot?
Did I choose a lighting idea (time/source + mood)?
Is style aligned with the platform (UGC vs commercial vs animation)?

FAQ

What’s the best AI video prompt structure to start with?

A six-slot structure—Subject, Action, Scene, Camera, Light, Style—maps cleanly to how many tools describe effective prompts. FlexClip publishes a comparable structure that includes camera movement, lighting, and style alongside subject/action/scene. (https://help.flexclip.com/en/articles/10326783-how-to-write-effective-text-prompts-to-generate-ai-videos)

How long should my prompt be?

Long enough to remove ambiguity, short enough to stay one shot. A clear, descriptive prompt is generally recommended, then refined by adding modifiers. (https://powtoonsupport.powtoon.com/hc/en-gb/articles/36565952594321-Text-to-Video-Prompt-Guide)

Should I include technical specs like duration and aspect ratio?

If your tool supports it, it can help. Pyxeljam lists setting technical specs (like length, resolution, and format) as a best practice. (https://pyxeljam.com/10-best-practices-for-writing-effective-ai-video-prompts/)

How does image-to-video prompting differ?

You’ll usually focus more on motion: subject motion, background movement, and camera movement. FlexClip outlines an image-to-video structure that highlights those elements. (https://help.flexclip.com/en/articles/10326783-how-to-write-effective-text-prompts-to-generate-ai-videos)

Ready to generate in Veo3Gen? (CTA)

If you want to turn these shot-card prompts into repeatable workflows—UGC batches, product b-roll sets, or consistent creator brand clips—build directly on the Veo3Gen API: /api. When you’re ready to scale beyond testing, see plans and usage options here: /pricing.

AI Video Prompt “Anatomy” for Creators: The 6-Part Shot Formula (Subject → Action → Scene → Camera → Light → Style) + 10 Plug‑and‑Play Examples (as of 2026-04-15)

Try Veo 3 & Veo 3 API for Free