Veo on Vertex AI video generation prompt guide

Veo offers endless customization through textual prompts. Thisguide explains how to modify your Veo prompts to producedifferent results and effects.

For more information about best practices, seeBest practices forVeo on Vertex AI.

Safety filters

Veo applies safety filters across Vertex AI tohelp ensure that generated videos and uploaded photos don't contain offensivecontent. For example, prompts that violateresponsible AIguidelines are blocked.

If you suspect abuse of Veo or any generated output that containsinappropriate material or inaccurate information, use theReport suspectedabuse on Google Cloudform.

Anatomy of a Veo prompt

When you use Veo to generate videos, using the correct keywordsand prompt structure helps the model to generate the content that you want.Breaking your idea down into key components is the most effective way to guideVeo toward the outcome that you want.

The following sections explain how to use key elements and keywords in yourprompts to guide Veo when generating videos.

You don't need to use all elements in every prompt, but understanding how eachelement works can help you apply them effectively in your Veoprompts.

Subject

The subject is the "who" or "what" that the action of your generated videorevolves around. Specificity helps avoid generic outputs.

The following are examples of subjects that you can use:

  • People:

    • Generic descriptors: man, woman, elderly person

    • Specific professions: "a seasoned detective", "a joyful baker", "a futuristic astronaut"

    • Historical figures

    • Mythical beings: a "mischievous fairy", "a stoic knight"

  • Animals or creatures:

    • Specific breeds of animals: "a playful Golden Retriever puppy", "a majestic bald eagle", "a sleek black panther"

    • Fantastical creatures: "a miniature dragon with iridescent scales", "a wise, ancient talking tree"

  • Objects:

    • Everyday items: "a vintage typewriter", "a steaming cup of coffee","a worn leather-bound book"

    • Vehicles: "a classic 1960s muscle car", "a futuristic hovercraft","a weathered pirate ship"

    • Abstract shapes: "glowing orbs", "crystalline structures"

You can combine people, animals, objects, or any mix of them in the same video(for example, "A group of diverse friends laughing around a campfire while acurious fox watches from the shadows", "a busy marketplace scene with vendorsand shoppers").

Example: The following video and prompt demonstrate complex details withmultiple subjects:

"A hyper-realistic, cinematic portrait of a wise, androgynous shaman ofindeterminate age. Their weathered skin is etched with intricate, bioluminescentcircuit-like tattoos that pulse with a soft, cyan light. They are draped inceremonial robes woven from dark moss and shimmering, metallic fiber-opticthreads. In one hand, they hold a gnarled wooden staff entwined with glowingenergy conduits and topped with a floating, crystalline artifact. Perched ontheir shoulder is a small, mechanical owl with holographic wings and camera-lenseyes that blink with a soft, red light. Their expression is serene and ancient,eyes holding a deep, knowing look"

Action

Actions describe the "verb" of your video, or what is happening. Action bringsthe subject to life, describes movements, interactions, and subtle expressions.

The following are examples of actions that you can use:

  • Basic movements: walking, running, jumping, flying,swimming, dancing, spinning, falling, standing still, sitting

  • Interactions: talking, laughing, arguing, hugging, fighting, playing agame, cooking, building, writing, reading, observing

  • Emotional expressions: smiling, frowning, surprise, concentratingdeeply, appearing thoughtful, showing excitement, crying

  • Subtle actions: a gentle breeze ruffling hair, leaves rustling, a subtlenod, fingers tapping impatiently, eyes blinking slowly

  • Transformations or processes: a flower blooming in fast-motion, icemelting, a city skyline developing over time (however, keep clip length inmind for events that occur over a longer period)

Example The following video and prompt demonstrate directing a story bysequencing actions and emotional changes:

"A gloved hand carefully slices open the spine of an ancient, leather-bound bookwith a scalpel. The hand then delicately extracts a tiny, metallic data chiphidden within the binding. The character's eyes, previously focused and calm,widen in a flash of alarm as a floorboard creaks off-screen. They quickly palmthe chip, their head snapping up to scan the dimly lit room, their body tenseand listening for any other sound"

Scene or context

The scene or context describes the "where" and the "when" of your video. Thatis, the environment that grounds the subject and establishes the video's moodand atmosphere.

The following are examples of scene or context that you can use:

  • Location (interior): a cozy living room with a crackling fireplace, asterile futuristic laboratory, a cluttered artist's studio, a grandballroom, a dusty attic

  • Location (exterior): a sun-drenched tropical beach, a misty ancientforest, a bustling futuristic cityscape at night, a serene mountain peak atdawn, a desolate alien planet

  • Time of day: golden hour, midday sun, twilight, deep night, pre-dawn

  • Weather: clear blue sky, overcast and gloomy, light drizzle, heavythunderstorm with visible lightning, gentle snowfall, swirling fog

  • Historical or fantastical period: a medieval castle courtyard, a roaring1920s jazz club, a cyberpunk alleyway, an enchanted forest glade

  • Atmospheric details: floating dust motes in a sunbeam, shimmering heathaze, reflections on wet pavement, leaves scattered by the wind

Example The following video demonstrates building an immersive world:

"The scene is a rain-slicked, crumbling street in a forgotten city, shrouded inperpetual twilight. Giant, bioluminescent mushrooms have sprouted from thecracked asphalt, casting an eerie, pulsating green and purple glow onto thedecaying facades of skeletal skyscrapers. A gentle, constant rain createsshimmering reflections in the puddles below, and the only sounds are the softpatter of rain and a low, otherworldly hum from the glowing fungi"

Camera angles

Camera angles define the shot's viewpoint, directly influencing how the audienceperceives the subject.

Important: Some advanced camera angles are not officially supported. Theresults and reliability may vary depending on the overall prompt and yourspecific use case.

The following are examples of camera angles that you can use:

  • Eye-level shot: offers a neutral, common perspective, as if viewed fromhuman height. For example, "eye-level shot of a woman sipping tea."

  • Low-angle shot: positions the camera below the subject, looking up,making the subject appear powerful or imposing. For example, "low-angletracking shot of a superhero landing."

  • High-angle shot: places the camera above the subject, looking down,which can make the subject seem small, vulnerable, or part of a largerpattern. For example, "high-angle shot of a child lost in a crowd."

  • Bird's-eye view or top-down shot: a shot taken directly from above,offering a map-like perspective of the scene. For example, "bird's-eye viewof a bustling city intersection."

  • Worm's-eye view: a very low-angle shot looking straight up from theground, emphasizing height and grandeur. For example, "worm's-eye view oftowering skyscrapers."

  • Dutch angle or canted angle: the camera is tilted to one side, creatinga skewed horizon line, often used to convey unease, disorientation, ordynamism. For example, "dutch angle shot of a character running down ahallway."

  • Close-up: frames the subject tightly, typically focusing on the face toemphasize emotions or a specific detail. For example, "close-up of acharacter's determined eyes."

  • Extreme close-up: isolates a very small detail of the subject, such asan eye or a drop of water. For example, "extreme close-up of a drop of waterlanding on a leaf."

  • Medium shot: shows the subject from approximately the waist up,balancing detail with some environmental context. Commonly used fordialogue. For example, "medium shot of two people conversing."

  • Full shot or long shot: shows the entire subject from head to toe, withsome of the surrounding environment visible. For example, "full shot of adancer performing."

  • Wide shot or establishing shot: shows the subject within their broadenvironment, often used to establish location and context at the beginningof a sequence. For example, "wide shot of a lone cabin in a snowylandscape."

  • Over-the-shoulder shot: frames the shot from behind one person, lookingover their shoulder at another person or object, common in conversations.For example, "over-the-shoulder shot during a tense negotiation."

  • Point-of-view shot: shows the scene from the direct visual perspectiveof a character, as if the audience is seeing through their eyes. Forexample, "POV shot as someone rides a rollercoaster."

Example: The following video and prompt demonstrate a bird's-eye view cameraangle:

"A bird's-eye view of a vast, intricate maze made of high green hedges. A lonefigure in a red coat is visible, moving through the labyrinthine paths below"

Example: The following video and prompt demonstrate anextreme close-up camera angle:

"An extreme close-up of a single, glistening drop of rain as it lands on thepetal of a vibrant red rose, causing the petal to tremble slightly"

Camera movements

The camera's movements help introduce dynamism into the shot, creating a morecinematic experience.

The following are examples of camera movements that you can use:

  • Static shot (or fixed): the camera remains completely still, there is nomovement. For example, "static shot of a serene landscape."

  • Pan (left/right): the camera rotates horizontally left or right from afixed position. For example, "slow pan left across a city skyline at dusk."

  • Tilt (up/down): the camera rotates vertically up or down from a fixedposition. For example, "tilt down from the character's shocked face to therevealing letter in their hands."

  • Dolly (in/out): the camera physically moves closer to the subject orfurther away. For example, "dolly out from the character to emphasize theirisolation."

  • Truck (left/right): the camera physically moves horizontally (sideways)left or right, often parallel to the subject or scene. For example, "truckright, following a character as they walk along a busy sidewalk."

  • Pedestal (up/down): the camera physically moves vertically up or downwhile maintaining a level perspective. For example, "pedestal up to revealthe full height of an ancient, towering tree."

  • Zoom (in/out): the camera's lens changes its focal length to magnify orde-magnify the subject. This is different from a dolly, as the camera itselfdoesn't move. For example, "slow zoom in on a mysterious artifact on atable."

  • Crane shot: the camera is mounted on a crane and moves vertically (up ordown) or in sweeping arcs, often used for dramatic reveals or high-angleperspectives. For example, "crane shot revealing a vast medievalbattlefield."

  • Aerial shot or drone shot: a shot taken from a high altitude, typicallyusing an aircraft or drone, often involving smooth, flying movements."Sweeping aerial drone shot flying over a tropical island chain."

  • Handheld or shaky cam: the camera is held by the operator, resulting inless stable, often jerky movements that can convey realism, immediacy, orunease. For example, "handheld camera shot during a chaotic marketplacechase."

  • Whip pan: an extremely fast pan that blurs the image, often used as atransition or to convey rapid movement or disorientation. For example, "whippan from one arguing character to another."

  • Arc shot: the camera moves in a circular or semi-circular path aroundthe subject. For example, "arc shot around a couple embracing in the rain."

Example: The following video and prompt demonstrate a zoom-incamera movement:

"A slow, dramatic zoom in on a mysterious, ancient compass lying on a dusty map.The camera starts wide, showing the map and a flickering candle, then smoothlyzooms in until the intricate, glowing symbols on the compass face fill theentire frame"

Example: The following video and prompt demonstrate an aerial drone camerashot:

"Sweeping aerial drone shot flying over a tropical island chain"

Lens and optical effects

Lens and optical effects change how the camera "sees" the world. Using lens andoptical effects helps add professional polish and stylistic flair.

Important: Some advanced camera lenses are not officially supported. Theresults and reliability may vary depending on the overall prompt and yourspecific use case.

The following are examples of lens and optical effects that you can use:

  • Wide-angle lens: captures a broader field of view than a standard lens.It can exaggerate perspective, making foreground elements appear larger andcreating a sense of grand scale or, at closer distances, distortion.For example, "wide-angle lens shot of a grand cathedral interior,emphasizing its soaring arches."

  • Telephoto lens: narrows the field of view and compresses perspective,making distant subjects appear closer and often isolating the subject bycreating a shallow depth of field. For example, "telephoto lens shotcapturing a distant eagle in flight against a mountain range."

  • Shallow depth of field: an optical effect where only a narrowplane of the image is in sharp focus, while the foreground or the backgroundis blurred. The aesthetic quality of this blur is known as 'bokeh'. Forexample, "portrait of a man with a shallow depth of field, their face sharpagainst a softly blurred park background with beautiful bokeh."

  • Deep depth of field: keeps most or all of the image, from foreground tobackground, in sharp focus. For example, "landscape scene with deep depth offield, showing sharp detail from the wildflowers in the immediate foregroundto the distant mountains."

  • Lens flare: an effect created when a bright light source directlystrikes the camera lens, causing streaks, starbursts, or circles of light toappear in the image. Often used for dramatic or cinematic effect. Forexample, "cinematic lens flare as the sun dips below the horizon behind asilhouetted couple."

  • Rack focus: the technique of shifting the focus of the lens from onesubject or plane of depth to another within a single, continuous shot. Forexample, "rack focus from a character's thoughtful face in the foreground toa significant photograph on the wall behind them."

  • Fisheye lens effect: an ultra-wide-angle lens that produces extremebarrel distortion, creating a circular or strongly convex, wide panoramicimage. For example, "fisheye lens view from inside a car, capturing thedriver and the entire curved dashboard and windscreen."

  • Vertigo effect (dolly zoom): a camera effect achieved by dollying thecamera towards or away from a subject while simultaneously zooming the lensin the opposite direction. This keeps the subject roughly the same size inthe frame, but the background perspective changes dramatically, oftenconveying disorientation or unease. For example, "vertigo effect (dollyzoom) on a character standing at the edge of a cliff, the background rushingaway.

Example: The following video and prompt demonstrate a shallow depth of fieldoptical effect:

A cinematic close-up portrait of a woman sitting in a café at night, with a veryshallow depth of field. Her face is in sharp focus, while the city lightsoutside the window behind her are transformed into soft, beautiful bokeh circles

Example: The following video and prompt demonstrates a rack focus shoteffect:

"A medium shot of a detective's hand in the foreground, holding a single, spentbullet casing. The camera then performs a slow rack focus, shifting from thecasing to reveal the anxious face of a witness in the background, now in sharpfocus"

Visual style & aesthetics

Visual style and aesthetics describe the overall artistic atmosphere for yourvideo, and it's one of the most impactful elements for creating a unique style.

This broad category can be broken down into four key components:

  • Lighting
  • Tone or mood
  • Artistic style
  • Ambiance

Lighting

Lighting effects change how the subject and surrounding areas are captured bythe camera. Using lighting effects can help set a particular style.

The following are examples of lighting effects that you can use:

  • Natural light: "soft morning sunlight streaming through a window,""overcast daylight", "moonlight"

  • Artificial light: "warm glow of a fireplace", "flickering candlelight,""harsh fluorescent office lighting", "pulsating neon signs"

  • Cinematic lighting: "rembrandt lighting on a portrait", "film noir stylewith deep shadows and stark highlights", "high-key lighting for a bright,cheerful scene", "low-key lighting for a dark, mysterious mood"

  • specific effects: "volumetric lighting creating visible light rays","backlighting to create a silhouette", "golden hour glow", "dramatic sidelighting"

Tone or mood

Tone and mood effects describe the atmospheric quality, or the overall feelingof the video.

The following are examples of tone or mood effects that you can use:

  • Happy/joyful: Bright, vibrant, cheerful, uplifting, whimsical.

  • Sad/melancholy: Somber, muted colors, slow pace, poignant, wistful.

  • Suspenseful/tense: Dark, shadowy, quick cuts (if implying edit), senseof unease, thrilling.

  • Peaceful/serene: Calm, tranquil, soft, gentle, meditative.

  • Epic/grandiose: Sweeping, majestic, dramatic, awe-inspiring.

  • Futuristic/sci-fi: Sleek, metallic, neon, technological, dystopian,utopian.

  • Vintage/retro: Sepia tone, grainy film, specific era aesthetics (Forexample, "1950s Americana", "1980s vaporwave").

  • Romantic: Soft focus, warm colors, intimate.

  • Horror: Dark, unsettling, eerie, gory (though be mindful of contentfilters).

Artistic style

You can describe an artistic style for the video to take inspiration from whilegenerating your video.

The following are examples of artistic style effects that you can use:

  • Photorealistic: "ultra-realistic rendering", "shot on 8K camera"

  • Cinematic: "cinematic film look", "shot on 35mm film", "anamorphicwidescreen"

  • Animation styles: "Japanese anime style", "classic Disney animationstyle," "Pixar-like 3D animation", "claymation style", "stop-motionanimation", "cel-shaded animation"

  • Art movements/artists: "in the style of Van Gogh", "surrealistpainting," "Impressionistic", "Art Deco design", "Bauhaus aesthetic"

  • Specific looks: "gritty graphic novel illustration", "watercolorpainting coming to life", "charcoal sketch animation", "blueprint schematicstyle.

Example: The following video and prompt demonstrates a Japanese animeanimation style:

"A dynamic scene in a vibrant Japanese anime style. A magical girl with silverhair and glowing blue eyes walks in a forest. The style features sharp lines,bright, saturated colors, and expressive"

Example: The following video and prompt demonstrates a vintage artisticstyle:

"A vintage 1920s street scene, sepia toned, film grain, with characters inperiod attire"

Ambiance

Ambiance describes the character of a place or environment that the video takesplace in.

The following are examples of ambiance effects that you can use:

  • Color palettes: "monochromatic black and white", "vibrant and saturatedtropical colors", "muted earthy tones", "cool blue and silver futuristicpalette", "warm autumnal oranges and browns"

  • Atmospheric effects: "thick fog rolling across a moor", "swirling desertsands", "gentle falling snow creating a soft blanket", "heat haze shimmeringabove asphalt", "magical glowing particles in the air", "subsurfacescattering on a translucent object"

  • Textural qualities: "rough-hewn stone walls", "smooth, polished chromesurfaces", "soft, velvety fabric", "dewdrops clinging to a spiderweb"

Temporal elements

Temporal elements affect the flow of time in a video, which you can use tohighlight changes even in short clips.

The following are examples of temporal elements that you can use:

  • Pacing: "slow-motion", "fast-paced action", "time-lapse"

  • Evolution (subtle for short clips): "a flower bud slowly unfurling", "acandle burning down slightly", "dawn breaking, the sky graduallylightening"

  • Rhythm: "pulsating light", "rhythmic movement"

Example: The following video and prompt demonstrate an evolution temporaleffect:

"A close-up of a single red rose bud, its petals tightly closed. The cameraremains static as the flower slowly and gracefully unfurls over the course ofthe shot, revealing its vibrant inner layers. The evolution is subtle, showing aclear but gradual change"

Example: The following video and prompt demonstrate a time lapsetemporal effect:

"A time-lapse of a bustling city skyline as day transitions to night. The camerais static. Watch as the sun sets, casting long shadows, and the city lightsbegin to twinkle on, with streaks of car headlights moving along the streetsbelow"

Audio

Audio prompts help guide the visuals of the video in relation to sound. Audiodirection can powerfully shape the action, pacing, and mood of the video.

Audio is supported byveo-3.0-generate-001 inPreview.

Clearly specify if you want audio. We recommend that you use separate sentencesin your prompt to describe the audio. The following are examples of common audioelements you can use:

  • Sound effects: individual, distinct sounds that occur within the scene.For example, "the sound of a phone ringing", "water splashing in thebackground", "soft house sounds, the creak of a closet door, and a tickingclock."

  • Ambient noise: the general background noise that makes a location feelreal. For example, "the sounds of city traffic and distant sirens", "wavescrashing on the shore", "the quiet hum of an office."

  • Dialogue: spoken words from characters or a narrator. For example, "theman in the red hat says: Where is the rabbit?", "a voiceover with a polishedBritish accent speaks in a serious, urgent tone", "two people discuss amovie."

Example: The following video and prompt demonstrate using dialogue:

"A medium shot in a dimly lit interrogation room. The seasoned detective says:Your story has holes. The nervous informant, sweating under a single bare bulb,replies: I'm telling you everything I know. The only other sounds are the slow,rhythmic ticking of a wall clock and the faint sound of rain against the window"

Cinematic terms

You can use cinematic terms for editing style and specifictechniques. For example, "match cut", "jump cut", "establishing shot sequence","montage", "split diopter effect."

Example: The following video and prompt demonstrate using a jump cuttechnique:

"A person sitting in the same position but wearing different outfits, with sharpjump cuts between each outfit change. The background should stay static and theperson should reappear instantly in the new outfit, creating a fast-paced,rhythmic jump cut effect. The lighting and framing should remain consistent toemphasize the sudden changes"

Negative prompts

Negative prompts are a tool that helps specify the elements that you don't wantgenerated in your video. When you use a negative prompt, you describe theelements that the model shouldn't include when generating the video.

We recommend the following:

  • Not recommended: using instructive language or words such as "no" or "don't". For example, avoid prompts such as "no walls" or "don't show walls".

  • Recommended: Describe what you don't want to see. For example, "wall, frame", which means that you don't want a wall or a frame in the video.

PromptGenerated output
Generate a short, stylized animation of a large, solitary oak tree with leaves blowing vigorously in a strong wind. The tree should have a slightly exaggerated, whimsical form, with dynamic, flowing branches. The leaves should display a variety of autumn colors, swirling and dancing in the wind. The animation should feature a gentle, atmospheric soundtrack and use a warm, inviting color palette.Tree with using words.

Generate a short, stylized animation of a large, solitary oak tree with leaves blowing vigorously in a strong wind. The tree should have a slightly exaggerated, whimsical form, with dynamic, flowing branches. The leaves should display a variety of autumn colors, swirling and dancing in the wind. The animation should feature a gentle, atmospheric soundtrack and use a warm, inviting color palette.

With negative prompt - urban background, man-made structures, dark, stormy, or threatening atmosphere.

Tree with no negative words.

What's next

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2026-02-19 UTC.