Best practices for Veo on Vertex AI

Veo helps you generate videos using text prompts. This guideprovides best practices to help you start generating high-qualityVeo videos.

For more information about writing effective prompts, seeVeo on Vertex AI videogeneration prompt guide.

Use clear and specific prompts

Clear and direct prompts that eliminate ambiguity help generate better videooutput.

  • Not recommended: "I envision a scenewhere, like, the main focus, a dude, is kinda sad, and it's like, dark, andthe camera is sort of, from below, you know?"

  • Recommended: "Low-angle close-upshot of a man with a somber expression. The scene is dimly lit, conveying amelancholic mood"

Avoid quotation marks

To prevent the model from rendering text in the video, use a colon (:) afterthe speaker's action to denote speech and avoid using quotation marks (").

  • Not recommended: A woman says: "Myname is Clara."

  • Recommended: A woman says: My nameis Clara.

Use multiple aspect ratios

Use aspect ratios to increase your video's performance on multiple platforms.Different platforms are optimized for different aspect ratios. Understandingyour platforms' aspect ratios is critical for marketing and advertising.

The following are key aspect ratios and their primary uses:

  • 16:9: Also called "landscape" or "widescreen", considered the standardfor televisions, monitors, most video displays, YouTube, presentations, andmobile phones in landscape mode. The 16:9 aspect ratio is also helpful forcapturing more of the background, such as scenic landscapes.

  • 9:16: Also called portrait, vertical, or rotated widescreen. 9:16 isessential for mobile-first platforms like TikTok, Instagram Reels, andYouTube Shorts. The 9:16 aspect ratio is also helpful for portraits or tallobjects with strong vertical orientations, such as buildings, trees, orwaterfalls.

Focus short videos on a single scene

For short videos, dedicate each prompt to a single, focused moment. Trying tochain multiple distinct events (A then B then C) in one prompt for a short videooften leads to muddled or incomplete videos.

  • Not recommended: "A detective finds aclue in a library, then drives across the city at night, and then confrontsa suspect in a warehouse"

  • Recommended: Generate each part as aseparate clip:

    • Clip 1: "close-up on a detective's gloved hand brushing dust off anold book in a dark library, revealing a hidden symbol"

    • Clip 2: "a car driving through a neon-lit city at night, with rainstreaking across the windshield, in a film noir style"

    • Clip 3: "inside a shadowy warehouse, a detective stands opposite asilhouetted figure, creating a tense atmosphere"

Enhance your workflow with Gemini

Gemini can be a powerful partner throughout your entire video creationprocess, from ideation to evaluation.

Before Creation: Use Gemini as an expert prompter

Instead of starting from scratch, you can ask Gemini to act as anexpert prompter. Have it refine your basic ideas into detailed,Veo-ready prompts. For example, you can give it an instructionsuch as the following:

"Act as an expert prompter for a generative AI video generation model. Look atthis image, and write a prompt thatINSTRUCTION. Ensure yourprompt is comprehensive and detailed."

ReplaceINSTRUCTION with further instructions to theVeo model.

After Creation: Use Gemini as a "Second pair of eyes"

After your video is generated, Gemini can evaluate the finaloutput, check it against company or brand guidelines, and flag any potentiallyproblematic areas that may require human review.

Achieve character and voice consistency

Create a detailed character description: Your character description is thefoundation for consistency. To ensure reusability and voice consistency, giveyour character a name and a specific voice style. Then, build out thedescription with a rich set of unchangeable features: physical build and age,hair color and style, facial structure, eye color and shape, and any definingmarks. You can use Gemini to generate an exhaustive verbaldescription of your character's facial features.

Apply the description consistently: Copy and paste the entire, unchangedcharacter description into your prompt for every new scene or action. Onlymodify the parts that describe the new action or setting. To improve yourworkflow, you can also use Gemini as a scene generator. ProvideGemini with your final character description and ask it togenerate multiple scene prompts for you.

Use the same seed parameter: To ensure consistent visual, stylistic, andvoice output across multiple scenes, use the same seed parameter.

Example: The following video was generated using the same seed parameter andthe following prompts. The repeated character and voice descriptions are boldedin each of the following prompts:

Prompt for Scene 1:

"A medium shot, with the camera slowly dollying forward in a dimly lit, grandParisian archive. Dust motes dance in a single beam of light from a high window.Clara, a historian in her early 30s, with observant, dark browneyes that hold a quiet intensity. She has chin-length, black hair styled in aclassic bob. She is dressed in a sophisticated, dark navy-bluewool coat, with a silk scarf patterned with subtle gold and cream designs tiedaround her neck. She stands before a large, ancient wooden table, carefullyturning the fragile, yellowed page of a massive, leather-bound book. Herexpression is one of deep concentration.In a voice that is crispand clear, with a thoughtful, analytical tone and a standard Americanaccent, Clara says: It has to be here"

Prompt for Scene 2:

"A wide shot of the Pont des Arts in Paris at twilight, the sky a mix of deepblue and soft orange. The lights of the city are beginning to twinkle on alongthe Seine.Clara, a historian in her early 30s, with observant,dark brown eyes that hold a quiet intensity. She has chin-length, black hairstyled in a classic bob. She is dressed in a sophisticated, darknavy-blue wool coat, with a silk scarf patterned with subtle gold and creamdesigns tied around her neck. She leans against the railing, looking out at thewater, a small, triumphant smile on her face. She pulls a folded, old map fromher coat pocket and looks down at it.In a voice that is crisp andclear, with a thoughtful, analytical tone and a standard Americanaccent, Clara says: I knew it. The path starts from here"

Prompt for Scene 3:

"An eye-level shot in a small, hidden Parisian courtyard, overgrown with ivy andlit by a single, warm gas lamp.Clara, a historian in her early30s, with observant, dark brown eyes that hold a quiet intensity. She haschin-length, black hair styled in a classic bob. She is dressedin a sophisticated, dark navy-blue wool coat, with a silk scarf patterned withsubtle gold and cream designs tied around her neck. She kneels down and runs herfingers over an ancient, carved symbol on a stone paver, almost completelyobscured by moss. Her eyes light up with discovery.In a voicethat is crisp and clear, with a thoughtful, analytical tone and a standardAmerican accent, Clara says: After all these years, I've foundit"

Image-to-video

The following sections are best practices that are important when usingimage-to-video.

Use a high-quality source image

When using the image-to-video feature, the quality of your source image isimportant. Veo uses the source image as the basis for everythingthat follows, including character detail, lighting, and overall artistic style.

A sharp, clear, and well-composed photograph yields a more coherent andhigher-quality video. Think of your source image as the first frame of yourfilm: the stronger the start, the better the finish.

Prompt for motion only

Your source image already provides the subject, scene, and style. Focus yourprompt on the motion you want to see.

  • Not recommended: Re-describe thecharacter, the background, or the lighting depicted in the image. Redundantprompts confuse the model and lead to poor results.

  • Recommended: Prompt for cameramovement, subject animation, and environmental changes.

Use general terms for characters in the source image

In your motion prompt, refer to the character with general terms like "thesubject", "the woman", "he", "she", or "they".

Direct the camera's movement

You can direct three types of movement, either alone or in combination.

  • Camera Motion: The camera moves, but the scene is static. This is thesimplest and most reliable way to add dynamism.

    • Example: "Slow dolly in on the subject."
  • Subject Animation: The main character or object moves. Best for subtle,lifelike actions.

    • Example: "The character's hair and clothes flutter gently in the wind."
  • Environmental Animation: The background or atmosphere comes alive.

    • Example: "Fog rolls in slowly across the landscape."

Example: The following video and prompt demonstrate subject animation usingan image generated by Imagen 4:

An old, somewhat beat-up blue pickup truck in front of a field of sunflowers

"A sweeping drone-like aerial view starting from ground level and rising toreveal the entire landscape in epic proportions"

Summary of best practices

The following table summarizes the best practices recommended in this document:

TopicTask
Prompts
Video generation
Image-to-video

What's next

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2026-02-19 UTC.