Generative AI beginner's guide

This beginner's guide introduces you to the core technologies of generative AIand explains how they fit together to power chatbots and applications.Generative AI (also known asgenAI orgen AI) is a field of machine learning(ML) that develops and uses ML models for generating new content.

Generative AI models are often called large language models (LLMs) because oftheir large size and ability to understand and generate natural language.However, depending on the data that the models are trained on, these models canunderstand and generate content from multiple modalities, including text,images, videos, and audio. Models that work with multiple modalities of data arecalledmultimodal models.

Google provides theGeminifamily of generative AI models designed formultimodal use cases; capable ofprocessing information from multiple modalities, including images, videos,and text.

Content generation

In order for generative AI models to generate content that's useful inreal-world applications, they need to have the following capabilities:

  • Learn how to perform new tasks:

    Generative AI models are designed to perform general tasks. If you want amodel to perform tasks that are unique to your use case, then you need to beable to customize the model. OnVertex AI, you can customize your model through model tuning.

  • Access external information:

    Generative AI models are trained on vast amounts of data. However, in orderfor these models to be useful, they need to be able to access informationoutside of their training data. For example, if you want to create acustomer service chatbot that's powered by a generative AI model, the modelneeds to have access to information about the products and services that youoffer. In Vertex AI, you use the grounding and function callingfeatures to help the model access external information.

  • Block harmful content:

    Generative AI models might generate output that you don't expect, includingtext that's offensive or insensitive. To maintain safety and prevent misuse,the models need safety filters to block prompts and responses that aredetermined to be potentially harmful. Vertex AI has built-in safetyfeatures that promote the responsible use of our generative AI services.

The following diagram shows how these different capabilities work together togenerate content that you want:

Generative AI workflow diagram

Prompt

Prompt

The generative AI workflow typically starts with prompting. A prompt is a natural language request sent to a generative AI model to elicit a response back. Depending on the model, a prompt can containtext,images,videos,audio,documents, and other modalities or even multiple modalities (multimodal).

Creating a prompt to get the desired response from the model is a practice calledprompt design. While prompt design is a process of trial and error, there are prompt design principles and strategies that you can use to nudge the model to behave in the desired way. Vertex AI Studio offers a prompt management tool to help you manage your prompts.

Foundation models

Foundation models

Prompts are sent to a generative AI model for response generation. Vertex AI has a variety ofgenerative AI foundation models that are accessible through a managed API, including the following:

  • Gemini API: Advanced reasoning, multiturn chat, code generation, and multimodal prompts.
  • Imagen API: Image generation, image editing, and visual captioning.
  • MedLM: Medical question answering and summarization. (Deprecated)

The models differ in size, modality, and cost. You can explore Google models, as well as open models and models from Google partners, inModel Garden.

Model customization

Model customization

You can customize the default behavior of Google's foundation models so that they consistently generate the desired results without using complex prompts. This customization process is calledmodel tuning. Model tuning helps you reduce the cost and latency of your requests by allowing you to simplify your prompts.

Vertex AI also offersmodel evaluation tools to help you evaluate the performance of your tuned model. After your tuned model is production-ready, you candeploy it to an endpoint and monitor performance like in standard MLOps workflows.

Access external information

Augmentation

Vertex AI offers multiple ways to give the model access to external APIs and real-time information.

  • Grounding: Connects model responses to a source of truth, such as your own data or web search, helping to reduce hallucinations.
  • RAG: Connects models to external knowledge sources, such as documents and databases, to generate more accurate and informative responses.
  • Function calling: Lets the model interact with external APIs to get real-time information and perform real-world tasks.

Citation check

Citation check

After the response is generated, Vertex AI checks whethercitations need to be included with the response. If a significant amount of the text in the response comes from a particular source, that source is added to the citation metadata in the response.

Responsible AI and safety

Responsible AI and safety

The last layer of checks that the prompt and response go through before being returned is thesafety filters. Vertex AI checks both the prompt and response for how much the prompt or response belongs to asafety category. If the threshold is exceed for one or more categories, the response is blocked and Vertex AI returns afallback response.

Response

Response

If the prompt and response passes the safety filter checks, the response is returned. Typically, the response is returned all at once. However, with Vertex AI you can also receive responses progressively as it generates by enablingstreaming.

Get started

Try one of these quickstarts to get started with generative AI onVertex AI:

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2026-02-19 UTC.