Introduction to Vertex ML Metadata

A critical part of the scientific method is recording both your observations andthe parameters of an experiment. In data science, it's also critical totrack the parameters, artifacts, and metrics used in a machine learning (ML)experiment. This metadata helps you:

  • Analyze runs of a production ML system to understand changes in thequality of predictions.
  • Analyze ML experiments to compare the effectiveness of different sets ofhyperparameters.
  • Track the lineage of ML artifacts, for example datasets and models, tounderstand just what contributed to the creation of an artifact or how thatartifact was used to create descendant artifacts.
  • Rerun an ML workflow with the same artifacts and parameters.
  • Track the downstream usage of ML artifacts for governance purposes.

Vertex ML Metadata lets you record the metadata and artifacts produced byyour ML system and query that metadata to help analyze, debug, and audit theperformance of your ML system or the artifacts that it produces.

Vertex ML Metadata builds on the concepts used in the open sourceML Metadata (MLMD) library that was developed by Google'sTensorFlow Extended team.

Overview of Vertex ML Metadata

Vertex ML Metadata captures your ML system's metadata as a graph.

In the metadata graph, artifacts and executions are nodes, and events areedges that link artifacts as inputs or outputs of executions. Contexts representsubgraphs that are used to logically group sets of artifacts and executions.

You can apply key-value pair metadata to artifacts, executions, and contexts.For example, a model could have metadata that describes the framework used totrain the model and performance metrics, such as the model's accuracy,precision, and recall.

Learn more abouttracking your ML system's metadata. If you're interested in analyzing metadata from Vertex AI Pipelines, check outthis step-by-step tutorial.

ML artifact lineage

In order to understand changes in the performance of your machine ML system,you must be able to analyze the metadata produced by your ML workflow and thelineage of its artifacts. An artifact's lineage includes all the factors thatcontributed to its creation, as well as artifacts and metadata that descend fromthis artifact.

For example, a model's lineage could include the following:

  • The training, test, and evaluation data used to create the model.
  • The hyperparameters used during model training.
  • The code that was used to train the model.
  • Metadata recorded from the training and evaluation process, such as themodel's accuracy.
  • Artifacts that descend from this model, such as the results of batchpredictions.

By tracking your ML system's metadata using Vertex ML Metadata, you can answer questions like the following:

  • Which dataset was used to train a certain model?
  • Which of my organization's models have been trained using a certain dataset?
  • Which run produced the most accurate model, and what hyperparameters wereused to train the model?
  • Which deployment targets was a certain model deployed to and when was itdeployed?
  • Which version of your model was used to create a prediction at a given pointin time?

Learn more aboutanalyzing your ML system's metadata.

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-12-15 UTC.