Check data quality for media recommendations Stay organized with collections Save and categorize content based on your preferences.
This page describes how to find out whether various metrics for your media data meettheir requirement thresholds.
About checking media data quality
Because recent user events are so important for media recommendations, you mustregularly check the quality of your ingested data and user events. You can dothis by reviewing theOptimization tab for your media recommendations app todetermine what improvements you can make to your data in order to optimize forbetter quality recommendations.
If a metric's threshold isn't met, then the metric has a warning status.Then, you need to review the metric and its description to determine what actionyou should take to improve your media quality.
All models and objectives need to pass theGeneral quality metric thresholds.Some model and objectives have additionalApp-specific quality metrics andthresholds. The general quality metrics are the same for all apps using the samedata store, but app-specific quality metrics vary according to the app's modeland objectives.
For information about the recommendation models and objectives, seeAbout media app recommendations types.
Check data quality
Console
To check the quality of your media recommendations data, follow these steps:
In the Google Cloud console, go to theAI Applications page.
Click the name of the media recommendations app that you want check dataquality for.
In the navigation menu, clickData quality and click theOptimization tab. This page shows the status of various metrics for thedata associated with your app.
Review theGeneral quality and theApp-specific quality statuses atthe top of the page. The summary status at the top of the page shows as awarning if one or more metrics has exceeded its threshold.
The two metrics tables (General quality and theApp-specificquality) list the individual metrics.

In the metrics tables, clickView details for more information about any metrics in the warningstate.
Optional: If you want to see the threshold for a compliant metric, clickView Details. Thresholds for compliant metrics are not shown in themetrics table.
REST
Note: This feature is a Preview offering, subject to the "Pre-GA Offerings Terms"of theGCP Service Specific Terms.Pre-GA products and features may have limited support, andchanges to pre-GA products and features may not be compatible with other pre-GAversions. For more information, see thelaunch stage descriptions.Further, by using this feature, you agree to theGenerative AI Preview terms and conditions("Preview Terms"). For this feature, you can process personal data as outlined in theCloud Data Processing Addendum,subject to applicable restrictions and obligations in the Agreement (as defined in the Preview Terms).Use therequirements:checkRequirement method to check thequality of your media recommendations data, as shown.
To check the quality from the command line, follow these steps:
Find your data store ID. If you already have your data storeID, skip to the next step.
In the Google Cloud console, go to theAI Applications page andin the navigation menu, clickData Stores.
Click the name of your data store.
On theData page for your data store, get the data store ID.
Run the following curl command to learn if your media recommendations meetsthe thresholds for the general metrics:
curl-XPOST\-H"Authorization: Bearer$(gcloudauthprint-access-token)"\-H"Content-Type: application/json"\-H"X-GFE-SSL: yes"\-H"X-Goog-User-Project:PROJECT_ID"\"https://discoveryengine.googleapis.com/v1alpha/projects/PROJECT_ID/locations/global/requirements:checkRequirement"\-d'{ "location": "projects/PROJECT_ID/locations/global", "requirementType": "discoveryengine.googleapis.com/media_recs/general/all/warning", "resources": [ { "labels": { "branch_id": "0", "collection_id": "default_collection", "datastore_id": "DATA_STORE_ID", "location_id": "global", "project_number": "PROJECT_ID" }, "type": "discoveryengine.googleapis.com/Branch" }, { "labels": { "collection_id": "default_collection", "datastore_id": "DATA_STORE_ID", "location_id": "global", "project_number": "PROJECT_ID" }, "type": "discoveryengine.googleapis.com/DataStore" } ] }'Replace the following:
PROJECT_ID: the ID of your Google Cloud project.DATA_STORE_ID: the ID of the Vertex AI Search data store.
Example command and result
curl -X POST-H "Authorization: Bearer $(gcloud auth print-access-token)"-H "Content-Type: application/json"-H "X-GFE-SSL: yes"-H "X-Goog-User-Project: my-project-123""https://discoveryengine.googleapis.com/v1alpha/projects/my-project-123/locations/global/requirements:checkRequirement"-d '{ "location": "projects/123456/locations/global", "requirementType": "discoveryengine.googleapis.com/media_recs/general/all/warning", "resources": [ { "labels": { "branch_id": "0", "collection_id": "default_collection", "datastore_id": "my-data-store", "location_id": "global", "project_number": "123456" }, "type": "discoveryengine.googleapis.com/Branch" }, { "labels": { "collection_id": "default_collection", "datastore_id": "my-data-store", "location_id": "global", "project_number": "123456" }, "type": "discoveryengine.googleapis.com/DataStore" } ]}'
{"requirement": {"type": "discoveryengine.googleapis.com/media_recs/general/all/warning","displayName": "Warning level requirements for all models and all business objectives.","description": "Requirements for the media recommendations model that will result in performance issue if not met for all media recommendations models and all business objectives.","condition": { "expression": "doc_with_same_title_percentage \u003c doc_with_same_title_percentage_threshold && most_common_visitor_id_percentage \u003c most_common_visitor_id_percentage_threshold && short_term_unjoined_events_percentage \u003c short_term_unjoined_events_percentage_threshold && long_term_unjoined_events_percentage \u003c long_term_unjoined_events_percentage_threshold"},"metricBindings": [ { "variableId": "doc_with_same_title_percentage", "resourceType": "discoveryengine.googleapis.com/Branch", "metricFilter": "metric.type = 'discoveryengine.googleapis.com/branch/documents/items_with_same_title' AND metric.labels.is_percentage = 'True' AND resource.labels.project_number = '123456' AND resource.labels.branch_id = '0' AND resource.labels.datastore_id = 'my-data-store' AND resource.labels.location_id = 'global' AND resource.labels.collection_id = 'default_collection'", "description": "The percentage of the documents with the same title in a branch.", "category": "Document" }, { "variableId": "most_common_visitor_id_percentage", "resourceType": "discoveryengine.googleapis.com/DataStore", "metricFilter": "metric.type = 'discoveryengine.googleapis.com/branch/datastore/user_events/most_used_visitor_id_events' AND metric.labels.is_percentage = 'True' AND resource.labels.datastore_id = 'my-data-store' AND resource.labels.project_number = '123456' AND resource.labels.location_id = 'global' AND resource.labels.collection_id = 'default_collection'", "description": "The percentage of the events with the same visitor id.", "category": "DataStore" }, { "variableId": "short_term_unjoined_events_percentage", "resourceType": "discoveryengine.googleapis.com/DataStore", "metricFilter": "metric.type = 'discoveryengine.googleapis.com/datastore/user_events/unjoined_events_for_document_ids' AND metric.labels.is_percentage = 'True' AND metric.conditions.time_range = 'WEEK' AND resource.labels.datastore_id = 'my-data-store' AND resource.labels.project_number = '123456' AND resource.labels.location_id = 'global' AND resource.labels.collection_id = 'default_collection'", "description": "The percentage of events refers to a document id that is not in the catalog in the last 7 days.", "category": "DataStore" }, { "variableId": "long_term_unjoined_events_percentage", "resourceType": "discoveryengine.googleapis.com/DataStore", "metricFilter": "metric.type = 'discoveryengine.googleapis.com/datastore/user_events/unjoined_events_for_document_ids' AND metric.labels.is_percentage = 'True' AND metric.conditions.time_range = 'NINETY_DAYS' AND resource.labels.datastore_id = 'my-data-store' AND resource.labels.project_number = '123456' AND resource.labels.location_id = 'global' AND resource.labels.collection_id = 'default_collection'", "description": "The percentage of events refers to a document id that is not in the catalog in the last 90 days.", "category": "DataStore" }],"thresholdBindings": [ { "variableId": "doc_with_same_title_percentage_threshold", "threshold_values": { "severity": "WARNING", "value": 1.0 } "description": "The threshold for the percentage of the documents with the same title in a branch." }, { "variableId": "most_common_visitor_id_percentage_threshold", "threshold_values": { "severity": "WARNING", "value": 5.0 } "description": "The threshold for the percentage of the events with the same visitor id." }, { "variableId": "short_term_unjoined_events_percentage_threshold", "threshold_values": { "severity": "WARNING", "value": 5.0 } "description": "The threshold for the percentage of the events refers to a document id that is not in the catalog in the last 7 days." }, { "variableId": "long_term_unjoined_events_percentage_threshold", "threshold_values": { "severity": "WARNING", "value": 2.0 } "description": "The threshold for the percentage of the events refers to a document id that is not in the catalog in the last 90 days" }]},"result": "WARNING","requirementCondition": {"expression": "doc_with_same_title_percentage \u003c doc_with_same_title_percentage_threshold && most_common_visitor_id_percentage \u003c most_common_visitor_id_percentage_threshold && short_term_unjoined_events_percentage \u003c short_term_unjoined_events_percentage_threshold && long_term_unjoined_events_percentage \u003c long_term_unjoined_events_percentage_threshold"},"metricResults": [{ "name": "short_term_unjoined_events_percentage", "value": { "doubleValue": 0 }, "timestamp": "2024-06-06T03:03:13.416900898Z", "unit": "%", "metricType": "discoveryengine.googleapis.com/datastore/user_events/unjoined_events_for_document_ids"},{ "name": "long_term_unjoined_events_percentage", "value": { "doubleValue": 0 }, "timestamp": "2024-06-06T03:03:13.417962744Z", "unit": "%", "metricType": "discoveryengine.googleapis.com/datastore/user_events/unjoined_events_for_document_ids"},{ "name": "most_common_visitor_id_percentage", "value": { "doubleValue": 0.8 }, "timestamp": "2024-06-06T03:03:16.090037135Z", "unit": "%", "metricType": "discoveryengine.googleapis.com/datastore/user_events/most_used_visitor_id_events"},{ "name": "doc_with_same_title_percentage", "value": { "doubleValue": 30.47 }, "timestamp": "2024-06-06T03:03:17.599458357Z", "unit": "%", "metricType": "discoveryengine.googleapis.com/documents/items_with_same_title"}],"oldestMetricTimestamp": "2024-06-06T03:03:13.416900898Z"}Review the output:
Look for the value of
result:If the value is
SUCCESS, then your data passes the general requirements;continue to step 4.If the value is
WARNING, continue to step b.If you don't see
resultin the output, there are a couple possible reasons:The
PROJECT_IDorDATA_STORE_IDin the request is incorrect.Some metric values are unavailable. Try again in 6 hours or reach out to a customer engineer for help.
Look for the expression (
Note: The less-than sign in the expression appears inunicode,requirement.Condition.Expression): If thisexpression evaluates to false, then there is a problem with your data.\u003c, instead of as "<".The value of the metrics are in the
requirementCondition.metricResults.valuefield. The warning thresholdvalues are in thethresholdBindings.thresholdValuesfields. Thedescriptionfields can help you understand the purpose of the metric.For example, the value of
doc_with_same_title_percentageis30.47andthe warning threshold fordoc_with_same_title_percentage_thresholdis1. There is a data problem that so many of the titles in the data storeare the same, and this needs to be investigated.
If the model and objective combination used for your recommendations appappears in this table, then you also need to call the check requirementmethod, updated with the values for your model and objective:
Model Objective MODEL_OBJOthers You May Like Conversion rate oyml/cvrRecommended for You Conversion rate rfy/cvrMore Like This Conversion rate mlt/cvrMost Popular Conversion rate mp/cvrOthers You May Like Watch duration per session oyml/wdpsRecommended for You Watch duration per session rfy/wdpsMore Like This Watch duration per session mlt/wdpscurl-XPOST\-H"Authorization: Bearer$(gcloudauthprint-access-token)"\-H"Content-Type: application/json"\-H"X-GFE-SSL: yes"\-H"X-Goog-User-Project:PROJECT_ID"\"https://discoveryengine.googleapis.com/v1alpha/projects/PROJECT_ID/locations/global/requirements:checkRequirement"\-d'{ "location": "projects/PROJECT_ID/locations/global", "requirementType": "discoveryengine.googleapis.com/media_recs/MODEL_OBJ/warning", "resources": [ { "labels": { "branch_id": "0", "collection_id": "default_collection", "datastore_id": "DATA_STORE_ID", "location_id": "global", "project_number": "PROJECT_ID" }, "type": "discoveryengine.googleapis.com/Branch" }, { "labels": { "collection_id": "default_collection", "datastore_id": "DATA_STORE_ID", "location_id": "global", "project_number": "PROJECT_ID" }, "type": "discoveryengine.googleapis.com/DataStore" } ] }'Replace the following:
PROJECT_ID: the ID of your Google Cloud project.DATA_STORE_ID: the ID of the Vertex AI Search data store.MODEL_OBJ: see the preceding table to choose the correct value for your recommendations app.
Example command and result
This example is for the More Like This model and the watch duration objective:
curl -X POST-H "Authorization: Bearer $(gcloud auth print-access-token)"-H "Content-Type: application/json"-H "X-GFE-SSL: yes"-H "X-Goog-User-Project: my-project-123""https://discoveryengine.googleapis.com/v1alpha/projects/my-project-123/locations/global/collections/default_collection/dataStores/my-data-store/branches/0/requirements:checkRequirement"-d '{ "location": "projects/my-project-123/locations/global", "requirementType": "discoveryengine.googleapis.com/media_recs/mlt/wdps/warning", "resources": [ { "labels": { "branch_id": "0", "collection_id": "default_collection", "datastore_id": "my-data-store", "location_id": "global", "project_number": "my-project-123" }, "type": "discoveryengine.googleapis.com/Branch" }, { "labels": { "collection_id": "default_collection", "datastore_id": "my-data-store", "location_id": "global", "project_number": "my-project-123" }, "type": "discoveryengine.googleapis.com/DataStore" } ]}'
{"requirement": {"type": "discoveryengine.googleapis.com/media_recs/mlt/wdps/warning","displayName": "Warning level requirements for 'More Like This' models and 'Watch duration per session' business objectives.","description": "Requirements for the media recommendations model that will result in performance issue if not met for the 'More Like This' model and the 'Watch duration per session' business objective.","condition": { "expression": "invalid_sequence_percentage \u003c= invalid_sequence_percentage_threshold"},"metricBindings": [ { "variableId": "invalid_sequence_percentage", "resourceType": "discoveryengine.googleapis.com/DataStore", "metricFilter": "metric.type = 'discoveryengine.googleapis.com/datastore/user_events/invalid_sequences_media_play_media_complete' AND metric.labels.is_percentage = 'True' AND resource.labels.location_id = 'global' AND resource.labels.collection_id = 'default_collection' AND resource.labels.project_number = '123456' AND resource.labels.datastore_id = 'my-data-store'", "description": "The percentage of invalid sequences for media play and media complete events sampled by randomly selected visitor ids.", "category": "DataStore" }],"thresholdBindings": [ { "variableId": "invalid_sequence_percentage_threshold", "thresholdValues": [ { "severity": "WARNING", "value": 50 } ], "description": "The threshold for the percentage of invalid sequences sampled among all media play and media complete events." }]},"result": "SUCCESS","requirementCondition": {"expression": "invalid_sequence_percentage \u003c= invalid_sequence_percentage_threshold"},"metricResults": [{ "name": "invalid_sequence_percentage", "value": { "doubleValue": 0 }, "timestamp": "2024-06-06T02:32:00.460056386Z", "unit": "%", "metricType": "discoveryengine.googleapis.com/datastore/user_events/invalid_sequences_media_play_media_complete"}],"oldestMetricTimestamp": "2024-06-06T02:32:00.460056386Z"}Review the output:
Look for the value of
result:If the value is
SUCCESS, then your data is good enough.If the value is
WARNING, continue to step b.If you don't see
resultin the output, there are a couple possible reasons:The
PROJECT_IDorDATA_STORE_IDin the request is incorrect.Some metric values are unavailable. Try again in 6 hours or reach out to a customer engineer for help.
Look the expression (
Note: The less-than sign in the expression appears inunicode,requirement.Condition.Expression). If thisexpression evaluates to false, then there is a problem with your data.\u003c, instead of as <.The value of the metrics can be found in the
requirementCondition.metricResults.valuefield, and the warning thresholdvalues, in thethresholdBindings.thresholdValuesfields. Thedescriptionfields can help you understand the purpose of the metric.
Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2026-02-19 UTC.