The ML.ROC_CURVE function
This document describes theML.ROC_CURVE function, which you can use toevaluate binary class classification specific metrics.
Syntax
ML.ROC_CURVE( MODEL `PROJECT_ID.DATASET.MODEL_NAME`, { TABLE `PROJECT_ID.DATASET.TABLE` | (QUERY_STATEMENT) }, [, GENERATE_ARRAY(THRESHOLDS)] [, STRUCT(TRIAL_ID AS trial_id)])Arguments
ML.ROC_CURVE takes the following arguments:
PROJECT_ID: the project that contains theresource.DATASET: the dataset that contains theresource.MODEL: the name of the model.TABLE: the name of the input table that containsthe evaluation data.If
TABLEis specified, the input column names in the table must match thecolumn names in the model, and their types should be compatible according toBigQueryimplicit coercion rules.The input must have a column that matches thelabel column name that's provided during training. This value is providedusing theinput_label_colsoption. Ifinput_label_colsis unspecified,the column that's namedlabelin the training data is used.If you don't specify either
TABLEorQUERY_STATEMENT,ML.ROC_CURVEcomputes the curve results as follows:- If the data is split during training, the split evaluation data is usedto compute the curve results.
- If the data is not split during training, the entire training input isused to compute the curve results.
QUERY_STATEMENT: a GoogleSQL query that isused to generate the evaluation data. For the supported SQL syntax of theQUERY_STATEMENTclause in GoogleSQL, seeQuery syntax.If
QUERY_STATEMENTis specified, the input column names from the querymust match the column names in the model, and their types should becompatible according to BigQueryimplicit coercion rules.The input must have a column that matches the label column name providedduring training. This value is provided using theinput_label_colsoption.Ifinput_label_colsis unspecified, the column namedlabelin thetraining data is used. The extra columns are ignored.If you used the
TRANSFORMclausein theCREATE MODELstatement that created the model, then only the inputcolumns present in theTRANSFORMclause must appear inQUERY_STATEMENT.If you don't specify either
tableorQUERY_STATEMENT,ML.ROC_CURVEcomputes the curve results as follows:- If the data is split during training, the split evaluation data is usedto compute the curve results.
- If the data is not split during training, the entire training input isused to compute the curve results.
THRESHOLDS: anARRAY<FLOAT64>value that specifiesthe percentile values of the prediction output supplied by theGENERATE_ARRAYfunction.TRIAL_ID: anINT64value that identifies thehyperparameter tuning trial that you want the function to evaluate. Thefunction uses the optimal trial by default. Only specify this argument if youran hyperparameter tuning when creating the model.
Output
ML.ROC_CURVE returns multiple rows with metrics fordifferent threshold values for the model. The metrics include the following:
threshold: aFLOAT64value that contains the custom threshold for thebinary class classification model.recall: aFLOAT64value that indicates the proportion of actual positivecases that were correctly predicted by the model.true_positives: anINT64value that contains the number of casescorrectly predicted as positive by the model.false_positives: anINT64value that contains the number of casesincorrectly predicted as positive by the model.true_negatives: anINT64value that contains the number of casescorrectly predicted as negative by the model.false_negatives: anINT64value that contains the number of casesincorrectly predicted as negative by the model.
Examples
The following examples assume your model and input table are in your defaultproject.
Evaluate the ROC curve of a binary class logistic regression model
The following query returns all of the output columns forML.ROC_CURVE. Youcan graph therecall andfalse_positive_rate values for an ROC curve. Thethreshold values returned are chosen based on the percentile values of theprediction output.
SELECT*FROMML.ROC_CURVE(MODEL`mydataset.mymodel`,TABLE`mydataset.mytable`)
Evaluate an ROC curve with custom thresholds
The following query returns all of the output columns forML.ROC_CURVE. Thethreshold values returned are chosen based on the output of theGENERATE_ARRAYfunction.
SELECT*FROMML.ROC_CURVE(MODEL`mydataset.mymodel`,TABLE`mydataset.mytable`,GENERATE_ARRAY(0.4,0.6,0.01))
Evaluate the precision-recall curve
Instead of getting an ROC curve (the recall versus false positive rate), thefollowing query calculates a precision-recall curve by using the precisionfrom the true and false positive counts:
SELECTrecall,true_positives/(true_positives+false_positives)ASprecisionFROMML.ROC_CURVE(MODEL`mydataset.mymodel`,TABLE`mydataset.mytable`)
What's next
- For more information about model evaluation, seeBigQuery ML model evaluation overview.
- For more information about supported SQL statements and functions for MLmodels, seeEnd-to-end user journeys for ML models.
Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2025-12-15 UTC.