Fairness: Evaluating for bias

  • Aggregate model performance metrics like precision, recall, and accuracy can hide biases against minority groups.

  • Fairness in model evaluation involves ensuring equitable outcomes across different demographic groups.

  • This page explores various fairness metrics, including demographic parity, equality of opportunity, and counterfactual fairness, to assess model predictions for bias.

  • Evaluating model predictions with these metrics helps in identifying and mitigating potential biases that can negatively affect minority groups.

  • The goal is to develop models that not only achieve good overall performance but also ensure fair treatment for all individuals, regardless of their demographic background.

When evaluating a model, metrics calculated against an entire test or validationset don't always give an accurate picture of how fair the model is.Great model performance overall for a majority of examples may mask poorperformance on a minority subset of examples, which can result in biasedmodel predictions. Using aggregate performance metrics such asprecision,recall,andaccuracy is not necessarily goingto expose these issues.

We can revisit ouradmissions model and explore some new techniquesfor how to evaluate its predictions for bias, with fairness in mind.

Suppose the admissions classification model selects 20 students to admit to theuniversity from a pool of 100 candidates, belonging to two demographic groups:the majority group (blue, 80 students) and the minority group(orange, 20 students).

Grid of 100 people icons. 80 icons are shaded blue,         representing the majority group. 20 icons are shaded orange,         representing the minority group.
Figure 1. Candidate pool of 100 students: 80 students belong to the majority group (blue), and 20 students belong to the minority group (orange).

The model must admit qualified students in a manner that is fair to thecandidates in both demographic groups.

How should we evaluate the model's predictions for fairness? There are a varietyof metrics we can consider, each of which provides a different mathematicaldefinition of "fairness." In the following sections, we'll explore three ofthese fairness metrics in depth: demographic parity, equality of opportunity,and counterfactual fairness.

Key terms:

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-08-25 UTC.