Scoring

  • Recommendation systems often involve candidate generation followed by scoring and ranking to select items for display.

  • Candidate generation can leverage various sources like user features, popular items, or social graphs, which are then combined and scored by a separate model.

  • Using a single scoring model allows for better comparability and context consideration compared to relying on individual candidate generator scores.

  • Careful selection of the scoring function is crucial as it directly impacts the quality and relevance of recommendations, with considerations for click-bait or overly long content.

  • Positional bias should be addressed by aiming for position-independent rankings, potentially by scoring candidates as if they were in the top position.

After candidate generation, another model scores and ranks the generatedcandidates to select the set of items to display. The recommendation systemmay have multiple candidate generators that use different sources, suchas the following:

Examples
  • Related items from a matrix factorization model.
  • User features that account for personalization.
  • "Local" vs "distant" items; that is, taking geographic information into account.
  • Popular or trending items.
  • A social graph; that is, items liked or recommended by friends.

The system combines these different sources into a common pool ofcandidates that are then scored by a single model and ranked according tothat score. For example, the system can train a model to predict theprobability of a user watching a video on YouTube given the following:

  • query features (for example, user watch history, language, country, time)
  • video features (for example, title, tags, video embedding)

The system can then rank the videos in the pool of candidates accordingto the prediction of the model.

Why not let the candidate generator score?

Since candidate generators compute a score (such as the similarity measurein the embedding space), you might be tempted to use them to do ranking aswell. However, you should avoid this practice for the following reasons:

  • Some systems rely on multiple candidate generators. The scores of thesedifferent generators might not be comparable.
  • With a smaller pool of candidates, the system can afford to usemore features and a more complex model that may better capture context.

Choosing an objective function for scoring

As you may remember fromIntroduction to ML ProblemFraming,ML can act like a mischievous genie: very happy to learn the objectiveyou provide, but you have to be careful what you wish for. This mischievousquality also applies to recommendation systems. The choice of scoringfunction can dramatically affect the ranking of items, and ultimately thequality of the recommendations.

Example:

Click the plus icons to learn what happens as a result of using eachobjective.

Maximize Click Rate

If the scoring function optimizes for clicks, the systems may recommend click-bait videos. This scoring function generates clicks but does not make a good user experience. Users' interest may quickly fade.

Maximize Watch Time

If the scoring function optimizes for watch time, the system might recommend very long videos, which might lead to a poor user experience. Note that multiple short watches can be just as good as one long watch.

Increase Diversity and Maximize Session Watch Time

Recommend shorter videos, but ones that are more likely to keep the user engaged.

An image of the Google Play store home page that is displaying new and updated games as well as recommended apps with the bottom items highlighted.

Positional bias in scoring

Items that appear lower on the screen are less likely to be clicked thanitems appearing higher on the screen. However, when scoring videos, thesystem usually doesn't know where on the screen a link to that video willultimately appear. Querying the model with all possible positions is tooexpensive. Even if querying multiple positions were feasible, the systemstill might not find a consistent ranking across multiple ranking scores.

Solutions

  • Create position-independent rankings.
  • Rank all the candidates as if they are in the top position on the screen.

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-08-25 UTC.