Pull subscriptions

This document provides an overview of a pull subscription, its workflow, andassociated properties.

In a pull subscription, a subscriber client requests messages from thePub/Sub server.

The pull mode can use one of the two service APIs, Pull or StreamingPull.To run the chosen API, you can select a Google-provided high-level clientlibrary, or a low-level auto-generated client library. You can also choosebetween asynchronous and synchronous message processing.

Note: For most use cases, we recommend the Google-provided high-levelclient library with the StreamingPull API and asynchronous message processing.

Before you begin

Before reading this document, ensure that you're familiar with the following:

Pull subscription workflow

For a pull subscription, your subscriber client initiates requests to aPub/Sub server to retrieve messages. The subscriber client usesone of the following APIs:

Most subscriber clients don't make these requests directly. Instead, the clientsrely on the Google Cloud-provided high-level client library that performsstreaming pull requests internally and delivers messages asynchronously. For asubscriber client that needs greater control over how messages are pulled,Pub/Sub uses a low-level and automatically generated gRPClibrary. This library makes pull or streaming pull requests directly. Theserequests can be synchronous or asynchronous.

The following two images show the workflow between a subscriber client and apull subscription.

Flow of messages for a pull subscription
Figure 1. Workflow for a pull subscription



Flow of messages for astreamingPull subscription
Figure 2. Workflow for a streaming pullsubscription

Pull workflow

The pull workflow is as follows and references Figure 1:

  1. The subscriber client explicitly calls thepull method, which requestsmessages for delivery. This request is thePullRequest as shown in theimage.
  2. The Pub/Sub server responds with zero or more messages andacknowledgment IDs. A response with zero messages or with an error does notnecessarily indicate that there are no messages available to receive. Thisresponse is thePullResponse as shown in the image.

  3. The subscriber client explicitly calls theacknowledge method. The clientuses the returned acknowledgment ID to acknowledge that the message isprocessed and need not be delivered again.

For a single streaming pull request, a subscriber client can have multipleresponses returned due to the open connection. In contrast, only one response isreturned for each pull request.

Properties of a pull subscription

The properties that you configure for a pull subscription determine howyou write messages to your subscription. For more information, seesubscription properties.

Pub/Sub service APIs

The Pub/Sub pull subscription can use one of thefollowing two APIs for retrieving messages:

  • Pull
  • StreamingPull

Use unary Acknowledge and ModifyAckDeadline RPCs when you receive messagesusing these APIs. The two Pub/Sub APIs are described in thefollowing sections.

StreamingPull API

Where possible, the Pub/Sub client libraries useStreamingPull for maximum throughput and lowest latency. Although you might never use theStreamingPull API directly, it's important to know how it differs from the PullAPI.

The StreamingPull API relies on a persistent bidirectional connection to receivemultiple messages as they become available. The following is the workflow:

  1. The client sends a request to the server to establish a connection. If theconnection quota is exceeded, the server returns a resource exhausted error.The client library retries the out-of-quota errors automatically.

  2. If there is no error or the connection quota is available again, the servercontinuously sends messages to the connected client.

  3. If or when the throughput quota is exceeded, the server stops sendingmessages. However, the connection is not broken. Whenever there's sufficientthroughput quota available again, the stream resumes.

  4. The client or the server eventually closes the connection.

The StreamingPull API keeps an open connection. The Pub/Subservers recurrently close the connection after a time period to avoid along-running sticky connection. The client library automatically reopens aStreamingPull connection.

Messages are sent to the connection when they are available. The StreamingPullAPI thus minimizes latency and maximizes throughput for messages.

Note: The PHP client library does not support the StreamingPull API.

Read more about the StreamingPull RPC methods:StreamingPullRequestandStreamingPullResponse.

Pull API

This API is a traditional unary RPC that is based on a request and responsemodel. A single pull response corresponds to a single pull request.The following is the workflow:

  1. The client sends a request to the server for messages.If the throughput quota is exceeded, the server returns a resourceexhausted error.

  2. If there is no error or the throughput quota is available again, the serverreplies with zero or more messages and acknowledgment IDs.

When using the unary Pull API, a response with zero messages or with anerror does not necessarily indicate that there are no messages availableto receive.

Using the Pull API does not guarantee low latency and a high throughput ofmessages. To achieve high throughput and low latency with the Pull API, youmust have multiple simultaneous outstanding requests. New requests are createdwhen old requests receive a response. Architecting such a solution iserror-prone and hard to maintain. We recommend that you use the StreamingPullAPI for such use cases.

Use the Pull API instead of the StreamingPull API only if you require strictcontrol over the following:

  • The number of messages that the subscriber client can process
  • The client memory and resources

You can also use this API when your subscriber is a proxy betweenPub/Sub and another service that operates in a morepull-oriented way.

Read more about the Pull REST methods:Method: projects.subscriptions.pull.

Read more about the Pull RPC methods:PullRequestandPullResponse.

Types of message processing modes

Choose one of the following pull modes for your subscriber clients.

Asynchronous pull mode

Asynchronous pull mode decouples the receiving of messages from the processingof messages in a subscriber client. This mode is the default for mostsubscriber clients. Asynchronous pull mode can use the StreamingPull API orunary Pull API. Asynchronous pull can also use the high-level client libraryor low-level auto-generated client library.

You can learn more about client libraries later in this document.

Synchronous pull mode

In synchronous pull mode, the receiving and processing of messages occur insequence and are not decoupled from each other. Hence, similar toStreamingPull versus unary Pull APIs, asynchronous processing offerslower latency and higher throughput than synchronous processing.

Use synchronous pull mode only for applications where low latency and highthroughput are not the most important factors as compared to some otherrequirements. For example, an application might be limited to using onlythe synchronous programming model. Or, an application with resourceconstraints might require more exact control over memory, network, orCPU. In such cases, use synchronous mode with the unary Pull API.

Pub/Sub client libraries

Pub/Sub offers a high-level and a low-level auto-generatedclient library.

High-level Pub/Sub client library

The high-level client library provides options for controlling theacknowledgment deadlines by using lease management. These options aremore granular than when you configure the acknowledgment deadlines byusing the console or the CLI at the subscription level. The high-level clientlibrary also implements support for features such as ordered delivery,exactly-once delivery, and flow control.

We recommend using asynchronous pull and the StreamingPull API with thehigh-level client library. Not all languages that are supported forGoogle Cloud also support the Pull API in the high-level client library.

To use the high-level client libraries, seePub/Sub client libraries.

Low-level auto-generated Pub/Sub client library

A low-level client library is available for cases where you must use the PullAPI directly. You can use synchronous or asynchronous processing with thelow-level auto-generated client library. You must manually code features such asordered delivery, exactly-once delivery, flow control, and lease management whenyou use the low-level auto-generated client library.

You can use the synchronous processing model when you use the low-levelauto-generated client library for all supported languages. You might use thelow-level auto-generated client library and synchronous pull in cases whereusing the Pull API directly makes sense. For example, you might have existingapplication logic that relies on this model.

To use the low-level auto-generated client libraries directly, seePub/Sub APIsoverview.

What's next

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-11-24 UTC.