Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

PubSub: Fix pubsub Streaming Pull shutdown on RetryError#7863

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged
plamut merged 1 commit intogoogleapis:masterfromplamut:iss-7709
May 7, 2019

Conversation

@plamut
Copy link
Contributor

@plamutplamut commentedMay 6, 2019
edited
Loading

Closes#7709.

If a gRPC channel is inTRANSIENT_FAILURE state for too long, the retry timeout configured insubscriber client config kicks in, and aRetryError is raised in a background thread, but the client keeps running, and the error is not propagated to the top level code.

This PR makes sure that the following happens:

  • The streaming pull manager shutdown is triggered, shutting down all background threads.
  • The reason for the shutdown (RetryError) is propagated to the main thread that awaitsfuture.result(), allowing the user code a chance to catch the error and react to it.

How to test

I was not able to reproduce the actual error users reported in a real setup (a sample pubsub app deployed to K8s), but figured out what isprobably happening and faked the error.

Steps to reproduce:

--- /home/peter/workspace/google-cloud-python/venv-3.6/lib/python3.6/site-packages/grpc/_channel.py     2019-04-23 17:01:39.282064676 +0200+++ /home/peter/workspace/google-cloud-python/venv-3.6/lib/python3.6/site-packages/grpc/_channel.py     2019-04-25 15:49:05.220317794 +0200@@ -456,6 +456,11 @@   def _end_unary_response_blocking(state, call, with_call, deadline):+    state.code = grpc.StatusCode.UNAVAILABLE+    state.details = "channel is in **fake** TRANSIENT_FAILURE state"+    state.debug_error_string = (+        "transient failure is faked during a fixed time window in an hour"+    )     if state.code is grpc.StatusCode.OK:         if with_call:             rendezvous = _Rendezvous(state, call, None, deadline)
  • Adjust thetotal_timeout_millis setting insubscriber client config to 10 (seconds... in order to not wait for too long)
  • Run a sample subscriber script (example). Prerequisite: having a Google service account configured for the subscriber client, and an active subscription to a topic.

Actual result (before the fix):
ARetryError occurs in the background after ~10 seconds, some of the threads exit, but the subscriber client keeps running, and the error isnot propagated to the main thread (the future returned by thesubscribe() method is not resolved)

Expected result (after the fix):
Everything gets shut down cleanly, andRetryError is propagated to and raised in the main code.

If a RetryError occurs, it is time to stop waiting for the underlyinggRPC channel to recover from a transient failure, and a clean shutdownneeds to be triggered.This commit assures that this indeed happens (it used to happen onterminal channel errors only).
@plamutplamut added the api: pubsubIssues related to the Pub/Sub API. labelMay 6, 2019
@plamutplamut requested a review fromcrwilcox as acode ownerMay 6, 2019 14:49
@googlebotgooglebot added the cla: yesThis human has signed the Contributor License Agreement. labelMay 6, 2019
@plamutplamut added the kokoro:force-runAdd this label to force Kokoro to re-run the tests. labelMay 7, 2019
@yoshi-kokoroyoshi-kokoro removed the kokoro:force-runAdd this label to force Kokoro to re-run the tests. labelMay 7, 2019
@plamutplamut added the kokoro:force-runAdd this label to force Kokoro to re-run the tests. labelMay 7, 2019
@yoshi-kokoroyoshi-kokoro removed the kokoro:force-runAdd this label to force Kokoro to re-run the tests. labelMay 7, 2019
@plamut
Copy link
ContributorAuthor

As discussed offline, the failingreCAPTCHA Enterprise build is not related, and we agreed to merge this.

@plamutplamut merged commite00f6b3 intogoogleapis:masterMay 7, 2019
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

@crwilcoxcrwilcoxAwaiting requested review from crwilcox

1 more reviewer

@tseavertseavertseaver approved these changes

Reviewers whose approvals may not affect merge requirements

Assignees

No one assigned

Labels

api: pubsubIssues related to the Pub/Sub API.cla: yesThis human has signed the Contributor License Agreement.

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

Uncaught exceptions within the streaming pull code.

4 participants

@plamut@tseaver@googlebot@yoshi-kokoro

[8]ページ先頭

©2009-2025 Movatter.jp