This PR only adds support for setting the connection replication mode in the connection configuration and support forCopyBoth mode queries. This is the minimum support needed for downstream users to implement replication support on their own.

@jeff-davis After studying the code I think we don't need to addunpipelined_send nor worry about protocol desyncs.

The reason we don't needunpipelined_send is because just likeCopyIn, when performing a CopyBoth operation the connection enters a special mode where it consumes FrontendMessages from a sub-receiver . These sub-receivers are theCopyInReceiver andCopyBothReceiver structs.

The reason we don't need to worry about protocol desyncs is that once the connection enters a sub-mode (withCopy{In,Both}Receiver) all other sends to the main command queue are ignored while the subprotocol is running. Only after these are exhausted, which always happens with aCopyDone/CopyFail/Sync message, the system resumes consuming the main queue.

The PR includes two tests that verify the client can resume operation to normal query processing. One that gracefully shuts down the replication stream and one that simply drops the handles.

petrosagg mentioned this pull request

May 25, 2021

Support for physical and logical replication#752

Closed

petrosagg force-pushed thecopy-both branch 3 times, most recently from5db9588 to71ad03eCompare

May 26, 2021 09:24

jeff-davis reviewed

May 26, 2021

View reviewed changes

Copy link

Contributor

jeff-davis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Looks great. A few questions/concerns.

tokio-postgres/tests/test/copy_both.rs OutdatedShow resolvedHide resolved

tokio-postgres/src/copy_both.rs OutdatedShow resolvedHide resolved

tokio-postgres/src/copy_both.rsShow resolvedHide resolved

tokio-postgres/src/copy_both.rs OutdatedShow resolvedHide resolved

tokio-postgres/tests/test/copy_both.rsShow resolvedHide resolved

tokio-postgres/src/copy_both.rs Outdated

		/// coming from the server. If it is not, `Sink::close` may hang forever waiting for the stream
		/// messages to be consumed.
		///
		/// The copy should be explicitly completed via the `Sink::close` method to ensure all data is

Copy link

Contributor

jeff-davisMay 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

It seems like dropping it is enough?

tokio-postgres/src/copy_both.rs Outdated

Comment on lines 100 to 251

		/// The stream side must be consumed even if not required in order to process the messages
		/// coming from the server. If it is not, `Sink::close` may hang forever waiting for the stream
		/// messages to be consumed.

Copy link

Contributor

jeff-davisMay 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Can you expand on the reasoning here?close() ordrop() should certainly ensure that aCopyDone is sent. And if the receiver is closed, that should cause future messages received in this sub-protocol to be discarded, eventually allowing the protocol to resume normal operations. Right?

tokio-postgres/tests/test/copy_both.rs OutdatedShow resolvedHide resolved

tokio-postgres/src/copy_both.rs Outdated

Comment on lines 151 to 153

		// Indicate to CopyBothReceiver to produce a Sync message instead of CopyDone
		let _ = this.error_sender.take().unwrap().send(());
		returnPoll::Ready(Some(Err(Error::db(error))));

Copy link

Contributor

jeff-davisMay 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Does this assume that the error happened duringSTART_REPLICATION? What about an error that happened during the stream; shouldn't we still send aCopyDone?

jeff-davis reviewed

May 26, 2021

View reviewed changes

tokio-postgres/tests/test/copy_both.rsShow resolvedHide resolved

jeff-davis reviewed

May 26, 2021

View reviewed changes

tokio-postgres/src/copy_both.rs OutdatedShow resolvedHide resolved

jeff-davis reviewed

May 27, 2021

View reviewed changes

tokio-postgres/src/client.rs


		/// Executes a CopyBoth query, returning a combined Stream+Sink type to read and write copy
		/// data.
		pubasyncfncopy_both_simple<T>(&self,query:&str) ->Result<CopyBothDuplex<T>,Error>

Copy link

Contributor

jeff-davisMay 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

After the replication stream, if the timeline is historical, Postgres will send a tuple as a response. So we actually need a function that returns something likeResult<(CopyBothDuplex<T>, Option<SimpleQueryMessage>), Error> (or maybeResult<(CopyBothDuplex<T>, Option<Vec<SimpleQueryMessage>>), Error> in case other commands are added in the future which useCopyBoth and return a set).

It's actually very specific toSTART_REPLICATION (and even more specifically, to physical replication), so it might make sense to have a more specific name or at least clarify what it's expecting the command to do. Maybe something likecopy_both_simple_with_result()?

Copy link

ContributorAuthor

petrosaggMay 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

That's a good point, I'll take a look on how we can expose this to users, ideally in a generic way.

Copy link

Contributor

jeff-davisMay 28, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

In case you missed my other comment, there's a similar issue forBASE_BACKUP, except withCopyOut instead. That can be a separate PR, though.

Copy link

Contributor

jeff-davis commentedMay 27, 2021

START_REPLICATION andBASE_BACKUP don't follow a simple pattern.START_REPLICATION begins streaming inCopyBoth mode, and when the copy is finished, it then sometimes sends a single tuple (when streaming from a timeline other than the current one).BASE_BACKUPfirst sendstwo result sets (one tuple in the first set, multiple tuples in the second set), then it sends the copy data inCopyOut mode.

Supporting these commands requires some special methods onClient that won't really be useful for anything else, so I think we should name them specifically, e.g.client.start_replication_command() andclient.base_backup_command().

Copy link

Contributor

jeff-davis commentedMay 27, 2021

//! It is recommended that you use a PostgreSQL server patch version                                               //! of at least: 14.0, 13.2, 12.6, 11.11, 10.16, 9.6.21, or                                                        //! 9.5.25. Earlier patch levels have a bug that doesn't properly                                                  //! handle pipelined requests after streaming has stopped.

In my PR, I had the above comment. Does that apply to this PR, as well?

Copy link

Contributor

jeff-davis commentedMay 27, 2021

It would be good to have a way to send Standby Status Updates and Hot Standby Feedback. Not all of this has to be in this PR, though. I'm just commenting on everything necessary to make a full-featured and independent crate for replication.

petrosagg force-pushed thecopy-both branch 2 times, most recently from935bd20 tof5e018fCompare

May 27, 2021 22:28

petrosagg commented

May 27, 2021

View reviewed changes

Copy link

ContributorAuthor

petrosagg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

@jeff-davis thank you for the thorough review!

Unfortunately last night I realised that the original design had a big hole that allowed de-syncing the protocol. I just push a new version that I think is Correct (tm) now.

When doing a CopyBoth query the architecture looks something like this:

                                        |        <tokio_postgres owned>          |    <userland owned>                                        |pg -> Connection -> CopyBothReceiver ---+---> CopyBothDuplex                                        |          /   \                                        |         v     v                                        |      Sink    Stream

The original version of the feature handled the state machine of the CopyBoth sub-protocol as part of the Stream implementation ofCopyBothDuplex and treatedCopyBothReceiver as a dumb relay of messages into the connection. Therein lies the problem. A user could create aCopyBothDuplex and drop it immediately. In that case the dumbCopyBothReceiver would unconditionally send aCopyDone to the server and finish. But what if the server sent an error in the meantime? There was nothing to handle this fact and no Sync message was being sent.

So this re-work of the feature flips this relationship around.CopyBothReceiver contains all the logic to drive the sub-protocol forward and ensures that no matter what the correct messages are being exchanged with the server. The user is free to drop their end at any point.

I've also added a ton of comments and a few diagrams to make reading and reviewing the code easier. If it's not a lot of ask I'd hugely appreciate another round of review, I think we're close!

Next week I plan to compile a modified version of postgres that errors out on purpose to test out the error paths of this PR and also see how we can incorporate getting results back for things like timeline changes etc.

tokio-postgres/src/client.rs


		/// Executes a CopyBoth query, returning a combined Stream+Sink type to read and write copy
		/// data.
		pubasyncfncopy_both_simple<T>(&self,query:&str) ->Result<CopyBothDuplex<T>,Error>

Copy link

ContributorAuthor

petrosaggMay 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

That's a good point, I'll take a look on how we can expose this to users, ideally in a generic way.

tokio-postgres/src/copy_both.rsShow resolvedHide resolved

tokio-postgres/tests/test/copy_both.rsShow resolvedHide resolved

petrosagg force-pushed thecopy-both branch fromf5e018f tob95f38bCompare

June 1, 2021 08:07

petrosagg mentioned this pull request

Jun 1, 2021

update definition of docker image#781

Closed

Copy link

ContributorAuthor

petrosagg commentedJun 1, 2021

@jeff-davis

START_REPLICATION andBASE_BACKUP don't follow a simple pattern.

It looks like the general pattern is that the database can respond with result sets, copy outs, or copy boths. Based on that I think we need here is to define something like:

enumResponsePart{RowStream(RowStream),SimpleQueryStream(SimpleQueryStream),CopyOutStream(CopyOutStream),CopyBothStream(CopyBothStream),}

And then provide a method to send a query and get a Stream ofResponseParts back. This will leave the interpretations of the various segments to the user issuing the query or some other higher level downstream library. How does this sound?

In my PR, I had the above comment. Does that apply to this PR, as well?

hm do you have a link to the patch? I thought pipelined queries were not allowed over a replication connection

Copy link

Contributor

jeff-davis commentedJun 1, 2021

And then provide a method to send a query and get a Stream ofResponseParts back. This will leave the interpretations of the various segments to the user issuing the query or some other higher level downstream library. How does this sound?

I like it. Maybe it can even be refactored so that there's one generic entry point,query_simple_extended(), that returns a stream ofResponseParts; and the other entry points are just special cases of that one. But the refactoring might be better as a separate PR.

hm do you have a link to the patch? I thought pipelined queries were not allowed over a replication connection

https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=a58db3aa10e62e4228aa409ba006014fa07a8ca2

Is there a reason you thought pipelined queries would not be allowed in replication mode? And if so, how would you prevent them, since the whole protocol implementation is built around pipelining, without explicitly blocking in certain cases?

petrosagg force-pushed thecopy-both branch fromb95f38b tocee0938Compare

June 2, 2021 09:41

Copy link

ContributorAuthor

petrosagg commentedJun 2, 2021

Is there a reason you thought pipelined queries would not be allowed in replication mode? And if so, how would you prevent them, since the whole protocol implementation is built around pipelining, without explicitly blocking in certain cases?

You're 100% right. I confused pipelined queries with extended query mode which is the one that's not allowed in replication mode. So yeah, your original comment still applies in that users should pay attention to this bug. I added this note in the documentation of theReplicationMode enum and theConfig::replication_mode function

jeff-davis reviewed

Jun 2, 2021

View reviewed changes

tokio-postgres/src/copy_both.rs Outdated

Comment on lines 26 to 37

		/// CopyOut-->ServerError<--CopyIn
		/// \ \| /
		/// `---, \| ,---'
		/// \ \| /
		/// v v v
		/// CopyNone
		/// \|
		/// v
		/// CopyComplete
		/// \|
		/// v
		/// CommandComplete

Copy link

Contributor

jeff-davisJun 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I don't think this is the right place in the state machine forServerError. If a server error happens inCopyBoth mode, that's the odd (and hard-to-test) case where an error is thrown by the server in the middle of streaming. From the docs, it looks like Postgres will just return theErrorResponse and thenReadyForQuery, without either of theCommandComplete messages.

You can test this by hacking uptest_decoding to throw an error randomly every 10 records or something.

vkrasnov referenced this pull request in readysettech/rust-postgres

Jun 4, 2021

Small changes to expose WAL

e899d18

Copy link

Contributor

jeff-davis commentedJun 6, 2021

Summary of the remaining issues as I see them:

Need an API to execute a query using the simple protocol and get back a stream ofResponseParts (or another API with similar functionality).
Change the way errors during aCopyBoth stream are handled.
1. The simplest answer is to close the whole client if this happens. That might be best, because handling the errors will be hard to test and not especially useful. Note: this reasoning does not apply to ordinary errors that happen as a part of theSTART_REPLICATION command, which should be properly handled; only errors that happen after enteringCopyBoth mode can reasonably cause the client to close the connection.
2. If we can reasonably test it, handling the errors would be the most proper thing to do, and the client can potentially try again with a newSTART_REPLICATION command without closing the entire connection. Honestly, I don't see a real use case here, but it does seem a bit more proper.
It would be nice to offer an API to send Standby Status Updates and Hot Standby Feedback.

After these are done, I'll be able to port my other replication code to a new crate, and if that works, then I think this PR is ready.

petrosagg force-pushed thecopy-both branch 7 times, most recently from50c81b8 to77dff40Compare

June 7, 2021 17:32

Copy link

Martichou commentedMar 17, 2022

@sfackler any news on this ? I'm relying on this PR for a project and I think a lot of people would benefits from this.
It would be great (and kind) to review the work that@petrosagg and@jeff-davis did on rust-postgres.

Copy link

Contributor

benesch commentedMar 17, 2022

@Martichou, you're welcome to use our fork (https://github.com/materializeInc/rust-postgres) for the moment! It's got this PR and#774 together, and we integrate new changes from rust-postgres periodically.

(We would, of course, love to get these PRs upstreamed, but@sfackler's time has seemed quite limited lately.)

Copy link

Martichou commentedMar 17, 2022

@benesch Thanks ! I was doing the same (maintaining a fork), but I'll use yours from now on ;)

ruslantalpa mentioned this pull request

Sep 13, 2022

implement support for streaming replication WIP (for #116)#652

Closed

petrosagg force-pushed thecopy-both branch 3 times, most recently from915ef69 to2fed91dCompare

January 13, 2023 21:36

Copy link

manfredcml commentedMar 5, 2023

@sfackler It'd be great if you can review this PR by@petrosagg and@jeff-davis. Merging this will be helpful for those who are currently maintaining or using the forks.

dtbuchholz mentioned this pull request

Oct 26, 2023

[NOT-64] Weeknotes individual update: October 30, 2023tablelandnetwork/weeknotes#62

Closed

Copy link

imor commentedAug 15, 2024•
edited
Loading

@sfackler would you be interested in replication support? I know this PR has conflicts, but I can clean this one up. Or, if you are not happy with the current design, maybe open a fresh one? We are forced to use a fork from MaterializeInc (cc@benesch) inpg_replicate which is not ideal because it prevents us from publishing to crates.io.

Copy link

Contributor

benesch commentedAug 15, 2024

Hey@imor! We were excited to see you buildpg_replicate on top of our fork. We'd be delighted to see this merge upstream so that we could shut down our fork. Unfortunately, we've not been able to get Steven's attention on this PR in the last three years.

petrosagg force-pushed thecopy-both branch from2fed91d tod8986ffCompare

August 16, 2024 08:00

Copy link

ContributorAuthor

petrosagg commentedAug 16, 2024

I rebased this PR on top of currentmaster

Copy link

imor commentedAug 19, 2024

Unfortunately, we've not been able to get Steven's attention on this PR in the last three years.

@benesch That's sad. Were there any concrete reasons given as to why this PR is not accepted? I don't see any comment from Steven in this PR. In the worst case if the status quo doesn't change, would it be too much to ask to publish your fork on crates.io. It will allow downstream users to publish their crates as well. Not ideal, I know, but looking for options here.

Copy link

Contributor

benesch commentedAug 19, 2024

That's sad. Were there any concrete reasons given as to why this PR is not accepted? I don't see any comment from Steven in this PR.

Nope, we’ve never heard anything, I’m afraid. I think Steven maintains this project on a volunteer basis though so I certainly don’t blame him. :)

Re a fork: I’d like to avoid having us (@MaterializeInc) on the critical path for new releases ofpg_replicate. E.g., if there are bug fixes you need merged. We’ve accumulated a few Materialize specific patches in our fork too that might not be broadly interesting. You’re welcome to just publish our fork to crates.io under a crate name you control, though!

Copy link

ContributorAuthor

petrosagg commentedAug 19, 2024

You’re welcome to just publish our fork to crates.io under a crate name you control, though!

Keep in mind that our fork contains more or less the code in this PR but also a lot more (the entire logical replication decoding logic). The idea was that once this PR was merged then the logical replication protocol would be handled by a separate crate layered on top of tokio-postgres, which would then expose the right functionality (i.e CopyBoth queries). I came up with this layering after we had our fork working to reduce the size of this PR and make more likely to be reviewed and merged. So what it might make sense to publish is a crate of this PR and then we can extract the logical decoding bits into another one. In the event that this PR eventually gets merged we can deprecate the published fork crate but keep the logical replication crate

Copy link

imor commentedAug 20, 2024

I think Steven maintains this project on a volunteer basis though so I certainly don’t blame him. :)

Oh there's no question of blaming Steven. All I have is gratitude for this great library. I understand how hard can it be to maintain a popular open source project.

Regarding the way forward, I like@petrosagg's suggestion about publishing a separate crate. A very practical solution which avoid fragmentation by a fork. I'm willing to help with this effort so let me know how I can be of use.

Copy link

Contributor

benesch commentedAug 21, 2024•
edited
Loading

I sent@sfackler an email asking if he has bandwidth to review/merge this PR. If we don't hear back in a few days, then we can investigate the backup plan of publishing a crate which is just rust-postgres plus this one PR.

petrosagg force-pushed thecopy-both branch 3 times, most recently from6340072 to63b8392Compare

August 21, 2024 17:40

petrosagg mentioned this pull request

Aug 21, 2024

*: upgrade postgres cratesMaterializeInc/materialize#29157

Merged

5 tasks

Copy link

ContributorAuthor

petrosagg commentedAug 21, 2024

Heads up that I tidied up our fork and brought in all changes from the master branch of this repo plus this PR. The logical replication functionality has been extracted in a separate crate herehttps://github.com/MaterializeInc/rust-postgres/tree/master/postgres-replication and is the crate that we could in theory publish if this PR ever gets merged or gets published under a different name.

jeff-davisand others added5 commits

September 2, 2024 18:23

Make simple_query::encode() pub(crate).

95a3c98

Connection string config for replication.

bd96437

Co-authored-by: Petros Angelatos <petrosagg@gmail.com>

implement Stream for Responses

92899e8

Signed-off-by: Petros Angelatos <petrosagg@gmail.com>

add copy_both_simple method

bed87a9

Signed-off-by: Petros Angelatos <petrosagg@gmail.com>

ci: enable logical replication in the test image

88edd68

Signed-off-by: Petros Angelatos <petrosagg@gmail.com>

petrosagg force-pushed thecopy-both branch from63b8392 to88edd68Compare

September 2, 2024 15:23

Copy link

qianyiwen2019 commentedNov 4, 2024

This has saved me a lot of time, thanks topetrosagg.
waiting for the merge.

Labels

None yet

9 participants

Movatterモバイル変換

Support CopyBoth queries and replication mode in config#778

Are you sure you want to change the base?

Support CopyBoth queries and replication mode in config#778

Uh oh!

Conversation

petrosagg commentedMay 25, 2021

Uh oh!

jeff-davis left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jeff-davis commentedMay 27, 2021

Uh oh!

jeff-davis commentedMay 27, 2021

Uh oh!

jeff-davis commentedMay 27, 2021

Uh oh!

petrosagg left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

petrosagg commentedJun 1, 2021

Uh oh!

jeff-davis commentedJun 1, 2021

Uh oh!

petrosagg commentedJun 2, 2021

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jeff-davis commentedJun 6, 2021

Uh oh!

Martichou commentedMar 17, 2022

Uh oh!

benesch commentedMar 17, 2022

Uh oh!

Martichou commentedMar 17, 2022

Uh oh!

manfredcml commentedMar 5, 2023

Uh oh!

imor commentedAug 15, 2024• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Uh oh!

benesch commentedAug 15, 2024

Uh oh!

petrosagg commentedAug 16, 2024

Uh oh!

imor commentedAug 19, 2024

Uh oh!

benesch commentedAug 19, 2024

Uh oh!

petrosagg commentedAug 19, 2024

Uh oh!

imor commentedAug 20, 2024

Uh oh!

benesch commentedAug 21, 2024• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Uh oh!

imor commentedAug 15, 2024•
edited
Loading

benesch commentedAug 21, 2024•
edited
Loading