Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commitd2c2551

Browse files
committed
Add file.
1 parentdd47964 commitd2c2551

File tree

1 file changed

+189
-0
lines changed

1 file changed

+189
-0
lines changed

‎doc/TODO.detail/async

Lines changed: 189 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,189 @@
1+
From owner-pgsql-hackers@hub.org Sun Jan 23 13:31:03 2000
2+
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
3+
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id NAA28482
4+
for <pgman@candle.pha.pa.us>; Sun, 23 Jan 2000 13:31:01 -0500 (EST)
5+
Received: from hub.org (hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.1 $) with ESMTP id NAA08409 for <pgman@candle.pha.pa.us>; Sun, 23 Jan 2000 13:04:34 -0500 (EST)
6+
Received: from localhost (majordom@localhost)
7+
by hub.org (8.9.3/8.9.3) with SMTP id MAA65651;
8+
Sun, 23 Jan 2000 12:57:33 -0500 (EST)
9+
(envelope-from owner-pgsql-hackers)
10+
Received: by hub.org (bulk_mailer v1.5); Sun, 23 Jan 2000 12:57:20 -0500
11+
Received: (from majordom@localhost)
12+
by hub.org (8.9.3/8.9.3) id MAA65548
13+
for pgsql-hackers-outgoing; Sun, 23 Jan 2000 12:56:20 -0500 (EST)
14+
(envelope-from owner-pgsql-hackers@postgreSQL.org)
15+
Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
16+
by hub.org (8.9.3/8.9.3) with ESMTP id MAA65492
17+
for <pgsql-hackers@postgreSQL.org>; Sun, 23 Jan 2000 12:55:41 -0500 (EST)
18+
(envelope-from tgl@sss.pgh.pa.us)
19+
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
20+
by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id MAA06211;
21+
Sun, 23 Jan 2000 12:55:36 -0500 (EST)
22+
To: Alfred Perlstein <bright@wintelcom.net>
23+
cc: pgsql-hackers@postgreSQL.org
24+
Subject: Re: pg_dump possible fix, need testers. (was: Re: [HACKERS] pg_dump disaster)
25+
In-reply-to: <20000123022341.J26520@fw.wintelcom.net>
26+
References: <20000122211427.C26520@fw.wintelcom.net> <200001230525.AAA08020@candle.pha.pa.us> <20000122220256.H26520@fw.wintelcom.net> <5120.948606837@sss.pgh.pa.us> <20000123022341.J26520@fw.wintelcom.net>
27+
Comments: In-reply-to Alfred Perlstein <bright@wintelcom.net>
28+
message dated "Sun, 23 Jan 2000 02:23:41 -0800"
29+
Date: Sun, 23 Jan 2000 12:55:36 -0500
30+
Message-ID: <6208.948650136@sss.pgh.pa.us>
31+
From: Tom Lane <tgl@sss.pgh.pa.us>
32+
Sender: owner-pgsql-hackers@postgreSQL.org
33+
Status: ORr
34+
35+
>> Um, I didn't have any trouble at all reproducing Patrick's complaint.
36+
>> pg_dump any moderately large table (I used tenk1 from the regress
37+
>> database) and try to load the script with psql. Kaboom.
38+
39+
> This is after or before my latest patch?
40+
41+
Before. I haven't updated since yesterday...
42+
43+
> I can't seem to reproduce this problem,
44+
45+
Odd. Maybe there is something different about the kernel's timing of
46+
message sending on your platform. I see it very easily on HPUX 10.20,
47+
and Patrick sees it very easily on whatever he's using (netbsd I think).
48+
You might try varying the situation a little, say
49+
psql mydb <dumpfile
50+
psql -f dumpfile mydb
51+
psql mydb
52+
\i dumpfile
53+
and the same with -h localhost (to get a TCP/IP connection instead of
54+
Unix domain). At the moment (pre-patch) I see failures with the
55+
first two of these, but not with the \i method. -h doesn't seem to
56+
matter for me, but it might for you.
57+
58+
> Telling me something is wrong without giving suggestions on how
59+
> to fix it, nor direct pointers to where it fails doesn't help me
60+
> one bit. You're not offering constructive critism, you're not
61+
> even offering valid critism, you're just waving your finger at
62+
> "problems" that you say exist but don't pin down to anything specific.
63+
64+
I have been explaining it as clearly as I could. Let's try it
65+
one more time.
66+
67+
> I spent hours looking over what I did to pqFlush and pqPutnBytes
68+
> because of what you said earlier when all the bug seems to have
69+
> come down to is that I missed that the socket is set to non-blocking
70+
> in all cases now.
71+
72+
Letting the socket mode default to blocking will hide the problems from
73+
existing clients that don't care about non-block mode. But people who
74+
try to actually use the nonblock mode are going to see the same kinds of
75+
problems that psql is exhibiting.
76+
77+
> The old sequence of events that happened was as follows:
78+
79+
> user sends data almost filling the output buffer...
80+
> user sends another line of text overflowing the buffer...
81+
> pqFlush is invoked blocking the user until the output pipe clears...
82+
> and repeat.
83+
84+
Right.
85+
86+
> The nonblocking code allows sends to fail so the user can abort
87+
> sending stuff to the backend in order to process other work:
88+
89+
> user sends data almost filling the output buffer...
90+
> user sends another line of text that may overflow the buffer...
91+
> pqFlush is invoked,
92+
> if the pipe can't be cleared an error is returned allowing the user to
93+
> retry the send later.
94+
> if the flush succeeds then more data is queued and success is returned
95+
96+
But you haven't thought through the mechanics of the "error is returned
97+
allowing the user to retry" code path clearly enough. Let's take
98+
pqPutBytes for an example. If it returns EOF, is that a hard error or
99+
does it just mean that the application needs to wait a while? The
100+
application *must* distinguish these cases, or it will do the wrong
101+
thing: for example, if it mistakes a hard error for "wait a while",
102+
then it will wait forever without making any progress or producing
103+
an error report.
104+
105+
You need to provide a different return convention that indicates
106+
what happened, say
107+
EOF (-1)=> hard error (same as old code)
108+
0=> OK
109+
1=> no data was queued due to risk of blocking
110+
And you need to guarantee that the application knows what the state is
111+
when the can't-do-it-yet return is made; note that I specified "no data
112+
was queued" above. If pqPutBytes might queue some of the data before
113+
returning 1, the application is in trouble again. While you apparently
114+
foresaw that in recoding pqPutBytes, your code doesn't actually work.
115+
There is the minor code bug that you fail to update "avail" after the
116+
first pqFlush call, and the much more fundamental problem that you
117+
cannot guarantee to have queued all or none of the data. Think about
118+
what happens if the passed nbytes is larger than the output buffer size.
119+
You may pass the first pqFlush successfully, then get into the loop and
120+
get a won't-block return from pqFlush in the loop. What then?
121+
You can't simply refuse to support the case nbytes > bufsize at all,
122+
because that will cause application failures as well (too long query
123+
sends it into an infinite loop trying to queue data, most likely).
124+
125+
A possible answer is to specify that a return of +N means "N bytes
126+
remain unqueued due to risk of blocking" (after having queued as much
127+
as you could). This would put the onus on the caller to update his
128+
pointers/counts properly; propagating that into all the internal uses
129+
of pqPutBytes would be no fun. (Of course, so far you haven't updated
130+
*any* of the internal callers to behave reasonably in case of a
131+
won't-block return; PQfn is just one example.)
132+
133+
Another possible answer is to preserve pqPutBytes' old API, "queue or
134+
bust", by the expedient of enlarging the output buffer to hold whatever
135+
we can't send immediately. This is probably more attractive, even
136+
though a long query might suck up a lot of space that won't get
137+
reclaimed as long as the connection lives. If you don't do this then
138+
you are going to have to make a lot of ugly changes in the internal
139+
callers to deal with won't-block returns. Actually, a bulk COPY IN
140+
would probably be the worst case --- the app could easily load data into
141+
the buffer far faster than it could be sent. It might be best to extend
142+
PQputline to have a three-way return and add code there to limit the
143+
growth of the output buffer, while allowing all internal callers to
144+
assume that the buffer is expanded when they need it.
145+
146+
pqFlush has the same kind of interface design problem: the same EOF code
147+
is returned for either a hard error or can't-flush-yet, but it would be
148+
disastrous to treat those cases alike. You must provide a 3-way return
149+
code.
150+
151+
Furthermore, the same sort of 3-way return code convention will have to
152+
propagate out through anything that calls pqFlush (with corresponding
153+
documentation updates). pqPutBytes can be made to hide a pqFlush won't-
154+
block return by trying to enlarge the output buffer, but in most other
155+
places you won't have a choice except to punt it back to the caller.
156+
157+
PQendcopy has the same interface design problem. It used to be that
158+
(unless you passed a null pointer) PQendcopy would *guarantee* that
159+
the connection was no longer in COPY state on return --- by resetting
160+
it, if necessary. So the return code was mainly informative; the
161+
application didn't have to do anything different if PQendcopy reported
162+
failure. But now, a nonblocking application does need to pay attention
163+
to whether PQendcopy completed or not --- and you haven't provided a way
164+
for it to tell. If 1 is returned, the connection might still be in
165+
COPY state, or it might not (PQendcopy might have reset it). If the
166+
application doesn't distinguish these cases then it will fail.
167+
168+
I also think that you want to take a hard look at the automatic "reset"
169+
behavior upon COPY failure, since a PQreset call will block the
170+
application until it finishes. Really, what is needed to close down a
171+
COPY safely in nonblock mode is a pair of entry points along the line of
172+
"PQendcopyStart" and "PQendcopyPoll", with API conventions similar to
173+
PQresetStart/PQresetPoll. This gives you the ability to do the reset
174+
(if one is necessary) without blocking the application. PQendcopy
175+
itself will only be useful to blocking applications.
176+
177+
> I'm sorry if they don't work for some situations other than COPY IN,
178+
> but it's functionality that I needed and I expect to be expanded on
179+
> by myself and others that take interest in nonblocking operation.
180+
181+
I don't think that the nonblock code is anywhere near production quality
182+
at this point. It may work for you, if you don't stress it too hard and
183+
never have a communications failure; but I don't want to see us ship it
184+
as part of Postgres unless these issues get addressed.
185+
186+
regards, tom lane
187+
188+
************
189+

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp