Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit9832a23

Browse files
committed
Modify COPY TO to emit carriage returns and newlines as backslash escapes
(backslash-r, backslash-n) for protection against newline-conversionmunging. In future we will also tweak COPY FROM, but this part of thechange should be backwards-compatible. Per pghackers discussion.Also, update COPY reference page to describe the backslash conversionsmore completely and accurately.
1 parentc16ef16 commit9832a23

File tree

2 files changed

+165
-77
lines changed

2 files changed

+165
-77
lines changed

‎doc/src/sgml/ref/copy.sgml

Lines changed: 97 additions & 45 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
<!--
2-
$Header: /cvsroot/pgsql/doc/src/sgml/ref/copy.sgml,v 1.27 2002/01/20 22:19:56 petere Exp $
2+
$Header: /cvsroot/pgsql/doc/src/sgml/ref/copy.sgml,v 1.28 2002/02/12 21:25:34 tgl Exp $
33
PostgreSQL documentation
44
-->
55

@@ -74,7 +74,7 @@ COPY [ BINARY ] <replaceable class="parameter">table</replaceable> [ WITH OIDS ]
7474
<term><replaceable class="parameter">filename</replaceable></term>
7575
<listitem>
7676
<para>
77-
The absolute Unixfile name of the input or output file.
77+
The absolute Unixpath name of the input or output file.
7878
</para>
7979
</listitem>
8080
</varlistentry>
@@ -225,7 +225,7 @@ ERROR: <replaceable>reason</replaceable>
225225
By default, a text copy uses a tab ("\t") character as a delimiter
226226
between fields. The field delimiter may be changed to any other single
227227
character with the keyword phrase USING DELIMITERS. Characters
228-
in data fieldswhich happen to match the delimiter character will
228+
in data fieldsthat happen to match the delimiter character will
229229
be backslash quoted.
230230
</para>
231231

@@ -265,8 +265,8 @@ ERROR: <replaceable>reason</replaceable>
265265
by the <application>PostgreSQL</application> user (the user ID the
266266
server runs as), not the client.
267267
<command>COPY</command> naming a file is only allowed to database
268-
superusers, since it allowswriting onany file that the backend has
269-
privileges towrite on.
268+
superusers, since it allowsreading or writingany file that the backend
269+
hasprivileges toaccess.
270270

271271
<tip>
272272
<para>
@@ -297,57 +297,109 @@ ERROR: <replaceable>reason</replaceable>
297297
<title>File Formats</title>
298298
<refsect2>
299299
<refsect2info>
300-
<date>2001-01-02</date>
300+
<date>2002-02-12</date>
301301
</refsect2info>
302302
<title>Text Format</title>
303303
<para>
304-
When <command>COPY TO</command> is used without the BINARY option,
305-
the file generated will have each row (instance) on a single line, with each
306-
column (attribute) separated by the delimiter character. Embedded
307-
delimiter characters will be preceded by a backslash character
308-
("\"). The attribute values themselves are strings generated by the
309-
output function associated with each attribute type. The output
310-
function for a type should not try to generate the backslash
311-
character; this will be handled by <command>COPY</command> itself.
304+
When <command>COPY</command> is used without the BINARY option,
305+
the file read or written is a text file with one line per table row.
306+
Columns (attributes) in a row are separated by the delimiter character.
307+
The attribute values themselves are strings generated by the
308+
output function, or acceptable to the input function, of each
309+
attribute's data type. The specified null-value string is used in
310+
place of attributes that are NULL.
312311
</para>
313312
<para>
314-
The actual format for each instance is
315-
<programlisting>
316-
&lt;attr1&gt;&lt;<replaceable class=parameter>separator</replaceable>&gt;&lt;attr2&gt;&lt;<replaceable class=parameter>separator</replaceable>&gt;...&lt;<replaceable class=parameter>separator</replaceable>&gt;&lt;attr<replaceable class="parameter">n</replaceable>&gt;&lt;newline&gt;
317-
</programlisting>
318-
Note that the end of each row is marked by a Unix-style newline
319-
("\n"). <command>COPY FROM</command> will not behave as desired
320-
if given a file containing DOS- or Mac-style newlines.
313+
If WITH OIDS is specified, the OID is read or written as the first column,
314+
preceding the user data columns. (An error is raised if WITH OIDS is
315+
specified for a table that does not have OIDs.)
321316
</para>
322317
<para>
323-
The OID is emitted as the first column if WITH OIDS is specified.
324-
(An error is raised if WITH OIDS is specified for a table that does not
325-
have OIDs.)
318+
End of data can be represented by a single line containing just
319+
backslash-period (<literal>\.</>). An end-of-data marker is
320+
not necessary when reading from a Unix file, since the end of file
321+
serves perfectly well; but an end marker must be provided when copying
322+
data to or from a client application.
326323
</para>
327324
<para>
328-
If <command>COPY TO</command> is sending its output to standard
329-
output instead of a file, after the last row it will send a backslash ("\")
330-
and a period (".") followed by a newline.
331-
Similarly, if <command>COPY FROM</command> is reading
332-
from standard input, it will expect a backslash ("\") and a period
333-
(".") followed by a newline, as the first three characters on a
334-
line to denote end-of-file. However, <command>COPY FROM</command>
335-
will terminate correctly (followed by the backend itself) if the
336-
input connection is closed before this special end-of-file pattern is
337-
found.
325+
Backslash characters (<literal>\</>) may be used in the
326+
<command>COPY</command> data to quote data characters that might otherwise
327+
be taken as row or column delimiters. In particular, the following
328+
characters <emphasis>must</> be preceded by a backslash if they appear
329+
as part of an attribute value: backslash itself, newline, and the current
330+
delimiter character.
338331
</para>
339332
<para>
340-
The backslash character has other special meanings. A literal backslash
341-
character is represented as two
342-
consecutive backslashes ("\\"). A literal tab character is represented
343-
as a backslash and a tab. (If you are using something other than tab
344-
as the column delimiter, backslash that delimiter character to include
345-
it in data.) A literal newline character is
346-
represented as a backslash and a newline. When loading text data
347-
not generated by <application>PostgreSQL</application>,
348-
you will need to convert backslash
349-
characters ("\") to double-backslashes ("\\") to ensure that they
350-
are loaded properly.
333+
The following special backslash sequences are recognized by
334+
<command>COPY FROM</command>:
335+
336+
<informaltable>
337+
<tgroup cols="2">
338+
<thead>
339+
<row>
340+
<entry>Sequence</entry>
341+
<entry>Represents</entry>
342+
</row>
343+
</thead>
344+
345+
<tbody>
346+
<row>
347+
<entry><literal>\b</></entry>
348+
<entry>Backspace (ASCII 8)</entry>
349+
</row>
350+
<row>
351+
<entry><literal>\f</></entry>
352+
<entry>Form feed (ASCII 12)</entry>
353+
</row>
354+
<row>
355+
<entry><literal>\n</></entry>
356+
<entry>Newline (ASCII 10)</entry>
357+
</row>
358+
<row>
359+
<entry><literal>\r</></entry>
360+
<entry>Carriage return (ASCII 13)</entry>
361+
</row>
362+
<row>
363+
<entry><literal>\t</></entry>
364+
<entry>Tab (ASCII 9)</entry>
365+
</row>
366+
<row>
367+
<entry><literal>\v</></entry>
368+
<entry>Vertical tab (ASCII 11)</entry>
369+
</row>
370+
<row>
371+
<entry><literal>\</><replaceable>digits</></entry>
372+
<entry>Backslash followed by one to three octal digits specifies
373+
the character with that numeric code</entry>
374+
</row>
375+
</tbody>
376+
</tgroup>
377+
</informaltable>
378+
379+
Presently, <command>COPY TO</command> will never emit an octal-digits
380+
backslash sequence, but it does use the other sequences listed above
381+
for those control characters.
382+
</para>
383+
<para>
384+
Never put a backslash before a data character <literal>N</> or period
385+
(<literal>.</>). Such pairs will be mistaken for the default null string
386+
or the end-of-data marker, respectively. Any other backslashed character
387+
that is not mentioned in the above table will be taken to represent itself.
388+
</para>
389+
<para>
390+
It is strongly recommended that applications generating COPY data convert
391+
data newlines and carriage returns to the <literal>\n</> and
392+
<literal>\r</> sequences respectively. At present
393+
(<productname>PostgreSQL</productname> 7.2 and older versions) it is
394+
possible to represent a data carriage return without any special quoting,
395+
and to represent a data newline by a backslash and newline. However,
396+
these representations will not be accepted by default in future releases.
397+
</para>
398+
<para>
399+
Note that the end of each row is marked by a Unix-style newline
400+
("\n"). Presently, <command>COPY FROM</command> will not behave as
401+
desired if given a file containing DOS- or Mac-style newlines.
402+
This is expected to change in future releases.
351403
</para>
352404
</refsect2>
353405

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp