Movatterモバイル変換


[0]ホーム

URL:



Facebook
Postgres Pro
Facebook
Downloads
8.4. Binary Data Types
Prev UpChapter 8. Data TypesHome Next

8.4. Binary Data Types#

Thebytea data type allows storage of binary strings; seeTable 8.6.

Table 8.6. Binary Data Types

NameStorage SizeDescription
bytea1 or 4 bytes plus the actual binary stringvariable-length binary string

A binary string is a sequence of octets (or bytes). Binary strings are distinguished from character strings in two ways. First, binary strings specifically allow storing octets of value zero and othernon-printable octets (usually, octets outside the decimal range 32 to 126). Character strings disallow zero octets, and also disallow any other octet values and sequences of octet values that are invalid according to the database's selected character set encoding. Second, operations on binary strings process the actual bytes, whereas the processing of character strings depends on locale settings. In short, binary strings are appropriate for storing data that the programmer thinks of asraw bytes, whereas character strings are appropriate for storing text.

Thebytea type supports two formats for input and output:hex format andPostgreSQL's historicalescape format. Both of these are always accepted on input. The output format depends on the configuration parameterbytea_output; the default is hex. (Note that the hex format was introduced inPostgreSQL 9.0; earlier versions and some tools don't understand it.)

TheSQL standard defines a different binary string type, calledBLOB orBINARY LARGE OBJECT. The input format is different frombytea, but the provided functions and operators are mostly the same.

Thehex format encodes binary data as 2 hexadecimal digits per byte, most significant nibble first. The entire string is preceded by the sequence\x (to distinguish it from the escape format). In some contexts, the initial backslash may need to be escaped by doubling it (seeSection 4.1.2.1). For input, the hexadecimal digits can be either upper or lower case, and whitespace is permitted between digit pairs (but not within a digit pair nor in the starting\x sequence). The hex format is compatible with a wide range of external applications and protocols, and it tends to be faster to convert than the escape format, so its use is preferred.

Example:

SET bytea_output = 'hex';SELECT '\xDEADBEEF'::bytea;   bytea------------ \xdeadbeef

8.4.2. bytea Escape Format#

Theescape format is the traditionalPostgreSQL format for thebytea type. It takes the approach of representing a binary string as a sequence of ASCII characters, while converting those bytes that cannot be represented as an ASCII character into special escape sequences. If, from the point of view of the application, representing bytes as characters makes sense, then this representation can be convenient. But in practice it is usually confusing because it fuzzes up the distinction between binary strings and character strings, and also the particular escape mechanism that was chosen is somewhat unwieldy. Therefore, this format should probably be avoided for most new applications.

When enteringbytea values in escape format, octets of certain valuesmust be escaped, while all octet valuescan be escaped. In general, to escape an octet, convert it into its three-digit octal value and precede it by a backslash. Backslash itself (octet decimal value 92) can alternatively be represented by double backslashes.Table 8.7 shows the characters that must be escaped, and gives the alternative escape sequences where applicable.

Table 8.7. bytea Literal Escaped Octets

Decimal Octet ValueDescriptionEscaped Input RepresentationExampleHex Representation
0zero octet'\000''\000'::bytea\x00
39single quote'''' or'\047'''''::bytea\x27
92backslash'\\' or'\134''\\'::bytea\x5c
0 to 31 and 127 to 255non-printable octets'\xxx' (octal value)'\001'::bytea\x01

The requirement to escapenon-printable octets varies depending on locale settings. In some instances you can get away with leaving them unescaped.

The reason that single quotes must be doubled, as shown inTable 8.7, is that this is true for any string literal in an SQL command. The generic string-literal parser consumes the outermost single quotes and reduces any pair of single quotes to one data character. What thebytea input function sees is just one single quote, which it treats as a plain data character. However, thebytea input function treats backslashes as special, and the other behaviors shown inTable 8.7 are implemented by that function.

In some contexts, backslashes must be doubled compared to what is shown above, because the generic string-literal parser will also reduce pairs of backslashes to one data character; seeSection 4.1.2.1.

Bytea octets are output inhex format by default. If you changebytea_output toescape,non-printable octets are converted to their equivalent three-digit octal value and preceded by one backslash. Mostprintable octets are output by their standard representation in the client character set, e.g.:

SET bytea_output = 'escape';SELECT 'abc \153\154\155 \052\251\124'::bytea;     bytea---------------- abc klm *\251T

The octet with decimal value 92 (backslash) is doubled in the output. Details are inTable 8.8.

Table 8.8. bytea Output Escaped Octets

Decimal Octet ValueDescriptionEscaped Output RepresentationExampleOutput Result
92backslash\\'\134'::bytea\\
0 to 31 and 127 to 255non-printable octets\xxx (octal value)'\001'::bytea\001
32 to 126printable octetsclient character set representation'\176'::bytea~

Depending on the front end toPostgreSQL you use, you might have additional work to do in terms of escaping and unescapingbytea strings. For example, you might also have to escape line feeds and carriage returns if your interface automatically translates these.


Prev Up Next
8.3. Character Types Home 8.5. Date/Time Types
pdfepub
Go to PostgreSQL 17
By continuing to browse this website, you agree to the use of cookies. Go toPrivacy Policy.

[8]ページ先頭

©2009-2025 Movatter.jp