9.5. Binary String Functions and Operators
This section describes functions and operators for examining and manipulating binary strings, that is values of typebytea. Many of these are equivalent, in purpose and syntax, to the text-string functions described in the previous section.
SQL defines some string functions that use key words, rather than commas, to separate arguments. Details are inTable 9.11.PostgreSQL also provides versions of these functions that use the regular function invocation syntax (seeTable 9.12).
Table 9.11. SQL Binary String Functions and Operators
Additional binary string manipulation functions are available and are listed inTable 9.12. Some of them are used internally to implement theSQL-standard string functions listed inTable 9.11.
Table 9.12. Other Binary String Functions
Function Description Example(s) |
|---|
Returns the number of bits set in the binary string (also known as“popcount”).
|
Removes the longest string containing only bytes appearing in
|
Extractsn'th bit from binary string.
|
Extractsn'th byte from binary string.
|
Returns the number of bytes in the binary string.
|
Returns the number of characters in the binary string, assuming that it is text in the given
|
Removes the longest string containing only bytes appearing in
|
Computes the MD5hash of the binary string, with the result written in hexadecimal.
|
Removes the longest string containing only bytes appearing in
|
Setsn'th bit in binary string to
|
Setsn'th byte in binary string to
|
Computes the SHA-224hash of the binary string.
|
Computes the SHA-256hash of the binary string.
|
Computes the SHA-384hash of the binary string.
|
Computes the SHA-512hash of the binary string.
|
Extracts the substring of
|
Functionsget_byte andset_byte number the first byte of a binary string as byte 0. Functionsget_bit andset_bit number bits from the right within each byte; for example bit 0 is the least significant bit of the first byte, and bit 15 is the most significant bit of the second byte.
For historical reasons, the functionmd5 returns a hex-encoded value of typetext whereas the SHA-2 functions return typebytea. Use the functionsencode anddecode to convert between the two. For example writeencode(sha256('abc'), 'hex') to get a hex-encoded text representation, ordecode(md5('abc'), 'hex') to get abytea value.
Functions for converting strings between different character sets (encodings), and for representing arbitrary binary data in textual form, are shown inTable 9.13. For these functions, an argument or result of typetext is expressed in the database's default encoding, while arguments or results of typebytea are in an encoding named by another argument.
Table 9.13. Text/Binary String Conversion Functions
Function Description Example(s) |
|---|
Converts a binary string representing text in encoding
|
Converts a binary string representing text in encoding
|
Converts a
|
Encodes binary data into a textual representation; supported
|
Decodes binary data from a textual representation; supported
|
Theencode anddecode functions support the following textual formats:
- base64
The
base64format is that ofRFC 2045 Section 6.8. As per theRFC, encoded lines are broken at 76 characters. However instead of the MIME CRLF end-of-line marker, only a newline is used for end-of-line. Thedecodefunction ignores carriage-return, newline, space, and tab characters. Otherwise, an error is raised whendecodeis supplied invalid base64 data — including when trailing padding is incorrect.- escape
The
escapeformat converts zero bytes and bytes with the high bit set into octal escape sequences (\nnn), and it doubles backslashes. Other byte values are represented literally. Thedecodefunction will raise an error if a backslash is not followed by either a second backslash or three octal digits; it accepts other byte values unchanged.- hex
The
hexformat represents each 4 bits of data as one hexadecimal digit,0throughf, writing the higher-order digit of each byte first. Theencodefunction outputs thea-fhex digits in lower case. Because the smallest unit of data is 8 bits, there are always an even number of characters returned byencode. Thedecodefunction accepts thea-fcharacters in either upper or lower case. An error is raised whendecodeis given invalid hex data — including when given an odd number of characters.
See also the aggregate functionstring_agg inSection 9.21 and the large object functions inSection 35.4.