The bytea data type allows storage of binary strings;
see Table 5-6.
Table 5-6. Binary String Types
Type Name | Storage | Description |
---|
bytea | 4 bytes plus the actual binary string | Variable (not specifically limited)
length binary string |
A binary string is a sequence of octets (or bytes). Binary
strings are distinguished from characters strings by two
characteristics: First, binary strings specifically allow storing
octets of zero value and other "non-printable"
octets. Second, operations on binary strings process the actual
bytes, whereas the encoding and processing of character strings
depends on locale settings.
When entering bytea values, octets of certain values
must be escaped (but all octet values
may be escaped) when used as part of a string
literal in an SQL statement. In general, to
escape an octet, it is converted into the three-digit octal number
equivalent of its decimal octet value, and preceded by two
backslashes. Some octet values have alternate escape sequences, as
shown in Table 5-7.
Table 5-7. bytea Literal Escaped Octets
Decimal Octet Value | Description | Input Escaped Representation | Example | Printed Result |
---|
0 | zero octet | '\\000' | SELECT '\\000'::bytea; | \000 |
39 | single quote | '\'' or '\\047' | SELECT '\''::bytea; | ' |
92 | backslash | '\\\\' or '\\134' | SELECT '\\\\'::bytea; | \\ |
Note that the result in each of the examples in Table 5-7 was exactly one
octet in length, even though the output representation of the zero
octet and backslash are more than one character. Bytea
output octets are also escaped. In general, each
"non-printable" octet decimal value is converted into
its equivalent three digit octal value, and preceded by one backslash.
Most "printable" octets are represented by their standard
representation in the client character set. The octet with decimal
value 92 (backslash) has a special alternate output representation.
Details are in Table 5-8.
Table 5-8. bytea Output Escaped Octets
Decimal Octet Value | Description | Output Escaped Representation | Example | Printed Result |
---|
92 | backslash | \\ | SELECT '\\134'::bytea; | \\ |
0 to 31 and 127 to 255 | "non-printable" octets | \### (octal value) | SELECT '\\001'::bytea; | \001 |
32 to 126 | "printable" octets | ASCII representation | SELECT '\\176'::bytea; | ~ |
To use the bytea escaped octet notation, string
literals (input strings) must contain two backslashes because they
must pass through two parsers in the PostgreSQL
server. The first backslash is interpreted as an escape character
by the string-literal parser, and therefore is consumed, leaving
the characters that follow. The remaining backslash is recognized
by the bytea input function as the prefix of a three
digit octal value. For example, a string literal passed to the
backend as '\\001' becomes
'\001' after passing through the string-literal
parser. The '\001' is then sent to the
bytea input function, where it is converted to a
single octet with a decimal value of 1.
For a similar reason, a backslash must be input as
'\\\\' (or '\\134'). The first
and third backslashes are interpreted as escape characters by the
string-literal parser, and therefore are consumed, leaving two
backslashes in the string passed to the bytea input function,
which interprets them as representing a single backslash.
For example, a string literal passed to the
server as '\\\\' becomes '\\'
after passing through the string-literal parser. The
'\\' is then sent to the bytea input
function, where it is converted to a single octet with a decimal
value of 92.
A single quote is a bit different in that it must be input as
'\'' (or '\\047'),
not as '\\''. This is because,
while the literal parser interprets the single quote as a special
character, and will consume the single backslash, the
bytea input function does not
recognize a single quote as a special octet. Therefore a string
literal passed to the backend as '\'' becomes
''' after passing through the string-literal
parser. The ''' is then sent to the
bytea input function, where it is retains its single
octet decimal value of 39.
Depending on the front end to PostgreSQL you use,
you may have additional work to do in terms of escaping and
unescaping bytea strings. For example, you may also
have to escape line feeds and carriage returns if your interface
automatically translates these. Or you may have to double up on
backslashes if the parser for your language or choice also treats
them as an escape character.
The SQL standard defines a different binary
string type, called BLOB or BINARY LARGE
OBJECT. The input format is different compared to
bytea, but the provided functions and operators are
mostly the same.