Packed Data¶↑
Quick Reference¶↑
These tables summarize the directives for packing and unpacking.
For Integers¶↑
Directive | Meaning--------------|---------------------------------------------------------------C | 8-bit unsigned (unsigned char)S | 16-bit unsigned, native endian (uint16_t)L | 32-bit unsigned, native endian (uint32_t)Q | 64-bit unsigned, native endian (uint64_t)J | pointer width unsigned, native endian (uintptr_t)c | 8-bit signed (signed char)s | 16-bit signed, native endian (int16_t)l | 32-bit signed, native endian (int32_t)q | 64-bit signed, native endian (int64_t)j | pointer width signed, native endian (intptr_t)S_ S! | unsigned short, native endianI I_ I! | unsigned int, native endianL_ L! | unsigned long, native endianQ_ Q! | unsigned long long, native endian | (raises ArgumentError if the platform has no long long type)J! | uintptr_t, native endian (same with J)s_ s! | signed short, native endiani i_ i! | signed int, native endianl_ l! | signed long, native endianq_ q! | signed long long, native endian | (raises ArgumentError if the platform has no long long type)j! | intptr_t, native endian (same with j)S> s> S!> s!> | each the same as the directive without >, but big endianL> l> L!> l!> | S> is the same as nI!> i!> | L> is the same as NQ> q> Q!> q!> |J> j> J!> j!> |S< s< S!< s!< | each the same as the directive without <, but little endianL< l< L!< l!< | S< is the same as vI!< i!< | L< is the same as VQ< q< Q!< q!< |J< j< J!< j!< |n | 16-bit unsigned, network (big-endian) byte orderN | 32-bit unsigned, network (big-endian) byte orderv | 16-bit unsigned, VAX (little-endian) byte orderV | 32-bit unsigned, VAX (little-endian) byte orderU | UTF-8 characterw | BER-compressed integer
For Floats¶↑
Directive | Meaning----------|--------------------------------------------------D d | double-precision, native formatF f | single-precision, native formatE | double-precision, little-endian byte ordere | single-precision, little-endian byte orderG | double-precision, network (big-endian) byte orderg | single-precision, network (big-endian) byte order
For Strings¶↑
Directive | Meaning----------|-----------------------------------------------------------------A | arbitrary binary string (remove trailing nulls and ASCII spaces)a | arbitrary binary stringZ | null-terminated stringB | bit string (MSB first)b | bit string (LSB first)H | hex string (high nibble first)h | hex string (low nibble first)u | UU-encoded stringM | quoted-printable, MIME encoding (see RFC2045)m | base64 encoded string (RFC 2045) (default) | (base64 encoded string (RFC 4648) if followed by 0)P | pointer to a structure (fixed-length string)p | pointer to a null-terminated string
Additional Directives for Packing¶↑
Directive | Meaning----------|----------------------------------------------------------------@ | moves to absolute positionX | back up a bytex | null byte
Additional Directives for Unpacking¶↑
Directive | Meaning----------|----------------------------------------------------------------@ | skip to the offset given by the length argumentX | skip backward one bytex | skip forward one byte
Packing and Unpacking¶↑
Certain Ruby core methods deal with packing and unpacking data:
Method
Array#pack: Formats each element in arrayselfinto a binary string; returns that string.Method
String#unpack: Extracts data from stringself, forming objects that become the elements of a new array; returns that array.Method
String#unpack1: Does the same, but unpacks and returns only the first extracted object.
Each of these methods accepts a stringtemplate, consisting of zero or moredirective characters, each followed by zero or moremodifier characters.
Examples (directive'C' specifies ‘unsigned character’):
[65].pack('C')# => "A" # One element, one directive.[65,66].pack('CC')# => "AB" # Two elements, two directives.[65,66].pack('C')# => "A" # Extra element is ignored.[65].pack('')# => "" # No directives.[65].pack('CC')# Extra directive raises ArgumentError.'A'.unpack('C')# => [65] # One character, one directive.'AB'.unpack('CC')# => [65, 66] # Two characters, two directives.'AB'.unpack('C')# => [65] # Extra character is ignored.'A'.unpack('CC')# => [65, nil] # Extra directive generates nil.'AB'.unpack('')# => [] # No directives.
The stringtemplate may contain any mixture of valid directives (directive'c' specifies ‘signed character’):
[65,-1].pack('cC')# => "A\xFF""A\xFF".unpack('cC')# => [65, 255]
The stringtemplate may contain whitespace (which is ignored) and comments, each of which begins with character'#' and continues up to and including the next following newline:
[0,1].pack(" C #foo \n C ")# => "\x00\x01""\0\1".unpack(" C #foo \n C ")# => [0, 1]
Any directive may be followed by either of these modifiers:
'*'- The directive is to be applied as many times as needed:[65,66].pack('C*')# => "AB"'AB'.unpack('C*')# => [65, 66]
Integercount- The directive is to be appliedcounttimes:[65,66].pack('C2')# => "AB"[65,66].pack('C3')# Raises ArgumentError.'AB'.unpack('C2')# => [65, 66]'AB'.unpack('C3')# => [65, 66, nil]
Note: Directives in
%w[A a Z m]usecountdifferently; seeString Directives.
If elements don’t fit the provided directive, only least significant bits are encoded:
[257].pack("C").unpack("C")# => [1]
Packing Method¶↑
MethodArray#pack accepts optional keyword argumentbuffer that specifies the target string (instead of a new string):
[65,66].pack('C*',buffer:'foo')# => "fooAB"
The method can accept a block:
# Packed string is passed to the block.[65,66].pack('C*') {|s|ps }# => "AB"
Unpacking Methods¶↑
MethodsString#unpack andString#unpack1 each accept an optional keyword argumentoffset that specifies an offset into the string:
'ABC'.unpack('C*',offset:1)# => [66, 67]'ABC'.unpack1('C*',offset:1)# => 66
Both methods can accept a block:
# Each unpacked object is passed to the block.ret = []"ABCD".unpack("C*") {|c|ret<<c }ret# => [65, 66, 67, 68]# The single unpacked object is passed to the block.'AB'.unpack1('C*') {|ele|pele }# => 65
Integer Directives¶↑
Each integer directive specifies the packing or unpacking for one element in the input or output array.
8-Bit Integer Directives¶↑
'c'- 8-bit signed integer (like Csigned char):[0,1,255].pack('c*')# => "\x00\x01\xFF"s = [0,1,-1].pack('c*')# => "\x00\x01\xFF"s.unpack('c*')# => [0, 1, -1]
'C'- 8-bit unsigned integer (like Cunsigned char):[0,1,255].pack('C*')# => "\x00\x01\xFF"s = [0,1,-1].pack('C*')# => "\x00\x01\xFF"s.unpack('C*')# => [0, 1, 255]
16-Bit Integer Directives¶↑
's'- 16-bit signed integer, native-endian (like Cint16_t):[513,-514].pack('s*')# => "\x01\x02\xFE\xFD"s = [513,65022].pack('s*')# => "\x01\x02\xFE\xFD"s.unpack('s*')# => [513, -514]
'S'- 16-bit unsigned integer, native-endian (like Cuint16_t):[513,-514].pack('S*')# => "\x01\x02\xFE\xFD"s = [513,65022].pack('S*')# => "\x01\x02\xFE\xFD"s.unpack('S*')# => [513, 65022]
'n'- 16-bit network integer, big-endian:s = [0,1,-1,32767,-32768,65535].pack('n*')# => "\x00\x00\x00\x01\xFF\xFF\x7F\xFF\x80\x00\xFF\xFF"s.unpack('n*')# => [0, 1, 65535, 32767, 32768, 65535]
'v'- 16-bit VAX integer, little-endian:s = [0,1,-1,32767,-32768,65535].pack('v*')# => "\x00\x00\x01\x00\xFF\xFF\xFF\x7F\x00\x80\xFF\xFF"s.unpack('v*')# => [0, 1, 65535, 32767, 32768, 65535]
32-Bit Integer Directives¶↑
'l'- 32-bit signed integer, native-endian (like Cint32_t):s = [67305985,-50462977].pack('l*')# => "\x01\x02\x03\x04\xFF\xFE\xFD\xFC"s.unpack('l*')# => [67305985, -50462977]
'L'- 32-bit unsigned integer, native-endian (like Cuint32_t):s = [67305985,4244504319].pack('L*')# => "\x01\x02\x03\x04\xFF\xFE\xFD\xFC"s.unpack('L*')# => [67305985, 4244504319]
'N'- 32-bit network integer, big-endian:s = [0,1,-1].pack('N*')# => "\x00\x00\x00\x00\x00\x00\x00\x01\xFF\xFF\xFF\xFF"s.unpack('N*')# => [0, 1, 4294967295]
'V'- 32-bit VAX integer, little-endian:s = [0,1,-1].pack('V*')# => "\x00\x00\x00\x00\x01\x00\x00\x00\xFF\xFF\xFF\xFF"s.unpack('v*')# => [0, 0, 1, 0, 65535, 65535]
64-Bit Integer Directives¶↑
'q'- 64-bit signed integer, native-endian (like Cint64_t):s = [578437695752307201,-506097522914230529].pack('q*')# => "\x01\x02\x03\x04\x05\x06\a\b\xFF\xFE\xFD\xFC\xFB\xFA\xF9\xF8"s.unpack('q*')# => [578437695752307201, -506097522914230529]
'Q'- 64-bit unsigned integer, native-endian (like Cuint64_t):s = [578437695752307201,17940646550795321087].pack('Q*')# => "\x01\x02\x03\x04\x05\x06\a\b\xFF\xFE\xFD\xFC\xFB\xFA\xF9\xF8"s.unpack('Q*')# => [578437695752307201, 17940646550795321087]
Platform-Dependent Integer Directives¶↑
'i'- Platform-dependent width signed integer, native-endian (like Cint):s = [67305985,-50462977].pack('i*')# => "\x01\x02\x03\x04\xFF\xFE\xFD\xFC"s.unpack('i*')# => [67305985, -50462977]
'I'- Platform-dependent width unsigned integer, native-endian (like Cunsigned int):s = [67305985,-50462977].pack('I*')# => "\x01\x02\x03\x04\xFF\xFE\xFD\xFC"s.unpack('I*')# => [67305985, 4244504319]
'j'- Pointer-width signed integer, native-endian (like Cintptr_t):s = [67305985,-50462977].pack('j*')# => "\x01\x02\x03\x04\x00\x00\x00\x00\xFF\xFE\xFD\xFC\xFF\xFF\xFF\xFF"s.unpack('j*')# => [67305985, -50462977]
'J'- Pointer-width unsigned integer, native-endian (like Cuintptr_t):s = [67305985,4244504319].pack('J*')# => "\x01\x02\x03\x04\x00\x00\x00\x00\xFF\xFE\xFD\xFC\x00\x00\x00\x00"s.unpack('J*')# => [67305985, 4244504319]
Other Integer Directives¶↑
'U'- UTF-8 character:s = [4194304].pack('U*')# => "\xF8\x90\x80\x80\x80"s.unpack('U*')# => [4194304]
'w'- BER-encoded integer (seeBER encoding):s = [1073741823].pack('w*')# => "\x83\xFF\xFF\xFF\x7F"s.unpack('w*')# => [1073741823]
Modifiers for Integer Directives¶↑
For the following directives,'!' or'_' modifiers may be suffixed as underlying platform’s native size.
'i','I'- Cint, always native size.'s','S'- Cshort.'l','L'- Clong.'q','Q'- Clong long, if available.'j','J'- Cintptr_t, always native size.
Native size modifiers are silently ignored for always native size directives.
The endian modifiers also may be suffixed in the directives above:
'>'- Big-endian.'<'- Little-endian.
Float Directives¶↑
Each float directive specifies the packing or unpacking for one element in the input or output array.
Single-Precision Float Directives¶↑
'F'or'f'- Native format:s = [3.0].pack('F')# => "\x00\x00@@"s.unpack('F')# => [3.0]
'e'- Little-endian:s = [3.0].pack('e')# => "\x00\x00@@"s.unpack('e')# => [3.0]
'g'- Big-endian:s = [3.0].pack('g')# => "@@\x00\x00"s.unpack('g')# => [3.0]
Double-Precision Float Directives¶↑
'D'or'd'- Native format:s = [3.0].pack('D')# => "\x00\x00\x00\x00\x00\x00\b@"s.unpack('D')# => [3.0]
'E'- Little-endian:s = [3.0].pack('E')# => "\x00\x00\x00\x00\x00\x00\b@"s.unpack('E')# => [3.0]
'G'- Big-endian:s = [3.0].pack('G')# => "@\b\x00\x00\x00\x00\x00\x00"s.unpack('G')# => [3.0]
A float directive may be infinity or not-a-number:
inf =1.0/0.0# => Infinity[inf].pack('f')# => "\x00\x00\x80\x7F""\x00\x00\x80\x7F".unpack('f')# => [Infinity]nan =inf/inf# => NaN[nan].pack('f')# => "\x00\x00\xC0\x7F""\x00\x00\xC0\x7F".unpack('f')# => [NaN]
String Directives¶↑
Each string directive specifies the packing or unpacking for one byte in the input or output string.
Binary String Directives¶↑
'A'- Arbitrary binary string (space padded; count is width);nilis treated as the empty string:['foo'].pack('A')# => "f"['foo'].pack('A*')# => "foo"['foo'].pack('A2')# => "fo"['foo'].pack('A4')# => "foo "[nil].pack('A')# => " "[nil].pack('A*')# => ""[nil].pack('A2')# => " "[nil].pack('A4')# => " ""foo\0".unpack('A')# => ["f"]"foo\0".unpack('A4')# => ["foo"]"foo\0bar".unpack('A10')# => ["foo\x00bar"] # Reads past "\0"."foo ".unpack('A')# => ["f"]"foo ".unpack('A4')# => ["foo"]"foo".unpack('A4')# => ["foo"]russian ="\u{442 435 441 442}"# => "тест"russian.size# => 4russian.bytesize# => 8[russian].pack('A')# => "\xD1"[russian].pack('A*')# => "\xD1\x82\xD0\xB5\xD1\x81\xD1\x82"russian.unpack('A')# => ["\xD1"]russian.unpack('A2')# => ["\xD1\x82"]russian.unpack('A4')# => ["\xD1\x82\xD0\xB5"]russian.unpack('A*')# => ["\xD1\x82\xD0\xB5\xD1\x81\xD1\x82"]
'a'- Arbitrary binary string (null padded; count is width):["foo"].pack('a')# => "f"["foo"].pack('a*')# => "foo"["foo"].pack('a2')# => "fo"["foo\0"].pack('a4')# => "foo\x00"[nil].pack('a')# => "\x00"[nil].pack('a*')# => ""[nil].pack('a2')# => "\x00\x00"[nil].pack('a4')# => "\x00\x00\x00\x00""foo\0".unpack('a')# => ["f"]"foo\0".unpack('a4')# => ["foo\x00"]"foo ".unpack('a4')# => ["foo "]"foo".unpack('a4')# => ["foo"]"foo\0bar".unpack('a4')# => ["foo\x00"] # Reads past "\0".
'Z'- Same as'a', except that null is added or ignored with'*':["foo"].pack('Z*')# => "foo\x00"[nil].pack('Z*')# => "\x00""foo\0".unpack('Z*')# => ["foo"]"foo".unpack('Z*')# => ["foo"]"foo\0bar".unpack('Z*')# => ["foo"] # Does not read past "\0".
Bit String Directives¶↑
'B'- Bit string (high byte first):['11111111'+'00000000'].pack('B*')# => "\xFF\x00"['10000000'+'01000000'].pack('B*')# => "\x80@"['1'].pack('B0')# => ""['1'].pack('B1')# => "\x80"['1'].pack('B2')# => "\x80\x00"['1'].pack('B3')# => "\x80\x00"['1'].pack('B4')# => "\x80\x00\x00"['1'].pack('B5')# => "\x80\x00\x00"['1'].pack('B6')# => "\x80\x00\x00\x00""\xff\x00".unpack("B*")# => ["1111111100000000"]"\x01\x02".unpack("B*")# => ["0000000100000010"]"".unpack("B0")# => [""]"\x80".unpack("B1")# => ["1"]"\x80".unpack("B2")# => ["10"]"\x80".unpack("B3")# => ["100"]
'b'- Bit string (low byte first):['11111111'+'00000000'].pack('b*')# => "\xFF\x00"['10000000'+'01000000'].pack('b*')# => "\x01\x02"['1'].pack('b0')# => ""['1'].pack('b1')# => "\x01"['1'].pack('b2')# => "\x01\x00"['1'].pack('b3')# => "\x01\x00"['1'].pack('b4')# => "\x01\x00\x00"['1'].pack('b5')# => "\x01\x00\x00"['1'].pack('b6')# => "\x01\x00\x00\x00""\xff\x00".unpack("b*")# => ["1111111100000000"]"\x01\x02".unpack("b*")# => ["1000000001000000"]"".unpack("b0")# => [""]"\x01".unpack("b1")# => ["1"]"\x01".unpack("b2")# => ["10"]"\x01".unpack("b3")# => ["100"]
Hex String Directives¶↑
'H'- Hex string (high nibble first):['10ef'].pack('H*')# => "\x10\xEF"['10ef'].pack('H0')# => ""['10ef'].pack('H3')# => "\x10\xE0"['10ef'].pack('H5')# => "\x10\xEF\x00"['fff'].pack('H3')# => "\xFF\xF0"['fff'].pack('H4')# => "\xFF\xF0"['fff'].pack('H5')# => "\xFF\xF0\x00"['fff'].pack('H6')# => "\xFF\xF0\x00"['fff'].pack('H7')# => "\xFF\xF0\x00\x00"['fff'].pack('H8')# => "\xFF\xF0\x00\x00""\x10\xef".unpack('H*')# => ["10ef"]"\x10\xef".unpack('H0')# => [""]"\x10\xef".unpack('H1')# => ["1"]"\x10\xef".unpack('H2')# => ["10"]"\x10\xef".unpack('H3')# => ["10e"]"\x10\xef".unpack('H4')# => ["10ef"]"\x10\xef".unpack('H5')# => ["10ef"]
'h'- Hex string (low nibble first):['10ef'].pack('h*')# => "\x01\xFE"['10ef'].pack('h0')# => ""['10ef'].pack('h3')# => "\x01\x0E"['10ef'].pack('h5')# => "\x01\xFE\x00"['fff'].pack('h3')# => "\xFF\x0F"['fff'].pack('h4')# => "\xFF\x0F"['fff'].pack('h5')# => "\xFF\x0F\x00"['fff'].pack('h6')# => "\xFF\x0F\x00"['fff'].pack('h7')# => "\xFF\x0F\x00\x00"['fff'].pack('h8')# => "\xFF\x0F\x00\x00""\x01\xfe".unpack('h*')# => ["10ef"]"\x01\xfe".unpack('h0')# => [""]"\x01\xfe".unpack('h1')# => ["1"]"\x01\xfe".unpack('h2')# => ["10"]"\x01\xfe".unpack('h3')# => ["10e"]"\x01\xfe".unpack('h4')# => ["10ef"]"\x01\xfe".unpack('h5')# => ["10ef"]
Pointer String Directives¶↑
'P'- Pointer to a structure (fixed-length string):s = ['abc'].pack('P')# => "\xE0O\x7F\xE5\xA1\x01\x00\x00"s.unpack('P*')# => ["abc"]".".unpack("P")# => []("\0"*8).unpack("P")# => [nil][nil].pack("P")# => "\x00\x00\x00\x00\x00\x00\x00\x00"
'p'- Pointer to a null-terminated string:s = ['abc'].pack('p')# => "(\xE4u\xE5\xA1\x01\x00\x00"s.unpack('p*')# => ["abc"]".".unpack("p")# => []("\0"*8).unpack("p")# => [nil][nil].pack("p")# => "\x00\x00\x00\x00\x00\x00\x00\x00"
Other String Directives¶↑
'M'- Quoted printable, MIME encoding; text mode, but input must use LF and output LF; (seeRFC 2045):["a b c\td \ne"].pack('M')# => "a b c\td =\n\ne=\n"["\0"].pack('M')# => "=00=\n"["a"*1023].pack('M')== ("a"*73+"=\n")*14+"a=\n"# => true("a"*73+"=\na=\n").unpack('M')== ["a"*74]# => true(("a"*73+"=\n")*14+"a=\n").unpack('M')== ["a"*1023]# => true"a b c\td =\n\ne=\n".unpack('M')# => ["a b c\td \ne"]"=00=\n".unpack('M')# => ["\x00"]"pre=31=32=33after".unpack('M')# => ["pre123after"]"pre=\nafter".unpack('M')# => ["preafter"]"pre=\r\nafter".unpack('M')# => ["preafter"]"pre=".unpack('M')# => ["pre="]"pre=\r".unpack('M')# => ["pre=\r"]"pre=hoge".unpack('M')# => ["pre=hoge"]"pre==31after".unpack('M')# => ["pre==31after"]"pre===31after".unpack('M')# => ["pre===31after"]
'm'- Base64 encoded string; count specifies input bytes between each newline, rounded down to nearest multiple of 3; if count is zero, no newlines are added; (seeRFC 4648):[""].pack('m')# => ""["\0"].pack('m')# => "AA==\n"["\0\0"].pack('m')# => "AAA=\n"["\0\0\0"].pack('m')# => "AAAA\n"["\377"].pack('m')# => "/w==\n"["\377\377"].pack('m')# => "//8=\n"["\377\377\377"].pack('m')# => "////\n""".unpack('m')# => [""]"AA==\n".unpack('m')# => ["\x00"]"AAA=\n".unpack('m')# => ["\x00\x00"]"AAAA\n".unpack('m')# => ["\x00\x00\x00"]"/w==\n".unpack('m')# => ["\xFF"]"//8=\n".unpack('m')# => ["\xFF\xFF"]"////\n".unpack('m')# => ["\xFF\xFF\xFF"]"A\n".unpack('m')# => [""]"AA\n".unpack('m')# => ["\x00"]"AA=\n".unpack('m')# => ["\x00"]"AAA\n".unpack('m')# => ["\x00\x00"][""].pack('m0')# => ""["\0"].pack('m0')# => "AA=="["\0\0"].pack('m0')# => "AAA="["\0\0\0"].pack('m0')# => "AAAA"["\377"].pack('m0')# => "/w=="["\377\377"].pack('m0')# => "//8="["\377\377\377"].pack('m0')# => "////""".unpack('m0')# => [""]"AA==".unpack('m0')# => ["\x00"]"AAA=".unpack('m0')# => ["\x00\x00"]"AAAA".unpack('m0')# => ["\x00\x00\x00"]"/w==".unpack('m0')# => ["\xFF"]"//8=".unpack('m0')# => ["\xFF\xFF"]"////".unpack('m0')# => ["\xFF\xFF\xFF"]
'u'- UU-encoded string:[""].pack("u")# => ""["a"].pack("u")# => "!80``\n"["aaa"].pack("u")# => "#86%A\n""".unpack("u")# => [""]"#86)C\n".unpack("u")# => ["abc"]
Offset Directives¶↑
'@'- Begin packing at the given byte offset; for packing, null fill or shrink if necessary:[1,2].pack("C@0C")# => "\x02"[1,2].pack("C@1C")# => "\x01\x02"[1,2].pack("C@5C")# => "\x01\x00\x00\x00\x00\x02"[*1..5].pack("CCCC@2C")# => "\x01\x02\x05"
For unpacking, cannot to move to outside the string:
"\x01\x00\x00\x02".unpack("C@3C")# => [1, 2]"\x00".unpack("@1C")# => [nil]"\x00".unpack("@2C")# Raises ArgumentError.
'X'- For packing, shrink for the given byte offset:[0,1,2].pack("CCXC")# => "\x00\x02"[0,1,2].pack("CCX2C")# => "\x02"
For unpacking; rewind unpacking position for the given byte offset:
"\x00\x02".unpack("CCXC")# => [0, 2, 2]
Cannot to move to outside the string:
[0,1,2].pack("CCX3C")# Raises ArgumentError."\x00\x02".unpack("CX3C")# Raises ArgumentError.
'x'- Begin packing at after the given byte offset; for packing, null fill if necessary:[].pack("x0")# => ""[].pack("x")# => "\x00"[].pack("x8")# => "\x00\x00\x00\x00\x00\x00\x00\x00"
For unpacking, cannot to move to outside the string:
"\x00\x00\x02".unpack("CxC")# => [0, 2]"\x00\x00\x02".unpack("x3C")# => [nil]"\x00\x00\x02".unpack("x4C")# Raises ArgumentError