Original author(s) | Ken Thompson, Dennis Ritchie (AT&T Bell Laboratories) |
---|---|
Developer(s) | Variousopen-source andcommercial developers |
Initial release | November 3, 1971; 53 years ago (1971-11-03) |
Written in | C |
Operating system | Unix,Unix-like,V,Plan 9,Inferno |
Platform | Cross-platform |
Type | Command |
License | Plan 9:MIT License |
Filename extension | |
---|---|
Internet media type | application/x-archive[1] |
Magic number | !<arch> |
Type of format | archive format |
Container for | usuallyobject files (.o) |
Standard | Not standardized, several variants exist |
Open format? | Yes[2] |
Thearchiver, also known simply asar, is aUnix utility that maintains groups of files as a singlearchive file. Today,ar
is generally used only to create and updatestatic library files that the link editor orlinker uses and for generating .deb packages for theDebian family; it can be used to create archives for any purpose, but has been largely replaced bytar
for purposes other than static libraries.[3] An implementation ofar
is included as one of theGNU Binutils.[2]
In theLinux Standard Base (LSB),ar
has been deprecated and is expected to disappear in a future release of that standard. The rationale provided was that "the LSB does not include software development utilities nor does it specify .o and .a file formats."[4]
The ar format has never been standardized; modern archives are based on a common format with two main variants,BSD andSystem V (initially known asCOFF, and used as well byGNU,ELF, andWindows.)
Historically there have been other variants[5] includingV6,V7, AIX (small and big), and Coherent, which all vary significantly from the common format.[6]
Debian ".deb" archives use the common format.
An ar file begins with a globalheader, followed by a header and data section for each file stored within the ar file.
Each data section is 2 byte aligned. If it would end on an odd offset, a newline ('\n', 0x0A) is used as filler.
The file signature is a single field containing themagic ASCII string"!<arch>"
followed by a singleLF control character (0x0A).
Each file stored in an ar archive includes a file header to store information about the file. The common format is as follows. Numeric values are encoded in ASCII and all values right-padded with ASCII spaces (0x20).
Offset | Length | Name | Format |
---|---|---|---|
0 | 16 | File identifier | ASCII |
16 | 12 | File modification timestamp (in seconds) | Decimal |
28 | 6 | Owner ID | Decimal |
34 | 6 | Group ID | Decimal |
40 | 8 | File mode (type and permission) | Octal |
48 | 10 | File size in bytes | Decimal |
58 | 2 | Ending characters | 0x60 0x0A |
As the headers only include printable ASCII characters and line feeds, an archive containing only text files therefore still appears to be a text file itself.
The members are aligned to even byte boundaries. "Each archive file member begins on an even byte boundary; a newline is inserted between files if necessary. Nevertheless, the size given reflects the actual size of the file exclusive of padding."[7]
Due to the limitations of file name length and format, both the GNU and BSD variants devised different methods of storing long filenames. Although the common format does not suffer from theyear 2038 problem, many implementations of the ar utility do and may need to be modified in the future to handle correctly timestamps in excess of 2147483647. A description of these extensions is found in libbfd.[8]
Depending on the format, many ar implementations include a global symbol table (aka armap, directory or index) for fast linking without needing to scan the whole archive for a symbol. POSIX recognizes this feature, and requires ar implementations to have an-s
option for updating it. Most implementations put it at the first file entry.[9]
BSD ar stores filenames right-padded with ASCII spaces. This causes issues with spaces inside filenames.4.4BSD ar stores extended filenames by placing the string "#1/" followed by the file name length in the file name field, and storing the real filename in front of the data section.[6]
BSD ar utility traditionally does not handle the building of a global symbol lookup table, and delegates this task to a separate utility namedranlib,[10] which inserts an architecture-specific file named__.SYMDEF
as first archive member.[11] Some descendents put a space and "SORTED" after the name to indicate a sorted version.[12] A 64-bit variant called__.SYMDEF_64
exists onDarwin.
Since POSIX added the requirement for the-s
option as a replacement of ranlib, however, newer BSD ar implementations have been rewritten to have this feature. FreeBSD in particular ditched the SYMDEF table format and embraced the System V style table.[13]
System V ar uses a '/' character (0x2F) to mark the end of the filename; this allows for the use of spaces without the use of an extended filename. Then it stores multiple extended filenames in the data section of a file with the name "//", this record is referred to by future headers. A header references an extended filename by storing a "/" followed by a decimal offset to the start of the filename in the extended filename data section. The format of this "//" file itself is simply a list of the long filenames, each separated by one or more LF characters. Note that the decimal offsets are number of characters, not line or string number within the "//" file. This is usually the second entry of the file, after the symbol table which always is the first.
System V ar uses the special filename "/" to denote that the following data entry contains a symbol lookup table, which is used in ar libraries to speed up access. This symbol table is built in three parts which are recorded together as contiguous data.
Some System V systems do not use the format described above for the symbol lookup table.For operating systems such asHP-UX 11.0, this information is stored in a data structure based on theSOM file format.
The special file "/" is not terminated with a specific sequence; the end is assumed once the last symbol name has been read.
To overcome the 4 GiB file size limit some operating system likeSolaris 11.2 and GNU use a variant lookup table. Instead of 32-bit integers, 64-bit integers are used in the symbol lookup tables. The string "/SYM64/" instead "/" is used as identifier for this table[14]
The Windows (PE/COFF) variant is based on the SysV/GNU variant. The first entry "/" has the same layout as the SysV/GNU symbol table. The second entry is another "/", a Microsoft extension that stores an extended symbol cross-reference table. This one is sorted and uses little-endian integers.[5][15] The third entry is the optional "//" long name data as in SysV/GNU.[16]
The version ofar
inGNU binutils andElfutils have an additional "thin archive" format with the magic number!<thin>. A thin archive only contains a symbol table and references to the file. The file format is essentially a System V format archive where every file is stored without the data sections. Every filename is stored as a "long" filename and they are to be resolved as if they weresymbolic links.[17]
To create an archive from filesclass1.o,class2.o,class3.o, the following command would be used:
ar rcs libclass.a class1.o class2.o class3.o
Unix linkers, usually invoked through theC compilercc
, can readar
files and extractobject files from them, so iflibclass.a
is an archive containingclass1.o
,class2.o
andclass3.o
, then
cc main.c libclass.a
or (if libclass.a is placed in standard library path, like/usr/local/lib)
cc main.c -lclass
or (during linking)
ld ... main.o -lclass ...
is the same as:
cc main.c class1.o class2.o class3.o
ar
– Shell and Utilities Reference,The Single UNIX Specification, Version 4 fromThe Open Groupar(5)
– FreeBSD File FormatsManualar
: create and maintain library archives – Shell and Utilities Reference,The Single UNIX Specification, Version 4 fromThe Open Groupar(1)
– Plan 9 Programmer's Manual, Volume 1ar(1)
– Inferno General commandsManualar(1)
– Linux User CommandsManualar(1)
– FreeBSD General CommandsManualar(1)
– Version 7 Unix Programmer'sManualar(5)
– FreeBSD File FormatsManual—an account of Unix formats