Movatterモバイル変換

[0]ホーム

Jump to content

Guide to Unix/Commands/File Compression

Add links

From Wikibooks, open books for an open world

<Guide to Unix |Commands

Wikibooks Guide to Unix Computing

Edit template

General:Introduction |Explanations |Why Unix-like
Platforms:Linux |BSD
Quick Reference:Commands |Environment Variables |Files |License
Commands:Getting Help |File System Utilities |Finding Files |Devices |File Viewing |File Editing |Text Processing |File Compression |File Analysing |Multiuser Commands |Self Information |System Information |Process Management |Kernel Commands |Miscellaneous |SW Development |

gzip

[edit |edit source]

gzip compresses files. Each single file is compressed into a single file. The compressed file consists of a GNU zip header and deflated data.

If given a file as an argument, gzip compresses the file, adds a ".gz" suffix, and deletes the original file. With no arguments, gzip compresses the standard input and writes the compressed file to standard output.

Some useful options are:

-c  Write compressed file to stdout. Do not delete original file.-d  Act likegunzip.-1  Performance: Use fast compression (somewhat bigger result)-9  Performance: Use best compression (somewhat slower)

Examples:

Compress the file namedREADME. CreatesREADME.gz and deletesREADME.

$gzip README

Compress the file calledREADME. The standard output (which is the compressed file) is redirected by the shell togzips/README.gz. KeepsREADME.

$gzip -c README > gzips/README.gz

Use gzip without arguments to compressREADME.

$< README gzip > gzips/README.gz

Links:

GNU Gzip, a manual, gnu.org

gunzip

[edit |edit source]

gunzip uncompresses a file that was compressed with "gzip" or "compress". It tries to handle both the GNU zip format of gzip and the older Unix compress format. It does this by recognizing the extension (".gz" or ".Z" or several others) of a file.

Some useful options are:

-c  Write uncompressed data to stdout. Do not delete original file.

Undo the effect ofgzip README.gz by replacing the compressed version of the file with the original, uncompressed version. CreatesREADME and deletesREADME.gz.

$gunzip README.gz

Write the uncompressed contents ofREADME.gz to standard output. Pipe it into a pager for easy reading of a compressed file.

$gunzip -c README.gz | more

Another way to do that is:

$gunzip < README.gz | more

Some people name filespackage.tgz as short forpackage.tar.gz.

Links:

GNU Gzip, a manual, gnu.org

zcat

[edit |edit source]

zcat is same thing asuncompress -c, though on many systems it is actually same as "gzcat" andgunzip -c.

Links:

zcat, opengroup.org
zcat, freebsd.org
GNU Gzip, a manual, gnu.org

gzcat

[edit |edit source]

gzcat is same asgunzip -c which isgzip -dc.

tar

[edit |edit source]

Archives without compression. Not covered by modern POSIX, which covers#pax instead; yet,tar continues to be widely used. An archive contains one or more files or directories.

Options to tar are confusing. Specify a mode every time.

Modes:

-c create an archive (files to archive, archive from files)
-x extract an archive (archive to files, files from archive)
-t list an archive (lists the files in the archive)

Options:

-f FILE name of archive -must specify unless using tape drive for archive
-v be verbose, list all files being archived/extracted
-p preserve permissions and (if possible) user/group when extracting.
-z create/extract archive withgzip/gunzip
-j create/extract archive withbzip2/bunzip2
-J create/extract archive with XZ

Examples:

Compress (gzip) and package (tar) the directorymyfiles to createmyfiles.tar.gz:

$tar -czvf myfiles.tar.gz myfiles

Uncompress (gzip) and unpack compressed package, extracting contents frommyfiles:

$tar -xzvf myfiles.tar.gz

There are two different conventions concerning gzipped tarballs. One often encounters .tar.gz. The other popular choice is .tgz. Slackware packages use the latter convention.

If you have access to a tape device or other backup medium, then you can use it instead of an archive file. If the material to be archived exceeds the capacity of the backup medium, the program will prompt the user to insert a new tape or diskette.

Use the following command to back up themyfiles directory to floppies:

$tar -cvf /dev/fd0 myfiles

Restore that backup with:

$tar -xvf /dev/fd0

You can also specify standard input or output-f - instead of an archive file or device. It is possible to use copy between directories by piping two "tar" commands together. For example, suppose we have two directories,from-stuff andto-stuff

$ls -Ffrom-stuff/to-stuff/

As described inRunning Linux, one can mirror everything fromfrom-stuff toto-stuff this way:

$tar cf - . | (cd ../to-stuff; tar xvf -)

Reference: Welsh, Matt, Matthias Kalle Dalheimer and Lar Kaufman (1999),Running Linux. Third edition, O'Reilly and Associates.

Links:

tar, The Single UNIX ® Specification, Version 2, 1997, opengroup.org
C.4 Utilities, opengroup.org - indicates tar as removed
tar man page, man.cat-v.org
tar, freebsd.org
GNU tar, a manual, gnu.org
tar (computing), wikipedia.org

cpio

[edit |edit source]

cpio is used for creating archives. When creating an archive, a list of files is fed to its standard-input (rather than specifying the files on the commandline). This file-list is typically created byls,find orlocate and then piped directly tocpio; but it can also first be filtered/edited with commands like*grep,sed,sort and others. A (pre-edited) list stored as a file can also be used, by usingcat to feed the pipeline or simply by redirecting the shell's standard-input (<).

cpio works in one of three modes:

cpio -o -Copy-Out mode: Files are copiedout from the filesystem tocreate an archive. Usually the archive is created by simply using the shell to redirect cpio's output to a file (with>).
cpio -i -Copy-In mode: Files from an existing archive arerestored/extracted, and copied backin to the filesystem.
cpio -p -Pass-Through mode:cpio is used to copy files from one location in the directory-tree to another, without an actual archiving being made.

In addition comes:

cpio -t -List archive: The content of an archive is listed without extracting it.
cpio -tv - Here the verbose-option (-v) will cause a "long listing", with permissions, size and ownership.

Adding the verbose-option (-v) in Copy-In, Copy-Out and Pass-Through mode, will causecpio to list the files as they're extracted/archived/copied.

Usingls to create an archive (verbosely) with all doc-files in the current directory:

$ls *.doc | cpio -ov > word-docs.cpio

Usingfind to create an archive with all txt-files in and below the current directory:

$find . -name "*.txt" | cpio -ov > text-files.cpio

Usingfind andfgrep to create an archive of just the txt-files containing the wordwiki (any case):

$find . -name "*.txt" -exec fgrep -l -i "wiki" {} \; | cpio -ov > wiki.cpio

Forfgrep the option-i means "ignore case", and the option-l cause it to just list the filenames of files matching the pattern.

Using an existing list of files:

$cpio -ov < file-list.txt > archive.cpio

Using several list of files, but first aftersort-ing anduniq-ing them:

$cat files1 files2 files3 | sort | uniq | cpio -ov > myfiles.cpio

To add more files, use the append-option (-A). Specify the file with the file-option (-F):

$cat files4 | cpio -ovA -F myfiles.cpio

To extract files (being verbose):

$cpio -iv < myfiles.cpio

cpio doesn't create directories by default, so use the option-d to make it.

To extract files, while creating directories as needed:

$cpio -ivd < myfiles.cpio

To list the content of an archive, short listing:

$cpio -t < myfiles.cpio

To list the content of an archive, long listing:

$cpio -tv < myfiles.cpio

Links:

cpio, The Single ... Specification, Version 2, 1997, opengroup.org
C.4 Utilities, opengroup.org
cpio man page, man.cat-v.org
GNU CPIO, a manual, gnu.org

pax

[edit |edit source]

Provides archiving services liketar but with different command-line syntax; provides more archive formats than tar. Becausepax does not assume the tape device, some prefer it totar.

Archive formats to be supported at minimum per POSIX are cpio, pax, and ustar. The FreeBSDpax tool does not support thepax archive format; thepax format is supported by AIX and Solaris.

Although covered by POSIX,pax is usually not installed per default in Linux distributions;tar sees continued use instead. Even when installed as an additional package,pax for Linux does not support the POSIX-requiredpax archive format.

Links:

pax, opengroup.org
pax in The Single UNIX ® Specification, Version 2, 1997, opengroup.org
pax, freebsd.org
pax(1p), man7.org - from POSIX Programmer's Manual, not from actual Linux
pax, opensource.apple.com
pax for AIX 7.3, ibm.com
pax for Solaris 11.1, docs.oracle.com
pax, heirloom.sourceforge.net
paxutils, savannah.gnu.org - has no published releases; there isgit repository
pax (command), wikipedia.org
Package: pax, packages.ubuntu.com
Package: pax, packages.debian.org
pax, pkgs.org
Pax-20201030, linuxfromscratch.org
pax lack of support for "pax" format fails LSB, 2009, bugs.launchpad.net/paxmirabilis
pax is not compliant with the latest version of POSIX yet, 2014, bugs.launchpad.net/ubuntu
mircpio in MirBSD, github.com

bzip2

[edit |edit source]

bzip2 andbunzip2 are similar to "gzip"/"gunzip" but with a different compression method. Compression is generally better but slower than "gzip". Decompression is somewhat fast.

An option of-1 through-9 can be used to specify how goodbzip2 should compress. The number tells how large "chunks" in steps of 100kB should compress at a time, so usingbzip2 -5 foo.bar will compress foo.bar in chunks of 500kB each. Generally, larger chunks means better compression (but probably slower). Onlyundamaged "chunks" can be recovered withbzip2recover from a damaged bzip2-file, so if you've compressed 900kB chunks, you'll loose 900kB of your file if one chunk is damaged - but only 100kB if you used 100kB chunks (bzip2 -1). By defaultbzip2 uses 900kB chunks for best possible compression.

bzcat is same asbunzip2 -c which isbzip2 -dc.

Links:

bzip2, freebsd.org

zip

[edit |edit source]

Adds files to a compressed zip archive. You can extract files from a zip archive usingunzip. The zip format is a common archiving file format used on Microsoft Windows PCs. A zip archive has members compressed individually; imagine gzip of every file before tar-ing them, but with a different format.

Like forgzip the quality of the compression can be specified by giving a number between 1 and 9 as an option (e.g.zip -5). 1 is quickest, but gives a low-quality compression. 9 gives the highest quality of compression, but is slow. In addition 0 can be used (i.e.zip -0) to specify that the files should just be "stored" andnot compressed (a compression of 0%), thus making it possible to usezip to make uncompressed archives.

Note that azip-archive containsindividualy compressed files collected into a single file. This is the opposite of how it's done for most other compressedUnix-archives (e.g.tar.gz andtar.bz2), where the files/directories arefirst collected into a single file -- an archive (e.g.cpio ortar), andthen thissingle file is compressed (e.g. usinggzip orbzip2).

Examples:

zip archive.zip file.txt
- Adds the file to the archive. If the archive does not exist, creates it.
zip archive file.txt
- As above; adds the .zip extension automatically, creating archive.zip.
cat filelist.txt | zip archive.zip -@
- Adds the files listed in filelist.txt to the archive.
zip -0 archive.zip file.txt
- Adds the file to the archive, making no compression, merely storing the file.
zip archive.zip file1.txt file2.txt file3.txt
- Adds multiple files to the archive.
zip -r archive.zip .
- Adds all files in the current directory and the sub-directories into the archive except for the archive itself, preserving the directory nesting information.
zip -r -j archive.zip .
- As above, but without the directory nesting information. Thus, each file is tracked under its file name only in the archive.
zip -h2
- Outputs extended help, longer than the -h one.

Links:

zip, freebsd.org
Info-ZIP, wikipedia.org
Zip (file format), wikipedia.org

unzip

[edit |edit source]

Extracts files from zip archives. See alsozip. You can get a Windows version of Info-ZIP unzip fromGnuWin32. FreeBSD appears to be using a custom version of unzip, distinct from Info-ZIP yet largely compatible with it.

Examples:

unzip archive.zip
- Extracts all files from the archive.
unzip archive.zip file.txt
- Extracts a particular file from the archive.
unzip -l archive.zip
- Lists files contained in the archive without extracting them.

Links:

unzip, freebsd.org
infozip, sourceforge.net
unzip.c as part of FreeBSD, github.com

compress

[edit |edit source]

compress is a compressed file format that is popular on UNIX systems. Files compressed withcompress will have a ".Z" extension appended to its name.

Links:

compress, opengroup.org
compress man page, man.cat-v.org
W:compress

uncompress

[edit |edit source]

Extracts files from an archive created bycompress.

Links:

uncompress, opengroup.org
compress, uncompress, zcat, man.cat-v.org
uncompress, freebsd.org
W:compress

unar

[edit |edit source]

Extracts files from a variety of compression formats, including 7z (7-zip) and RAR. License: LGPL. A companion utility to show archive file listing islsar.

Links: