Extensions over GNU
Though the main goal of the project is compatibility, uutils supports a fewfeatures that are not supported by GNU coreutils. We take care not to introducefeatures that are incompatible with the GNU coreutils. Below is a list of uutilsextensions.
General
GNU coreutils provides two ways to define short options taking an argument:
$ ls -w 80$ ls -w80
We support a third way:
$ ls -w=80
env
GNUenv
allows the empty string to be used as an environment variable name.This is unsupported by uutils, and it will show a warning on any suchassignment.
env
has an additional-f
/--file
flag that canparse.env
files and set variables accordingly. This feature is adopted fromdotenv
stylepackages.
cp
cp
can display a progress bar when the-g
/--progress
flag is set.
mv
mv
can display a progress bar when the-g
/--progress
flag is set.
hashsum
This utility does not exist in GNU coreutils.hashsum
is a utility thatsupports computing the checksums with several algorithms. The flags and optionsare identical to the*sum
family of utils (sha1sum
,sha256sum
,b2sum
,etc.).
b3sum
This utility does not exist in GNU coreutils. The behavior is modeled after boththeb2sum
utility of GNU and theb3sum
utility by the BLAKE3 team andsupports the--no-names
option that does not appear in the GNU util.
more
We provide a simple implementation ofmore
, which is not part of GNUcoreutils. We do not aim for full compatibility with themore
utility fromutil-linux
. Features from more modern pagers (likeless
andbat
) aretherefore welcomed.
cut
cut
can separate fields by whitespace (Space and Tab) with-w
flag. Thisfeature is adopted fromFreeBSD.
fmt
fmt
has additional flags for prefixes:-P
/--skip-prefix
,-x
/--exact-prefix
, and-X
/--exact-skip-prefix
. With-m
/--preserve-headers
, an attempt is made to detect and preservemail headers in the input.-q
/--quick
breaks lines more quickly. And-T
/--tab-width
defines thenumber of spaces representing a tab when determining the line length.
printf
printf
uses arbitrary precision decimal numbers to parse and format floating pointnumbers. GNU coreutils useslong double
, whose actual size may bedouble precision64-bit float(e.g 32-bit arm),extended precision 80-bit float(x86(-64)), orquadruple precision 128-bit float (e.g. arm64).
Practically, this means that printing a number with a large precision will stay exact:
printf "%.48f\n" 0.10.100000000000000000000000000000000000000000000000 << uutils on all platforms0.100000000000000000001355252715606880542509316001 << GNU coreutils on x86(-64)0.100000000000000000000000000000000004814824860968 << GNU coreutils on arm640.100000000000000005551115123125782702118158340454 << GNU coreutils on armv7 (32-bit)
Hexadecimal floats
For hexadecimal float format (%a
), POSIX only states that one hexadecimal numbershould be present left of the decimal point (0xh.hhhhp±d
[1]), but does not say howmanybits should be included (between 1 and 4). On x86(-64), the first digit alwaysincludes 4 bits, so its value is always between0x8
and0xf
, while on otherarchitectures, only 1 bit is included, so the value is always0x1
.
However, the first digit will of course be0x0
if the number is zero. Also,rounding numbers may cause the first digit to be0x1
on x86(-64) (e.g.0xf.fffffffp-5
rounds to0x1.00p-1
), or0x2
on other architectures.
We chose to replicate x86-64 behavior on all platforms.
Additionally, the default precision of the hexadecimal float format (%a
withoutany specifier) is expected to be "sufficient for exact representation of the value" [1].This is not possible in uutils as we store arbitrary precision numbers that may beperiodic in hexadecimal form (0.1 = 0xc.ccc...p-7
), so we revertto the number of digits that would be required to exactly print anextended precision 80-bit float,emulating GNU coreutils behavior on x86(-64). An 80-bit float has 64 bits in itsinteger and fractional part, so 16 hexadecimal digits are printed in total (1 digitbefore the decimal point, 15 after).
Practically, this means that the default hexadecimal floating point output isidentical to x86(-64) GNU coreutils:
printf "%a\n" 0.10xc.ccccccccccccccdp-7 << uutils on all platforms0xc.ccccccccccccccdp-7 << GNU coreutils on x86-640x1.999999999999999999999999999ap-4 << GNU coreutils on arm640x1.999999999999ap-4 << GNU coreutils on armv7 (32-bit)
Wecan print an arbitrary number of digits if a larger precision is requested,and the leading digit will still be in the0x8
-0xf
range:
printf "%.32a\n" 0.10xc.cccccccccccccccccccccccccccccccdp-7 << uutils on all platforms0xc.ccccccccccccccd00000000000000000p-7 << GNU coreutils on x86-640x1.999999999999999999999999999a0000p-4 << GNU coreutils on arm640x1.999999999999a0000000000000000000p-4 << GNU coreutils on armv7 (32-bit)
Note: The architecture-specific behavior on non-x86(-64) platforms may change inthe future.
seq
Unlike GNU coreutils,seq
always uses arbitrary precision decimal numbers, nomatter the parameters (integers, decimal numbers, positive or negative increments,format specified, etc.), so its output will be more correct than GNU coreutils forsome inputs (e.g. small fractional increments where GNU coreutils useslong double
).
The only limitation is that the position of the decimal point is stored in ai64
,so values smaller than 10**(-263) will underflow to 0, and some values largerthan 10(2**63) may overflow to infinity.
See also comments underprintf
for formatting precision and differences.
seq
provides-t
/--terminator
to set the terminator character.
sort
When sorting with-g
/--general-numeric-sort
, arbitrary precision decimal numbersare parsed and compared, unlike GNU coreutils that uses platform-specific longdouble floating point numbers.
Extremely large or small values can still overflow or underflow to infinity or zero,see note inseq
.
ls
GNUls
provides two ways to use a long listing format:-l
and--format=long
. We support athird way:--long
.
GNUls --sort=VALUE
only supports special non-default sort orders.We support--sort=name
, which makes it possible to override an earlier value.
du
du
allowsbirth
andcreation
as values for the--time
argument to show the creation time. Italso provides a-v
/--verbose
flag.
id
id
has three additional flags:
-P
displays the id as a password file entry-p
makes the output human-readable-A
displays the process audit user ID
uptime
Similar to the proc-ps implementation and unlike GNU/Coreutils,uptime
provides-s
/--since
to show since when the system is up.
base32/base64/basenc
Just like on macOS,base32/base64/basenc
provides-D
to decode data.
shred
The number of random passes is deterministic in both GNU and uutils. However, uutilsshred
computes the number of random passes in a simplified way, specificallymax(3, x / 10)
, which is very close but not identical to the number of random passes that GNU would do. This also satisfies an expectation that reasonable users might have, namely that the number of random passes increases monotonically with the number of passes overall; GNUshred
violates this assumption.
unexpand
GNUunexpand
provides--first-only
to convert only leading sequences of blanks. We support asecond way:-f
like busybox.
Using-U
/--no-utf8
, you can interpret input files as 8-bit ASCII rather than UTF-8.
expand
expand
also offers the-U
/--no-utf8
option to interpret input files as 8-bit ASCII instead of UTF-8.