Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up

Modernish is a library for writing robust, portable, readable, and powerful programs for POSIX-based shells and utilities.

License

NotificationsYou must be signed in to change notification settings

modernish/modernish

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Releases

For code examples, seeEXAMPLES.mdandshare/doc/modernish/examples

modernish – harness the shell

  • Sick of quoting hell and split/glob pitfalls?
  • Tired of brittle shell scripts going haywire and causing damage?
  • Mystified by line noise commands like[,[[,(( ?
  • Is scripting basic things just too hard?
  • Ever wish thatfind were a built-in shell loop?
  • Do you want your script to work on nearly any shell on any Unix-like OS?

Modernish is a library for shell script programming which provides featureslike safer variable and command expansion, new language constructs for loopiteration, and much more. Modernish programs are shell programs; the newconstructs are mixed with shell syntax so that the programmer can takeadvantage of the best of both.

There is no compiled code to install, as modernish is written entirely in theshell language. It can be deployed in embedded or multi-user systems in whichnew binary executables may not be introduced for security reasons, and isportable among numerous shell implementations. The installer can alsobundlea reduced copy of the library with your scripts, so they can run portably witha known version of modernish without requiring prior installation.

Join us and help breathe some new life into the shell! Weare looking for testers, early adopters, and developers to join us.Download the latest releaseor check out the very latest development code from the master branch.Read through the documentation below. Play with the example scripts andwrite your own. Try to break the library and send reports of breakage.

Table of contents

Getting started

Runinstall.sh and follow instructions, choosing your preferred shelland install location. After successful installation you can run modernishshell scripts and write your own. Rununinstall.sh to remove modernish.

Both the install and uninstall scripts are interactive by default, butsupport fully automated (non-interactive) operation as well. Commandline options are as follows:

install.sh [-n ] [-sshell ] [-f ] [-Ppathspec ][-dinstallroot ] [-Dprefix ] [-Bscriptfile ... ]

  • -n: non-interactive operation
  • -s: specify default shell to execute modernish
  • -f: force unconditional installation on specified shell
  • -P: specify an alternativeDEFPATHfor the installation (be careful; usuallynot recommended)
  • -d: specify root directory for installation
  • -D: extra destination directory prefix (for packagers)
  • -B: bundle modernish with your scripts (-D required,-n implied), seeAppendix F

uninstall.sh [-n ] [-f ] [-dinstallroot ]

  • -n: non-interactive operation
  • -f: delete*/modernish directories even if files left
  • -d: specify root directory of modernish installation to uninstall

Two basic forms of a modernish program

In thesimple form, modernish is added to a script written for a specificshell. In theportable form, your script is shell-agnostic and may run on anyshell that can run modernish.

Simple form

Thesimplest way to write a modernish program is to source modernish as adot script. For example, if you write for bash:

#! /bin/bash. modernishuse safeuse sys/base...your program starts here...

The modernishuse command load modules with optional functionality. Thesafe module initialises thesafe mode.Thesys/base module contains modernish versions of certain basic butnon-standardised utilities (e.g.readlink,mktemp,which), guaranteeingthat modernish programs all have a known version at their disposal. There aremany other modules as well. SeeModules for moreinformation.

The above method makes the program dependent on one particular shell (in thiscase, bash). So it is okay to mix and match functionality specific to thatparticular shell with modernish functionality.

(Onzsh, there is a way to integrate modernish with native zsh scripts. SeeAppendix E.)

Portable form

Themost portable way to write a modernish program is to use the specialgeneric hashbang path for modernish programs. For example:

#! /usr/bin/env modernish#! use safe#! use sys/base...your program begins here...

For portability, it is important there is no space afterenv modernish;NetBSD and OpenBSD consider trailing spaces part of the name, soenv willfail to find modernish.

A program in this form is executed by whatever shell the user who installedmodernish on the local system chose as the default shell. Since you as theprogrammer can't know what shell this is (other than the fact that it passedsome rigorous POSIX compliance testing executed by modernish), a program inthis formmust be strictly POSIX compliant – except, of course, that itshould also make full use of the rich functionality offered by modernish.

Note that modules are loaded in a different way: theuse commands are part ofhashbang comment (starting with#! like the initial hashbang path). Only suchlines thatimmediately follow the initial hashbang path are evaluated; evenan empty line in between causes the rest to be ignored.This special way of pre-loading modules is needed to make any aliases theydefine work reliably on all shells.

Interactive use

Modernish is primarily designed to enhance shell programs/scripts, but alsooffers features for use in interactive shells. For instance, the newrepeatloop construct from thevar/loop module can be quite practical to repeatan action x times, and thesafe module on interactive shells providesconvenience functions for manipulating, saving and restoring the state offield splitting and globbing.

To use modernish on your favourite interactive shell, you have to add it toyour.profile,.bashrc or similar init file.

Important: Upon initialising, modernish adapts itself toother settings, such as the locale. It also removes certain aliases thatmay keep modernish from initialising properly. So you have to organise your.profile or similar file in the following order:

  • first, define general system settings (PATH, locale, etc.);
  • then,. modernish anduse any modules you want;
  • then define anything that may depend on modernish, and set your aliases.

Non-interactive command line use

After installation, themodernish command can be invoked as if it were ashell, with the standard command line options from other shells (such as-c to specify a command or script directly on the command line), plus someenhancements. The effect is that the shell chosen at installation time willbe run enhanced with modernish functionality. It is not possible to usemodernish as an interactive shell in this way.

Usage:

  1. modernish [--use=module |shelloption ... ][scriptfile ] [arguments ]
  2. modernish [--use=module |shelloption ... ]-c [script [me-name [arguments ] ] ]
  3. modernish --test [testoption ... ]
  4. modernish [--version |--help ]

In the first form, the script in the filescriptfile isloaded and executed with anyarguments assigned to the positional parameters.

In the second form,-c executes the specified modernishscript, optionally with theme-name assigned to$ME and thearguments assigned to the positional parameters.

The--use option pre-loads any given modernishmodulesbefore executing the script.Themodule argument to each specified--use option is split usingstandard shell field splitting. The first field is the module name and anyfurther fields become arguments to that module's initialisation routine.

Any given short-form or long-formshelloptions areset or unset before executing the script. Both POSIXshell optionsand shell-specific options are supported, depending onthe shell executing modernish.Using the shell option-e or-o errexit is an error, because modernishdoes not support it andwould break.

The--test option runs the regression test suite and exits. This verifiesthat the modernish installation is functioning correctly. SeeAppendix Bfor more information.

The--version and--help options output the relative information and exit.

Non-interactive usage examples

  • Count to 10 using abasic loop:
    modernish --use=var/loop -c 'LOOP for i=1 to 10; DO putln "$i"; DONE'
  • Run aportable-formmodernish program using zsh and enhanced-prompt xtrace:
    zsh /usr/local/bin/modernish -o xtrace /path/to/program.sh

Shell capability detection

Modernish includes a battery of shell feature, quirk and bug detectiontests, each of which is given a special capability ID.SeeAppendix A for alist of shell capabilities that modernish currently detects, as wellas further general information on the capability detection framework.

thisshellhas is the central function of the capability detectionframework. It not only tests for the presence of shell features/quirks/bugs,but can also detect specific shell built-in commands, shell reserved words,shell options (short or long form), and signals.

Modernish itself extensively uses capability detection to adapt itself to theshell it's running on. This is how it works around shell bugs and takesadvantage of efficient features not all shells have. But any script usingthe library can do this in the same way, with the help of this function.

Test results are cached in memory, so repeated checks usingthisshellhasare efficient and there is no need to avoid calling it to optimiseperformance.

Usage:

thisshellhasitem ...

  • Ifitem contains only ASCII capital letters A-Z, digits 0-9 or_,return the result status of the associated modernishcapability detection test.
  • Ifitem is any other ASCII word, check if it is a shell reservedword or built-in command on the current shell.
  • Ifitem is-- (end-of-options delimiter), disable the recognition ofoperators starting with- for subsequent items.
  • Ifitem starts with--rw= or--kw=, check if the identifierimmediately following these characters is a shell reserved word(a.k.a. shell keyword).
  • Ifitem starts with--bi=, similarly check for a shell built-in command.
  • Ifitem starts with--sig=, check if the shell knows about a signal(usable bykill,trap, etc.) by the name or number following the=.If a number > 128 is given, the remainder of its division by 128 is checked.If the signal is found, its canonicalised signal name is left in theREPLY variable, otherwiseREPLY is unset. (If multiple--sig= itemsare given and all are found,REPLY contains only the last one.)
  • Ifitem is-o followed by a separate word, check if this shell has along-form shell option by that name.
  • Ifitem is any other letter or digit preceded by a single-, check ifthis shell has a short-form shell option by that character.
  • item can also be one of the following two operators.
    • --cache runs all external modernish shell capability teststhat have not yet been run, causing the cache to be complete.
    • --show performs a--cache and then outputs all the IDs ofpositive results, one per line.

thisshellhas continues to processitems until one of them produces anegative result or is found invalid, at which point any furtheritems areignored. So the function only returns successfully if all theitemsspecified were found on the current shell. (To check if either oneitem oranother is present, use separatethisshellhas invocations separated by the|| shell operator.)

Exit status: 0 if this shell has all theitems in question; 1 if not; 2 ifanitem was encountered that is not recognised as a valid identifier.

Note: The tests for the presence of reserved words, built-in commands,shell options, and signals are different from capability detection tests in animportant way: they only check if an item by that name exists on this shell,and don't verify that it does the same thing as on another shell.

Names and identifiers

All modernish functions require portable variable and shell function names,that is, ones consisting of ASCII uppercase and lowercase letters, digits,and the underscore character_, and that don't begin with digit. For shelloption names, the constraints are the same except a dash- is alsoaccepted. An invalid identifier is generally treated as a fatal error.

Internal namespace

Function-local variables are not supported by the standard POSIX shell; onlyglobal variables are provided for. Modernish needs a way to store itsinternal state without interfering with the program using it. So most of themodernish functionality uses an internal namespace_Msh_* for variables,functions and aliases. All these names may change at any time withoutnotice.Any names starting with_Msh_ should be considered sacrosanct anduntouchable; modernish programs should never directly use them in any way.Of course this is not enforceable, but names starting with_Msh_ should beuncommon enough that no unintentional conflict is likely to occur.

Modernish system constants

Modernish provides certain constants (read-only variables) to make life easier.These include:

  • $MSH_VERSION: The version of modernish.
  • $MSH_PREFIX: Installation prefix for this modernish installation (e.g./usr/local).
  • $MSH_MDL: Mainmodules directory.
  • $MSH_AUX: Main helper scripts directory.
  • $MSH_CONFIG: Path to modernish user configuration directory.
  • $ME: Path to the current program. Replacement for$0. This isnecessary if the hashbang path#!/usr/bin/env modernish is used, or ifthe program is launched likesh /path/to/bin/modernish /path/to/script.sh, as these set$0 to the path to bin/modernish andnot your program's path.
  • $MSH_SHELL: Path to the default shell for this modernish installation,chosen at install time (e.g. /bin/sh). This is a shell that is known tohave passed all the modernish tests for fatal bugs. Cross-platform scriptsshould use it instead of hard-coding /bin/sh, because on some operatingsystems (NetBSD, OpenBSD, Solaris) /bin/sh is not POSIX compliant.
  • $SIGPIPESTATUS: The exit status of a command killed bySIGPIPE (abroken pipe). For instance, if you usegrep something somefile.txt | more and you quitmore beforegrep is finished,grep is killed bySIGPIPE and exits with that particular status.Hardened commands or functions may need to handle such aSIGPIPE exitspecially to avoid unduly killing the program. The exact value of thisexit status is shell-specific, so modernish runs a quick test to determineit at initialisation time.
    IfSIGPIPE was set to ignore by the process that invoked the currentshell,$SIGPIPESTATUS can't be detected and is set to the special value99999. See also the description of theWRN_NOSIGPIPEID forthisshellhas.
  • $DEFPATH: The default system path guaranteed to find compliant POSIXutilities, as given bygetconf PATH.
  • $ERROR: A guaranteed unset variable that can be used to trigger anerror that exits the (sub)shell, for instance:: "${4+${ERROR:?excess arguments}}" (error on 4 or more arguments)

Control character, whitespace and shell-safe character constants

POSIX does not provide for the quoted C-style escape codes commonly used inbash, ksh and zsh (such as$'\n' to represent a newline character),leaving the standard shell without a convenient way to refer to controlcharacters. Modernish provides control character constants (read-onlyvariables) with hexadecimal suffixes$CC01 ..$CC1F and$CC7F, as well as$CCe,$CCa,$CCb,$CCf,$CCn,$CCr,$CCt,$CCv (corresponding withprintf backslash escape codes). This makes it easy to insert controlcharacters in double-quoted strings.

More convenience constants, handy for use in bracket glob patterns for usewithcase or modernishmatch:

  • $CONTROLCHARS: All ASCII control characters.
  • $WHITESPACE: All ASCII whitespace characters.
  • $ASCIIUPPER: The ASCII uppercase letters A to Z.
  • $ASCIILOWER: The ASCII lowercase letters a to z.
  • $ASCIIALNUM: The ASCII alphanumeric characters 0-9, A-Z and a-z.
  • $SHELLSAFECHARS: Safe-list for shell-quoting.
  • $ASCIICHARS: The complete set of ASCII characters (minus NUL).

Usage examples:

# Use a glob pattern to check against control characters in a string:if str match"$var""*[$CONTROLCHARS]*";thenputln"\$var contains at least one control character"fi# Use '!' (not '^') to check for characters *not* part of a particular set:if str match"$var""*[!$ASCIICHARS]*";thenputln"\$var contains at least one non-ASCII character";;fi# Safely split fields at any whitespace, comma or slash (requires safe mode):use safeLOOPfor--split=$WHITESPACE,/ fieldin$my_items; DOputln"Item:$field"DONE

Reliable emergency halt

Thedie function reliably halts program execution, even from withinsubshells, optionallyprinting an error message. Note thatdie is meant for an emergency programhalt only, i.e. in situations were continuing would mean the program is in aninconsistent or undefined state. Shell scripts running in an inconsistent orundefined state may wreak all sorts of havoc. They are also notoriouslydifficult to terminate correctly, especially if the fatal error occurs withina subshell:exit won't work then. That's whydie is optimised forkillingall the program's processes (including subshells and externalcommands launched by it) as quickly as possible. It should never be used forexiting the program normally.

On interactive shells,die behaves differently. It does not kill or exit yourshell; instead, it issuesSIGINT to the shell to abort the execution of yourrunning command(s), which is equivalent to pressing Ctrl+C.In addition, ifdie is invoked from a subshell such as a background job, itkills all processes belonging to that job, but leaves other running jobs alone.

Usage:die [message ]

If thetrap stack moduleis active, a specialDIE pseudosignalcan be trapped (using plain oldtrap orpushtrap)to perform emergency cleanup commands upon invokingdie.

If theMSH_HAVE_MERCY variable is set in a script anddie is invokedfrom a subshell, thendie will only terminate the current subshell and itssubprocesses and will not executeDIE traps, allowing the script to resumeexecution in the parent process. This is for use in special cases, such asregression tests, and is strongly discouraged for general use. Modernishunsets the variable on init so it cannot be inherited from the environment.

Low-level shell utilities

Outputting strings

The POSIX shell lacks a simple, straightforward and portable way to outputarbitrary strings of text, so modernish adds two commands for this.

  • put prints each argument separated by a space, without a trailing newline.
  • putln prints each argument, terminating each with a newline character.

There is no processing of options or escape codes. (Modernish constants$CCn, etc.can be used to insert control characters in double-quoted strings. To process escape codes, useprintfinstead.)

Theecho command is notoriously unportable and kind of broken, so isdeprecated in favour ofput andputln. Modernish does provide its ownversion ofecho, but it is only activated forportable-form)scripts. Otherwise, the shell-specific version ofecho is left intact.The modernish version ofecho does not interpret any escape codesand supports only one option,-n, which, like BSDecho, suppresses thefinal newline. However, unlike BSDecho, if-n is the only argument, it isnot interpreted as an option and the string-n is printed instead. This makesit safe to output arbitrary data using this version ofecho as long as it isgiven as a single argument (using quoting if needed).

Legibility aliases:not,so,forever

Modernish sets three aliases that can help to make the shell language lookslightly friendlier. Their use is optional.

not is a new synonym for!. They can be used interchangeably.

so is a command that tests if the previous command exited with a statusof zero, so you can test the preceding command's success withif so orif not so.

forever is a new synonym forwhile :;. This allows simple infinite loopsof the form:forever dostuff; done.

Enhancedexit

Theexit command can be used as normal, but has gained capabilities.

Extended usage:exit [-u ] [status [message ] ]

  • As per standard, ifstatus is not specified, it defaults to the exitstatus of the command executed immediately prior toexit.Otherwise, it is evaluated as a shell arithmetic expression. If it isinvalid as such, the shell exits immediately with an arithmetic error.
  • Any remaining arguments afterstatus are combined, separated by spaces,and taken as amessage to print on exit. The message shown is preceded bythe name of the current program ($ME minus directories). Note that it isnot possible to skipstatus while specifying amessage.
  • If the-u option is given, and the shell functionshowusage is defined,that function is run in a subshell before exiting. It is intended to printa message showing how the command should be invoked. The-u option has noeffect if the script has not defined ashowusage function.
  • Ifstatus is non-zero, themessage and the output of theshowusagefunction are redirected to standard error.

chdir

chdir is a robustcd replacement for use in scripts.

Thestandardcd commandis designed for interactive shells and appropriate to use there.However, for scripts, its features create serious pitfalls:

  • The$CDPATH variable is searched. A script may inherit a user'sexported$CDPATH, socd may change to an unintended directory.
  • cd cannot be used with arbitrary directory names (such as untrusted userinput), as some operands have special meanings, even after--. POSIXspecifies that- changes directory to$OLDPWD. On zsh (even in sh modeon zsh <= 5.7.1), numeric operands such as+12 or-345 representdirectory stack entries. All such paths need escaping by prefixing./.
  • Symbolic links in directory path components are not resolved by default,leaving a potential symlink attack vector.

Thus, robust and portable use ofcd in scripts is unreasonably difficult.The modernishchdir function callscd in a way that takes care of allthese issues automatically: it disables$CDPATH and special operandmeanings, and resolves symbolic links by default.

Usage:chdir [-f ] [-L ] [-P ] [-- ]directorypath

Normally, failure to change the present working directory todirectorypathis a fatal error that ends the program. To tolerate failure, add the-foption; in that case, exit status 0 signifies success and exit status 1signifies failure, and scripts should always check and handle exceptions.

The options-L (logical: don't resolve symlinks) and-P (physical:resolve symlinks) are the same as incd, except that-P is the default.Note that on a shell withBUG_CDNOLOGIC (NetBSD sh),the-L option tochdir does nothing.

To use arbitrary directory names (e.g. directory names input by the user orother untrusted input) always use the-- separator that signals the end ofoptions, or paths starting with- may be misinterpreted as options.

insubshell

Theinsubshell function checks if you're currently running in asubshell environment(usually called simplysubshell).

Asubshell is a copy of the parent shell that starts out as an exactduplicate (including non-exported variables, functions, etc.), except fortraps. A new subshell is invoked by constructs like(parentheses),$(command substitutions), pipe|lines, and& (to launch a backgroundsubshell). Upon exiting a subshell, all changes to its state are lost.

This is not to be confused with a newly initialised shell that ismerely a child process of the current shell, which is sometimes(confusingly andwrongly) called a "subshell" as well.This documentation avoids such a misleading use of the term.

Usage:insubshell [-p |-u ]

This function returns success (0) if it was called from within a subshelland non-success (1) if not. One of two options can be given:

  • -p: Store the process ID (PID) of the current subshell or main shellinREPLY.
  • -u: Store an identifier inREPLY that is useful for determining ifyou've entered a subshell relative to a previously stored identifier. Thecontent and format are unspecified and shell-dependent.

isset

isset checks if a variable, shell function or option is set, or hascertain attributes. Usage:

  • issetvarname: Check if a variable is set.
  • isset -vvarname: Id.
  • isset -xvarname: Check if variable is exported.
  • isset -rvarname: Check if variable is read-only.
  • isset -ffuncname: Check if a shell function is set.
  • isset -optionletter (e.g.isset -C): Check if shell option is set.
  • isset -ooptionname: Check if shell option is set by long name.

Exit status: 0 if the item is set; 1 if not; 2 if the argument is notrecognised as avalid identifier.Unlike most other modernish commands,isset does not treat an invalididentifier as a fatal error.

When checking a shell option, a nonexistent shell option is not an error,but returns the same result as an unset shell option. (To check if a shelloption exists, usethisshellhas.

Note: justisset -f checks if shell option-f (a.k.a.-o noglob) isset, but with an extra argument, it checks if a shell function is set.Similarly,isset -x checks if shell option-x (a.k.a-o xtrace)is set, butisset -xvarname checks if a variable is exported. If youuse unquoted variable expansions here, make sure they're not empty, orthe shell's empty removal mechanism will cause the wrong thing to be checked(even in thesafe mode).

setstatus

setstatus manually sets the exit status$? to the desired value. Thefunction exits with the status indicated. This is useful in conditionalconstructs if you want to prepare a particular exit status for a subsequentexit orreturn command to inherit under certain circumstances.The status argument is a parsed as a shell arithmetic expression. A negativevalue is treated as a fatal error. The behaviour of values greater than 255is not standardised and depends on your particular shell.

Testing numbers, strings and files

Thetest/[ command is the bane of casual shell scripters. Even advancedshell programmers are frequently caught unaware by one of the many pitfallsof its arcane, hackish syntax. It attempts to look like shell grammar withoutbeing shell grammar, causing myriad problems(1,2).Its-a,-o,( and) operators areinherently and fatally broken asthere is no way to reliably distinguish operators from operands, so POSIXdeprecates their use;however, most manual pages do not include this essential information, andeven the few that do will not tell you what to do instead.

Ksh, zsh and bash offer a[[ alternative that fixes many of these problems,as it is integrated into the shell grammar. Nevertheless, it increasesconfusion, as entirely different grammar and quoting rules applywithin[[...]] than outside it, yet many scripts end up using theminterchangeably. It is also not available on all POSIX shells. (To makematters worse, Busybox ash has a false-friend[[ that is just an aliasof[, with none of the shell grammar integration!)

Finally, the POSIXtest/[ command is incompatible with the modernish"safe mode" which aims to eliminate most of the need to quote variables.Seeuse safe for more information.

Modernish deprecatestest/[ and[[ completely. Instead, it offers acomprehensive alternative command design that works with the usual shellgrammar in a safer way while offering various feature enhancements. Thefollowing replacements are available:

Integer number arithmetic tests and operations

To test if a string is a valid number in shell syntax,str isint isavailable. SeeString tests.

The arithmetic commandlet

An implementation oflet as in ksh, bash and zsh is now available to allPOSIX shells. This makes C-style signed integer arithmetic evaluationavailable to everysupported shell,with the exception of the unary++ and-- operators(which are a nonstandard shell capability detected by modernish under the ID ofARITHPP).

This meanslet should be used for operations and tests, e.g. bothlet "x=5" andif let "x==5"; then... are supported (note: single= forassignment, double== for comparison). See POSIX2.6.4 Arithmetic Expansionfor more information on the supported operators.

Multiple expressions are supported, one per argument. The exit status ofletis zero (the shell's idea of success/true) if the last expression argumentevaluates to non-zero (the arithmetic idea of true), and 1 otherwise.

It is recommended to adopt the habit to quote eachlet expression with"double quotes", as this consistently makes everything work as expected:double quotes protect operators that would otherwise be misinterpreted asshell grammar, while shell expansions starting with$ continue to work.

Arithmetic shortcuts

Various handy functions that make common arithmetic operationsand comparisons easier to program are available from thevar/arith module.

String and file tests

The following notes apply to all commands described in the subsections ofthis section:

  1. "True" is understood to mean exit status 0, and "false" is understood tomean a non-zero exit status – specifically 1.
  2. Passingmore than the number of arguments specified for each commandis afatal error. (If thesafe mode is not used, excessive argumentsmay be generated accidentally if you forget to quote a variable. Thetest result would have been wrong anyway, so modernish kills theprogram immediately, which makes the problem much easier to trace.)
  3. Passingfewer than the number of arguments specified to the command isassumed to be the result of removal of an empty unquoted expansion.Where possible, this is not treated as an error, and an exit statuscorresponding to the omitted argument(s) being empty is returned instead.(This helps make thesafe mode possible; unlikewithtest/[, paranoid quoting to avoid empty removal is not needed.)

String tests

Thestr function offers various operators for tests on strings. Forexample,str in $foo "bar" tests if the variablefoo contains "bar".

Thestr function takes unary (one-argument) operators that check a propertyof a single word, binary (two-argument) operators that check a word against apattern, as well as an option that makes binary operators check multiple wordsagainst a pattern.

Unary string tests

Usage:stroperator [word ]

Theword is checked for the property indicated byoperator; if the resultis true,str returns status 0, otherwise it returns status 1.

The available unary string testoperators are:

  • empty: Theword is empty.
  • isint: Theword is a decimal, octal or hexadecimal integer number invalid POSIX shell syntax, safe to use withlet,$((...)) and otherarithmetic contexts on all POSIX-derived shells. This operator ignoresleading (but not trailing) spaces and tabs.
  • isvarname: Theword is a valid portable shell variable or function name.

Ifword is omitted, it is treated as empty, on the assumption that it isan unquoted empty variable. Passing more than one argument after theoperator is a fatal error.

Binary string matching tests

Usage:stroperator [ [word ]pattern ]

Theword is compared to thepattern according to theoperator; if itmatches,str returns status 0, otherwise it returns status 1.The available binary matchingoperators are:

  • eq:word is equal topattern.
  • ne:word is not equal topattern.
  • in:word includespattern.
  • begin:word begins withpattern.
  • end:word ends withpattern.
  • match:word matchespattern as a shell glob pattern(as in the shell's nativecase construct).Apattern that ends in an unescaped backslash is considered invalidand causesstr to return status 2.
  • ematch:word matchespattern as a POSIXextended regular expression.An emptypattern is a fatal error.(In UTF-8 locales, check ifthisshellhasWRN_EREMBYTEbefore matching multi-byte characters.)
  • lt:word lexically sorts before (is 'less than')pattern.
  • le:word is lexically 'less than or equal to'pattern.
  • gt:word lexically sorts after (is 'greater than')pattern.
  • ge:word is lexically 'greater than or equal to'pattern.

Ifword is omitted, it is treated as empty on the assumption that it is anunquoted empty variable, and the single remaining argument is assumed to bethepattern. Similarly, if bothword andpattern are omitted, an emptyword is matched against an emptypattern. Passing more than twoarguments after theoperator is a fatal error.

Multi-matching option

Usage:str -Moperator [ [word ... ]pattern ]

The-M option causesstr to compare any number ofwords to thepattern. The availableoperators are the same as the binary stringmatching operators listed above.

All matchingwords are stored in theREPLY variable, separatedby newline characters ($CCn) if there is more than one match.If nowords match,REPLY is unset.

The exit status returned bystr -M is as follows:

  • If nowords match, the exit status is 1.
  • If oneword matches, the exit status is 0.
  • If between two and 254words match, the exit status is the number of matches.
  • If 255 or morewords match, the exit status is 255.

Usage example: the following matches a given GNU-style long-form commandline option$1 against a series of available options. To make it possiblefor the options to be abbreviated, we check if any of the options begin withthe given argument$1.

if str -M begin --fee --fi --fo --fum --foo --bar --baz --quux"$1";thenputln"OK. The given option$1 matched$REPLY"elsecase$?in( 1 )putln"No such option:$1">&2 ;;(* )putln"Ambiguous option:$1""Did you mean:""$REPLY">&2 ;;esacfi

File type tests

These avoid the snags with symlinks you get with[ and[[.By default, symlinks arenot followed. Add-L to operate on filespointed to by symlinks instead of symlinks themselves (the-L makesno difference if the operands are not symlinks).

These commands all take one argument. If the argument is absent, they returnfalse. More than one argument is a fatal error. See notes 1-3 in theparent section.

is presentfile: Returns true if the file is present in the filesystem (even if it is a broken symlink).

is -L presentfile: Returns true if the file is present in the filesystem and is not a broken symlink.

is symfile: Returns true if the file is a symbolic link (symlink).

is -L symfile: Returns true if the file is a non-broken symlink, i.e.a symlink that points (either directly or indirectly via other symlinks)to a non-symlink file that is present in the file system.

is regfile: Returns true iffile is a regular data file.

is -L regfile: Returns true iffile is either a regular data fileor a symlink pointing (either directly or indirectly via other symlinks)to a regular data file.

Other commands are available that work exactly likeis reg andis -L regbut test for other file types. To test for them, replacereg with one of:

  • dir for a directory
  • fifo for a named pipe (FIFO)
  • socket for a socket
  • blockspecial for a block special file
  • charspecial for a character special file

File comparison tests

The following notes apply to these commands:

  • Symlinks arenot resolved/followed by default. To operate on files pointedto by symlinks, add-L before the operator argument, e.g.is -L newer.
  • Omitting any argument is a fatal error, because no empty argument (removed orotherwise) would make sense for these commands.

is newerfile1file2: Compares file timestamps, returning true iffile1is newer thanfile2. Also returns true iffile1 exists, butfile2 doesnot; this is consistent for all shells (unliketest file1 -nt file2).

is olderfile1file2: Compares file timestamps, returning true iffile1is older thanfile2. Also returns true iffile1 does not exist, butfile2does; this is consistent for all shells (unliketest file1 -ot file2).

is samefilefile1file2: Returns true iffile1 andfile2 are the samefile (hardlinks).

is onsamefsfile1file2: Returns true iffile1 andfile2 are on thesame file system. If any non-regular, non-directory files are specified, theirparent directory is tested instead of the file itself.

File status tests

These always follow symlinks.

is nonemptyfile: Returns true if thefile exists, is not a brokensymlink, and is not empty. Unlike[ -s file ], this also worksfor directories, as long as you have read permission in them.

is setuidfile: Returns true if thefile has its set-user-ID flag set.

is setgidfile: Returns true if thefile has its set-group-ID flag set.

I/O tests

is onterminalFD: Returns true if file descriptorFD is associatedwith a terminal. TheFD may be a non-negative integer number or one of thespecial identifiersstdin,stdout andstderr which are equivalent to0, 1, and 2. For instance,is onterminal stdout returns true if commandsthat write to standard output (FD 1), such asputln, would write to theterminal, and false if the output is redirected to a file or pipeline.

File permission tests

Any symlinks given are resolved, as these tests would be meaninglessfor a symlink itself.

can readfile: True if the file's permission bits indicate that you can readthe file - i.e., if anr bit is set and applies to your user.

can writefile: True if the file's permission bits indicate that you canwrite to the file: for non-directories, if aw bit is set and applies to youruser; for directories, bothw andx.

can execfile: True if the file's type and permission bits indicate thatyou can execute the file: for regular files, if anx bit is set and appliesto your user; for other file types, never.

can traversefile: True if the file is a directory and its permission bitsindicate that a path can traverse through it to reach its subdirectories: fordirectories, if anx bit is set and applies to your user; for other filetypes, never.

The stack

In modernish, every variable and shell option gets its own stack. Arbitraryvalues/states can be pushed onto the stack and popped off it in reverseorder. For variables, both the value and the set/unset state is (re)stored.

Usage:

  • push [--key=value ]item [item ... ]
  • pop [--keepstatus ] [--key=value ]item [item ... ]

whereitem is a valid portable variable name, a short-form shell option(dash plus letter), or a long-form shell option (-o followed by an optionname, as two arguments).

Before pushing or popping anything, both functions check if all the givenarguments are valid andpop checks all items have a non-empty stack. Thisallows pushing and popping groups of items with a check for the integrity ofthe entire group.pop exits with status 0 if all items were poppedsuccessfully, and with status 1 if one or more of the given items could notbe popped (and no action was taken at all).

The--key= option is an advanced feature that can help different modulesor functions to use the same variable stack safely. If a key is given topush, then for eachitem, the given keyvalue is stored along with thevariable's value for that position in the stack. Subsequently, restoringthat value withpop will only succeed if the key option with the same keyvalue is given to thepop invocation. Similarly, popping a keyless valueonly succeeds if no key is given topop. If there is any key mismatch, nochanges are made andpop returns status 2. Note that this isa robustness/convenience feature, not a security feature; the keys are nothidden in any way.

If the--keepstatus option is given,pop will exit with theexit status of the command executed immediately prior to callingpop. Thiscan avoid the need for awkward workarounds when restoring variables or shelloptions at the end of a function. However, note that this makes failure to pop(stack empty or key mismatch) a fatal error that kills the program, aspopno longer has a way to communicate this through its exit status.

The shell options stack

push andpop allow saving and restoring the state of any shell optionavailable to theset builtin. The precise shell options supported(other than the ones guaranteed by POSIX) depend onthe shell modernish is running on.To facilitate portability, nonexistent shell options are treated as unset.

Long-form shell options are matched to their equivalent short-form shelloptions, if they exist. For instance, on all POSIX shells,-f isequivalent to-o noglob, andpush -o noglob followed bypop -f workscorrectly. This also works for shell-specific short & long optionequivalents.

On shells with a dynamicno option name prefix, that is on ksh, zsh andyash (where, for example,noglob is the opposite ofglob), thenoprefix is ignored, so something likepush -o glob followed bypop -o noglob does the right thing. But this depends on the shell and should neverbe used in portable scripts.

The trap stack

Modernish can also make traps stack-based, so that eachprogram component or library module can set its own trap commandswithout interfering with others. This functionality is providedby thevar/stack/trap module.

Modules

As modularity is one of modernish'sdesign principles,much of its essential functionality is provided in the form of loadablemodules, so the core library is kept lean. Modules are organisedhierarchically, with names such assafe,var/loop andsys/cmd/harden. Theuse command loads and initialises a module or a combined directory of modules.

Internally, modules exist in files with the name extension.mm insubdirectories oflib/modernish/mdl – for example, the modulevar/stack/trap corresponds to the filelib/modernish/mdl/var/stack/trap.mm.

Usage:

  • usemodulename [argument ... ]
  • use [-q |-e ]modulename
  • use -l

The first form loads and initialises a module. All arguments, including themodule name, are passed on to the dot script unmodified, so modules knowtheir own name and can implement option parsing to influence theirinitialisation. See alsoTwo basic forms of a modernish programfor information on how to use modules in portable-form scripts.

In the second form, the-q option queries if a module is loaded, and the-eoption queries if a module exists.use returns status 0 for yes, 1 for no,and 2 if the module name is invalid.

The-l option lists all currently loaded modules in the order in whichthey were originally loaded. Just add| sort for alphabetical order.

If a directory of modules, such assys/cmd or even justsys, is given as themodulename, then all the modules in that directory and any subdirectories areloaded recursively. In this case, passing extra arguments is a fatal error.

If a module fileX.mm exists along with a directoryX, resolving to thesamemodulename, thenuse will load theX.mm module file withoutautomatically loading any modules in theX directory, because it is expectedthatX.mm handles the submodules inX manually. (This is currently the caseforvar/loop which auto-loads submodules containing loop types on first use).

The completelib/modernish/mdl directory path, which depends on wheremodernish is installed, is stored in the system constant$MSH_MDL.

The following subchapters document the modules that come with modernish.

use safe

Thesafe module sets the 'safe mode' for the shell. It removes most of theneed to quote variables, parameter expansions, command substitutions, or globpatterns. It uses shell settings and modernish library functionality to secureand demystify split and glob mechanisms. This creates a new and safer way ofshell script programming, essentially building a new shell language dialectwhile still running on all POSIX-compliant shells.

Why the safe mode?

One of the most common headaches with shell scripting is caused by afundamental flaw in the shell as a scripting language:constantlyactive field splitting (a.k.a. word splitting)and pathname expansion(a.k.a. globbing). To cope with this situation, it is hammered intoprogrammers of shell scripts to be absolutely paranoid about properlyquoting nearly everything, includingvariable and parameter expansions, command substitutions, and patterns passedto commands likefind.

These mechanisms were designed for interactive command line usage, where theydo come in very handy. But when the shell language is used as a programminglanguage, splitting and globbing often ends up being applied unexpectedly tounquoted expansions and command substitutions, helping cause thousands ofbuggy, brittle, or outright dangerous shell scripts.

One could blame the programmer for forgetting to quote an expansion properly,or one could blame a pitfall-ridden scripting language design where hammeringpunctilious and counterintuitive habits into casual shell script programmers isnecessary. Modernish does the latter, then fixes it.

How the safe mode works

Every POSIX shell comes with a little-used ability to disable global fieldsplitting and pathname expansion:IFS=''; set -f. An emptyIFS variabledisables split; the-f (or-o noglob) shell option disables pathnameexpansion. The safe mode sets these, and two others (see below).

The reason these safer settings are hardly ever used is that they are notpractical to use with the standard shell language. For instance,for textfile in *.txt, orfor item in $(some command) which both (!)field-splitsand pathname-expands the output of a command, all break.

However, that is where modernish comes in. It introduces several powerfulnewloop constructs, as well as arbitrary codeblocks withlocal settings, each of whichhas straightforward, intuitive operators for safely applying field splittingor pathname expansion – to specific command arguments only. By default,they arenot both applied to the arguments, which is much safer. And yourscript code as a whole is kept safe from them at all times.

With global field splitting and pathname expansion removed, a third issuestill affects the safe mode: the shell'sempty removal mechanism. If thevalue of an unquoted expansion like$var is empty, it will not expand toan empty argument, but will be removed altogether, as if it were neverthere. This behaviour cannot be disabled.

Thankfully, the vast majority of shell and Un*x commands order their argumentsin a way that is actually designed with empty removal in mind, making it agood thing. For instance, when doingls $option some_dir, if$option is-l the listing will be long-format and if is empty it will be removed, whichis the desired behaviour. (An empty argument there would cause an error.)

However, one command that is used in almost all shell scripts,test/[,iscompletely unable to cope with empty removal due to its idiosyncraticand counterintuitive syntax. Potentially empty operands come before options,so operands removed as empty expansions cause errors or, worse, falsepositives. Thus, the safe mode doesnot remove the need for paranoidquoting of expansions used withtest/[ commands. Modernish fixesthis issue bydeprecatingtest/[ completely and offeringa safe command designto use instead, which correctly deals with empty removal.

With the 'safe mode' shell settings, plus the safe, explicit and readablesplit and glob operators andtest/[ replacements, the only quotingrequirements left are:

  1. a very occasional need to stop empty removal from happening;
  2. to quote"$@" and"$*" until shell bugs are fixed (see notes below).

In addition to the above, the safe mode also sets these shell options:

  • set -C (set -o noclobber) to prevent accidentally overwriting files usingoutput redirection. To force overwrite, use>| instead of>.
  • set -u (set -o nounset) to make it an error to use unset (that is,uninitialised) variables by default. You'll notice this will catch manytypos before they cause you hard-to-trace problems. To bypass the checkfor a specific variable, use${var-} instead of$var (be careful).

Important notes for safe mode

  • The safe mode isnot compatible with existing conventional shell scripts,written in what we could now call the 'legacy mode'. Essentially, the safemode is a new way of shell script programming. That is why it is not enabledby default, but activated by loading thesafe module.It is highlyrecommended that new modernish scripts start out withuse safe.
  • The shell applies entirely different quoting rules to string matching globpatterns withincase constructs. The safe mode changes nothing here.
  • Due toshell bugs ID'ed asBUG_PP_*, the positionalparameters expansions$@ and$* should stillalways be quoted. As oflate 2018, these bugs have been fixed in the latest or upcoming releaseversions of allsupported shells.But, until buggy versions fall out of useand modernish no longer supports anyBUG_PP_* shell bugs, quoting"$@"and"$*" remains mandatory even in safe mode (unless you know withcertainty that your script will be used on a shell with none of these bugs).
  • The behaviour of"$*" changes in safe mode. It uses the first characterof$IFS as the separator for combining all positional parameters intoone string. SinceIFS is emptied in safe mode, there is no separator,so it will string them together unseparated. You can use something likepush IFS; IFS=' '; var="$*"; pop IFSorLOCAL IFS=' '; BEGIN var="$*"; ENDto use the space character as a separator.(If you're outputting the positional parameters, note that theputcommand always separates its arguments by spaces, so you cansafely pass it multiple arguments with"$@" instead.)

Extra options for the safe mode

Usage:use safe [-k |-K ] [-i ]

The-k and-K module options install an extra handler thatreliably kills the programif it tries to execute a command that is not found, on shells that have theability to catch and handle 'command not found' errors (currently bash, yash,and zsh). This helps catch typos, forgetting to load a module, etc., and stopsyour program from continuing in an inconsistent state and potentially causingdamage. TheMSH_NOT_FOUND_OK variable may be set to temporarily disable thischeck. The uppercase-K module option aborts the program on shells thatcannot handle 'command not found' errors (so should not be used for portablescripts), whereas the lowercase-k variant is ignored on such shells.

If the-i option is given, or the shell is interactive, two extra one-letterfunctions are loaded,s andg. These are pre-command modifiers for use whensplit and glob are globally disabled; they allow running a single command withlocal split and glob applied to that command's arguments only. They also havesome options designed to manipulate, examine, save, restore, and generallyexperiment with the global split and glob state on interactive shells. Types --help andg --help for more information. In general, the safe mode isdesigned for scripts and is not recommended for interactive shells.

use var/loop

Thevar/loop module provides an innovative, robust and extensibleshell loop construct. Several powerful loop types are provided, whileadvanced shell programmers may find it easy and fun tocreate their own.This construct is also ideal for thesafe mode:thefor,select andfind loop types allow you to selectivelyapply field splitting and/or pathname expansion to specific argumentswithout subjecting a single line of your code to them.

The basic form is a bit different from native shell loops. Note the caps:
LOOPlooptypearguments;DO
     your commands here
DONE

The familiardo...done block syntax cannot be used because the shellwill not allow modernish to add its own functionality to it. TheDO...DONE block does behave in the same way asdo...done: you canappend redirections at the end, pipe commands into a loop, etc. as usual.Thebreak andcontinue shell builtin commands also work as normal.

Remember:using lowercasedo...done with modernishLOOP willcause the shell to throw a misleading syntax error. So will using uppercaseDO...DONE with the shell's native loops. To help you remember to use theuppercase variants for modernish loops, theLOOP keyword itself is also incapitals.

Loops exist in submodules ofvar/loop named after the loop type; forinstance, thefind loop lives in thevar/loop/find module. However, thecorevar/loop module will automatically load a loop type's module whenthat loop is first used, souse-ing individual loop submodules at yourscript's startup time is optional.

TheLOOP block internally uses file descriptor 8 to doits thing.If your script happens to use FD 8 for other purposes, you shouldknow that FD 8 is made local to each loop block, and always appearsinitially closed withinDO...DONE.

Simple repeat loop

This simply iterates the loop the number of times indicated. Before the firstiteration, the argument is evaluated as a shell integer arithmetic expressionas inletand its value used as the number of iterations.

LOOP repeat 3; DOputln"This line is repeated 3 times."DONE

BASIC-style arithmeticfor loop

This is a slightly enhanced version of theFOR loop in BASIC.It is more versatile than therepeat loop but still very easy to use.

LOOP forvarname=initial tolimit [stepincrement ]; DO
     some commands
DONE

To count from 1 to 20 in steps of 2:

LOOPfor i=1 to 20 step 2; DOputln"$i"DONE

Note thevarname=initial needs to be one argument as in a shellassignment (so no spaces around the=).

If "stepincrement" is omitted,increment defaults to 1 iflimit isequal to or greater thaninitial, or to -1 iflimit is less thaninitial (so counting backwards 'just works').

Technically precise description: On entry, theinitial,limit andincrement values are evaluated once as shell arithmetic expressions as inlet,the value ofinitial is assigned tovarname, and the loop iterates.Before every subsequent iteration, the value ofincrement (as determinedon the first iteration) is added to the value ofvarname, then thelimitexpression is re-evaluated; as long as the current value ofvarname isless (ifincrement is non-negative) or greater (ifincrement isnegative) than or equal to the current value oflimit, the loop reiterates.

C-style arithmeticfor loop

A C-style for loop akin tofor (( )) in ksh93, bash and zsh is nowavailable on all POSIX-compliant shells, with a slightly different syntax.The one loop argument contains three arithmetic expressions (as inlet),separated by semicolons within that argument. The first is only evaluatedbefore the first iteration, so is typically used to assign an initial value.The second is evaluated before each iteration to check whether to continuethe loop, so it typically contains some comparison operator. The third isevaluated before the second and further iterations, and typically increasesor decreases a value. For example, to count from 1 to 10:

LOOPfor"i=1; i<=10; i+=1"; DOputln"$i"DONE

However, using complex expressions allows doing much more powerful things.Any or all of the three expressions may also be left empty (with theirseparating; character remaining). If the second expression is empty, itdefaults to 1, creating an infinite loop.

(Note that++i andi++ can only be used on shells withARITHPP,buti+=1 ori=i+1 can be used on all POSIX-compliant shells.)

Enumerativefor/select loop with safe split/glob

The enumarativefor andselect loop types mirror those already present innative shell implementations. However, the modernish versions provide safefield splitting and globbing (pathname expansion) functionality that can beused without globally enabling split or glob for any of your code – idealfor thesafe mode. They also add a unique operatorfor processing text in fixed-size slices. Theselect loop type bringsselect functionality to all POSIX shells and not just ksh, zsh and bash.

Usage:

LOOP [for |select ] [operators ]varnameinargument ...;DOcommands;DONE

Simple usage example:

LOOPselect--glob textfilein*.txt; DOputln"You chose text file$textfile."DONE

If the loop type isfor, the loop iterates once for eachargument, storingit in the variable namedvarname.

If the loop type isselect, the loop presents before each iteration anumbered menu that allows the user to select one of thearguments. The promptfrom thePS3 variable is displayed and a reply read from standard input. Theliteral reply is stored in theREPLY variable. If the reply was a numbercorresponding to anargument in the menu, thatargument is stored in thevariable namedvarname. Then the loop iterates. If the user enters ^D (end offile),REPLY is cleared and the loop breaks with an exit status of 1. (Tobreak the menu loop under other conditions, use thebreak command.)

The following operators are supported. Note that the split and globoperators are only for use in thesafe mode.

  • One of--split or--split=characters. This operator safely appliesthe shell's field splitting mechanism to thearguments given. The simple--split operator applies the shell's default field splitting by space,tab, and newline. If you supply one or more of your owncharacters tosplit by, each of these characters will be taken as a field separator ifit is whitespace, or field terminator if it is non-whitespace. (Note thatshells withQRK_IFSFINAL treat both whitespace andnon-whitespace characters as separators.)
  • One of--glob or--fglob. These operators safely apply shell pathnameexpansion (globbing) to thearguments given. Eachargument is taken asa pattern, whether or not it contains any wildcard characters. For anyresulting pathname that starts with- or+ or is identical to! or(,./ is prefixed to keep various commands from misparsing it as anoption or operand. Non-matching patterns are treated as follows:
    • --glob: Any non-matching patterns are quietly removed. If none match,the loop will not iterate but break with exit status 103.
    • --fglob: All patterns must match. Any nonexistent path terminates theprogram. Use this if your program would not work after a non-match.
  • --base=string. This operator prefixes the givenstring to each of thearguments, after first applying field splitting and/or pathname expansionif specified.If--glob or--fglob are given, then thestring is used as a basedirectory path for pathname expansion, without expanding any wildcardcharacters in that base directory path itself.If such base directory can't be entered, then if--glob was given, the loopbreaks with status 98, or if--fglob was given, the program terminates.
  • One of--slice or--slice=number. This operator divides thearguments in slices of up tonumber characters. The default slice sizeis 1 character, allowing for easy character-by-character processing.(Note that shells withWRN_MULTIBYTE willnot slice multi-byte characters correctly.)

If multiple operators are given, their mechanisms are applied in thefollowing order: split, glob, base, slice.

Thefind loop

This powerful loop type turns your local POSIX-compliantfind utilityinto a shell loop, safely integrating bothfindandxargs functionality into the POSIX shell. The infamouspitfalls and limitationsof usingfind andxargs as external commands are gone, as allthe results fromfind are readily available to your main shellscript. Any "dangerous" characters in file names (includingwhitespace and even newlines) "just work", especially if thesafe modeis also active. This gives you the flexibility to use either thefindexpression syntax, or shell commands (including your own shell functions), orsome combination of both, to decide whether and how to handle each file found.

Usage:

LOOP find [options ]varname [inpath ... ][find-expression ];DOcommands;DONE

LOOP find [options ]--xargs[=arrayname] [inpath ... ][find-expression ];DOcommands;DONE

The loop recursively walks down the directory tree for eachpath given.For each file encountered, it uses thefind-expression to decidewhether to iterate the loop with the path to the file stored in thevariable referenced byvarname. Thefind-expression is a standardfindutility expression except as described below.

Any number of paths to search may be specified after thein keyword.By default, a nonexistent path is afatal error.The entirein clause may be omitted, in which case it defaults toin .so the current working directory will be searched. Any argument that startswith a-, or is identical to! or(, indicates the end of thepathsand the beginning of thefind-expression; if you need to explicitlyspecify a path with such a name, prefix./ to it.

Except for syntax errors, any errors or warnings issued byfind areconsidered non-fatal and will cause the exit status of the loop to benon-zero, so your script has the opportunity to handle the exception.

Availableoptions
  • Any single-letter options supported by your localfind utility. Note thatPOSIX specifies-H and-L only, so portable scripts should only use these.Options that require arguments (-f on BSDfind) are not supported.
  • --xargs. This operator is specifiedinstead of thevarname; it is asyntax error to have both. Instead of one iteration per found item, as manyitems as possible per iteration are stored into the positional parameters(PPs), so your program can access them in the usual way (using"$@" andfriends). Note that--xargs therefore overwrites the current PPs (however,a shell function orLOCAL block will giveyou local PPs). Modernish clears the PPs upon completion of the loop, but ifthe loop is exited prematurely (such as bybreak), the last chunk survives.
    • On shells with theKSHARRAYcapability, anextra variant is available:--xargs=arrayname which uses the namedarray instead of the PPs. It otherwise works identically.
  • --try. If this option is specified, then if one of the primaries used inthefind-expression is not supported by either thefind utility used bythe loop or by modernish itself,LOOP find will not throw afatal errorbut will instead quietly abort the loop without iterating it, set the loop'sexit status to 128, and leave the invalid primary in theREPLY variable.(Expression errors other than 'unknown primary' remain fatal errors.)
  • One of--split or--split=characters. This operator, which is onlyaccepted in thesafe mode, safely applies theshell's field splitting mechanism to thepath name(s) given(butnotto any patterns in thefind-expression, which are passed on to thefindutility as given). The simple--split operator applies the shell's defaultfield splitting by space, tab, and newline. Alternatively, you can supplyone or morecharacters to split by. If any pathname resulting from thesplit starts with- or+ or is identical to! or(,./ is prefixed.
  • One of--glob or--fglob. These operators are only accepted in thesafe mode. They safely apply shell pathnameexpansion (globbing) to thepath name(s) given(butnot to anypatterns in thefind-expression, which are passed on to thefind utilityas given). Allpath names are taken as patterns, whether or not theycontain any wildcard characters. If any pathname resulting from theexpansion start with- or+ or is identical to! or(,./ isprefixed. Non-matching patterns are treated as follows:
    • --glob: Any pattern not matching an existing path will output awarning to standard error and set the loop's exit status to 103 uponnormal completion, even if other existing paths are processedsuccessfully. If none match, the loop will not iterate.
    • --fglob: Any pattern not matching an existing path is a fatal error.
  • --base=basedirectory. This operator prefixes the givenbasedirectoryto each of thepath names (and thus to each path found byfind), afterfirst applying field splitting and/or pathname expansion if specified.If--glob or--fglob are given, then wildcard characters are onlyexpanded in thepath names and not in the prefixedbasedirectory.If thebasedirectory can't be entered, then either the loop breaks withstatus 98, or if--fglob was given, the program terminates.
Availablefind-expression operands

LOOP find can use all expression operands supported by your localfindutility; see its manual page. However, portable scripts should use onlyoperands specified by POSIXalong with the modernish additions described below.

The modernish-iterate expression primary evaluates as true and causes theloop to iterate, executing yourcommands for each matching file. It may beused any number of times in thefind-expression to start a correspondingseries of loop iterations. If it is not given, the loop acts as if the entirefind-expression is enclosed in parentheses with-iterate appended. If theentirefind-expression is omitted, it defaults to-iterate.

The modernish-ask primary asks confirmation of the user. The text of theprompt may be specified in one optional argument (which cannot start with-or be equal to! or(). Any occurrences of the characters{} within theprompt text are replaced with the current pathname. If not specified, thedefault prompt is:"{}"? If the answer is affirmative (y orY in thePOSIX locale),-ask yields true, otherwise false. This can be used to makeany part of the expression conditional upon user input, and (unlike commands inthe shell loop body) is capable of influencing directory traversal mid-run.

The standard-exec and-ok primaries are integrated into the main shellenvironment. When used withLOOP find, they can call a shell builtin commandor your own shell function directly in the main shell (no subshell). Its exitstatus is used in thefind expression as a true/false value capable ofinfluencing directory traversal (for example, when combined with-prune),just as if it were an external command -exec'ed with the standard utility.

Some familiar, easy-to-use but non-standardfind operands from GNU and/orBSD may be used withLOOP find on all systems. Before invoking thefindutility, modernish translates them internally to portable equivalents.The following expression operands are made portable:

  • The-or,-and and-not operators: same as-o,-a,!.
  • The-true and-false primaries, which always yield true/false.
  • The BSD-style-depthn primary, e.g.-depth +4 yields true on depthgreater than 4 (minimum 5),-depth -4 yields true on depth less than 4(maximum 3), and-depth 4 yields true on a depth of exactly 4.
  • The GNU-style-mindepth and-maxdepth global options.Unlike BSD-depth, these GNU-isms are pseudo-primaries thatalways yield true and affect the entireLOOP find operation.

Expression primaries that write output (-print and friends) may be used fordebugging or logging the loop. Their output is redirected to standard error.

Picking afind utility

Upon initialisation, thevar/loop/find module searches for a POSIX-compliantfind utility under various names in$DEFPATH and then in$PATH. To see atrace of the full command lines of utility invocations when the loop runs, setthe_loop_DEBUG variable to any value.

For debugging or system-specific usage, it is possible to use a certainfindutility in preference to any others on the system. To do this, add an argumentto ause var/loop/find command before the first use of the loop. For example:

  • use var/loop/find bsdfind (prefer utility by this name)
  • use var/loop/find /opt/local/bin (look for a utility here first)
  • use var/loop/find /opt/local/bin/gfind (try this one first)
Compatibility mode for obsoletefind utilities

Some systems come with obsolete or brokenfind utilities that don't fullysupport-exec ... {} + aggregating functionality as specified by POSIX.Normally, this is a fatal error, but passing the-b/-B option to theuse command, e.g.use var/loop/find -b, enables a compatibility modethat tolerates this defect. If no compliantfind is found, then an obsoleteor brokenfind is used as a last resort, a warning is printed to standarderror, and the variable_loop_find_broken is set. The-B option isequivalent to-b but does not print a warning. Loop performance may suffer asmodernish adapts to using olderexec ... {} \; which is very inefficient.

Scripts using this compatibility mode should handle their logic using shellcode in the loop body as much as possible (afterDO) and use only simplefind expressions (beforeDO), as obsolete utilities are often buggy andbreakage is likely if complex expressions or advanced features are used.

find loop usage examples

Simple example script: without the safe mode, the*.txt patternmust be quoted to prevent it from being expanded by the shell.

. modernishuse var/loopLOOP find TextFilein~/Documents -name'*.txt'DOputln"Found my text file:$TextFile"DONE

Example script withsafe mode: the--glob optionexpands the patterns of thein clause, butnot the expression – so itis not necessary to quote any pattern.

. modernishuse safeuse var/loopLOOP find --glob lsProgin /*bin /*/*bin -type f -name ls*DOputln"This command may list something:$lsProg"DONE

Example use of the modernish-ask primary: ask the user if they want todescend into each directory found. The shell loop body could skip unwantedresults, but cannot physically influence directory traversal, so skipping largedirectories would take long. Afind expression can prevent directorytraversal using the standard-prune primary, which can be combined with-ask, so that unwanted directories never iterate the loop in the first place.

. modernishuse safeuse var/loopLOOP find filein~/Documents \-type d\( -ask'Descend into "{}" directory?' -or -prune\) \-or -iterateDOput"File found:"ls -li$fileDONE

Creating your own loop

The modernish loop construct is extensible. To define a new loop type, youonly need to define a shell function called_loopgen_type wheretypeis the loop type. This function, called theloop iteration generator, isexpected to output lines of text to file descriptor 8, containing properlyshell-quotediteration commands for the shell to run, one line per iteration.

The internal commands expanded fromLOOP,DO andDONE (which aredefined as aliases) launch that loop iteration generator function in thebackground withsafe mode enabled, while causingthe main shell to read lines from that background process through a pipe,evaling each line as a command before iterating the loop. As long as thatiteration command finishes with an exit status of zero, the loop keepsiterating. If it has a nonzero exit status or if there are no more commandsto read, iteration terminates and execution continues beyond the loop.

Instead of the normalinternal namespacewhich is considered off-limits for modernish scripts,var/loop and itssubmodules use a_loop_* internal namespace for variables, which is alsofor use by user-implemented loop iteration generator functions.

The above is just the general principle. For the details, study the commentsand the code inlib/modernish/mdl/var/loop.mm and the loop generators inlib/modernish/mdl/var/loop/*.mm.

use var/local

This module defines a newLOCAL...BEGIN...END shell code blockconstruct with local variables, local positional parameters and local shelloptions. The local positional parameters can be filled using safe fieldsplitting and pathname expansion operators similar to those in theLOOPconstruct describedabove.

Usage:LOCAL [localitem |operator ... ] [-- [word ... ] ];BEGINcommands;END

Thecommands are executed once, with the specifiedlocalitems applied.Eachlocalitem can be:

  • A variable name with or without a= immediately followed by a value.This renders that variable local to the block, initially either unsettingit or assigning the value, which may be empty.
  • A shell option letter immediately preceded by a- or+ sign. Thislocally turns that shell option on or off, respectively. This follows thecounterintuitive syntax ofset. Long-form shell options like-ooptionname and+ooptionname are also supported. It depends on theshell what options are supported. Specifying a nonexistent option is afatal error. Usethisshellhas to checkfor a non-POSIX option's existence on the current shell before using it.

Modernish implementsLOCAL blocks as one-time shell functions that usethe stackto save and restore variables and settings. So thereturn command exits theblock, causing the global variables and settings to be restored and resumingexecution at the point immediately followingEND. Like any shell function, aLOCAL block exits with the exit status of the last command executed withinit, or with the status passed on by or given as an argument toreturn.

The positional parameters ($@,$1, etc.) are always local to the block, buta copy is inherited from outside the block by default. Any changes to thepositional parameters made within the block will be discarded upon exiting it.

However, if a double-dash-- argument is given in theLOCAL command line,the positional parameters outside the block are ignored and the set ofwordsafter-- (which may be empty) becomes the positional parameters instead.

Thesewords can be modified prior to entering theLOCAL block using thefollowingoperators. The safe glob and split operators are only accepted inthesafe mode. The operators are:

  • One of--split or--split=characters. This operator safely appliesthe shell's field splitting mechanism to thewords given. The simple--split operator applies the shell's default field splitting by space,tab, and newline. If you supply one or more of your owncharacters tosplit by, each of these characters will be taken as a field separator ifit is whitespace, or field terminator if it is non-whitespace. (Note thatshells withQRK_IFSFINAL treat both whitespace andnon-whitespace characters as separators.)
  • One of--glob or--fglob. These operators safely apply shell pathnameexpansion (globbing) to thewords given. Eachword is taken as a pattern,whether or not it contains any wildcard characters. For any resultingpathname that starts with- or+ or is identical to! or(,./is prefixed to keep various commands from misparsing it as an optionor operand. Non-matching patterns are treated as follows:
    • --glob: Any non-matching patterns are quietly removed.
    • --fglob: All patterns must match. Any nonexistent path terminates theprogram. Use this if your program would not work after a non-match.
  • --base=string. This operator prefixes the givenstring to each of thewords, after first applying field splitting and/or pathname expansionif specified.If--glob or--fglob are given, then thestring is used as a basedirectory path for pathname expansion, without expanding any wildcardcharacters in that base directory path itself.If such base directory can't be entered, then if--glob was given, allwords are removed, or if--fglob was given, the program terminates.
  • One of--slice or--slice=number. This operator divides thewords in slices of up tonumber characters. The default slice sizeis 1 character, allowing for easy character-by-character processing.(Note that shells withWRN_MULTIBYTE willnot slice multi-byte characters correctly.)

If multiple operators are given, their mechanisms are applied in thefollowing order: split, glob, base, slice.

Importantvar/local usage notes

  • Due to the limitations of aliases and shell reserved words,LOCAL hasto use its ownBEGIN...END block instead of the shell'sdo...done.Using the latter results in a misleading shell syntax error.
  • LOCAL blocks donot mix well with use of the shell capabilityLOCALVARS(shell-native functionality for local variables), especially not on shellswithQRK_LOCALUNS orQRK_LOCALUNS2. Using both with the same variablescauses unpredictable behaviour, depending on the shell.
  • Warning! Never usebreak orcontinue within aLOCAL block toresume or break from enclosing loops outside the block! Shells withQRK_BCDANGER allow this, preventingEND fromrestoring the global settings and corrupting the stack; shells withoutthis quirk will throw an error if you try this. A proper way to do whatyou want is to exit the block with a nonzero status using something likereturn 1, then append something like|| break or|| continue toEND. Note that this caveat only applies when crossingBEGIN...ENDboundaries. Usingcontinue andbreak to continue or break loopsentirelywithin the block is fine.

use var/arith

These shortcut functions are alternatives for usinglet.

Arithmetic operator shortcuts

inc,dec,mult,div,mod: simple integer arithmetic shortcuts. The firstargument is a variable name. The optional second argument is anarithmetic expression, but a sane default value is assumed (1 for incand dec, 2 for mult and div, 256 for mod). For instance,inc X isequivalent toX=$((X+1)) andmult X Y-2 is equivalent toX=$((X*(Y-2))).

ndiv is likediv but with correct rounding down for negative numbers.Standard shell integer division simply chops off any digits after thedecimal point, which has the effect of rounding down for positive numbersand rounding up for negative numbers.ndiv consistently rounds down.

Arithmetic comparison shortcuts

These have the same name as theirtest/[ option equivalents. Unlikewithtest, the arguments are shell integer arith expressions, which can beanything from simple numbers to complex expressions. As with$(( )),variable names are expanded to their values even without the$.

Function:         Returns successfully if:eq <expr> <expr>  the two expressions evaluate to the same numberne <expr> <expr>  the two expressions evaluate to different numberslt <expr> <expr>  the 1st expr evaluates to a smaller number than the 2ndle <expr> <expr>  the 1st expr eval's to smaller than or equal to the 2ndgt <expr> <expr>  the 1st expr evaluates to a greater number than the 2ndge <expr> <expr>  the 1st expr eval's to greater than or equal to the 2nd

use var/assign

This module is provided to solve a common POSIX shell language annoyance: in anormal shell variable assignment, only literal variable names are accepted, soit is impossible to use a variable whose name is stored in another variable.The only way around this is to useeval which is too difficult to use safely.Instead, you can now use theassign command.

Usage:assign [ [+r ]variable=value ... ] | [-rvariable=variable2 ... ] ...

assign safely processes assignment-arguments in the same form as customarilygiven to thereadonly andexport commands, but it only assignsvalues tovariables without setting any attributes. Each argument is grammatically anordinary shell word, so any part or all of it may result from an expansion. Theabsence of a= character in any argument is a fatal error. The text precedingthe first= is taken as the variable name in which to store thevalue; aninvalidvariable name is a fatal error. No whitespace is accepted before the= and any whitespace after the= is part of thevalue to be assigned.

The-r (reference) option causes the part to the right of the= to betaken as a second variable namevariable2, and its value is assigned tovariable instead.+r turns this option back off.

Examples: Each of the lines below assigns the value 'hello world' to thevariablegreeting.

var=greeting; assign$var='hello world'var=greeting; assign"$var=hello world"tag='greeting=hello world'; assign"$tag"var=greeting; gvar=myinput; myinput='hello world'; assign -r$var=$gvar

use var/readf

readf reads arbitrary data from standard input into a variable until endof file, converting it into a format suitable for passing to theprintfutility. For example,readf var <foo; printf "$var" >bar will copy foo tobar. Thus,readf allows storing both text and binary files into shellvariables in a textual format suitable for manipulation with standard shellfacilities.

All non-printable, non-ASCII characters are converted toprintf octal orone-letter escape codes, except newlines. Not encoding newline charactersallows for better processing by line-based utilities such asgrep,sed,awk, etc. However, if the file ends in a newline, that final newline isencoded to\n to protect it from being stripped by command substitutions.

Usage:readf [-h ]varname

The-h option disables conversion of high-byte characters (accented letters,non-Latin scripts). Do not use for binary files; this is only guaranteed towork for text files in an encoding compatible with the current locale.

Caveats:

  • Best for small-ish files. The encoded file is stored in memory (a shellvariable). For a binary file, encoding inprintf format typicallyabout doubles the size, though it could be up to four times as large.
  • If the shell executing your program does not haveprintf as a builtincommand, the externalprintf command will fail if the encoded filesize exceeds the maximum length of arguments to external commands(getconf ARG_MAX will obtain this limit for your system). Shell builtincommands do not have this limit. Check for aprintf builtin usingthisshellhas if you need to be sure,and alwayshardenprintf!

use var/shellquote

This module provides an efficient, fast, safe and portable shell-quotingalgorithm for quoting arbitrary data in such a way that the quoted values aresafe to pass to the shell for parsing as string literals. This is essentialfor any context where the shell must grammatically parse untrusted input,such as when supplying arbitrary values totrap oreval.

The shell-quoting algorithm is optimised to minimise exponential growth whenquoting repeatedly. By default, it also ensures that quoted strings arealways one single printable line, making them safe for terminal output andprocessing by line-oriented utilities.

shellquote

Usage:shellquote [-f|+f|-P|+P ]varname[=value] ...

The values of the variables specified by name are shell-quoted and storedback into those variables.Repeating a variable name will add another level of shell-quoting.If a= plus avalue (which may be empty) is appended to thevarname,that value is shell-quoted and assigned to the variable.

Options modify the algorithm for variable names following them, as follows:

  • By default, newlines and any control characters are converted into${CC*}expansions and quoted with double quotes, ensuring that the quoted stringconsists of a single line of printable text. The-P option forces purePOSIX quoted strings that may span multiple lines;+P turns this back off.

  • By default, a value is only quoted if it contains characters not presentin$SHELLSAFECHARS. The-f option forces unconditional quoting,disabling optimisations that may leave shell-safe characters unquoted;+f turns this back off.

shellquote willdie if youattempt to quote an unset variable (because there is no value to quote).

shellquoteparams

Theshellquoteparams command shell-quotes the current positionalparameters in place using the default quoting method ofshellquote. Nooptions are supported and any attempt to add arguments results in a syntaxerror.

use var/stack

Modules that extendthe stack.

use var/stack/extra

This module contains stack query and maintenance functions.

If you only need one or two of these functions, they can also be loaded asindividual submodules ofvar/stack/extra.

For the four functions below,item can be:

  • a valid portable variable name
  • a short-form shell option: dash plus letter
  • a long-form shell option:-o followed by an option name (two arguments)
  • --trap=SIGNAME to refer to the trap stack for the indicated signal(as set bypushtrap fromvar/stack/trap)

stackempty [--key=value ] [--force ]item: Tests if the stackfor an item is empty. Returns status 0 if it is, 1 if it is not. The keyfeature works as inpop: by default, a keymismatch is considered equivalent to an empty stack. If--force is given,this function ignores keys altogether.

clearstack [--key=value ] [--force ]item [item ... ]:Clears one or more stacks, discarding all items on it.If (part of) the stack is keyed or a--key is given, only clears until akey mismatch is encountered. The--force option overrides this and alwaysclears the entire stack (be careful, e.g. don't use withinLOCAL ...BEGIN ...END).Returns status 0 on success, 1 if that stack was already empty, 2 ifthere was nothing to clear due to a key mismatch.

stacksize [--silent |--quiet ]item: Leaves the size of a stack intheREPLY variable and, if option--silent or--quiet is not given,writes it to standard output.The size of the complete stack is returned, even if some values are keyed.

printstack [--quote ]item: Outputs a stack's content.Option--quote shell-quotes each stack value before printing it, allowingfor parsing multi-line or otherwise complicated values.Column 1 to 7 of the output contain the number of the item (down to 0).If the item is set, column 8 and 9 contain a colon and a space, andif the value is non-empty or quoted, column 10 and up contain the value.Sets of values that were pushed with a key are started with a specialline containing--- key:value. A subsequent set pushed with no key isstarted with a line containing--- (key off).Returns status 0 on success, 1 if that stack is empty.

use var/stack/trap

This module providespushtrap andpoptrap. These functions integratewith themain modernish stackto make traps stack-based, so that eachprogram component or library module can set its own trap commands withoutinterfering with others.

This module also provides a newDIE pseudosignalthat allows pushing traps to execute whendieis called.

Note an important difference between the trap stack and stacks for variablesand shell options: pushing traps does not save them for restoring later, butadds them alongside other traps on the same signal. All pushed traps areactive at the same time and are executed from last-pushed to first-pushedwhen the respective signal is triggered. Traps cannot be pushed and poppedusingpush andpop but use dedicated commands as follows.

Usage:

  • pushtrap [--key=value ] [--nosubshell ] [-- ]commandsigspec [sigspec ... ]
  • poptrap [--key=value ] [-R ] [-- ]sigspec [sigspec ... ]

pushtrap works like regulartrap, with the following exceptions:

  • Adds traps for a signal without overwriting previous ones.
  • An invalid signal is a fatal error. When using non-standard signals, check ifthisshellhas --sig=yoursignalbefore using it.
  • Unlike regular traps, a stack-based trap does not cause a signal to beignored. Setting one will cause it to be executed upon the shell receivingthat signal, but after the stack traps complete execution, modernish re-sendsthe signal to the main shell, causing it to behave as if no trap were set(unless a regular POSIX trap is also active).Thus,pushtrap does not accept an emptycommand as it would be pointless.
  • Each stack trap is executed in a newsubshellto keep it from interferingwith others. This means a stack trap cannot change variables except withinits own environment, andexit will only exit the trap and not the program.The--nosubshell option overrides this behaviour, causing that particulartrap to be executed in the main shell environment instead. This is notrecommended if not absolutely needed, as you have to be extra careful toavoid exiting the shell or otherwise interfere with other stack traps.This option cannot be used withDIE traps.
  • Each stack trap is executed with$? initially set to the exit statusthat was active at the time the signal was triggered.
  • Stack traps do not have access to the positional parameters.
  • pushtrap stores current$IFS (field splitting) and$- (shell options)along with the pushed trap. Within the subshell executing each stack trap,modernish restoresIFS and the shell optionsf (noglob),u(nounset) andC (noclobber) to the values in effect during thecorrespondingpushtrap. This is to avoid unexpected effects in case a trapis triggered while temporary settings are in effect.The--nosubshell option disables this functionality for the trap pushed.
  • The--key option applies the keying functionality inherited fromplainpush to the trap stack.It works the same way, so the description is not repeated here.

poptrap takes just signal names or numbers as arguments. It takes thelast-pushed trap for each signal off the stack. By default, it discardsthe trap commands. If the-R option is given, it stores commands torestore those traps into theREPLY variable, in a format suitable forre-entry into the shell. Again, the--key option works as inplainpop.

With the sole exception ofDIE traps,all stack-based traps, like native shell traps, are reset upon entering asubshell.However, commands for printing traps will print the traps forthe parent shell, until anothertrap,pushtrap orpoptrap command isinvoked, at which point all memory of the parent shell's traps is erased.

Trap stack compatibility considerations

Modernish tries hard to avoid incompatibilities with existing trap practice.To that end, it intercepts the regular POSIXtrap command using an alias,reimplementing and interfacing it with the shell's builtin trap facilityso that plain old regular traps play nicely with the trap stack. You shouldnot notice any changes in the POSIXtrap command's behaviour, except forthe following:

  • The regulartrap command does not overwrite stack traps (but stilloverwrites existing regular traps).
  • Unlike zsh's native trap command, signal names are case insensitive.
  • Unlike dash's native trap command, signal names may have theSIG prefix;that prefix is quietly accepted and discarded.
  • Setting an empty trap action to ignore a signal only works fully (passingthe ignoring on to child processes) if there are no stack traps associatedwith the signal; otherwise, an empty trap action merely suppresses thesignal's default action for the current process – e.g., after executingthe stack traps, it keeps the shell from exiting.
  • Thetrap command with no arguments, which prints the traps that are setin a format suitable for re-entry into the shell, now also prints thestack traps aspushtrap commands. (bash users might notice theSIGprefix is not included in the signal names written.)
  • The bash/yash-style-p option, including its yash-style--printequivalent, is now supported on all shells. If further arguments aregiven after that option, they are taken as signal specifications andonly the commands to recreate the traps for those signals are printed.
  • Saving the traps to a variable using command substitution (as in:var=$(trap)) now works on everyshell supported by modernish,including (d)ash, mksh and zsh which don't support this natively.
  • To reset (unset) a trap, the modernishtrap command accepts bothvalid POSIX syntaxand legacy bash/(d)ash/zsh syntax, liketrap INT to unset aSIGINTtrap (which only works if thetrap command is given exactly oneargument). Note that this is for compatibility with existing scripts only.
  • Bypassing thetrap alias to set a trap using the shell builtin commandwill cause an inconsistent state. This may be repaired with a simpletrapcommand; as modernish prints the traps, it will quietly detect ones itdoesn't yet know about and make them work nicely with the trap stack.

POSIX traps for each signal are always executed after that signal's stack-basedtraps; this means they should not rely on modernish modules that use the trapstack to clean up after themselves on exit, as those cleanups would alreadyhave been done.

The newDIE pseudosignal

Thevar/stack/trap module adds newDIE pseudosignal whose traps areexecuted upon invokingdie.This allows for emergency cleanup operations upon fatal program failure,asEXIT traps cannot be executed afterdie is invoked.

  • On non-interactive shells (as well assubshellsof interactive shells),DIE is its own pseudosignal with its own trapstack and POSIX trap. In order to kill the malfunctioning program as quicklyas possible (hopefully before it has a chance to delete all your data),diedoesn't wait for those traps to complete before killing the program. Instead,it executes eachDIE trap simultaneously as a background job, then gathersthe process IDs of the main shell and all its subprocesses, sendingSIGKILLto all of them except anyDIE trap processes. Unlike other traps,DIEtraps are inherited by and survive in subshell processes, andpushtrap mayadd to them within the subshell. Whatever shell process invokesdie willfork allDIE trap actions before beingSIGKILLed itself. (Note that anyDIE traps pushed or set within a subshell will still be forgotten uponexiting the subshell.)
  • On an interactive shell (not including itssubshells),DIE is simply an alias forINT, andINT traps(both POSIX and stack) are cleared out after executing them once. This isbecausedie usesSIGINT for command interruption on interactive shells, andit would not make sense to execute emergency cleanup commands repeatedly. Asa side effect of this special handling,INT traps on interactive shells donot have access to the positional parameters and cannot return from functions.

use var/string

String comparison and manipulation functions.

use var/string/touplow

toupper andtolower: convert case in variables.

Usage:

  • touppervarname [varname ... ]
  • tolowervarname [varname ... ]

Arguments are taken as variable names (note: they should be given withoutthe$) and case is converted in the contents of the specified variables,without reading input or writing output.

toupper andtolower try hard to use the fastest available method on theparticular shell your program is running on. They use built-in shellfunctionality where available and working correctly, otherwise they fall backon running an external utility.

Which external utility is chosen depends on whether the current locale usesthe Unicode UTF-8 character set or not. For non-UTF-8 locales, modernishassumes the POSIX/C locale andtr is always used. For UTF-8 locales,modernish tries hard to find a way to correctly convert case even fornon-Latin alphabets. A few shells have this functionality built in withtypeset. The rest need an external utility. Modernish initialisationtriestr,awk, GNUawk and GNUsed before giving up and settingthe variableMSH_2UP2LOW_NOUTF8. Ifisset MSH_2UP2LOW_NOUTF8, itmeans modernish is in a UTF-8 locale but has not found a way to convertcase for non-ASCII characters, sotoupper andtolower will convertonly ASCII characters and leave any other characters in the string alone.

use var/string/trim

trim: strip whitespace from the beginning and end of a variable's value.Whitespace is defined by the[:space:] character class. In the POSIXlocale, this is tab, newline, vertical tab, form feed, carriage return, andspace, but in other locales it may be different.(On shells withBUG_NOCHCLASS,$WHITESPACEis used to define whitespace instead.) Optionally, a string of literalcharacters can be provided in the second argument. Any characters appearingin that string will then be trimmed instead of whitespace.Usage:trimvarname [characters ]

use var/string/replacein

replacein: Replace leading,-trailing or-all occurrences of a string byanother string in a variable.
Usage:replacein [-t |-a ]varnameoldstringnewstring

use var/string/append

append andprepend: Append or prepend zero or more strings to avariable, separated by a string of zero or more characters, avoiding thehairy problem of dangling separators.Usage:append|prepend [--sep=separator ] [-Q ]varname [string ... ]
If the separator is not specified, it defaults to a space character.If the-Q option is given, eachstring isshell-quotedbefore appending or prepending.

use var/unexport

Theunexport function clears the "export" bit of a variable, conservingits value, and/or assigns values to variables without setting the exportbit. This works even ifset -a (allexport) is active, allowing an "exportall variables, except these" way of working.

Usage is likeexport, with the caveat that variable assignment argumentscontaining non-shell-safe characters or expansions must be quoted asappropriate, unlike in some specific shell implementations ofexport.(To get rid of that headache,use safe.)

Unlikeexport,unexport does not work for read-only variables.

use var/genoptparser

As thegetopts builtin is not portable when used in functions, this moduleprovides a command that generates modernish code to parse options for yourshell function in a standards-compliant manner. The generated parsersupports short-form (one-character) options which can be stacked/combined.

Usage:generateoptionparser [-o ] [-ffunc ] [-vvarprefix ][-noptions ] [ -aoptions ] [varname ]

  • -o: Write parser to standard output.
  • -f: Function name to prefix to error messages. Default: none.
  • -v: Variable name prefix for options. Default:opt_.
  • -n: String of options that do not take arguments.
  • -a: String of options that require arguments.
  • varname: Store parser in specified variable. Default:REPLY.

At least one of-n and-a is required. All other arguments are optional.Option characters must be valid components of portable variable names, sothey must be ASCII upper- or lowercase letters, digits, or the underscore.

generateoptionparser stores the generated parser code in a variable: eitherREPLY or thevarname specified as the first non-option argument. This makesit possible to generate and use the parser on the fly with a command likeeval "$REPLY" immediately following thegenerateoptionparser invocation.

For better efficiency and readability, it will often be preferable to insertthe option parser code directly into your shell function instead. The-ooption writes the parser code to standard output, so it can be redirected toa file, inserted into your editor, etc.

Parsed options are shifted out of the positional parameters while setting orunsetting corresponding variables, until a non-option argument, a--end-of-options delimiter argument, or the end of arguments is encountered.Unlike withgetopts, no additionalshift command is required.

Each specified option gets a corresponding variable with a name consistingof thevarprefix (default:opt_) plus the option character. If an optionis not passed to your function, the parser unsets its variable; otherwise itsets it to either the empty value or its option-argument if it requires one.Thus, your function can check if any optionx was given usingisset,for example,if isset opt_x; then...

use sys/base

Some very common and essential utilities are not specified by POSIX, differwidely among systems, and are not always available. For instance, thewhich andreadlink commands have incompatible options on various GNU andBSD variants and may be absent on other Unix-like systems. Thesys/basemodule provides a complete re-implementation of such non-standard but basicutilities, written as modernish shell functions. Using the modernish versionof these utilities can help a script to be fully portable. These versionsalso have various enhancements over the GNU and BSD originals, some of whichare made possible by their integration into the modernish shell environment.

use sys/base/mktemp

A cross-platform shell implementation ofmktemp that aims to be just assafe as nativemktemp(1) implementations, while avoiding the problem ofhaving various mutually incompatible versions and adding several uniquefeatures of its own.

Creates one or more unique temporary files, directories or named pipes,atomically (i.e. avoiding race conditions) and with safe permissions.The path name(s) are stored inREPLY and optionally written to stdout.

Usage:mktemp [-dFsQCt ] [template ... ]

  • -d: Create a directory instead of a regular file.
  • -F: Create a FIFO (named pipe) instead of a regular file.
  • -s: Silent. Store output in$REPLY, don't write any output or message.
  • -Q: Shell-quote each unit of output. Separate by spaces, not newlines.
  • -C: Automated cleanup.Pushes a trapto remove the fileson exit. On an interactive shell, that's all this option does. On anon-interactive shell, the following applies: Clean up on receivingSIGPIPE andSIGTERM as well. On receivingSIGINT, clean up if theoption was given at least twice, otherwise notify the user of filesleft. On the invocation ofdie,clean up if the option was given at least three times, otherwise notifythe user of files left.
  • -t: Prefix one temporary files directory to all thetemplates:$XDG_RUNTIME_DIR or$TMPDIR if set, or/tmp. Thetemplatesmay not contain any slashes. If the template has neither any trailingXes nor a trailing dot, a dot is added before the random suffix.

The template defaults to “/tmp/temp.”. An suffix of random shell-safe ASCIIcharacters is added to the template to create the file. For compatibility withothermktemp implementations, any optional trailingX characters in thetemplate are removed. The length of the suffix will be equal to the amount ofXes removed, or 10, whichever is more. The longer the random suffix, thehigher the security of usingmktemp in a shared directory such astmp.

Since/tmp is a world-writable directory shared by other users, for bestsecurity it is recommended to create a private subdirectory usingmktemp -dand work within that.

Option-C cannot be used without option-s when in asubshell.Modernish will detect this and treat it as afatal error. The reason is that a typical command substitution liketmpfile=$(mktemp -C)is incompatible with auto-cleanup, as the cleanup EXIT trap would betriggered not upon exiting the program but upon exiting the commandsubstitution subshell that just ranmktemp, thereby immediately undoingthe creation of the file. Instead, do something like:mktemp -sC; tmpfile=$REPLY

This module depends on the trap stack to do auto-cleanup (the-C option),so it will automaticallyuse var/stack/trap on initialisation.

use sys/base/readlink

readlink reads the target of a symbolic link, robustly handling strangefilenames such as those containing newline characters. It stores the resultin theREPLY variable and optionally writes it on standard output.

Usage:readlink [-nsefmQ ]path [path ... ]

  • -n: If writing output, don't add a trailing newline.This does not remove the separating newlines if multiplepaths are given.
  • -s:Silent operation: don't write output, only store it inREPLY.
  • -e,-f,-m: Canonicalise. Convert eachpath found into a canonicaland absolute path that can be used starting from any working directory.Relativepaths are resolved starting from the present working directory.Double slashes are removed. Any special pathname components. and.. are resolved. All symlinks encountered arefollowed, but apath does not need to contain any symlinks. UNC networkpaths (as on Cygwin) are supported. These options differ as follows:
    • -e: All pathname components must exist to produce a result.
    • -f: All but the last pathname component must exist to produce a result.
    • -m: No pathname component needs to exist; this always produces a result.Nonexistent pathname components are simulated as regular directories.
  • -Q: Shell-quote each unit of output. Separate by spaces insteadof newlines. This generates a list of arguments in shell syntax,guaranteed to be suitable for safe parsing by the shell, even if theresulting pathnames should contain strange characters such as spaces ornewlines and other control characters.

The exit status ofreadlink is 0 on success and 1 if thepath either isnot a symlink, or could not be canonicalised according to the option given.

use sys/base/rev

rev copies the specified files to the standard output, reversing the orderof characters in every line. If no files are specified, the standard inputis read.

Usage: likerev on Linux and BSD, which is likecat except that- isa filename and does not denote standard input. No options are supported.

use sys/base/seq

A cross-platform implementation ofseq that is more powerful and versatilethan native GNU and BSDseq(1) implementations. The core is written inbc, the POSIX arbitrary-precision calculator language. That means thisseq inherits the capacity to handle numbers with a precision and size onlylimited by computer memory, as well as the ability to handle input numbersin any base from 1 to 16 and produce output in any base 1 and up.

Usage:seq [-w ] [-L ] [-fformat ] [-sstring ] [-Sscale ][-Bbase ] [-bbase ] [first [incr ] ]last

seq prints a sequence of arbitrary-precision floating point numbers, oneper line, fromfirst (default 1), to as nearlast as possible, in increments ofincr (default 1). Iffirst is larger thanlast, the defaultincr is -1.Anincr of zero is treated as a fatal error.

  • -w: Equalise width by padding with leading zeros. The longest of thefirst,incr orlast arguments is taken as the length that eachoutput number should be padded to.
  • -L: Use the current locale's radix point in the output instead of thefull stop (.).
  • -f:printf-style floating-point format. The format string is passed on(with an added\n) toawk's builtinprintf function. Becauseof that, the-f option can only be used if the output base is 10.Note thatawk's floating point precision is limited, so verylarge or long numbers will be rounded.
  • -s: Instead of writing one number per line, write all numbers on oneline separated bystring and terminated by a newline character.
  • -S: Explicitly set the scale (number of digits after theradix point).Defaults to the largest number of digits after the radix pointamong thefirst,incr orlast arguments.
  • -B: Set input and output base from 1 to 16. Defaults to 10.
  • -b: Set arbitrary output base from 1. Defaults to input base.See thebc(1) manual for more information on the output formatfor bases greater than 16.

The-S,-B and-b options take shell integer numbers as operands. Thismeans a leading0X or0x denotes a hexadecimal number and a leading0denotes an octal number.

For portability reasons, modernishseq uses a full stop (.) for theradix point, regardless of thesystem locale. This applies both to command arguments and to output.The-L option causesseq to use the current locale's radix pointcharacter for output only.

Differences with GNU and BSDseq

The-S,-B and-b options are modernish innovations.The-w,-f and-s options are inspired by GNU and BSDseq.The following differences apply:

  • Like GNU and unlike BSD, the separator specified by the-s optionis not appended to the final number and there is no-t option toadd a terminator character.
  • Like GNU and unlike BSD, the-s option-argument is taken as literalcharacters and is not parsed for backslash escape codes like\n.
  • Unlike GNU and like BSD, the output radix point defaults to a full stop,regardless of the current locale.
  • Unlike GNU and like BSD, ifincr is not specified,it defaults to -1 iffirst >last, 1 otherwise.For example,seq 5 1 counts backwards from 5 to 1, andspecifyingseq 5 -1 1 as with GNU is not needed.
  • Unlike GNU and like BSD, anincr of zero is not accepted.To output the same number or string infinite times, useyes instead.
  • Unlike both GNU and BSD, the-f option accepts any format specifiersaccepted byawk'sprintf() function.

Thesys/base/seq module depends on, and automatically loads,var/string/touplow.

use sys/base/shuf

Shuffle lines of text.A portable reimplementation of a commonly used GNU utility.

Usage:

  • shuf [-nmax ] [-rrfile ]file
  • shuf [-nmax ] [-rrfile ]-ilow-high
  • shuf [-nmax ] [-rrfile ]-eargument ...

By default,shuf reads lines of text from standard input, or fromfile(thefile- signifies standard input).It writes the input lines to standard output in random order.

  • -i: Use sequence of non-negative integerslow throughhigh as input.
  • -e: Instead of reading input, use thearguments as lines of input.
  • -n: Output a maximum ofmax lines.
  • -r: Userfile as the source of random bytes. Defaults to/dev/urandom.

Differences with GNUshuf:

  • Long option names are not supported.
  • The-o/--output-file option is not supported; use output redirection.Safely shuffling files in-place is not supported; use a temporary file.
  • --random-source=file is changed to-rfile.
  • The-z/--zero-terminated option is not supported.

use sys/base/tac

tac (the reverse ofcat) is a cross-platform reimplementation of the GNUtac utility, with some extra features.

Usage:tac [-rbBP ] [-Sseparator ]file [file ... ]

tac outputs thefiles in reverse order of lines/records.Iffile is- or is not given,tac reads from standard input.

  • -s: Specify the record (line) separator. Default: linefeed.
  • -r: Interpret the record separator as anextended regular expression.This allows using separators that may vary. Each separator is preservedin the output as it is in the input.
  • -b: Assume the separator comes before each record in the input, and alsooutput the separator before each record. Cannot be combined with-B.
  • -B: Assume the separator comes after each record in the input, but outputthe separator before each record. Cannot be combined with-b.
  • -P: Paragraph mode: output text last paragraph first. Input paragraphsare separated from each other by at least two linefeeds. Cannot be combinedwith any other option.

Differences between GNUtac and modernishtac:

  • The-B and-P options were added.
  • The-r option interprets the record separator as an extended regularexpression. This is an incompatibility with GNUtac unless expressionsare used that are valid as both basic and extended regular expressions.
  • In UTF-8 locales, multi-byte characters are recognised and reversedcorrectly.

use sys/base/which

The modernishwhich utility finds external programs and reports theirabsolute paths, offering several unique options for reporting, formattingand robust processing. The default operation is similar to GNUwhich.

Usage:which [-apqsnQ1f ] [-Pnumber ]program [program ... ]

By default,which finds the first available path to each givenprogram.Ifprogram is itself a path name (contains a slash), only that path's basedirectory is searched; if it is a simple command name, the current$PATHis searched. Any relative paths found are converted to absolute paths.Symbolic links are not followed. The first path found for eachprogram iswritten to standard output (one per line), and a warning is written tostandard error for everyprogram not found. The exit status is 0 (success)if allprograms were found, 1 otherwise.

which also leaves its output in theREPLY variable. This may be usefulif you runwhich in the main shell environment. TheREPLY value willnot survive a command substitutionsubshellas inls_path=$(which ls).

The following options modify the default behaviour described above:

  • -a: Listallprograms that can be found in the directories searched,instead of just the first one. This is useful for finding duplicatecommands that the shell would not normally find when searching its$PATH.
  • -p: Search in$DEFPATH(the default standard utilityPATH provided by the operating system)instead of in the user's$PATH, which is vulnerable to manipulation.
  • -q: Bequiet: suppress all warnings.
  • -s:Silent operation: don't write output, only store it in theREPLYvariable. Suppress warnings except, if you runwhich -s in a subshell,a warning that theREPLY variable will not survive the subshell.
  • -n: When writing to standard output, donot write a finalnewline.
  • -Q: Shell-quote each unit of output. Separate by spaces insteadof newlines. This generates a one-line list of arguments in shell syntax,guaranteed to be suitable for safe parsing by the shell, even if theresulting pathnames should contain strange characters such as spaces ornewlines and other control characters.
  • -1 (one): Output the results for at mostone of the arguments indescending order of preference: once a search succeeds, ignorethe rest. Suppress warnings except a subshell warning for-s.This is useful for finding a command that can exist underseveral names, for example:which -f -1 gnutar gtar tar
    This option modifies which's exit status behaviour:which -1returns successfully if at least one command was found.
  • -f: Throw afatal errorin cases wherewhich would otherwise return status 1 (non-success).
  • -P: Strip the indicated number ofpathname elements from the output,starting from the right.-P1: strip/program;-P2: strip/*/program,etc. This is useful for determining the installation root directory foran installed package.
  • --help: Show brief usage information.

use sys/base/yes

yes very quickly outputs infinite lines of text, each consisting of itsspace-separated arguments, until terminated by a signal or by a failure towrite output. If no argument is given, the default line isy. No optionsare supported.

This infinite-output command is useful for piping into commands that need anindefinite input data stream, or to automate a command requiring interactiveconfirmation.

Modernishyes is like GNUyes in that it outputs all its arguments,whereas BSDyes only outputs the first. It can output multiple gigabytesper second on modern systems.

use sys/cmd

Modules in this category contain functions for enhancing the invocation ofcommands.

use sys/cmd/extern

extern is likecommand but always runs an external command, withouthaving to know or determine its location. This provides an easy way tobypass a builtin, alias or function. It does the same$PATH searchthe shell normally does when running an external command. For instance, toguarantee running externalprintf just do:extern printf ...

Usage:extern [-p ] [-v ] [-uvarname ... ][varname=value ... ]command [argument ... ]

  • -p: Thecommand, as well as any commands it further invokes, are searched in$DEFPATH(the default standard utilityPATH provided by the operating system)instead of in the user's$PATH, which is vulnerable to manipulation.
    • extern -p is much more reliable than the shell's builtincommand -pbecause: (a) many existing shell installations use a wrong search path forcommand -p; (b)command -p does not export the defaultPATH, sosomething likecommand -p sudo cp foo /bin/bar searches onlysudo inthe secure default path and notcp.
  • -v: don't executecommand but show the full path name of the command thatwould have been executed. Any extraarguments are taken as more commandpaths to show, one per line.extern exits with status 0 if all the commandswere found, 1 otherwise. This option can be combined with-p.
  • -u: Temporary export override. Unset the given variable in theenvironment of the command executed, even if it is currently exported. Canbe specified multiple times.
  • varname=value assignment-arguments: These variables/values aretemporarily exported to the environment during the execution of the command.
    • This is provided because assignmentsprecedingextern cause unwanted,shell-dependent side effects, asextern is a shell function. Besure to provide assignment-argumentsfollowingextern instead.
    • Assignment-arguments after a-- end-of-options delimiter are not parsed;this allowscommands containing a= sign to be executed.

use sys/cmd/harden

Theharden function allows implementing emergency halt on errorfor any external commands and shell builtin utilities. It ismodernish's replacement forset -e a.k.a.set -o errexit (which isfundamentallyflawed,not supported and will break the library).It depends on, and auto-loads, thesys/cmd/extern module.

harden sets a shell function with the same name as the command hardened,so it can be used transparently. This function hardens the given command bychecking its exit status against values indicating error or system failure.Exactly what exit statuses signify an error or failure depends on thecommand in question; this should be looked up in thePOSIX specification(under "Utilities") or in the command'sman page or other documentation.

If the command fails, the function installed byharden callsdie, so itwill reliably halt program execution, even if the failure occurred within asubshell.

Usage:

harden [-ffuncname ] [-[cSpXtPE] ] [-etestexpr ][var=value ... ] [-uvar ... ]command_name_or_path[command_argument ... ]

The-f option hardens the command as the shell functionfuncname insteadof defaulting tocommand_name_or_path as the function name. (If the latteris a path, that's always an invalid function name, so the use of-f ismandatory.) Ifcommand_name_or_path is itself a shell function, thatfunction is bypassed and the builtin or external command by that name ishardened instead. If no such command is found,harden dies with the messagethat hardening shell functions is not supported. (Instead, you should invokedie directly from your shell function upon detecting a fatal error.)

The-c option causescommand_name_or_path to be hardened and runimmediately instead of setting a shell function for later use. This optionis meant for commands that run once; it is not efficient for repeated use.It cannot be used together with the-f option.

The-S option allows specifying several possible names/paths for acommand. It causes thecommand_name_or_path to be split by comma andinterpreted as multiple names or paths to search. The first name or pathfound is used. Requires-f.

The-e option, which defaults to>0, indicates the exit statusescorresponding to a fatal error. It depends on the command what these are;consult the POSIX spec and the manual pages.The status test expressiontestexpr, argumentto the-e option, is like a shell arithmeticexpression, with the binary operators==!=<=>=<> turnedinto unary operators referring to the exit status of the command inquestion. Assignment operators are disallowed. Everything else is the same,including&& (logical and) and|| (logical or) and parentheses.Note that the expression needs to be quoted as the characters used in itclash with shell grammar tokens.

The-X option causesharden to always search for and harden an externalcommand, even if a built-in command by that name exists.

The-E option causes the hardening function to consider it a fatal errorif the hardened command writes anything to the standard error stream. Thisoption allows hardening commands (such asbc)where you can't rely on the exit status to detect an error. The text writtento standard error is passed on as part of the error message printed bydie. Note that:

  • Intercepting standard error necessitates that the command be executed from asubshell.This means any builtins or shell functions hardened with-E cannotinfluence the calling shell (e.g.harden -E cd renderscd ineffective).
  • -E does not disable exit status checks; by default, any exit status greaterthan zero is still considered a fatal error as well. If your command does noteven reliably return a 0 status upon success, then you may want to add-e '>125', limiting the exit status check to reserved values indicating errorslaunching the command and signals caught.

The-p option causesharden to search for commands using thesystem default path (as obtained withgetconf PATH) as opposed to thecurrent$PATH. This ensures that you're using a known-good externalcommand that came with your operating system. By default, the system-defaultPATH search only applies to the command itself, and not to any commands thatthe command may search for in turn. But if the-p option is specified atleast twice, the command is run in a subshell withPATH exported as thedefault path, which is equivalent to adding aPATH=$DEFPATH assignmentargument (seebelow).

Examples:

harden make# simple check for status > 0harden -f tar'/usr/local/bin/gnutar'# id.; be sure to use this 'tar' versionharden -e'> 1' grep# for grep, status > 1 means errorharden -e'==1 || >2' gzip# 1 and >2 are errors, but 2 isn't (see manual)
Important note on variable assignments

As far as the shell is concerned, hardened commands are shell functions andnot external or builtin commands. This essentially changes one behaviour ofthe shell: variable assignments preceding the command will not be local tothe command as usual, butwill persist after the command completes.(POSIX technically makes that behaviouroptionalbut all current shells behave the same in POSIX mode.)

For example, this means that something like

harden -e'>1' grep# [...]LC_ALL=C grep regex some_ascii_file.txt

should never be done, because the meant-to-be-temporaryLC_ALL localeassignment will persist and is likely to cause problems further on.

To solve this problem,harden supports adding these assignments aspart of the hardening command, so instead of the above you do:

harden -e'>1' LC_ALL=C grep# [...]grep regex some_ascii_file.txt

With the-u option,harden also supports unsetting variables for theduration of a command, e.g.:

harden -e'>1' -u LC_ALL grep

The-u option may be specified multiple times.It causes the hardened command to be invoked from asubshellwith the specified variables unset.

Hardening while allowing for broken pipes

If you're piping a command's output into another command that may closethe pipe before the first command is finished, you can use the-P optionto allow for this:

harden -e'==1 || >2' -P gzip# also tolerate gzip being killed by SIGPIPEgzip -dc file.txt.gz| head -n 10# show first 10 lines of decompressed file

head will close the pipe ofgzip input after ten lines; the operatingsystem kernel then killsgzip with the PIPE signal before it's finished,causing a particular exit status that is greater than 128. This exit statuswould normally makeharden kill your entire program, which in the exampleabove is clearly not the desired behaviour. If the exit status caused by abroken pipe were known, you could specifically allow for that exit status inthe status expression. The trouble is that this exit status varies dependingon the shell and the operating system. The-p option was made to solvethis problem: it automatically detects and whitelists the correct exitstatus corresponding toSIGPIPE termination on the current system.

ToleratingSIGPIPE is an option and not the default, because in manycontexts it may be entirely unexpected and a symptom of a severe error if acommand is killed by a broken pipe. It is up to the programmer to decidewhich commands should expectSIGPIPE and which shouldn't.

Tip: It could happen that the same command should expectSIGPIPE in onecontext but not another. You can create two hardened versions of the samecommand, one that toleratesSIGPIPE and one that doesn't. For example:

harden -f hardGrep -e'>1' grep# hardGrep does not tolerate being abortedharden -f pipeGrep -e'>1' -P grep# pipeGrep for use in pipes that may break

Note: IfSIGPIPE was set to ignore by the process invoking the currentshell, the-p option has no effect, because no process or subprocess ofthe current shell can ever be killed bySIGPIPE. However, this may causevarious other problems and you may want to refuse to let your program rununder that condition.thisshellhas WRN_NOSIGPIPE can helpyou easily detect that condition so your program can make a decision. SeetheWRN_NOSIGPIPE description for more information.

Tracing the execution of hardened commands

The-t option will trace command output. Each execution of a commandhardened with-t causes the command line to be output to standarderror, in the following format:

[functionname]> commandline

wherefunctionname is the name of the shell function used to harden thecommand andcommandline is the actual command executed. Thecommandline is properly shell-quoted in a format suitable for re-entryinto the shell; however, command lines longer than 512 bytes will betruncated and the unquoted string (TRUNCATED) will be appended to thetrace. If standard error is on a terminal that supports ANSI colours,the tracing output will be colourised.

The-t option was added toharden because the commands that you hardenare often the same ones you would be particularly interested in tracing. Theadvantage of usingharden -t over the shell's builtin tracing facility(set -x orset -o xtrace) is that the output is alot less noisy,especially when using a shell library such as modernish.

Note: Internally,-t uses the shell file descriptor 9, redirecting it tostandard error (usingexec 9>&2). This allows tracing to continue to worknormally even for commands that redirect standard error to a file (which isanother enhancement overset -x on most shells). However, this does meanharden -t conflicts with any other use of the file descriptor 9 in yourshell program.

If file descriptor 9 is already open beforeharden is called,hardendoes not attempt to override this. This means tracing may be redirectedelsewhere by doing something likeexec 9>trace.out before callingharden. (Note that redirecting FD 9 on theharden command itself willnot work as it won't survive the run of the command.)

Simple tracing of commands

Sometimes you just want to trace the execution of some specific commands asinharden -t (see above) without actually hardening them against commanderrors; you might prefer to do your own error handling.trace makes thiseasy. It is modernish's replacement or complement forset -x a.k.a.set -o xtrace.Unlikeharden -t, it can also trace shell functions.

Usage 1:trace [-ffuncname ] [-[cSpXE] ][var=value ... ] [-uvar ... ]command_name_or_path[command_argument ... ]

For non-function commands,trace acts as a shortcut forharden -t -P -e '>125 && !=255'command_name_or_path.Any further options and arguments are passed on toharden as given. Theresult is that the indicated command is automatically traced upon execution.A bonus is that you still get minimal hardening against fatal system errors.Errors in the traced command itself are ignored, but your program isimmediately halted with an informative error message if the traced command:

  • cannot be found (exit status 127);
  • was found but cannot be executed (exit status 126);
  • was killed by a signal other thanSIGPIPE (exit status > 128, exceptthe shell-specific exit status forSIGPIPE, and except 255 which isused by some utilities, such asssh andrsync, to return an error).

Note: The caveat for command-local variable assignments forharden alsoapplies totrace. SeeImportant note on variable assignmentsabove.

Usage 2: [#! ]trace -ffuncname

If no further arguments are given,trace -f will trace the shellfunctionfuncname without applying further hardening (except againstnonexistence).trace -f can be used to trace the execution of modernishlibrary functions as well as your own script's functions. The trace outputfor shell functions shows an extra() following the function name.

Internally, this involves setting an alias under the function's name, sothe limitations of the shell's alias expansion mechanism apply: onlyfunction calls that the shell had not yet parsed before callingtrace -fwill be traced. So you should usetrace -f at the beginning of yourscript, before defining your own functions. To facilitate this,trace -fdoes not check that the functionfuncname exists while setting uptracing, but only when attempting to execute the traced function.

Inportable-formmodernish scripts,trace -f should be used as a hashbang command to becompatible with alias expansion on all shells. Only thetrace -f formmay be used that way. For example:

#! /usr/bin/env modernish#! use safe -k#! use sys/cmd/harden#! trace -f push#! trace -f pop...your program begins here...

use sys/cmd/mapr

mapr (map records) is an alternative toxargs that shares features with themapfile command in bash 4.x. It is fully integrated into your script's mainshell environment, so it can call your shell functions as well as builtin andexternal utilities.It depends on, and auto-loads, thesys/cmd/procsubst module.

Usage:mapr [-ddelimiter |-P ] [-scount ] [-nnumber ][-mlength ] [-cquantum ]callback

mapr reads delimited records from the standard input, invoking the specifiedcallback command once or repeatedly as needed, with batches of input recordsas arguments. Thecallback may consist of multiple arguments. By default, aninput record is one line of text.

Options:

  • -ddelimiter: Use the single characterdelimiter to delimit input records,instead of the newline character. ANUL (0) character and multi-bytecharacters are not supported.
  • -P: Paragraph mode. Input records are delimited by sequences consisting ofa newline plus one or more blank lines, and leading or trailing blank lineswill not result in empty records at the beginning or end of the input. Cannotbe used together with-d.
  • -scount: Skip and discard the firstcount records read.
  • -nnumber: Stop processing after passing a total ofnumber records toinvocation(s) ofcallback. If-n is not supplied ornumber is 0, allrecords are passed, except those skipped using-s.
  • -mlength: Set the maximum argument length in bytes of eachcallbackcommand call, including thecallback command argument(s) and the currentbatch of up toquantum input records. The length of each argument isincreased by 1 to account for the terminating null byte. The defaultmaximum length depends on constraints set by the operating system forinvoking external commands. Iflength is 0, this limit is disabled.
  • -cquantum: Pass at mostquantum arguments at a time to each call tocallback. If-c is not supplied or ifquantum is 0, the number ofarguments per invocation is not limited except by-m; whichever limit isreached first applies.

Arguments:

  • callback: Call thecallback command with the collected arguments eachtimequantum lines are read. The callback command may be a shell function orany other kind of command, and is executed from the same shell environmentthat invokedmapr. If the callback command exits or returns with status255 or is interrupted by theSIGPIPE signal,mapr will not process anyfurther batches but immediately exit with the status of the callbackcommand. If it exits with another exit status 126 or greater, afatal erroris thrown. Otherwise,mapr exits with the status of the last-executedcallback command.
  • argument: If there are extra arguments supplied on the mapr command line,they will be added before the collected arguments on each invocation on thecallback command.
Differences frommapfile

mapr was inspired by the bash 4.x builtin commandmapfile a.k.a.readarray, and uses similar options, but there are important differences.

  • mapr passes all the records as arguments to the callback command.
  • mapr does not support assigning records directly to an array. Instead,all handling is done through the callback command (which could be a shellfunction that assigns its arguments to an array.)
  • The callback command is specified directly instead of with a-C option,and it may consist of several arguments (as withxargs).
  • The record separator itself is never included in the arguments passedto the callback command (so there is no-t option to remove it).
  • mapr supports paragraph mode.
  • If the callback command exits with status 255, processing is aborted.
Differences fromxargs

mapr shares important characteristics withxargswhile avoiding its myriad pitfalls.

  • Instead of being an external utility,mapr is fully integrated into theshell. The callback command can be a shell function or builtin, which candirectly modify the shell environment.
  • mapr is line-oriented by default, so it is safe to use for inputarguments that contain spaces or tabs.
  • mapr does not parse or modify the input arguments in any way, e.g. itdoes not process and remove quotes from them likexargs does.
  • mapr supports paragraph mode.

use sys/cmd/procsubst

This module provides a portableprocess substitutionconstruct, the advantage being that this is not limited to bash, ksh or zshbut works on all POSIX shells capable of running modernish. It is notpossible for modernish to introduce the original ksh syntax into othershells. Instead, this module provides a% command for use within a$(command substitution).

The% command takes one simple command as its arguments, executes it inthe background, and writes a file name from which to read its output. Soif% is used within a command substitution as intended, that file nameis passed on to the invoking command as an argument.

The% command supports one option,-o. If that option is given, then it isexpected that, instead of reading input, the invoking command writes output tothe file name passed on to it, so that the command invoked by% -o can readthat data from its standard input.

Example syntax comparison:
ksh/bash/zshmodernish
diff -u <(ls) <(ls -a)diff -u $(% ls) $(% ls -a)
IFS=' ' read -r user vsz args < <(ps -o 'user= vsz= args=' -p $$)IFS=' ' read -r user vsz args < $(% ps -o 'user= vsz= args=' -p $$)
{ some commands; } > >(tee stdout.log) 2> >(tee stderr.log)
(both `tee` commands write terminal output to standard output)
{ some commands; } > $(% -o tee stdout.log) 2> $(% -o tee stderr.log)
(both `tee` commands write terminal output to standard error)

Unlike the bash/ksh/zsh version, modernish process substitution only workswith simple commands. This includes shell function calls, but not aliases oranything involving shell grammar or reserved words (such as redirections,pipelines, loops, etc.). To use such complex commands, enclose them in a shellfunction and call that function from the process substitution.

Also note that anything that a command invoked by the% -o writes to itsstandard output is redirected to standard error. The main shell environment'sstandard output is not available because the command substitution subsumes it.

use sys/cmd/source

Thesource command sources a dot script like the. command, butadditionally supports passing arguments to sourced scripts like you wouldpass them to a function. It mostly mimics the behaviour of thesourcecommand built in to bash and zsh.

If a filename without a directory path is given, then, unlike the.command,source looks for the dot script in the current directory bydefault, as well as searching$PATH.

It is a fatal error to attempt to source a directory, a file with no readpermission, or a nonexistent file.

use sys/dir

Functions for working with directories.

use sys/dir/countfiles

countfiles: Count the files in a directory using nothing but shellfunctionality, so without external commands. (It's amazing how many pitfallsthis has, so a library function is needed to do it robustly.)

Usage:countfiles [-s ]directory [globpattern ... ]

Count the number of files in a directory, storing the number inREPLYand (unless-s is given) printing it to standard output.If anyglobpatterns are given, only count the files matching them.

use sys/dir/mkcd

Themkcd function makes one or more directories, then, upon success,change into the last-mentioned one.mkcd inheritsmkdir's usage, sooptions depend on your system'smkdir; only thePOSIX optionsare guaranteed.Whenmkcd is run from a script, it usescd -P to change the workingdirectory, resolving any symlinks in the present working directory path.

use sys/term

Utilities for working with the terminal.

use sys/term/putr

This module provides commands to efficiently output a string repeatedly.

Usage:

  • putr [number |- ]string
  • putrln [number |- ]string

Output thestringnumber times. When usingputrln, add a newline atthe end.

If a- is given instead of anumber, then the total length of the outputis the line length of the terminal divided by the length of thestring,rounded down.

Note that, unlike withput andputln, only a singlestringargument is accepted.

Example:putrln - '=' prints a full terminal line of equals signs.

use sys/term/readkey

readkey: read a single character from the keyboard without echoing back tothe terminal. Buffering is done so that multiple waiting characters are readone at a time.

Usage:readkey [-EERE ] [-ttimeout ] [-r ] [varname ]

-E: Only accept characters that match the extended regular expressionERE (the type of RE used bygrep -E/egrep).readkey will silentlyignore input not matching the ERE and wait for input matching it.

-t: Specify atimeout in seconds (one significant digit after thedecimal point). After the timeout expires, no character is read andreadkey returns status 1.

-r: Raw mode. Disables INTR (Ctrl+C), QUIT, and SUSP (Ctrl+Z) processingas well as translation of carriage return (13) to linefeed (10).

The character read is stored into the variable referenced byvarname,which defaults toREPLY if not specified.

This module depends on the trap stack to save and restore the terminal stateif the program is stopped while reading a key, so it will automaticallyuse var/stack/trap on initialisation.


Appendix A: List of shell cap IDs

This appendix lists all the shellcapabilities,quirks, andbugsthat modernish can detect in the current shell, so that modernish scriptscan easily query the results of these tests and decide what to do. Certainproblematic system conditionsare also detected this way and listed here.

The all-caps IDs below are all usable with thethisshellhasfunction. This makes it easy for a cross-platform modernish script tobe aware of relevant conditions and decide what to do.

Each detection test has its own little test script in thelib/modernish/cap directory. These tests are executed on demand, thefirst time the capability or bug in question is queried usingthisshellhas. SeeREADME.md in that directory for further information.The test scripts also document themselves in the comments.

Capabilities

Modernish currently identifies and supports the following non-standardshell capabilities:

  • ADDASSIGN: Add a string to a variable using additive assignment,e.g.VAR+=string
  • ANONFUNC: zsh anonymous functions (basically the native zsh equivalentof modernish's var/local module)
  • ARITHCMD: standalone arithmetic evaluation using a command like((expression)).
  • ARITHFOR: ksh93/C-style arithmeticfor loops of the formfor ((exp1;exp2;exp3)) docommands; done.
  • ARITHPP: support for the++ and-- unary operators in shell arithmetic.
  • CESCQUOT: Quoting with C-style escapes, like$'\n' for newline.
  • DBLBRACKET: The ksh88-style[[ double-bracket command]],implemented as a reserved word, integrated into the main shell grammar,and with a different grammar applying within the double brackets.(ksh93, mksh, bash, zsh, yash >= 2.48)
  • DBLBRACKETERE:DBLBRACKET plus the=~ binary operator to match astring against an extended regular expression.
  • DBLBRACKETV:DBLBRACKET plus the-v unary operator to test if avariable is set. Named variables only. (Testing positional parameters(like[[ -v 1 ]]) does not work on bash or ksh93; check$# instead.)
  • DOTARG: Dot scripts support arguments.
  • HERESTR: Here-strings, an abbreviated kind of here-document.
  • KSH88FUNC: define ksh88-style shell functions with thefunction keyword,supporting dynamically scoped local variables with thetypeset builtin.(mksh, bash, zsh, yash, et al)
  • KSH93FUNC: the same, but with static scoping for local variables. (ksh93 only)See Q28 at theksh93 FAQ for an explanationof the difference.
  • KSHARRAY: ksh93-style arrays. Supported on bash, zsh (underemulate sh),mksh, and ksh93.
  • LEPIPEMAIN: execute last element of a pipe in the main shell, so thatthings likesomecommand| readsomevariable work. (zsh, AT&T ksh,bash 4.2+)
  • LINENO: the$LINENO variable contains the current shell script linenumber.
  • LOCALVARS: thelocal command creates dynamically scoped local variableswithin functions defined using standard POSIX syntax.
  • NONFORKSUBSH: as a performance optimisation,subshells areimplemented without forking a new process, so they share a PID with the mainshell. (AT&T ksh93; it hasmany bugsrelated to this, but there's a nice workaround:ulimit -t unlimited forcesa subshell to fork, making those bugs disappear! See alsoBUG_FNSUBSH.)
  • PRINTFV: The shell'sprintf builtin has the-v option to print to a variable,which avoids forking a command substitution subshell.
  • PROCREDIR: the shell natively supports<(process redirection),a special kind of redirection that connects standard input (or standardoutput) to a background process running your command(s).This exists on yash.Note this isnot combined with a redirection like< <(command).Contrast with bash/ksh/zsh'sPROCSUBST where this<(syntax)substitutes a file name.
  • PROCSUBST: the shell natively supports<(process substitution),a special kind of command substitution that substitutes a file name,connecting it to a background process running your command(s).This exists on ksh93 and zsh.(Bash has it too, but its POSIX mode turns it off, so modernish can't use it.)Note this is usually combined with a redirection, like< <(command).Contrast this with yash'sPROCREDIR where the same<(syntax)is itself a redirection.
  • PSREPLACE: Search and replace strings in variables using special parametersubstitutions with a syntax vaguely resembling sed.
  • RANDOM: the$RANDOM pseudorandom generator.Modernish seeds it if detected. The variable is then set it to read-onlywhether the generator is detected or not, in order to block it from losingits special properties by being unset or overwritten, and to stop it beingused if there is no generator. This is because some of modernish dependsonRANDOM either working properly or being unset.
    (The use case for non-readonlyRANDOM is setting a known seed to getreproducible pseudorandom sequences. To get that in a modernish script,useawk'ssrand(yourseed) andint(rand()*32768).)
  • ROFUNC: Set functions to read-only withreadonly -f. (bash, yash)
  • TESTERE: The regulartest/[ builtin command supports the=~ binaryoperator to match a string against an extended regular expression.
  • TESTO: Thetest/[ builtin supports the-o unary operator to check ifa shell option is set.
  • TRAPPRSUBSH: The ability to obtain a list of the current shell's nativetraps from a command substitution subshell, for example:var=$(trap),as long as no new traps have been set within that command substitution.Note that thevar/stack/trap module transparently reimplements thisfeature on shells without this native capability.
  • TRAPZERR: This feature ID is detected if theERR trap is an alias fortheZERR trap. According to the zsh manual, this is the case for zsh onmost systems, i.e. those that don't have aSIGERR signal. (Thetrap stackuses this feature test.)
  • VARPREFIX: Expansions of type${!prefix@} and${!prefix*} yieldall names of set variables beginning withprefix in the same way and withthe same quoting effects as$@ and$*, respectively.This includes the nameprefix itself, unless the shell hasBUG_VARPREFIX.(bash; AT&T ksh93)

Quirks

Modernish currently identifies and supports the following shell quirks:

  • QRK_32BIT: mksh: the shell only has 32-bit arithmetic. Since every modernsystem these days supports 64-bit long integers even on 32-bit kernels, wecan now count this as a quirk.
  • QRK_ANDORBG: On zsh, the& operator takes the last simple commandas the background job and not an entire AND-OR list (if any).In other words,a && b || c & is interpreted asa && b || { c & } and not{ a && b || c; } &.
  • QRK_ARITHEMPT: In yash, with POSIX mode turned off, a set but emptyvariable yields an empty string when used in an arithmetic expression,instead of 0. For example,foo=''; echo $((foo)) outputs an empty line.
  • QRK_ARITHWHSP: Inyashand FreeBSD /bin/sh, trailing whitespace from variables is not trimmed in arithmeticexpansion, causing the shell to exit with an 'invalid number' error. POSIX is silenton the issue. The modernishisint function (to determine if a string is a validinteger number in shell syntax) isQRK_ARITHWHSP compatible, tolerating onlyleading whitespace.
  • QRK_BCDANGER:break andcontinue can affect non-enclosing loops,even across shell function barriers (zsh, Busybox ash; older versionsof bash, dash and yash). (This is especially dangerous when usingvar/localwhich internally uses a temporary shell function to try to protect againstbreaking out of the block without restoring global parameters and settings.)
  • QRK_EMPTPPFLD: Unquoted$@ and$* do not discard empty fields.POSIX saysfor both unquoted$@ and unquoted$* that empty positional parametersmay be discarded from the expansion. AFAIK, just one shell (yash)doesn't.
  • QRK_EMPTPPWRD:POSIX saysthat empty"$@" generates zero fields but empty'' or"" or"$emptyvariable" generates one empty field. But it leaves unspecifiedwhether something like"$@$emptyvariable" generates zero fields or onefield. Zsh, pdksh/mksh and (d)ash generate one field, as seems logical.But bash, AT&T ksh and yash generate zero fields, which we consider aquirk. (See also BUG_PP_01)
  • QRK_EVALNOOPT:eval does not parse options, not even--, which makes itincompatible with other shells: on the one hand, (d)ash does not accept
    eval -- "$command" whereas on other shells this is necessary if the commandstarts with a-, or the command would be interpreted as an option toeval.A simple workaround is to prefix arbitrary commands with a space.Both situations are POSIX compliant,but since they are incompatible without a workaround,the minority situationis labeled here as a QuiRK.
  • QRK_EXECFNBI: In pdksh and zsh,exec looks up shell functions andbuiltins before external commands, and if it finds one it does theequivalent of running the function or builtin followed byexit. Thisis probably a bug in POSIX terms;exec is supposed to launch aprogram that overlays the current shell, implying the program launched byexec is always external to the shell. However, since thePOSIX languageis rathervague and possibly incorrect,this is labeled as a shell quirk instead of a shell bug.
  • QRK_FNRDREXIT: On FreeBSD sh and NetBSD sh, an error in a redirectionattached to a function call causes the shell to exit. This affectsredirections of all functions, including modernish library functionsas well as functions set byharden.
  • QRK_GLOBDOTS: Pathname expansion of.* matches the pseudonames. and.. so that, e.g.,cp -pr .* backup/ cannot be used to copy all yourhidden files. (bash < 5.2, (d)ash, AT&T ksh != 93u+m, yash)
  • QRK_HDPARQUOT: Doublequotes within certainparameter substitutions inhere-documents aren't removed (FreeBSD sh; bosh). For instance, ifvar is set,${var+"x"} in a here-document yields"x", notx.POSIX considers it undefinedto use double quotes there, so they should be avoided for a script to befully POSIX compatible.(Note this quirk doesnot apply for substitutions that remove patterns,such as${var#"$x"} and${var%"$x"}; those are defined by POSIXand double quotes are fine to use.)(Note 2: single quotes produce widely varying behaviour and should neverbe used within any form of parameter substitution in a here-document.)
  • QRK_IFSFINAL: in field splitting, a final non-whitespaceIFS delimitercharacter is counted as an empty field (yash < 2.42, zsh, pdksh). This is a QRK(quirk), not a BUG, because POSIX is ambiguous on this.
  • QRK_LOCALINH: On a shell withLOCALVARS, local variables, when declaredwithout assigning a value, inherit the state of their global namesake, ifany. (dash, FreeBSD sh)
  • QRK_LOCALSET: On a shell withLOCALVARS, local variables are immediately setto the empty value upon being declared, instead of being initially withouta value. (zsh)
  • QRK_LOCALSET2: LikeQRK_LOCALSET, butonly if the variable by thesame name in the global/parent scope is unset. If the global variable isset, then the local variable starts out unset. (bash 2 and 3)
  • QRK_LOCALUNS: On a shell withLOCALVARS, local variables lose their localstatus when unset. Since the variable name reverts to global, this means thatunset will not necessarily unset the variable! (yash, pdksh/mksh. Note:this is actually a behaviour oftypeset, to which modernish aliaseslocalon these shells.)
  • QRK_LOCALUNS2: This is a more treacherous version ofQRK_LOCALUNS thatis unique to bash. Theunset command works as expected when used on a localvariable in the same scope that variable was declared in,however, itmakes local variables global again if they are unset in a subscope of thatlocal scope, such as a function called by the function where it is local.(Note: sinceQRK_LOCALUNS2 is a special case ofQRK_LOCALUNS, modernishwill not detect both.)On bash >= 5.0, modernish eliminates this quirk upon initialisationby settingshopt -s localvar_unset.
  • QRK_OPTABBR: Long-form shell option names can be abbreviated down to alength where the abbreviation is not redundant with other long-form optionnames. (ksh93, yash)
  • QRK_OPTCASE: Long-form shell option names are case-insensitive. (yash, zsh)
  • QRK_OPTDASH: Long-form shell option names ignore the-. (ksh93, yash)
  • QRK_OPTNOPRFX: Long-form shell option names use a dynamicno prefix forall options (including POSIX ones). For instance,glob is the oppositeofnoglob, andnonotify is the opposite ofnotify. (ksh93, yash, zsh)
  • QRK_OPTULINE: Long-form shell option names ignore the_. (yash, zsh)
  • QRK_PPIPEMAIN: On zsh <= 5.5.1, in all elements of a pipeline, parameterexpansions are evaluated in the current environment (with any changes theymake surviving the pipeline), though the commands themselves of everyelement but the last are executed in a subshell. For instance, given unsetor emptyv, in the pipelinecmd1 ${v:=foo} | cmd2, the assignment tov survives, thoughcmd1 itself is executed in a subshell.
  • QRK_SPCBIXP: Variable assignments directly precedingspecial builtin commandsare exported, and persist as exported. (bash; yash)
  • QRK_UNSETF: If 'unset' is invoked without any option flag (-v or -f), andno variable by the given name exists but a function does, the shell unsetsthe function. (bash)

Bugs

Modernish currently identifies and supports the following shell bugs:

  • BUG_ALIASCSHD: A spurious syntax error occurs if a here-documentcontaining a command substitution is used within two aliases that define ablock. The syntax error reporting a missing} occurs because the aliasterminating the block is not correctly expanded. This bug affectsvar/local andvar/loopas they define blocks this way. Workaround: make a shell function thathandles the here-document and call that shell function from the block/loopinstead. Bug found on: dash <= 0.5.10.2; Busybox ash <= 1.31.1.
  • BUG_ALIASPOSX: Running any command "foo" in POSIX mode likePOSIXLY_CORRECT=y foo will globally disable alias expansion on anon-interactive shell (killing modernish), unless POSIX mode is globallyenabled. Bug found on bash 4.2 through 5.0.Note: on bash versions with this bug, modernish automatically enablesPOSIX mode to avoid triggering it. A side effect is that process substitution(PROCSUBST) isn't available.
  • BUG_ARITHINIT: Using unset or empty variables (dash <= 0.5.9.1 on macOS)or unset variables (yash <= 2.44) in arithmetic expressions causes theshell to exit, instead of taking them as a value of zero.
  • BUG_ARITHLNNO: The shell supports$LINENO, but the variable isconsidered unset in arithmetic contexts, like$(( LINENO > 0 )).This makes it error out underset -u and default to zero otherwise.Workaround: use shell expansion like$(( $LINENO > 0 )). (FreeBSD sh)
  • BUG_ARITHNAN: The case-insensitive special floating point constantsInf andNaN are recognised in arithmetic evaluation, overriding anyvariables with the namesInf,NaN,INF,nan, etc. (AT&T ksh93;zsh 5.6 - 5.8)
  • BUG_ARITHSPLT: Unquoted$((arithmetic expressions)) are notsubject to field splitting as expected. (zsh, mksh<=R49)
  • BUG_ASGNCC01: ifIFS contains a$CC01 (^A) character, unquoted expansions inshell assignments discard that character (if present). Found on: bash 4.0-4.3
  • BUG_ASGNLOCAL: If you have a function-local variable (seeLOCALVARS)with the same name as a global variable, and within the function you run ashell builtin command preceded by a temporary variable assignment, thenthe global variable is unset. (zsh <= 5.7.1)
  • BUG_BRACQUOT: shell quoting within bracket patterns has no effect (zsh < 5.3;ksh93) This bug means the- retains it special meaning of 'characterrange', and an initial! (and, on some shells,^) retains the meaning ofnegation, even in quoted strings within bracket patterns, including quotedvariables.
  • BUG_CASEEMPT: An emptycase list on a single line, as incase x in esac,is a syntax error. (AT&T ksh93)
  • BUG_CASELIT: If acase pattern doesn't match as a pattern, it's triedagain as a literal string, even if the pattern isn't quoted. This canresult in false positives when a pattern doesn't match itself, like withbracket patterns. This contravenes POSIX and breaks use cases such asinput validation. (AT&T ksh93) Note: modernishmatch works around this.
  • BUG_CASEPAREN:case patterns without an opening parenthesis(i.e. with only an unbalanced closing parenthesis) are misparsedas a syntax error within command substitutions of the form$( ).Workaround: include the opening parenthesis. Found on: bash 3.2
  • BUG_CASESTAT: Thecase conditional construct prematurely clobbers theexit status$?. (found in zsh < 5.3, Busybox ash <= 1.25.0, dash <0.5.9.1)
  • BUG_CDNOLOGIC: Thecd built-in command lacks the POSIX-specified-Loption and does not support logical traversal; it always acts as if the-P(physical traversal) option was passed. This also renders the-L optionto modernishchdir ineffective. (NetBSD sh)
  • BUG_CDPCANON:cd -P (and hence also modernishchdir) does not correctly canonicalise/normalise adirectory path that starts with three or more slashses; it reduces these totwo initial slashes instead of one in$PWD. (zsh <= 5.7.1)
  • BUG_CMDEXEC: usingcommand exec (to open a file descriptor, usingcommand to avoid exiting the shell on failure) within a function causesbash <= 4.0 to fail to restore the global positional parameters whenleaving that function. It also renders bash <=4.0 prone to hanging.
  • BUG_CMDEXPAN: if thecommand command results from an expansion, it actslikecommand -v, showing the path of the command instead of executing it.For example:v=command; "$v" ls orset -- command ls; "$@" don't work.(AT&T ksh93)
  • BUG_CMDOPTEXP: thecommand builtin does not recognise options if theyresult from expansions. For instance, you cannot conditionally store-pin a variable likedefaultpath and then docommand $defaultpath someCommand. (found in zsh < 5.3)
  • BUG_CMDPV:command -pv does not find builtins ({pd,m}ksh), does notaccept the -p and -v options together (zsh < 5.3) or ignores the-poption altogether (bash 3.2); in any case, it's not usable to find commandsin the default system PATH.
  • BUG_CMDSETPP: usingcommand set -- has no effect; it does not set thepositional parameters. For compat, useset withoutcommand. (mksh <= R57)
  • BUG_CMDSPASGN: preceding aspecial builtinwithcommand does not stop preceding invocation-local variableassignments from becoming global. (AT&T ksh93)
  • BUG_CMDSPEXIT: preceding aspecial builtin(other thaneval,exec,return orexit)withcommand does not always stopit from exiting the shell if the builtin encounters error.(bash <= 4.0; zsh <= 5.2; mksh; ksh93)
  • BUG_CSNHDBKSL: Backslashes within non-expanding here-documents withincommand substitutions are incorrectly expanded to perform newline joining,as opposed to left intact. (bash <= 4.4)
  • BUG_CSUBBTQUOT: A spurious syntax erorr is thrown when using doublequotes within a backtick-style command substitution that is itself withindouble quotes. (AT&T ksh93 < 93u+m 2022-05-20)
  • BUG_CSUBLNCONT: Backslash line continuation is not processed correctlywithin modern-form$(command substitutions).(AT&T ksh93 < 93u+m 2022-05-21)
  • BUG_CSUBRMLF: A bug affecting the stripping of final linefeeds fromcommand substitutions. If a command substitution does not produce anyoutput to substituteand is concatenated in a string or here-document,then the shell removes any concurrent linefeeds occurring directly beforethe command substitution in that string or here-document.(dash <= 0.5.10.2, Busybox ash, FreeBSD sh)
  • BUG_CSUBSTDO: If standard output (file descriptor 1) is closed beforeentering a command substitution, and any other file descriptors areredirected within the command substitution, commands such asecho orputln will not work within the command substitution, acting as if standardoutput is still closed (AT&T ksh93 <= AJM 93u+ 2012-08-01). Workaround: seecap/BUG_CSUBSTDO.t.
  • BUG_DEVTTY: the shell can't redirect output to/dev/tty ifset -C/set -o noclobber (part ofsafe mode)is active. Workaround: use>| /dev/tty instead of> /dev/tty.Bug found on: bash on certain systems (at least QNX and Interix).
  • BUG_DOLRCSUB: parsing problem where, inside a command substitution ofthe form$(...), the sequence$$'...' is treated as$'...' (i.e. asa use of CESCQUOT), and$$"..." as$"..." (bash-specific translatablestring). (Found in bash up to 4.4)
  • BUG_DQGLOB:globbing is not properly deactivated withindouble-quoted strings. Within double quotes, a* or? immediatelyfollowing a backslash is interpreted as a globbing character. This appliesto both pathname expansion and pattern matching incase. Found in: dash.(The bug is not triggered when using modernishmatch.)
  • BUG_EXPORTUNS: Setting the export flag on an otherwise unset variablecauses a set and empty environment variable to be exported, though thevariable continues to be considered unset within the current shell.(FreeBSD sh < 13.0)
  • BUG_FNSUBSH: Function definitions within subshells (including commandsubstitutions) are ignored if a function by the same name exists in themain shell, so the wrong function is executed.unset -f is also silentlyignored. ksh93 (all current versions as of November 2018) has this bug.It only applies to non-forked subshells. SeeNONFORKSUBSH.
  • BUG_FORLOCAL: afor loop in a function makes the iteration variablelocal to the function, so it won't survive the execution of the function.Found on: yash. This is intentional and documented behaviour on yash innon-POSIX mode, but in POSIX terms it's a bug, so we mark it as such.
  • BUG_GETOPTSMA: Thegetopts builtin leaves a: instead of a? inthe specified option variable if a given option that requires an argumentlacks an argument, and the option string does not start with a:. (zsh)
  • BUG_HDOCBKSL: Line continuation usingbackslashes in expandinghere-documents is handled incorrectly. (zsh up to 5.4.2)
  • BUG_HDOCMASK: Here-documents (and here-strings, seeHERESTRING) usetemporary files. This fails if the currentumask setting disallows theuser to read, so the here-document can't read from the shell's temporaryfile. Workaround: ensure user-readableumask when using here-documents.(bash, mksh, zsh)
  • BUG_IFSCC01PP: IfIFS contains a$CC01 (^A) control character, theexpansion"$@" (even quoted) is gravely corrupted.Since many modernishfunctions use this to loop through the positional parameters, this breaksthe library. (Found in bash < 4.4)
  • BUG_IFSGLOBC: In glob pattern matching (such as incase and[[), if awildcard character is part ofIFS, it is matched literally instead of as amatching character. This applies to glob characters*,?,[ and].Since nearly all modernish functions usecase for argument validation andother purposes, nearly every modernish function breaks on shells with thisbug ifIFS contains any of these three characters!(Found in bash < 4.4)
  • BUG_IFSGLOBP: In pathname expansion (filename globbing), if awildcard character is part ofIFS, it is matched literally instead of as amatching character. This applies to glob characters*,?,[ and].(Bug found in bash, all versions up to at least 4.4)
  • BUG_IFSGLOBS: in glob pattern matching (as incase or parametersubstitution with# and%), ifIFS starts with? or* and the"$*" parameter expansion inserts anyIFS separator characters, thosecharacters are erroneously interpreted as wildcards when quoted "$*" isused as the glob pattern. (AT&T ksh93)
  • BUG_IFSISSET: AT&T ksh93 (2011/2012 versions):${IFS+s} always yieldsseven ifIFS is unset. This applies toIFS only.
  • BUG_ISSETLOOP: AT&T ksh93: Expansions like${var+set}remain static when used within afor,while oruntil loop; the expansions don't change along with the state of thevariable, so they cannot be used to check whether a variable is setwithin a loop if the state of that variable may changein the course of the loop.
  • BUG_KBGPID: AT&T ksh93: If a single command ending in& (i.e. a backgroundjob) is enclosed in a{ braces; } block with an I/O redirection, the$!special parameter is not set to the background job's PID.
  • BUG_KUNSETIFS: AT&T ksh93: UnsettingIFS fails to activate default fieldsplitting if the following conditions are met: 1.IFS is set and empty(i.e. split is disabled) in the main shell, and at least one expansion hasbeen processed with that setting; 2. The code is currently executing in anon-forked subshell (seeNONFORKSUBSH).
  • BUG_LNNONEG:$LINENO becomes wildly inaccurate, even negative, whendotting/sourcing scripts. Bug found on: dash with LINENO support compiled in.
  • BUG_LOOPRET1: If areturn command is given with a status argument withinthe set of conditional commands in awhile oruntil loop (i.e., betweenwhile/until anddo), the status argument is ignored and the functionreturns with status 0 instead of the specified status.Found on: dash <= 0.5.8; zsh <= 5.2
  • BUG_LOOPRET2: If areturn command is given without a status argumentwithin the set of conditional commands in awhile oruntil loop (i.e.,betweenwhile/until anddo), the exit status passed down from theprevious command is ignored and the function returns with status 0 instead.Found on: dash <= 0.5.10.2; AT&T ksh93; zsh <= 5.2
  • BUG_LOOPRET3: If areturn command is given within the set of conditionalcommands in awhile oruntil loop (i.e., betweenwhile/until anddo),and the return status (either the status argument toreturn or theexit status passed down from the previous command byreturn without astatus argument) is non-zero,and the conditional command list itself yieldsfalse (forwhile) or true (foruntil),and the whole construct isexecuted in a dot script sourced from another script, then too many levels ofloop are broken out of, causingprogram flow corruption or premature exit.Found on: zsh <= 5.7.1
  • BUG_MULTIBIFS: We're on a UTF-8 locale and the shell supports UTF-8characters in general (i.e. we don't haveWRN_MULTIBYTE) – however, usingmulti-byte characters asIFS field delimiters still doesn't work. Forexample,"$*" joins positional parameters on the first byte ofIFSinstead of the first character. (ksh93, mksh, FreeBSD sh, Busybox ash)
  • BUG_NOCHCLASS: POSIX-mandated character[:classes:] within bracket[expressions] are not supported in glob patterns. (mksh)
  • BUG_NOEXPRO: Cannot export read-only variables. (zsh <= 5.7.1 in sh mode)
  • BUG_OPTNOLOG: on dash, setting-o nolog causes$- to wreak havoc:trying to expand$- silently aborts parsing of an entire argument,so e.g."one,$-,two" yields"one,". (Same applies to-o debug.)
  • BUG_PP_01:POSIX saysthat empty"$@" generates zero fields but empty'' or"" or"$emptyvariable" generates one empty field. This means concatenating"$@" with one or more other, separately quoted, empty strings (like"$@""$emptyvariable") should still produce one empty field. But onbash 3.x, this erroneously produces zero fields. (See also QRK_EMPTPPWRD)
  • BUG_PP_02: LikeBUG_PP_01, but with unquoted$@ and onlywith"$emptyvariable"$@, not$@"$emptyvariable".(mksh <= R50f; FreeBSD sh <= 10.3)
  • BUG_PP_03: WhenIFS is unset or empty (zsh 5.3.x) or empty (mksh <= R50),assigningvar=$* only assigns the first field, failing to join anddiscarding the rest of the fields. Workaround:var="$*"(POSIX leavesvar=$@, etc. undefined, so we don't test for those.)
  • BUG_PP_03A: WhenIFS is unset, assignments likevar=$*incorrectly remove leading and trailing spaces (but not tabs ornewlines) from the result. Workaround: quote the expansion. Found on:bash 4.3 and 4.4.
  • BUG_PP_03B: WhenIFS is unset, assignments likevar=${var+$*},etc. incorrectly remove leading and trailing spaces (but not tabs ornewlines) from the result. Workaround: quote the expansion. Found on:bash 4.3 and 4.4.
  • BUG_PP_03C: WhenIFS is unset, assigningvar=${var-$*} only assignsthe first field, failing to join and discarding the rest of the fields.(zsh 5.3, 5.3.1) Workaround:var=${var-"$*"}
  • BUG_PP_04A: Like BUG_PP_03A, but for conditional assignments withinparameter substitutions, as in: ${var=$*} or: ${var:=$*}.Workaround: quote either$* within the expansion or the expansionitself. (bash <= 4.4)
  • BUG_PP_04E: When assigning the positional parameters ($*) to a variableusing a conditional assignment within a parameter substitution, e.g.: ${var:=$*}, the fields are always joined and separated by spaces,except ifIFS is set and empty. Workaround as in BUG_PP_04A.(bash 4.3)
  • BUG_PP_04_S: WhenIFS is null (empty), the result of a substitutionlike${var=$*} is incorrectly field-split on spaces.The assignment itself succeeds normally.Found on: bash 4.2, 4.3
  • BUG_PP_05:POSIX saysthat empty$@ and$* generate zero fields, but with nullIFS, emptyunquoted$@ and$* yield one empty field. Found on: dash 0.5.9and 0.5.9.1; Busybox ash.
  • BUG_PP_06A:POSIX saysthat unquoted$@ and$* initially generate as many fields as there arepositional parameters, and then (because$@ or$* is unquoted) each field issplit further according toIFS. With this bug, the latter step is notdone ifIFS is unset (i.e. default split). Found on: zsh < 5.4
  • BUG_PP_07: unquoted$* and$@ (including in substitutions like${1+$@} or${var-$*}) do not perform default field splitting ifIFS is unset. Found on: zsh (up to 5.3.1) in sh mode
  • BUG_PP_07A: WhenIFS is unset, unquoted$* undergoes word splittingas ifIFS=' ', and not the expectedIFS=" ${CCt}${CCn}".Found on: bash 4.4
  • BUG_PP_08: WhenIFS is empty, unquoted$@ and$* do not generateone field for each positional parameter as expected, but instead jointhem into a single field without a separator. Found on: yash < 2.44and dash < 0.5.9 and Busybox ash < 1.27.0
  • BUG_PP_08B: WhenIFS is empty, unquoted$* within a substitution (e.g.${1+$*} or${var-$*}) does not generate one field for each positionalparameter as expected, but instead joins them into a single field withouta separator. Found on: bash 3 and 4
  • BUG_PP_09: WhenIFS is non-empty but does not contain a space,unquoted$* within a substitution (e.g.${1+$*} or${var-$*}) doesnot generate one field for each positional parameter as expected,but instead joins them into a single field separated by spaces(even though, as said,IFS does not contain a space).Found on: bash 4.3
  • BUG_PP_10: WhenIFS is null (empty), assigningvar=$* removes any$CC01 (^A) and$CC7F (DEL) characters. (bash 3, 4)
  • BUG_PP_10A: WhenIFS is non-empty, assigningvar=$* prefixes each$CC01 (^A) and$CC7F (DEL) character with a$CC01 character. (bash 4.4)
  • BUG_PP_1ARG: WhenIFS is empty on bash <= 4.3 (i.e. fieldsplitting is off),${1+"$@"} or"${1+$@}" is counted as a singleargument instead of each positional parameter as separate arguments.This also applies to prepending text only if there are positionalparameters with something like"${1+foobar $@}".
  • BUG_PP_MDIGIT: Multiple-digit positional parameters don't require expansionbraces, so e.g.$10 =${10} (dash; Busybox ash). This is classed as a bugbecause it causes a straight-up incompatibility with POSIX scripts. POSIXsays:"The parameter name or symbol can be enclosed in braces, which areoptional except for positional parameters with more than one digit [...]".
  • BUG_PP_MDLEN: For${#x} expansions where x >= 10, only the first digit ofthe positional parameter number is considered, e.g.${#10},${#12},${#123} are all parsed as if they are${#1}. Then, string parsing isaborted so that further characters or expansions, if any, are lost.Bug found in: dash 0.5.11 - 0.5.11.4 (fixed in dash 0.5.11.5)
  • BUG_PSUBASNCC: in an assignment parameter substitution of the form${foo=value}, if the characters$CC01 (^A) or$CC7F (DEL) are in thevalue, all their occurrences are stripped from the expansion (although theassignment itself is done correctly). If the expansion is quoted, only$CC01 is stripped. This bug is independent of the state ofIFS, except ifIFS is null, the assignment in${foo=$*} (unquoted) is buggy too: itstrips$CC01 from the assigned value. (Found on bash 4.2, 4.3, 4.4)
  • BUG_PSUBBKSL1: A backslash-escaped} character within a quoted parametersubstitution is not unescaped. (bash 3.2, dash <= 0.5.9.1, Busybox 1.27 ash)
  • BUG_PSUBEMIFS: ifIFS is empty (no split, as in safe mode), then if aparameter substitution of the forms${foo-$*},${foo+$*},${foo:-$*} or${foo:+$*} occurs in a command argument, the characters$CC01 (^A) or$CC7F (DEL) are stripped from the expanded argument. (Found on: bash 4.4)
  • BUG_PSUBEMPT: Expansions of the form${V-} and${V:-} are notsubject to normal shell empty removal if that parameter is unset, causingunexpected empty arguments to commands. Workaround:${V+$V} and${V:+$V} work as expected. (Found on FreeBSD 10.3 sh)
  • BUG_PSUBIFSNW: When field-splitting unquoted parameter substitutions like${var#foo},${var##foo},${var%foo} or${var%%foo} on non-whitespaceIFS, if there is an initial empty field, a spurious extra initial emptyfield is generated. (mksh)
  • BUG_PSUBNEWLN: Due to a bug in the parser, parameter substitutionsspread over more than one line cause a syntax error.Workaround: instead of a literal newline, use$CCn.(found in dash <= 0.5.9.1 and Busybox ash <= 1.28.1)
  • BUG_PSUBSQUOT: in pattern matching parameter substitutions(${param#pattern},${param%pattern},${param##pattern} and${param%%pattern}), if the whole parameter substitution is quoted withdouble quotes, then single quotes in thepattern are not parsed. POSIXsaysthey are to keep their special meaning, so that glob characters maybe quoted. For example:x=foobar; echo "${x#'foo'}" should yieldbarbut with this bug yieldsfoobar. (dash <= 0.5.9.1; Busybox 1.27 ash)
  • BUG_PSUBSQHD: Like BUG_PSUBSQUOT, but included a here-document instead ofquoted with double quotes. (dash <= 0.5.9.1; mksh)
  • BUG_PUTIOERR: Shell builtins that output strings (echo,printf, ksh/zshprint), and thus also modernishput andputln, do not check for I/Oerrors on output. This means a script cannot check for them, and a scriptprocess in a pipe can get stuck in an infinite loop ifSIGPIPE is ignored.
  • BUG_READWHSP: If there is more than one field to read,read does nottrim trailingIFS whitespace. (dash 0.5.7, 0.5.8)
  • BUG_REDIRIO: the I/O redirection operator<> (open a file descriptorfor both read and write) defaults to opening standard output (i.e. isshort for1<>) instead of defaulting to opening standard input (0<>) asPOSIX specifies.(AT&T ksh93)
  • BUG_REDIRPOS: Buggy behaviour occurs if aredirection ispositionedin between to variable assignments in the same command. On zsh 5.0.x, aparse error is thrown. On zsh 5.1 to 5.4.2, anything following theredirection (other assignments or command arguments) is silently ignored.
  • BUG_SCLOSEDFD: bash < 5.0 and dash fail to establish a block-local scopefor a file descriptor that is added to the end of the block as a redirectionthat closes that file descriptor (e.g.} 8<&- ordone 7>&-). If that FDis already closed outside the block, the FD remains global, so you can'tlocallyexec it. So with this bug, it is not straightforward to make ablock-local FD appear initially closed within a block. Workaround: first openthe FD, then close it – for example:done 7>/dev/null 7>&- will establisha local scope for FD 7 for the precedingdo...done block while stillmaking FD 7 appear initially closed within the block.
  • BUG_SETOUTVAR: Theset builtin (with no arguments) only prints nativefunction-local variables when called from a shell function. (yash <= 2.46)
  • BUG_SHIFTERR0: Theshift builtin silently returns a successful exitstatus (0) when attempting to shift a number greater than the currentamount of positional parameters. (Busybox ash <= 1.28.4)
  • BUG_SPCBILOC: Variable assignments precedingspecial builtinscreate a partially function-local variable if a variable by the same namealready exists in the global scope. (bash < 5.0 in POSIX mode)
  • BUG_TESTERR1A:test/[ exits with a non-errorfalse status(1) if an invalid argument is given to an operator. (AT&T ksh93)
  • BUG_TESTILNUM: On dash (up to 0.5.8), giving an illegal number totest -tor[ -t causes some kind of corruption so the nexttest/[ invocationfails with an "unexpected operator" error even if it's legit.
  • BUG_TESTONEG: Thetest/[ builtin supports a-o unary operator tocheck if a shell option is set, but it ignores theno prefix on shelloption names, so something like[ -o noclobber ] gives a false positive.Bug found on yash up to 2.43. (TheTESTO feature test implicitly checksagainst this bug and won't detect the feature if the bug is found.)
  • BUG_TRAPEMPT: Thetrap builtin does not quote empty traps in itsoutput, rendering the output unsuitable for shell re-input. For instance,trap '' INT; trap outputs "trap -- INT" instead of "trap -- '' INT".(found in mksh <= R56c)
  • BUG_TRAPEXIT: the shell'strap builtin does not know the EXIT trap byname, but only by number (0). Using the name throws a "bad trap" error. Found inklibc 2.0.4 dash.
  • BUG_TRAPFNEXI: When a function issues a signal whose trap exits theshell, the shell is not exited immediately, but only on return from thefunction. (zsh)
  • BUG_TRAPRETIR: Usingreturn withineval triggers infinite recursion ifboth a RETURN trap and thefunctrace shell option are active. This bug inbash-only functionality triggers a crash when using modernish, so to avoidthis, modernish automatically disables thefunctrace shell option if aRETURN trap is set or pushed and this bug is detected. (bash 4.3, 4.4)
  • BUG_TRAPSUB0: Subshells in traps fail to pass down a nonzero exit status ofthe last command they execute, under certain conditions or consistently,depending on the shell. (bash <= 4.0; dash 0.5.9 - 0.5.10.2; yash <= 2.47)
  • BUG_TRAPUNSRE: When a trapunsets itself and thenresends its own signal,the execution of the trap action (including functions called by it) isnot interrupted by the now-untrapped signal; instead, the processterminates after completing the entire trap routine. (bash <= 4.2; zsh)
  • BUG_UNSETUNXP: If an unset variable is given the export flag using theexport command, a subsequentunset command does not remove that exportflag again. Workaround: assign to the variable first, then unset it tounexport it. (Found on AT&T ksh JM-93u-2011-02-08; Busybox 1.27.0 ash)
  • BUG_VARPREFIX: On a shell with theVARPREFIX feature, expansions of type${!prefix@} and${!prefix*} do not find the variable nameprefix itself. (AT&T ksh93)
  • BUG_ZSHNAMES: A series of lowerase names, normally okay for script useas per POSIX convention, is reserved for special use. Unsetting thesenames is impossible in most cases, and changing them may corrupt importantshell or system settings. This may conflict withsimple-form modernish scripts.This bug is detected on zsh when it was not initially invoked in emulationmode, and emulation mode was enabled usingemulate sh post invocationinstead (which does not disable these conflicting parameters).As of zsh 5.6, the list of variable names affected is:aliasesargvbuiltinscdpathcommandsdirstackdis_aliasesdis_builtinsdis_functionsdis_functions_sourcedis_galiasesdis_patcharsdis_reswordsdis_saliasesfignorefpathfuncfiletracefuncsourcetracefuncstackfunctionsfunctions_sourcefunctracegaliaseshistcharshistoryhistorywordsjobdirsjobstatesjobtextskeymapsmailpathmanpathmodule_pathmodulesnameddirsoptionsparameterspatcharspathpipestatuspromptpsvarreswordssaliasessignalsstatustermcapterminfouserdirsusergroupswatchwidgetszsh_eval_contextzsh_scheduled_events
  • BUG_ZSHNAMES2: Two lowercase variable nameshistchars andsignals,normally okay for script use as per POSIX convention, are reserved forspecial use on zsh,even if zsh is initialised in sh mode (via ashsymlink or using the--emulate sh option at startup).Bug found on: zsh <= 5.7.1. The bug is only detected ifBUG_ZSHNAMES isnot detected, because this bug's effects are included in that one's.

Warning IDs

Warning IDs do not identify any characteristic of the shell, but insteadwarn about a potentially problematic system condition that was detected atinitialisation time.

  • WRN_EREMBYTE: The current system locale setting supports Unicode UTF-8multi-byte/variable-length characters, but the utility used bystr ematchto match extended regular expressions (EREs) does not support themand treats all characters as single bytes. This means multi-byte characterswill be matched as multiple characters, and character[:classes:]within bracket expressions will only match ASCII characters.
  • WRN_MULTIBYTE: The current system locale setting supports Unicode UTF-8multi-byte/variable-length characters, but the current shell does notsupport them and treats all characters as single bytes. This meanscounting or processing multi-byte characters with the current shell willproduce incorrect results. Scripts that need compatibility with thissystem condition should checkif thisshellhas WRN_MULTIBYTE and resortto a workaround that uses external utilities where necessary.
  • WRN_NOSIGPIPE: Modernish has detected that the process that launchedthe current program has setSIGPIPE to ignore, an irreversible conditionthat is in turn inherited by any process started by the current shell, andtheir subprocesses, and so on. The system constant$SIGPIPESTATUSis set to the special value 99999 and neither the current shell nor anyprocess it spawns is now capable of receivingSIGPIPE. The-P option tohardenis also rendered ineffective.Depending on how a given commandfoo is implemented, it is now possiblethat a pipeline such asfoo | head -n 10 never ends; iffoo doesn'tcheck for I/O errors, the only way it would ever stop trying to writelines is by receivingSIGPIPE ashead terminates.Programs that use commands in this fashion should checkif thisshellhas WRN_NOSIGPIPE and either employ workarounds or refuse to run if so.

Appendix B: Regression test suite

Modernish comes with a suite of regression tests to detect bugs in modernishitself, which can be run usingmodernish --test after installation. Bydefault, it will run all the tests verbosely but without tracing the commandexecution. Theinstall.sh installer will runmodernish --test -eqq on theselected shell before installation.

A few options are available to specify after--test:

  • -h: show help.
  • -e: disable or reduce expensive (i.e. slow or memory-hogging) tests.
  • -q: quieter operation; report expected fails [known shell bugs]and unexpected fails [bugs in modernish]). Add-q again forquietest operation (report unexpected fails only).
  • -s: entirely silent operation.
  • -t: run only specific test sets or tests. Test sets are those listedin the full default output ofmodernish --test. This option requiresan option-argument in the following format:
    testset1:num1,num2,/testset2:num1,num2,/
    The colon followed by numbers is optional; if omitted, the entire setwill be run, otherwise the given numbered tests will be run in the givenorder. Example:modernish --test -t match:2,4,7/arith/shellquote:1 runstest 2, 4 and 7 from thematch set, the entirearith set, and onlytest 1 from theshellquote set.Atestset can also be given as the incomplete beginning of a name or asa shell glob pattern. In that case, all matching sets will be run.
  • -x: trace each test using the shell'sxtrace facility. Each trace isstored in a separate file in a specially created temporary directory. Bydefault, the trace is deleted if a test does not produce an unexpectedfail. Add-x again to keep expected fails as well, and again tokeep all traces regardless of result. If any traces were saved,modernish will tell you the location of the temporary directory at theend, otherwise it will silently remove the directory again.
  • -E: don't run any tests, but output a command to open the tests that wouldhave been run in your editor. The editor from theVISUAL orEDITORenvironment variable is used, withvi as a default. This option should beused together with-t to specify tests. All other options are ignored.
  • -F: takes an argument with the name or path to afind utility toprefer when testingLOOP find.More info here.

These short options can be combined so, for example,--test -qxx is the same as--test -q -x -x.

Difference between capability detection and regression tests

Note the difference between these regression tests and the cap tests listed inAppendix A. The latter aretests for whatever shell is executing modernish: they detect capabilities(features, quirks, bugs) of the current shell. They are meant to be run viathisshellhas and are designed tobe taken advantage of in scripts. On the other hand, these tests run bymodernish --test are regression tests for modernish itself. It does notmake sense to use these in a script.

New/unknown shell bugs can still cause modernish regression tests to fail,of course. That's why some of the regression tests also check forconsistency with the results of the capability detection tests: if there is ashell bug in a widespread release version that modernish doesn't know aboutyet, this in turn is considered to be a bug in modernish, because one of itsgoals is to know about all the shell bugs in all released shell versionscurrently seeing significant use.

Testing modernish on all your shells

Thetestshells.sh program inshare/doc/modernish/examples can be used torun the regression test suite on all the shells installed on your system.You could put it astestshells in some convenient location in your$PATH, and then simply run:

testshells modernish --test

(adding any further options you like – for instance, you might like to add-q to avoid very long terminal output). On first run,testshells willgenerate a list of shells it can find on your system and it will give you achance to edit it before proceeding.

Appendix C: Supported locales

modernish, like most shells, fully supports two system locales: POSIX(a.k.a. C, a.k.a. ASCII) and Unicode's UTF-8. It will work in other locales,but things like converting to upper/lower case, and matching singlecharacters in patterns, are not guaranteed.

Caveat: some shells or operating systems have bugs that prevent (or lackfeatures required for) full locale support. If portability is a concern,check forthisshellhas WRN_MULTIBYTE orthisshellhas BUG_NOCHCLASSwhere needed. SeeAppendix A.

Scripts/programs shouldnot change the locale (LC_* orLANG) afterinitialising modernish. Doing this might break various functions, asmodernish sets specific versions depending on your OS, shell and locale.(Temporarily changing the locale is fine as long as you don't usemodernish features that depend on it – for example, setting a specificlocale just for an external command. However, if you useharden, seetheimportant notein its documentation!)

Appendix D: Supported shells

Modernish builds on thePOSIX 2018 Editionstandard, so it should run on any sufficiently POSIX-compliant shell andoperating system. It uses bothbug/feature detectionandregression testingto determine whether it can run on any particular shell, so it does notblock or support particular shell versions as such. However, modernish hasbeen confirmed to run correctly on the following shells:

  • bash 3.2 or higher
  • Busybox ash 1.20.0 or higher, excluding 1.28.x(also possibly excluding anything older than 1.27.x on UTF-8 locales,depending on your operating system)
  • dash (Debian sh)0.5.7 or higher,excluding 0.5.10, 0.5.10.1, 0.5.11-0.5.11.4
  • FreeBSD sh 11.0 or higher
  • gwsh
  • ksh 93u+ 2012-08-01, 93u+m
  • mksh version R55 or higher
  • NetBSD sh 9.0 or higher
  • yash 2.40 or higher (2.44+ for POSIX mode)
  • zsh 5.3 or higher

Currently knownnot to run modernish due to excessive bugs:

Appendix E: zsh: integration with native scripts

This appendix is specific tozsh.

While modernish duplicates some functionality already available nativelyon zsh, it still has plenty to add. However, writing a normalsimple-form modernish script turnsemulate sh on for the entire script, so you lose important aspectsof the zsh language.

But there is another way – modernish functionality may be integratedwith native zsh scripts using 'sticky emulation', as follows:

emulate -R sh -c'. modernish'

This causes modernish functions to run in sh mode while your script will stillrun in native zsh mode with all its advantages. The following notes apply:

  • Using thesafe mode isnot recommended, as zshdoes not apply split/glob to variable expansions by default, and themodernish safe mode would defeat the${~var} and${=var} flags that applythese on a case by case basis. This does mean that:
    • The--split and--glob operators to constructs such asLOOP findare not available. Use zsh expansion flags instead.
    • Quoting literal glob patterns to commands likefind remains necessary.
  • UsingLOCAL is not recommended.Anonymous functionsare the native zsh equivalent.
  • Native zsh loops should be preferred over modernish loops, except wheremodernish adds functionality not available in zsh (such asLOOP find oruser-programmed loops).

Seeman zshbuiltins underemulate, option-c, for more information.

Appendix F: Bundling modernish with your script

The modernish installerinstall.sh can bundle one or more scripts with astripped-down version of the modernish library. This allows the bundled scriptsto run with a known version of modernish, whether or not modernish is installedon the user's system. Like modernish itself, bundling is cross-platform andportable (or as portable as your script is).

Bundled scripts are not modified. Instead, for each script, a wrapper script isinstalled under the same name in the installation root directory. This wrapperautomatically looks for asuitablePOSIX-compliant shell that passes the modernish battery of fatal bug tests,then sets up the environment to run the real script with modernish on thatshell. Your modernish script can be run through the supplied wrapper scriptfrom any directory location on any POSIX-compliant operating system, as long asall files remain in the same location relative to each other.

Bundling is always a non-interactive installer operation, with optionsspecified on the command line. The installer usage for bundling is as follows:

install.sh-B-Drootdir [-dsubdir ][-sshell ]scriptfile [scriptfile ... ]

The-B option enables bundling mode. The option does not itself take anoption-argument. Instead, any number ofscriptfiles to bundle can be givenas arguments following all other options. All scripts are bundled with asingle copy of modernish. The bundling operation does not deal with anyauxiliary files the scripts may require (other than modernish modules); anysuch need to be added manually after bundling is complete.

The-D option specifies the path to the bundled installation's rootdirectory, where wrapper scripts are installed. This option is mandatory.If the directory doesn't exist, it is created.

The-d option specifies the subdirectory of the-D root directory where thebundled scripts and modernish are installed. It can contain slashes to installthe bundle at a deeper directory level. The default subdirectory isbndl.The option-argument can be empty or/, in which case the bundle is installeddirectly into the installation root directory.

The-s option specifies a preferred shell for the bundled scripts. A shellname or a full path to a shell can be given. Wrapper scripts try the full pathfirst (if any), then try to find a shell with its basename, and then try tofind a shell with that basename minus any version number (e.g.bash insteadofbash-5.0 orksh instead ofksh93). If all that doesn't produce a shellthat passes fatal bugs tests, it continues with the normal shell search.

This means the script won't fail to launch if the preferred shell can't befound. Instead, it is up to the script itself to refuse to run if requiredshell-specific conditions are not met. Script should use thethisshellhasfunction to check for any nonstandardcapabilitiesrequired, or anybugsorquirksthat the script is incompatible with (or indeed requires!).

Bundling is supported for bothportable-formandsimple-formmodernish scripts. The installer automatically adapts the wrapper scripts tothe form used. For simple-form scripts, the directory containing the bundledmodernish core library (by default,.../bndl/bin/modernish) is prefixed to$PATH so that. modernish works. Since simple-form scripts are often moreshell-specific, you may want to specify a preferred shell with the-s option.

To save space, the bundled copy of the modernish library is reduced such thatall comments are stripped from the code,interactive useis not supported,theregression test suiteis not included,thisshellhasdoes not have the--cache and--show operators,and thecap/*.t capability detection scriptsare "statically linked" (directly included) into bin/modernish instead ofshipped as separate files.AREADME.modernish file is added with a short explanation, the licence,and a link for people to get the complete version of modernish. Please donot remove this when distributing bundled scripts.


EOF


[8]ページ先頭

©2009-2025 Movatter.jp