No New Privileges Flag¶
The execve system call can grant a newly-started program privileges thatits parent did not have. The most obvious examples are setuid/setgidprograms and file capabilities. To prevent the parent program fromgaining these privileges as well, the kernel and user code must becareful to prevent the parent from doing anything that could subvert thechild. For example:
The dynamic loader handles
LD_*environment variables differently ifa program is setuid.chroot is disallowed to unprivileged processes, since it would allow
/etc/passwdto be replaced from the point of view of a process thatinherited chroot.The exec code has special handling for ptrace.
These are all ad-hoc fixes. Theno_new_privs bit (since Linux 3.5) is anew, generic mechanism to make it safe for a process to modify itsexecution environment in a manner that persists across execve. Any taskcan setno_new_privs. Once the bit is set, it is inherited across fork,clone, and execve and cannot be unset. Withno_new_privs set,execve()promises not to grant the privilege to do anything that could not havebeen done without the execve call. For example, the setuid and setgidbits will no longer change the uid or gid; file capabilities will notadd to the permitted set, and LSMs will not relax constraints afterexecve.
To setno_new_privs, use:
prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0);
Be careful, though: LSMs might also not tighten constraints on execinno_new_privs mode. (This means that setting up a general-purposeservice launcher to setno_new_privs before execing daemons mayinterfere with LSM-based sandboxing.)
Note thatno_new_privs does not prevent privilege changes that do notinvolveexecve(). An appropriately privileged task can still callsetuid(2) and receive SCM_RIGHTS datagrams.
There are two main use cases forno_new_privs so far:
Filters installed for the seccomp mode 2 sandbox persist acrossexecve and can change the behavior of newly-executed programs.Unprivileged users are therefore only allowed to install such filtersif
no_new_privsis set.By itself,
no_new_privscan be used to reduce the attack surfaceavailable to an unprivileged user. If everything running with agiven uid hasno_new_privsset, then that uid will be unable toescalate its privileges by directly attacking setuid, setgid, andfcap-using binaries; it will need to compromise something without theno_new_privsbit set first.
In the future, other potentially dangerous kernel features could becomeavailable to unprivileged tasks ifno_new_privs is set. In principle,several options tounshare(2) andclone(2) would be safe whenno_new_privs is set, andno_new_privs +chroot is considerable lessdangerous than chroot by itself.