SafeSetID¶
SafeSetID is an LSM module that gates the setid family of syscalls to restrictUID/GID transitions from a given UID/GID to only those approved by asystem-wide allowlist. These restrictions also prohibit the given UIDs/GIDsfrom obtaining auxiliary privileges associated with CAP_SET{U/G}ID, such asallowing a user to set up user namespace UID/GID mappings.
Background¶
In absence of file capabilities, processes spawned on a Linux system that needto switch to a different user must be spawned with CAP_SETUID privileges.CAP_SETUID is granted to programs running as root or those running as a non-rootuser that have been explicitly given the CAP_SETUID runtime capability. It isoften preferable to use Linux runtime capabilities rather than filecapabilities, since using file capabilities to run a program with elevatedprivileges opens up possible security holes since any user with access to thefile canexec() that program to gain the elevated privileges.
While it is possible to implement a tree of processes by giving fullCAP_SET{U/G}ID capabilities, this is often at odds with the goals of running atree of processes under non-root user(s) in the first place. Specifically,since CAP_SETUID allows changing to any user on the system, including the rootuser, it is an overpowered capability for what is needed in this scenario,especially since programs often only callsetuid() to drop privileges to alesser-privileged user -- not elevate privileges. Unfortunately, there is nogenerally feasible way in Linux to restrict the potential UIDs that a user canswitch to throughsetuid() beyond allowing a switch to any user on the system.This SafeSetID LSM seeks to provide a solution for restricting setidcapabilities in such a way.
The main use case for this LSM is to allow a non-root program to transition toother untrusted uids without full blown CAP_SETUID capabilities. The non-rootprogram would still need CAP_SETUID to do any kind of transition, but theadditional restrictions imposed by this LSM would mean it is a “safer” versionof CAP_SETUID since the non-root program cannot take advantage of CAP_SETUID todo any unapproved actions (e.g. setuid to uid 0 or create/enter new usernamespace). The higher level goal is to allow for uid-based sandboxing of systemservices without having to give out CAP_SETUID all over the place just so thatnon-root programs can drop to even-lesser-privileged uids. This is especiallyrelevant when one non-root daemon on the system should be allowed to spawn otherprocesses as different uids, but it’s undesirable to give the daemon abasically-root-equivalent CAP_SETUID.
Other Approaches Considered¶
Solve this problem in userspace¶
For candidate applications that would like to have restricted setid capabilitiesas implemented in this LSM, an alternative option would be to simply take awaysetid capabilities from the application completely and refactor the processspawning semantics in the application (e.g. by using a privileged helper programto do process spawning and UID/GID transitions). Unfortunately, there are anumber of semantics around process spawning that would be affected by this, suchas fork() calls where the program doesn’t immediately callexec() after thefork(), parent processes specifying custom environment variables or command lineargs for spawned child processes, or inheritance of file handles across afork()/exec(). Because of this, as solution that uses a privileged helper inuserspace would likely be less appealing to incorporate into existing projectsthat rely on certain process-spawning semantics in Linux.
Use user namespaces¶
Another possible approach would be to run a given process tree in its own usernamespace and give programs in the tree setid capabilities. In this way,programs in the tree could change to any desired UID/GID in the context of theirown user namespace, and only approved UIDs/GIDs could be mapped back to theinitial system user namespace, affectively preventing privilege escalation.Unfortunately, it is not generally feasible to use user namespaces in isolation,without pairing them with other namespace types, which is not always an option.Linux checks for capabilities based off of the user namespace that “owns” someentity. For example, Linux has the notion that network namespaces are owned bythe user namespace in which they were created. A consequence of this is thatcapability checks for access to a given network namespace are done by checkingwhether a task has the given capability in the context of the user namespacethat owns the network namespace -- not necessarily the user namespace underwhich the given task runs. Therefore spawning a process in a new user namespaceeffectively prevents it from accessing the network namespace owned by theinitial namespace. This is a deal-breaker for any application that expects toretain the CAP_NET_ADMIN capability for the purpose of adjusting networkconfigurations. Using user namespaces in isolation causes problems regardingother system interactions, including use of pid namespaces and device creation.
Use an existing LSM¶
None of the other in-tree LSMs have the capability to gate setid transitions, oreven employ the security_task_fix_setuid hook at all. SELinux says of that hook:“Since setuid only affects the current process, and since the SELinux controlsare not based on the Linux identity attributes, SELinux does not need to controlthis operation.”
Directions for use¶
This LSM hooks the setid syscalls to make sure transitions are allowed if anapplicable restriction policy is in place. Policies are configured throughsecurityfs by writing to the safesetid/uid_allowlist_policy andsafesetid/gid_allowlist_policy files at the location where securityfs ismounted. The format for adding a policy is ‘<UID>:<UID>’ or ‘<GID>:<GID>’,using literal numbers, and ending with a newline character such as ‘123:456n’.Writing an empty string “” will flush the policy. Again, configuring a policyfor a UID/GID will prevent that UID/GID from obtaining auxiliary setidprivileges, such as allowing a user to set up user namespace UID/GID mappings.
Note on GID policies and setgroups()¶
In v5.9 we are adding support for limiting CAP_SETGID privileges as was donepreviously for CAP_SETUID. However, for compatibility with common sandboxingrelated code conventions in userspace, we currently allow arbitrarysetgroups() calls for processes with CAP_SETGID restrictions. Until we addsupport in a future release for restrictingsetgroups() calls, these GIDpolicies add no meaningful security.setgroups() restrictions will be enforcedonce we have the policy checking code in place, which will rely on GID policyconfiguration code added in v5.9.