On Thu, Oct 29, 2020 at 11:37:23AM -0500, Eric W. Biederman wrote:
First and foremost: A uid shift on write to a filesystem is a
security
bug waiting to happen. This is especially in the context of facilities
like iouring, that play very agressive games with how process context
makes it to system calls.
The only reason containers were not immediately exploitable when iouring
was introduced is because the mechanisms are built so that even if
something escapes containment the security properties still apply.
Changes to the uid when writing to the filesystem does not have that
property. The tiniest slip in containment will be a security issue.
This is not even the least bit theoretical. I have seem reports of how
shitfs+overlayfs created a situation where anyone could read
/etc/shadow.
This bug was the result of a complex interaction with several
contributing factors. It's fair to say that one component was overlayfs
writing through an id-shifted mount, but the primary cause was related
to how copy-up was done coupled with allowing unprivileged overlayfs
mounts in a user ns. Checks that the mounter had access to the lower fs
file were not done before copying data up, and so the file was copied up
temporarily to the id shifted upperdir. Even though it was immediately
removed, other factors made it possible for the user to get the file
contents from the upperdir.
Regardless, I do think you raise a good point. We need to be wary of any
place the kernel could open files through a shifted mount, especially
when the open could be influenced by userspace.
Perhaps kernel file opens through shifted mounts should to be opt-in.
I.e. unless a flag is passed, or a different open interface used, the
open will fail if the dentry being opened is subject to id shifting.
This way any kernel writes which would be subject to id shifting will
only happen through code which as been written to take it into account.
Seth