On Thu, May 30, 2019 at 1:09 PM Serge E. Hallyn <serge(a)hallyn.com> wrote:
On Wed, May 29, 2019 at 06:39:48PM -0400, Paul Moore wrote:
> On Wed, May 29, 2019 at 6:28 PM Tycho Andersen <tycho(a)tycho.ws> wrote:
> > On Wed, May 29, 2019 at 12:03:58PM -0400, Paul Moore wrote:
> > > On Wed, May 29, 2019 at 11:34 AM Tycho Andersen <tycho(a)tycho.ws>
wrote:
> > > > On Wed, May 29, 2019 at 11:29:05AM -0400, Paul Moore wrote:
> > > > > On Wed, May 29, 2019 at 10:57 AM Tycho Andersen
<tycho(a)tycho.ws> wrote:
> > > > > > On Mon, Apr 08, 2019 at 11:39:09PM -0400, Richard Guy
Briggs wrote:
...
> > > > > The current thinking
> > > > > is that you would only change the audit container ID from one
> > > > > set/inherited value to another if you were nesting containers,
in
> > > > > which case the nested container orchestrator would need to be
granted
> > > > > CAP_AUDIT_CONTROL (which everyone to date seems to agree is a
workable
> > > > > compromise).
> >
> > won't work in user namespaced containers, because they will never be
> > capable(CAP_AUDIT_CONTROL); so I don't think this will work for
> > nesting as is. But maybe nobody cares :)
>
> That's fun :)
>
> To be honest, I've never been a big fan of supporting nested
> containers from an audit perspective, so I'm not really too upset
> about this. The k8s/cri-o folks seem okay with this, or at least I
> haven't heard any objections; lxc folks, what do you have to say?
I actually thought the answer to this (when last I looked, "some time" ago)
was that userspace should track an audit message saying "task X in
container Y is changing its auditid to Z", and then decide to also track Z.
This should be doable, but a lot of extra work in userspace.
Per-userns containerids would also work. So task X1 is in containerid
1 on the host and creates a new task Y in new userns; it continues to
be reported in init_user_ns as containerid 1 forever; but in its own
userns it can request to be known as some other containerid. Audit
socks would be per-userns, allowing root in a container to watch for
audit events in its own (and descendent) namespaces.
But again I'm sure we've gone over all this in the last few years.
I suppose we can look at this as a "first step", and talk about
making it user-ns-nestable later. But agreed it's not useful in a
lot of situations as is.
[REMINDER: It is an "*audit* container ID" and not a general
"container ID" ;) Smiley aside, I'm not kidding about that part.]
I'm not interested in supporting/merging something that isn't useful;
if this doesn't work for your use case then we need to figure out what
would work. It sounds like nested containers are much more common in
the lxc world, can you elaborate a bit more on this?
As far as the possible solutions you mention above, I'm not sure I
like the per-userns audit container IDs, I'd much rather just emit the
necessary tracking information via the audit record stream and let the
log analysis tools figure it out. However, the bigger question is how
to limit (re)setting the audit container ID when you are in a non-init
userns. For reasons already mentioned, using capable() is a non
starter for everything but the initial userns, and using ns_capable()
is equally poor as it essentially allows any userns the ability to
munge it's audit container ID (obviously not good). It appears we
need a different method for controlling access to the audit container
ID.
Punting this to a LSM hook is an obvious thing to do, and something we
might want to do anyway, but currently audit doesn't rely on the LSM
for proper/safe operation and I'm not sure I want to change that now.
The next obvious thing is to create some sort of access control knob
in audit itself. Perhaps an auditctl operation that would allow the
administrator to specify which containers, via their corresponding
audit container IDs, are allowed to change their audit container ID?
The permission granting would need to be done in the init userns, but
it would allow containers with a non-init userns the ability to change
their audit container ID. We would probably still want a
ns_capable(CAP_AUDIT_CONTROL) restriction in this case.
Does anyone else have any other ideas?
--
paul moore
www.paul-moore.com