On Thu, Mar 28, 2019 at 5:40 PM Richard Guy Briggs
<rgb(a)redhat.com> wrote:
> On 2019-03-28 11:46, Paul Moore wrote:
> > On Wed, Mar 27, 2019 at 9:12 PM Richard Guy Briggs <rgb(a)redhat.com>
wrote:
> > >
> > > On 2019-03-27 23:42, Ondrej Mosnacek wrote:
> > > > On Fri, Mar 15, 2019 at 7:35 PM Richard Guy Briggs
<rgb(a)redhat.com> wrote:
> > > > > Audit events could happen in a network namespace outside of a
task
> > > > > context due to packets received from the net that trigger an
auditing
> > > > > rule prior to being associated with a running task. The
network
> > > > > namespace could be in use by multiple containers by association
to the
> > > > > tasks in that network namespace. We still want a way to
attribute
> > > > > these events to any potential containers. Keep a list per
network
> > > > > namespace to track these audit container identifiiers.
> > > > >
> > > > > Add/increment the audit container identifier on:
> > > > > - initial setting of the audit container identifier via /proc
> > > > > - clone/fork call that inherits an audit container identifier
> > > > > - unshare call that inherits an audit container identifier
> > > > > - setns call that inherits an audit container identifier
> > > > > Delete/decrement the audit container identifier on:
> > > > > - an inherited audit container identifier dropped when child
set
> > > > > - process exit
> > > > > - unshare call that drops a net namespace
> > > > > - setns call that drops a net namespace
> > > > >
> > > > > See:
https://github.com/linux-audit/audit-kernel/issues/92
> > > > > See:
https://github.com/linux-audit/audit-testsuite/issues/64
> > > > > See:
https://github.com/linux-audit/audit-kernel/wiki/RFE-Audit-Container-ID
> > > > > Signed-off-by: Richard Guy Briggs <rgb(a)redhat.com>
> > > > > ---
> > > > > include/linux/audit.h | 19 ++++++++++++
> > > > > kernel/audit.c | 86
+++++++++++++++++++++++++++++++++++++++++++++++++--
> > > > > kernel/nsproxy.c | 4 +++
> > > > > 3 files changed, 106 insertions(+), 3 deletions(-)
> > > > >
> > > > > diff --git a/include/linux/audit.h b/include/linux/audit.h
> > > > > index fa19fa408931..70255c2dfb9f 100644
> > > > > --- a/include/linux/audit.h
> > > > > +++ b/include/linux/audit.h
> > > > > @@ -27,6 +27,7 @@
> > > > > #include <linux/ptrace.h>
> > > > > #include <linux/namei.h> /* LOOKUP_* */
> > > > > #include <uapi/linux/audit.h>
> > > > > +#include <linux/refcount.h>
> > > > >
> > > > > #define AUDIT_INO_UNSET ((unsigned long)-1)
> > > > > #define AUDIT_DEV_UNSET ((dev_t)-1)
> > > > > @@ -99,6 +100,13 @@ struct audit_task_info {
> > > > >
> > > > > extern struct audit_task_info init_struct_audit;
> > > > >
> > > > > +struct audit_contid {
> > > > > + struct list_head list;
> > > > > + u64 id;
> > > > > + refcount_t refcount;
> > > >
> > > > Hm, since we only ever touch the refcount under a spinlock, I wonder
> > > > if we could just make it a regular unsigned int (we don't need
the
> > > > atomicity guarantees). OTOH, refcount_t comes with some extra
overflow
> > > > checking, so it's probably better to leave it as is...
> > >
> > > Since the update is done using rcu-safe methods, do we even need the
> > > spin_lock? Neil? Paul?
> >
> > As discussed, the refcount field is protected against simultaneous
> > writes by the spinlock that protects additions/removals from the list
> > as a whole so I don't believe the refcount_t atomicity is critical in
> > this regard.
> >
> > Where it gets tricky, and I can't say I'm 100% confident on my answer
> > here, is if refcount was a regular int and we wanted to access it
> > outside of a spinlock (to be clear, it doesn't look like this patch
> > currently does this). With RCU, if refcount was a regular int
> > (unsigned or otherwise), I believe it would be possible for different
> > threads of execution to potentially see different values of refcount
> > (assuming one thread was adding/removing from the list). Using a
> > refcount_t would protect against this, alternatively, taking the
> > spinlock should also protect against this.
>
> Ok, from the above it isn't clear to me if you are happy with the
> current code or would prefer any changes, or from below that you still
> need to work it through to make a pronouncement. It sounds to me you
> would be ok with *either* spinlock *or* refcount_t, but don't see the
> need for both.
To be fair you didn't ask if I was "happy" with the approach above,
you asked if we needed the spinlock/refcount_t. I believe I answered
that question as comprehensively as I could, but perhaps you wanted a
hard yes or no? In that case, since refcount_t is obviously safer, I
would stick with that for now just to limit the number of possible
failures. If someone smarter than you or I comes along and
definitively says you are 100% safe to use an int, then go ahead and
use an int.
Beyond that, I'm still in the process of reviewing your patches, but I
haven't finished yet, so no "pronouncement" or whatever you want to
call it.
We definately need the spinlock, as its not meant to protect the refcount
alterations, its meant to protect parallel modifications to the list pointers.
Without the spinlock, the list can become corrupted (protecting the refcount is
just a byproduct).
Because of that byproduct, the atomicity of the refcount isn't required, and so
we could modify it to be an int or some other non-atomic ordinal type, but as I
noted, I don't think thats a good idea. The refcount type helps denote clearly
what the variable is used for, making it more readable, and given the relative
non-performance critical path that the reference count is read/written in, and
the relatively more expensive locking we are already doing there, I think there
is more value in using the refcount to make the code legible than making
marginally more performant by altering it to be an int type.
Neil