On 2018-04-18 15:39, Stefan Berger wrote:
 On 04/18/2018 03:23 PM, Richard Guy Briggs wrote:
 > On 2018-04-18 14:45, Stefan Berger wrote:
 > > On 03/15/2018 11:58 PM, Richard Guy Briggs wrote:
 > > > On 2018-03-15 16:27, Stefan Berger wrote:
 > > > > On 03/01/2018 02:41 PM, Richard Guy Briggs wrote:
 > > > > > Implement the proc fs write to set the audit container ID of a
process,
 > > > > > emitting an AUDIT_CONTAINER record to document the event.
 > > > > > 
 > > > > > This is a write from the container orchestrator task to a proc
entry of
 > > > > > the form /proc/PID/containerid where PID is the process ID of
the newly
 > > > > > created task that is to become the first task in a container, or
an
 > > > > > additional task added to a container.
 > > > > > 
 > > > > > The write expects up to a u64 value (unset:
18446744073709551615).
 > > > > > 
 > > > > > This will produce a record such as this:
 > > > > > type=UNKNOWN[1333] msg=audit(1519903238.968:261): op=set pid=596
uid=0 subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 auid=0 tty=pts0 ses=1
opid=596 old-contid=18446744073709551615 contid=123455 res=0
 > > > > > 
 > > > > > The "op" field indicates an initial set.  The
"pid" to "ses" fields are
 > > > > > the orchestrator while the "opid" field is the
object's PID, the process
 > > > > > being "contained".  Old and new container ID values
are given in the
 > > > > > "contid" fields, while res indicates its success.
 > > > > > 
 > > > > > It is not permitted to self-set, unset or re-set the container
ID.  A
 > > > > > child inherits its parent's container ID, but then can be
set only once
 > > > > > after.
 > > > > > 
 > > > > > See: 
https://github.com/linux-audit/audit-kernel/issues/32
 > > > > > 
 > > > > > 
 > > > > >     /* audit_rule_data supports filter rules with both integer
and string
 > > > > >      * fields.  It corresponds with AUDIT_ADD_RULE,
AUDIT_DEL_RULE and
 > > > > > diff --git a/kernel/auditsc.c b/kernel/auditsc.c
 > > > > > index 4e0a4ac..0ee1e59 100644
 > > > > > --- a/kernel/auditsc.c
 > > > > > +++ b/kernel/auditsc.c
 > > > > > @@ -2073,6 +2073,92 @@ int audit_set_loginuid(kuid_t loginuid)
 > > > > >     	return rc;
 > > > > >     }
 > > > > > 
 > > > > > +static int audit_set_containerid_perm(struct task_struct *task,
u64 containerid)
 > > > > > +{
 > > > > > +	struct task_struct *parent;
 > > > > > +	u64 pcontainerid, ccontainerid;
 > > > > > +	pid_t ppid;
 > > > > > +
 > > > > > +	/* Don't allow to set our own containerid */
 > > > > > +	if (current == task)
 > > > > > +		return -EPERM;
 > > > > > +	/* Don't allow the containerid to be unset */
 > > > > > +	if (!cid_valid(containerid))
 > > > > > +		return -EINVAL;
 > > > > > +	/* if we don't have caps, reject */
 > > > > > +	if (!capable(CAP_AUDIT_CONTROL))
 > > > > > +		return -EPERM;
 > > > > > +	/* if containerid is unset, allow */
 > > > > > +	if (!audit_containerid_set(task))
 > > > > > +		return 0;
 > > > > I am wondering whether there should be a check for the target process
that
 > > > > will receive the containerid to not have CAP_SYS_ADMIN that would
otherwise
 > > > > allow it to arbitrarily unshare()/clone() and leave the set of
namespaces
 > > > > that may make up the container whose containerid we assign here?
 > > > This is a reasonable question.  This has been debated and I understood
 > > > the conclusion was that without a clear definition of a
"container", the
 > > > task still remains in that container that just now has more
 > > > sub-namespaces (in the case of hierarchical namespaces), we don't
want
 > > > to restrict it in such a way and that allows it to create nested
 > > > containers.  I see setns being more problematic if it could switch to
 > > > another existing namespace that was set up by the orchestrator for a
 > > > different container.  The coming v2 patchset acknowledges this situation
 > > > with the network namespace being potentially shared by multiple
 > > > containers.
 > > Are you going to post v2 soon? We would like to build on top of it for IMA
 > > namespacing and auditing inside of IMA namespaces.
 > I don't know if it addresses your specific needs, but V2 was posted on
 > March 16th along with userspace patches:
 > 	
https://www.redhat.com/archives/linux-audit/2018-March/msg00110.html
 > 	
https://www.redhat.com/archives/linux-audit/2018-March/msg00124.html
 > 
 > V3 is pending.
 Thanks. I hadn't actually looked at primarily due to the ghak and ghau in
 the title. Whatever these may mean. 
They are Github issue numbers:
GHAK: GitHub Audit Kernel
GHAU: GitHub Audit Userspace
GHAD: GitHub Audit Documentation
GHAT: GitHub Audit Testsuite
 Does V2 or will V3 prevent a privileged process to setns() to a
whole
 different set of namespaces and still be audited with that initial container
 id ? 
No, not significantly different from V1 in that respect.
It does not prevent setns(), but will maintain its containerid.
It will prevent games by blocking a child and parent from setting each
other's containerids.
It does check that the task being conainered does not yet have any
children or peer threads.
- RGB
--
Richard Guy Briggs <rgb(a)redhat.com>
Sr. S/W Engineer, Kernel Security, Base Operating Systems
Remote, Ottawa, Red Hat Canada
IRC: rgb, SunRaycer
Voice: +1.647.777.2635, Internal: (81) 32635