Re: Preferred subj= with multiple LSMs

Friday, 19 July 2019

On Wed, Jul 17, 2019 at 7:02 PM Casey Schaufler <casey(a)schaufler-ca.com&gt; wrote:
...
 On 7/17/2019 9:23 AM, Paul Moore wrote:
 > On Wed, Jul 17, 2019 at 11:49 AM Casey Schaufler <casey(a)schaufler-ca.com&gt;
wrote:
 >> On 7/17/2019 5:14 AM, Paul Moore wrote:
 >>> On Tue, Jul 16, 2019 at 7:47 PM Casey Schaufler
<casey(a)schaufler-ca.com&gt; wrote:
 >>>> On 7/16/2019 4:13 PM, Paul Moore wrote:
 >>>>> On Tue, Jul 16, 2019 at 6:18 PM Casey Schaufler
<casey(a)schaufler-ca.com&gt; wrote:
 >>>>>> It sounds as if some variant of the Hideous format:
 >>>>>>
 >>>>>>         subj=selinux='a:b:c:d',apparmor='z'
 >>>>>>         subj=selinux/a:b:c:d/apparmor/z
 >>>>>>         subj=(selinux)a:b:c:d/(apparmor)z
 >>>>>>
 >>>>>> would meet Steve's searchability requirements, but with
significant
 >>>>>> parsing performance penalties.
 >>>>> I think "hideous format" sums it up nicely.  Whatever we
choose here
 >>>>> we are likely going to be stuck with for some time and I'm near
to
 >>>>> 100% that multiplexing the labels onto a single field is going to be
a
 >>>>> disaster.
 >>>> If the requirement is that subj= be searchable I don't see much of
 >>>> an alternative to a Hideous format. If we can get past that, and say
 >>>> that all subj_* have to be searchable we can avoid that set of issues.
 >>>> Instead of:
 >>>>
 >>>>         s = strstr(source, "subj=")
 >>>>         search_after_subj(s, ...);
 >>> This example does a lot of hand waving in search_after_subj(...)
 >>> regarding parsing the multiplexed LSM label.  Unless we restrict the
 >>> LSM label formats (which seems both wrong, and too late IMHO)
 >> I don't think it's too late, and I think it would be healthy
 >> to restrict LSM "contexts" to character sets that make command
 >> line specification possible. Embedded newlines? Ewwww.
 > That would imply that the delimiter you would choose for the
 > multiplexed approach would be something odd (I think you suggested
 > 0x02, or similar, earlier) which would likely require the multiplexed
 > subj field to become a hex encoded field which would be very
 > unfortunate in my opinion and would technically break with the current
 > subj/obj field format spec.  Picking a normal-ish delimiter, and
 > restricting its use by LSMs seems wrong to me.

 Just say "no" to hex encoding! 
Yes, it's best avoided.

...
 BTW, keys are not hex encoded. 
The kernel keyring keys?  Not really relevant here I don't think.

...
 We've never had to think about having general rules on
 what security modules do before, because with only one
 active each could do whatever it wanted without fear of
 conflict. If there is already a character that none of
 the existing modules use, how would it be wrong to
 reserve it? 
"We've never had to think about having general rules on what security
modules do before..."

We famously haven't imposed restrictions on the label format before
now, and this seems like a pretty poor reason to start.

...
 > It's important to remember that Steve's strstr() comment
only reflects
 > his set of userspace tools.  When you start talking about log
 > aggregation and analytics, it seems very likely that there are other
 > tools in use, likely with their own parsers that do much more
 > complicated searches than a simple strstr() call.

 Point. But long term, they'll have to be updated to accommodate
 whatever we decide on. Which makes the "simple" case, where one
 security module is in use all the more important. 
Both the multiplexed and subj_X proposals handle the single major LSM
case the same: identical to what we have now.  Regardless of how
important the single major LSM case may be, it isn't a distinguishing
factor in this discussion.

...
 >>>> we have
 >>>>
 >>>>         s = source
 >>>>         for (i = 0; i < lsm_slots ; i++) {
 >>>>                 s = strstr(s, "subj_")
 >>>>                 if (!s)
 >>>>                         break;
 >>>>                 s = search_after_subj_(s, lsm_slot_name[i], ...)
 >>> The hand waving here in search_after_subj_(...) is much less;
 >>> essentially you just match "subj_X" and then you can take the
field
 >>> value as the LSM's label without having to know the format, the policy
 >>> loaded, etc.  It is both safer and doesn't require knowledge of the
 >>> LSMs (the LSM "name" can be specified as a parameter to the
search
 >>> tool).
 >> You can do that with the Hideous format as well. I wouldn't
 >> say which would be easier without delving into the audit user
 >> space.

 > No, you can't.  You still need to parse the multiplexed mess, that's
 > the problem.

 You move the parsing problem to the record, where you have to
 look for subj_selinux= instead of having the parsing problem in
 the subj= field, where you look for something like selinux=
 within the field. Neither looks like the work of an afternoon to
 get right. 
Finding subj_X in an audit record is no different than finding any
other field in a record.  Parsing the multiplexed label mess is a
whole different problem and prone to lots of mistakes.

...
 It probably looks like I'm arguing for the Hideous format
option.
 That would require less work and code disruption, so it is tempting
 to push for it. But I would have to know the user space side a
 whole lot better than I do to feel good about pushing anything that
 isn't obviously a good choice. I kind of prefer Paul's "subj=?"
 approach, but as it's harder, I don't want to spend too much time
 on it if it gets me a big, juicy, well deserved NAK. 
I didn't want to have to NAK this, but if that is what it is going to
take, so be it ... as it currently stands I'm NAK'ing the the
multiplexed approach.  You don't have to go with the subj_X approach,
but the multiplexed approach is a terrible idea and I can almost
guarantee that we would be regretting that choice in a few years time.

-- 
paul moore
www.paul-moore.com

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

Re: Preferred subj= with multiple LSMs